Files
taskplaner/.planning/research/FEATURES.md
Thomas Richter 5dbabe6a2d docs: complete v2.0 CI/CD and observability research
Files:
- STACK-v2-cicd-observability.md (ArgoCD, Prometheus, Loki, Alloy)
- FEATURES.md (updated with CI/CD and observability section)
- ARCHITECTURE.md (updated with v2.0 integration architecture)
- PITFALLS-CICD-OBSERVABILITY.md (14 critical/moderate/minor pitfalls)
- SUMMARY-v2-cicd-observability.md (synthesis with roadmap implications)

Key findings:
- Stack: kube-prometheus-stack + Loki monolithic + Alloy (Promtail EOL March 2026)
- Architecture: 3-phase approach - GitOps first, observability second, CI tests last
- Critical pitfall: ArgoCD TLS redirect loop, Loki disk exhaustion, k3s metrics config

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 03:29:23 +01:00

451 lines
22 KiB
Markdown

# Feature Research
**Domain:** Personal Task/Notes Web App (self-hosted, single-user)
**Researched:** 2026-01-29
**Confidence:** MEDIUM (based on competitor analysis via WebFetch + domain knowledge)
## Feature Landscape
### Table Stakes (Users Expect These)
Features users assume exist. Missing these = product feels incomplete.
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| Create/Edit/Delete items | Basic CRUD is fundamental | LOW | Text input with persistence |
| Distinguish tasks vs thoughts | Project requirement; users expect to filter by type | LOW | Boolean or enum field |
| Mark tasks complete | Core task management; every competitor has this | LOW | Status toggle |
| Cross-device access | Project requirement; web app enables this inherently | LOW | Responsive design |
| Search | Users expect to find content quickly; all competitors offer this | MEDIUM | Full-text search on title + content |
| Tags/labels | Standard organization pattern (Todoist, Bear, Simplenote all have) | MEDIUM | Many-to-many relationship |
| Image attachments | Project requirement for digitizing paper notes | MEDIUM | File upload, storage, display |
| Mobile-friendly UI | "Any device" access means mobile must work | MEDIUM | Responsive design, touch targets |
| Data persistence | Notes must survive restart | LOW | Database storage |
| Quick capture | Fast entry is table stakes (Todoist: "Capture...the moment they come to you") | LOW | Minimal friction input |
### Differentiators (Competitive Advantage)
Features that set the product apart. Not required, but valuable.
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| OCR on images | Search text within uploaded images (Evernote, Bear Pro have this) | HIGH | Requires OCR library/service |
| Image annotation | Mark up photos of paper notes | HIGH | Canvas drawing, save state |
| Natural language dates | "tomorrow", "next Monday" (Todoist signature feature) | MEDIUM | Date parsing library |
| Recurring tasks | Habits and repeated items (Todoist core feature) | MEDIUM | RRULE or simple patterns |
| Offline support | Work without internet, sync later | HIGH | Service worker, conflict resolution |
| Keyboard shortcuts | Power user efficiency | LOW | Event handlers |
| Dark mode | User preference, reduces eye strain | LOW | CSS variables, theme toggle |
| Markdown support | Rich formatting without WYSIWYG complexity (Simplenote, Bear, Obsidian) | MEDIUM | Markdown parser + preview |
| Note linking | Connect related items (Obsidian's core feature) | MEDIUM | Internal link syntax, backlinks |
| Document scanning | Camera capture with perspective correction (Evernote) | HIGH | Camera API, image processing |
| Export/backup | Data portability, user owns data (Obsidian philosophy) | LOW | JSON/Markdown export |
| Drag-and-drop reorder | Intuitive organization (Todoist Upcoming view) | MEDIUM | Sortable library, persist order |
| Pin/favorite items | Quick access to important items (Bear) | LOW | Boolean field, UI section |
| Due dates with reminders | Time-sensitive tasks (all task apps have this) | MEDIUM | Date field + notification system |
### Anti-Features (Commonly Requested, Often Problematic)
Features that seem good but create problems for a personal, single-user app.
| Feature | Why Requested | Why Problematic | Alternative |
|---------|---------------|-----------------|-------------|
| Real-time collaboration | "Maybe I'll share with family" | Massive complexity (OT/CRDT), scope creep, conflicts with "personal" app | Export/share single notes manually |
| AI-powered categorization | Trendy, seems smart | Over-engineering for personal use; manual tags are clearer | Good tag UX + search |
| Complex folder hierarchies | "I want to organize everything" | Deep nesting causes friction; flat + tags is more flexible | Tags with hierarchy (nested tags) |
| Kanban boards | Looks nice, seems productive | Overhead for personal tasks; simple list often better | Optional board view later |
| Multiple note types | "Journals, wikis, tasks, etc." | Complicates data model, UI; blur the simple task/thought distinction | Two types (task/thought) with tags |
| Social features | Share achievements, collaborate | Out of scope for self-hosted personal app | None |
| Heavy WYSIWYG editor | "I want formatting" | Bloated, complex, mobile-unfriendly | Markdown with preview |
| Notifications/alerts on web | Keep me on track | Browser notifications are annoying, unreliable | Focus on capture, not nagging |
| Version history | "I might want to undo" | Storage overhead, complexity for personal use | Simple edit; consider soft-delete |
| Multi-user/auth | "Maybe I'll share the server" | Security complexity, out of scope | Single-user by design |
## Feature Dependencies
```
[Search]
└──requires──> [Data persistence]
[Tags]
└──requires──> [Data persistence]
└──enhances──> [Search] (filter by tag)
[Image attachments]
└──requires──> [Data persistence]
└──requires──> [File storage]
[OCR on images]
└──requires──> [Image attachments]
[Image annotation]
└──requires──> [Image attachments]
[Markdown support]
└──enhances──> [Create/Edit items]
[Note linking]
└──requires──> [Data persistence]
└──enhances──> [Search] (backlink discovery)
[Recurring tasks]
└──requires──> [Mark tasks complete]
[Due dates with reminders]
└──requires──> [Distinguish tasks vs thoughts]
[Offline support]
└──conflicts──> [Real-time collaboration] (sync conflicts)
└──requires──> [Data persistence]
[Export/backup]
└──requires──> [Data persistence]
```
### Dependency Notes
- **OCR requires Image attachments:** Cannot search images if they don't exist
- **Tags enhance Search:** Tag filtering is a search feature
- **Offline conflicts with Real-time collab:** Sync conflict resolution is hard; single-user sidesteps this
- **Recurring tasks require completion status:** Need to know when to regenerate
## MVP Definition
### Launch With (v1)
Minimum viable product based on project requirements:
- [x] Create/Edit/Delete items (task or thought) -- core CRUD
- [x] Distinguish tasks vs thoughts -- project requirement
- [x] Mark tasks complete -- essential task management
- [x] Image attachments -- project requirement for digitizing paper notes
- [x] Tags for organization -- project requirement
- [x] Search -- project requirement
- [x] Mobile-responsive UI -- "any device" requirement
- [x] Containerized deployment -- project requirement
### Add After Validation (v1.x)
Features to add once core is working:
- [ ] Dark mode -- low complexity, high user satisfaction
- [ ] Keyboard shortcuts -- power user efficiency
- [ ] Pin/favorite items -- quick access to important items
- [ ] Export to JSON/Markdown -- data portability
- [ ] Due dates on tasks -- natural extension of task type
- [ ] Drag-and-drop reorder -- better organization UX
### Future Consideration (v2+)
Features to defer until product-market fit is established:
- [ ] Markdown support -- adds complexity to editing
- [ ] Natural language dates -- requires parsing library
- [ ] Recurring tasks -- adds state machine complexity
- [ ] OCR on images -- requires external service/library
- [ ] Note linking -- changes how users think about the app
- [ ] Offline support -- significant complexity (service worker, sync)
## Feature Prioritization Matrix
| Feature | User Value | Implementation Cost | Priority |
|---------|------------|---------------------|----------|
| Create/Edit/Delete | HIGH | LOW | P1 |
| Task vs Thought distinction | HIGH | LOW | P1 |
| Mark complete | HIGH | LOW | P1 |
| Image attachments | HIGH | MEDIUM | P1 |
| Tags | HIGH | MEDIUM | P1 |
| Search | HIGH | MEDIUM | P1 |
| Mobile-responsive | HIGH | MEDIUM | P1 |
| Dark mode | MEDIUM | LOW | P2 |
| Keyboard shortcuts | MEDIUM | LOW | P2 |
| Pin/favorite | MEDIUM | LOW | P2 |
| Export | MEDIUM | LOW | P2 |
| Due dates | MEDIUM | MEDIUM | P2 |
| Drag-and-drop | MEDIUM | MEDIUM | P2 |
| Markdown | MEDIUM | MEDIUM | P3 |
| Natural language dates | LOW | MEDIUM | P3 |
| Recurring tasks | MEDIUM | HIGH | P3 |
| OCR | LOW | HIGH | P3 |
| Note linking | LOW | MEDIUM | P3 |
| Offline support | LOW | HIGH | P3 |
**Priority key:**
- P1: Must have for launch (MVP)
- P2: Should have, add when possible (v1.x)
- P3: Nice to have, future consideration (v2+)
## Competitor Feature Analysis
| Feature | Todoist | Bear | Simplenote | Obsidian | Our Approach |
|---------|---------|------|------------|----------|--------------|
| Task management | Core focus | Checkbox in notes | Basic lists | Plugin | First-class task type |
| Notes/thoughts | Via descriptions | Core focus | Core focus | Core focus | First-class thought type |
| Image attachments | Yes | Yes (with sketching) | No | Yes | Yes, for paper note capture |
| Tags | Yes (labels) | Yes (with icons) | Yes | Yes | Yes, simple tags |
| Search | Advanced filters | OCR search (Pro) | Instant search | Full-text | Full-text + tag filter |
| Sync | Cloud (their servers) | iCloud | Cloud (their servers) | Local-first, optional sync | Self-hosted, cross-device via web |
| Markdown | Limited | Yes | Yes | Yes (core) | Defer to v2 |
| Offline | Yes | Yes | Yes | Yes (local-first) | Defer to v2 |
| Mobile | Native apps | Native (Apple only) | Native apps | Native apps | Responsive web |
**Our differentiation:** Self-hosted, single-user simplicity with explicit task/thought distinction and image attachment for digitizing paper notes. No account required, no cloud dependency, you own your data.
## Confidence Notes
| Section | Confidence | Rationale |
|---------|------------|-----------|
| Table Stakes | HIGH | Verified against Todoist, Bear, Simplenote, Evernote via WebFetch |
| Differentiators | MEDIUM | Based on competitor features; value proposition is hypothesis |
| Anti-Features | MEDIUM | Based on domain experience; specific to single-user context |
| Dependencies | HIGH | Logical dependencies from requirements |
| MVP Definition | HIGH | Derived directly from project requirements |
## Sources
- Todoist features page (verified via WebFetch)
- Obsidian home page (verified via WebFetch)
- Bear app home page (verified via WebFetch)
- Simplenote home page (verified via WebFetch)
- Evernote features page (verified via WebFetch)
---
# CI/CD and Observability Features
**Domain:** CI/CD pipelines and Kubernetes observability for personal project
**Researched:** 2026-02-03
**Context:** Single-user, self-hosted TaskPlanner app with existing basic Gitea Actions pipeline
## Current State
Based on the existing `.gitea/workflows/build.yaml`:
- Build and push Docker images to Gitea Container Registry
- Docker layer caching enabled
- Automatic Helm values update with new image tag
- No tests in pipeline
- No GitOps automation (ArgoCD defined but requires manual sync)
- No observability stack
---
## Table Stakes
Features required for production-grade operations. Missing any of these means the system is incomplete for reliable self-hosting.
### CI/CD Pipeline
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| **Automated tests in pipeline** | Catch bugs before deployment; without tests, pipeline is just a build script | Low | Start with unit tests (70% of test pyramid), add integration tests later |
| **Build caching** | Already have this | - | Using Docker layer cache to registry |
| **Lint/static analysis** | Catch errors early (fail fast principle) | Low | ESLint, TypeScript checking |
| **Pipeline as code** | Already have this | - | Workflow defined in `.gitea/workflows/` |
| **Automated deployment trigger** | Manual `helm upgrade` defeats CI/CD purpose | Low | ArgoCD auto-sync on Git changes |
| **Container image tagging** | Already have this | - | SHA-based tags with `latest` |
### GitOps
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| **Git as single source of truth** | Core GitOps principle; cluster state should match Git | Low | ArgoCD watches Git repo, syncs to cluster |
| **Auto-sync** | Manual sync defeats GitOps purpose | Low | ArgoCD `syncPolicy.automated.enabled: true` |
| **Self-healing** | Prevents drift; if someone kubectl edits, ArgoCD reverts | Low | ArgoCD `selfHeal: true` |
| **Health checks** | Know if deployment succeeded | Low | ArgoCD built-in health status |
### Observability
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| **Basic metrics collection** | Know if app is running, resource usage | Medium | Prometheus + kube-state-metrics |
| **Metrics visualization** | Metrics without dashboards are useless | Low | Grafana with pre-built Kubernetes dashboards |
| **Container logs aggregation** | Debug issues without `kubectl logs` | Medium | Loki (lightweight, label-based) |
| **Basic alerting** | Know when something breaks | Low | AlertManager with 3-5 critical alerts |
---
## Differentiators
Features that add significant value but are not strictly required for a single-user personal app. Implement if you want learning/practice or improved reliability.
### CI/CD Pipeline
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| **Smoke tests on deploy** | Verify deployment actually works | Medium | Hit health endpoint after deploy |
| **Build notifications** | Know when builds fail without watching | Low | Slack/Discord/email webhook |
| **DORA metrics tracking** | Track deployment frequency, lead time | Medium | Measure CI/CD effectiveness |
| **Parallel test execution** | Faster feedback on larger test suites | Medium | Only valuable with substantial test suite |
| **Dependency vulnerability scanning** | Catch security issues early | Low | `npm audit`, Trivy for container images |
### GitOps
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| **Automated pruning** | Remove resources deleted from Git | Low | ArgoCD `prune: true` |
| **Sync windows** | Control when syncs happen | Low | Useful if you want maintenance windows |
| **Application health dashboard** | Visual cluster state | Low | ArgoCD UI already provides this |
| **Git commit status** | See deployment status in Gitea | Medium | ArgoCD notifications to Git |
### Observability
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| **Application-level metrics** | Track business metrics (tasks created, etc.) | Medium | Custom Prometheus metrics in app |
| **Request tracing** | Debug latency issues | High | OpenTelemetry, Tempo/Jaeger |
| **SLO/SLI dashboards** | Define and track reliability targets | Medium | Error budgets, latency percentiles |
| **Log-based alerting** | Alert on error patterns | Medium | Loki alerting rules |
| **Uptime monitoring** | External availability check | Low | Uptime Kuma or similar |
---
## Anti-Features
Features that are overkill for a single-user personal app. Actively avoid these to prevent over-engineering.
| Anti-Feature | Why Avoid | What to Do Instead |
|--------------|-----------|-------------------|
| **Multi-environment promotion (dev/staging/prod)** | Single user, single environment | Deploy directly to prod; use feature flags if needed |
| **Blue-green/canary deployments** | Complex rollout for single user is overkill | Simple rolling update; ArgoCD rollback if needed |
| **Full E2E test suite in CI** | Expensive, slow, diminishing returns for personal app | Unit + smoke tests; manual E2E when needed |
| **High availability ArgoCD** | HA is for multi-team, multi-tenant | Single replica ArgoCD is fine |
| **Distributed tracing** | Overkill unless debugging microservices latency | Only add if you have multiple services with latency issues |
| **ELK stack for logging** | Resource-heavy; Elasticsearch needs significant memory | Use Loki instead (label-based, lightweight) |
| **Full APM solution** | DataDog/NewRelic-style solutions are enterprise-focused | Prometheus + Grafana + Loki covers personal needs |
| **Secrets management (Vault)** | Complex for single user with few secrets | Kubernetes secrets or sealed-secrets |
| **Policy enforcement (OPA/Gatekeeper)** | You are the only user; no policy conflicts | Skip entirely |
| **Multi-cluster management** | Single cluster, single app | Skip entirely |
| **Cost optimization/FinOps** | Personal project; cost is fixed/minimal | Skip entirely |
| **AI-assisted observability** | Marketing hype; manual review is fine at this scale | Skip entirely |
---
## Feature Dependencies
```
Automated Tests
|
v
Lint/Static Analysis --> Build --> Push Image --> Update Git
|
v
ArgoCD Auto-Sync
|
v
Health Check Pass
|
v
Deployment Complete
|
v
Metrics/Logs Available in Grafana
```
Key ordering constraints:
1. Tests before build (fail fast)
2. ArgoCD watches Git, so Git update triggers deploy
3. Observability stack must be deployed before app for metrics collection
---
## MVP Recommendation for CI/CD and Observability
For production-grade operations on a personal project, prioritize in this order:
### Phase 1: GitOps Foundation
1. Enable ArgoCD auto-sync with self-healing
2. Add basic health checks
*Rationale:* Eliminates manual `helm upgrade`, establishes GitOps workflow
### Phase 2: Basic Observability
1. Prometheus + Grafana (kube-prometheus-stack helm chart)
2. Loki for log aggregation
3. 3-5 critical alerts (pod crashes, high memory, app down)
*Rationale:* Can't operate what you can't see; minimum viable observability
### Phase 3: CI Pipeline Hardening
1. Add unit tests to pipeline
2. Add linting/type checking
3. Smoke test after deploy (optional)
*Rationale:* Tests catch bugs before they reach production
### Defer to Later (if ever)
- Application-level custom metrics
- SLO dashboards
- Advanced alerting
- Request tracing
- Extensive E2E tests
---
## Complexity Budget
For a single-user personal project, the total complexity budget should be LOW-MEDIUM:
| Category | Recommended Complexity | Over-Budget Indicator |
|----------|----------------------|----------------------|
| CI Pipeline | LOW | More than 10 min build time; complex test matrix |
| GitOps | LOW | Multi-environment promotion; complex sync policies |
| Metrics | MEDIUM | Custom exporters; high-cardinality metrics |
| Logging | LOW | Full-text search; complex log parsing |
| Alerting | LOW | More than 10 alerts; complex routing |
| Tracing | SKIP | Any tracing for single-service app |
---
## Essential Alerts for Personal Project
Based on best practices, these 5 alerts are sufficient for a single-user app:
| Alert | Condition | Why Critical |
|-------|-----------|--------------|
| **Pod CrashLooping** | restarts > 3 in 15 min | App is failing repeatedly |
| **Pod OOMKilled** | OOM event detected | Memory limits too low or leak |
| **High Memory Usage** | memory > 85% for 5 min | Approaching resource limits |
| **App Unavailable** | probe failures > 3 | Users cannot access app |
| **Disk Running Low** | disk > 80% used | Persistent storage filling up |
**Key principle:** Alerts should be symptom-based and actionable. If an alert fires and you don't need to do anything, remove it.
---
## Sources
### CI/CD Best Practices
- [TeamCity CI/CD Guide](https://www.jetbrains.com/teamcity/ci-cd-guide/ci-cd-best-practices/)
- [Spacelift CI/CD Best Practices](https://spacelift.io/blog/ci-cd-best-practices)
- [GitLab CI/CD Best Practices](https://about.gitlab.com/blog/how-to-keep-up-with-ci-cd-best-practices/)
- [AWS CI/CD Best Practices](https://docs.aws.amazon.com/prescriptive-guidance/latest/strategy-cicd-litmus/cicd-best-practices.html)
### Observability
- [Kubernetes Observability Trends 2026](https://www.usdsi.org/data-science-insights/kubernetes-observability-and-monitoring-trends-in-2026)
- [Spectro Cloud: Choosing the Right Monitoring Stack](https://www.spectrocloud.com/blog/choosing-the-right-kubernetes-monitoring-stack)
- [ClickHouse: Mastering Kubernetes Observability](https://clickhouse.com/resources/engineering/mastering-kubernetes-observability-guide)
- [Kubernetes Official Observability Docs](https://kubernetes.io/docs/concepts/cluster-administration/observability/)
### ArgoCD/GitOps
- [ArgoCD Auto Sync Documentation](https://argo-cd.readthedocs.io/en/stable/user-guide/auto_sync/)
- [ArgoCD Best Practices](https://argo-cd.readthedocs.io/en/stable/user-guide/best_practices/)
- [mkdev: ArgoCD Self-Heal and Sync Windows](https://mkdev.me/posts/argo-cd-self-heal-sync-windows-and-diffing)
### Alerting
- [Sysdig: Alerting on Kubernetes](https://www.sysdig.com/blog/alerting-kubernetes)
- [Groundcover: Kubernetes Alerting](https://www.groundcover.com/kubernetes-monitoring/kubernetes-alerting)
- [Sematext: 10 Must-Have Kubernetes Alerts](https://sematext.com/blog/top-10-must-have-alerts-for-kubernetes/)
### Logging
- [Plural: Loki vs ELK for Kubernetes](https://www.plural.sh/blog/loki-vs-elk-kubernetes/)
- [Loki vs ELK Comparison](https://alexandre-vazquez.com/loki-vs-elk/)
### Testing Pyramid
- [CircleCI: Testing Pyramid](https://circleci.com/blog/testing-pyramid/)
- [Semaphore: Testing Pyramid](https://semaphore.io/blog/testing-pyramid)
- [AWS: Testing Stages in CI/CD](https://docs.aws.amazon.com/whitepapers/latest/practicing-continuous-integration-continuous-delivery/testing-stages-in-continuous-integration-and-continuous-delivery.html)
### Homelab/Personal Projects
- [Prometheus and Grafana Homelab Setup](https://unixorn.github.io/post/homelab/homelab-setup-prometheus-and-grafana/)
- [Better Stack: Install Prometheus/Grafana with Helm](https://betterstack.com/community/questions/install-prometheus-and-grafana-on-kubernetes-with-helm/)