Files
taskplaner/.planning/phases/08-observability-stack/08-01-PLAN.md
Thomas Richter 8c3dc137ca docs(08): create phase plan
Phase 08: Observability Stack
- 3 plans in 2 waves
- Wave 1: 08-01 (metrics), 08-02 (Alloy) - parallel
- Wave 2: 08-03 (verification) - depends on both
- Ready for execution

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 21:24:24 +01:00

5.7 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, must_haves
phase plan type wave depends_on files_modified autonomous must_haves
08-observability-stack 01 execute 1
package.json
src/routes/metrics/+server.ts
src/lib/server/metrics.ts
helm/taskplaner/templates/servicemonitor.yaml
helm/taskplaner/values.yaml
true
truths artifacts key_links
TaskPlanner /metrics endpoint returns Prometheus-format text
ServiceMonitor exists in Helm chart templates
Prometheus can discover TaskPlanner via ServiceMonitor
path provides exports
src/routes/metrics/+server.ts Prometheus metrics HTTP endpoint
GET
path provides contains
src/lib/server/metrics.ts prom-client registry and metrics definitions collectDefaultMetrics
path provides contains
helm/taskplaner/templates/servicemonitor.yaml ServiceMonitor for Prometheus Operator kind: ServiceMonitor
from to via pattern
src/routes/metrics/+server.ts src/lib/server/metrics.ts import register import.*register.*from.*metrics
from to via pattern
helm/taskplaner/templates/servicemonitor.yaml tp-app service selector matchLabels selector.*matchLabels
Add Prometheus metrics endpoint to TaskPlanner and ServiceMonitor for scraping

Purpose: Enable Prometheus to collect application metrics from TaskPlanner (OBS-08, OBS-01) Output: /metrics endpoint returning prom-client default metrics, ServiceMonitor in Helm chart

<execution_context> @/home/tho/.claude/get-shit-done/workflows/execute-plan.md @/home/tho/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/08-observability-stack/CONTEXT.md @package.json @src/routes/health/+server.ts @helm/taskplaner/values.yaml @helm/taskplaner/templates/service.yaml Task 1: Add prom-client and create /metrics endpoint package.json src/lib/server/metrics.ts src/routes/metrics/+server.ts 1. Install prom-client: ```bash npm install prom-client ```
2. Create src/lib/server/metrics.ts:
   - Import prom-client's Registry, collectDefaultMetrics
   - Create a new Registry instance
   - Call collectDefaultMetrics({ register: registry }) to collect Node.js process metrics
   - Export the registry
   - Keep it minimal - just default metrics (memory, CPU, event loop lag)

3. Create src/routes/metrics/+server.ts:
   - Import the registry from $lib/server/metrics
   - Create GET handler that returns registry.metrics() with Content-Type: text/plain; version=0.0.4
   - Handle errors gracefully (return 500 on failure)
   - Pattern follows existing /health endpoint structure

NOTE: prom-client is the standard Node.js Prometheus client. Use default metrics only - no custom metrics needed for this phase.
1. npm run build completes without errors 2. npm run dev, then curl http://localhost:5173/metrics returns text starting with "# HELP" or "# TYPE" 3. Response Content-Type header includes "text/plain" /metrics endpoint returns Prometheus-format metrics including process_cpu_seconds_total, nodejs_heap_size_total_bytes Task 2: Add ServiceMonitor to Helm chart helm/taskplaner/templates/servicemonitor.yaml helm/taskplaner/values.yaml 1. Create helm/taskplaner/templates/servicemonitor.yaml: ```yaml {{- if .Values.metrics.enabled }} apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: {{ include "taskplaner.fullname" . }} labels: {{- include "taskplaner.labels" . | nindent 4 }} spec: selector: matchLabels: {{- include "taskplaner.selectorLabels" . | nindent 6 }} endpoints: - port: http path: /metrics interval: {{ .Values.metrics.interval | default "30s" }} namespaceSelector: matchNames: - {{ .Release.Namespace }} {{- end }} ```
2. Update helm/taskplaner/values.yaml - add metrics section:
   ```yaml
   # Prometheus metrics
   metrics:
     enabled: true
     interval: 30s
   ```

3. Ensure the service template exposes port named "http" (check existing service.yaml - it likely already does via targetPort: http)

NOTE: The ServiceMonitor uses monitoring.coreos.com/v1 API which kube-prometheus-stack provides. The namespaceSelector ensures Prometheus finds TaskPlanner in the default namespace.
1. helm template ./helm/taskplaner includes ServiceMonitor resource 2. helm template output shows selector matching app.kubernetes.io/name: taskplaner 3. No helm lint errors ServiceMonitor template renders correctly with selector matching TaskPlanner service, ready for Prometheus to discover - [ ] npm run build succeeds - [ ] curl localhost:5173/metrics returns Prometheus-format text - [ ] helm template ./helm/taskplaner shows ServiceMonitor resource - [ ] ServiceMonitor selector matches service labels

<success_criteria>

  1. /metrics endpoint returns Prometheus-format metrics (process metrics, heap size, event loop)
  2. ServiceMonitor added to Helm chart templates
  3. ServiceMonitor enabled by default in values.yaml
  4. Build and type check pass </success_criteria>
After completion, create `.planning/phases/08-observability-stack/08-01-SUMMARY.md`