--- phase: 08-observability-stack plan: 02 type: execute wave: 1 depends_on: [] files_modified: - helm/alloy/values.yaml (new) - helm/alloy/Chart.yaml (new) autonomous: true must_haves: truths: - "Alloy DaemonSet runs on all nodes" - "Alloy forwards logs to Loki" - "Promtail DaemonSet is removed" artifacts: - path: "helm/alloy/Chart.yaml" provides: "Alloy Helm chart wrapper" contains: "name: alloy" - path: "helm/alloy/values.yaml" provides: "Alloy configuration for Loki forwarding" contains: "loki.write" key_links: - from: "Alloy pods" to: "loki-stack:3100" via: "loki.write endpoint" pattern: "endpoint.*loki" --- Migrate from Promtail to Grafana Alloy for log collection Purpose: Replace EOL Promtail (March 2026) with Grafana Alloy DaemonSet (OBS-04) Output: Alloy DaemonSet forwarding logs to Loki, Promtail removed @/home/tho/.claude/get-shit-done/workflows/execute-plan.md @/home/tho/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/08-observability-stack/CONTEXT.md Task 1: Deploy Grafana Alloy via Helm helm/alloy/Chart.yaml helm/alloy/values.yaml 1. Create helm/alloy directory and Chart.yaml as umbrella chart: ```yaml apiVersion: v2 name: alloy description: Grafana Alloy log collector version: 0.1.0 dependencies: - name: alloy version: "0.12.*" repository: https://grafana.github.io/helm-charts ``` 2. Create helm/alloy/values.yaml with minimal config for Loki forwarding: ```yaml alloy: alloy: configMap: content: | // Discover pods and collect logs discovery.kubernetes "pods" { role = "pod" } // Relabel to extract pod metadata discovery.relabel "pods" { targets = discovery.kubernetes.pods.targets rule { source_labels = ["__meta_kubernetes_namespace"] target_label = "namespace" } rule { source_labels = ["__meta_kubernetes_pod_name"] target_label = "pod" } rule { source_labels = ["__meta_kubernetes_pod_container_name"] target_label = "container" } } // Collect logs from discovered pods loki.source.kubernetes "pods" { targets = discovery.relabel.pods.output forward_to = [loki.write.default.receiver] } // Forward to Loki loki.write "default" { endpoint { url = "http://loki-stack.monitoring.svc.cluster.local:3100/loki/api/v1/push" } } controller: type: daemonset serviceAccount: create: true ``` 3. Add Grafana Helm repo and build dependencies: ```bash helm repo add grafana https://grafana.github.io/helm-charts helm repo update cd helm/alloy && helm dependency build ``` 4. Deploy Alloy to monitoring namespace: ```bash helm upgrade --install alloy ./helm/alloy -n monitoring --create-namespace ``` 5. Verify Alloy pods are running: ```bash kubectl get pods -n monitoring -l app.kubernetes.io/name=alloy ``` Expected: 5 pods (one per node) in Running state NOTE: - Alloy uses River configuration language (not YAML) - Labels (namespace, pod, container) match existing Promtail labels for query compatibility - Loki endpoint is cluster-internal: loki-stack.monitoring.svc.cluster.local:3100 1. kubectl get pods -n monitoring -l app.kubernetes.io/name=alloy shows 5 Running pods 2. kubectl logs -n monitoring -l app.kubernetes.io/name=alloy --tail=20 shows no errors 3. Alloy logs show "loki.write" component started successfully Alloy DaemonSet deployed with 5 pods collecting logs and forwarding to Loki Task 2: Verify log flow and remove Promtail (no files - kubectl operations) 1. Generate a test log by restarting TaskPlanner pod: ```bash kubectl rollout restart deployment taskplaner ``` 2. Wait for pod to be ready: ```bash kubectl rollout status deployment taskplaner --timeout=60s ``` 3. Verify logs appear in Loki via LogCLI or curl: ```bash # Query recent TaskPlanner logs via Loki API kubectl run --rm -it logtest --image=curlimages/curl --restart=Never -- \ curl -s "http://loki-stack.monitoring.svc.cluster.local:3100/loki/api/v1/query_range" \ --data-urlencode 'query={namespace="default",pod=~"taskplaner.*"}' \ --data-urlencode 'limit=5' ``` Expected: JSON response with "result" containing log entries 4. Once logs confirmed flowing via Alloy, remove Promtail: ```bash # Find and delete Promtail release helm list -n monitoring | grep promtail # If promtail found: helm uninstall loki-stack-promtail -n monitoring 2>/dev/null || \ helm uninstall promtail -n monitoring 2>/dev/null || \ kubectl delete daemonset -n monitoring -l app=promtail ``` 5. Verify Promtail is gone: ```bash kubectl get pods -n monitoring | grep -i promtail ``` Expected: No promtail pods 6. Verify logs still flowing after Promtail removal (repeat step 3) NOTE: Promtail may be installed as part of loki-stack or separately. Check both. 1. Loki API returns TaskPlanner log entries 2. kubectl get pods -n monitoring shows NO promtail pods 3. kubectl get pods -n monitoring shows Alloy pods still running 4. Second Loki query after Promtail removal still returns logs Logs confirmed flowing from Alloy to Loki, Promtail DaemonSet removed from cluster - [ ] Alloy DaemonSet has 5 Running pods (one per node) - [ ] Alloy pods show no errors in logs - [ ] Loki API returns TaskPlanner log entries - [ ] Promtail pods no longer exist - [ ] Log flow continues after Promtail removal 1. Alloy DaemonSet running on all 5 nodes 2. Logs from TaskPlanner appear in Loki within 60 seconds of generation 3. Promtail DaemonSet completely removed 4. No log collection gap (Alloy verified before Promtail removal) After completion, create `.planning/phases/08-observability-stack/08-02-SUMMARY.md`