Files: - STACK-v2-cicd-observability.md (ArgoCD, Prometheus, Loki, Alloy) - FEATURES.md (updated with CI/CD and observability section) - ARCHITECTURE.md (updated with v2.0 integration architecture) - PITFALLS-CICD-OBSERVABILITY.md (14 critical/moderate/minor pitfalls) - SUMMARY-v2-cicd-observability.md (synthesis with roadmap implications) Key findings: - Stack: kube-prometheus-stack + Loki monolithic + Alloy (Promtail EOL March 2026) - Architecture: 3-phase approach - GitOps first, observability second, CI tests last - Critical pitfall: ArgoCD TLS redirect loop, Loki disk exhaustion, k3s metrics config Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
17 KiB
Technology Stack: CI/CD Testing, ArgoCD GitOps, and Observability
Project: TaskPlanner v2.0 Production Operations Researched: 2026-02-03 Scope: Stack additions for existing k3s-deployed SvelteKit app
Executive Summary
This research covers three areas: (1) adding tests to the existing Gitea Actions pipeline, (2) ArgoCD for GitOps deployment automation, and (3) Prometheus/Grafana/Loki observability. The existing setup already has ArgoCD configured; research focuses on validating that configuration and adding the observability stack.
Key finding: Promtail is EOL on 2026-03-02. Use Grafana Alloy instead for log collection.
1. CI/CD Testing Stack
Recommended Stack
| Component | Version | Purpose | Rationale |
|---|---|---|---|
| Playwright | ^1.58.1 (existing) | E2E testing | Already configured, comprehensive browser automation |
| Vitest | ^3.0.0 | Unit/component tests | Official Svelte recommendation for Vite-based projects |
| @testing-library/svelte | ^5.0.0 | Component testing utilities | Streamlined component assertions |
| mcr.microsoft.com/playwright | v1.58.1 | CI browser execution | Pre-installed browsers, eliminates install step |
Why This Stack
Playwright (keep existing): Already configured with playwright.config.ts and tests/docker-deployment.spec.ts. The existing tests cover critical paths: health endpoint, CSRF-protected form submissions, and data persistence. Extend rather than replace.
Vitest (add): Svelte officially recommends Vitest for unit and component testing when using Vite (which SvelteKit uses). Vitest shares Vite's config, eliminating configuration overhead. Jest muscle memory transfers directly.
NOT recommended:
- Jest: Requires separate configuration, slower than Vitest, no Vite integration
- Cypress: Overlaps with Playwright; adding both creates maintenance burden
- @vitest/browser with Playwright: Adds complexity; save for later if jsdom proves insufficient
Gitea Actions Workflow Updates
The existing workflow at .gitea/workflows/build.yaml needs a test stage. Gitea Actions uses GitHub Actions syntax.
Recommended workflow structure:
name: Build and Push
on:
push:
branches: [master, main]
pull_request:
branches: [master, main]
env:
REGISTRY: git.kube2.tricnet.de
IMAGE_NAME: tho/taskplaner
jobs:
test:
runs-on: ubuntu-latest
container:
image: mcr.microsoft.com/playwright:v1.58.1-noble
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: npm ci
- name: Run type check
run: npm run check
- name: Run unit tests
run: npm run test:unit
- name: Run E2E tests
run: npm run test:e2e
env:
CI: true
build:
needs: test
runs-on: ubuntu-latest
if: github.event_name != 'pull_request'
steps:
# ... existing build steps ...
Key decisions:
- Use Playwright Docker image to avoid browser installation (saves 2-3 minutes)
- Run tests before build to fail fast
- Only build/push on push to master, not PRs
- Type checking (
svelte-check) catches errors before runtime
Package.json Scripts to Add
{
"scripts": {
"test": "npm run test:unit && npm run test:e2e",
"test:unit": "vitest run",
"test:unit:watch": "vitest",
"test:e2e": "playwright test",
"test:e2e:docker": "BASE_URL=http://localhost:3000 playwright test tests/docker-deployment.spec.ts"
}
}
Installation
# Add Vitest and testing utilities
npm install -D vitest @testing-library/svelte jsdom
Vitest Configuration
Create vitest.config.ts:
import { defineConfig } from 'vitest/config';
import { sveltekit } from '@sveltejs/kit/vite';
export default defineConfig({
plugins: [sveltekit()],
test: {
include: ['src/**/*.{test,spec}.{js,ts}'],
environment: 'jsdom',
globals: true,
setupFiles: ['./src/test-setup.ts']
}
});
Confidence: HIGH
Sources:
- Svelte Testing Documentation - Official recommendation for Vitest
- Playwright CI Setup - Docker image and CI best practices
- Existing
playwright.config.tsin project
2. ArgoCD GitOps Stack
Current State
ArgoCD is already configured in argocd/application.yaml. The configuration is correct and follows best practices:
syncPolicy:
automated:
prune: true # Removes resources deleted from Git
selfHeal: true # Reverts manual changes
Recommended Stack
| Component | Version | Purpose | Rationale |
|---|---|---|---|
| ArgoCD Helm Chart | 9.4.0 | GitOps controller | Latest stable, deploys ArgoCD v3.3.0 |
What's Already Done (No Changes Needed)
- Application manifest:
argocd/application.yamlcorrectly points tohelm/taskplaner - Auto-sync enabled:
automated.pruneandselfHealare configured - Git-based image tags: Pipeline updates
values.yamlwith new image tag - Namespace creation:
CreateNamespace=trueis set
What May Need Verification
- ArgoCD installation: Verify ArgoCD is actually deployed on the k3s cluster
- Repository credentials: If the Gitea repo is private, ArgoCD needs credentials
- Registry secret: The
gitea-registry-secretplaceholder needs real credentials
Installation (if ArgoCD not yet installed)
# Add ArgoCD Helm repository
helm repo add argo https://argoproj.github.io/argo-helm
helm repo update
# Install ArgoCD (minimal for single-node k3s)
helm install argocd argo/argo-cd \
--namespace argocd \
--create-namespace \
--set server.service.type=ClusterIP \
--set configs.params.server\.insecure=true # If behind Traefik TLS termination
Apply Application
kubectl apply -f argocd/application.yaml
NOT Recommended
- ArgoCD Image Updater: Overkill for single-app deployment; the current approach of updating values.yaml in Git is simpler and provides better audit trail
- ApplicationSets: Unnecessary for single environment
- App of Apps pattern: Unnecessary complexity for one application
Confidence: HIGH
Sources:
- ArgoCD Helm Chart on Artifact Hub - Version 9.4.0 confirmed
- ArgoCD Helm GitHub Releases - Release notes
- Existing
argocd/application.yamlin project
3. Observability Stack
Recommended Stack
| Component | Chart | Version | Purpose |
|---|---|---|---|
| kube-prometheus-stack | prometheus-community/kube-prometheus-stack | 81.4.2 | Prometheus + Grafana + Alertmanager |
| Loki | grafana/loki | 6.51.0 | Log aggregation (monolithic mode) |
| Grafana Alloy | grafana/alloy | 1.5.3 | Log collection agent |
Why This Stack
kube-prometheus-stack (not standalone Prometheus): Single chart deploys Prometheus, Grafana, Alertmanager, node-exporter, and kube-state-metrics. Pre-configured with Kubernetes dashboards. This is the standard approach.
Loki (not ELK/Elasticsearch): "Like Prometheus, but for logs." Integrates natively with Grafana. Much lower resource footprint than Elasticsearch. Uses same label-based querying as Prometheus.
Grafana Alloy (not Promtail): CRITICAL - Promtail reaches End-of-Life on 2026-03-02 (next month). Grafana Alloy is the official replacement. It's based on OpenTelemetry Collector and supports logs, metrics, and traces in one agent.
NOT Recommended
- Promtail: EOL 2026-03-02. Do not install; use Alloy
- loki-stack Helm chart: Deprecated, no longer maintained
- Elasticsearch/ELK: Resource-heavy, complex, overkill for single-user app
- Loki microservices mode: Requires 3+ nodes, object storage; overkill for personal app
- Separate Prometheus + Grafana charts: kube-prometheus-stack bundles them correctly
Architecture
+------------------+
| Grafana |
| (Dashboards/UI) |
+--------+---------+
|
+--------------------+--------------------+
| |
+--------v---------+ +----------v---------+
| Prometheus | | Loki |
| (Metrics) | | (Logs) |
+--------+---------+ +----------+---------+
| |
+--------------+---------------+ |
| | | |
+-----v-----+ +-----v-----+ +------v------+ +--------v---------+
| node- | | kube- | | TaskPlanner | | Grafana Alloy |
| exporter | | state- | | /metrics | | (Log Shipper) |
| | | metrics | | | | |
+-----------+ +-----------+ +-------------+ +------------------+
Installation
# Add Helm repositories
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Create monitoring namespace
kubectl create namespace monitoring
# Install kube-prometheus-stack
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--values prometheus-values.yaml
# Install Loki (monolithic mode for single-node)
helm install loki grafana/loki \
--namespace monitoring \
--values loki-values.yaml
# Install Alloy for log collection
helm install alloy grafana/alloy \
--namespace monitoring \
--values alloy-values.yaml
Recommended Values Files
prometheus-values.yaml (minimal for k3s single-node)
# Reduce resource usage for single-node k3s
prometheus:
prometheusSpec:
retention: 15d
resources:
requests:
cpu: 200m
memory: 512Mi
limits:
cpu: 1000m
memory: 2Gi
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: longhorn # Use existing Longhorn
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi
alertmanager:
alertmanagerSpec:
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 256Mi
storage:
volumeClaimTemplate:
spec:
storageClassName: longhorn
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5Gi
grafana:
persistence:
enabled: true
storageClassName: longhorn
size: 5Gi
# Grafana will be exposed via Traefik
ingress:
enabled: true
ingressClassName: traefik
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- grafana.kube2.tricnet.de
tls:
- secretName: grafana-tls
hosts:
- grafana.kube2.tricnet.de
# Disable components not needed for single-node
kubeControllerManager:
enabled: false # k3s bundles this differently
kubeScheduler:
enabled: false # k3s bundles this differently
kubeProxy:
enabled: false # k3s uses different proxy
loki-values.yaml (monolithic mode)
deploymentMode: SingleBinary
loki:
auth_enabled: false
commonConfig:
replication_factor: 1
storage:
type: filesystem
schemaConfig:
configs:
- from: "2024-01-01"
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: loki_index_
period: 24h
singleBinary:
replicas: 1
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 1Gi
persistence:
enabled: true
storageClass: longhorn
size: 10Gi
# Disable components not needed for monolithic
backend:
replicas: 0
read:
replicas: 0
write:
replicas: 0
# Gateway not needed for internal access
gateway:
enabled: false
alloy-values.yaml
alloy:
configMap:
content: |-
// Discover and collect logs from all pods
discovery.kubernetes "pods" {
role = "pod"
}
discovery.relabel "pods" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_namespace"]
target_label = "namespace"
}
rule {
source_labels = ["__meta_kubernetes_pod_name"]
target_label = "pod"
}
rule {
source_labels = ["__meta_kubernetes_pod_container_name"]
target_label = "container"
}
}
loki.source.kubernetes "pods" {
targets = discovery.relabel.pods.output
forward_to = [loki.write.local.receiver]
}
loki.write "local" {
endpoint {
url = "http://loki.monitoring.svc:3100/loki/api/v1/push"
}
}
controller:
type: daemonset
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 256Mi
TaskPlanner Metrics Endpoint
The app needs a /metrics endpoint for Prometheus to scrape. SvelteKit options:
- prom-client library (recommended): Standard Prometheus client for Node.js
- Custom endpoint: Simple counter/gauge implementation
Add to package.json:
npm install prom-client
Add ServiceMonitor for Prometheus to scrape TaskPlanner:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: taskplaner
namespace: monitoring
labels:
release: prometheus # Must match Prometheus selector
spec:
selector:
matchLabels:
app.kubernetes.io/name: taskplaner
namespaceSelector:
matchNames:
- default
endpoints:
- port: http
path: /metrics
interval: 30s
Resource Summary
Total additional resource requirements for observability:
| Component | CPU Request | Memory Request | Storage |
|---|---|---|---|
| Prometheus | 200m | 512Mi | 20Gi |
| Alertmanager | 50m | 64Mi | 5Gi |
| Grafana | 100m | 128Mi | 5Gi |
| Loki | 100m | 256Mi | 10Gi |
| Alloy (per node) | 50m | 64Mi | - |
| Total | ~500m | ~1Gi | 40Gi |
This fits comfortably on a single k3s node with 4+ cores and 8GB+ RAM.
Confidence: HIGH
Sources:
- kube-prometheus-stack on Artifact Hub - Version 81.4.2
- Grafana Loki Helm Installation - Monolithic mode guidance
- Grafana Alloy Kubernetes Deployment - Alloy setup
- Promtail Deprecation Notice - EOL 2026-03-02
- Migrate from Promtail to Alloy - Migration guide
Summary: What to Install
Immediate Actions
| Category | Add | Version | Notes |
|---|---|---|---|
| Testing | vitest | ^3.0.0 | Unit tests |
| Testing | @testing-library/svelte | ^5.0.0 | Component testing |
| Metrics | prom-client | ^15.0.0 | Prometheus metrics from app |
Helm Charts to Deploy
| Chart | Repository | Version | Namespace |
|---|---|---|---|
| kube-prometheus-stack | prometheus-community | 81.4.2 | monitoring |
| loki | grafana | 6.51.0 | monitoring |
| alloy | grafana | 1.5.3 | monitoring |
Already Configured (Verify, Don't Re-install)
| Component | Status | Action |
|---|---|---|
| ArgoCD Application | Configured in argocd/application.yaml |
Verify ArgoCD is running |
| Playwright | Configured in playwright.config.ts |
Keep, extend tests |
Do NOT Install
| Component | Reason |
|---|---|
| Promtail | EOL 2026-03-02, use Alloy instead |
| loki-stack chart | Deprecated, unmaintained |
| Elasticsearch/ELK | Overkill, resource-heavy |
| Jest | Vitest is better for Vite projects |
| ArgoCD Image Updater | Current Git-based approach is simpler |
Helm Repository Commands
# Add all needed repositories
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add argo https://argoproj.github.io/argo-helm
helm repo update
# Verify
helm search repo prometheus-community/kube-prometheus-stack
helm search repo grafana/loki
helm search repo grafana/alloy
helm search repo argo/argo-cd
Sources
Official Documentation
- Svelte Testing
- Playwright CI Setup
- ArgoCD Helm Chart
- kube-prometheus-stack
- Grafana Loki Helm
- Grafana Alloy
Critical Updates
- Promtail EOL Notice - EOL 2026-03-02
- Promtail to Alloy Migration