docs: complete v2.0 CI/CD and observability research
Files: - STACK-v2-cicd-observability.md (ArgoCD, Prometheus, Loki, Alloy) - FEATURES.md (updated with CI/CD and observability section) - ARCHITECTURE.md (updated with v2.0 integration architecture) - PITFALLS-CICD-OBSERVABILITY.md (14 critical/moderate/minor pitfalls) - SUMMARY-v2-cicd-observability.md (synthesis with roadmap implications) Key findings: - Stack: kube-prometheus-stack + Loki monolithic + Alloy (Promtail EOL March 2026) - Architecture: 3-phase approach - GitOps first, observability second, CI tests last - Critical pitfall: ArgoCD TLS redirect loop, Loki disk exhaustion, k3s metrics config Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -257,7 +257,7 @@ func (s *LocalStorage) Store(ctx context.Context, file io.Reader) (string, error
|
||||
| |
|
||||
| v
|
||||
| [FTS5 trigger auto-updates index]
|
||||
| |
|
||||
| v
|
||||
v v
|
||||
[UI shows new note] <--JSON response-- [Return created note]
|
||||
```
|
||||
@@ -513,3 +513,621 @@ Based on component dependencies, suggested implementation order:
|
||||
---
|
||||
*Architecture research for: Personal task/notes web application*
|
||||
*Researched: 2026-01-29*
|
||||
|
||||
---
|
||||
|
||||
# v2.0 Architecture: CI/CD and Observability Integration
|
||||
|
||||
**Domain:** GitOps CI/CD and Observability Stack
|
||||
**Researched:** 2026-02-03
|
||||
**Confidence:** HIGH (verified with official documentation)
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This section details how ArgoCD, Prometheus, Grafana, and Loki integrate with the existing k3s/Gitea/Traefik architecture. The integration follows established patterns for self-hosted Kubernetes observability stacks, with specific considerations for k3s's lightweight nature and Traefik as the ingress controller.
|
||||
|
||||
Key insight: The existing CI/CD foundation (Gitea Actions + ArgoCD Application) is already in place. This milestone adds observability and operational automation rather than building from scratch.
|
||||
|
||||
## Current Architecture Overview
|
||||
|
||||
```
|
||||
Internet
|
||||
|
|
||||
[Traefik]
|
||||
(Ingress)
|
||||
|
|
||||
+-------------------------+-------------------------+
|
||||
| | |
|
||||
task.kube2 git.kube2 (future)
|
||||
.tricnet.de .tricnet.de argocd/grafana
|
||||
| |
|
||||
[TaskPlaner] [Gitea]
|
||||
(default ns) + Actions
|
||||
| Runner
|
||||
| |
|
||||
[Longhorn PVC] |
|
||||
(data store) |
|
||||
v
|
||||
[Container Registry]
|
||||
git.kube2.tricnet.de
|
||||
```
|
||||
|
||||
### Existing Components
|
||||
|
||||
| Component | Namespace | Purpose | Status |
|
||||
|-----------|-----------|---------|--------|
|
||||
| k3s | - | Kubernetes distribution | Running |
|
||||
| Traefik | kube-system | Ingress controller | Running |
|
||||
| Longhorn | longhorn-system | Persistent storage | Running |
|
||||
| cert-manager | cert-manager | TLS certificates | Running |
|
||||
| Gitea | gitea (assumed) | Git hosting + CI | Running |
|
||||
| TaskPlaner | default | Application | Running |
|
||||
| ArgoCD Application | argocd | GitOps deployment | Defined (may need install) |
|
||||
|
||||
### Existing CI/CD Pipeline
|
||||
|
||||
From `.gitea/workflows/build.yaml`:
|
||||
1. Push to master triggers Gitea Actions
|
||||
2. Build Docker image with BuildX
|
||||
3. Push to Gitea Container Registry
|
||||
4. Update Helm values.yaml with new image tag
|
||||
5. Commit with `[skip ci]`
|
||||
6. ArgoCD detects change and syncs
|
||||
|
||||
**Current gap:** ArgoCD may not be installed yet (Application manifest exists but needs ArgoCD server).
|
||||
|
||||
## Integration Architecture
|
||||
|
||||
### Target State
|
||||
|
||||
```
|
||||
Internet
|
||||
|
|
||||
[Traefik]
|
||||
(Ingress)
|
||||
|
|
||||
+----------+----------+----------+----------+----------+
|
||||
| | | | | |
|
||||
task.* git.* argocd.* grafana.* (internal)
|
||||
| | | | |
|
||||
[TaskPlaner] [Gitea] [ArgoCD] [Grafana] [Prometheus]
|
||||
| | | | [Loki]
|
||||
| | | | [Alloy]
|
||||
| +---webhook---> | |
|
||||
| | | |
|
||||
+------ metrics ------+----------+--------->+
|
||||
+------ logs ---------+---------[Alloy]---->+ (to Loki)
|
||||
```
|
||||
|
||||
### Namespace Strategy
|
||||
|
||||
| Namespace | Components | Rationale |
|
||||
|-----------|------------|-----------|
|
||||
| `argocd` | ArgoCD server, repo-server, application-controller | Standard convention; ClusterRoleBinding expects this |
|
||||
| `monitoring` | Prometheus, Grafana, Alertmanager | Consolidate observability; kube-prometheus-stack default |
|
||||
| `loki` | Loki, Alloy (DaemonSet) | Separate from metrics for resource isolation |
|
||||
| `default` | TaskPlaner | Existing app deployment |
|
||||
| `gitea` | Gitea + Actions Runner | Assumed existing |
|
||||
|
||||
**Alternative considered:** All observability in single namespace
|
||||
**Decision:** Separate `monitoring` and `loki` because:
|
||||
- Different scaling characteristics (Alloy is DaemonSet, Prometheus is StatefulSet)
|
||||
- Easier resource quota management
|
||||
- Standard community practice
|
||||
|
||||
## Component Integration Details
|
||||
|
||||
### 1. ArgoCD Integration
|
||||
|
||||
**Installation Method:** Helm chart from `argo/argo-cd`
|
||||
|
||||
**Integration Points:**
|
||||
|
||||
| Integration | How | Configuration |
|
||||
|-------------|-----|---------------|
|
||||
| Gitea Repository | HTTPS clone | Repository credential in argocd-secret |
|
||||
| Gitea Webhook | POST to `/api/webhook` | Reduces sync delay from 3min to seconds |
|
||||
| Traefik Ingress | IngressRoute or Ingress | `server.insecure=true` to avoid redirect loops |
|
||||
| TLS | cert-manager annotation | Let's Encrypt via existing cluster-issuer |
|
||||
|
||||
**Critical Configuration:**
|
||||
|
||||
```yaml
|
||||
# Helm values for ArgoCD with Traefik
|
||||
configs:
|
||||
params:
|
||||
server.insecure: true # Required: Traefik handles TLS
|
||||
|
||||
server:
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- argocd.kube2.tricnet.de
|
||||
tls:
|
||||
- secretName: argocd-tls
|
||||
hosts:
|
||||
- argocd.kube2.tricnet.de
|
||||
```
|
||||
|
||||
**Webhook Setup for Gitea:**
|
||||
|
||||
1. In ArgoCD secret, set `webhook.gogs.secret` (Gitea uses Gogs-compatible webhooks)
|
||||
2. In Gitea repository settings, add webhook:
|
||||
- URL: `https://argocd.kube2.tricnet.de/api/webhook`
|
||||
- Content type: `application/json`
|
||||
- Secret: Same as configured in ArgoCD
|
||||
|
||||
**Known Limitation:** Webhooks work for Applications but not ApplicationSets with Gitea.
|
||||
|
||||
### 2. Prometheus/Grafana Integration (kube-prometheus-stack)
|
||||
|
||||
**Installation Method:** Helm chart `prometheus-community/kube-prometheus-stack`
|
||||
|
||||
**Integration Points:**
|
||||
|
||||
| Integration | How | Configuration |
|
||||
|-------------|-----|---------------|
|
||||
| k3s metrics | Exposed kube-* endpoints | k3s config modification required |
|
||||
| Traefik metrics | ServiceMonitor | Traefik exposes `:9100/metrics` |
|
||||
| TaskPlaner metrics | ServiceMonitor (future) | App must expose `/metrics` endpoint |
|
||||
| Grafana UI | Traefik Ingress | Standard Kubernetes Ingress |
|
||||
|
||||
**Critical k3s Configuration:**
|
||||
|
||||
k3s binds controller-manager, scheduler, and proxy to localhost by default. For Prometheus scraping, expose on 0.0.0.0.
|
||||
|
||||
Create/modify `/etc/rancher/k3s/config.yaml`:
|
||||
|
||||
```yaml
|
||||
kube-controller-manager-arg:
|
||||
- "bind-address=0.0.0.0"
|
||||
kube-proxy-arg:
|
||||
- "metrics-bind-address=0.0.0.0"
|
||||
kube-scheduler-arg:
|
||||
- "bind-address=0.0.0.0"
|
||||
```
|
||||
|
||||
Then restart k3s: `sudo systemctl restart k3s`
|
||||
|
||||
**k3s-specific Helm values:**
|
||||
|
||||
```yaml
|
||||
# Disable etcd monitoring (k3s uses sqlite, not etcd)
|
||||
defaultRules:
|
||||
rules:
|
||||
etcd: false
|
||||
|
||||
kubeEtcd:
|
||||
enabled: false
|
||||
|
||||
# Fix endpoint discovery for k3s
|
||||
kubeControllerManager:
|
||||
enabled: true
|
||||
endpoints:
|
||||
- <k3s-server-ip>
|
||||
service:
|
||||
enabled: true
|
||||
port: 10257
|
||||
targetPort: 10257
|
||||
|
||||
kubeScheduler:
|
||||
enabled: true
|
||||
endpoints:
|
||||
- <k3s-server-ip>
|
||||
service:
|
||||
enabled: true
|
||||
port: 10259
|
||||
targetPort: 10259
|
||||
|
||||
kubeProxy:
|
||||
enabled: true
|
||||
endpoints:
|
||||
- <k3s-server-ip>
|
||||
service:
|
||||
enabled: true
|
||||
port: 10249
|
||||
targetPort: 10249
|
||||
|
||||
# Grafana ingress
|
||||
grafana:
|
||||
ingress:
|
||||
enabled: true
|
||||
ingressClassName: traefik
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
hosts:
|
||||
- grafana.kube2.tricnet.de
|
||||
tls:
|
||||
- secretName: grafana-tls
|
||||
hosts:
|
||||
- grafana.kube2.tricnet.de
|
||||
```
|
||||
|
||||
**ServiceMonitor for TaskPlaner (future):**
|
||||
|
||||
Once TaskPlaner exposes `/metrics`:
|
||||
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: ServiceMonitor
|
||||
metadata:
|
||||
name: taskplaner
|
||||
namespace: monitoring
|
||||
labels:
|
||||
release: prometheus # Must match kube-prometheus-stack release
|
||||
spec:
|
||||
namespaceSelector:
|
||||
matchNames:
|
||||
- default
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: taskplaner
|
||||
endpoints:
|
||||
- port: http
|
||||
path: /metrics
|
||||
interval: 30s
|
||||
```
|
||||
|
||||
### 3. Loki + Alloy Integration (Log Aggregation)
|
||||
|
||||
**Important:** Promtail is deprecated (LTS until Feb 2026, EOL March 2026). Use **Grafana Alloy** instead.
|
||||
|
||||
**Installation Method:**
|
||||
- Loki: Helm chart `grafana/loki` (monolithic mode for single node)
|
||||
- Alloy: Helm chart `grafana/alloy`
|
||||
|
||||
**Integration Points:**
|
||||
|
||||
| Integration | How | Configuration |
|
||||
|-------------|-----|---------------|
|
||||
| Pod logs | Alloy DaemonSet | Mounts `/var/log/pods` |
|
||||
| Loki storage | Longhorn PVC or MinIO | Single-binary uses filesystem |
|
||||
| Grafana datasource | Auto-configured | kube-prometheus-stack integration |
|
||||
| k3s node logs | Alloy journal reader | journalctl access |
|
||||
|
||||
**Deployment Mode Decision:**
|
||||
|
||||
| Mode | When to Use | Our Choice |
|
||||
|------|-------------|------------|
|
||||
| Monolithic (single-binary) | Small deployments, <100GB/day | **Yes - single node k3s** |
|
||||
| Simple Scalable | Medium deployments | No |
|
||||
| Microservices | Large scale, HA required | No |
|
||||
|
||||
**Loki Helm values (monolithic):**
|
||||
|
||||
```yaml
|
||||
deploymentMode: SingleBinary
|
||||
|
||||
singleBinary:
|
||||
replicas: 1
|
||||
persistence:
|
||||
enabled: true
|
||||
storageClass: longhorn
|
||||
size: 10Gi
|
||||
|
||||
# Disable components not needed in monolithic
|
||||
read:
|
||||
replicas: 0
|
||||
write:
|
||||
replicas: 0
|
||||
backend:
|
||||
replicas: 0
|
||||
|
||||
# Use filesystem storage (not S3/MinIO for simplicity)
|
||||
loki:
|
||||
storage:
|
||||
type: filesystem
|
||||
schemaConfig:
|
||||
configs:
|
||||
- from: "2024-01-01"
|
||||
store: tsdb
|
||||
object_store: filesystem
|
||||
schema: v13
|
||||
index:
|
||||
prefix: index_
|
||||
period: 24h
|
||||
```
|
||||
|
||||
**Alloy DaemonSet Configuration:**
|
||||
|
||||
```yaml
|
||||
# alloy-values.yaml
|
||||
alloy:
|
||||
configMap:
|
||||
create: true
|
||||
content: |
|
||||
// Kubernetes logs collection
|
||||
loki.source.kubernetes "pods" {
|
||||
targets = discovery.kubernetes.pods.targets
|
||||
forward_to = [loki.write.default.receiver]
|
||||
}
|
||||
|
||||
// Send to Loki
|
||||
loki.write "default" {
|
||||
endpoint {
|
||||
url = "http://loki.loki.svc.cluster.local:3100/loki/api/v1/push"
|
||||
}
|
||||
}
|
||||
|
||||
// Kubernetes discovery
|
||||
discovery.kubernetes "pods" {
|
||||
role = "pod"
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Traefik Metrics Integration
|
||||
|
||||
Traefik already exposes Prometheus metrics. Enable scraping:
|
||||
|
||||
**Option A: ServiceMonitor (if using kube-prometheus-stack)**
|
||||
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: ServiceMonitor
|
||||
metadata:
|
||||
name: traefik
|
||||
namespace: monitoring
|
||||
labels:
|
||||
release: prometheus
|
||||
spec:
|
||||
namespaceSelector:
|
||||
matchNames:
|
||||
- kube-system
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: traefik
|
||||
endpoints:
|
||||
- port: metrics
|
||||
path: /metrics
|
||||
interval: 30s
|
||||
```
|
||||
|
||||
**Option B: Verify Traefik metrics are enabled**
|
||||
|
||||
Check Traefik deployment args include:
|
||||
```
|
||||
--entrypoints.metrics.address=:8888
|
||||
--metrics.prometheus=true
|
||||
--metrics.prometheus.entryPoint=metrics
|
||||
```
|
||||
|
||||
## Data Flow Diagrams
|
||||
|
||||
### Metrics Flow
|
||||
|
||||
```
|
||||
+------------------+ +------------------+ +------------------+
|
||||
| TaskPlaner | | Traefik | | k3s core |
|
||||
| /metrics | | :9100/metrics | | :10249,10257... |
|
||||
+--------+---------+ +--------+---------+ +--------+---------+
|
||||
| | |
|
||||
+------------------------+------------------------+
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| Prometheus |
|
||||
| (ServiceMonitors) |
|
||||
+--------+----------+
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| Grafana |
|
||||
| (Dashboards) |
|
||||
+-------------------+
|
||||
```
|
||||
|
||||
### Log Flow
|
||||
|
||||
```
|
||||
+------------------+ +------------------+ +------------------+
|
||||
| TaskPlaner | | Traefik | | Other Pods |
|
||||
| stdout/stderr | | access logs | | stdout/stderr |
|
||||
+--------+---------+ +--------+---------+ +--------+---------+
|
||||
| | |
|
||||
+------------------------+------------------------+
|
||||
|
|
||||
/var/log/pods
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| Alloy DaemonSet |
|
||||
| (log collection) |
|
||||
+--------+----------+
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| Loki |
|
||||
| (log storage) |
|
||||
+--------+----------+
|
||||
|
|
||||
v
|
||||
+-------------------+
|
||||
| Grafana |
|
||||
| (log queries) |
|
||||
+-------------------+
|
||||
```
|
||||
|
||||
### GitOps Flow
|
||||
|
||||
```
|
||||
+------------+ +------------+ +---------------+ +------------+
|
||||
| Developer | --> | Gitea | --> | Gitea Actions | --> | Container |
|
||||
| git push | | Repository | | (build.yaml) | | Registry |
|
||||
+------------+ +-----+------+ +-------+-------+ +------------+
|
||||
| |
|
||||
| (update values.yaml)
|
||||
| |
|
||||
v v
|
||||
+------------+ +------------+
|
||||
| Webhook | ----> | ArgoCD |
|
||||
| (notify) | | Server |
|
||||
+------------+ +-----+------+
|
||||
|
|
||||
(sync app)
|
||||
|
|
||||
v
|
||||
+------------+
|
||||
| Kubernetes |
|
||||
| (deploy) |
|
||||
+------------+
|
||||
```
|
||||
|
||||
## Build Order (Dependencies)
|
||||
|
||||
Based on component dependencies, recommended installation order:
|
||||
|
||||
### Phase 1: ArgoCD (no dependencies on observability)
|
||||
|
||||
```
|
||||
1. Install ArgoCD via Helm
|
||||
- Creates namespace: argocd
|
||||
- Verify existing Application manifest works
|
||||
- Configure Gitea webhook
|
||||
|
||||
Dependencies: None (Traefik already running)
|
||||
Validates: GitOps pipeline end-to-end
|
||||
```
|
||||
|
||||
### Phase 2: kube-prometheus-stack (foundational observability)
|
||||
|
||||
```
|
||||
2. Configure k3s metrics exposure
|
||||
- Modify /etc/rancher/k3s/config.yaml
|
||||
- Restart k3s
|
||||
|
||||
3. Install kube-prometheus-stack via Helm
|
||||
- Creates namespace: monitoring
|
||||
- Includes: Prometheus, Grafana, Alertmanager
|
||||
- Includes: Default dashboards and alerts
|
||||
|
||||
Dependencies: k3s metrics exposed
|
||||
Validates: Basic cluster monitoring working
|
||||
```
|
||||
|
||||
### Phase 3: Loki + Alloy (log aggregation)
|
||||
|
||||
```
|
||||
4. Install Loki via Helm (monolithic mode)
|
||||
- Creates namespace: loki
|
||||
- Configure storage with Longhorn
|
||||
|
||||
5. Install Alloy via Helm
|
||||
- DaemonSet in loki namespace
|
||||
- Configure Kubernetes log discovery
|
||||
- Point to Loki endpoint
|
||||
|
||||
6. Add Loki datasource to Grafana
|
||||
- URL: http://loki.loki.svc.cluster.local:3100
|
||||
|
||||
Dependencies: Grafana from step 3, storage
|
||||
Validates: Logs visible in Grafana Explore
|
||||
```
|
||||
|
||||
### Phase 4: Application Integration
|
||||
|
||||
```
|
||||
7. Add TaskPlaner metrics endpoint (if not exists)
|
||||
- Expose /metrics in app
|
||||
- Create ServiceMonitor
|
||||
|
||||
8. Create application dashboards in Grafana
|
||||
- TaskPlaner-specific metrics
|
||||
- Request latency, error rates
|
||||
|
||||
Dependencies: All previous phases
|
||||
Validates: Full observability of application
|
||||
```
|
||||
|
||||
## Resource Requirements
|
||||
|
||||
| Component | CPU Request | Memory Request | Storage |
|
||||
|-----------|-------------|----------------|---------|
|
||||
| ArgoCD (all) | 500m | 512Mi | - |
|
||||
| Prometheus | 200m | 512Mi | 10Gi (Longhorn) |
|
||||
| Grafana | 100m | 256Mi | 1Gi (Longhorn) |
|
||||
| Alertmanager | 50m | 64Mi | 1Gi (Longhorn) |
|
||||
| Loki | 200m | 256Mi | 10Gi (Longhorn) |
|
||||
| Alloy (per node) | 100m | 128Mi | - |
|
||||
|
||||
**Total additional:** ~1.2 CPU cores, ~1.7Gi RAM, ~22Gi storage
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Network Policies
|
||||
|
||||
Consider network policies to restrict:
|
||||
- Prometheus scraping only from monitoring namespace
|
||||
- Loki ingestion only from Alloy
|
||||
- Grafana access only via Traefik
|
||||
|
||||
### Secrets Management
|
||||
|
||||
| Secret | Location | Purpose |
|
||||
|--------|----------|---------|
|
||||
| `argocd-initial-admin-secret` | argocd ns | Initial admin password |
|
||||
| `argocd-secret` | argocd ns | Webhook secrets, repo credentials |
|
||||
| `grafana-admin` | monitoring ns | Grafana admin password |
|
||||
|
||||
### Ingress Authentication
|
||||
|
||||
For production, consider:
|
||||
- ArgoCD: Built-in OIDC/OAuth integration
|
||||
- Grafana: Built-in auth (local, LDAP, OAuth)
|
||||
- Prometheus: Traefik BasicAuth middleware (already pattern in use)
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
### 1. Skipping k3s Metrics Configuration
|
||||
|
||||
**What happens:** Prometheus installs but most dashboards show "No data"
|
||||
**Prevention:** Configure k3s to expose metrics BEFORE installing kube-prometheus-stack
|
||||
|
||||
### 2. Using Promtail Instead of Alloy
|
||||
|
||||
**What happens:** Technical debt - Promtail EOL is March 2026
|
||||
**Prevention:** Use Alloy from the start; migration documentation exists
|
||||
|
||||
### 3. Running Loki in Microservices Mode for Small Clusters
|
||||
|
||||
**What happens:** Unnecessary complexity, resource overhead
|
||||
**Prevention:** Monolithic mode for clusters under 100GB/day log volume
|
||||
|
||||
### 4. Forgetting server.insecure for ArgoCD with Traefik
|
||||
|
||||
**What happens:** Redirect loop (ERR_TOO_MANY_REDIRECTS)
|
||||
**Prevention:** Always set `configs.params.server.insecure=true` when Traefik handles TLS
|
||||
|
||||
### 5. ServiceMonitor Label Mismatch
|
||||
|
||||
**What happens:** Prometheus doesn't discover custom ServiceMonitors
|
||||
**Prevention:** Ensure `release: <helm-release-name>` label matches kube-prometheus-stack release
|
||||
|
||||
## Sources
|
||||
|
||||
**ArgoCD:**
|
||||
- [ArgoCD Webhook Configuration](https://argo-cd.readthedocs.io/en/stable/operator-manual/webhook/)
|
||||
- [ArgoCD Ingress Configuration](https://argo-cd.readthedocs.io/en/stable/operator-manual/ingress/)
|
||||
- [ArgoCD Installation](https://argo-cd.readthedocs.io/en/stable/operator-manual/installation/)
|
||||
- [Mastering GitOps: ArgoCD and Gitea on Kubernetes](https://blog.stackademic.com/mastering-gitops-a-comprehensive-guide-to-self-hosting-argocd-and-gitea-on-kubernetes-9cdf36856c38)
|
||||
|
||||
**Prometheus/Grafana:**
|
||||
- [kube-prometheus-stack Helm Chart](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack)
|
||||
- [Prometheus on K3s](https://fabianlee.org/2022/07/02/prometheus-installing-kube-prometheus-stack-on-k3s-cluster/)
|
||||
- [K3s Monitoring Guide](https://github.com/cablespaghetti/k3s-monitoring)
|
||||
- [ServiceMonitor Explained](https://dkbalachandar.wordpress.com/2025/07/21/kubernetes-servicemonitor-explained-how-to-monitor-services-with-prometheus/)
|
||||
|
||||
**Loki/Alloy:**
|
||||
- [Loki Monolithic Installation](https://grafana.com/docs/loki/latest/setup/install/helm/install-monolithic/)
|
||||
- [Loki Deployment Modes](https://grafana.com/docs/loki/latest/get-started/deployment-modes/)
|
||||
- [Migrate from Promtail to Alloy](https://grafana.com/docs/alloy/latest/set-up/migrate/from-promtail/)
|
||||
- [Grafana Loki 3.4 Release](https://grafana.com/blog/2025/02/13/grafana-loki-3.4-standardized-storage-config-sizing-guidance-and-promtail-merging-into-alloy/)
|
||||
- [Alloy Replacing Promtail](https://docs-bigbang.dso.mil/latest/docs/adrs/0004-alloy-replacing-promtail/)
|
||||
|
||||
**Traefik Integration:**
|
||||
- [Traefik Metrics with Prometheus](https://traefik.io/blog/capture-traefik-metrics-for-apps-on-kubernetes-with-prometheus)
|
||||
|
||||
---
|
||||
*Last updated: 2026-02-03*
|
||||
|
||||
Reference in New Issue
Block a user