Monitoring with Grafana
Introduction
Grafana is an open-source platform for monitoring and observability. It allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. In this post, we will see how to monitor a Kubernetes cluster using Grafana.
Prerequisites
- A Kubernetes cluster
- Helm
- The following argument in your kubernetes service file to enable additional metrics:
--kube-controller-manager-arg bind-address=0.0.0.0
--kube-proxy-arg metrics-bind-address=0.0.0.0
--kube-scheduler-arg bind-address=0.0.0.0
--etcd-expose-metrics true
--kubelet-arg containerd=/run/k3s/containerd/containerd.sock
Install Grafana & Prometheus
Add the Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Create a namespace for monitoring
kubectl create namespace monitoring
Create a secret for Grafana
Create a secret manifest file
- Create a file named
grafana-admin-credential.ymlwith the following content:
apiVersion: v1
kind: Secret
metadata:
name: grafana-admin-credentials
namespace: monitoring
type: Opaque
stringData:
GF_SECURITY_ADMIN_PASSWORD: "admin" # Change this password
GF_SECURITY_ADMIN_USER: "admin" # Change this username
Apply the secret
kubectl apply -f grafana-admin-credential.yml
Create a values file for Grafana
- Create a file named
values.yamlwith the following content:
fullnameOverride: prometheus
defaultRules:
create: true
rules:
alertmanager: true
etcd: true
configReloaders: true
general: true
k8s: true
kubeApiserverAvailability: true
kubeApiserverBurnrate: true
kubeApiserverHistogram: true
kubeApiserverSlos: true
kubelet: true
kubeProxy: true
kubePrometheusGeneral: true
kubePrometheusNodeRecording: true
kubernetesApps: true
kubernetesResources: true
kubernetesStorage: true
kubernetesSystem: true
kubeScheduler: true
kubeStateMetrics: true
network: true
node: true
nodeExporterAlerting: true
nodeExporterRecording: true
prometheus: true
prometheusOperator: true
alertmanager:
fullnameOverride: alertmanager
enabled: true
ingress:
enabled: false
grafana:
enabled: true
fullnameOverride: grafana
forceDeployDatasources: false
forceDeployDashboards: false
defaultDashboardsEnabled: true
defaultDashboardsTimezone: utc
serviceMonitor:
enabled: true
sidecar:
dashboards:
provider:
allowUiUpdates: true
admin:
existingSecret: grafana-admin-credentials
userKey: GF_SECURITY_ADMIN_USER
passwordKey: GF_SECURITY_ADMIN_PASSWORD
kubeApiServer:
enabled: true
kubelet:
enabled: true
serviceMonitor:
metricRelabelings:
- action: replace
sourceLabels:
- node
targetLabel: instance
kubeControllerManager:
enabled: true
endpoints: # ips of master node
- 10.10.10.30
- 10.10.10.31
- 10.10.10.32
coreDns:
enabled: true
kubeDns:
enabled: false
kubeEtcd:
enabled: true
endpoints: # ips of master node
- 10.10.10.30
- 10.10.10.31
- 10.10.10.32
service:
enabled: true
port: 2381
targetPort: 2381
kubeScheduler:
enabled: true
endpoints: # ips of master node
- 10.10.10.30
- 10.10.10.31
- 10.10.10.32
kubeProxy:
enabled: true
endpoints: # ips of master node
- 10.10.10.30
- 10.10.10.31
- 10.10.10.32
kubeStateMetrics:
enabled: true
kube-state-metrics:
fullnameOverride: kube-state-metrics
selfMonitor:
enabled: true
prometheus:
monitor:
enabled: true
relabelings:
- action: replace
regex: (.*)
replacement: $1
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: kubernetes_node
nodeExporter:
enabled: true
serviceMonitor:
relabelings:
- action: replace
regex: (.*)
replacement: $1
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: kubernetes_node
prometheus-node-exporter:
fullnameOverride: node-exporter
podLabels:
jobLabel: node-exporter
extraArgs:
- --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
- --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
service:
portName: http-metrics
prometheus:
monitor:
enabled: true
relabelings:
- action: replace
regex: (.*)
replacement: $1
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: kubernetes_node
resources:
requests:
memory: 512Mi
cpu: 250m
limits:
memory: 1024Mi
prometheusOperator:
enabled: true
prometheusConfigReloader:
resources:
requests:
cpu: 200m
memory: 50Mi
limits:
memory: 100Mi
prometheus:
enabled: true
prometheusSpec:
replicas: 1
replicaExternalLabelName: "replica"
ruleSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false
podMonitorSelectorNilUsesHelmValues: false
probeSelectorNilUsesHelmValues: false
retention: 6h
enableAdminAPI: true
walCompression: true
thanosRuler:
enabled: false
- Change the ips of the master node in the values file
Install Grafana & Prometheus with Helm
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring -f values.yaml
Error loading config
If you get the following error:
Error loading config (--config.file=/etc/prometheus/config_out/prometheus.env.yaml)
You can fix it opening a shell in the prometheus pod and running the following command:
cd /etc/prometheus/config_out
vi prometheus.env.yaml
Then, add the following content:
global:
scrape_interval: 15s
evaluation_interval: 15s
The pod should restart and the error should be fixed.
Access Grafana
Create an ingress for Grafana
- Create a file named
grafana-ingress.ymlwith the following content:
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: grafana-ingress
namespace: monitoring
annotations:
kubernetes.io/ingress.class: traefik-external
spec:
entryPoints:
- websecure
routes:
- match: Host(`www.gf.your-domain.com`) # Change this to your domain
kind: Rule
services:
- name: grafana
port: 80
- match: Host(`gf.your-domain.com`) # Change this to your domain
kind: Rule
services:
- name: grafana
port: 80
middlewares:
- name: default-headers
tls:
secretName: tls # Change this to your tls secret
Apply the ingress
kubectl apply -f grafana-ingress.yml
After adding an entry in your DNS server, you should be able to access Grafana at https://gf.your-domain.com
Conclusion
In this post, we saw how to monitor a Kubernetes cluster using Grafana. We installed Grafana and Prometheus using Helm and created an ingress to access Grafana. We also saw how to fix the error loading config in Prometheus. In a next post, we will see how to create dashboards and alerts in Grafana to monitor the Kubernetes cluster and other resources.