Development and Maintenance Guide for the Grafana Package📜
Grafana is a modified/customized version of an upstream chart. The below details the steps required to update to a new version of the Grafana package.
How to upgrade the Grafana Package chart📜
-
Navigate to the upstream chart repo and folder and find the tag (e.g.,
grafana-x.x.x
) that corresponds with the new chart version for this update. -
From the root of the repo run
kpt pkg update chart@<tag> --strategy alpha-git-patch
replacing<tag>
with the tag you got in step 1. You may be prompted to resolve some conflicts - choose what makes sense (if there are BB additions/changes keep them, if there are upstream additions/changes keep them). -
See the Big Bang Modifications section below for the changes that need to be made to the
chart/values.yaml
andchart/templates/_helpers.tpl
files. -
Modify the
version
inChart.yaml
. You will want to append-bb.0
to the chart version from upstream. -
Update dependencies to latest BB gluon library version using
helm dependency update ./chart
-
Check for changes to the dashboards provided with
kube-prometheus-stack
. Also check for changes to the following python script from upstream. If there are changes read the section below for Syncing Dashboards -
Update
CHANGELOG.md
adding an entry for the new version and noting all changes (at minimum should includeUpdated Grafana chart to x.x.x
andUpdated image versions to latest in IB (grafana: x.x.x, etc)
. -
Generate the
README.md
updates by following the guide in gluon. -
Push up your changes, validate that CI passes. If there are any failures follow the information in the pipeline to make the necessary updates and reach out to the team if needed.
Testing a new Grafana version📜
-
As part of your MR that modifies bigbang packages, you should modify the bigbang bigbang/tests/test-values.yaml against your branch for the CI/CD MR testing by enabling your packages.
- To do this, at a minimum, you will need to follow the instructions at bigbang/docs/developer/test-package-against-bb.md with changes for Grafana enabled (the below is a reference, actual changes could be more depending on what changes were made to Grafana in the package MR).
grafana: enabled: true git: tag: null branch: <my-package-branch-that-needs-testing> values: istio: hardened: enabled: true ### Additional components of Grafana should be changed to reflect testing changes introduced in the package MR
-
Perform the steps below for manual testing. Our CI provides a good set of basic smoke tests (use the
debug
label), but it is beneficial to run some additional checks.
Deploy Grafana as a part of BigBang📜
overrides/testing-grafana.yaml
flux:
interval: 1m
rollback:
cleanupOnFail: false
networkPolicies:
enabled: true
clusterAuditor:
enabled: false
gatekeeper:
enabled: false
neuvector:
enabled: false
istioOperator:
enabled: true
istio:
enabled: true
values:
hardened:
enabled: true
monitoring:
enabled: true
grafana:
enabled: true
git:
tag: null
branch: "renovate/ironbank"
sso:
enabled: true
grafana:
client_id: platform1_a8604cc9-f5e9-4656-802d-d05624370245_bb8-grafana
scopes: "openid Grafana"
values:
istio:
enabled: true
hardened:
enabled: true
loki:
enabled: true
promtail:
enabled: true
tempo:
enabled: true
kyverno:
enabled: false
kyvernoPolicies:
enabled: false
kyvernoReporter:
enabled: false
jaeger:
enabled: false
kiali:
enabled: false
elasticsearchKibana:
enabled: false
eckOperator:
enabled: false
fluentbit:
enabled: false
twistlock:
enabled: false
- Visit
https://grafana.dev.bigbang.mil
and login - Navigate to
Dashboards
and then click onKubernetes / Compute Resources / Cluster
and validate that data is loaded
Big Bang Modifications📜
Modifications made to upstream chart
chart/values.yaml
📜
- Line 3: Ensure
global.imageRegistry
is set to toregistry1.dso.mil
.
global:
# -- Overrides the Docker registry globally for all images
imageRegistry: registry1.dso.mil
- Line 19: Ensure
openshift: false
is present.
openshift: false
- Line 100-103: Ensure the
image
configuration is set to the following, whereX.Y.Z
is the correct version:
image:
repository: ironbank/big-bang/grafana/grafana-plugins
# Overrides the Grafana image tag whose default is the chart appVersion
tag: "X.Y.Z"
#sha: ""
- Line 112: Ensure
image.pullSecrets
is supplied.
pullSecrets:
- private-registry
- Line 115-117: Ensure the
testFramework
configuration is set to the following, whereX.Y.Z
is the correct version:
testFramework:
enabled: false
image: ironbank/opensource/bats/bats
tag: "v1.4.1"
- Line 140-142: Ensure the
securityContext
values forrunAsUser
,runAsGroup
, andfsGroup
are set to65532
:
securityContext:
runAsNonRoot: true
runAsUser: 65532
runAsGroup: 65532
fsGroup: 65532
- Line 177-179: Ensure the
downloadDashboardsImage
configuration is set to the following, whereX.Y.Z
is the correct version:
downloadDashboardsImage:
repository: ironbank/big-bang/base
tag: X.Y.Z
#sha: ""
- Line 185-191: Ensure the
downloadDashboards.resources
configuration is set to the following:
resources:
limits:
cpu: 20m
memory: 20Mi
requests:
cpu: 20m
memory: 20Mi
- Line 213-215: Set required
app
andversion
label defaults for Kiali
podLabels:
app: monitoring-grafana
version: "{{ .Chart.AppVersion }}"
- Line 240: Ensure
service.portName
is set tohttp-service
.
portName: http-service
- Line 252: Ensure
serviceMonitor.interval
is set to1m
.
interval: 1m
- Line 310: Ensure
resources
is set to the following:
yaml
resources:
limits:
cpu: 100m
memory: 256Mi
requests:
cpu: 100m
memory: 256Mi
- Line 410: Ensure
initChownData.enabled
is set tofalse
.
initChownData:
## If false, data ownership will not be reset at startup
## This allows the grafana-server to be run with an arbitrary user
##
enabled: false
- Line 415-416: Ensure
initChownData.image.repository
andinitChownData.image.tag
are set to the following:
image:
repository: ironbank/redhat/ubi/ubi9-minimal
tag: "9.4"
- Line 423-429: Ensure
initChownData.resources
is set to the following:
yaml
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
- Line 441: Ensure
adminPassword
is set toprom-operator
.
adminPassword: prom-operator
- Line 790-792: Ensure that
grafana.ini.analytics
has these values:
analytics:
reporting_enabled: false
check_for_updates: false
- Line 804-827: Ensure the following section is added to the
grafana.ini
configuration:
auth.generic_oauth:
enabled: false
client_id: grafana #this is a sample client_id, review docs/KEYCLOAK.md
client_secret: secret #this is a sample secret, review docs/KEYCLOAK.md
scopes: Grafana #this is a sample client scope, review docs/KEYCLOAK.md
auth_url: https://login.dso.mil/auth/realms/baby-yoda/protocol/openid-connect/auth
token_url: https://login.dso.mil/auth/realms/baby-yoda/protocol/openid-connect/token
api_url: https://login.dso.mil/auth/realms/baby-yoda/protocol/openid-connect/userinfo
allow_sign_up: true
role_attribute_path: Viewer
# tls_skip_verify_insecure: false
# tls_client_cert: ""
# tls_client_key: ""
# tls_client_ca : /etc/oidc/ca.pem
# allowed_domains: mycompany.com mycompany.org
# empty_scopes: false
plugin.grafana-piechart-panel:
path: /var/lib/bb-plugins/piechart-panel
plugin.grafana-polystat-panel:
path: /var/lib/bb-plugins/polystat-panel
plugin.redis-datasource:
path: /var/lib/bb-plugins/redis-datasource
security:
angular_support_enabled: false
- Line 884-885: Ensure that
sidecar.image.repository
andsidecar.image.tag
are set to the following:
sidecar:
image:
repository: ironbank/kiwigrid/k8s-sidecar
tag: 1.27.5
- Line 887-893: Ensure that
sidecar.resources
is set to the following:
resources:
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
- Line 951: Ensure
sidecar.dashboards.enabled
is set totrue
.
dashboards:
enabled: true
- Line 973: Ensure
sidecar.dashboards.labelValue
is set to"1"
.
labelValue: "1"
- Line 983: Ensure
sidecar.dashboards.searchNameSpace
is set toALL
.
searchNamespace: ALL
- Line 1028-1032: Ensure
sidecar.dashboards.multicluster
is set to the following:
multicluster:
global:
enabled: true
etcd:
enabled: true
- Line 1034: Ensure
sidecar.datasources.enabled
is set totrue
.
datasources:
enabled: true
- Line 1055: Ensure
sidecar.datasources.labelValue
is set to"1"
.
labelValue: "1"
- Line 1188-1190: Ensure
imageRenderer.image.registry
is removed andimageRenderer.image.repository
is overridden.
image:
# image-renderer Image repository
repository: docker.io/grafana/grafana-image-renderer
- Line 124-: Ensure
imageRenderer.service.portName
is set tohttp-web
portName: http-web
- Line 1405: Ensure
assertNoLeakedSecrets
is set tofalse
.
assertNoLeakedSecrets: false
- EOF: Add the following extra configurations to the bottom of the file:
defaultDashboardsEnabled:
enabled: true
coreDns:
enabled: true
kubeEtcd:
enabled: true
kubeApiServer:
enabled: true
kubeControllerManager:
enabled: true
kubelet:
enabled: true
namespace: kube-system
kubeProxy:
enabled: true
kubeScheduler:
enabled: true
nodeExporter:
enabled: true
operatingSystems:
linux:
enabled: true
darwin:
enabled: true
windows:
enabled: true
windowsMonitoring:
enabled: true
prometheusRemoteWriteDashboards: true
networkPolicies:
enabled: false
ingressLabels:
app: public-ingressgateway
istio: ingressgateway
additionalPolicies: []
defaultDashboardsEditable: true
domain: dev.bigbang.mil
istio:
enabled: false
hardened:
enabled: false
outboundTrafficPolicyMode: "REGISTRY_ONLY"
customServiceEntries: []
# - name: "allow-google"
# enabled: true
# spec:
# hosts:
# - google.com
# location: MESH_EXTERNAL
# ports:
# - number: 443
# protocol: TLS
# name: https
# resolution: DNS
customAuthorizationPolicies: []
# - name: "allow-nothing"
# enabled: true
# spec: {}
kiali:
enabled: true
namespaces:
- kiali
principals:
- cluster.local/ns/kiali/sa/kiali-service-account
grafana:
# Toggle vs creation
enabled: true
annotations: {}
labels: {}
gateways:
- istio-system/main
hosts:
- grafana.{{ .Values.domain }}
service: ""
port: ""
namespace: ""
injection: disabled
mtls:
# Note that setting this to STRICT requires additional configuration for Prometheus and monitors.
# Review `./docs/istio-mtls-metrics.md` for additional information.
mode: STRICT
sso:
enabled: false
bbtests:
enabled: false
cypress:
artifacts: true
envs:
cypress_grafana_url: 'http://grafana:80'
resources:
requests:
cpu: 2
memory: 2Gi
limits:
cpu: 2
memory: 2Gi
istio:
sidecar:
resources:
cpu:
requests: 100m
limits: 2000m
memory:
requests: 512Mi
limits: 2048Mi
chart/templates/_helpers.tpl
📜
- Line 84: Set
app.kubernetes.io/instance
tomonitoring-monitoring
.
app.kubernetes.io/instance: monitoring-monitoring
- Line 104: Set
app.kubernetes.io/instance
tomonitoring-monitoring
.
app.kubernetes.io/instance: monitoring-monitoring
- EOF: Ensure this section is added to the bottom of the file:
{{/*
Find hostname from uri
*/}}
{{- define "grafana.hostnameFromUri" -}}
{{- $match := . | toString | regexFind "//.*" -}}
{{- $hostWithPort := regexSplit "/" ($match | trimAll "//") -1 -}}
{{- $host := regexSplit ":" (first $hostWithPort) -1 -}}
{{- printf "%s" (first $host) -}}
{{- end -}}
hack/sync_grafana_dashboards.py
📜
- Line 92: Change the value of
condition_map['prometheus_remote_write']
to be:
'prometheus-remote-write': ' .Values.prometheusRemoteWriteDashboards',
automountServiceAccountToken📜
The mutating Kyverno policy named update-automountserviceaccounttokens
is leveraged to harden all ServiceAccounts in this package with automountServiceAccountToken: false
. This policy is configured by namespace in the Big Bang umbrella chart repository at chart/templates/kyverno-policies/values.yaml.
This policy revokes access to the K8s API for Pods utilizing said ServiceAccounts. If a Pod truly requires access to the K8s API (for app functionality), the Pod is added to the pods:
array of the same mutating policy. This grants the Pod access to the API, and creates a Kyverno PolicyException to prevent an alert.
Syncing Dashboards📜
We ship the grafana package separately due to https://repo1.dso.mil/big-bang/product/packages/monitoring/-/issues/110 & https://github.com/prometheus-community/helm-charts/issues/3548 as a solution never bubbled down that fixed the issue for our environments.
When the dashboards and script are updated upstream, we must pull in the new scripts from hack/
in kube-prometheus-stack
, modify them so that any new values are present in this chart, and revert any references to .Values.grafana
back to just .Values.
since this is the grafana chart.
Before running the Python script, ensure the relative locations are correct, eg:
charts = [
{
'source': '../../monitoring/chart/files/dashboards/k8s-coredns.json', #Pointing to local BigBang monitoring chart/files/dashboard
'destination': '../chart/templates/dashboards/dashboards-1.14', #Pointing to this grafana package chart/templates/dashboards (eg ran from hack/ folder)
...,
},
{
...,
'destination': '../chart/templates/dashboards/dashboards-1.14',
Push up any changes to the dashboards/dashboards-1.14
folder. Deploy the chart in dev and ensure modified dashboards from upstream (coredns/node-exporter/etcd) are importing and showing data as before the changes/upgrades.