Installing GitLab + Backup on Kubernetes - Part 1

In this guide, we will explain how to install GitLab Community Edition on Kubernetes, as well as setting up an automated backup schedule for it using Veeam Kasten.

We will also demonstrate a disaster scenario where we lose our entire GitLab installation, and then recover it completely from the backups. Performing recovery testing of backups in general is highly recommended.

In short, the guide covers:

  • Hosting GitLab on a Kubernetes cluster.
  • Exposing it to the internet.
  • Managing TLS certificates properly.
  • Backup and recovery of your GitLab instance using Veeam Kasten.

Glossary

GitLab Community Edition

You must have surely heard of GitHub.com. That's currently the most popular software development collaboration platform there is today. However, it is running on someone else's computer, Microsoft in this case. They do sell GitHub on-premises, but that can get quite expensive.

GitLab Community Edition is a free and open source alternative to GitHub with a comparable feature set that you can host on your own.

GitHub does have an Enterprise Edition, hence the distinctive name of the open variant: Community Edition

Kubernetes

As a software developer, you may be familiar with Docker, Docker Compose or Podman for running software inside isolated containers. Kubernetes is like those, but on steroids.

Kubernetes is designed to be scalable. In practice, this means running your software in a cluster across multiple machines. Unlike Docker though, it has more moving parts, and a steeper learning curve. But for growing businesses and larger organizations, a system that can scale is very important, and Kubernetes is the most popular container orchestration platform for this purpose.

Although there is a steep learning curve, installing Kubernetes can be very easy. There are several Kubernetes distributions that bundle all the necessary components for you, and let you install everything with just a few commands.

Here is a list of distributions to download:

  • Minikube: Is good for development on your own machine. ← Pick this if you're new to Kubernetes.
  • K0s: Is quick to install and production-ready.
  • K3s: Is designed to scale across multiple smaller machines, even IoT devices.
  • Talos: Is a stripped-down and hardened Linux, includes, and designed only to run Kubernetes.

Veeam Kasten

Veeam Kasten is a backup solution for Kubernetes. It automates backing up software hosted on Kubernetes, and automates the recovery process. You should have backups, and if you're using Kubernetes, something like Kasten makes creating backups and recovering from them much easier.

Unlike Kubernetes itself and GitLab CE, Kasten is not fully open source, but it is free for clusters with up to 5 nodes. It means that if you're running a 98-threaded CPU and 256GB of RAM x5, you're still eligible to run Kasten using the free license. The free license is perpetual too.

Although Kasten as a complete solution is not entirely open source, it depends on many open source component, including its data mover Kopia which is licensed under the Apache License 2.0. As a sidenote, Kopia can be run as a desktop application and is actually great for creating automated, deduplicated and encrypted home backups. The more you know.

Prerequisites

Kubernetes basics
  1. The Helm tool installed on your machine.
  2. A Kubernetes cluster whose node(s) fulfils:
S3 storage

For the Kubernetes backups, we recommend using S3 storage because it allows you to make the data immutable. In the case of a ransomware attack, the attacker won't be able to touch your backup data.

If you don't have any S3 storage available, feel free to contact us! We can recommend S3 solutions or services that suit your needs.

A domain name

A domain name whose subdomains point to your cluster.

In other words, a DNS A-record with the name * must be set to an IP address of your cluster.

This is important because GitLab will create several subdomain ingresses:

  • gitlab.yourdomain.com
  • registry.yourdomain.com
  • kas.yourdomain.com
  • minio.yourdomain.com

And for repositories that house their own "pages" websites:

  • *.pages.yourdomain.com

The easiest way to set this up is by using a static public IP address. This allows the cluster to be publicly reachable and simplifies the installation considerably by allowing GitLab to use the ACME HTTP-01 protocol to ask for TLS certificates from Let's Encrypt.

If, however, your cluster is not publicly accessible, such as on a development machine, then you will be limited to generating certificates using ACME DNS-01, which is a more powerful method, but more tedious to set up and is beyond the scope of this guide. You must also ensure that the IP address allows you and your cluster pods to reach the cluster (127.0.0.1 does not qualify). You have at least three choices here:

  • If you have access to your internet router, make your LAN IP address static by pinning it to your network card's MAC address in your work/home router. Then use your LAN IP address.
  • Run the cluster inside a VM, make the VM's LAN IP address static and use that for the domain name.
  • Use the Kubernetes Ingress Controller service IP:
    1. Expose the entire Kubernetes service network to the host.
    2. Make your Ingress Controller service IP address static. You must choose a valid service IP address, this depends on what the IP range is on your cluster.
    3. Set the aforementioned DNS A record of your domain name to the service IP address.

Optional

These are required for GitLab to work, but GitLab's Helm chart can install them for you, hence why they are optional prerequisites. However, if you don't have them already, but don't want GitLab's Helm chart to manage them for you (as they are cluster-wide components that may be useful for other applications than GitLab), you can install them separately:

  1. Cert manager
  2. An Ingress Controller such as Nginx or Traefik.

Regardless, it's up to you.

Configuring GitLab

We will be installing GitLab using its Helm chart. Before doing so, create the configuration file values.yaml.

values.yaml:

global:
  # Install the Community Edition (CE).
  edition: ce
  hosts:
    # Your domain name.
    domain: "example.com"  # CHANGE THIS
    
  # If you want to use the built-in ingress controller along with ACME HTTP-01, 
  # remove this section:
  ingress:
    # If you intend to use ACME HTTP-01, set this to true:
    configureCertmanager: false
    # The kind of ingress controller (nginx, traefik, ...).
    # This is important because the Helm chart will choose a strategy to expose
    # port 22 (Git over SSH) based on the controller that you use.
    provider: traefik  # CHANGE THIS
    # The ingress controller name on your cluster.
    class: traefik  # CHANGE THIS
    annotations:
      # Only include this if you have an ACME DNS-01 Cert Manager ClusterIssuer:
      cert-manager.io/cluster-issuer: letsencrypt-production  # CHANGE THIS
    tls:
      enabled: true
  
  pages:
    enabled: true
    accessControl: true
    customDomainMode: https
    
  # Remove the `email` and `smtp` sections if you don't have an email
  # address for GitLab to use:
  email:
    display_name: "GitLab"
    from: "noreply@example.com"  # CHANGE THIS
  smtp:
    enabled: true
    address: mail.example.com  # CHANGE THIS
    port: 465
    tls: true
    user_name: "noreply@example.com"  # CHANGE THIS
    password:
      secret: smtp-password
    authentication: login

# If you intend to use ACME DNS-01, remove this section, 
# otherwise set the email address, because it's needed to generate certificates:
certmanager-issuer:
  email: admin@yourdomain.com  # CHANGE THIS

gitlab-runner:
  runners:
    privileged: true

# Cert Manager will create one certificate for each ingress, so each cert secret
# will need its own name.
gitlab:
  webservice:
    ingress:
      tls:
        secretName: gitlab-ingress-cert
  kas:
    ingress:
      tls:
        secretName: kas-ingress-cert
  gitlab-pages:
    ingress:
      tls:
        secretName: pages-ingress-cert
registry:
  ingress:
    tls:
      secretName: registry-ingress-cert
minio:
  ingress:
    tls:
      secretName: minio-ingress-cert

# Set this to false if Cert Manager is already installed on your cluster:
installCertmanager: true
nginx-ingress:
  # Set this to false if an ingress controller is already installed on 
  # your cluster, and you DON'T want to use the built-in ingress controller:
  enabled: true

There are four things you have to consider:

  1. Should you use ACME HTTP-01 or DNS-01?
  2. Should you use the built-in ingress controller or not?
  3. Should you use the built-in Cert Manager or not?
  4. Should your GitLab instance be able to send emails or not (SMTP)?

You will have to customize the above values.yaml depending on how you answer these questions.

Built-in Ingress Controller? Cluster Ingress Controller?
ACME HTTP-01 - Remove global.ingress.
- Set certmanager-issuer.email.
- Requirement: The cluster must be publicly accessible.
- Set global.ingress.provider.
- Set global.ingress.class.
- Set global.ingress.configureCertmanager to true.
- Set nginx-ingress to false.
- Set certmanager-issuer.email.
- Requirement: The cluster must be publicly accessible.
ACME DNS-01 - Create a Cert Manager ClusterIssuer that is able to issue certificates for *.yourdomain.com and set global.ingress.annotations.cert-manager.io/cluster-issuer.
- Set global.ingress.provider to "nginx".
- Set global.ingress.class to "gitlab-nginx".
- Create a Cert Manager ClusterIssuer that is able to issue certificates for *.yourdomain.com and set global.ingress.annotations.cert-manager.io/cluster-issuer.
- Set global.ingress.provider.
- Set global.ingress.class.
- Set nginx-ingress to false.

Note: GitLab has many more configuration options that may interest you. In this article, we only cover the most essential options to get a running installation. For example, in (especially larger) production environments, there are some extra considerations that have to be taken to increase scalability and availability.

Configuring your own Traefik ingress controller

If you have decided to use your own (Traefik) ingress controller, do the following. Otherwise, ignore this section.

In your Traefik values.yaml append the SSH/shell entrypoint under ports:

ports:
  # ...
  gitlab-shell:
    port: 22
    expose:
      default: true
    exposedPort: 22
    protocol: TCP

Then run a helm upgrade ... on your Traefik installation to apply the new settings.

This will cause Traefik to expose port 22, allowing GitLab to use it for its SSH-based git interface.

Installing GitLab

SMTP password

If you have configured an email address for GitLab, do the following. Otherwise, ignore this section.

Run the command below and replace YOUR_SMTP_PASSWORD with your email password:

# Making sure the namespace is available
kubectl create namespace gitlab
# Create the secret
kubectl create secret \
  generic \
  smtp-password \
  --from-literal=password=YOUR_SMTP_PASSWORD \
  --namespace gitlab
Add the GitLab Helm chart repository
helm repo add gitlab https://charts.gitlab.io/
helm repo update
Install GitLab
helm upgrade --install gitlab gitlab/gitlab \
  --namespace gitlab \
  --create-namespace \
  --timeout 600s \
  -f values.yaml

If you wish to change any settings in values.yaml, do so, then run the above command again to apply the changes to your GitLab installation.

Logging in to GitLab

You should eventually be able to visit https://gitlab.yourdomain.com/.

The admin username is root, and the password can be retrieved by running:

kubectl get secret gitlab-gitlab-initial-root-password -n gitlab -ojsonpath='{.data.password}' | base64 --decode ; echo

Installing Veeam Kasten

Installing the backup solution Kasten should be a lot easier as it won't require any ingresses or publicly recognized TLS certificates.

Following the official installation guide should be straightforward.

Perhaps adding the --create-namespace flag to the helm command invocation will be useful though.

Admin user

You will need an administrative Kubernetes user to log in to Kasten.

To create an admin user, write a file called admin.yaml:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
  
---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kubernetes-dashboard
  
---

apiVersion: v1
kind: Secret
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
  annotations:
    kubernetes.io/service-account.name: "admin-user"   
type: kubernetes.io/service-account-token

Then apply it:

kubectl -f admin.yaml

Logging in to Veeam Kasten

As the installation instructions suggested, make the Kasten Dashboard accessible to you by running:

kubectl --namespace kasten-io port-forward service/gateway 8080:80

The URL will be: http://127.0.0.1:8080/k10/#/

Create a long-running access token, and use it to log in to Kasten:

kubectl get secret admin-user -n kubernetes-dashboard -o jsonpath="{.data.token}" | base64 -d

Be careful, this token has a practically unlimited expiry date, and grants access to the entire cluster, so keep it secure.

Coming up Next in Part 2

In part 2, we will show how to actually back up and restore GitHub using Kasten.

Next
Next

Ensuring DORA Compliance with IssProtect for DevOps (Veeam Kasten)