Allow configuration of project cloning approach

Context

I'm trying to make GDK work inside a remote development workspace.

Currently, when a workspace spins up, one or two init containers are injected into the pod:

(optional) editor (in my case yes as gl/inject-editor is true)
project_cloner (this currently is an alpine/git image)

The project_cloner image has memory and CPU limits applied. These limits work well for a small project, but when trying to clone a large git repo, the cluster will kill these off and you will see OOMKilled errors in the workspace pod.

You can see this happening in the following output:

`kubectl` shell output

❯ kubectl describe pod workspace-61728-2083197-b1wkpa-fd8b85464-9n7sm
Name:             workspace-61728-2083197-b1wkpa-fd8b85464-9n7sm
Namespace:        gl-rd-ns-61728-2083197-b1wkpa
Priority:         0
Service Account:  default
Node:             gke-rhook-068b113c-g-rhook-068b113c-g-15da185b-r80g/10.10.0.4
Start Time:       Fri, 02 Jun 2023 16:50:35 +0100
Labels:           agent.gitlab.com/id=61728
                  pod-template-hash=fd8b85464
Annotations:      config.k8s.io/owning-inventory: workspace-61728-2083197-b1wkpa-workspace-inventory
                  workspaces.gitlab.com/host-template: {{.port}}-workspace-61728-2083197-b1wkpa.workspace.sting-ray.za.net
                  workspaces.gitlab.com/id: 129
Status:           Pending
IP:               10.164.1.23
IPs:
  IP:           10.164.1.23
Controlled By:  ReplicaSet/workspace-61728-2083197-b1wkpa-fd8b85464
Init Containers:
  gl-cloner-injector-gl-cloner-injector-command-1:
    Container ID:  containerd://63f96ca08052839621db1661e7c3a162a60ab3ea7c319212444027c30f88d085
    Image:         alpine/git:2.36.3
    Image ID:      docker.io/alpine/git@sha256:66b210a97bc07bfd4019826bcd13a488b371a6cbe2630a4b37d23275658bd3f2
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
    Args:
      if [ ! -d '/projects/git-lab-rdev' ];
      then
        git clone --branch master https://gitlab.com/srza/git-lab-rdev.git /projects/git-lab-rdev;
        cd /projects/git-lab-rdev;
        git config user.name "${GIT_AUTHOR_NAME}";
        git config user.email "${GIT_AUTHOR_EMAIL}";
      fi
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    128
      Started:      Fri, 02 Jun 2023 17:50:28 +0100
      Finished:     Fri, 02 Jun 2023 17:50:58 +0100
    Ready:          False
    Restart Count:  15
    Limits:
      cpu:     500m
      memory:  128Mi
    Requests:
      cpu:     30m
      memory:  32Mi
    Environment:
      PROJECTS_ROOT:     /projects
      PROJECT_SOURCE:    /projects
      GIT_AUTHOR_NAME:   Raimund Hook
      GIT_AUTHOR_EMAIL:  rhook@gitlab.com
    Mounts:
      /projects from gl-workspace-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7r6xb (ro)
  gl-editor-injector-gl-editor-injector-command-2:
    Container ID:   
    Image:          registry.gitlab.com/gitlab-org/gitlab-web-ide-vscode-fork/web-ide-injector:1
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  128Mi
    Requests:
      cpu:     30m
      memory:  32Mi
    Environment:
      EDITOR_VOLUME_DIR:  /projects/.gl-editor
      EDITOR_PORT:        60001
      PROJECTS_ROOT:      /projects
      PROJECT_SOURCE:     /projects
    Mounts:
      /projects from gl-workspace-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7r6xb (ro)
Containers:
  tooling-container:
    Container ID:  
    Image:         registry.gitlab.com/gitlab-org/community-relations/contributor-success/gitlab-rd-web-ide-docker:stingrayza-add-gdk-no-entry
    Image ID:      
    Ports:         2222/TCP, 3000/TCP, 3005/TCP, 3010/TCP, 3808/TCP, 5000/TCP, 5778/TCP, 9000/TCP, 9122/TCP, 60001/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      /projects/.gl-editor/start_server.sh
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     6
      memory:  16384M
    Requests:
      cpu:     500m
      memory:  8192M
    Environment:
      EDITOR_VOLUME_DIR:  /projects/.gl-editor
      EDITOR_PORT:        60001
      PROJECTS_ROOT:      /projects
      PROJECT_SOURCE:     /projects
    Mounts:
      /projects from gl-workspace-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7r6xb (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  gl-workspace-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  workspace-61728-2083197-b1wkpa-gl-workspace-data
    ReadOnly:   false
  kube-api-access-7r6xb:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                   From                     Message
  ----     ------                  ----                  ----                     -------
  Normal   Scheduled               63m                   default-scheduler        Successfully assigned gl-rd-ns-61728-2083197-b1wkpa/workspace-61728-2083197-b1wkpa-fd8b85464-9n7sm to gke-rhook-068b113c-g-rhook-068b113c-g-15da185b-r80g
  Normal   SuccessfulAttachVolume  63m                   attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-89d8169c-77e4-4db4-a4a6-bafb19651144"
  Normal   Pulled                  63m                   kubelet                  Successfully pulled image "alpine/git:2.36.3" in 218.229351ms (218.237709ms including waiting)
  Normal   Pulled                  62m                   kubelet                  Successfully pulled image "alpine/git:2.36.3" in 192.874535ms (192.888493ms including waiting)
  Normal   Pulled                  61m                   kubelet                  Successfully pulled image "alpine/git:2.36.3" in 229.153365ms (229.168848ms including waiting)
  Normal   Started                 61m (x4 over 63m)     kubelet                  Started container gl-cloner-injector-gl-cloner-injector-command-1
  Normal   Pulled                  61m                   kubelet                  Successfully pulled image "alpine/git:2.36.3" in 264.354124ms (264.362319ms including waiting)
  Normal   Pulling                 59m (x5 over 63m)     kubelet                  Pulling image "alpine/git:2.36.3"
  Normal   Created                 59m (x5 over 63m)     kubelet                  Created container gl-cloner-injector-gl-cloner-injector-command-1
  Normal   Pulled                  59m                   kubelet                  Successfully pulled image "alpine/git:2.36.3" in 238.758303ms (238.775917ms including waiting)
  Warning  BackOff                 3m2s (x238 over 62m)  kubelet                  Back-off restarting failed container

The memory consumption of git can be seen here:

Memory Screenshot

On a large repository like GitLab, the git clone can actually consume 1.6GB RAM.

The Init container is hard-coded with a limit of 128Mi, which for a large repository is not nearly enough.

Currently, my devfile specifies significantly larger limits than those of the initcontainer.

Devfile

schemaVersion: 2.2.0
components:
  - name: tooling-container
    attributes:
      gl/inject-editor: true
    container:
      image: registry.gitlab.com/gitlab-org/community-relations/contributor-success/gitlab-rd-web-ide-docker:stingrayza-add-gdk-no-entry
      memoryRequest: 8192M
      memoryLimit: 16384M
      cpuRequest: 500m
      cpuLimit: 6000m
      endpoints:
        - name: ssh-2222
          targetPort: 2222
        - name: http-3000
          targetPort: 3000

Proposal

This could be solved in at minimum one of two ways:

Make the initcontainer that does the cloning optional - in that way I would have to perform my clone manually in my IDE once it has spawned. (In my particular case this might be best - we would need to do the clone later anyway)
Allow the user to specify limits for the initcontainer, or for the initcontainer to inherit the actual workspace container's limits.
Allow a shallow clone, that may take significantly less resources up front, at a cost of later operations taking a bit longer

Related Issues

Expose settings to control the size of memory a... (&12807), which is about exposing settings for the the memory and volume size of the project cloner init container
BUG: Errors are not properly handled when proje... (#471531 - closed), which relates to issues on the agent side when cloning fails
Increase hardcoded defaults for project cloner ... (#477834 - closed), which is a stopgap solution to workspace failures, by increasing the hardcoded project cloner memory and volume size

Edited Aug 07, 2024 by Chad Woolley