GAP Documentation
GitHub Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage
Edit page

Migrating applications to multiregion pipeline

We highly encourage you to do the migration in one sitting. It should not take more time than 1 hour based on our experience. Running the old and new pipeline at the same time is not supported and can lead to unexpected synchronization problems.
The US production cluster (p-us1-01) will be enabled by the end of April 2026 at the latest. This may happen earlier — we will make an announcement before enabling it.

Table of contents

Migration steps:

  1. Add the new cluster(s) to your machine to be used in e.g k9s (if not yet authenticated to GPC locally, follow the points related here):

    gcloud container clusters get-credentials s-us1-01-gap-primary --region us-east4 --project ems-gap-s-us1-01 --dns-endpoint
    gcloud container clusters get-credentials p-us1-01-gap-primary --region us-east4 --project ems-gap-p-us1-01 --dns-endpoint
    
  2. Remove autoSyncProduction from your gap.yaml if you have it set and commit this change.

    autoSyncProduction should be disabled before the migration to avoid potential on first glance confusing situations where the production app is out of sync after the migration. Eg.: When the old applications are not deleted yet, they can take back the ownership of the resources and redeploy an old state.

    The new pipeline also doesn’t support auto-sync to production environments for complience reasons and leaving it here would lead to confusion as it will be a no-op after the migration.

  3. Use the usePrebuiltImage feature of gap.yaml if not doing so already:

    • Add a github workflow that builds your application, similarly to the example defined here
    • Add usePrebuiltImage: true to the root of the gap.yaml.
  4. If before you were already using usePrebuiltImage:

    • add the github actions workflow level env var IMAGE_TAG with the value of, for example, the github.sha built-in variable:
    env:
      IMAGE_TAG: "${{ github.sha }}"
    
    • add the IMAGE_NAME env var as below, so it is pushed to the new registry and repository:
    env:
      IMAGE_NAME: europe-west3-docker.pkg.dev/sap-ems-base-infra-package-p/gap-images/<your-application-name>
    
    • update the following in your image building step in the github actions workflow:
      • your tag value in docker build to use the workflow level environment variable set above. Notice that the latest tag was removed, as the new image repositories have tag immutability enabled:
        run: docker build . --file Dockerfile --tag ${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }}
        
      • update the docker registry authentication command before docker push at the last stage:
        gcloud auth configure-docker europe-west3-docker.pkg.dev
        
  5. Migrate to new image specification format:

    • Use new image specification format with the below example in gap.yaml. The registry is automatically calculated for the region, if not set explicitly. The tag will be automatically generated by the update-image-tag step:
    image:
        repository: sap-ems-base-infra-package-p/gap-images/<your-application-name>
    
  6. Add GAP-Workflow application to your repository: https://github.com/apps/gap-workflows/installations/107694084

    • In case you have branch protection policies that require PRs or blocks push to master/main in your repo, add GAP-Workflow application to the Bypass list on GitHub.
  7. Make sure to have, if there are or would be any, the gap folder env level override directories (where they were previously set to staging and/or production, still keep them) added for the new envs with the new environment name, or added to staging-defaults, production-defaults folders respectively if there are common configurations across instances:

    • s-eu1-01 (for the current EU staging, the staging folder should be copied (not yet deleted) with this new name if there is any)
    • p-eu1-01 (for the current EU production, the production folder should be copied (not yet deleted) with this new name if there is any)
    • s-us1-01 (for the US staging)
    • p-us1-01 (for the US production)
    • staging-defaults (common configs across staging instances)
    • production-defaults (common configs across prod instances)

    Your gap folder should look like this during the migration (old directories kept until step 15):

    gap/
    ├── gap.yaml
    ├── staging/                   # old — keep until migration is complete (step 15)
    │   └── gap.yaml
    ├── production/                # old — keep until migration is complete (step 15)
    │   └── gap.yaml
    ├── staging-defaults/          # (optional) overrides for all staging clusters
    │   └── gap.yaml
    ├── production-defaults/       # (optional) overrides for all production clusters
    │   └── gap.yaml
    ├── s-eu1-01/                  # new EU staging (copied from staging/)
    │   └── gap.yaml
    ├── p-eu1-01/                  # new EU production (copied from production/)
    │   └── gap.yaml
    ├── s-us1-01/                  # new US staging
    │   └── gap.yaml
    └── p-us1-01/                  # new US production
        └── gap.yaml
    

If you don’t commit these before adding the application to gap-registry, the first manifest generation could lead to dangling resources.
The migration is a great opportunity to review and rationalize your resource requests for each environment. Staging apps often have significantly inflated requests relative to their actual usage. Check your apps on our Grafana dashboards EU staging dashboard and EU production and set appropriate values per cluster. See the resource details guide for practical guidance. We are currently working Logging and Monitoring solutions with the new clusters in multi-region.

  1. If you need to make Egress calls to the outside (non-GCP), please refer to this doc.

  2. Add a github workflow that updates the image tag in gap.yaml, with the addition of the paths-ignore in order to not run the below update-image-tag job when only a gap folder change happened, as there should be an image build in that case:

    Make sure you have updated below:

    • <name-of-container-building-step>
    • <name-of-the-main-branch>
      on:
        push:
          paths-ignore:
            - 'gap/**'
    
      jobs:
        update-image-tag:
          timeout-minutes: 10
          permissions:
            contents: "write"
            id-token: "write"
          needs:
            - <name-of-container-building-step>
          runs-on: ubuntu-latest
          if: ${{ github.ref == 'refs/heads/<name-of-the-main-branch>' && github.actor != 'gap-workflows[bot]' }}
          steps:
          - uses: actions/create-github-app-token@v3
            id: app-token
            with:
              client-id: ${{ secrets.GAP_WORKFLOW_APP_ID }}
              private-key: ${{ secrets.GAP_WORKFLOW_APP_PEM }}
          - uses: "actions/checkout@v6"
            with:
              token: ${{ steps.app-token.outputs.token }}
          - name: Get GitHub App User ID
            id: get-user-id
            run: echo "user-id=$(gh api "/users/${{ steps.app-token.outputs.app-slug }}[bot]" --jq .id)" >> "$GITHUB_OUTPUT"
            env:
              GH_TOKEN: ${{ steps.app-token.outputs.token }}
          - name: Update image tag
            run: |
              yq -i '.image.tag = env(IMAGE_TAG)' gap/gap.yaml
          - name: Commit and push changes
            id: commit_gap_yaml
            run: |
              git config --global user.name '${{ steps.app-token.outputs.app-slug }}[bot]'
              git config --global user.email '${{ steps.get-user-id.outputs.user-id }}+${{ steps.app-token.outputs.app-slug }}[bot]@users.noreply.github.com'
              git add gap/gap.yaml
              git commit -m "Update image tag to ${{ env.IMAGE_TAG }}"
              git push origin HEAD:${{ github.ref_name }}
    
  3. Remove the submit-build (or sometimes it is called gap-deploy) step from the current workflow.

    Using the old and new workflow at the same time is not supported as it can lead to unexpected problems. It’s much easier to just clean it up.

  4. Add entry to https://github.com/emartech/gap-registry the GAP team will approve and merge your PR. In case you can not create a PR please reach out to the GAP team on #infra-support slack channel or open a ticket in the team’s Jira project

    • create the app.yaml with the below specification: your-namespace/your-app-name/app.yaml (e.g cloud-platform/gap-docs/app.yaml)
      • inside app.yaml:
        • app
          • name: name of your app, same as in gap.yaml name
          • repo: url to your app repo, such as https://github.com/emartech/gap-docs.git
          • labels: (optional)
            • group: (string, optional) the value set for the group key can be used to group together Applications on Argo CD Applications view UI (if set on gap.yaml with the applicationLabels field, please move it over to the app.yaml)
          • path: (optional, defaults to “gap”) if there are multiple gap folders in the app repo, name of the gap folder in the root for the application
          • autoSyncToStaging: (optional, defaults to true) if using preDeploy, set to false for the first sync in order to only sync ServiceAccount first. The field value only applies to staging environments.
          • instanceConfig: (optional) per-instance overrides
            • <instance-key>: (e.g: s-us1-01)
              • disable: true (if set to true, the application will not be deployed on the specified instance, except if set to true after the deployment of the app, deletion of the app with the default Foreground propogation policy on ArgoCD is required.)
          • targetRepoRevision: (optional, defaults to "HEAD", which means the most current revision in the repo) can be set to a tag/branch/commit

    Example:

    app:
      name: <my-application>
      repo: https://github.com/emartech/<my-app-repo-name>.git
    
Make sure you have commited your cluster specific overrides in your applications gap folder before adding the application to gap-registry, otherwise the first manifest generation could lead to dangling resources. See step 7 for more details.
  1. Ensure, if needed, the workload identity is configured to access e.g your databases with the help of this guide.
  2. Don’t forget to create your secrets for your application for example with gap-cli config:init.
  3. Verify that the new applications popped up in Argo CD for the new cluster(s). It can take a few minutes to show up correctly on Argo CD. Unless autoSyncToStaging was set to false the staging environments will be auto-synced.
    • Check the status for the new staging application. If autoSyncToStaging was not set to false, the staging app will take ownership of your old app’s resources automatically.
    • After checking that the diff for the production app seems good (nothing unexpected, mostly metadata changes) sync the new application (if using preDeploy, sync the ServiceAccount first), so it takes ownership of the old resources.
      • If using preDeploy and forgot to set autoSyncToStaging to false for staging for the first run or synced all production resources without syncing ServiceAccount first for the first run, click on the button that displays the sync in progress, and on the popped up page click Terminate.

SharedResourceWarning will show up on Argo CD in the App Conditions. These can safely be ignored, these will resolve after the sync completes. You are seeing these because you already have an application with the old name and workflow.

If the app shows as OutOfSync after the initial sync but auto-sync does not trigger, this is expected behaviour. Argo CD will not re-attempt automated sync for a commit SHA it has already successfully synced against — even if chart metadata or generation fields have since changed. Simply manually sync the application once from the Argo CD UI to resolve it.

  1. At this point, the staging and production gap folder env level override directories should be deleted.
  2. usePrebuiltImage is a no-op unless you roll back, you can remove it.
  3. Ask the GAP team to delete the no longer necessary gap-application directory by creating a ticket in Jira to the GAP team’s Jira project with the following template. Important Avoid creating separate tickets for each application — the ideal case is to get rid of the whole namespace with a single ticket once all apps are migrated. If you are migrating multiple applications but the rollout is spread out over time, do open a ticket per app rather than waiting — having duplicate Argo CD apps running in parallel for extended periods is not recommended.
Dear GAP team,

Please remove the following folder(s) from the gap-applications repository for our migration process:
- <your-namespace>
or
- <your-namespace>/<your-migrated-app>
- <your-namespace>/<your-other-migrated-app>
This namespace/application has been successfully migrated and is no longer needed in the gap-applications repository.

and please delete our application from Argo CD with the non-cascading option.

Cheers,
<your-name>

Rollback to the old pipeline:

  1. Set app.autoSyncToStaging field to false in the app.yaml of the app that is being rolled back to the old pipeline.
  2. (If using env level gap folder overrides) Bring back the staging and/or production directories in the gap folder for the proper overrides to apply.
  3. Remove the on: paths-ignore in the workflow, it might look like this:
    on:
      push:
        paths-ignore:
          - 'gap/**'
    
  4. Add back the latest tag in your docker build command run: docker build . --file Dockerfile --tag ${{ env.IMAGE_NAME }}:latest --tag ${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }}
  5. Comment out the update-image-tag workflow job.
  6. Put back the submit-build step which was deleted, and sync the applications that it brings up.

Edge cases and troubleshooting:

Monorepo with multiple gap yaml files and applications:

In case you have a monorepo with multiple gap.yaml files and having multiple workflows building each application, you can end up with race condition in the update-image-tag step.

To rectify this you can use a retry mechanism:

      - uses: nick-fields/retry@v3
        name: Commit and push changes
        id: commit_gap_yaml
        with:
          timeout_minutes: 10
          max_attempts: 10
          retry_on_exit_code: 1
          on_retry_command: echo "Retrying commit and push due to parallel GAP deployment changes..."
          command: |
            git pull --rebase origin ${{ github.ref_name }}
            git config --global user.name '${{ steps.app-token.outputs.app-slug }}[bot]'
            git config --global user.email '${{ steps.get-user-id.outputs.user-id }}+${{ steps.app-token.outputs.app-slug }}[bot]@users.noreply.github.com'
            git add gap-workers/gap.yaml
            git commit -m "Update image tag to ${{ env.IMAGE_TAG }}"
            git push origin HEAD:${{ github.ref_name }}
Each application in a monorepo needs its own entry in the gap-registry. Create a separate <your-namespace>/<your-app-name>/app.yaml file for each application, and make sure each entry uses the correct app.path to point to its respective gap folder (e.g. app.path: gap-workers).

Applications using preDeploy:

If your application uses preDeploy, the first sync requires extra care because Argo CD needs to create the ServiceAccount before it can run the pre-deploy job and sync the rest of the resources.

Before the first sync:

  • In your app.yaml in the gap-registry, set autoSyncToStaging: false. This prevents Argo CD from auto-syncing all resources at once before the ServiceAccount exists.

First sync (staging and production):

  1. Manually sync only the ServiceAccount resource first, using the selective sync option in the Argo CD UI. (Click on the ‘…’ on the box of the SA resource and select Sync)
  2. Once the ServiceAccount is in place, sync the remaining resources.

If you forgot to set autoSyncToStaging: false (staging) or synced everything at once without the ServiceAccount first:

  • Click on the sync operation that is in progress in the Argo CD UI, and on the page that appears click Terminate.
  • Then follow the steps above from the beginning.

After the first successful sync, you can (and probably should) set autoSyncToStaging: true again in your app.yaml.

Applications not building their own image:

So far we have seen only very few applications that do not build their own image, but use one provided by another team. (eg.: contact-data-deletion-v2). You should only use this approach if the image is updated infrequently. In such cases the migrations steps are mostly the same with the following differences:

  • You will need to remove the submit-build step but you don’t need to add ‘update-image-tag’ step, as there is no image being built in the workflow.
  • You will need to specify the image repository and tag in the gap.yaml file. Eg.:
    image:
      repository: sap-ems-base-infra-package-p/gap-images/contact-data-deletion-v2
      tag: "2.7" # make sure you use the quotes for such tags, otherwise it will be misinterpreted as a float number and not a string, which causes the schema validation to fail and the pipeline to break
    
  • In case of using patches, the full image path should be used, beause there is no registry interpolation. Eg.: image: europe-west3-docker.pkg.dev/sap-ems-base-infra-package-p/gap-images/contact-data-deletion-v2:2.7 instead of just image: contact-data-deletion-v2:2.7
  • You will need to change the image tag manually in the gap.yaml file for every new image version, when notified by the team that provides the image to you.