GitLab Pipelines#

As seen in the CI overview, developers can easily interact and iterate while their changes are in their local machines or in the GitLab pipelines. In this section we will cover the versatility and possible uses of these pipelines to get the most out of them.

Pipelines consist of stages and jobs. Stages are logical groupings of jobs. Our pipeline does not rely on stages to sequence the jobs. Instead, it uses the Directed Acyclic Graph feature from GitLab. This allows us to start executing jobs as soon as their conditions are met. This improves parallelism and reduces the runtime of our pipelines.

Below you can find the DAG of the default pipeline running in CTA:

CTA CI DAG

Pipeline stages and jobs#

A job is the building block of the CI and jobs are logically grouped into stages. Our CI consists of the following stages and jobs, which may or may not be executed depending on the type of pipeline:

stage: prepare
- modify-project-json: on certain pipeline types, dependency versions change. This job modifies the project.json by updating those dependency versions so that they are used in the rest of the build/deploy process.
stage: validate
- validate-catalogue-schema-version: performs various consistency checks on the schema versions indicated in the project.json and those in the cta-catalogue-schema submodule.
- validate-pipeline-variables: checks that the input pipeline variables are consistent with each other. It also checks e.g. that if a custom dependency version is provided, said version is indeed available.
- validate-project-json: ensures the integrity, completeness and correctness of the project.json file in the root of the repository.
stage: lint
- cppcheck: lightweight static analysis tool. For cppcheck, a number of errors are suppressed based on the .cppcheck-supression file.
- clang-format-report: check the clang format for the lines that differ from the main branch and generates a git patch.
- clang-format-apply: apply the git patch generated by the clang-format-report job by generating a new commit on the branch this job was executed on. You can only trigger this job manually.
- SonarCloud (external): To complement this stage, we also analyze the project with SonarCloud, the analysis results can be found on sonarcloud.io. It is not executed synchronously with the pipeline because the analysis is heavy and takes too long to be integrated into the developer workflow. To run it we use a GitHub mirror of the CTA repository that does the analysis for every commit on the main branch. You should also check the results of the analysis run after your commits reach the main branch to check if the committed code generated any new issues.
stage: build
- build-cta-srpm: builds the srpms for the current commit. The output is used by the rpm build stage.
- build-cta-rpm: build the rpms for the current commit. The output can be used to build a container image for the development setup or uploaded to a repository as a tagged CTA version. The job artifacts also contain the compilation database.
- check-compile-commands: compares the compilation database generated by the build-cta-rpm job with the current version of the file in the branch. If they match, no further action is required and the job will succeed. If they differ, the job will fail. The job will always succeeed if the current branch is not up to date with the main branch.
- update-compile-commands: update the compilation database file using the newly generated version from the build-cta-rpm job.
- export-docs: converts the man pages of the project as markdown and uploads then as an artifact to be consumed by the eoscta-docs project.
stage: build:image
- build-ctageneric-from-local-rpms: build and upload the ctageneric container image from the RPMs generated in the build stage.
- build-ctageneric-from-remote-rpms: build and upload the ctageneric container image from RPMS of a specific CTA version available in the testing repo.
stage: test
- test-cta-valgrind: runs valgrind tests to check for memory leaks.
- integration-test-cta: tests executable invocation and CTA's threading code.
- unit-test-postgresql: series of CTA Catalogue unit tests run against a live Postgres DB.
- unit-test-oracle: series of CTA Catalogue unit tests run against a live Oracle DB.
- test-cta-release: checks that the cta-release RPM works as expected.
stage: system-test
- test-client: tests rest API compliance; file immutability; archival, retrieval, eviction, retrieval abort and deletion of 10.000 files; multiple retrieve test; idempotent prepare; deletion on closew errors; eviction before archival; EOS evict command; ObjectStore queue cleanup.
- test-client-gfal2: archival, retrieval, eviction and deletion of 2.000 files. Using the gfal2 library, core library for FTS, 1.000 files are tested against the XRootD protocol and the other 1.000 against the HTTP protocol. It also checks for activity passing through the gfal2 stack.
- test-repack: tests of repacking workflows.
- test-cta-admin: exercises the execution and tests the different cta-amdin commands.
- test-liquibase-update: tests the upgrade and downgrade of the different schema versions of the Catalogue.
- test-external-tape-formats: tests the support of tapes configured by other tape software.
- test-regression-dCache: dCache regression tests.
- stress-test: runs the stress test on a dedicated runner.
stage: review
- danger-review: runs the Danger bot. Only triggered in merge requests.
- danger-review-gate: job to ensure the merge request is blocked (by failing the pipeline) if the Danger checks fail. The danger-review job should run as early as possible, but it should not immediately auto-cancel the entire pipeline on failure. Therefore, the job of failing the pipeline on danger-review failure is delegated to the danger-review-gate, which runs at the end of the pipeline.
stage: release
- changelog-preview: produces a preview of the changelog based on all the commits between two commits (the latest commit and the latest tag by default)
- changelog-update: generates a merge request with an update to the CHANGELOG.md file.
- internal-release-cta: publishes the RPMs to a CTA internal repo, making them available to be deployed in the stress tests and later stages.
- public-release-cta-unstable: publishes the RPMs to the public unstable repository.
- public-release-cta-testing: publishes the RPMs to the public testing repository.
- public-release-cta-stable: publishes the RPMs to the public stable repository.

System tests organization and design constraints#

For system tests we have at our disposal 3 runners, each runner can only run one test at a time. The current run time of the longest system test (test_client.sh) is around 20 minutes. Whenever a test is run, the virtual environment is created and it is destroyed after the test. The creation of the environment has an overhead of ~2 minutes, the destruction is much faster. This is important to ensure consistency and reproducibility.

Ideally, the tests should be logically grouped together, this means that related workflows should be tested in the creation of the same environment, this helps to better understand the source of the failure. Nevertheless the logs produced by the tests should be clear enough about what was being tested and why it failed.

This ideal is not always achievable, as it is of utter importance to find the right balance of number of tests and execution length to minimize execution time. Having a single test containing everything leads to resource under-utilization and longer pipelines, specially when there are not many developers pushing to the repository at the same time; and splitting them too much will create an excessive amount of overhead which leads to wasted time, specially when many pipelines are being executed at the same time.

Triggering Pipelines#

We introduced the logical concept of pipeline type to our CI to address the growing requirements for additional functionality and regression tests. Currently we have the following types of pipelines:

DEFAULT: the full pipeline that runs most of the jobs including validation, linting, building and testing. This runs automatically on merge-requests and on pushes to main.
SYSTEM_TEST_ONLY: a subset of the pipeline that skips the build stage and instead uses the latest Docker image built by main. Useful if only the system tests were changed.
REGR_AGAINST_CTA_MAIN: runs the default pipeline, but with possible different versions for XRootD or EOS.
REGR_AGAINST_CTA_VERSION: builds and uploads a docker image of the provided (existing) CTA version and tests it against a provided version of EOS.
IMAGE_FROM_CTA_VERSION: builds and uploads a docker image of the provided (existing) CTA version.

These pipeline types all accept various pipeline variables that can be customized. The variables that can be (safely) adjusted, are found with their corresponding description when you trigger a pipeline. As stated above, the validate-pipeline-variables should warn you early if there is something wrong with the variables you passed. Note that it is difficult for these checks to be exhaustive, so always double check what you pass in.

Scheduled Pipelines#

Certain jobs and workflows are unnecessary or too long to run in every single pipeline. However, these should still be run periodically. For this we use scheduled pipelines, which can be checked at: https://gitlab.cern.ch/cta/CTA/-/pipeline_schedules. The "main" scheduled pipelines we currently:

Run the CI with the Postgres scheduler.
Run the CI with Ninja as the build generator instead of Unix Makefiles.
Run the CI with Valgrind enabled.
Run the CI with Oracle support disabled.

These scheduled pipelines help keep the developer workflow as streamlined as possible while checking new changes don't break compatibility with different configurations. There is drawback for this approach, which is that the developer must check if its changes caused some of the nightly pipelines to fail. Any failures to the nightly pipelines should be taken seriously. If such a failure occurs, a ticket should be created for it so that it can be fixed.