GitLab CI Integration

GitLab CI Integration

Overview

Once configured for a repository, the GitLab CI integration will provide analysis of project dependencies from a
lockfile. This can happen in a branch pipeline as a result of a commit or in a Merge Request (MR) pipeline. The
results will be provided in the pipeline logs and provided as a note (comment) on the MR.
The CI job will return an error (i.e., fail the build) if any of the newly added/modified dependencies from the MR
fail to meet the project risk thresholds for any of the five Phylum risk domains:

  • Vulnerability (aka vul)
  • Malicious Code (aka mal)
  • Engineering (aka eng)
  • License (aka lic)
  • Author (aka aut)

See Phylum Risk Domains documentation for more detail.

NOTE: It is not enough to have the total project threshold set. Individual risk domain threshold values must be
set, either in the UI or with phylum-ci options, in order to enable analysis results for CI. Otherwise, the risk
domain is considered disabled and the threshold value used will be zero (0).

There will be no note if no dependencies were added or modified for a given MR.
If one or more dependencies are still processing (no results available), then the note will make that clear and
the CI job will only fail if dependencies that have completed analysis results do not meet the specified project
risk thresholds.

Prerequisites

The GitLab CI environment is primarily supported through the use of a Docker image.
The pre-requisites for using this image are:

Configure .gitlab-ci.yml

Phylum analysis of dependencies can be added to existing CI workflows or on it's own with this minimal configuration:

stages:
  - QA

analyze_MR_with_Phylum:
  stage: QA
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  image: phylumio/phylum-ci:latest
  variables:
    GIT_STRATEGY: clone
    GITLAB_TOKEN: $GITLAB_TOKEN_VARIABLE_OR_SECRET_HERE
    PHYLUM_API_KEY: $PHYLUM_TOKEN_VARIABLE_OR_SECRET_HERE
  script:
    - phylum-ci

This configuration contains a single Quality Assurance stage named QA and will only run in merge request pipelines.
It does not override any of the phylum-ci arguments, which are all either optional or default to secure values.
Let's take a deeper dive into each part of the configuration:

Stage and Job names

The stage and job names can be named differently or included in existing stages/jobs.

stages:
  - QA  # Name this what you like

analyze_MR_with_Phylum:  # Name this what you like
  stage: QA  # Change the stage where the job will run here

Job control

Choose when to run the job. The Phylum integration can run in the context of branch pipelines or merge request
pipelines but merge request pipelines are given preferential treatment so care should be taken to
avoid duplicate pipelines.

There are several ways to accomplish this goal. The first is to create a rule at the job level to specify that
the job should only run for merge request pipelines. Branch pipelines are the default type and will run when new
commits are pushed to a branch. If the desire is to only run the job for branch pipelines, then no rule limiting
the pipeline source should be specified.

  # This optional rule specifies to run the job for merge request pipelines only.
  # Remove these lines entirely to run the job for branch pipelines instead.
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"

It is also possible to allow for both pipeline types while ensuring only one runs at a time by using workflow
rules to automatically switch between branch pipelines and merge request pipelines. To do so, remove
any job level rules related to pipeline sources and add the following workflow level rules to the configuration:

workflow:
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS
      when: never
    - if: $CI_COMMIT_BRANCH

See the GitLab CI/CD Job Control documentation for more detail.

Docker image selection

Choose the Docker image tag to match your comfort level with image dependencies. latest is a "rolling" tag that
will point to the image created for the latest released phylum-ci Python package. A particular version tag
(e.g., 0.15.0-CLIv3.10.0) is created for each release of the phylum-ci Python package and should not change
once published.

However, to be certain that the image does not change...or be warned when it does because it won't be available
anymore...use the SHA256 digest of the tag. The digest can be found by looking at the phylumio/phylum-ci
tags on Docker Hub or with the command:

# NOTE: The command-line JSON processor `jq` is used here for the sake of a one line example. It is not required.
❯ docker manifest inspect --verbose phylumio/phylum-ci:0.15.0-CLIv3.10.0 | jq .Descriptor.digest
"sha256:db450b4233484faf247fffbd28fc4f2b2d4d22cef12dfb1d8716be296690644e"

For instance, at the time of this writing, all of these tag references pointed to the same image:

  # NOTE: These are examples. Only one image line for `phylum-ci` is expected.

  # Not specifying a tag means a default of `latest`
  image: phylumio/phylum-ci

  # Be more explicit about wanting the `latest` tag
  image: phylumio/phylum-ci:latest

  # Use a specific release version of the `phylum-ci` package
  image: phylumio/phylum-ci:0.15.0-CLIv3.10.0

  # Use a specific image with it's SHA256 digest
  image: phylumio/[email protected]:db450b4233484faf247fffbd28fc4f2b2d4d22cef12dfb1d8716be296690644e

Only the last tag reference, by SHA256 digest, is guaranteed to not have the underlying image it points to change.

Variables

The job variables are used to ensure the phylum-ci tool is able to perform it's job.

For instance, git is used within the phylum-ci package to do things like determine if there was a lockfile change
and, when specified, report on new dependencies only. Therefore, a clone of the repository is required to ensure that
the local working copy is always pristine and history is available to pull the requested information.
It may also be necessary to specify the depth of cloning if/when there is not enough info.

A GitLab token with API access is required to use the API (e.g., to post notes/comments). This can be a personal,
project, or group access token. The account used to create the token will be the one that appears to post the
notes/comments on the MR. Therefore, it might be worth looking into using a bot account, which is available for
project and group access tokens. See the GitLab Token Overview documentation for more info.

Note, the GitLab token is only required when this Phylum integration is used in merge request pipelines.
It is not required when used in branch pipelines.

Note, using $CI_JOB_TOKEN as the value will work in some situations because "API authentication uses the job token,
by using the authorization of the user triggering the job." This is not recommended for anything other than temporary
personal use in private repositories as there is a chance that depending on it will cause failures when attempting to
do the same thing in different scenarios.

A Phylum token with API access is required to perform analysis on project dependencies.
Contact Phylum or register to gain access.
See also phylum auth register command documentation and consider using a bot or group account
for this token.

Values for the GITLAB_TOKEN and PHYLUM_API_KEY variables can come from a CI/CD Variable or an
External Secret. Since they are sensitive, care should be taken to protect them appropriately.

  variables:
    # References:
    # GIT_STRATEGY - https://docs.gitlab.com/ee/ci/runners/configure_runners.html#git-strategy
    # GIT_DEPTH - https://docs.gitlab.com/ee/ci/runners/configure_runners.html#shallow-cloning
    GIT_STRATEGY: clone
    # GIT_DEPTH: "50"

    # References for GitLab tokens:
    # All tokens - https://docs.gitlab.com/ee/security/token_overview.html
    # Personal - https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html
    # Project - https://docs.gitlab.com/ee/user/project/settings/project_access_tokens.html
    # Group - https://docs.gitlab.com/ee/user/group/settings/group_access_tokens.html
    GITLAB_TOKEN: $GITLAB_TOKEN_VARIABLE_OR_SECRET_HERE

    # Contact Phylum (phylum.io/contact-us) or register (app.phylum.io/register) to gain
    # access. See also `phylum auth register` (docs.phylum.io/docs/phylum_auth_register)
    # command documentation. Consider using a bot or group account for this token.
    PHYLUM_API_KEY: $PHYLUM_TOKEN_VARIABLE_OR_SECRET_HERE

Script arguments

The script arguments to the Docker image are the way to exert control over the execution of the Phylum analysis.
The phylum-ci script entry point is expected to be called. It has a number of arguments that are all optional
and defaulted to secure values. To view the arguments, their description, and default values,
run the script with --help output as specified in the Usage section of the top-level README.md or
view the script options output for the latest release.

  # NOTE: These are examples. Only one script entry line for `phylum-ci` is expected.
  script:
    # Use the defaults for all the arguments.
    # The default behavior is to only analyze newly added dependencies against
    # the risk domain threshold levels set at the Phylum project level.
    - phylum-ci

    # Consider all dependencies in analysis results instead of just the newly added ones.
    # The default is to only analyze newly added dependencies, which can be useful for
    # existing code bases that may not meet established project risk thresholds yet,
    # but don't want to make things worse. Specifying `--all-deps` can be useful for
    # casting the widest net for strict adherence to Quality Assurance (QA) standards.
    - phylum-ci --all-deps

    # Some lockfile types (e.g., Python/pip `requirements.txt`) are ambiguous in that
    # they can be named differently and may or may not contain strict dependencies.
    # In these cases, it is best to specify an explicit lockfile path.
    - phylum-ci --lockfile requirements-prod.txt

    # Thresholds for the five risk domains may be set at the Phylum project level.
    # They can be set differently for CI environments to "fail the build."
    # Long commands: https://docs.gitlab.com/ee/ci/yaml/script.html#split-long-commands
    - |
      phylum-ci \
        --vul-threshold 60 \
        --mal-threshold 60 \
        --eng-threshold 70 \
        --lic-threshold 90 \
        --aut-threshold 80

    # Ensure the latest Phylum CLI is installed.
    - phylum-ci --force-install

    # Install a specific version of the Phylum CLI.
    - phylum-ci --phylum-release 3.8.0 --force-install

    # Mix and match for your specific use case.
    - |
      phylum-ci \
        --vul-threshold 60 \
        --mal-threshold 60 \
        --eng-threshold 70 \
        --lic-threshold 90 \
        --aut-threshold 80 \
        --lockfile requirements-prod.txt \
        --all-deps

Alternatives

It is also possible to make direct use of the phylum Python package within CI.
This may be necessary if the Docker image is unavailable or undesirable for some reason.
To use the phylum package, install it and call the desired entry points from a script under your control.
See the Installation and Usage sections of the README file for more detail.