Azure Pipelines Integration

Overview

Azure Pipelines supports several different source repositories. This integration works with repos hosted on Azure Repos Git and GitHub.

Once configured for a repository, the Azure Pipelines integration will provide analysis of project dependencies from manifests and lockfiles. This can happen in a branch pipeline run from a CI trigger or in a Pull Request (PR) pipeline run from a PR trigger.

For PR triggered pipelines, analyzed dependencies will include any that are added/modified in the PR.

For CI triggered pipelines, the analyzed dependencies will be determined by comparing dependency files in the branch to the default branch. All dependencies will be analyzed when the CI triggered pipeline is run on the default branch.

The results will be provided in the pipeline logs and provided as a comment in a thread on the PR unless the option to skip comments is provided. The CI job will return an error (i.e., fail the build) if any of the analyzed dependencies fail to meet the established policy unless audit mode is specified.

There will be no comment if no dependencies were added or modified for a given PR. There will be no comment when the results of the analysis are successful. If one or more dependencies are still processing (no results available), then the comment will make that clear and the CI pipeline job will only fail if dependencies that have completed analysis results do not meet the active policy.

Prerequisites

The Azure Pipelines environment is primarily supported through the use of a Docker image. The prerequisites for using this image are:

Access to the phylumio/phylum-ci Docker image
Azure DevOps Services is used with an Azure Repos Git or GitHub repository type
- Azure DevOps Server versions are not guaranteed to work at this time
- Bitbucket Cloud hosted repositories are not supported at this time
An Azure token with API access
- This is only required when:
  - The build repository is Azure Repos Git
  - PR triggers are enabled
  - Comment generation has not been skipped
- Can be the default System.AccessToken provided automatically at the start of each pipeline build
  - The scoped build identity using this token needs the Contribute to pull requests permission
  - See documentation for using the token and setting it's job authorization scope
- Can be a personal access token (PAT) - see documentation
  - Needs at least the Pull Request Threads scope (read & write)
  - Consider using a service account for this token
A GitHub PAT with API access
- This is only required when:
  - The build repository is GitHub
  - PR triggers are enabled
  - Comment generation has not been skipped
- Can be a fine-grained PAT
  - Needs repository access and permissions: read access to metadata and read/write access to pull requests
  - See permissions required for fine-grained PATs
- Can be a classic PAT
  - Needs the repo scope or minimally the public_repo scope if private repositories are not used
  - See documentation
A Phylum token with API access
- Contact Phylum or register to gain access
  - See also phylum auth register command documentation
- Consider using a bot or group account for this token
Access to the Phylum API endpoints
- That usually means a connection to the internet, optionally via a proxy
- Support for on-premises installs are not available at this time

Configure `azure-pipelines.yml`

Phylum analysis of dependencies can be added to existing pipelines or on it's own with this minimal configuration:

trigger:
  - main
pr:
  - main

jobs:
  - job: Phylum
    pool:
      vmImage: ubuntu-latest
    container: phylumio/phylum-ci:latest
    steps:
      - checkout: self
        fetchDepth: 0
        persistCredentials: true
      - script: phylum-ci -vv
        displayName: Analyze dependencies with Phylum
        env:
          PHYLUM_API_KEY: $(PHYLUM_TOKEN)
          AZURE_TOKEN: $(AZURE_PAT)     # For Azure repos only
          GITHUB_TOKEN: $(GITHUB_PAT)   # For GitHub repos only

This single stage pipeline configuration contains a single container job named Phylum, triggered to run on pushes or PRs targeting the main branch. It provides debug output but otherwise does not override any of the phylum-ci arguments, which are all either optional or default to secure values.

Let's take a deeper dive into each part of the configuration:

Pipeline control

Choose when to run the pipeline. See the YAML schema trigger definition and pr definition documentation for more detail. It is recommended to enable CI triggered branch pipelines for pushes to the default branch, to ensure the Phylum analysis results for that branch are current.

# This is a CI trigger that will cause the
# pipeline to run on pushes to the `main` branch
trigger:
  - main

It is recommended to also enable PR validation for the target trigger branch(es). To do so for GitHub repos, use the pr keyword. See the YAML schema pr definition documentation for more detail.

# This is a PR trigger that will cause the pipeline to run when
# a pull request is opened with `main` as the target branch.
# NOTE: This has no affect for Azure Repos Git based repositories
pr:
  - main

To enable PR validation for Azure Repos Git, navigate to the branch policies for the desired branch (main in this example), and configure the Build validation policy for that branch. For more information, see the documentation on PR triggers for Azure Repos Git hosted repositories, PR triggers for GitHub, or more broadly events that trigger pipelines.

Job names

The job name can be named differently or included in an existing stage/job.

jobs:
  - job: Phylum  # Name this what you like

Pool selection

The pool is specified at the job level here because this is a container job. While Azure Pipelines allows container jobs for windows-2019 and ubuntu-* base vmImage images, only ubuntu-* is supported by Phylum at this time. Keeping that restriction in mind, the pool can be specified at the pipeline or stage level instead. See the YAML schema pool definition documentation for more detail.

    pool:
      vmImage: ubuntu-latest

Docker image selection

The container is specified at the job level here because this is a container job where all steps in the job are meant to run with the same image. The container can also be specified as a resource at the pipeline level and then referenced by name in individual steps of a job instead. See the YAML schema jobs.job.container definition and resource definition documentation for more detail.

Choose the Docker image tag to match your comfort level with image dependencies. latest is a "rolling" tag that will point to the image created for the latest released phylum-ci Python package. A particular version tag (e.g., 0.42.4-CLIv6.1.2) is created for each release of the phylum-ci Python package and should not change once published.

However, to be certain that the image does not change...or be warned when it does because it won't be available anymore...use the SHA256 digest of the tag. The digest can be found by looking at the phylumio/phylum-ci tags on Docker Hub or with the command:

# The command-line JSON processor `jq` is used here for the sake of a one line example. It is not required.
❯ docker manifest inspect --verbose phylumio/phylum-ci:0.42.4-CLIv6.1.2 | jq .Descriptor.digest
"sha256:77b761ccef10edc28b0f009a40fbeab240bf004522edaaea05572dc3728b6ca6"

For instance, at the time of this writing, all of these tag references pointed to the same image:

    # NOTE: These are examples. Only one container line for `phylum-ci` is expected.

    # Be explicit about wanting the `latest` tag
    container: phylumio/phylum-ci:latest

    # Use a specific release version of the `phylum-ci` package
    container: phylumio/phylum-ci:0.42.4-CLIv6.1.2

    # Use a specific image with it's SHA256 digest
    container: phylumio/phylum-ci@sha256:77b761ccef10edc28b0f009a40fbeab240bf004522edaaea05572dc3728b6ca6

Only the last tag reference, by SHA256 digest, is guaranteed to not have the underlying image it points to change.

The default phylum-ci Docker image contains git and the installed phylum Python package. It also contains an installed version of the Phylum CLI and all required tools needed for lockfile generation. An advantage of using the default Docker image is that the complete environment is packaged and made available with components that are known to work together.

One disadvantage to the default image is it's size. It can take a while to download and may provide more tools than required for your specific use case. Special slim tags of the phylum-ci image are provided as an alternative. These tags differ from the default image in that they do not contain the required tools needed for lockfile generation (with the exception of the pip tool). The slim tags are significantly smaller and allow for faster action run times. They are useful for those instances where no manifest files are present and/or only lockfiles are used.

Here are examples of using the slim image tags:

    # NOTE: These are examples. Only one container line for `phylum-ci` is expected.

    # Use the most current release of *both* `phylum-ci` and the Phylum CLI
    container: phylumio/phylum-ci:slim

    # Use the `slim` image with a specific release version of `phylum-ci` and Phylum CLI
    container: phylumio/phylum-ci:0.42.4-CLIv6.1.2-slim

Repository checkout

The phylum-ci logic for determining changes in dependency files requires git history beyond what is available in a shallow clone/checkout/fetch. To ensure the shallow fetch option is disabled for the pipeline, an explicit checkout step is specified here, with fetchDepth set to 0. It is also possible to disable the shallow fetch option in the pipeline settings UI. See the YAML schema steps.checkout definition documentation for more detail.

In order to support CI triggers, certain git operations are needed to determine the default branch name and set the remote HEAD ref for it since Azure Pipelines does not do so during repository checkout. These operations require git credentials to be available after the initial fetch, which is done with the persistCredentials property. This property is not required if CI triggers are disabled (e.g., via trigger: none).

      # Reference: https://learn.microsoft.com/azure/devops/pipelines/yaml-schema/steps-checkout
      - checkout: self
        fetchDepth: 0
        persistCredentials: true    # Needed only for CI triggers

Script arguments

The arguments to the script step are the way to exert control over the execution of the Phylum analysis. The entry here will run as a script in the phylum-ci based container job. See the YAML schema steps.script definition and container job documentation for more detail.

The phylum-ci script entry point is expected to be called. It has a number of arguments that are all optional and defaulted to secure values. To view the arguments, their description, and default values, run the script with --help output as specified in the Usage section of the top-level README.md or view the script options output for the latest release.

      # NOTE: These are examples. Only one script entry line for `phylum-ci` is expected.

      # Use the defaults for all the arguments.
      # The default behavior is to only analyze newly added dependencies
      # against the active policy set at the Phylum project level.
      - script: phylum-ci

      # Provide debug level output. Highly recommended.
      - script: phylum-ci -vv

      # Consider all dependencies in analysis results instead of just the newly added ones.
      # The default is to only analyze newly added dependencies, which can be useful for
      # existing code bases that may not meet established policy rules yet,
      # but don't want to make things worse. Specifying `--all-deps` can be useful for
      # casting the widest net for strict adherence to Quality Assurance (QA) standards.
      - script: phylum-ci --all-deps

      # Some lockfile types (e.g., Python/pip `requirements.txt`) are ambiguous in
      # that they can be named differently and may or may not contain strict
      # dependencies. In these cases it is best to specify an explicit path, either
      # with the `--depfile` option or in a `.phylum_project` file. For more, see:
      # https://docs.phylum.io/knowledge_base/phylum_project_files
      - script: phylum-ci --depfile requirements-prod.txt

      # Specify multiple explicit dependency file paths.
      - script: phylum-ci --depfile requirements-prod.txt Cargo.toml path/to/dependency.file

      # Exclude dependency files by gitignore-style pattern.
      - script: phylum-ci --exclude "requirements-*.txt"

      # Specify multiple exclusion patterns.
      - script: phylum-ci --exclude "build.gradle" "tests/fixtures/"
      - script: |
        phylum-ci \
          --exclude "/requirements-*.txt" \
          --exclude "build.gradle" "fixtures/"

      # Force analysis for all dependencies in a manifest file. This is especially useful
      # for *workspace* manifest files where there is no companion lockfile (e.g., libraries).
      - script: phylum-ci --force-analysis --all-deps --depfile Cargo.toml

      # Perform analysis as part of an organization and/or group-owned project.
      # When an org is specified, a group name must also be specified.
      - script: phylum-ci --org my_org --group my_group
      - script: phylum-ci --group my_group

      # Analyze all dependencies in audit mode, to gain insight without failing builds.
      - script: phylum-ci --all-deps --audit

      # Ensure the latest Phylum CLI is installed.
      - script: phylum-ci --force-install

      # Install a specific version of the Phylum CLI.
      - script: phylum-ci --phylum-release 4.8.0 --force-install

      # Mix and match for your specific use case.
      - script: |
        phylum-ci \
          -vv \
          --org my_org \
          --group my_group \
          --depfile requirements-dev.txt \
          --depfile requirements-prod.txt path/to/dependency.file \
          --depfile Cargo.toml \
          --force-analysis \
          --all-deps

Script Variables

The script step environment variables are used to ensure the phylum-ci tool is able to perform it's job.

A Phylum token with API access is required to perform analysis on project dependencies. Contact Phylum or register to gain access. See also phylum auth register command documentation and consider using a bot or group account for this token.

Azure Repos Git Build Repositories

An Azure DevOps token with API access is required to use the API (e.g., to post notes/comments) when the build repository is Azure Repos Git, PR triggers are enabled, and comment generation is not skipped. This can be the default System.AccessToken provided automatically at the start of each pipeline build for the scoped build identity or a personal access token (PAT).

If using a PAT, it will need at least the Pull Request Threads scope (read & write). The account used to create the PAT will be the one that appears to post the comments on the pull request. Therefore, it might be worth using a bot or service account. See the Azure DevOps documentation for using PATs to authenticate for more info.

If using the System.AccessToken, the scoped build identity it attaches to needs at least the Contribute to pull requests permission. For example, to use the System.AccessToken on a project-scoped identity, follow these steps:

Go to project settings
Select the Repos --> Repositories menu
Select the Security tab
Select the user {Project Name} Build Service ({Org Name})
- NOTE: This user will only exist after the first time the pipeline has run
Ensure the Contribute to pull requests permission is set to Allow

See the Azure DevOps documentation for using the System.AccessToken and setting it's job authorization scope.

GitHub Build Repositories

A GitHub PAT with API access is required to use the API (e.g., to post notes/comments) when the build repository is GitHub, PR triggers are enabled, and comment generation is not skipped. This can be a fine-grained or classic PAT.

If using a fine-grained PAT, it will need repository access and permissions for read access to metadata and read/write access to pull requests. See permissions required for fine-grained PATs for more info.

If using a classic PAT, it will need the repo scope or minimally the public_repo scope if private repositories are not used. See documentation for scopes for more info.

Setting Values

Values for the PHYLUM_API_KEY and either AZURE_TOKEN or GITHUB_TOKEN environment variable (e.g., PHYLUM_TOKEN and one of either AZURE_PAT or GITHUB_PAT in the example here) can come from the pipeline UI, a variable group, or an Azure Key Vault. View the full documentation for how to set secret variables for more information. Since these tokens are sensitive, care should be taken to protect them appropriately.

        env:
          # Contact Phylum (phylum.io/contact-us) or register (app.phylum.io/register)
          # to gain access. Consider using a bot or group account for this token. See:
          # https://docs.phylum.io/knowledge_base/api-keys
          # This value (`PHYLUM_TOKEN`) will need to be set as a secret variable:
          # https://learn.microsoft.com/azure/devops/pipelines/process/set-secret-variables
          PHYLUM_API_KEY: $(PHYLUM_TOKEN)

          # NOTE: These are examples. Only one `AZURE_TOKEN` entry line is expected, and only
          #       when the build repository is hosted in Azure Repos Git with PR triggers
          #       and comment generation enabled.
          #
          # Use the `System.AccessToken` provided automatically at the start of each pipeline build.
          # This value does not have to be set as a secret variable since it is provided by default.
          AZURE_TOKEN: $(System.AccessToken)
          #
          # Use a personal access token (PAT).
          # This value (`AZURE_PAT`) will need to be set as a secret variable:
          # https://learn.microsoft.com/azure/devops/pipelines/process/set-secret-variables
          AZURE_TOKEN: $(AZURE_PAT)

          # NOTE: A `GITHUB_TOKEN` entry is only needed for GitHub hosted build repositories
          #       with PR triggers and comment generation enabled.
          #
          # Use a personal access token (PAT).
          # This value (`GITHUB_PAT`) will need to be set as a secret variable:
          # https://learn.microsoft.com/azure/devops/pipelines/process/set-secret-variables
          GITHUB_TOKEN: $(GITHUB_PAT)

Exit Codes

The Phylum analysis job will return a zero (0) exit code when it completes successfully and a non-zero code otherwise. The full and current list of exit codes is documented here and "Output Modification" options exist to be strict or loose with setting them.

Alternatives

It is also possible to make direct use of the phylum Python package within CI. This may be necessary if the Docker image is unavailable or undesirable for some reason. To use the phylum package, install it and call the desired entry points from a script under your control. See the Installation and Usage sections of the README file for more detail.

Overview​

Prerequisites​

Configure azure-pipelines.yml​

Pipeline control​

Job names​

Pool selection​

Docker image selection​

Repository checkout​

Script arguments​

Script Variables​

Azure Repos Git Build Repositories​

GitHub Build Repositories​

Setting Values​

Exit Codes​

Alternatives​