Workflow Engine

Workflow Engine Splash Logo

Workflow Engine is a security and delivery pipeline designed to orchestrate the process of building and scanning an application image for security vulnerabilities. It solves the problem of having to configure a hardened-predefined security pipeline using traditional CI/CD. Workflow Engine can be statically compiled as a binary and run on virtually any platform, CI/CD environment, or locally.

Getting Started

Install Prerequisites:

Container Engine
Docker or Podman CLI
Golang >= v1.22.0
Just (optional)

Compiling Workflow Engine

Running the just recipe will put the compiled-binary into ./bin

just build

OR compile manually

git clone <this-repo> <target-dir>
cd <target-dir>
mkdir bin
go build -o bin/workflow-engine ./cmd/workflow-engine

Optionally, if you care to include metadata you use build arguments

go build -ldflags="-X 'main.cliVersion=$(git describe --tags)' -X 'main.gitCommit=$(git rev-parse HEAD)' -X 'main.buildDate=$(date -u +%Y-%m-%dT%H:%M:%SZ)' -X 'main.gitDescription=$(git log -1 --pretty=%B)'" -o ./bin ./cmd/workflow-engine

Running A Pipeline

You can run the executable directory

workflow-engine run debug

Configuring a Pipeline

Configuration Options:

Configuration via CLI flags
Environment Variables
Config File in JSON
Config File in YAML
Config File in TOML

Configuration Order-of-Precedence:

CLI Flag
Environment Variable
Config File Value
Default Value

Note: (none) means unset, left blank

Config Key	Environment Variable	Default Value	Description
codescan.enabled	WFE_CODE_SCAN_ENABLED	1	Enable/Disable the code scan pipeline
codescan.gitleaksfilename	WFE_CODE_SCAN_GITLEAKS_FILENAME	gitleaks-secrets-report.json	The filename for the gitleaks secret report - must contain 'gitleaks'
codescan.gitleakssrcdir	WFE_CODE_SCAN_GITLEAKS_SRC_DIR	.	The target directory for the gitleaks scan
codescan.semgrepfilename	WFE_CODE_SCAN_SEMGREP_FILENAME	semgrep-sast-report.json	The filename for the semgrep SAST report - must contain 'semgrep'
codescan.semgreprules	WFE_CODE_SCAN_SEMGREP_RULES	p/default	Semgrep ruleset manual override
deploy.enabled	WFE_IMAGE_PUBLISH_ENABLED	1	Enable/Disable the deploy pipeline
deploy.gatecheckconfigfilename	WFE_DEPLOY_GATECHECK_CONFIG_FILENAME	-	The filename for the gatecheck config
gatecheckbundlefilename	WFE_GATECHECK_BUNDLE_FILENAME	artifacts/gatecheck-bundle.tar.gz	The filename for the gatecheck bundle, a validatable archive of security artifacts
imagebuild.args	WFE_IMAGE_BUILD_ARGS	-	Comma seperated list of build time variables
imagebuild.builddir	WFE_IMAGE_BUILD_DIR	.	The build directory to using during an image build
imagebuild.cachefrom	WFE_IMAGE_BUILD_CACHE_FROM	-	External cache sources (e.g., "user/app:cache", "type=local,src=path/to/dir")
imagebuild.cacheto	WFE_IMAGE_BUILD_CACHE_TO	-	Cache export destinations (e.g., "user/app:cache", "type=local,src=path/to/dir")
imagebuild.dockerfile	WFE_IMAGE_BUILD_DOCKERFILE	Dockerfile	The Dockerfile/Containerfile to use during an image build
imagebuild.enabled	WFE_IMAGE_BUILD_ENABLED	1	Enable/Disable the image build pipeline
imagebuild.platform	WFE_IMAGE_BUILD_PLATFORM	-	The target platform for build
imagebuild.squashlayers	WFE_IMAGE_BUILD_SQUASH_LAYERS	0	squash image layers - Only Supported with Podman CLI
imagebuild.target	WFE_IMAGE_BUILD_TARGET	-	The target build stage to build (e.g., [linux/amd64])
imagepublish.bundlepublishenabled	WFE_IMAGE_BUNDLE_PUBLISH_ENABLED	1	Enable/Disable gatecheck artifact bundle publish task
imagepublish.bundletag	WFE_IMAGE_PUBLISH_BUNDLE_TAG	my-app/artifact-bundle:latest	The full image tag for the target gatecheck bundle image blob
imagepublish.enabled	WFE_IMAGE_PUBLISH_ENABLED	1	Enable/Disable the image publish pipeline
imagescan.clamavfilename	WFE_IMAGE_SCAN_CLAMAV_FILENAME	clamav-virus-report.txt	The filename for the clamscan virus report - must contain 'clamav'
imagescan.enabled	WFE_IMAGE_SCAN_ENABLED	1	Enable/Disable the image scan pipeline
imagescan.grypeconfigfilename	WFE_IMAGE_SCAN_GRYPE_CONFIG_FILENAME	-	The config filename for the grype vulnerability report
imagescan.grypefilename	WFE_IMAGE_SCAN_GRYPE_FILENAME	grype-vulnerability-report-full.json	The filename for the grype vulnerability report - must contain 'grype'
imagescan.syftfilename	WFE_IMAGE_SCAN_SYFT_FILENAME	syft-sbom-report.json	The filename for the syft SBOM report - must contain 'syft'

Running in Docker

When running workflow-engine in a docker container there are some pipelines that need to run docker commands. In order for the docker CLI in the workflow-engine to connect to the docker daemon running on the host machine, you must either mount the /var/run/docker.sock in the workflow-engine container, or provide configuration for accessing the docker daemon remotely with the DOCKER_HOST environment variable.

If you don't have access to Artifactory to pull in the Omnibus base image, you can build the image manually which is in images/omnibus/Dockerfile.

Using `/var/run/docker.sock`

This approach assumes you have the docker daemon running on your host machine.

Example:

docker run -it --rm \
  `# Mount your Dockerfile and supporting files in the working directory: /app` \
  -v "$(pwd):/app:ro" \
  `# Mount docker.sock for use by the docker CLI running inside the container` \
  -v "/var/run/docker.sock:/var/run/docker.sock" \
  `# Run the workflow-engine container with the desired arguments` \
  workflow-engine run image-build

Using a Remote Daemon

For more information see the Docker CLI and Docker Daemon documentation pages.

Using Podman in Docker

In addition to building images with Docker it is also possible to build them with podman. When running podman in docker it is necessary to either launch the container in privileged mode, or to run as the podman user:

docker run --user podman -it --rm \
  `# Mount your Dockerfile and supporting files in the working directory: /app` \
  -v "$(pwd):/app:ro" \
  `# Run the workflow-engine container with the desired arguments` \
  workflow-engine:local run image-build -i podman

If root access is needed, the easiest solution for using podman inside a docker container is to run the container in "privileged" mode:

docker run -it --rm \
  `# Mount your Dockerfile and supporting files in the working directory: /app` \
  -v "$(pwd):/app:ro" \
  `# Run the container in privileged mode so that podman is fully functional` \
  --privileged \
  `# Run the workflow-engine container with the desired arguments` \
  workflow-engine run image-build -i podman

Using Podman in Podman

To run the workflow-engine container using podman the process is quite similar, but there are a few additional security options required:

podman run --user podman  -it --rm \
  `# Mount your Dockerfile and supporting files in the working directory: /app` \
  -v "$(pwd):/app:ro" \
  `# Run the container with additional security options so that podman is fully functional` \
  --security-opt label=disable --device /dev/fuse \
  `# Run the workflow-engine container with the desired arguments` \
  workflow-engine:local run image-build -i podman

Getting Started

Github Access

You will need access to the https://github.com/nightwing-demo/workflow-engine repository. If you don't already have access you can contact the following Nightwing team members:

to get access.

Required Tools

The following are required tools for running the Nightwing Workflow Engine

Go

The Nightwing Workflow Engine is written in Go.

To install on a Mac, install using Homebrew:

brew install go

Optional: if you would like Go built tools to be available locally on the command line, add the following to your ~/.zshrc or ~/.zprofile file:

# Go
export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin

Recommended Resources

If you are new to Go, or would like a refresher, here are some recommended resources:

Go Documentation
101 Go Mistakes and How to Avoid Them - A free an online summarized version can be found here

Optional Tools

The following are optional tools that may be installed to enhance the developer experience.

mdbook

mdbook is written in Rust and requires Rust to be installed as a pre-requisite.

To install Rust on a Mac or other Unix-like OS:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

If you've installed rustup in the past, you can update your installation by running:

rustup update

Once you have installed Rust, the following command can be used to build and install mdbook:

cargo install mdbook

Once mdbook is installed, you can serve it by going to the directory containing the mdbook markdown files and running:

mdbook serve

just

just is "just" a command runner. It is a handy way to save and run project-specific commands.

To install just on a Mac:

You can use the following command on Linux, MacOS, or Windows to download the latest release, just replace <destination directory> with the directory where you'd like to put just:

curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to <destination directory>

For example, to install just to ~/bin:

# create ~/bin
mkdir -p ~/bin

# download and extract just to ~/bin/just
curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to ~/bin

# add `~/bin` to the paths that your shell searches for executables
# this line should be added to your shell's initialization file,
# e.g. `~/.bashrc` or `~/.zshrc`
export PATH="$PATH:$HOME/bin"

# just should now be executable
just --help

Workflow Engine CLI Configuration

The Workflow Engine CLI provides a set of commands to manage the configuration of your workflow engine. These commands allow you to initialize, list variables, render, and convert configuration files in various formats.

This documentation provides a comprehensive overview of the configuration management capabilities available in the Workflow Engine CLI. For further assistance or more detailed examples, refer to the CLI's help command or the official documentation.

Configuring Using Environment Variables, CLI Arguments, or Configuration Files

The Workflow Engine supports flexible configuration methods to suit various operational environments. You can configure the engine using environment variables, command-line (CLI) arguments, or configuration files in JSON, YAML, or TOML formats. This flexibility allows you to choose the most convenient way to set up your workflow engine based on your deployment and development needs.

Configuration Precedence

The Workflow Engine uses Viper under the hood to manage its configurations, which follows a specific order of precedence when merging configuration options:

Command-line Arguments: These override values specified through other methods.
Environment Variables: They take precedence over configuration files.
Configuration Files: Supports JSON, YAML, and TOML formats. The engine reads these files if specified and merges them into the existing configuration.
Default Values: Predefined in the code.

Using Environment Variables

Environment variables are a convenient way to configure the application in environments where file access might be restricted or for overriding specific configurations without changing the configuration files.

To use environment variables:

Prefix your environment variables with a specific prefix (e.g., WF_) to avoid conflicts with other applications.
Use the environment variable names that correspond to the configuration options you wish to set.

Using CLI Arguments

CLI arguments provide a way to specify configuration values when running a command. They are useful for temporary overrides or when scripting actions. For each configuration option, there is usually a corresponding flag that can be passed to the command.

For example:

./workflow-engine run image-build --build-dir . --dockerfile custom.Dockerfile

Using Configuration Files

Configuration files offer a structured and human-readable way to manage your application settings. The Workflow Engine supports JSON, YAML, and TOML formats, allowing you to choose the one that best fits your preferences or existing infrastructure.

JSON: A lightweight data-interchange format.
YAML: A human-readable data serialization standard.
TOML:A minimal configuration file format that's easy to read due to its clear semantics.

To specify which configuration file to use, you can typically pass the file path as a CLI argument or set an environment variable pointing to the file.

Merging Configuration

Workflow Engine merges configuration from different sources in the order of precedence mentioned above. If the same configuration is specified in multiple places, the source with the highest precedence overrides the others. This mechanism allows for flexible configuration strategies, such as defining default values in a file and overriding them with environment variables or CLI arguments as needed.

Commands - Managing the configuration file

`config init`

Initializes the configuration file with default settings.

`config vars`

Lists supported built-in variables that can be used in templates.

`config render`

Renders a configuration template using the --file flag or STDIN and writes the output to STDOUT.

`config convert`

Converts a configuration file from one format to another.

Examples

Render Configuration Template

Rendering a configuration template from config.json.tmpl to JSON format:

$ cat config.json.tmpl | ./workflow-engine config render

Output:

{
  "image": {...},
  "artifacts": {...}
}

Convert Configuration Format

Attempting to convert the configuration without specifying required flags results in an error:

$ cat config.json.tmpl | ./workflow-engine config render  | ./workflow-engine config convert

Error Output:

Error: at least one of the flags in the group [file input] is required

Successful conversion from JSON to TOML format:

$ cat config.json.tmpl | ./workflow-engine config render  | ./workflow-engine config convert -i json -o toml

Output:

[image]
buildDir = '.'
...

Image Build

Command Parameters

Build Directory

CLI Flag	Variable Name	Config Field Name
`--build-dir`	`WFE_BUILD_DIR`	`image.buildDir`

The directory from which to build the container (typically, but not always, the directory where the Dockerfile is located). This parameter is optional, expects a string value, and defaults to the current working directory.

Dockerfile

CLI Flag	Variable Name	Config Field Name
`--dockerfile`	`WFE_BUILD_DOCKERFILE`	`image.buildDockerfile`

Build Args

CLI Flag	Variable Name	Config Field Name
`--build-arg`	`WFE_BUILD_ARGS`	`image.buildArgs`

Defines build arguments that are passed to the actual container image build command. This parameter is optional, and expects a mapping of string keys to string values, the exact format of which depends on the medium by which it is specified.

CLI Flag

The --build-arg flag can be specified multiple times to specify different args. The key and value for each arg should be specified as a string in the format key=value.

Environment Variable

The WFE_BUILD_ARGS environment variable must contain all the build arguments in a JSON formatted object (i.e. {"key":"value"}).

Configuration File

Similar to how build args are specified as an environment variable, build args in config files must be specified as a JSON formatted object. The following is an example YAML config file:

image:
  buildArgs: |-
    { "key": "value" }

Note that when specifying build args via the configuration file, special care must be taken to ensure that the case of the key is preserved. In the above example the value of buildArgs is a string, not a YAML object. When using a JSON config file this would need to be specified as follows:

{
	"image": {
		"buildArgs": "{ \"key\": \"value\" }"
	}
}

This is because the workflow-engine configuration file loader does not preserve the case of keys, and build args in Dockerfiles are case sensitive.

Tag

CLI Flag	Variable Name	Config Field Name
`--tag`	`WFE_BUILD_TAG`	`image.buildTag`

Platform

CLI Flag	Variable Name	Config Field Name
`--platform`	`WFE_BUILD_PLATFORM`	`image.buildPlatform`

Target

CLI Flag	Variable Name	Config Field Name
`--target`	`WFE_BUILD_TARGET`	`image.buildTarget`

For multi-stage Dockerfiles this parameter specifies a named stage to build.

Cache To

CLI Flag	Variable Name	Config Field Name
`--cache-to`	`WFE_BUILD_CACHE_TO`	`image.buildCacheTo`

Cache From

CLI Flag	Variable Name	Config Field Name
`--cache-from`	`WFE_BUILD_CACHE_FROM`	`image.buildCacheFrom`

Squash Layers

CLI Flag	Variable Name	Config Field Name
`--squash-layers`	`WFE_BUILD_SQUASH_LAYERS`	`image.buildSquashLayers`

Security Analysis

Overview

lorem ipsum blah blah text

Security Analysis Docs

workflow-engine run code-scan

Code-scan

Overview

lorem ipsum stuff

Using Code-scan On CLI

workflow-engine run code-scan [flags]

Flags

Flags	Definition
--gitleaks-filename string	the output filename for the gitleaks vulnerability report
-h, --help	help for code-scan
--semgrep-experimental	use the osemgrep statically compiled binary
--semgrep-filename string	the output filename for the semgrep vulnerability report
--semgrep-rules string	the rules semgrep will use for the scan

Code-scan Security Tools

Semgrep

Semgrep

Semgrep Logo

Overview

Semgrep is a static code analysis tool that provides a range of features for detecting and preventing security vulnerabilities and bugs in software. It is designed to help businesses improve their applications' security, increase reliability, and reduce the complexity and cost of performing code analysis. As applications become more complex and interconnected, it becomes increasingly difficult to identify and fix security vulnerabilities and bugs before they are exploited or cause problems in production. This can result in security breaches, data loss, and other issues that can damage a business's reputation and success.

Supported Languages

Apex · Bash · C · C++ · C# · Clojure · Dart · Dockerfile · Elixir · HTML · Go · Java · JavaScript · JSX · JSON · Julia · Jsonnet · Kotlin · Lisp · Lua · OCaml · PHP · Python · R · Ruby · Rust · Scala · Scheme · Solidity · Swift · Terraform · TypeScript · TSX · YAML · XML · Generic (ERB, Jinja, etc.)

Supported Package Managers

C# (NuGet) · Dart (Pub) · Go (Go modules, go mod) · Java (Gradle, Maven) · Javascript/Typescript (npm, Yarn, Yarn 2, Yarn 3, pnpm) · Kotlin (Gradle, Maven) · PHP (Composer) · Python (pip, pip-tool, Pipenv, Poetry) · Ruby (RubyGems) · Rust (Cargo) · Scala (Maven) · Swift (SwiftPM)

Configuration

Under the hood, Workflow engine runs Semgrep with certain flags as its base. Workflow engine then continues on to do further improvements on functionality as a security tool and user experience based on the output of on one of the two optional commands.

Runs this over your git repository:

semgrep ci --json --config [semgrep-rule-config-file]

or this when affixed with the --semgrep-experimental flag:

osemgrep ci --json --experimental --config [semgrep-rule-config-file]

Semgrep with Workflow-engine Code-scan

On the command line use the following with the necessary flags below in your git repo:

  workflow-engine run code-scan [semgrep-flags]

Flags

Input Flag:

  --semgrep-rules string

The input of a .yaml,.toml, or .json file with a ruleset Semgrep will use while scanning your code. More on rulesets here. This can be further configured by specifying the filename with path into an environment variable or workflow-engine config keys within wfe-config.yaml.

Output Flag:

  --semgrep-filename string

The filename for Semgrep to output as a vulnerability report. More on the vulnerability reports here.

Toggle Osemgrep Flag:

  --semgrep-experimental

Use Semgrep's experimental features that are still in beta that have the potential to increase vulnerability detection. Furthermore uses osemgrep a variant built upon Semgrep with OpenSSF Security Metrics in mind.

Env Variables

Config Key	Environment Variable	Default Value	Description
codescan.semgrepfilename	WFE_CODE_SCAN_SEMGREP_FILENAME	semgrep-sast-report.json	The filename for the semgrep SAST report - must contain 'semgrep'
codescan.semgreprules	WFE_CODE_SCAN_SEMGREP_RULES	p/default	Semgrep ruleset manual override

Rulesets

rules:
  - id: dangerously-setting-html
    languages:
      - javascript
    message: dangerouslySetInnerHTML usage! Don't allow XSS!
    pattern: ...dangerouslySetInnerHTML(...)...
    severity: ERROR
    files:
      - "*.jsx"
      - "*.js"

Semgrep operates on a set of rulesets given by the user to determine on what terms are best to scan your code. These rulesets are given by files with the .yaml, .json or .toml extension.

To identify vulnerabilities at a basic level Semgrep requires:

Language to target
Message to display on vulnerability detection
Pattern(s) to match
Severity Rating from lowest to highest:
- INFO
- WARNING
- ERROR

Furthermore there are some advanced options, some which can even amend or exclude certain code snippets.

Typically rules and rulesets have already been written by various developers; thanks to Semgrep's open source nature you can find these below:

Or if you're the type to blaze your own path, here's some documentation on how to write your own custom including examples on advanced pattern matching syntax:

Here below is a rule playground you can test writing your own semgrep rules:

Semgrep Rule Playground

Logging Semgrep with Workflow-engine

Within workflow engine, semgrep-sast-report.json is the default value for a file that will be the output Semgrep it will appear in the artifacts directory if workflowengine is given read write permissions. As covered above in configuration using the flag --semgrep-filename filename will configure a custom file to output the semgrep-report to.

Furthermore Semgrep when enabled via code-scan, workflow-engine run code-scan -v will output the Semgrep outputs with verbosity along with other code-scan tools.

The contents of the semgrep-sast-report.json contains rules and snippets of code that have potential vulnerabilities as well as amended code that has been fixed with the tag fix in the rule.

Workflow engine uses Gatecheck to 'audit' the semgrep logs once Semgrep has finished. It does so by scanning for vulnerabilities defined by Open Worldwide Application Security Project IDs. Workflow-engine reads STDERR, where other errors are gathered from code-scan tools, audits them via Gatecheck and outputs this audit to STDOUT. It also releases the logged output files into the artifacts/ directory in your working directory.

Ex. | Check ID | Owasp IDs | Severity | Impact | link | |--------------------------------|---------------------------------------------------------------|-----------|---------|-------| | react-dangerouslysetinnerhtml | A07:2017 - Cross-Site Scripting (XSS), A03:2021 - Injection | ERROR | MEDIUM | |

Handling False Positives & Problematic File(s)

Semgrep is a rather simplistic tool that searches for vulnerabilities in your code based on the rules given to it. It is up to you to handle these false positives and problematic file(s). There are a multitude of ways to handle this that will increase complexity of the base rule but increase its power and specificity.

False Positives

You notice that Semgrep is screaming at you from the console in workflow-engine. You rage and rage as your terminal is just polluted with messages for a vulnerability you know is just a false positive.

Nosemgrep

Just add a comment with nosemgrep on the line next to the vulnerability or function head of the block of code and boom, false positives away. This is a full Semgrep blocker, for best practice use // nosemgrep: rule-id-1, rule-id-2, .... to restrict certain rules that cause the false positive. Here's more info on nosemgrep.

Taint Analysis

Of course, the above is somewhat of a workaround and should only be considered mostly when there are only very few areas where false positives occur. The better way to handle false positives is by adding taints to rules when you understand what the root of the false positive, taints can be applied to places with false positive vulnerabilities, prepended with taint_assume_safe_ and given a boolean value. False positive taints are for:

Boolean inputs
Numeric inputs
Index inputs
Function names
Propagation (must taint its initialization)

Taints can also be used to track variables that can lead to vulnerabilities in code. It allows the developers to see the flow of this potential vulnerability in a large code base. This can be used by tainting the source variable, and the sink, where the variable ends up at a potential vulnerable function. If it mutates it is best to track the propagators and sanitizers of this variable as well. At a high level, these are functions that modify the tainted variable in some way and therefore the taint should change in someway. Here's an example of such a rule with taints. Of course if you'd like to know more, click here to see the official ondocumentation on Semgrep taint analysis.

Problematic File(s)

At a grander scale, if a whole file or directory of files is causing a false positive, or you just don't need to scan these files, there are multitudes of ways to handle this.

Down below are some examples of both:

.Semgrepignore

.semgrepignore is just like a .gitignore file, it simply will show semgrep a list of things to not look at and it will skip over them. Place this file in your root directory or in your working directory. The below specifies don't include the .gitignore to scan and ANY node_modules directory, denoted by '**', will be excluded if this is placed at the root directory.

.gitignore
.env
main_test.go
resources/
**/node_modules/**

Rules with Certain Paths

Semgrep allows two ways inside of a rule to disregard or specify files and directories. These are indicated by first adding the paths field and then adding the exclude and include subfields each with their own lists of files/directories. These values are strings.

Example of Rules with Path Specification and Taints

rules:
  - id: eqeq-is-bad
    mode: taint
    source: $X
    sink: $Y
    sanitizer: clean($X)
    pattern: $X == $Y
    paths:
      exclude:
        - "*_test.go"
        - "project/tests"
      include:
        - "project/server"

Official Semgrep Documentation & Resources

Troubleshooting

Mac M1 Docker Container Execution Failure

If you are running on a Mac M1, and are getting an error similar to:

ERR execution failure error="input:1: container.from.withEnvVariable.withExec.stdout process \"echo sample output from debug container\" did not complete successfully: exit code: 1\n\nStdout:\n\nStderr:\n"

You may need to install colima.

To install colima on a Mac using Homebrew:

brew install colima

Start colima:

colima start --arch x86_64

Then go ahead and run the workflow engine.

Registry Authentication Issues

If you getting an error connecting to the GitHub container registry ghcr.io similar to:

ERR execution failure error="input:1: container.from unexpected status from HEAD request to https://ghcr.io/v2/nightwing-demo/omnibus/manifests/v1.0.0: 403 Forbidden\n

You will need to login to the GitHub Container Registry as follows.

To login to the GitHub Container Registry, you will need to first create a GitHub Personal Access Token (PAT) and use the token to login to the GitHub Container Registry using the following command:

docker login ghcr.io

Then go ahead and run the workflow engine in the same terminal window.

Developer Guide

TODO: Project info, goals, etc.

Getting Started

Project Layout

TODO: Add the philosophy behind the project layout

Shell

The Shell package (pkg/shell) is a library of commands and utilities used in workflow engine. The standard library way to execute shell commands is by using the os/exec package which has a lot of features and flexibility. In our case, we want to restrict the ability to arbitrarily execute shell commands by carefully selecting a sub-set of features for each command.

For example, if you look at the Syft CLI reference, you'll see dozens of commands and configuration options. This is all controlled by flag parsing the string of the command. This is an opinionated security pipeline, so we don't need all the features Syft provides. The user shouldn't care that we're using Syft to generate an SBOM which is then scanned by Grype for vulnerabilities. The idea of Workflow Engine is that it's all abstracted to the Security Analysis pipeline.

In the Shell package, all necessary commands will be abstracted into a native Go object. Only the used features for the given command will be written into this package.

The shell.Executable wraps the exec.Cmd struct and adds some convenient methods for building a command.

syft version -o json

How to execute regular std lib commands with exec.Cmd

cmd := exec.Command("syft", "version", "-o","json")
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// some other options
err := cmd.Run()

There's also additional logic with the os.exec standard library command. Since workflow engine is built around executing external binaries, there is an internal library called the pkg/shell used to abstract a lot of the complexities involved with handling async patterns, possible interrupts, and parameters.

Commands can be represented as functions.

func SyftVersion(options ...OptionFunc) error {
	o := newOptions(options...)
	cmd := exec.Command("syft", "version")
	return run(cmd, o)
}

The OptionFunc variadic parameter allows the caller to modify the behavior of the command with an arbitrary number of OptionFunc(s).

newOptions generates the default Options structure and then applies all of passed in functions. The o variable can now be used to apply parameters to the command before execution.

Returning the run function handles off the execution phase of the command to another function which bootstraps a lot of useful functionality without needing to write supported code for each new command.

For example, if you only want to output what the command would run but not actually run the command,

dryRun := false
SyftVersion(WithStdout(os.Stdout), WithDryRun(dryRun))

This would log the final output command without executing.

The motivation behind this architecture is to simply the Methods for all sub-commands on an executable.

Implementing a new sub command is trivial, just write a new function with the same pattern

func SyftHelp(options ...OptionFunc) error {
	o := newOptions(options...)
	cmd := exec.Command("syft", "--help")
	return run(cmd, o)
}

If we wanted to build an optionFunc for version to optionally write JSON instead of plain text, it would go in the pkg/shell/shell.go function.

Since there aren't many commands, they all share the same configuration object Options.

func WithJSONOutput(enabled bool) OptionFunc {
	return func(o *Options) {
		o.JSONOutput = true
	}
}

Now, the version function can reference this field and change the shell command

func SyftVersion(options ...OptionFunc) error {
	o := newOptions(options...)
	cmd := exec.Command("syft", "version")
  if o.JSONOutput {
    cmd = exec.Command("syft", "version", "-o", "json")
  }
	return run(cmd, o)
}

See pkg/shell/docker.go for a more complex example of a command with a lot of parameters.

Pipelines

Concepts

Concurrency

Workflow Engine PR #26

This PR contains a detailed explanation of the concurrency pattern used in the pipeline definitions.

Documentation

Too Long; Might Read (TL;MR)

A collection of thoughts around design decisions made in Workflow Engine, mostly ramblings that some people may or may not find useful.

Why CI/CD Flexible Configuration is Painful

In a traditional CI/CD environment, you would have to parse strings to build the exact command you want to execute.

Local Shell:

syft version

GitLab CI/CD Configuration let's use declare the execution environment by providing an image name

syft-version:
  stage: scan
  image: anchore/syft:latest
  script:
    - syft version

What typically happens is configuration creep. If you need to print the version information in JSON, (one of the many command options), you would have to provide multiple options in GitLab, only changing the script block, hiding each on behind an env variable

.syft:
  stage: scan
  image: anchore/syft:latest

syft-version:text:
  extends: .syft
  script:
    - syft version
  rules:
    - if: $SYFT_VERSION_JSON != "true"

syft-version:json:
  extends: .syft
  script:
    - syft version -o json
  rules:
    - if: $SYFT_VERSION_JSON == "true"

The complexity increase exponentially in a GitLab CI/CD file for each configuration option you wish to support.

Developer Notes

Tidy First

By: Kent Beck

"Tidy First?" suggests the following:

There isn’t a single way to do things, there are things that make sense in context, and you know your context
There are many distinct ways to tidy code, which make code easier to work with: guard clauses, removing dead code, normalizing symmetries, and so on
Tidying and logic changes are different types of work, and should be done in distinct pull requests
This speeds up pull request review, and on high-cohesion teams tidying commits shouldn’t require code review at all
Tidying should be done in small amounts, not large amounts
Tidying is usually best to do before changing application logic, to the extent that it reduces the cost of making the logical change
It’s also OK to tidy after your change, later when you have time, or even never (for code that doesn’t change much)
Coupling is really bad for maintainable code

Effective Go

link: Effective Go

Formatting in Go

To format the Go source files, run the following command:

go fmt .

VSCode Setup

Install the Go extension for VSCode for Go language support and highlighting.

If you would like to automatically format on save in VSCode, use the following settings in VSCode:

Press Command ⌘ + , to view the settings.
Search for editor.formatOnSave and set it to true
Search for editor.defaultformatter and set it to Go