Workflow Engine
Workflow Engine is a security and delivery pipeline designed to orchestrate the process of building and scanning an application image for security vulnerabilities. It solves the problem of having to configure a hardened-predefined security pipeline using traditional CI/CD. Workflow Engine can be statically compiled as a binary and run on virtually any platform, CI/CD environment, or locally.
Getting Started
Install Prerequisites:
- Container Engine
- Docker or Podman CLI
- Golang >= v1.22.0
- Just (optional)
Compiling Workflow Engine
Running the just recipe will put the compiled-binary into ./bin
just build
OR compile manually
git clone <this-repo> <target-dir>
cd <target-dir>
mkdir bin
go build -o bin/workflow-engine ./cmd/workflow-engine
Optionally, if you care to include metadata you use build arguments
go build -ldflags="-X 'main.cliVersion=$(git describe --tags)' -X 'main.gitCommit=$(git rev-parse HEAD)' -X 'main.buildDate=$(date -u +%Y-%m-%dT%H:%M:%SZ)' -X 'main.gitDescription=$(git log -1 --pretty=%B)'" -o ./bin ./cmd/workflow-engine
Running A Pipeline
You can run the executable directory
workflow-engine run debug
Configuring a Pipeline
Configuration Options:
- Configuration via CLI flags
- Environment Variables
- Config File in JSON
- Config File in YAML
- Config File in TOML
Configuration Order-of-Precedence:
- CLI Flag
- Environment Variable
- Config File Value
- Default Value
Note: (none)
means unset, left blank
Config Key | Environment Variable | Default Value | Description |
---|---|---|---|
codescan.enabled | WFE_CODE_SCAN_ENABLED | 1 | Enable/Disable the code scan pipeline |
codescan.gitleaksfilename | WFE_CODE_SCAN_GITLEAKS_FILENAME | gitleaks-secrets-report.json | The filename for the gitleaks secret report - must contain 'gitleaks' |
codescan.gitleakssrcdir | WFE_CODE_SCAN_GITLEAKS_SRC_DIR | . | The target directory for the gitleaks scan |
codescan.semgrepfilename | WFE_CODE_SCAN_SEMGREP_FILENAME | semgrep-sast-report.json | The filename for the semgrep SAST report - must contain 'semgrep' |
codescan.semgreprules | WFE_CODE_SCAN_SEMGREP_RULES | p/default | Semgrep ruleset manual override |
deploy.enabled | WFE_IMAGE_PUBLISH_ENABLED | 1 | Enable/Disable the deploy pipeline |
deploy.gatecheckconfigfilename | WFE_DEPLOY_GATECHECK_CONFIG_FILENAME | - | The filename for the gatecheck config |
gatecheckbundlefilename | WFE_GATECHECK_BUNDLE_FILENAME | artifacts/gatecheck-bundle.tar.gz | The filename for the gatecheck bundle, a validatable archive of security artifacts |
imagebuild.args | WFE_IMAGE_BUILD_ARGS | - | Comma seperated list of build time variables |
imagebuild.builddir | WFE_IMAGE_BUILD_DIR | . | The build directory to using during an image build |
imagebuild.cachefrom | WFE_IMAGE_BUILD_CACHE_FROM | - | External cache sources (e.g., "user/app:cache", "type=local,src=path/to/dir") |
imagebuild.cacheto | WFE_IMAGE_BUILD_CACHE_TO | - | Cache export destinations (e.g., "user/app:cache", "type=local,src=path/to/dir") |
imagebuild.dockerfile | WFE_IMAGE_BUILD_DOCKERFILE | Dockerfile | The Dockerfile/Containerfile to use during an image build |
imagebuild.enabled | WFE_IMAGE_BUILD_ENABLED | 1 | Enable/Disable the image build pipeline |
imagebuild.platform | WFE_IMAGE_BUILD_PLATFORM | - | The target platform for build |
imagebuild.squashlayers | WFE_IMAGE_BUILD_SQUASH_LAYERS | 0 | squash image layers - Only Supported with Podman CLI |
imagebuild.target | WFE_IMAGE_BUILD_TARGET | - | The target build stage to build (e.g., [linux/amd64]) |
imagepublish.bundlepublishenabled | WFE_IMAGE_BUNDLE_PUBLISH_ENABLED | 1 | Enable/Disable gatecheck artifact bundle publish task |
imagepublish.bundletag | WFE_IMAGE_PUBLISH_BUNDLE_TAG | my-app/artifact-bundle:latest | The full image tag for the target gatecheck bundle image blob |
imagepublish.enabled | WFE_IMAGE_PUBLISH_ENABLED | 1 | Enable/Disable the image publish pipeline |
imagescan.clamavfilename | WFE_IMAGE_SCAN_CLAMAV_FILENAME | clamav-virus-report.txt | The filename for the clamscan virus report - must contain 'clamav' |
imagescan.enabled | WFE_IMAGE_SCAN_ENABLED | 1 | Enable/Disable the image scan pipeline |
imagescan.grypeconfigfilename | WFE_IMAGE_SCAN_GRYPE_CONFIG_FILENAME | - | The config filename for the grype vulnerability report |
imagescan.grypefilename | WFE_IMAGE_SCAN_GRYPE_FILENAME | grype-vulnerability-report-full.json | The filename for the grype vulnerability report - must contain 'grype' |
imagescan.syftfilename | WFE_IMAGE_SCAN_SYFT_FILENAME | syft-sbom-report.json | The filename for the syft SBOM report - must contain 'syft' |
Running in Docker
When running workflow-engine in a docker container there are some pipelines that need to run docker commands.
In order for the docker CLI in the workflow-engine to connect to the docker daemon running on the host machine,
you must either mount the /var/run/docker.sock
in the workflow-engine
container, or provide configuration for
accessing the docker daemon remotely with the DOCKER_HOST
environment variable.
If you don't have access to Artifactory to pull in the Omnibus base image, you can build the image manually which is
in images/omnibus/Dockerfile
.
Using /var/run/docker.sock
This approach assumes you have the docker daemon running on your host machine.
Example:
docker run -it --rm \
`# Mount your Dockerfile and supporting files in the working directory: /app` \
-v "$(pwd):/app:ro" \
`# Mount docker.sock for use by the docker CLI running inside the container` \
-v "/var/run/docker.sock:/var/run/docker.sock" \
`# Run the workflow-engine container with the desired arguments` \
workflow-engine run image-build
Using a Remote Daemon
For more information see the Docker CLI and Docker Daemon documentation pages.
Using Podman in Docker
In addition to building images with Docker it is also possible to build them with podman. When running podman in docker it is necessary to either launch the container in privileged mode, or to run as the podman
user:
docker run --user podman -it --rm \
`# Mount your Dockerfile and supporting files in the working directory: /app` \
-v "$(pwd):/app:ro" \
`# Run the workflow-engine container with the desired arguments` \
workflow-engine:local run image-build -i podman
If root access is needed, the easiest solution for using podman inside a docker container is to run the container in "privileged" mode:
docker run -it --rm \
`# Mount your Dockerfile and supporting files in the working directory: /app` \
-v "$(pwd):/app:ro" \
`# Run the container in privileged mode so that podman is fully functional` \
--privileged \
`# Run the workflow-engine container with the desired arguments` \
workflow-engine run image-build -i podman
Using Podman in Podman
To run the workflow-engine container using podman the process is quite similar, but there are a few additional security options required:
podman run --user podman -it --rm \
`# Mount your Dockerfile and supporting files in the working directory: /app` \
-v "$(pwd):/app:ro" \
`# Run the container with additional security options so that podman is fully functional` \
--security-opt label=disable --device /dev/fuse \
`# Run the workflow-engine container with the desired arguments` \
workflow-engine:local run image-build -i podman
Getting Started
Github Access
You will need access to the https://github.com/nightwing-demo/workflow-engine repository. If you don't already have access you can contact the following Nightwing team members:
to get access.
Required Tools
The following are required tools for running the Nightwing Workflow Engine
Go
The Nightwing Workflow Engine is written in Go.
To install on a Mac, install using Homebrew:
brew install go
Optional: if you would like Go built tools to be available locally on the command line, add the following to your ~/.zshrc
or ~/.zprofile
file:
# Go
export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin
Recommended Resources
If you are new to Go, or would like a refresher, here are some recommended resources:
- Go Documentation
- 101 Go Mistakes and How to Avoid Them - A free an online summarized version can be found here
Optional Tools
The following are optional tools that may be installed to enhance the developer experience.
mdbook
mdbook is written in Rust and requires Rust to be installed as a pre-requisite.
To install Rust on a Mac or other Unix-like OS:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
If you've installed rustup in the past, you can update your installation by running:
rustup update
Once you have installed Rust, the following command can be used to build and install mdbook:
cargo install mdbook
Once mdbook is installed, you can serve it by going to the directory containing the mdbook markdown files and running:
mdbook serve
just
just is "just" a command runner. It is a handy way to save and run project-specific commands.
To install just on a Mac:
You can use the following command on Linux, MacOS, or Windows to download the latest release, just replace <destination directory>
with the directory where you'd like to put just:
curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to <destination directory>
For example, to install just
to ~/bin
:
# create ~/bin
mkdir -p ~/bin
# download and extract just to ~/bin/just
curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to ~/bin
# add `~/bin` to the paths that your shell searches for executables
# this line should be added to your shell's initialization file,
# e.g. `~/.bashrc` or `~/.zshrc`
export PATH="$PATH:$HOME/bin"
# just should now be executable
just --help
Workflow Engine CLI Configuration
The Workflow Engine CLI provides a set of commands to manage the configuration of your workflow engine. These commands allow you to initialize, list variables, render, and convert configuration files in various formats.
This documentation provides a comprehensive overview of the configuration management capabilities available in the Workflow Engine CLI. For further assistance or more detailed examples, refer to the CLI's help command or the official documentation.
Configuring Using Environment Variables, CLI Arguments, or Configuration Files
The Workflow Engine supports flexible configuration methods to suit various operational environments. You can configure the engine using environment variables, command-line (CLI) arguments, or configuration files in JSON, YAML, or TOML formats. This flexibility allows you to choose the most convenient way to set up your workflow engine based on your deployment and development needs.
Configuration Precedence
The Workflow Engine uses Viper under the hood to manage its configurations, which follows a specific order of precedence when merging configuration options:
- Command-line Arguments: These override values specified through other methods.
- Environment Variables: They take precedence over configuration files.
- Configuration Files: Supports JSON, YAML, and TOML formats. The engine reads these files if specified and merges them into the existing configuration.
- Default Values: Predefined in the code.
Using Environment Variables
Environment variables are a convenient way to configure the application in environments where file access might be restricted or for overriding specific configurations without changing the configuration files.
To use environment variables:
- Prefix your environment variables with a specific prefix (e.g.,
WF_
) to avoid conflicts with other applications. - Use the environment variable names that correspond to the configuration options you wish to set.
Using CLI Arguments
CLI arguments provide a way to specify configuration values when running a command. They are useful for temporary overrides or when scripting actions. For each configuration option, there is usually a corresponding flag that can be passed to the command.
For example:
./workflow-engine run image-build --build-dir . --dockerfile custom.Dockerfile
Using Configuration Files
Configuration files offer a structured and human-readable way to manage your application settings. The Workflow Engine supports JSON, YAML, and TOML formats, allowing you to choose the one that best fits your preferences or existing infrastructure.
- JSON: A lightweight data-interchange format.
- YAML: A human-readable data serialization standard.
- TOML:A minimal configuration file format that's easy to read due to its clear semantics.
To specify which configuration file to use, you can typically pass the file path as a CLI argument or set an environment variable pointing to the file.
Merging Configuration
Workflow Engine merges configuration from different sources in the order of precedence mentioned above. If the same configuration is specified in multiple places, the source with the highest precedence overrides the others. This mechanism allows for flexible configuration strategies, such as defining default values in a file and overriding them with environment variables or CLI arguments as needed.
Commands - Managing the configuration file
config init
Initializes the configuration file with default settings.
config vars
Lists supported built-in variables that can be used in templates.
config render
Renders a configuration template using the --file
flag or STDIN and writes the output to STDOUT.
config convert
Converts a configuration file from one format to another.
Examples
Render Configuration Template
Rendering a configuration template from config.json.tmpl
to JSON format:
$ cat config.json.tmpl | ./workflow-engine config render
Output:
{
"image": {...},
"artifacts": {...}
}
Convert Configuration Format
Attempting to convert the configuration without specifying required flags results in an error:
$ cat config.json.tmpl | ./workflow-engine config render | ./workflow-engine config convert
Error Output:
Error: at least one of the flags in the group [file input] is required
Successful conversion from JSON to TOML format:
$ cat config.json.tmpl | ./workflow-engine config render | ./workflow-engine config convert -i json -o toml
Output:
[image]
buildDir = '.'
...
Image Build
Command Parameters
Build Directory
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--build-dir | WFE_BUILD_DIR | image.buildDir |
The directory from which to build the container (typically, but not always, the directory where the Dockerfile is located). This parameter is optional, expects a string value, and defaults to the current working directory.
Dockerfile
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--dockerfile | WFE_BUILD_DOCKERFILE | image.buildDockerfile |
Build Args
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--build-arg | WFE_BUILD_ARGS | image.buildArgs |
Defines build arguments that are passed to the actual container image build command. This parameter is optional, and expects a mapping of string keys to string values, the exact format of which depends on the medium by which it is specified.
CLI Flag
The --build-arg
flag can be specified multiple times to specify different args. The key and value for each arg should be specified as a string in the format key=value
.
Environment Variable
The WFE_BUILD_ARGS
environment variable must contain all the build arguments in a JSON formatted object (i.e. {"key":"value"}
).
Configuration File
Similar to how build args are specified as an environment variable, build args in config files must be specified as a JSON formatted object. The following is an example YAML config file:
image:
buildArgs: |-
{ "key": "value" }
Note that when specifying build args via the configuration file, special care must be taken to ensure that the case of the key is preserved. In the above example the value of buildArgs
is a string, not a YAML object. When using a JSON config file this would need to be specified as follows:
{
"image": {
"buildArgs": "{ \"key\": \"value\" }"
}
}
This is because the workflow-engine configuration file loader does not preserve the case of keys, and build args in Dockerfiles are case sensitive.
Tag
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--tag | WFE_BUILD_TAG | image.buildTag |
Platform
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--platform | WFE_BUILD_PLATFORM | image.buildPlatform |
Target
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--target | WFE_BUILD_TARGET | image.buildTarget |
For multi-stage Dockerfiles this parameter specifies a named stage to build.
Cache To
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--cache-to | WFE_BUILD_CACHE_TO | image.buildCacheTo |
Cache From
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--cache-from | WFE_BUILD_CACHE_FROM | image.buildCacheFrom |
Squash Layers
CLI Flag | Variable Name | Config Field Name |
---|---|---|
--squash-layers | WFE_BUILD_SQUASH_LAYERS | image.buildSquashLayers |
Security Analysis
Overview
lorem ipsum blah blah text
Security Analysis Docs
Code-scan
Overview
lorem ipsum stuff
Using Code-scan On CLI
workflow-engine run code-scan [flags]
Flags
Flags | Definition |
---|---|
--gitleaks-filename string | the output filename for the gitleaks vulnerability report |
-h, --help | help for code-scan |
--semgrep-experimental | use the osemgrep statically compiled binary |
--semgrep-filename string | the output filename for the semgrep vulnerability report |
--semgrep-rules string | the rules semgrep will use for the scan |
Code-scan Security Tools
Semgrep
Table of Contents
- Overview
- Configuration
- Rulesets
- Logging Semgrep with Workflow-engine
- Handling False Positives & Problematic File(s)
- Official Semgrep Documentation & Resources
Overview
Semgrep is a static code analysis tool that provides a range of features for detecting and preventing security vulnerabilities and bugs in software. It is designed to help businesses improve their applications' security, increase reliability, and reduce the complexity and cost of performing code analysis. As applications become more complex and interconnected, it becomes increasingly difficult to identify and fix security vulnerabilities and bugs before they are exploited or cause problems in production. This can result in security breaches, data loss, and other issues that can damage a business's reputation and success.
Supported Languages
Apex · Bash · C · C++ · C# · Clojure · Dart · Dockerfile · Elixir · HTML · Go · Java · JavaScript · JSX · JSON · Julia · Jsonnet · Kotlin · Lisp · Lua · OCaml · PHP · Python · R · Ruby · Rust · Scala · Scheme · Solidity · Swift · Terraform · TypeScript · TSX · YAML · XML · Generic (ERB, Jinja, etc.)
Supported Package Managers
C# (NuGet) · Dart (Pub) · Go (Go modules, go mod) · Java (Gradle, Maven) · Javascript/Typescript (npm, Yarn, Yarn 2, Yarn 3, pnpm) · Kotlin (Gradle, Maven) · PHP (Composer) · Python (pip, pip-tool, Pipenv, Poetry) · Ruby (RubyGems) · Rust (Cargo) · Scala (Maven) · Swift (SwiftPM)
Configuration
Under the hood, Workflow engine runs Semgrep with certain flags as its base. Workflow engine then continues on to do further improvements on functionality as a security tool and user experience based on the output of on one of the two optional commands.
Runs this over your git repository:
semgrep ci --json --config [semgrep-rule-config-file]
or this when affixed with the --semgrep-experimental
flag:
osemgrep ci --json --experimental --config [semgrep-rule-config-file]
Semgrep with Workflow-engine Code-scan
On the command line use the following with the necessary flags below in your git repo:
workflow-engine run code-scan [semgrep-flags]
Flags
Input Flag:
--semgrep-rules string
The input of a .yaml
,.toml
, or .json
file with a ruleset Semgrep will use while scanning your code. More on rulesets here. This can be further configured by specifying the filename with path into an environment variable or workflow-engine config keys within wfe-config.yaml.
Output Flag:
--semgrep-filename string
The filename for Semgrep to output as a vulnerability report. More on the vulnerability reports here.
Toggle Osemgrep Flag:
--semgrep-experimental
Use Semgrep's experimental features that are still in beta that have the potential to increase vulnerability detection. Furthermore uses osemgrep
a variant built upon Semgrep with OpenSSF Security Metrics in mind.
Env Variables
Config Key | Environment Variable | Default Value | Description |
---|---|---|---|
codescan.semgrepfilename | WFE_CODE_SCAN_SEMGREP_FILENAME | semgrep-sast-report.json | The filename for the semgrep SAST report - must contain 'semgrep' |
codescan.semgreprules | WFE_CODE_SCAN_SEMGREP_RULES | p/default | Semgrep ruleset manual override |
Rulesets
rules:
- id: dangerously-setting-html
languages:
- javascript
message: dangerouslySetInnerHTML usage! Don't allow XSS!
pattern: ...dangerouslySetInnerHTML(...)...
severity: ERROR
files:
- "*.jsx"
- "*.js"
Semgrep operates on a set of rulesets given by the user to determine on what terms are best to scan your code. These rulesets are given by files with the .yaml, .json or .toml extension.
To identify vulnerabilities at a basic level Semgrep requires:
- Language to target
- Message to display on vulnerability detection
- Pattern(s) to match
- Severity Rating from lowest to highest:
- INFO
- WARNING
- ERROR
Furthermore there are some advanced options, some which can even amend or exclude certain code snippets.
Typically rules and rulesets have already been written by various developers; thanks to Semgrep's open source nature you can find these below:
Or if you're the type to blaze your own path, here's some documentation on how to write your own custom including examples on advanced pattern matching syntax:
Here below is a rule playground you can test writing your own semgrep rules:
Semgrep Rule Playground
Logging Semgrep with Workflow-engine
Within workflow engine, semgrep-sast-report.json
is the default value for a file that will be the output Semgrep it will appear in the artifacts directory if workflowengine is given read write permissions. As covered above in configuration using the flag --semgrep-filename filename
will configure a custom file to output the semgrep-report to.
Furthermore Semgrep when enabled via code-scan, workflow-engine run code-scan -v
will output the Semgrep outputs with verbosity along with other code-scan tools.
The contents of the semgrep-sast-report.json
contains rules and snippets of code that have potential vulnerabilities as well as amended code that has been fixed with the tag fix
in the rule.
Workflow engine uses Gatecheck to 'audit' the semgrep logs once Semgrep has finished. It does so by scanning for vulnerabilities defined by Open Worldwide Application Security Project IDs. Workflow-engine reads STDERR, where other errors are gathered from code-scan
tools, audits them via Gatecheck and outputs this audit to STDOUT. It also releases the logged output files into the artifacts/
directory in your working directory.
Ex. | Check ID | Owasp IDs | Severity | Impact | link | |--------------------------------|---------------------------------------------------------------|-----------|---------|-------| | react-dangerouslysetinnerhtml | A07:2017 - Cross-Site Scripting (XSS), A03:2021 - Injection | ERROR | MEDIUM | |
Handling False Positives & Problematic File(s)
Semgrep is a rather simplistic tool that searches for vulnerabilities in your code based on the rules given to it. It is up to you to handle these false positives and problematic file(s). There are a multitude of ways to handle this that will increase complexity of the base rule but increase its power and specificity.
False Positives
You notice that Semgrep is screaming at you from the console in workflow-engine. You rage and rage as your terminal is just polluted with messages for a vulnerability you know is just a false positive.
Nosemgrep
Just add a comment with nosemgrep
on the line next to the vulnerability or function head of the block of code and boom, false positives away. This is a full Semgrep blocker, for best practice use // nosemgrep: rule-id-1, rule-id-2, ....
to restrict certain rules that cause the false positive. Here's more info on nosemgrep.
Taint Analysis
Of course, the above is somewhat of a workaround and should only be considered mostly when there are only very few areas where false positives occur. The better way to handle false positives is by adding taints to rules when you understand what the root of the false positive, taints can be applied to places with false positive vulnerabilities, prepended with taint_assume_safe_
and given a boolean value. False positive taints are for:
- Boolean inputs
- Numeric inputs
- Index inputs
- Function names
- Propagation (must taint its initialization)
Taints can also be used to track variables that can lead to vulnerabilities in code. It allows the developers to see the flow of this potential vulnerability in a large code base. This can be used by tainting the source variable, and the sink, where the variable ends up at a potential vulnerable function. If it mutates it is best to track the propagators and sanitizers of this variable as well. At a high level, these are functions that modify the tainted variable in some way and therefore the taint should change in someway. Here's an example of such a rule with taints. Of course if you'd like to know more, click here to see the official ondocumentation on Semgrep taint analysis.
Problematic File(s)
At a grander scale, if a whole file or directory of files is causing a false positive, or you just don't need to scan these files, there are multitudes of ways to handle this.
Down below are some examples of both:
.Semgrepignore
.semgrepignore
is just like a .gitignore
file, it simply will show semgrep a list of things to not look at and it will skip over them. Place this file in your root directory or in your working directory. The below specifies don't include the .gitignore to scan and ANY node_modules directory, denoted by '**', will be excluded if this is placed at the root directory.
.gitignore
.env
main_test.go
resources/
**/node_modules/**
Rules with Certain Paths
Semgrep allows two ways inside of a rule to disregard or specify files and directories. These are indicated by first adding the paths field and then adding the exclude and include subfields each with their own lists of files/directories. These values are strings.
Example of Rules with Path Specification and Taints
rules:
- id: eqeq-is-bad
mode: taint
source: $X
sink: $Y
sanitizer: clean($X)
pattern: $X == $Y
paths:
exclude:
- "*_test.go"
- "project/tests"
include:
- "project/server"
Official Semgrep Documentation & Resources
- Semgrep Source Code
- Semgrep Official Documentation
- Semgrep Ruleset Registry
- Semgrep Rule Registry
- Semgrep Rule Playground
- Semgrep Custom Rule Declarations
- Semgrep Rule Taints
- Semgrep FAQs
Troubleshooting
Mac M1 Docker Container Execution Failure
If you are running on a Mac M1, and are getting an error similar to:
ERR execution failure error="input:1: container.from.withEnvVariable.withExec.stdout process \"echo sample output from debug container\" did not complete successfully: exit code: 1\n\nStdout:\n\nStderr:\n"
You may need to install colima.
To install colima on a Mac using Homebrew:
brew install colima
Start colima:
colima start --arch x86_64
Then go ahead and run the workflow engine.
Registry Authentication Issues
If you getting an error connecting to the GitHub container registry ghcr.io similar to:
ERR execution failure error="input:1: container.from unexpected status from HEAD request to https://ghcr.io/v2/nightwing-demo/omnibus/manifests/v1.0.0: 403 Forbidden\n
You will need to login to the GitHub Container Registry as follows.
Login to GitHub Container Registry
To login to the GitHub Container Registry, you will need to first create a GitHub Personal Access Token (PAT) and use the token to login to the GitHub Container Registry using the following command:
docker login ghcr.io
Then go ahead and run the workflow engine in the same terminal window.
Developer Guide
TODO: Project info, goals, etc.
Getting Started
Project Layout
TODO: Add the philosophy behind the project layout
Shell
The Shell package (pkg/shell
) is a library of commands and utilities used in workflow engine.
The standard library way to execute shell commands is by using the os/exec
package which has a lot of features and
flexibility.
In our case, we want to restrict the ability to arbitrarily execute shell commands by carefully selecting a sub-set of
features for each command.
For example, if you look at the Syft CLI reference, you'll see dozens of commands and configuration options. This is all controlled by flag parsing the string of the command. This is an opinionated security pipeline, so we don't need all the features Syft provides. The user shouldn't care that we're using Syft to generate an SBOM which is then scanned by Grype for vulnerabilities. The idea of Workflow Engine is that it's all abstracted to the Security Analysis pipeline.
In the Shell package, all necessary commands will be abstracted into a native Go object. Only the used features for the given command will be written into this package.
The shell.Executable wraps the exec.Cmd struct and adds some convenient methods for building a command.
syft version -o json
How to execute regular std lib commands with exec.Cmd
cmd := exec.Command("syft", "version", "-o","json")
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// some other options
err := cmd.Run()
There's also additional logic with the os.exec
standard library command.
Since workflow engine is built around executing external binaries, there is an internal library called the pkg/shell
used to abstract a lot of the complexities involved with handling async patterns, possible interrupts, and parameters.
Commands can be represented as functions.
func SyftVersion(options ...OptionFunc) error {
o := newOptions(options...)
cmd := exec.Command("syft", "version")
return run(cmd, o)
}
The OptionFunc
variadic parameter allows the caller to modify the behavior of the command with an arbitrary
number of OptionFunc
(s).
newOptions
generates the default Options
structure and then applies all of passed in functions.
The o
variable can now be used to apply parameters to the command before execution.
Returning the run
function handles off the execution phase of the command to another function which bootstraps
a lot of useful functionality without needing to write supported code for each new command.
For example, if you only want to output what the command would run but not actually run the command,
dryRun := false
SyftVersion(WithStdout(os.Stdout), WithDryRun(dryRun))
This would log the final output command without executing.
The motivation behind this architecture is to simply the Methods for all sub-commands on an executable.
Implementing a new sub command is trivial, just write a new function with the same pattern
func SyftHelp(options ...OptionFunc) error {
o := newOptions(options...)
cmd := exec.Command("syft", "--help")
return run(cmd, o)
}
If we wanted to build an optionFunc for version to optionally write JSON instead of plain text, it would go in the
pkg/shell/shell.go
function.
Since there aren't many commands, they all share the same configuration object Options
.
func WithJSONOutput(enabled bool) OptionFunc {
return func(o *Options) {
o.JSONOutput = true
}
}
Now, the version function can reference this field and change the shell command
func SyftVersion(options ...OptionFunc) error {
o := newOptions(options...)
cmd := exec.Command("syft", "version")
if o.JSONOutput {
cmd = exec.Command("syft", "version", "-o", "json")
}
return run(cmd, o)
}
See pkg/shell/docker.go
for a more complex example of a command with a lot of parameters.
Pipelines
Concepts
Concurrency
This PR contains a detailed explanation of the concurrency pattern used in the pipeline definitions.
Documentation
Too Long; Might Read (TL;MR)
A collection of thoughts around design decisions made in Workflow Engine, mostly ramblings that some people may or may not find useful.
Why CI/CD Flexible Configuration is Painful
In a traditional CI/CD environment, you would have to parse strings to build the exact command you want to execute.
Local Shell:
syft version
GitLab CI/CD Configuration let's use declare the execution environment by providing an image name
syft-version:
stage: scan
image: anchore/syft:latest
script:
- syft version
What typically happens is configuration creep. If you need to print the version information in JSON, (one of the many command options), you would have to provide multiple options in GitLab, only changing the script block, hiding each on behind an env variable
.syft:
stage: scan
image: anchore/syft:latest
syft-version:text:
extends: .syft
script:
- syft version
rules:
- if: $SYFT_VERSION_JSON != "true"
syft-version:json:
extends: .syft
script:
- syft version -o json
rules:
- if: $SYFT_VERSION_JSON == "true"
The complexity increase exponentially in a GitLab CI/CD file for each configuration option you wish to support.
Developer Notes
Tidy First
By: Kent Beck
"Tidy First?" suggests the following:
- There isn’t a single way to do things, there are things that make sense in context, and you know your context
- There are many distinct ways to tidy code, which make code easier to work with: guard clauses, removing dead code, normalizing symmetries, and so on
- Tidying and logic changes are different types of work, and should be done in distinct pull requests
- This speeds up pull request review, and on high-cohesion teams tidying commits shouldn’t require code review at all
- Tidying should be done in small amounts, not large amounts
- Tidying is usually best to do before changing application logic, to the extent that it reduces the cost of making the logical change
- It’s also OK to tidy after your change, later when you have time, or even never (for code that doesn’t change much)
- Coupling is really bad for maintainable code
Effective Go
link: Effective Go
Formatting in Go
To format the Go source files, run the following command:
go fmt .
VSCode Setup
Install the Go extension for VSCode for Go language support and highlighting.
If you would like to automatically format on save in VSCode, use the following settings in VSCode:
- Press
Command ⌘ + ,
to view the settings. - Search for
editor.formatOnSave
and set it totrue
- Search for
editor.defaultformatter
and set it toGo