Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions sbom/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# SBOM Generation Package
#
# This package provides Bazel-native SBOM (Software Bill of Materials) generation
# using module extensions and aspects.
#
# Public API:
# - load("@score_tooling//sbom:defs.bzl", "sbom")
# - use_extension("@score_tooling//sbom:extensions.bzl", "sbom_metadata")

load("@rules_python//python:defs.bzl", "py_library")

package(default_visibility = ["//visibility:public"])

exports_files([
"defs.bzl",
"extensions.bzl",
"repos.bzl",
"repository_rules.bzl",
"crates_metadata.json",
"cpp_metadata.json",
])

# Filegroup for all SBOM-related bzl files
filegroup(
name = "bzl_files",
srcs = [
"defs.bzl",
"extensions.bzl",
"//sbom/internal:bzl_files",
],
)

# npm wrapper (uses system-installed npm from PATH)
sh_binary(
name = "npm_wrapper",
srcs = ["npm_wrapper.sh"],
)
244 changes: 244 additions & 0 deletions sbom/SBOM_Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,244 @@
# SBOM Setup Guide

## 1. Configure MODULE.bazel

Add the SBOM metadata extension in your **root** MODULE.bazel (e.g. `reference_integration/MODULE.bazel`):

```starlark
# Enable SBOM metadata collection from all modules in the dependency graph
sbom_ext = use_extension("@score_tooling//sbom:extensions.bzl", "sbom_metadata")
use_repo(sbom_ext, "sbom_metadata")
```

No manual license entries are needed — all license metadata is collected automatically.

## 2. Add SBOM Target in BUILD

```starlark
load("@score_tooling//sbom:defs.bzl", "sbom")

sbom(
name = "my_sbom",
targets = ["//my/app:binary"],
component_name = "my_application",
component_version = "1.0.0",
# Auto-generate caches during build
cargo_lockfile = "//:Cargo.lock",
auto_crates_cache = True,
auto_cdxgen = True, # Requires system-installed npm/cdxgen (see below)
)
```

### Parameters

| Parameter | Description |
| :--- | :--- |
| `targets` | Bazel targets to include in SBOM |
| `component_name` | Main component name (defaults to rule name) |
| `component_version` | Version string |
| `output_formats` | `["spdx", "cyclonedx"]` (default: both) |
| `cargo_lockfile` | Cargo.lock used to auto-generate Rust crate cache |
| `auto_crates_cache` | Auto-generate crates cache when `cargo_lockfile` is set |
| `auto_cdxgen` | Auto-run cdxgen when no `cdxgen_sbom` is provided |

## 3. Install Prerequisites (for auto_cdxgen)

If using `auto_cdxgen = True` to automatically scan C++ dependencies:

```bash
# Install Node.js and cdxgen globally
# Option 1: Using nvm (recommended)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
source ~/.bashrc
nvm install 20
npm install -g @cyclonedx/cdxgen

# Verify installation
which cdxgen
cdxgen --version
```

**Note:** If you don't have npm/cdxgen installed, set `auto_cdxgen = False` in your SBOM configuration.

## 4. Build

```bash
bazel build //:my_sbom
```

## 5. Output

Generated files in `bazel-bin/`:

- `my_sbom.spdx.json` — SPDX 2.3 format
- `my_sbom.cdx.json` — CycloneDX 1.6 format
- `my_sbom_crates_metadata.json` — Auto-generated Rust crate cache (if `auto_crates_cache = True`)
- `my_sbom_cdxgen.cdx.json` — C++ dependencies from cdxgen (if `auto_cdxgen = True`)

---

## Toolchain Components

### Core Tools

| Tool | Role | Required For |
|------|------|--------------|
| [Bazel](https://bazel.build) | Build system — rules, aspects, and module extensions drive dependency discovery and SBOM generation | All SBOM generation |
| [Python 3](https://www.python.org) | Runtime for the SBOM generator, formatters, and metadata extraction scripts | All SBOM generation |
| [crates.io API](https://crates.io) | Rust crate metadata source (license, version, checksums) | Rust metadata extraction when `auto_crates_cache = True` |
| [@cyclonedx/cdxgen](https://github.com/CycloneDX/cdxgen) | C++ dependency scanner and license discovery tool | C++ metadata extraction when `auto_cdxgen = True` |
| [Node.js / npm](https://nodejs.org) | Runtime for cdxgen | C++ metadata extraction when `auto_cdxgen = True` |

### Build-Time Components (Bazel-native, no external dependencies)

| Component | File | Role |
|-----------|------|------|
| **Public API** | `defs.bzl` | `sbom()` macro — user-facing entry point |
| **Module Extension** | `extensions.bzl` | Collects metadata from all modules in dependency graph |
| **Aspect** | `internal/aspect.bzl` | Traverses transitive deps of targets (`SbomDepsInfo` provider) |
| **Rule** | `internal/rules.bzl` | Orchestrates SBOM generation action |
| **Repository Rules** | `repos.bzl`, `repository_rules.bzl` | SBOM-aware `http_archive`/`git_repository` replacements |
| **Generator** | `internal/generator/sbom_generator.py` | Main Python executable — resolves components, loads caches, calls formatters |
| **SPDX Formatter** | `internal/generator/spdx_formatter.py` | Produces SPDX 2.3 JSON output |
| **CycloneDX Formatter** | `internal/generator/cyclonedx_formatter.py` | Produces CycloneDX 1.6 JSON output |
| **PURL Utilities** | `internal/generator/purl.py` | Package URL generation and parsing (`pkg:cargo`, `pkg:github`, `pkg:bazel`, `pkg:generic`) |
| **Rust Cache** | `crates_metadata.json` | Bundled Rust crate metadata (license, version, checksum, PURL) |
| **C++ Cache** | `cpp_metadata.json` | Bundled C++ dependency metadata (license, supplier, version, PURL) |
| **npm Wrapper** | `npm_wrapper.sh` | Shell wrapper to execute system-installed npm/cdxgen from Bazel sandbox |

### Maintenance Scripts (auto-run during build or manual)

| Script | Purpose | When Auto-Run | External dependency |
|--------|---------|---------------|---------------------|
| `scripts/generate_crates_metadata_cache.py` | Regenerate Rust cache from Cargo.lock + crates.io API | ✅ When `auto_crates_cache = True` | Python 3.11+ (tomllib) |
| `scripts/generate_cpp_metadata_cache.py` | Convert cdxgen output to cpp_metadata.json cache | ❌ Manual only (see note below) | None |

### Four-Phase Architecture

```
Phase 1: Loading Phase 2: Analysis Phase 3: Metadata Extraction Phase 4: Generation
(extensions.bzl) (aspect.bzl) (rules.bzl - parallel) (sbom_generator.py)

MODULE.bazel Bazel targets
| | ┌─ Cargo.lock (Rust)
v v │ ↓
sbom_metadata ext ---> SbomDepsInfo ---> │ generate_crates_cache.py
| aspect │ ↓ All metadata
v | │ crates_metadata.json combined
metadata.json v │ ↓
_deps.json │ + Python generator
│ ↓
└─ Source tree (C++) .spdx.json
↓ .cdx.json
cdxgen --deep -r
(scans LICENSE,
CMake, headers)
cdxgen.cdx.json
```

**Key: Parallel execution** (Rust and C++ run independently, not sequentially)

**Phase Details:**

1. **Loading** — MODULE.bazel extension collects metadata from all modules in dependency graph
2. **Analysis** — Bazel aspect traverses target dependencies and extracts external repo information
3. **Metadata Extraction** — **Auto-generate enrichment data in parallel** (independent pipelines):

**Branch A: Rust** (`auto_crates_cache = True`):
- Input: `Cargo.lock` file (dependency lock file)
- Process: Query crates.io API for each crate's license, version, checksum
- Output: `crates_metadata.json` with complete Rust crate metadata

**Branch B: C++** (`auto_cdxgen = True`):
- Input: **Source tree** (C++ files, CMakeLists.txt, LICENSE files, headers)
- Process: Run `cdxgen --deep -r` to scan for dependencies and licenses
- Output: `cdxgen.cdx.json` with discovered C++ dependencies

⚡ **Note:** Both branches run **independently and can execute in parallel**. Neither depends on the other.

4. **Generation** — Combine all metadata (phases 1-3) and generate SPDX/CycloneDX output files

### What Is Excluded

- Dependencies not in the transitive dep graph of your `targets`
- Build toolchain repos matching `exclude_patterns` (e.g. `rules_rust`, `rules_cc`, `bazel_tools`, `platforms`)

## Example

See [reference_integration/BUILD](../../reference_integration/BUILD) for working SBOM targets with both `auto_crates_cache` and `auto_cdxgen` enabled, and [reference_integration/MODULE.bazel](../../reference_integration/MODULE.bazel) for the metadata extension setup.

---

## CISA 2025 Element Coverage (CycloneDX)

The table below maps the CISA 2025 draft elements to CycloneDX fields and notes current support in this SBOM generator.

| CISA 2025 Element | CycloneDX Field (JSON) | Support | Notes |
|---|---|---|---|
| Software Producer | `components[].supplier.name` (or manufacturer) | **Supported** | Component `supplier` is emitted when provided. Root producer is in `metadata.component.supplier`. |
| Component Name | `components[].name` | **Supported** | Single name; aliases are stored as `properties` with `cdx:alias`. |
| Component Version | `components[].version` | **Supported** | If unknown and source is git repo with `commit_date`, version can fall back to that date. |
| Software Identifiers | `components[].purl`, `components[].cpe` | **Supported (PURL)** / **Optional (CPE)** | PURL is generated for all components. CPE is optional if provided in metadata. |
| Component Hash | `components[].hashes` | **Supported** | SHA-256 supported (Cargo lock + http_archive `sha256` + repo metadata). |
| License | `components[].licenses` | **Supported when known** | Requires license metadata from `sbom_ext.license(...)`, repo metadata, or caches. |
| Dependency Relationship | `dependencies` | **Supported** | Uses external repo dependency edges from Bazel aspect. |
| Pedigree / Derivation | `components[].pedigree` | **Supported (manual)** | Must be provided via metadata (`pedigree_*` fields). Not auto-deduced. |
| SBOM Author | `metadata.authors` | **Supported** | Set via `sbom_authors` in `sbom()` rule. |
| Tool Name | `metadata.tools` | **Supported** | Always includes `score-sbom-generator`; extra tools via `sbom_tools`. |
| Timestamp | `metadata.timestamp` | **Supported** | ISO 8601 UTC timestamp generated at build time. |
| Generation Context | `metadata.lifecycles` | **Supported** | Set via `generation_context` in `sbom()` rule (`pre-build`, `build`, `post-build`). |

### Notes on Missing Data
If a field is absent in output, it usually means the source metadata was not provided:
- Licenses and suppliers require `sbom_ext.license(...)` or repo metadata.
- CPE, aliases, and pedigree are optional and must be explicitly set.
- Rust crate licenses require a crates metadata cache; this can now be generated automatically when `cargo_lockfile` is provided to `sbom()`.

Examples (add to `MODULE.bazel`):

```starlark
# bazel_dep module (version from module graph)
sbom_ext.license(
name = "googletest",
license = "BSD-3-Clause",
supplier = "Google LLC",
)

# http_archive dependency (explicit version)
sbom_ext.license(
name = "boost",
license = "BSL-1.0",
version = "1.87.0",
supplier = "Boost.org",
)

# git_repository dependency
sbom_ext.license(
name = "iceoryx2",
license = "Apache-2.0",
version = "0.7.0",
supplier = "Eclipse Foundation",
remote = "https://github.com/eclipse-iceoryx/iceoryx2.git",
)

# Rust crate (type = "cargo")
sbom_ext.license(
name = "tokio",
license = "MIT",
version = "1.10.0",
type = "cargo",
supplier = "Tokio Contributors",
)

# Optional metadata (CPE, aliases, pedigree)
sbom_ext.license(
name = "linux-kernel",
license = "GPL-2.0-only",
version = "5.10.120",
cpe = "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
aliases = ["linux", "kernel"],
pedigree_ancestors = ["pkg:generic/linux-kernel@5.10.130"],
pedigree_notes = "Backported CVE-2025-12345 fix from 5.10.130",
)
```
55 changes: 55 additions & 0 deletions sbom/cpp_metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
{
"boost": {
"version": "1.87.0",
"license": "BSL-1.0",
"supplier": "Boost.org",
"purl": "pkg:conan/boost@1.87.0",
"url": "https://www.boost.org/"
},
"nlohmann-json": {
"version": "3.11.3",
"license": "MIT",
"supplier": "Niels Lohmann",
"purl": "pkg:conan/nlohmann_json@3.11.3",
"url": "https://github.com/nlohmann/json"
},
"googletest": {
"version": "1.17.0",
"license": "BSD-3-Clause",
"supplier": "Google LLC",
"purl": "pkg:github/google/googletest@1.17.0",
"url": "https://github.com/google/googletest"
},
"google_benchmark": {
"version": "1.9.4",
"license": "Apache-2.0",
"supplier": "Google LLC",
"purl": "pkg:github/google/benchmark@1.9.4",
"url": "https://github.com/google/benchmark"
},
"flatbuffers": {
"version": "25.2.10",
"license": "Apache-2.0",
"supplier": "Google LLC",
"purl": "pkg:github/google/flatbuffers@25.2.10",
"url": "https://github.com/google/flatbuffers"
},
"vsomeip": {
"version": "3.6.0",
"license": "MPL-2.0",
"supplier": "COVESA",
"purl": "pkg:github/COVESA/vsomeip@3.6.0",
"url": "https://github.com/COVESA/vsomeip"
},
"json_schema_validator": {
"version": "2.1.0",
"license": "MIT",
"supplier": "Patrick Boettcher",
"purl": "pkg:github/pboettch/json-schema-validator@2.1.0"
},
"bazel_skylib": {
"version": "1.8.1",
"license": "Apache-2.0",
"purl": "pkg:github/bazelbuild/bazel-skylib@1.8.1"
}
}
Loading