Add SBoM ingestion service processing
What does this MR do and why?
Describe in detail what your merge request does and why.
Issue: #364709 (closed) Epic: &8024 (closed)
This MR adds a service for ingesting SBoM reports. The reports are JSON files which represent several objects. This service is responsible for taking pre-parsed representations of these reports, and persisting them into the database using bulk upserts. This MR adds the pre-processing to prepare these objects for insertion. The bulk insertions are implemented in !96575 (merged).
The following DDL diagram shows the relations and the order in which they are created:
All relations tie back to a single sbom_occurrence
record, so an OccurenceMap
data structure is used to hold all attributes which are related to each other during processing. The service takes the report data, turns it into OccurenceMap
s, and then passes the OccurrenceMap
s into the ingestion pipeline for performing bulk upserts for each model. The following diagram illustrates the flow of data:
flowchart TD
IngestReportsWorker[IngestReportsWorker: Executes IngestReportsService when pipelines complete];
IngestReportsService[IngestReportsService: Collects reports from pipeline];
IngestReportService[IngestReportService: Turns a single report into batches of OccurenceMaps];
IngestReportSliceService[IngestReportSliceService: Passes a batch of OccurenceMaps into the ingestion pipeline];
IngestReportsWorker-- pipeline -->IngestReportsService
IngestReportsService-- sbom_report -->IngestReportService
IngestReportService-- "occurrence_maps (batched)" -->IngestReportSliceService
IngestReportSliceService-- "occurence_maps (batched)" -->IngestComponents
subgraph Ingestion Pipeline
IngestComponents-- component_ids -->IngestComponentVersions
IngestComponentVersions-- component_version_ids -->IngestSources
IngestSources-- source_ids -->IngestOccurrences
end
This MR implements these classes, up to and excluding the ingestion pipeline.
Screenshots or Screen Recordings
These are strongly recommended to assist reviewers and reduce the time to merge your change.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
-
Create a new project
-
Add the following
.gitlab-ci.yml
to the project:persist_sbom: image: alpine:latest script: - wget https://gitlab.com/-/snippets/2378046/raw/main/gl-sbom-npm-npm.cdx.json - wget https://gitlab.com/-/snippets/2378046/raw/main/gl-sbom-go-go.cdx.json artifacts: reports: cyclonedx: - gl-sbom-npm-npm.cdx.json - gl-sbom-go-go.cdx.json
-
The pipeline should run and succeed. Note down the pipeline ID.
-
Make this change:
diff --git a/ee/app/services/sbom/ingestion/tasks/ingest_components.rb b/ee/app/services/sbom/ingestion/tasks/ingest_components.rb index f3ee5025553..c975f344aa1 100644 --- a/ee/app/services/sbom/ingestion/tasks/ingest_components.rb +++ b/ee/app/services/sbom/ingestion/tasks/ingest_components.rb @@ -5,7 +5,11 @@ module Ingestion module Tasks class IngestComponents < Base def self.execute(pipeline, occurrence_maps) - # Not yet implemented + f = File.open(Rails.root.join('output.txt'), 'a') + f.puts "Got occurrence maps" + f.puts "Size: #{occurrence_maps.size}" + PP.pp(occurrence_maps, f) + f.close end end end
-
Start the rails console:
bundle exec rails c
-
Invoke the service:
pipeline = Pipeline.find(pipeline_id) ::Sbom::Ingestion::IngestReportsService.execute(pipeline)
-
Look in
output.txt
to see what got passed to the ingestion pipeline:$ head -n 30 output.txt Got occurrence maps Size: 15 [#<Sbom::Ingestion::OccurrenceMap:0x000000013ac7aca0 @report_component= #<Gitlab::Ci::Reports::Sbom::Component:0x000000010e1741c0 @component_type="library", @name="github.com/astaxie/beego", @version="v1.10.0">, @report_source= #<Gitlab::Ci::Reports::Sbom::Source:0x000000010e17f818 @data= {"input_file"=>{"path"=>"go.sum"}, "source_file"=>{"path"=>"go.mod"}, "package_manager"=>{"name"=>"go"}, "language"=>{"name"=>"go"}}, @fingerprint= "78f0613de674dc2d37f07d8662969754f46abbdfe7efd88fcc6cbe8d37df9058", @source_type=:dependency_scanning>>, #<Sbom::Ingestion::OccurrenceMap:0x000000013ac7ac50 @report_component= #<Gitlab::Ci::Reports::Sbom::Component:0x000000010e174008 @component_type="library", @name="github.com/davecgh/go-spew", @version="v1.1.1">, @report_source= #<Gitlab::Ci::Reports::Sbom::Source:0x000000010e17f818 @data= {"input_file"=>{"path"=>"go.sum"}, "source_file"=>{"path"=>"go.mod"}, "package_manager"=>{"name"=>"go"},
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.