Update vulnerability advisory table column constraints
Why are we doing this work
The vulnerability_advisories
table was modeled to import the advisories in the advisory database, and has safeguards in place to control the size of the table e.g. an advisory description cannot be longer than 2048 characters. While importing the advisory database, we noticed that some of the advisories do not fit within these constraints. Also, some of the constraints don't match the ones defined in the JSON schema for the YAML files of the advisory database.
The status of the constraints are as follows:
DB column | DB limit | schema limit | max length | status | notes |
---|---|---|---|---|---|
description | 2048 | 1048576 | 5502 | could be compressed | |
title | 2048 | 255 | 228 | ||
component_name | 2048 | - | ? | should match sbom_components.name
|
|
solution | 2048 | - | ? | can be generated | |
not_impacted | 2048 | - | ? | can be generated | |
cvss_v2 | 128 | (pattern) | ? | ||
cvss_v3 | 128 | (pattern) | ? | ||
affected_range | 32 | - | 369 |
Additionally, the advisory database uses characters that are not UTF-8 compliant, for example a null byte, and this causes the ingestion of the advisor to fail. We should update the constraints of the advisory table used in Advisories stored in vulnerability_advisories a... (#375302 - closed) so that it handles these edge cases.
Relevant links
For a more detailed analysis see #371046 (comment 1347512212).
Non-functional requirements
-
Documentation: -
Feature flag: -
Performance: -
Testing: Verify if importing the advisory database produces no errors. The glad-to-csv can be used to produce an import script.
Implementation plan
-
Update the title
limit to255
-
Update the description
limit to8192
. See #406596 (comment 1348751418). -
Update the component_name
limit to255
-
Update the affected_range
limit to512
-
Update the links
column type tojsonb
to match gemnasium-db's JSON/YAML schema. -
Remove the urls
column to deduplicate data inlinks
column. -
Escape null bytes and other characters that do not comply with UTF-8 encoding. The null-bytes can be grepped via ripgrep like so rg '\\0'
.
Verification steps
TODO