Update package metadata schema to support license deduplication
Why are we doing this work
To deduplicate the current package metadata dataset a new column is needed to store licenses without going to the join tables (pm_package_versions
and pm_package_version_licenses
).
In order to fully re-import the license dataset while keeping the sync positions on instances that already have package metadata, a new checkpoint needs to be created. The checkpoint has the same version_format
and accesses the same urls.
Relevant links
Implementation plan
-
add migration to update pm_packages
-
add jsonb
column
-
-
update model and validation with json_schema
Edited by Igor Frenkel