Adds relation package_id to ml_candidates
What does this MR do and why?
Adds relation package_id to ml_candidates
Replaces an implicit relation between ml_candidates and packages_packages by an explicit one.
Adds foreign_key from ml_candiates to packages_packages. This key is populated when the package is created using a new event PackageCreatedEvent. Existing ml_candidate packages are populated with a migration.
Database
Queries
This query replaces: !104166 (merged)
SELECT
"packages_packages".*
FROM
"packages_packages"
WHERE
"packages_packages"."id" IN (126, 125)
Index Scan using packages_packages_pkey on packages_packages (cost=0.14..3.32 rows=2 width=89) (actual time=0.043..0.049 rows=2 loops=1)
" Index Cond: (id = ANY ('{126,125}'::bigint[]))"
Migrations
Up
❯ bundle exec rails db:migrate
main: == 20230308154243 AddPackageIdToMlCandidates: migrating =======================
main: -- add_column(:ml_candidates, :package_id, :bigint, {:null=>true})
main: -> 0.0088s
main: == 20230308154243 AddPackageIdToMlCandidates: migrated (0.0346s) ==============
main: == 20230308154244 AddPackageIdForeignKeyToMlCandidates: migrating =============
main: -- transaction_open?()
main: -> 0.0000s
main: -- transaction_open?()
main: -> 0.0000s
main: -- execute("ALTER TABLE ml_candidates ADD CONSTRAINT fk_a1d5f1bc45 FOREIGN KEY (package_id) REFERENCES packages_packages (id) ON DELETE SET NULL NOT VALID;")
main: -> 0.0032s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0002s
main: -- execute("ALTER TABLE ml_candidates VALIDATE CONSTRAINT fk_a1d5f1bc45;")
main: -> 0.0032s
main: -- execute("RESET statement_timeout")
main: -> 0.0003s
main: == 20230308154244 AddPackageIdForeignKeyToMlCandidates: migrated (0.1146s) ====
main: == 20230308154245 AddIndexOnPackageIdForMlCandidates: migrating ===============
main: -- transaction_open?()
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0028s
main: -- index_exists?(:ml_candidates, :package_id, {:name=>"index_ml_candidates_on_package_id", :algorithm=>:concurrently})
main: -> 0.0131s
main: -- add_index(:ml_candidates, :package_id, {:name=>"index_ml_candidates_on_package_id", :algorithm=>:concurrently})
main: -> 0.0082s
main: == 20230308154245 AddIndexOnPackageIdForMlCandidates: migrated (0.0870s) ======
main: == 20230313142631 BackfillMlCandidatesPackageId: migrating ====================
main: -- execute(" UPDATE ml_candidates\n SET package_id = candidate_id_to_package_id.package_id\n FROM (SELECT id as package_id, TRIM(LEADING 'ml_candidates_' FROM name) as candidate_id\n FROM packages_packages\n WHERE name LIKE 'ml_candidate_%'\n and version = '-') AS candidate_id_to_package_id\n WHERE cast(ml_candidates.id as text) = candidate_id_to_package_id.candidate_id\n")
main: -> 0.1023s
main: == 20230313142631 BackfillMlCandidatesPackageId: migrated (0.1615s) ===========
Down
❯ bundle exec rails db:migrate:down:main RAILS_ENV=development VERSION=20230313142631
main: == 20230313142631 BackfillMlCandidatesPackageId: reverting ====================
main: == 20230313142631 BackfillMlCandidatesPackageId: reverted (0.0233s) ===========
❯ bundle exec rails db:migrate:down:main RAILS_ENV=development VERSION=20230308154245
main: == 20230308154245 AddIndexOnPackageIdForMlCandidates: reverting ===============
main: -- transaction_open?()
main: -> 0.0003s
main: -- view_exists?(:postgres_partitions)
main: -> 0.1933s
main: -- indexes(:ml_candidates)
main: -> 0.0063s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0007s
main: -- remove_index(:ml_candidates, {:algorithm=>:concurrently, :name=>"index_ml_candidates_on_package_id"})
main: -> 0.0051s
main: -- execute("RESET statement_timeout")
main: -> 0.0008s
main: == 20230308154245 AddIndexOnPackageIdForMlCandidates: reverted (0.2388s) ======
❯ bundle exec rails db:migrate:down:main RAILS_ENV=development VERSION=20230308154244
main: == 20230308154244 AddPackageIdForeignKeyToMlCandidates: reverting =============
main: -- transaction_open?()
main: -> 0.0000s
main: -- remove_foreign_key(:ml_candidates, {:column=>:package_id})
main: -> 0.0062s
main: == 20230308154244 AddPackageIdForeignKeyToMlCandidates: reverted (0.2534s) ====
❯ bundle exec rails db:migrate:down:main RAILS_ENV=development VERSION=20230308154244
main: == 20230308154244 AddPackageIdForeignKeyToMlCandidates: reverting =============
main: -- transaction_open?()
main: -> 0.0000s
main: -- remove_foreign_key(:ml_candidates, {:column=>:package_id})
main: -> 0.0062s
main: == 20230308154244 AddPackageIdForeignKeyToMlCandidates: reverted (0.2534s) ====
❯ bundle exec rails db:migrate:down:main RAILS_ENV=development VERSION=20230308154243
main: == 20230308154243 AddPackageIdToMlCandidates: reverting =======================
main: -- remove_column(:ml_candidates, :package_id, :bigint, {:null=>true})
main: -> 0.0045s
main: == 20230308154243 AddPackageIdToMlCandidates: reverted (0.0168s) ==============
How to set up and validate locally
-
Enable the feature flag
echo "Feature.enable(:ml_experiment_tracking)" | bundle exec rails c
-
Create a Project and a project access token, with api level:
export PROJECT_ID=<Your Project Id> export GITLAB_PAT=<your api token>
-
Create an Experiment:
curl -X POST -H "Authorization: Bearer $GITLAB_PAT" -d name=my_cool_experiment http://gdk.test:3000/api/v4/projects/$PROJECT_ID/ml/mlflow/api/2.0/mlflow/experiments/create
-
Create a Run. The
artifact_uri
on the response should be a url to the generic packages api, and look something likehttp://gdk.test:3000/api/v4/projects/21/packages/generic/ml_candidate_{id}/-/
curl -X POST -H "Authorization: Bearer $GITLAB_PAT" -d experiment_id=1 http://gdk.test:3000/api/v4/projects/$PROJECT_ID/ml/mlflow/api/2.0/mlflow/runs/create
-
In rails console, make sure the candidate has no package associated (iid is the run_iid returned in the previous call):
Ml::Candidate.last.package_id
-
Upload a file:
curl --header "PRIVATE-TOKEN: $GITLAB_PAT" --upload-file file.txt "{CANDIDATE_UPLOAD_URL}/file.txt"
-
Check that the worker
AssociateMlCandidateToPackageWorker
has been called by looking atlogs/sidekiq.log
-
In rails console, make sure the candidate has a package associated.
Ml::Candidate.last.package_id
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.