Skip to content

Remove SELECT N+1 on software license policies

mo khan requested to merge 208723-pipelines-licenses into master

What does this MR do?

This change replaces an N+1 lookup by using the LicenseCompliance class to take care of loading/merging software license policies and detected licenses. The N+1 current exists in the LicenseScanningReportLicenseEntity and is still used by the /merge_requests/license_scannings_reports endpoint. This MR decouples the pipelines/licenses endpoint from this N+1.

Before:

class LicenseScanningReportLicenseEntity < Grape::Entity
  include RequestAwareEntity

  expose :name
  expose :classification
  expose :dependencies, using: LicenseScanningReportDependencyEntity
  expose :count
  expose :url

  def classification
    default = { id: nil, name: value_for(:name), approval_status: 'unclassified' }
    found = SoftwareLicensePoliciesFinder.new(request&.current_user, request&.project, name: value_for(:name)).find
    ManagedLicenseEntity.represent(found || default)
  end
end
[0] pry(main)> LicenseScanningReportsSerializer.new(project: Project.find(23), current_user: User.find_by(name: 'xlgmokha')).represent(Ci::Pipeline.find(61).license_scanning_report.licenses)
  Project Load (1.0ms)  SELECT "projects".* FROM "projects" WHERE "projects"."id" = 23 LIMIT 1
  User Load (2.4ms)  SELECT "users".* FROM "users" WHERE "users"."name" = 'xlgmokha' LIMIT 1
Creating scope :without_statuses. Overwriting existing method Ci::Pipeline.without_statuses.
  Ci::Pipeline Load (0.6ms)  SELECT "ci_pipelines".* FROM "ci_pipelines" WHERE "ci_pipelines"."id" = 61 LIMIT 1
  Ci::Build Load (1.2ms)  SELECT "ci_builds".* FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."commit_id" = 61 AND ("ci_builds"."retried" = FALSE OR "ci_builds"."retried" IS NULL) AND (EXISTS (SELECT 1 FROM "ci_job_artifacts" WHERE (ci_builds.id = ci_job_artifacts.job_id) AND "ci_job_artifacts"."file_type" IN (10, 101)))
  Ci::JobArtifact Load (0.6ms)  SELECT "ci_job_artifacts".* FROM "ci_job_artifacts" WHERE "ci_job_artifacts"."job_id" = 929
  Project Load (0.9ms)  SELECT "projects".* FROM "projects" WHERE "projects"."id" = 24 LIMIT 1
  License Load (0.4ms)  SELECT "licenses".* FROM "licenses" ORDER BY "licenses"."id" DESC LIMIT 100
  ApplicationSetting Load (1.4ms)  SELECT "application_settings".* FROM "application_settings" ORDER BY "application_settings"."id" DESC LIMIT 1
  License Load (0.6ms)  SELECT "licenses".* FROM "licenses" ORDER BY "licenses"."id" DESC LIMIT 100
  ProjectFeature Load (0.7ms)  SELECT "project_features".* FROM "project_features" WHERE "project_features"."project_id" = 23 LIMIT 1
  SQL (1.1ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('(bsd-2-clause or mit or apache-2.0)') LIMIT 1
  SQL (1.1ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('(gpl-2.0 or mit)') LIMIT 1
  SQL (0.8ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('(mit and bsd-3-clause)') LIMIT 1
  SQL (0.9ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('(mit and zlib)') LIMIT 1
  SQL (0.9ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('(mit or apache-2.0)') LIMIT 1
  SQL (1.1ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('(mit or cc0-1.0)') LIMIT 1
  SQL (0.9ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('(wtfpl or mit)') LIMIT 1
  SQL (0.8ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('apache license 2.0') LIMIT 1
  SQL (0.9ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('bsd 2-clause "simplified" license') LIMIT 1
  SQL (0.7ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('bsd 3-clause "new" or "revised" license') LIMIT 1
  SQL (0.9ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('bsd*') LIMIT 1
  SQL (0.8ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('creative commons attribution 4.0 international') LIMIT 1
  SQL (0.8ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('creative commons zero v1.0 universal') LIMIT 1
  SQL (0.8ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('isc license') LIMIT 1
  SQL (0.8ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('lil') LIMIT 1
  SQL (0.8ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('mit license') LIMIT 1
  SQL (0.9ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('mit*') LIMIT 1
  SQL (0.7ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('mozilla public license 2.0') LIMIT 1
  SQL (0.7ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('public domain') LIMIT 1
  SQL (0.5ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('the unlicense') LIMIT 1
  SQL (0.5ms)  SELECT "software_license_policies"."id" AS t0_r0, "software_license_policies"."project_id" AS t0_r1, "software_license_policies"."software_license_id" AS t0_r2, "software_license_policies"."classification" AS t0_r3, "software_license_policies"."created_at" AS t0_r4, "software_license_policies"."updated_at" AS t0_r5, "software_licenses"."id" AS t1_r0, "software_licenses"."name" AS t1_r1, "software_licenses"."spdx_identifier" AS t1_r2 FROM "software_license_policies" INNER JOIN "software_licenses" ON "software_licenses"."id" = "software_license_policies"."software_license_id" WHERE "software_license_policies"."project_id" = 23 AND LOWER("software_licenses"."name") IN ('unknown') LIMIT 1

After:

class LicensePolicyEntity < Grape::Entity
  expose :name
  expose :classification do |entity|
    {
      id: entity.id,
      name: entity.name,
      approval_status: entity.approval_status
    }
  end
  expose :dependencies, using: LicenseScanningReportDependencyEntity
  expose :count do |entity|
    entity.dependencies.count
  end
  expose :url
end
[1] pry(main)> LicenseScanningReportsSerializer.new.represent(Ci::Pipeline.find(61).license_compliance.find_policies(detected_only: true))
  Ci::Pipeline Load (0.6ms)  SELECT "ci_pipelines".* FROM "ci_pipelines" WHERE "ci_pipelines"."id" = 61 LIMIT 1
   (0.4ms)  SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::float as lag
  Ci::Pipeline Load (1.9ms)  SELECT "ci_pipelines".* FROM "ci_pipelines" WHERE "ci_pipelines"."id" = 61 LIMIT 1
  Project Load (3.1ms)  SELECT "projects".* FROM "projects" WHERE "projects"."id" = 24 LIMIT 1
  Ci::Build Load (3.4ms)  SELECT "ci_builds".* FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."commit_id" = 61 AND ("ci_builds"."retried" = FALSE OR "ci_builds"."retried" IS NULL) AND (EXISTS (SELECT 1 FROM "ci_job_artifacts" WHERE (ci_builds.id = ci_job_artifacts.job_id) AND "ci_job_artifacts"."file_type" IN (10, 101)))
  Ci::JobArtifact Load (0.4ms)  SELECT "ci_job_artifacts".* FROM "ci_job_artifacts" WHERE "ci_job_artifacts"."job_id" = 929
  Project Load (0.7ms)  SELECT "projects".* FROM "projects" WHERE "projects"."id" = 24 LIMIT 1
  License Load (0.5ms)  SELECT "licenses".* FROM "licenses" ORDER BY "licenses"."id" DESC LIMIT 100
  ApplicationSetting Load (2.7ms)  SELECT "application_settings".* FROM "application_settings" ORDER BY "application_settings"."id" DESC LIMIT 1
  Ci::Build Load (1.5ms)  SELECT "ci_builds".* FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."commit_id" = 61 AND ("ci_builds"."retried" = FALSE OR "ci_builds"."retried" IS NULL) AND (EXISTS (SELECT 1 FROM "ci_job_artifacts" WHERE (ci_builds.id = ci_job_artifacts.job_id) AND "ci_job_artifacts"."file_type" = 6))
  Ci::Build Load (1.3ms)  SELECT "ci_builds".* FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."commit_id" = 61 AND ("ci_builds"."retried" = FALSE OR "ci_builds"."retried" IS NULL) AND (EXISTS (SELECT 1 FROM "ci_job_artifacts" WHERE (ci_builds.id = ci_job_artifacts.job_id) AND "ci_job_artifacts"."file_type" IN (10, 101)))
  Ci::JobArtifact Load (0.4ms)  SELECT "ci_job_artifacts".* FROM "ci_job_artifacts" WHERE "ci_job_artifacts"."job_id" = 929
  Project Load (1.0ms)  SELECT "projects".* FROM "projects" WHERE "projects"."id" = 24 LIMIT 1
  License Load (0.4ms)  SELECT "licenses".* FROM "licenses" ORDER BY "licenses"."id" DESC LIMIT 100
  Namespace Load (1.9ms)  SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."id" = 57 LIMIT 1
  Route Load (1.0ms)  SELECT "routes".* FROM "routes" WHERE "routes"."source_id" = 57 AND "routes"."source_type" = 'Namespace' LIMIT 1
  SoftwareLicensePolicy Load (0.7ms)  SELECT "software_license_policies".* FROM "software_license_policies" WHERE "software_license_policies"."project_id" = 24 LIMIT 1000

A single request loads pipeline details that might include n number of detected software licenses and m number of software license policies. The SoftwareLicensePoliciesFinder.find is invoked n times to find 1 of m software license policies associated with a project.

The change I am submitting uses the LicenseCompliance class to load all the project.software_license_policies at once rather than n times.

Screenshots

screenshot-2020-06-18-1592521250

screenshot-2020-06-18-1592521289

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by 🤖 GitLab Bot 🤖

Merge request reports

Loading