Add models, finders for code hotspots MVP
What does this MR do?
This is one of the many small MRs coming out of the parent MR: !14719 (diffs)
This MR adds models, finder for code hotspots. For the first iteration we'd like to provide the top N files that have been committed into the project repository.
- Discovery issue: #12683 (closed)
- Implementation issue: #13165 (closed)
- Follow-up issue for writing documentation: #32816 (closed)
How it works
The feature would store how many times a file was committed for a given date (only the default_branch
). This MR is only focuses on the query part.
Example:
Commit 2018-05-06
Gemfile
app/models/user.rb
Commit 2018-05-06 (same day but a bit later)
app/models/user.rb
Commit 2018-06-01
Gemfile
Result:
-
app/models/user.rb
=> 2 -
Gemfile
=> 2
Database Structure
There has been a fairly substantial change in the feature, for the first iteration we want to count the number of commits and not the file edits (they are basically the same, but named a bit differently). Since the existing models are not used at all, I just removed analytics_repository_file_edits
and added a new table (analytics_repository_file_commits
).
Explanation for some of the fields:
-
committed_date
, we choose to store the date only since we wouldn't query this data by the timestamp (FE only has date picker). It also reduces the amount of data within the table. -
commit_count
, there can be several commits on the same day, so we'll just increment this column. (we use smallint, max 32K, unlikely that we'll have that amount of commits for a given file in a day)
Query
We have no data in the DB atm...
SELECT SUM("analytics_repository_file_commits"."commit_count") AS sum_struct_arel_attributes_attribute_relation_arel_table_0x0000,
"analytics_repository_files"."file_path" AS analytics_repository_files_file_path
FROM "analytics_repository_file_commits"
INNER JOIN "analytics_repository_files" ON "analytics_repository_files"."id" = "analytics_repository_file_commits"."analytics_repository_file_id"
WHERE "analytics_repository_file_commits"."project_id" = 5
AND "analytics_repository_file_commits"."committed_date" >= '2018-03-05'
AND "analytics_repository_file_commits"."committed_date" <= '2018-10-20'
GROUP BY "analytics_repository_files"."file_path"
ORDER BY SUM("analytics_repository_file_commits"."commit_count")
LIMIT 100;