GitLab's robots.txt file is out of date
GitLab's robots.txt
includes lines such as
Disallow: /*/*/repository/archive*
This is done to prevent crawlers from download archives, a very expensive operation.
However, since renaming our URL scheme, the robots.txt
file does not match our URL patterns.
For example, archives are now at:
*/*/-/archive/....
*/*/*/-/archive/....
*/*/*/-/archive/....