Explain This Vulnerability - Secret Detection Pre-Flight Check

It's been identified that there are situations in which the Explain This Vulnerability feature has the possibility of sending sensitive snippets of user code to the respective AI Third Party service under certain possible circumstances. To maximise our due-diligence in ensuring customer privacy, we need to institute pre-flight checks on the prompt before sending it to the AI so that we can prevent these situations.

The current theorised possibility is that a scanner could detect a vulnerability of a non-secret type on a snippet of code which includes a secret contravention. As the scanner detecting this issue is not a secret detection, GitLab would not know this internally, and would ship the code off as requested by the user.

Iteration 1 here would be a super basic, boring solution of doing basic secret detection checks on the prompt snippets to ensure there are no obvious passwords, keys or tokens, and preventing a sendoff if they are found.

Solution

The plan is to do a short term iteration to mitigate the risk of secrets being sent to the AI by performing basic Regex checks against the code snippet, and removing it from the prompt if they pose a potential risk.

A non-exhaustive list of keywords or phrases to check for would be:

"secret"
"api_key"
"token"
"ENV"
"password"
"encrypted/encryptions"

Implementation Plan

Implement a rudimentary Explain This Vulnerability Secret Detection pre-flight check of basic regex scans in which we scan the code that we're adding to the prompt, and if it looks like it may include anything possibly undesirable to send outside of GitLab, we send a prompt without the code instead.

Verification Steps

Set up a project with a SAST vulnerability containing in which the code snippet containers one of the blocked keywords.
Execute Explain This Vulnerability on that vulnerability. It should not be possible for the code to be sent to the AI.

Edited Jun 14, 2023 by Gregory Havenga