My work for my diploma thesis Detecting Vulnerabilities/Malware with Python . I used four open source projects from Github:
- GitLeaks : Secret Detection like passwords, API keys, and tokens in git repos, files
- GuardDog : Identification of malicious PyPI and npm packages or Go modules
- Safety : Python dependency vulnerability scanner
- Bearer : Static application security testing (SAST) tool that scans the source code and analyzes the data flows to discover, filter and prioritize security and privacy risks.
What steps to follow in order to use CrimePoirot:
- Enable WSL in terminal.
- In root directory:
git clone https://github.com/kosmits-ai/CrimePoirot.git
- In root directory:
git clone https://github.com/gitleaks/gitleaks
- If you have Go installed:
cd gitleaks
make build
- Create venv and enable it in root directory:
python3 -m venv myenv
enable source myenv/bin/activate
- Navigate to CrimePoirot and install the required staff:
cd CrimePoirot
pip install -r requirements.txt
- Authentication for Safety :
safety auth
- Install Bearer package:
-
sudo apt-get install apt-transport-https echo "deb [trusted=yes] https://apt.fury.io/bearer/ /" | sudo tee -a /etc/apt/sources.list.d/fury.list sudo apt-get update sudo apt-get install bearer
- Run four tools in serial architecture:
python run_poirot.py
- You enter a github repository url.
- GitLeaks is running...
- Leaked credentials -if they exist- stored in MongoDB.
- Guarddog is running... (requirements.txt is required in the repo you try to scan)
- Malicious warnings -if they exist- are stored in MongoDB.
- Safety is running...
- Names of vulnerable packages -if they exist- are stored in MongoDB.
- Bearer is running...
- Count of Critical, High, Medium, Low Vulnerabilities in the source code of the repository.
PS. If everything is clear according to a tool, we insert output documents with zero-empty values in MongoDB.
The main idea behind this project was building a tool that can check for various parameters which affect the security trust for a specific repository. After evaluating the findings of Gitleaks, GuardDog, Safety, Bearer for 100-150 random repositories, we calculate the mean values of each parameter. We do that in order to check how much a new repository which needs to be scanned will diverge from these mean values. According to this deviation, a trust score will be calculated. This score will help the owner/ developer to have a quick measure to check about how safe is the repository.