Skip to content

Commit f2dba22

Browse files
committed
Initial fix
1 parent 62b4e27 commit f2dba22

8 files changed

+109
-6
lines changed

.gitignore

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
zenv/
22
__pycache__/
3-
logs/
3+
4+
logs.log

README.md

+73-1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,75 @@
11
# Python Phishing URL Detection
22
---
3-
Python 3.12.3
3+
**Python 3.11.9 _(Currently Using)_**
4+
5+
6+
## How to Run?
7+
8+
- Clone or download [python-phishing-url-detection](https://github.com/sannjayy/python-phishing-url-detection)
9+
10+
`git clone [email protected]:sannjayy/python-phishing-url-detection.git`
11+
12+
13+
- Create a virtual environment
14+
```bash
15+
python -m venv zenv
16+
source zenv/Scripts/activate # Windows
17+
source zenv/bin/activate # Mac
18+
```
19+
20+
21+
- Install basic requirements
22+
```bash
23+
pip install -r requirements.txt
24+
25+
# OR INITIAL INSTALLATION
26+
pip install --upgrade pip
27+
pip install --upgrade setuptools
28+
29+
pip install pandas whois httpx
30+
pip install pycaret # It will take sometime.
31+
```
32+
33+
### Replace Domain
34+
35+
```python
36+
if __name__ == "__main__":
37+
phishing_url_1 = 'https://bafybeifqd2yktzvwjw5g42l2ghvxsxn76khhsgqpkaqfdhnqf3kiuiegw4.ipfs.dweb.link/'
38+
phishing_url_2 = 'http://about-ads-microsoft-com.o365.frc.skyfencenet.com'
39+
real_url_1 = 'https://chat.openai.com'
40+
real_url_2 = 'https://github.com/'
41+
42+
43+
print(predict(phishing_url_1))
44+
print(predict(phishing_url_2))
45+
print(predict(real_url_1))
46+
print(predict(real_url_2))
47+
```
48+
49+
### To Run
50+
51+
```bash
52+
python main.py
53+
54+
55+
# OUTPUT: {'prediction_label': 0, 'prediction_score': 68.39}
56+
57+
# 0 = False | 1 True
58+
```
59+
60+
61+
62+
63+
---
64+
---
65+
66+
- 🌏 [GitHub Repo](https://github.com/sannjayy/python-phishing-url-detection)
67+
- 🌏 [Website](https://www.sanjaysikdar.dev)
68+
69+
- 📖 [read.sanjaysikdar.dev](https://read.sanjaysikdar.dev)
70+
- 📦 [pypi releases](https://pypi.org/user/sannjayy/) | [npm releases](https://www.npmjs.com/~sannjayy)
71+
72+
---
73+
74+
[![](https://img.shields.io/github/followers/sannjayy?style=social)](https://github.com/sannjayy)
75+
Developed with ❤️ by *[sanjaysikdar.dev](https://www.sanjaysikdar.dev)*.

extractorFunctions.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
# importing required packages for Address Bar Based feature Extraction
2-
from urllib.parse import urlparse,urlencode, unquote
2+
from urllib.parse import urlparse, urlencode, unquote
33
import re
44
# importing required packages for Domain Based Feature Extraction
5-
import whois
65
from datetime import datetime
76

87

featureExtractor.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ def featureExtraction(url):
4242

4343
features.append(ef.has_unicode(url)+ef.haveAtSign(url)+ef.havingIP(url))
4444

45-
with open('pca_model.pkl', 'rb') as file:
45+
with open('model/pca_model.pkl', 'rb') as file:
4646
pca = pk.load(file)
4747

4848
#converting the list to dataframe

main.py

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
from featureExtractor import featureExtraction
2+
from pycaret.classification import load_model, predict_model
3+
4+
model = load_model('model/phishingdetection')
5+
6+
7+
def predict(url):
8+
data = featureExtraction(url)
9+
result = predict_model(model, data=data)
10+
11+
# Get the prediction score for the positive class (Phishing)
12+
prediction_score = result['prediction_score'][0]
13+
prediction_label = result['prediction_label'][0]
14+
# domain_age = result['Domain_Age'][0]
15+
# print('Result -> ', url)
16+
17+
return {
18+
'prediction_label': prediction_label,
19+
'prediction_score': prediction_score * 100,
20+
}
21+
22+
if __name__ == "__main__":
23+
phishing_url_1 = 'https://bafybeifqd2yktzvwjw5g42l2ghvxsxn76khhsgqpkaqfdhnqf3kiuiegw4.ipfs.dweb.link/'
24+
phishing_url_2 = 'http://about-ads-microsoft-com.o365.frc.skyfencenet.com'
25+
real_url_1 = 'https://chat.openai.com'
26+
real_url_2 = 'https://github.com/'
27+
28+
29+
print(predict(phishing_url_1))
30+
print(predict(phishing_url_2))
31+
print(predict(real_url_1))
32+
print(predict(real_url_2))
File renamed without changes.
File renamed without changes.

requirements.txt

-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@ numpy==1.26.4
88
pandas==2.2.2
99
python-dateutil==2.9.0.post0
1010
pytz==2024.1
11-
regex==2024.4.16
1211
six==1.16.0
1312
sniffio==1.3.1
1413
tzdata==2024.1

0 commit comments

Comments
 (0)