Skip to content

Commit a56b38d

Browse files
committed
README: Improve instructions
Improve instructions in `README.md` on installing and setting up the module. Add detailed instructions, fix commands where required, reference the `dataset` and `commons` modules. Signed-off-by: Razvan Deaconescu <[email protected]>
1 parent 4a28f02 commit a56b38d

File tree

1 file changed

+143
-86
lines changed

1 file changed

+143
-86
lines changed

README.md

Lines changed: 143 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,13 @@
88
- [Setup](#setup)
99
- [Usage](#usage)
1010
- [As a CLI Tool](#as-a-cli-tool)
11-
- [Arguments Dictionary Generation](#arguments-dictionary-generation)
12-
- [Input Streams Detection](#input-streams-detection)
13-
- [Arguments Fuzzing](#arguments-fuzzing)
14-
- [Help](#help)
11+
- [Generate Dictionary for Arguments](#generate-dictionary-for-arguments)
12+
- [Input Streams Detection](#detect-input-streams)
13+
- [Arguments Fuzzing](#fuzz-arguments)
14+
- [Get Help](#get-help)
1515
- [As a Python Module](#as-a-python-module)
16-
- [Input Streams Detection](#input-streams-detection-1)
17-
- [Arguments Fuzzing](#arguments-fuzzing-1)
16+
- [Input Streams Detection](#detect-input-streams-1)
17+
- [Arguments Fuzzing](#fuzz-arguments-1)
1818

1919
---
2020

@@ -23,122 +23,179 @@
2323
`attack_surface_approximation` is the CRS module that deals with the approximation of the attack surface in a vulnerable program.
2424

2525
Some input mechanisms are omitted: elements of the user interface, signals, devices and interrupts. At the moment, the supported mechanisms are the following:
26-
- Files;
27-
- Arguments;
28-
- Standard input;
29-
- Networking; and
30-
- Environment variables.
3126

32-
In addition, a custom fuzzer is implemented to discover arguments that trigger different code coverage. It takes arguments from a dictionary which can be handcrafted or generated with an exposed command, with an implemented heuristic.
27+
- files
28+
- command-line arguments
29+
- standard input
30+
- networking
31+
- environment variables
32+
33+
In addition, a custom fuzzer is implemented to discover arguments that trigger different code coverage.
34+
It takes arguments from a dictionary which can be handcrafted or generated with an exposed command, with an implemented heuristic.
3335

3436
Examples of arguments dictionaries can be found in `examples/dictionaries`:
35-
- `man.txt`, generated with the `man_parsing` heurstic and having 6605 entries; and
36-
- `generation.txt`, generated with the `generation` heuristic and having 62 entries.
37+
38+
- `man.txt`: generated with the `man_parsing` heuristic and having 6605 entries
39+
- `generation.txt`: generated with the `generation` heuristic and having 62 entries
3740

3841
### Limitations
3942

4043
- ELF format
4144
- x86 architecture
42-
- Non-static binaries
43-
- Symbols present (namely, no stripping is involved)
44-
- No obfuscation technique involved
45+
- dynamic binaries (static binaries are not supported)
46+
- symbols present (namely, no stripping is involved)
47+
- no obfuscation technique involved
4548

4649
## How It Works
4750

48-
The module works by automating Ghidra for static binary analysis. It extracts information and apply heuristics to determine if a given input stream is present.
51+
The module works by automating [Ghidra](https://ghidra-sre.org/) for static binary analysis.
52+
It extracts information and applies heuristics to determine if a given input stream is present.
4953

5054
Examples of such heuristics are:
51-
- For standard input, calls to `getc()` and `gets()`
52-
- For networking, calls to `recv()` and `recvfrom()`
53-
- For arguments, occurrences of `argc` and `argv` in the `main()`'s decompilation.
5455

55-
The argument fuzzer uses Docker and QBDI to detect basic block coverage.
56+
- for standard input: calls to `getc()` and `gets()`
57+
- for networking: calls to `recv()` and `recvfrom()`
58+
- for command-line arguments: occurrences of `argc` and `argv` in `main()`
59+
60+
The argument fuzzer uses [Docker](https://www.docker.com/) for running and [QBDI](https://qbdi.quarkslab.com/) to detect basic-block coverage.
5661

5762
## Setup
5863

59-
1. Ensure you have Docker installed.
60-
2. Install the required Python 3 packages via `poetry install --no-dev`.
61-
3. Ensure the Docker API is accessible by:
62-
- Running the module as `root`; or
63-
- Changing the Docker socket permissions (unsecure approach) via `chmod 777 /var/run/docker.sock`.
64+
1. Make sure you have set up the repositories and Python environment according to the [top-level instructions](https://github.com/open-crs#requirements).
65+
That is:
66+
67+
- Docker is installed and is properly running.
68+
Check using:
69+
70+
```console
71+
docker version
72+
docker ps -a
73+
docker run --rm hello-world
74+
```
75+
76+
These commands should run without errors.
77+
78+
- The current module repository and all other module repositories (particularly the [`dataset` repository](https://github.com/open-crs/dataset) and the [`commons` repository](https://github.com/open-crs/commons)) are cloned in the same directory.
79+
80+
- You are running all commands inside a Python virtual environment.
81+
There should be `(.venv)` prefix to your prompt.
82+
83+
- You have installed Poetry in the virtual environment.
84+
If you run:
85+
86+
```console
87+
which poetry
88+
```
89+
90+
you should get a path ending with `.venv/bin/poetry`.
91+
92+
1. Disable the Python Keyring:
93+
94+
```console
95+
export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
96+
```
97+
98+
This is an problem that may occur in certain situations, preventing Poetry from getting packages.
99+
100+
1. Install the required packages with Poetry (based on `pyprojects.toml`):
101+
102+
```console
103+
poetry install --only main
104+
```
105+
106+
1. Create the `ghidra` and `qbdi_args_fuzzing` Docker images by using the [instructions in the `commons` repository](https://github.com/open-crs/commons?tab=readme-ov-file#setup).
107+
108+
1. Optionally, generate executables by using the [instructions in the `dataset` repository](https://github.com/open-crs/dataset).
64109

65110
## Usage
66111

112+
You can use the `attack_surface_approximation` module either standalone, as a CLI tool, or integrated into Python applications, as a Python module.
113+
67114
### As a CLI Tool
68115

69-
#### Arguments Dictionary Generation
116+
As a CLI tool, you can either use the `cli.py` module:
117+
118+
```console
119+
python attack_surface_approximation/cli.py
120+
```
121+
122+
or the Poetry interface:
70123

124+
```console
125+
poetry run attack_surface_approximation
71126
```
72-
➜ poetry run attack_surface_approximation generate --heuristic man --output args.txt --top 10
127+
128+
#### Generate Dictionary for Arguments
129+
130+
```console
131+
$ poetry run attack_surface_approximation generate --heuristic man_parsing --output args.txt --top 100
73132
Successfully generated dictionary with 10 arguments
74-
➜ cat args.txt
75-
--and
76-
--get
77-
--get-feedbacks
78-
--no-progress-meter
79-
--print-name
80-
-input
81-
-lmydep2
82-
-miniswhite
83-
-nM
84-
-prune
133+
134+
$ head args.txt
135+
--allow-unrelated-histories
136+
--analysis-display-unstable-clusters
137+
--auto-area-segmentation
138+
--backup-dir
139+
--callstack-filter
140+
--cidfile
141+
--class
142+
--codename
143+
--column
144+
--contained
85145
```
86146

87-
#### Input Streams Detection
147+
#### Detect Input Streams
88148

149+
Use an ELF i386 (32 bit) executable as target for detecting input streams.
150+
151+
For example, you can use one of the executables generated in the [`dataset` repository](https://github.com/open-crs/dataset):
152+
153+
```console
154+
$ ../dataset/executables/toy_test_suite_1.elf
155+
Gimme two lines of input:
156+
aaa
157+
bbb
89158
```
90-
➜ ./crackme
91-
Enter the password: pass
92-
Wrong password!
93-
➜ poetry run attack_surface_approximation detect --elf crackme
159+
160+
Now, do the attack surface approximation:
161+
162+
```console
163+
$ poetry run attack_surface_approximation detect --elf $(pwd)/../dataset/executables/toy_test_suite_1.elf
94164
Several input mechanisms were detected for the given program:
95165

96-
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
97-
┃ Stream ┃ Present ┃
98-
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
99-
files No
100-
arguments No
101-
stdin │ Yes │
102-
networking No
103-
environment_variables │ No
104-
└──────────────────────┴─────────┘
166+
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
167+
┃ Stream ┃ Present ┃
168+
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
169+
STDINYes
170+
ARGUMENTSYes
171+
FILES │ Yes │
172+
ENVIRONMENT_VARIABLE Yes
173+
NETWORKING │ Yes
174+
└──────────────────────┴─────────┘
105175
```
106176

107-
#### Arguments Fuzzing
177+
The executable used uses all potential input streams.
108178

109-
```
110-
➜ poetry run attack_surface_approximation fuzz --elf /bin/uname --dictionary args.txt
179+
#### Fuzz Arguments
180+
181+
```console
182+
$ poetry run attack_surface_approximation fuzz --elf $(pwd)/../dataset/executables/toy_test_suite_1.elf --dictionary args.txt
111183
Several arguments were detected for the given program:
112184

113-
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
114-
┃ Argument ┃ Role ┃
115-
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
116-
│ - │ FLAG │
117-
│ -a │ FLAG │
118-
│ -a string │ STRING_ENABLER │
119-
│ -i │ FLAG │
120-
│ -i string │ STRING_ENABLER │
121-
│ -m │ FLAG │
122-
│ -m string │ STRING_ENABLER │
123-
│ -n │ FLAG │
124-
│ -n string │ STRING_ENABLER │
125-
│ -o │ FLAG │
126-
│ -o string │ STRING_ENABLER │
127-
│ -p │ FLAG │
128-
│ -p string │ STRING_ENABLER │
129-
│ -r │ FLAG │
130-
│ -r string │ STRING_ENABLER │
131-
│ -s │ FLAG │
132-
│ -s string │ STRING_ENABLER │
133-
│ -v │ FLAG │
134-
│ -v string │ STRING_ENABLER │
135-
└───────────┴────────────────┘
185+
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
186+
┃ Argument ┃ Role ┃
187+
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
188+
│ - │ FLAG │
189+
│ --re │ FLAG │
190+
│ --re string │ STRING_ENABLER │
191+
│ -mmusl │ FLAG │
192+
└─────────────┴────────────────┘
136193
```
137194

138-
#### Help
195+
#### Get Help
139196

140-
```
141-
poetry run attack_surface_approximation
197+
```console
198+
$ poetry run attack_surface_approximation
142199
Usage: attack_surface_approximation [OPTIONS] COMMAND [ARGS]...
143200

144201
Discovers the attack surface of vulnerable programs.
@@ -155,7 +212,7 @@ Commands:
155212

156213
### As a Python Module
157214

158-
#### Input Streams Detection
215+
#### Detect Input Streams
159216

160217
```python
161218
from attack_surface_approximation.static_input_streams_detection import \
@@ -165,11 +222,11 @@ detector = InputStreamsDetector(elf_filename)
165222
streams_list = detector.detect_all()
166223
```
167224

168-
#### Arguments Fuzzing
225+
#### Fuzz Arguments
169226

170227
```python
171228
from attack_surface_approximation.arguments_fuzzing import ArgumentsFuzzer
172229

173230
fuzzer = ArgumentsFuzzer(elf_filename, fuzzed_arguments)
174231
detected_arguments = fuzzer.get_all_valid_arguments()
175-
```
232+
```

0 commit comments

Comments
 (0)