-
-
Notifications
You must be signed in to change notification settings - Fork 370
Add BytesSubInput that allows us to mutate sub-parts of a bytes-backed input #2220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pub trait HasBytesVec { | ||
/// The internal bytes map | ||
/// Contains mutateable and resizable bytes | ||
pub trait HasMutatorBytes: HasLen { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also split this trait into two; one that only supports static slices, and one that supports growing and shrinking? But it gets (a bit) more messy
libafl/src/inputs/bytessub.rs
Outdated
let (mut bytes_input, _len_orig) = init_bytes_input(); | ||
let bytes_input_cloned = bytes_input.clone(); | ||
|
||
let mut bsi = bytes_input.sub_input(..2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riesentoaster does this help you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that my "main" Input
still needs to implement HasMutatorBytes
. I could just create a slice from all vectors in the struct concatenated, but that seems inefficient. Plus I'd then have to dynamically calculate the bounds by again querying the original Input
for the lengths of the vector parts and calculate offsets.
For this to be much prettier, I'd need a similar trait (maybe ExtractsMutatorBytes
?), which accepts a mapping/extracting closure instead of a simple index range, but can be implemented by any input
without the requirement for HasMutatorBytes
. Do you think that would be feasible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand what your goal is, but we can implement HasMutatorBytes
for Vec
, does that help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really.
Some background: I have a custom Input
type that has multiple binary entries (Vec<u8>
), which are then mapped to command line parameters for my binary (coreutils in my case). I want to mutate them all individually according to the mutations defined in havoc_mutations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want Multipart: https://github.com/AFLplusplus/LibAFL/blob/main/libafl/src/mutators/multi.rs
This is already default implemented, just change your input shape to MultipartInput<BytesInput>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean by that. I was just going to add a CommandLineArg
with a fixed name for the stdin (yes, I should maybe rename it, or just use Option<Option<Vec<u8>>>
since that's essentially what this is). The mapping from the MultipartInput
to the executor parts are not the issue, I'm writing custom logic there anyway.
I'll wait for this PR to be merged and try again then. And write an issue or something if I can't figure it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you need to keep a CommandLineArg
key in the input that doesn't have a value? Or do you also want to mutate the list of parameters themselves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly. I'm doing differential fuzzing and I want to restrict the parameter space to command line arguments that don't alter the environment (such as output file args), which is why I cannot just use an arbitrary representation like an array of strings or a byte array which gets split at a magic value.
Within those, I want the fuzzer to explore based on different combinations of CLAs, including the presence/absence of them and different values for those that take parameters. So inputs passed to the executor (or how you'd pass them manually on the command line) could look something like this:
echo "This is a stdin byte array" | binary_under_test --arg1_with_param "value of param for arg1" --arg2_without_param --arg3_without_param --arg4_with_param "some value"
echo "This is another byte array for stdin" | binary_under_test --arg1_with_param "mutated value of param for arg1"
echo "More stdin" | binary_under_test --arg2_without_param --arg3_without_param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a MutlipartInput will be the thing to use. Either way having the current PR in is good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way having the current PR in is good
I'll check again once this PR is merged. Thank you for your suggestions!
@@ -351,7 +351,7 @@ where | |||
return result.replace(Ok(MutationResult::Skipped)); | |||
} | |||
bytes.truncate(new_size); | |||
core::mem::swap(input.bytes_mut(), &mut bytes); | |||
input.bytes_mut().copy_from_slice(&bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addisoncrump this no longer works since it's not a Vec. I believe it shouldn't matter?
Also, why don't we pass the original slice to the custom mutator directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess because it max_size
big we want to allocate the dude (could make sense to keep a max_size vec around in the struct, though)
@rmalmain QEMU CI seems unhappy, probably flakey(?) From https://gitlab.com/qemu-project/berkeley-softfloat-3
* branch b64af41c3276f97f0e181920400ee056b9c88037 -> FETCH_HEAD
HEAD is now at b64af41 Fix typo in function 'softfloat_propagateNaNF128M' for RISC-V.
From https://gitlab.com/qemu-project/berkeley-testfloat-3
* branch e7af9751d9f9fd3b47911f51a5cfd08af256a9ab -> FETCH_HEAD
HEAD is now at e7af975 Fix -Wreturn-type errors in fNNRandom
thread 'main' panicked at /__w/LibAFL/LibAFL/libafl_qemu/libafl_qemu_build/src/build.rs:502:44:
Partial linked failure: Os { code: 2, kind: NotFound, message: "No such file or directory" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace |
Might have to undo aa77494 - with it we need to specify the input type for havoc_mutations... /edit: Was only the case for tmin, so it's ok |
Yes, I observed it in other PRs. Probably linked to #2219. I'm investigating. |
fn test_bytessubinput_use_vec() { | ||
let mut test_vec = vec![0, 1, 2, 3, 4]; | ||
let mut test_vec = &mut test_vec; | ||
let mut sub_vec = test_vec.sub_input(1..2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced this with a wrapper type, but it still looks similar enough
…splus#2220) * Add BytesSubMutator that allows us to mutate sub-parts of a bytes-backed input * no_std * fix string mutator * make build * Fix clippy on macOS * Docs * More docs * Better docs * --amend * Renamed bsi to sub_input. Too much BSI * More more * balance backticks * Make splicing usable with sub_input (not that it makes sense) * More annotations * more input annotations? * Implement HasMutatorBytes for &mut Vec * clippy * Use a wrapper type instead * Add wrapper type for Vec as well * Remove the duplicate BytesInput... lol
This is a handy feature for a bunch of use-cases.
I don't think it will currently work for splicing mutators, simply because they assume the input used in the mutator has the same type as the input in the corpus.
This should help for #2202 and more.