Skip to content

Add BytesSubInput that allows us to mutate sub-parts of a bytes-backed input #2220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 21, 2024

Conversation

domenukk
Copy link
Member

This is a handy feature for a bunch of use-cases.
I don't think it will currently work for splicing mutators, simply because they assume the input used in the mutator has the same type as the input in the corpus.

This should help for #2202 and more.

pub trait HasBytesVec {
/// The internal bytes map
/// Contains mutateable and resizable bytes
pub trait HasMutatorBytes: HasLen {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also split this trait into two; one that only supports static slices, and one that supports growing and shrinking? But it gets (a bit) more messy

let (mut bytes_input, _len_orig) = init_bytes_input();
let bytes_input_cloned = bytes_input.clone();

let mut bsi = bytes_input.sub_input(..2);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riesentoaster does this help you?

Copy link
Contributor

@riesentoaster riesentoaster May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that my "main" Input still needs to implement HasMutatorBytes. I could just create a slice from all vectors in the struct concatenated, but that seems inefficient. Plus I'd then have to dynamically calculate the bounds by again querying the original Input for the lengths of the vector parts and calculate offsets.

For this to be much prettier, I'd need a similar trait (maybe ExtractsMutatorBytes?), which accepts a mapping/extracting closure instead of a simple index range, but can be implemented by any input without the requirement for HasMutatorBytes. Do you think that would be feasible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand what your goal is, but we can implement HasMutatorBytes for Vec, does that help?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really.

Some background: I have a custom Input type that has multiple binary entries (Vec<u8>), which are then mapped to command line parameters for my binary (coreutils in my case). I want to mutate them all individually according to the mutations defined in havoc_mutations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want Multipart: https://github.com/AFLplusplus/LibAFL/blob/main/libafl/src/mutators/multi.rs

This is already default implemented, just change your input shape to MultipartInput<BytesInput>.

Copy link
Contributor

@riesentoaster riesentoaster May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by that. I was just going to add a CommandLineArg with a fixed name for the stdin (yes, I should maybe rename it, or just use Option<Option<Vec<u8>>> since that's essentially what this is). The mapping from the MultipartInput to the executor parts are not the issue, I'm writing custom logic there anyway.

I'll wait for this PR to be merged and try again then. And write an issue or something if I can't figure it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you need to keep a CommandLineArg key in the input that doesn't have a value? Or do you also want to mutate the list of parameters themselves?

Copy link
Contributor

@riesentoaster riesentoaster May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. I'm doing differential fuzzing and I want to restrict the parameter space to command line arguments that don't alter the environment (such as output file args), which is why I cannot just use an arbitrary representation like an array of strings or a byte array which gets split at a magic value.

Within those, I want the fuzzer to explore based on different combinations of CLAs, including the presence/absence of them and different values for those that take parameters. So inputs passed to the executor (or how you'd pass them manually on the command line) could look something like this:

echo "This is a stdin byte array"           | binary_under_test --arg1_with_param "value of param for arg1" --arg2_without_param --arg3_without_param --arg4_with_param "some value"
echo "This is another byte array for stdin" | binary_under_test --arg1_with_param "mutated value of param for arg1"
echo "More stdin"                           | binary_under_test --arg2_without_param --arg3_without_param

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a MutlipartInput will be the thing to use. Either way having the current PR in is good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way having the current PR in is good

I'll check again once this PR is merged. Thank you for your suggestions!

@@ -351,7 +351,7 @@ where
return result.replace(Ok(MutationResult::Skipped));
}
bytes.truncate(new_size);
core::mem::swap(input.bytes_mut(), &mut bytes);
input.bytes_mut().copy_from_slice(&bytes);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addisoncrump this no longer works since it's not a Vec. I believe it shouldn't matter?
Also, why don't we pass the original slice to the custom mutator directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess because it max_size big we want to allocate the dude (could make sense to keep a max_size vec around in the struct, though)

@domenukk
Copy link
Member Author

@rmalmain QEMU CI seems unhappy, probably flakey(?)

  From https://gitlab.com/qemu-project/berkeley-softfloat-3
   * branch            b64af41c3276f97f0e181920400ee056b9c88037 -> FETCH_HEAD
  HEAD is now at b64af41 Fix typo in function 'softfloat_propagateNaNF128M' for RISC-V.
  From https://gitlab.com/qemu-project/berkeley-testfloat-3
   * branch            e7af9751d9f9fd3b47911f51a5cfd08af256a9ab -> FETCH_HEAD
  HEAD is now at e7af975 Fix -Wreturn-type errors in fNNRandom
  thread 'main' panicked at /__w/LibAFL/LibAFL/libafl_qemu/libafl_qemu_build/src/build.rs:502:44:
  Partial linked failure: Os { code: 2, kind: NotFound, message: "No such file or directory" }
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

@domenukk
Copy link
Member Author

domenukk commented May 20, 2024

Might have to undo aa77494 - with it we need to specify the input type for havoc_mutations...

/edit: Was only the case for tmin, so it's ok

@domenukk domenukk requested a review from addisoncrump May 20, 2024 08:03
@rmalmain
Copy link
Member

@rmalmain QEMU CI seems unhappy, probably flakey(?)

  From https://gitlab.com/qemu-project/berkeley-softfloat-3
   * branch            b64af41c3276f97f0e181920400ee056b9c88037 -> FETCH_HEAD
  HEAD is now at b64af41 Fix typo in function 'softfloat_propagateNaNF128M' for RISC-V.
  From https://gitlab.com/qemu-project/berkeley-testfloat-3
   * branch            e7af9751d9f9fd3b47911f51a5cfd08af256a9ab -> FETCH_HEAD
  HEAD is now at e7af975 Fix -Wreturn-type errors in fNNRandom
  thread 'main' panicked at /__w/LibAFL/LibAFL/libafl_qemu/libafl_qemu_build/src/build.rs:502:44:
  Partial linked failure: Os { code: 2, kind: NotFound, message: "No such file or directory" }
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Yes, I observed it in other PRs. Probably linked to #2219. I'm investigating.

fn test_bytessubinput_use_vec() {
let mut test_vec = vec![0, 1, 2, 3, 4];
let mut test_vec = &mut test_vec;
let mut sub_vec = test_vec.sub_input(1..2);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced this with a wrapper type, but it still looks similar enough

@domenukk domenukk changed the title Add BytesSubMutator that allows us to mutate sub-parts of a bytes-backed input Add BytesSubInput that allows us to mutate sub-parts of a bytes-backed input May 21, 2024
@domenukk domenukk merged commit 684b312 into main May 21, 2024
46 of 102 checks passed
@domenukk domenukk deleted the bytessubinput branch May 21, 2024 23:50
riesentoaster pushed a commit to riesentoaster/LibAFL that referenced this pull request May 24, 2024
…splus#2220)

* Add BytesSubMutator that allows us to mutate sub-parts of a bytes-backed input

* no_std

* fix string mutator

* make build

* Fix clippy on macOS

* Docs

* More docs

* Better docs

* --amend

* Renamed bsi to sub_input. Too much BSI

* More more

* balance backticks

* Make splicing usable with sub_input (not that it makes sense)

* More annotations

* more input annotations?

* Implement HasMutatorBytes for &mut Vec

* clippy

* Use a wrapper type instead

* Add wrapper type for Vec as well

* Remove the duplicate BytesInput... lol
@LCBH LCBH mentioned this pull request Apr 30, 2025
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants