(actually only two types of file, lol)
A command line tool, written in Rust, to redact texts in file(s) that match a certain regex.
It is inspired by the following:
- Given a file in
.txtand.docxand a list of regexes that one wishes to redact,rafredacts it with random alphanumeric characters in place of all texts that match any of the regexes. - It also provides a mapping in the form of a
.jsonfile between the original text and the redacted text.
Disclaimer:
Sadly, for now, you can only redact .txt and .docx files.
You need to have Rust installed first.
- Git clone this repo
$ git clone https://github.com/jymchng/raf.gitcargo runit
Example, to redact all phone and emails texts for all files in a folder, use:
$ cargo run -- folder ./tests/test_files -t phone emails
Note the folder subcommand.
Or you can use:
$ cargo run -- folder ./tests/test_files -t phone emailThis is because email and emails are categorized under the list of types in the patterns.json file, i.e.:
{
"pattern": "([a-z0-9_+]([a-z0-9_+.\\-]*[a-z0-9_+\\-])?)@([a-z0-9]+([\\-\\.]{1}[a-z0-9]+)*\\.[a-z]{2,6})",
"type": [
"email",
"emails"
]
}To redact files in sub-directories of your folder, recursively, use:
$ cargo run -- folder ./tests/test_files -t phone -rNote the -r parameter.
A .json file is also provided as a mapping between the redacted text and the original text.
Example:
[
{ "unredacted_text": "[email protected]",
"redacted_text": "QLfyzaoQxQGHDWmaECMl1Gxdf8“
},
...
{ "unredacted_text": "[email protected]",
"redacted_text": "nM0F4AehzbLDw4RYhGn4"
},
…
]
You can compile and build the binary using:
$ cargo install --git https://github.com/jymchng/raf.git
Then use the commands as discussed in the previous section. Example:
$ raf file ./tests/test_files/file1.txt -t phone email
Use the --help command to read the help section of raf.
Example:
$ raf folder --help

