Skip to content

Commit a30315f

Browse files
committed
Merge branch 'master' of ssh://github.com/OWASP/java-html-sanitizer
2 parents e6710cf + d8776c7 commit a30315f

File tree

3 files changed

+149
-15
lines changed

3 files changed

+149
-15
lines changed

README.md

Lines changed: 147 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,25 @@ This code was written with security best practices in mind, has an
1212
extensive test suite, and has undergone
1313
[adversarial security review](docs/attack_review_ground_rules.md).
1414

15-
----
15+
## Table Of Contents
16+
17+
* [Getting Started](#getting-started)
18+
* [Prepackaged Policies](#prepackaged-policies)
19+
* [Crafting a policy](#crafting-a-policy)
20+
* [Custom policies](#custom-policies)
21+
* [Preprocessors](#preprocessors)
22+
* [Telemetry](#telemetry)
23+
* [Questions\?](#questions)
24+
* [Contributing](#contributing)
25+
* [Credits](#credits)
26+
27+
## Getting Started
1628

1729
[Getting Started](docs/getting_started.md) includes instructions on
1830
how to get started with or without Maven.
1931

32+
## Prepackaged Policies
33+
2034
You can use
2135
[prepackaged policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/Sanitizers.html):
2236

@@ -25,7 +39,9 @@ PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS);
2539
String safeHTML = policy.sanitize(untrustedHTML);
2640
```
2741

28-
or the
42+
## Crafting a policy
43+
44+
The
2945
[tests](https://github.com/OWASP/java-html-sanitizer/blob/master/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java)
3046
show how to configure your own
3147
[policy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/HtmlPolicyBuilder.html):
@@ -40,20 +56,22 @@ PolicyFactory policy = new HtmlPolicyBuilder()
4056
String safeHTML = policy.sanitize(untrustedHTML);
4157
```
4258

43-
or you can write
59+
## Custom Policies
60+
61+
You can write
4462
[custom policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/ElementPolicy.html)
4563
to do things like changing `h1`s to `div`s with a certain class:
4664

4765
```Java
4866
PolicyFactory policy = new HtmlPolicyBuilder()
4967
.allowElements("p")
5068
.allowElements(
51-
new ElementPolicy() {
52-
public String apply(String elementName, List<String> attrs) {
53-
attrs.add("class");
54-
attrs.add("header-" + elementName);
55-
return "div";
56-
}
69+
(String elementName, List<String> attrs) -> {
70+
// Add a class attribute.
71+
attrs.add("class");
72+
attrs.add("header-" + elementName);
73+
// Return elementName to include, null to drop.
74+
return "div";
5775
}, "h1", "h2", "h3", "h4", "h5", "h6")
5876
.toFactory();
5977
String safeHTML = policy.sanitize(untrustedHTML);
@@ -64,14 +82,129 @@ need to be explicitly whitelisted using the `allowWithoutAttributes()`
6482
method if you want them to be allowed through the filter when these
6583
elements do not include any attributes.
6684

67-
----
85+
[Attribute policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/AttributePolicy.html) allow running custom code too. Adding an attribute policy will not water down any default policy like `style` or URL attribute checks.
86+
87+
```Java
88+
new HtmlPolicyBuilder = new HtmlPolicyBuilder()
89+
.allowElement("div", "span")
90+
.allowAttributes("data-foo")
91+
.matching(
92+
(String elementName, String attributeName, String value) -> {
93+
// Return value for the attribute or null to drop.
94+
})
95+
.onElements("div", "span")
96+
.build()
97+
```
98+
99+
## Preprocessors
100+
101+
Preprocessors allow inserting text and large scale structural changes.
102+
103+
```Java
104+
new HtmlPolicyBuilder = new HtmlPolicyBuilder()
105+
// Use a preprocessor to be backwards compatible with the
106+
// <plaintext> element which
107+
.withPreprocessor(
108+
(HtmlStreamEventReceiver r) -> {
109+
// Provide user with info about links before they click.
110+
// Before: <a href="https://example.com/...">
111+
// After: (https://example.com) <a href="https://example.com/...">
112+
return new HtmlStreamEventReceiverWrapper(r) {
113+
@Override public void openTag(String elementName, List<String> attrs) {
114+
if ("a".equals(elementName)) {
115+
for (int i = 0, n = attrs.size(); i < n; i += 2) {
116+
if ("href".equals(attrs.get(i)) {
117+
String url = attrs.get(i + 1);
118+
String origin;
119+
try {
120+
URI uri = new URI(url);
121+
String scheme = uri.getScheme();
122+
String authority = uri.getRawAuthority();
123+
if (scheme == null && authority == null) {
124+
origin = null;
125+
} else {
126+
origin = (scheme != null ? scheme + ":" : "")
127+
+ (authority != null ? "//" + authority : "");
128+
}
129+
} catch (URISyntaxException ex) {
130+
origin = "about:invalid";
131+
}
132+
if (origin != null) {
133+
text(" (" + origin + ") ");
134+
}
135+
}
136+
}
137+
}
138+
super.openTag(elementName, attrs);
139+
}
140+
};
141+
}
142+
.allowElement("a")
143+
...
144+
.build()
145+
146+
```
147+
148+
Preprocessing happens before a policy is applied, so cannot affect the security
149+
of the output.
150+
151+
## Telemetry
152+
153+
When a policy rejects an element or attribute it notifies an [HtmlChangeListener](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20190325.1/org/owasp/html/HtmlChangeListener.html).
154+
155+
You can use this to keep track of policy violation trends and find out when someone
156+
is making an effort to breach your security.
157+
158+
```Java
159+
PolicyFactory myPolicyFactory = ...;
160+
// If you need to associate reports with some context, you can do so.
161+
MyContextClass myContext = ...;
162+
163+
String sanitizedHtml = myPolicyFactory.sanitize(
164+
unsanitizedHtml,
165+
new HtmlChangeListener<MyContextClass>() {
166+
@Override
167+
public void discardedTag(MyContextClass context, String elementName) {
168+
// ...
169+
}
170+
@Override
171+
public void discardedAttributes(
172+
MyContextClass context, String elementName, String... attributeNames) {
173+
// ...
174+
}
175+
},
176+
myContext);
177+
```
178+
179+
**Note**: If a string sanitizes with no change notifications, it is not the case
180+
that the input string is necessarily safe to use. Only use the output of the sanitizer.
181+
182+
The sanitizer ensures that the output is in a sub-set of HTML that commonly
183+
used HTML parsers will agree on the meaning of, but the absence of
184+
notifications does not mean that the input is in such a sub-set,
185+
only that it does not contain elements or attributes that were removed.
186+
187+
See ["Why sanitize when you can validate"](https://github.com/OWASP/java-html-sanitizer/blob/master/docs/html-validation.md) for more on this topic.
188+
189+
## Questions?
68190

69-
Subscribe to the
70-
[mailing list](http://groups.google.com/group/owasp-java-html-sanitizer-support)
71-
to be notified of known [Vulnerabilities](docs/vulnerabilities.md).
72191
If you wish to report a vulnerability, please see
73192
[AttackReviewGroundRules](docs/attack_review_ground_rules.md).
74193

75-
----
194+
Subscribe to the
195+
[mailing list](http://groups.google.com/group/owasp-java-html-sanitizer-support)
196+
to be notified of known [Vulnerabilities](docs/vulnerabilities.md) and important updates.
197+
198+
## Contributing
199+
200+
If you would like to contribute, please ping [@mvsamuel](https://twitter.com/mvsamuel) or [@manicode](https://twitter.com/manicode).
201+
202+
We welcome [issue reports](https://github.com/OWASP/java-html-sanitizer/issues) and PRs.
203+
PRs that change behavior or that add functionality should include both positive and
204+
[negative tests](https://www.guru99.com/negative-testing.html).
205+
206+
Please be aware that contributions fall under the [Apache 2.0 License](https://github.com/OWASP/java-html-sanitizer/blob/master/COPYING).
207+
208+
## Credits
76209

77210
[Thanks to everyone who has helped with criticism and code](docs/credits.md)

docs/client-side-templates.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Many client-side templates look for special constructs in text nodes. Often, us
2222

2323
## Client side template / expression attributes
2424

25-
When filtering client-side templates, it should also be considered to fully cover attributes containing expressions and parseable information that might cause damage or lead to arbitary JavaScript execution.
25+
When filtering client-side templates, it should also be considered to fully cover attributes containing expressions and parseable information that might cause damage or lead to arbitrary JavaScript execution.
2626

2727
| Template Language | Attrbutes | Notes |
2828
|-------------------|-----------|-------|

docs/credits.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# Credits
22

3+
* 0xflotus
34
* agustin.lucchetti
45
* andy-h-chen
56
* augustd

0 commit comments

Comments
 (0)