-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GH-121; Make attribute order deterministic #137
Fix GH-121; Make attribute order deterministic #137
Conversation
b34286f
to
bbe00c0
Compare
Passing Travis CI build: https://travis-ci.org/jsocol/bleach/builds/43805108 |
|
Thanks, @msabramo! Rather than making ordereddict a permanent dependency, especially since it's only for the oldest supported Python, can we do something like html5lib/html5lib-python#177? Mind doing that then squashing it all down to one commit? |
Sure, that makes sense. Coming up shortly... |
by passing `alphabetical_attributes=True` to `html5lib.serializer.HTMLSerializer` Fixes: mozillaGH-121
bbe00c0
to
2df578f
Compare
Changed and squashed. Tested with:
|
Now I'm wondering if I should of made it a flag in case some people don't like their attribute order getting changed... e.g.: diff --git a/bleach/__init__.py b/bleach/__init__.py
index 08f0532..51ee396 100644
--- a/bleach/__init__.py
+++ b/bleach/__init__.py
@@ -95,7 +95,8 @@ DEFAULT_CALLBACKS = [linkify_callbacks.nofollow]
def clean(text, tags=ALLOWED_TAGS, attributes=ALLOWED_ATTRIBUTES,
- styles=ALLOWED_STYLES, strip=False, strip_comments=True):
+ styles=ALLOWED_STYLES, strip=False, strip_comments=True,
+ alphabetical_attributes=True):
"""Clean an HTML fragment and return it"""
if not text:
return ''
@@ -111,7 +112,8 @@ def clean(text, tags=ALLOWED_TAGS, attributes=ALLOWED_ATTRIBUTES,
parser = html5lib.HTMLParser(tokenizer=s)
- return _render(parser.parseFragment(text))
+ return _render(parser.parseFragment(text),
+ alphabetical_attributes=alphabetical_attributes)
def linkify(text, callbacks=DEFAULT_CALLBACKS, skip_pre=False,
@@ -364,15 +366,17 @@ def linkify(text, callbacks=DEFAULT_CALLBACKS, skip_pre=False,
return _render(forest)
-def _render(tree):
+def _render(tree, alphabetical_attributes=True):
"""Try rendering as HTML, then XML, then give up."""
- return force_unicode(_serialize(tree))
+ return force_unicode(
+ _serialize(tree,
+ alphabetical_attributes=alphabetical_attributes))
-def _serialize(domtree):
+def _serialize(domtree, alphabetical_attributes=True):
walker = html5lib.treewalkers.getTreeWalker('etree')
stream = walker(domtree)
serializer = HTMLSerializer(quote_attr_values=True,
- alphabetical_attributes=True,
+ alphabetical_attributes=alphabetical_attributes,
omit_optional_tags=False)
return serializer.render(stream)
diff --git a/docs/clean.rst b/docs/clean.rst
index 2fb888b..5cd7e5b 100644
--- a/docs/clean.rst
+++ b/docs/clean.rst
@@ -8,7 +8,8 @@
``clean()`` is Bleach's HTML sanitization method::
def clean(text, tags=ALLOWED_TAGS, attributes=ALLOWED_ATTRIBUTES,
- styles=ALLOWED_STYLES, strip=False, strip_comments=True):
+ styles=ALLOWED_STYLES, strip=False, strip_comments=True,
+ alphabetical_attributes=True):
"""Clean an HTML fragment and return it."""
Given a fragment of HTML, Bleach will parse it according to the HTML5 parsing @jsocol: What do you think? Want me to throw that in there? |
No, without |
…erministic Fix GH-121; Make attribute order deterministic
Great! You're welcome, @jsocol and thank you for creating bleach and sharing it! |
Time for a new release on PyPI perhaps? That would help folks have reliable test runs and I could avoid depending on git in my PR to pypa/readme ... |
Oof, yeah it's been about a year. I'll get on that |
The new version of bleach now sorts attributes alphabetically so that the results are deterministic (see mozilla/bleach#137) so I had to reverse the order of attributes in the expected HTML.
Always serialize attributes in alpha order by passing
alphabetical_attributes=True
tohtml5lib.serializer.HTMLSerializer
.This allows greatly simplifying tests within bleach itself and in code that makes use of bleach.
Fixes: GH-121
Cc: @metalculus84, @dstufft