Skip to content

jsoup 1.16.1

Compare
Choose a tag to compare
@jhy jhy released this 29 Apr 06:32
· 306 commits to master since this release
062ebdb

jsoup Java HTML Parser release 1.16.1

Improvements

  • In Jsoup.connect(String url), natively support URLs with Unicode characters in the path or query string, without having to be escaped by the caller. #1914
  • Calling Node.remove() on a node with no parent is now a no-op, vs a validation error. #1898

Bug Fixes

  • Aligned the HTML Tree Builder processing steps for AfterBody and AfterAfterBody to the updated WHATWG standard, to not pop the stack to close <body> or <html> elements. This prevents an errant </html> closing the preceding structure. Also added appropriate error message outputs in this case. #1851
  • Corrected support for ruby elements (<ruby>, <rp>, <rt>, and <rtc>) to current spec. #1294
  • In Jsoup.connect(String url), if the input URL had components that were already % escaped, they would be escaped again, causing errors when fetched. #1902
  • When tracking input source positions, text in tables that was fostered had invalid positions. #1927
  • When pretty-printing, the first inline Element or Comment in a block would not be wrap-indented if it were preceded by a blank text node. #1906
  • When pretty-printing a <pre> containing block tags, those tags were incorrectly indented. #1891
  • When pretty-printing nested inlineable blocks (such as a <p> in a <td>), the inner element should be indented. #1926
  • <br> tags should be wrap-indented when in block tags (and not when in inline tags). #1911
  • The contents of a sufficiently large <textarea> with un-escaped HTML closing tags may be incorrectly parsed to an empty node. #1929