Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read out-of-bounds in TextEndsWithNewline #379

Closed
gaa-cifasis opened this issue Feb 28, 2016 · 3 comments
Closed

Read out-of-bounds in TextEndsWithNewline #379

gaa-cifasis opened this issue Feb 28, 2016 · 3 comments

Comments

@gaa-cifasis
Copy link

Hello,

We found a read out-of-bounds in tidy-html5 (git revision 03a643f). A test case to reproduce it is available here. You can see the ASAN report:

==29141== ERROR: AddressSanitizer: heap-buffer-overflow on address 0xb4b000ff at pc 0x81942b8 bp 0xbfffee48 sp 0xbfffee3c
READ of size 1 at 0xb4b000ff thread T0
#0 0x81942b7 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x81942b7)
#1 0x81b4233 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x81b4233)
#2 0x81b4233 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x81b4233)
#3 0x81b74e3 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x81b74e3)
#4 0x81b54a3 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x81b54a3)
#5 0x81b54a3 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x81b54a3)
#6 0x81b54a3 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x81b54a3)
#7 0x81b4aeb (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x81b4aeb)
#8 0x80aaeb7 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x80aaeb7)
#9 0x80541af (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x80541af)
#10 0xb6851a82 (/lib/i386-linux-gnu/libc-2.19.so+0x19a82)
#11 0x8058825 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x8058825)
0xb4b000ff is located 1 bytes to the left of 8192-byte region [0xb4b00100,0xb4b02100)
allocated by thread T0 here:
#0 0xb69fc854 (/usr/lib/i386-linux-gnu/libasan.so.0.0.0+0x16854)
#1 0x82040c4 (/home/vagrant/afl-tests/progs/tidy-html5/build/cmake/tidy+0x82040c4)
Shadow bytes around the buggy address:
0x3695ffc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x3695ffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x3695ffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x3695fff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x36960000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x36960010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]
0x36960020:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x36960030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x36960040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x36960050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x36960060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap righ redzone: fb
Freed Heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
ASan internal: fe
==29141== ABORTING

and the gdb backtrace is here:

(gdb) bt
#0 0xb7fdd428 in __kernel_vsyscall ()
#1 0xb6866607 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#2 0xb6869a33 in __GI_abort () at abort.c:89
#3 0xb6a042e4 in ?? () from /usr/lib/i386-linux-gnu/libasan.so.0
#4 0xb69f858a in ?? () from /usr/lib/i386-linux-gnu/libasan.so.0
#5 0xb6a00f4b in ?? () from /usr/lib/i386-linux-gnu/libasan.so.0
#6 0xb69ffd3a in __asan_report_error () from /usr/lib/i386-linux-gnu/libasan.so.0
#7 0xb69f88ff in __asan_report_load1 () from /usr/lib/i386-linux-gnu/libasan.so.0
#8 0x081942b8 in TextEndsWithNewline (lexer=, mode=25, node=0xb49017c0)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:1870

#9 PPrintText (doc=0xb6601d00, mode=25, indent=0, node=0xb49017c0)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:1008

#10 0x081b4234 in PPrintScriptStyle (node=0xb4901830, indent=0, mode=25, doc=0xb6601d00)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:1995

#11 prvTidyPPrintTree (doc=0xb6601d00, mode=25, indent=0, node=0xb4901830)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:2237

#12 0x081b4234 in PPrintScriptStyle (node=0xb4901910, indent=0, mode=25, doc=0xb6601d00)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:1995

#13 prvTidyPPrintTree (doc=0xb6601d00, mode=0, indent=0, node=0xb4901910)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:2237

#14 0x081b74e4 in prvTidyPPrintTree (doc=0xb6601d00, mode=0, indent=0, node=0xb49019f0)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:2278

#15 0x081b54a4 in prvTidyPPrintTree (doc=0xb6601d00, mode=0, indent=0, node=0xb4901bb0)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:2348

#16 0x081b54a4 in prvTidyPPrintTree (doc=0xb6601d00, mode=0, indent=0, node=0xb4901a60)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:2348

#17 0x081b54a4 in prvTidyPPrintTree (doc=0xb6601d00, mode=0, indent=0, node=0xb4901b40)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:2348

---Type to continue, or q to quit---
#18 0x081b4aec in prvTidyPPrintTree (doc=0xb6601d00, mode=0, indent=0, node=0xb6601d00)

at /home/vagrant/afl-tests/progs/tidy-html5/src/pprint.c:2141

#19 0x080aaeb8 in tidyDocSaveStream (out=0xb6400640, doc=0xb6601d00)

at /home/vagrant/afl-tests/progs/tidy-html5/src/tidylib.c:1899

#20 tidyDocSaveStdout (doc=0xb6601d00) at /home/vagrant/afl-tests/progs/tidy-html5/src/tidylib.c:1096
#21 tidySaveStdout (tdoc=tdoc@entry=0xb6601d00) at /home/vagrant/afl-tests/progs/tidy-html5/src/tidylib.c:997
#22 0x080541b0 in main (argc=2, argv=) at /home/vagrant/afl-tests/progs/tidy-html5/console/tidy.c:1943

Regards,
Gus.

@gaa-cifasis gaa-cifasis changed the title Read out-of-band in TextEndsWithNewline Read out-of-bounds in TextEndsWithNewline Feb 28, 2016
@geoffmcl
Copy link
Contributor

@gaa-cifasis thank you for your usual detailed report... you certainly come across some weird documents... what characer encoding was the doument supposed to be? It is certainly not utf-8! But as discussed before, tidy should have no problem with any byte sequence...

Not in a position to push this to the repo today, but would be much appreciated if you could apply the follow patch, and try again -

diff --git a/src/pprint.c b/src/pprint.c
index bb7eccc..e3dbce5 100644
--- a/src/pprint.c
+++ b/src/pprint.c
@@ -1034,10 +1034,11 @@ static void PPrintText( TidyDocImpl* doc, uint mode, uint indent,
             ix = IncrWS( ix, end, indent, ixWS );
         }
         else if (( c == '&' ) && (TY_(HTMLVersion)(doc) == HT50) &&
-            (((ix + 1) == end) || (((ix + 1) < end) && (isspace(doc->lexer->lexbuf[ix+1])))) )
+            (((ix + 1) == end) || (((ix + 1) < end) && (isspace(doc->lexer->lexbuf[ix+1] & 0xff)))) )
         {
             /*\
              * Issue #207 - This is an unambiguous ampersand need not be 'quoted' in HTML5
+             * Issue #379 - Ensure character passed to isspace in range 0 to 255
             \*/
             PPrintChar( doc, c, (mode | CDATA) );
         }
@@ -1866,8 +1867,11 @@ static int TextEndsWithNewline(Lexer *lexer, Node *node, uint mode )
     if ( (mode & (CDATA|COMMENT)) && TY_(nodeIsText)(node) && node->end > node->start )
     {
         uint ch, ix = node->end - 1;
-        /* Skip non-newline whitespace. */
-        while ( ix >= node->start && (ch = (lexer->lexbuf[ix] & 0xff))
+        /*\
+         *  Skip non-newline whitespace. 
+         *  Issue #379 - Only if ix is GT start can it be decremented!
+        \*/
+        while ( ix > node->start && (ch = (lexer->lexbuf[ix] & 0xff))
                 && ( ch == ' ' || ch == '\t' || ch == '\r' ) )
             --ix;

The first is because your document contains some high bit characters, and without the & 0xff an assert can be triggered in debug form of isspace(int), due to sign extension...

The second is in the service TextEndsWithNewline() you pointed out and how we got away with the >= for so long is beyond me!

Obviously ix can only be decremented if it is greater than the start of the text. In this case we are checking the first character in the lexer buffer, and ix starts as zero, and it is a space, with very bad consequences...

And note here we do add & 0xff to ensure no sign extension, and uses the more tidy traditional testing of the character rather, than using isspace(). Maybe a fuller fix for the first part would be to repeat this logic there, but will leave that for the moment...

Anyway, will push this fix soonest but would be much appreciated if you could test the patch... This is against the latest master, version 5.1.42... Thanks...

@geoffmcl geoffmcl added the Bug label Feb 28, 2016
@geoffmcl geoffmcl added this to the 5.2 milestone Feb 28, 2016
geoffmcl added a commit that referenced this issue Mar 6, 2016
How this lasted so long in the code is a mystery! But of course it will
only be a read out-of-bounds if testing the first character in the lexer,
and it is a spacey char.

A big thanks to @gaa-cifasis for running ASAN tests on Tidy.
geoffmcl added a commit that referenced this issue Mar 6, 2016
@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 6, 2016

@gaa-cifasis because I am currently doing extensive testing on #380 I have also pushed the above patch to the issue-380 branch, and bumped the version to 5.1.45-Exp2.

It would be great if you could checkout issue-380 and re-test... thanks...

In fact every ones help in testing this branch would be most appreciated... it now has two important fixes... thanks...

After sufficient testing of this issue and #380 in this branch will be merged to master...

@geoffmcl
Copy link
Contributor

@gaa-cifasis this is now in master, and as there have been no further comments for several weeks closing this...

Feel free to re-open, a file a new issue... thanks...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants