Use Sphinx 1.4.9 for now #15

methane · 2017-02-11T02:09:51Z

Sphinx 1.5 is more strict.
We should fix them before using Sphinx 1.5 on Travis.

methane · 2017-02-11T02:33:49Z

test failed even with sphinx-1.4.9
#16 may fix it.

Win arm32 fix tests

TODO: - news etc.? - test somehow? at least make sure semantic tests are adequate - that "older version" path... shouldn't it be MAYBE? - mention explicitly in commit message that *this* is the actual algorithm from UAX python#15 - think if there are counter-cases where this is slower. If caller treats MAYBE same as NO... e.g. if caller actually just wants to normalize? May need to parametrize and offer both behaviors. This lets us return a NO answer instead of MAYBE when that's what a Quick_Check property tells us; or also when that's what the canonical combining classes tell us, after a Quick_Check property has said "maybe". At a quick test on my laptop, the existing code takes about 6.7 ms/MB (so 6.7 ns per byte) when the quick check returns MAYBE and it has to do the slow comparison: $ ./python -m timeit -s 'import unicodedata; s = "\uf900"*500000' -- \ 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 6.67 msec per loop With this patch, it gets the answer instantly (78 ns) on the same 1 MB string: $ ./python -m timeit -s 'import unicodedata; s = "\uf900"*500000' -- \ 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 78 nsec per loop

The purpose of the `unicodedata.is_normalized` function is to answer the question `str == unicodedata.normalized(form, str)` more efficiently than writing just that, by using the "quick check" optimization described in the Unicode standard in UAX python#15. However, it turns out the code doesn't implement the full algorithm from the standard, and as a result we often miss the optimization and end up having to compute the whole normalized string after all. Implement the standard's algorithm. This greatly speeds up `unicodedata.is_normalized` in many cases where our partial variant of quick-check had been returning MAYBE and the standard algorithm returns NO. At a quick test on my desktop, the existing code takes about 4.4 ms/MB (so 4.4 ns per byte) when the partial quick-check returns MAYBE and it has to do the slow normalize-and-compare: $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 4.39 msec per loop With this patch, it gets the answer instantly (58 ns) on the same 1 MB string: $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 58.2 nsec per loop

…H-15558) The purpose of the `unicodedata.is_normalized` function is to answer the question `str == unicodedata.normalized(form, str)` more efficiently than writing just that, by using the "quick check" optimization described in the Unicode standard in UAX #15. However, it turns out the code doesn't implement the full algorithm from the standard, and as a result we often miss the optimization and end up having to compute the whole normalized string after all. Implement the standard's algorithm. This greatly speeds up `unicodedata.is_normalized` in many cases where our partial variant of quick-check had been returning MAYBE and the standard algorithm returns NO. At a quick test on my desktop, the existing code takes about 4.4 ms/MB (so 4.4 ns per byte) when the partial quick-check returns MAYBE and it has to do the slow normalize-and-compare: $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 4.39 msec per loop With this patch, it gets the answer instantly (58 ns) on the same 1 MB string: $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 58.2 nsec per loop This restores a small optimization that the original version of this code had for the `unicodedata.normalize` use case. With this, that case is actually faster than in master! $ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 561 usec per loop $ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 512 usec per loop

…orithm. (pythonGH-15558) The purpose of the `unicodedata.is_normalized` function is to answer the question `str == unicodedata.normalized(form, str)` more efficiently than writing just that, by using the "quick check" optimization described in the Unicode standard in UAX pythonGH-15. However, it turns out the code doesn't implement the full algorithm from the standard, and as a result we often miss the optimization and end up having to compute the whole normalized string after all. Implement the standard's algorithm. This greatly speeds up `unicodedata.is_normalized` in many cases where our partial variant of quick-check had been returning MAYBE and the standard algorithm returns NO. At a quick test on my desktop, the existing code takes about 4.4 ms/MB (so 4.4 ns per byte) when the partial quick-check returns MAYBE and it has to do the slow normalize-and-compare: $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 4.39 msec per loop With this patch, it gets the answer instantly (58 ns) on the same 1 MB string: $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 58.2 nsec per loop This restores a small optimization that the original version of this code had for the `unicodedata.normalize` use case. With this, that case is actually faster than in master! $ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 561 usec per loop $ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 512 usec per loop (cherry picked from commit 2f09413) Co-authored-by: Greg Price <gnprice@gmail.com>

GH-15558) The purpose of the `unicodedata.is_normalized` function is to answer the question `str == unicodedata.normalized(form, str)` more efficiently than writing just that, by using the "quick check" optimization described in the Unicode standard in UAX GH-15. However, it turns out the code doesn't implement the full algorithm from the standard, and as a result we often miss the optimization and end up having to compute the whole normalized string after all. Implement the standard's algorithm. This greatly speeds up `unicodedata.is_normalized` in many cases where our partial variant of quick-check had been returning MAYBE and the standard algorithm returns NO. At a quick test on my desktop, the existing code takes about 4.4 ms/MB (so 4.4 ns per byte) when the partial quick-check returns MAYBE and it has to do the slow normalize-and-compare: $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 4.39 msec per loop With this patch, it gets the answer instantly (58 ns) on the same 1 MB string: $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 58.2 nsec per loop This restores a small optimization that the original version of this code had for the `unicodedata.normalize` use case. With this, that case is actually faster than in master! $ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 561 usec per loop $ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 512 usec per loop (cherry picked from commit 2f09413) Co-authored-by: Greg Price <gnprice@gmail.com>

…ithm. (pythonGH-15558) The purpose of the `unicodedata.is_normalized` function is to answer the question `str == unicodedata.normalized(form, str)` more efficiently than writing just that, by using the "quick check" optimization described in the Unicode standard in UAX python#15. However, it turns out the code doesn't implement the full algorithm from the standard, and as a result we often miss the optimization and end up having to compute the whole normalized string after all. Implement the standard's algorithm. This greatly speeds up `unicodedata.is_normalized` in many cases where our partial variant of quick-check had been returning MAYBE and the standard algorithm returns NO. At a quick test on my desktop, the existing code takes about 4.4 ms/MB (so 4.4 ns per byte) when the partial quick-check returns MAYBE and it has to do the slow normalize-and-compare: $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 4.39 msec per loop With this patch, it gets the answer instantly (58 ns) on the same 1 MB string: $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 58.2 nsec per loop This restores a small optimization that the original version of this code had for the `unicodedata.normalize` use case. With this, that case is actually faster than in master! $ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 561 usec per loop $ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 512 usec per loop

Now we can also remove `__setstate__`.

…ithm. (pythonGH-15558) The purpose of the `unicodedata.is_normalized` function is to answer the question `str == unicodedata.normalized(form, str)` more efficiently than writing just that, by using the "quick check" optimization described in the Unicode standard in UAX python#15. However, it turns out the code doesn't implement the full algorithm from the standard, and as a result we often miss the optimization and end up having to compute the whole normalized string after all. Implement the standard's algorithm. This greatly speeds up `unicodedata.is_normalized` in many cases where our partial variant of quick-check had been returning MAYBE and the standard algorithm returns NO. At a quick test on my desktop, the existing code takes about 4.4 ms/MB (so 4.4 ns per byte) when the partial quick-check returns MAYBE and it has to do the slow normalize-and-compare: $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 4.39 msec per loop With this patch, it gets the answer instantly (58 ns) on the same 1 MB string: $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \ -- 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 58.2 nsec per loop This restores a small optimization that the original version of this code had for the `unicodedata.normalize` use case. With this, that case is actually faster than in master! $ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 561 usec per loop $ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 512 usec per loop

16: Warn for specific thread module methods r=ltratt a=nanjekyejoannah Dont merge until python#13 and python#14 are merged, some helper code cuts across. This replaces python#15 Threading module Notes Python 2: ``` >>> from thread import get_ident >>> from threading import get_ident Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: cannot import name get_ident >>> import threading >>> from threading import _get_ident >>> ``` Python 3: ``` >>> from threading import get_ident >>> from thread import get_ident Traceback (most recent call last): File "<stdin>", line 1, in <module> ModuleNotFoundError: No module named 'thread' > ``` **Note:** There is no neutral way of porting Co-authored-by: Joannah Nanjekye <jnanjekye@python.org>

Use Sphinx 1.4.9 for now

d9c54db

the-knights-who-say-ni added the CLA signed label Feb 11, 2017

vstinner approved these changes Feb 11, 2017

View reviewed changes

methane closed this Feb 11, 2017

methane deleted the sphinx-1.4 branch February 11, 2017 02:33

paulmon added a commit to paulmon/cpython that referenced this pull request Jan 10, 2019

Merge pull request python#15 from paulmon/win-arm32-fix-tests

abaaa92

Win arm32 fix tests

gnprice mentioned this pull request Aug 28, 2019

bpo-37966: Fully implement the UAX #15 quick-check algorithm. #15558

Merged

gnprice added a commit to gnprice/cpython that referenced this pull request Aug 29, 2019

Move UAX python#15 link to doc-comment.

27e8122

miss-islington mentioned this pull request Sep 4, 2019

[3.8] closes bpo-37966: Fully implement the UAX GH-15 quick-check algorithm. (GH-15558) #15671

Merged

emmatyping added a commit to emmatyping/cpython that referenced this pull request Mar 16, 2020

Make __parameters__ lazy (python#15)

e50136d

Now we can also remove `__setstate__`.

pablogsal mentioned this pull request Jun 12, 2020

bpo-40958: Avoid buffer overflow in the parser when indexing the current line #20842

Closed

shihai1991 mentioned this pull request Jun 23, 2020

bpo-1635741: Enable unicode_release_interned() without insure or valgrind. #21087

Closed

larryhastings mentioned this pull request Jun 7, 2022

The Great Argument Clinic Conversion Derby Meta-Issue #64386

Closed

itachaaa mentioned this pull request Aug 22, 2022

Python 3.10 hang at exit in drop_gil() (due to resource warning at exit?) #91414

Open

mdboom mentioned this pull request Aug 25, 2022

Assert and incorrect error message when loading source file containing invalid UTF-8 #96268

Closed

mdboom mentioned this pull request Nov 22, 2022

Type punning (and strict aliasing) issue in Py_CLEAR() and Py_SETREF() macros: Python --enable-pystats is miscompiled #99701

Closed

ziegenbalg mentioned this pull request Nov 30, 2022

double free in io.TextIOWrapper #72573

Closed

gvanrossum mentioned this pull request Aug 22, 2023

heap-use-after-free in _PyFunction_LookupByVersion #108253

Closed

LinanV mentioned this pull request Oct 23, 2023

Objects/typeobject.c: No such file or directory. #111203

Closed

stasos24 mentioned this pull request Oct 24, 2023

Modules/cjkcodecs/_codecs_iso2022.c - read out of bounds #101180

Closed

williamhu020 mentioned this pull request Nov 5, 2023

Use the API C of 'Py-NewInterpreterFromConfig' to exit unexpectedly in multiple threads. #111751

Closed

kcatss mentioned this pull request Nov 15, 2023

Use-after-free in unregister() of atexit module #112127

Open

kcatss mentioned this pull request Jan 14, 2024

crash in long_vectorcall in longobject.c #114050

Closed

kcatss mentioned this pull request Feb 12, 2024

Segmentation Fault in pthread_getcpuclockid function in time module #115378

Closed

kcatss mentioned this pull request Feb 20, 2024

Use After Free at _heapreplace_max #115706

Closed

ngoldbaum mentioned this pull request Sep 23, 2024

Crash running PyO3 tests with --test-threads=1000 #124375

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Sphinx 1.4.9 for now #15

Use Sphinx 1.4.9 for now #15

methane commented Feb 11, 2017

methane commented Feb 11, 2017

Use Sphinx 1.4.9 for now #15

Use Sphinx 1.4.9 for now #15

Conversation

methane commented Feb 11, 2017

methane commented Feb 11, 2017