Need to rewrite cython extensions in pure C #97

asvetlov · 2017-06-20T15:55:48Z

The reason is multidict.add(key, val) is ten times slower than dict[key] = val.

This is because multidict stores data internally as a list of cythonized _Item objects.
But creation of python (ever cythonized) object is too expensive for our use case.

The solution is using C structs for internal data but Cython has no support for visiting values stored in these structures: tp_visit and tp_clear slots.

Thus for sake of speed we need pure C implementation.

The text was updated successfully, but these errors were encountered:

mind1m · 2017-06-30T15:15:39Z

Update.
First we are going to do the MVP version:

rewrite multidict._Pair as a C struct in cython
multidict._items becomes C array instead of list
add corresponding methods to manipulate array size (start with 32 elements)
use PyObject* for Pair elements
use incref/decref for Pair elements
allow adding only simple types as elements (see _PyObject_GC_MAY_BE_TRACKED)

samuelcolvin · 2018-03-30T16:55:29Z

Given that python had now agreed to guarantee insertion order for standards dicts in python 3.6+ can't multidict just use a states standard dictionary not a list as it's core datastructures and improve performance?

asvetlov · 2018-03-30T17:46:54Z

dict keeps insertion order but doesn't allow multiple keys.
HTTP standard requires the feature, it was the main reason for multidict package creation.
Very many python web frameworks have multidicts in some form: Django, Flask, Pyramid etc etc.

Sure, the library can be reimplemented by borrowing CPython compact dict ideas but it is another level of complication (need a C level coding anyway).

Assuming that usual HTTP headers count is limited (10-30-50 at least) the sequential scan is as fast as hash table lookup, everything should fit into CPU cache.

The current problem is Python list usage as internal storage, switching to C array can help to drop the bottleneck I hope.

samuelcolvin · 2018-03-30T17:55:21Z

Makes sense, I was wondering if you could use a dict to avoid sequential scan but I see that it might not make much difference.

asvetlov · 2019-11-21T14:17:32Z

Duplicate of #249

asvetlov mentioned this issue Oct 23, 2017

Decorate _Pair by @cython.freelist #177

Closed

asvetlov mentioned this issue Mar 30, 2018

Speedup aiohttp web server aio-libs/aiohttp#2779

Open

asvetlov mentioned this issue May 18, 2018

Pair list c extension #234

Merged

webknjaz mentioned this issue Aug 28, 2019

Get rid of cython #249

Closed

asvetlov marked this as a duplicate of #249 Nov 21, 2019

asvetlov closed this as completed Nov 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need to rewrite cython extensions in pure C #97

Need to rewrite cython extensions in pure C #97

asvetlov commented Jun 20, 2017

mind1m commented Jun 30, 2017

samuelcolvin commented Mar 30, 2018

asvetlov commented Mar 30, 2018

samuelcolvin commented Mar 30, 2018

asvetlov commented Nov 21, 2019

Need to rewrite cython extensions in pure C #97

Need to rewrite cython extensions in pure C #97

Comments

asvetlov commented Jun 20, 2017

mind1m commented Jun 30, 2017

samuelcolvin commented Mar 30, 2018

asvetlov commented Mar 30, 2018

samuelcolvin commented Mar 30, 2018

asvetlov commented Nov 21, 2019