Tree query improvements #3443

SchrodingersGat · 2022-07-31T13:35:50Z

#3438 focused my attention on the PartCategory and StockLocation list API endpoints.

Currently they are quite inefficient!

Taking the PartCategory list endpoint as an example: /api/part/category/?limit=<x>

Number of Results	Number of Queries	Time (s)
10	24	0.1
100	203	0.3
500	1000	1.0
1000	1994	1.9
5000	8910	8.6

~~This is very slow! (The StockLocation endpoint has similar issues)~~

After fixes applied

Number of Results	Number of Queries	Time (s)
10	6	0.1
100	6	0.1
500	6	0.25
1000	6	0.45
5000	6	2.5

There are two culprits here causing 1 + N query problems.

Part Count

Each category object has a field "parts" which is a count of the number of parts in the category (or any subcategories)

Solution

Use a query annotation (with a tree-based subquery) to annotate the queryset and prevent any additional database lookups.

This has been tested, and works very well - query count has been cut in half!

Pathstring

The pathstring field is calculated "on the fly" for each part category. One DB query is required for each item in the results. For thousands of results, this means thousands of queries!

This currently presents a more difficult challenge to fix!

Options to pursue here:

Caching

We could use the "cache" framework to save and load pathstring values, so we can load from memory instead of the database. This works (and has been tested), reducing the problem down to O(1) instead of O(n).

However the default caching framework can only handle a certain number of keys before it runs out. This solution works up to a point, but is always going to be limited by the maximum number of cache entries allowed by the system.

Annotation

We could (maybe) work out a queryset annotation to fetch the required data from the database and reduce the number of DB hits to a constant number.

However this is a harder problem than any existing queryset annotations. Might be not possible as we need to return multiple results per row, which is not generally possible with django's annotation framework.

Pre-Calculate and Store

We could pre-calculate the pathstring field and store in the database. Whenever the model is saved, re-compute and save the "pathstring" again. As these don't change very often, this would be not too bad.

However we would be limited by the available number of characters in the char field. This might be OK if we decide on a mechanism to handle "very long" path strings, maybe something like:

/some/path/which/is/super/long/....../middle/is/cut/out/and/replaced/with-dots/

- Uses subquery to annotate the part-count for sub-categories - Huge reduction in number of queries

matmair

Looks good for now. Prerendering the path might be the best solution.

- No longer a dynamically calculated value - Constructed when the model is saved, and then written to the database - Limited to 250 characters

- Add new annotation to PartLocationDetail view

SchrodingersGat added 2 commits July 31, 2022 22:44

Allow part category table to be ordered by part count

ef724a9

Add queryset annotation for part-category part-count

947f4c4

- Uses subquery to annotate the part-count for sub-categories - Huge reduction in number of queries

SchrodingersGat added bug Identifies a bug which needs to be addressed api Relates to the API labels Jul 31, 2022

SchrodingersGat added this to the 0.9.0 milestone Jul 31, 2022

SchrodingersGat requested a review from matmair July 31, 2022 13:37

matmair approved these changes Jul 31, 2022

View reviewed changes

SchrodingersGat added 5 commits August 1, 2022 09:58

Update 'pathstring' property of PartCategory and StockLocation

5658915

- No longer a dynamically calculated value - Constructed when the model is saved, and then written to the database - Limited to 250 characters

Data migration to re-construct pathstring for PartCategory objects

e1a2b7a

Fix for tree model save() method

0ce5538

Add unit tests for pathstring construction

e754204

Data migration for StockLocation pathstring values

71f99d4

SchrodingersGat mentioned this pull request Aug 1, 2022

Update API to match server changes inventree/inventree-app#196

Merged

SchrodingersGat added 6 commits August 1, 2022 11:18

Update part API

0b99207

- Add new annotation to PartLocationDetail view

Update API version

673938f

Apply similar annotation to StockLocation API endpoints

110cbc4

Extra tests for PartCategory API

d88fd8a

Unit test fixes

fa4e460

Allow PartCategory and StockLocation lists to be sorted by 'pathstring'

6ae5bd7

SchrodingersGat modified the milestones: 0.9.0, 0.8.0 Aug 1, 2022

Further unit test fixes

b7c937a

SchrodingersGat merged commit 175d955 into inventree:master Aug 1, 2022

SchrodingersGat deleted the category-query branch August 1, 2022 03:43

luwol03 mentioned this pull request Aug 1, 2022

[BUG] Treegrid is loading an eternity for huge amounts of data #3438

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tree query improvements #3443

Tree query improvements #3443

SchrodingersGat commented Jul 31, 2022 •

edited

Loading

matmair left a comment

Tree query improvements #3443

Tree query improvements #3443

Conversation

SchrodingersGat commented Jul 31, 2022 • edited Loading

Part Count

Solution

Pathstring

Caching

Annotation

Pre-Calculate and Store

matmair left a comment

Choose a reason for hiding this comment

SchrodingersGat commented Jul 31, 2022 •

edited

Loading