Replies: 1 comment
-
Thanks @porscheme 1. As you could see from the profile/explain output, the query started to seek 2. Due to the nature of query/storage separation design, it'll be costly to have lots of data being fetched from storage to query engine when some of the filter/limits cannot be pushed down to the storage side. 2.1 On the nebula graph side, introducing more optimization rules and storage pushdown operators would help here(progress: #2533 ), here I could see Filter/Limit is really costly, maybe there are some space to be optimized. 2.2 On the query composing side, adding more information to reduce the data being traversed would help: 2.2.1 2.2.2 Another approach could be to limit the traverse in the middle rather than only in the final phase: i. it could be something like this, where, if you check its plan, the limit will be applied in the first part of the traversal
ii. or, even further, we use 2.3 On the Super Node perspective, when few vertices could be connected to tons of vertices, if all of the queries are targeting sample(limit/topN) data instead of fetching all of them, or, for those supernodes, we would like to truncate data, a configuration in storageD @Shylock-Hg do you see how any optimization rule could help here in 2.1, please? |
Beta Was this translation helpful? Give feedback.
-
Nebula version: v3.1.0
graphd: 1 (128GM, 2 TB SSD)
metad: 1 (128GM, 2 TB SSD)
storage: 3 (128GM, 2 TB SSD)
Below query took about 20 minutes
Below is the profile
I changed my query like below, little improvement but not enough
Below is the profile
@wey-gu
Beta Was this translation helpful? Give feedback.
All reactions