You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do you guys have any clever thoughts on how to represent hierarchical relationships in Qdrant?
I'm trying to make something like an outline or tree where nodes can be nested inside each other.
There are a few options -
1. Separate flat mapping collection:
Maintain an explicit parents collection, where the parent P of node X is stored in Qdrant with {id=X, vector=[], payload=P}. With this approach, I can find all the children of a node by scrolling on a /collections/parents field match for P, or find the parent of node X just by looking up id=X. But this would require many subqueries to descend a tree.
2. Regular parent payload field node's collection:
Store a single parent field in each node's record in the collection, or store a children field of some kind, like space separated. I think this would suffer from the query problem as well.
3. Multi-value payload fields for parent or children
I have yet to use the multi-value fields effectively. I could potentially put the entire "parent path" of a node as multiple field values, and this would let me query everything at once - by just finding anything with {parent = P} - even if each item has many parents.
4. Vector embedding as a learned representation of graph
In it, the authors find a way to represent hierarchical relationships inside the embedding space itself. But this technique doesn't seem to extrapolate beyond the set of IDs it was trained with (would love to be wrong here), so probably not useful here.
5. Vector embedding itself as a "path vector" of some kind
I think I could use each float in the point's vector as a key, and then do distance search, if I think about it a bit. Kinda like a float version of the SQL trick of embedding a "path string" like 0002.0057.0441 in a varchar and then using LIKE to find all children.
For example, you might represent a tree of vehicle makes and models like so:
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Do you guys have any clever thoughts on how to represent hierarchical relationships in Qdrant?
I'm trying to make something like an outline or tree where nodes can be nested inside each other.
There are a few options -
1. Separate flat mapping collection:
Maintain an explicit
parents
collection, where the parentP
of nodeX
is stored in Qdrant with{id=X, vector=[], payload=P}
. With this approach, I can find all the children of a node by scrolling on a/collections/parents
field match forP
, or find the parent of node X just by looking upid=X
. But this would require many subqueries to descend a tree.2. Regular
parent
payload field node's collection:Store a single parent field in each node's record in the collection, or store a children field of some kind, like space separated. I think this would suffer from the query problem as well.
3. Multi-value payload fields for
parent
orchildren
I have yet to use the multi-value fields effectively. I could potentially put the entire "parent path" of a node as multiple field values, and this would let me query everything at once - by just finding anything with
{parent = P}
- even if each item has many parents.4. Vector embedding as a learned representation of graph
This is some really interesting research:
https://dawn.cs.stanford.edu/2018/03/19/hyperbolics/
In it, the authors find a way to represent hierarchical relationships inside the embedding space itself. But this technique doesn't seem to extrapolate beyond the set of IDs it was trained with (would love to be wrong here), so probably not useful here.
5. Vector embedding itself as a "path vector" of some kind
I think I could use each float in the point's vector as a key, and then do distance search, if I think about it a bit. Kinda like a float version of the SQL trick of embedding a "path string" like
0002.0057.0441
in a varchar and then using LIKE to find all children.For example, you might represent a tree of vehicle makes and models like so:
But you'd need to pick the right "encoding" that will work with the chosen distance function, of course.
Open to any ideas and thoughts.
Thanks again to Qdrant team for this great package.
Beta Was this translation helpful? Give feedback.
All reactions