Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple instructions for a self referential table #127

Open
8 tasks done
michaelmcandrew opened this issue Oct 11, 2021 · 12 comments
Open
8 tasks done

Simple instructions for a self referential table #127

michaelmcandrew opened this issue Oct 11, 2021 · 12 comments
Labels
question Further information is requested

Comments

@michaelmcandrew
Copy link

First Check

  • I added a very descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the SQLModel documentation, with the integrated search.
  • I already searched in Google "How to X in SQLModel" and didn't find any information.
  • I already read and followed all the tutorial in the docs and didn't find an answer.
  • I already checked if it is not related to SQLModel but to Pydantic.
  • I already checked if it is not related to SQLModel but to SQLAlchemy.

Commit to Help

  • I commit to help with one of those options 👆

Example Code

class Node(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    text: str
    parent_id: Optional[int] = Field(foreign_key="node.id")
    # parent: Optional["Node"] not sure what to put here
    # children: List[Node] not sure what to put here either :)

Description

I am trying to create a simple self referential model - using the SQL model equivalent of the adjacency list pattern described here: https://docs.sqlalchemy.org/en/14/orm/self_referential.html

I am only a litte familiar with SQL alchemy but was unable to translate their example into one that would work with SQLmodel.

In your docs you said: "Based on SQLAlchemy [...] SQLModel is designed to satisfy the most common use cases and to be as simple and convenient as possible for those cases, providing the best developer experience". I was assuming that a self referential model would be a fairly common use case but totally appreciate that I could be wrong on this :)

I see that there is an sa_relationship param that you can pass 'complicated stuff' too but I was not sure whether I should be using that (or how I would do so if I was meant to) - sorry just a bit too new to this.

Crossing my fingers that it is straight forward to complete the commented lines in my example.

Operating System

Linux

Operating System Details

No response

SQLModel Version

0.0.4

Python Version

3.9.7

Additional Context

No response

@michaelmcandrew michaelmcandrew added the question Further information is requested label Oct 11, 2021
@michaelmcandrew
Copy link
Author

Did some more experimentation

With a model defined like so

class Node(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    text: str
    parent_id: Optional[int] = Field(foreign_key="node.id")
    children: List["Node"] = Relationship(back_populates="parent")
    parent: Optional["Node"] = Relationship(back_populates="children")

running session.exec(select(Node)).all() throws the following error;

sqlalchemy.exc.ArgumentError: Context.children and back-reference Context.parent are both of the same direction symbol('ONETOMANY'). Did you mean to set remote_side on the many-to-one side ?

Will carry on investigating...

@giubaru
Copy link

giubaru commented Oct 31, 2021

@michaelmcandrew did you try using sa_relationship_kwargs?

This is how I'm using it:

class AccountBase(SQLModel):
  total: float = 0.0
  name: str
 
class Account(AccountBase, table=True):
  __tablename__ = "accounts"

  id: Optional[int] = Field(default=None, primary_key=True)
  parent_id: Optional[int] = Field(default=None, foreign_key="accounts.id")
  children: List["Account"] = Relationship(
    sa_relationship_kwargs=dict(
      cascade="all",
      backref=backref("parent", remote_side="Account.id"),
    )  
  )

  def append(self, child: "Account"):
    self.children.append(child)

def main():
   with Session(engine) as session:
       account_1 = Account(total=200, name="Parent")
       account_2 = Account(total=300, name="Child")
       account_3 = Account(total=400, name="Grandson")

       account_2.append(account_3)
       account_1.append(account_2)

My references:

@michaelmcandrew
Copy link
Author

Thanks @giubaru - I suspected that sa_relationship_kwargs might be part of the solution but had not got round to testing it out. I will take a look and report back.

@michaelmcandrew
Copy link
Author

This is working for me @giubaru - thanks!

@bugchecker
Copy link

bugchecker commented Feb 23, 2022

Cannot use parent field

from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.orm import backref, relationship
from sqlalchemy.ext.declarative import declarative_base
from sqlmodel import Field, Relationship, SQLModel
from typing import List, Optional

Base = declarative_base()


class NodeSa(Base):
    __tablename__ = 'node_sa'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('node_sa.id'))
    children = relationship(
        'NodeSa',
        backref=backref('parent', remote_side=[id]),
    )
print(NodeSa(parent=NodeSa()).parent)  # <__main__.NodeSa object at 0x7fde515462c0>


class NodeSm(SQLModel, table=True):
    id: int = Field(primary_key=True)
    parent_id: Optional[int] = Field(default=None, foreign_key='node_sm.id')
    children: List['NodeSm'] = Relationship(
        sa_relationship_kwargs=dict(
            backref=backref('parent', remote_side='NodeSm.id'),
        ),
    )
try:
    NodeSm(parent=NodeSm()).parent
except Exception as e:
    print(type(e))  # <class 'sqlalchemy.exc.NoForeignKeysError'>

@Cielquan
Copy link

Cielquan commented Apr 1, 2022

Cannot use parent field

from sqlalchemy import Column, ForeignKey, Integer
from sqlalchemy.orm import backref, relationship
from sqlalchemy.ext.declarative import declarative_base
from sqlmodel import Field, Relationship, SQLModel
from typing import List, Optional

Base = declarative_base()


class NodeSa(Base):
    __tablename__ = 'node_sa'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('node_sa.id'))
    children = relationship(
        'NodeSa',
        backref=backref('parent', remote_side=[id]),
    )
print(NodeSa(parent=NodeSa()).parent)  # <__main__.NodeSa object at 0x7fde515462c0>


class NodeSm(SQLModel, table=True):
    id: int = Field(primary_key=True)
    parent_id: Optional[int] = Field(default=None, foreign_key='node_sm.id')
    children: List['NodeSm'] = Relationship(
        sa_relationship_kwargs=dict(
            backref=backref('parent', remote_side='NodeSm.id'),
        ),
    )
try:
    NodeSm(parent=NodeSm()).parent
except Exception as e:
    print(type(e))  # <class 'sqlalchemy.exc.NoForeignKeysError'>

Have my own share of issues I am currently working on with this self referential stuff and came across your example.

The error is correct because the node_sm table does not exist at all.

You can fix the error by either:

removing the underscore:

- parent_id: Optional[int] = Field(default=None, foreign_key='node_sm.id')
+ parent_id: Optional[int] = Field(default=None, foreign_key='nodesm.id')

or setting the table name manually:

class NodeSm(SQLModel, table=True):
+    __tablename__ = "node_sm"
    id: int = Field(primary_key=True)
    ...

BUT then you directly get the next issue: None

When you try to print the NodeSm.parent like so:

try:
-    NodeSm(parent=NodeSm()).parent
+    print(NodeSm(parent=NodeSm()).parent)
except Exception as e:
    print(type(e))  # <class 'sqlalchemy.exc.NoForeignKeysError'>

You simply get None

EDIT

But you can work the other way around.

For readability add a new field:

class NodeSm(SQLModel, table=True):
    id: int = Field(primary_key=True)
+    name: str
    ...

Then change the way to connect them:

try:
-    print(NodeSm(name="child", parent=NodeSm()).parent)
+    child = NodeSm(name="child")
+    parent = NodeSm(name="parent")
+    parent.children.append(child)

+    print(f"{child.parent=}")
+    print(f"{parent.children=}")
except Exception as e:
    print(type(e))

The output is:

child.parent=NodeSm(name='parent', parent_id=None, children=[NodeSm(name='child', parent_id=None, parent=NodeSm(name='parent', parent_id=None, children=[...]))])
parent.children=[NodeSm(name='child', parent_id=None, parent=NodeSm(name='parent', parent_id=None, children=[...]))]

Root Cause

I also have a guess as to why you can add children, but not a parent. I think pydantic is the culprit here. Reason for this is simple:

On the NodeSm class we create an attribute called children but we never created an attribute called parent. Therefore pydantic just discards this information in the init of a new NodeSm instance. Sure the parent field exists but it is dynamically added by sqlalchemy and therefore it is unknown to pydantic.

In the line with print(f"{child.parent=}") you also get a mypy error which supports my assumption:
"NodeSm" has no attribute "parent"; maybe "parent_id"? [attr-defined]mypy(error)

EDIT 2

I did some further research and found the option to allow extra attributes for pydantic models in the pydantic docs.

The default is ignore -> no Exception and when you set it so forbid is throws one, but in this case the exception is somewhere catched and does not surface. BUT when you set it to allow it works:

...
+ from pydantic import Extra
...
-class NodeSm(SQLModel, table=True):
+class NodeSm(SQLModel, table=True, extra=Extra.allow):
    id: int = Field(primary_key=True)
    ...

Unfortunately this is a solution using a hammer, as we allow all possible extra attributes, even those who does not exist. I also saw that you can alter the JSON schema of a model in the pydantic docs, which could be the safest solution.

EDIT 3

The schema manipulation does not work. It changed nothing.

class Node(SQLModel, table=True):
    id: int = Field(primary_key=True)
    parent_id: Optional[int] = Field(default=None, foreign_key='node.id')
    children: List['Node'] = Relationship(
        sa_relationship_kwargs=dict(
            backref=backref('parent', remote_side='Node.id'),
        ),
    )

    class Config:
        @staticmethod
        def schema_extra(schema: Dict[str, Any], model: Type['Node']) -> None:
            schema['properties']['parent'] = {'title': 'Parent', '$ref': '#/definitions/Node'}

TL;DR

  1. The table name in the parent_id's foreign_key declaration is wrong. Set __tablename__ or remove the underscore. (See first part of this post)
  2. Allow extra fields for pydantic: extra=Extra.allow (See EDIT 2)

@johnziebro
Copy link

johnziebro commented May 12, 2022

So far, this seems to be the best question/answer on self-referential tables for SQLModel. With all the versions and edits it seems unclear as to what the final "simple" working solution was. Would it be possible to post a definitive example if this issue was resolved?

@Cielquan
Copy link

Well unfortunately this is only a workaround and not a real solution, because the typing information is missing. I think for a real solution there needs to be first party support for self-referential tables coming from SQLModel itself.

The project at work where this issue occurs got postponed and I will probably work on it again in the next 4 to 6 weeks, but no promises.

The TL;DR above is this workaround but to summarize, this should (I did not test it) give you a working self-referential table.

from typing import List, Optional

from pydantic import Extra
from sqlalchemy.orm import backref
from sqlmodel import Field, Relationship, SQLModel


class DBTable(SQLModel, table=True, extra=Extra.allow):

    __tablename__ = "db_table"

    id: int = Field(primary_key=True)
    parent_id: Optional[int] = Field(default=None, foreign_key="db_table.id")

    children: List["DBTable"] = Relationship(
        sa_relationship_kwargs=dict(
            backref=backref("parent", remote_side="DBTable.id"),
        ),
    )

On the DBTable class you 4 attributes:

  • id is the primary key
  • parent_id is the self-referential foreign key matching the DBTable.id of the parent
  • children is a list of DBTables which have this instance's id as parent_id
  • parent is the DBTable with the id specified as parent_id in this instance

But only the first two are actual columns in the database.

The issue with this workaround is, that the DBTable.parent attribute is dynamically created and thus typing information is missing. Therefore type checkers will complain if you use it and you cannot autocomplete it either. But because extra=Extra.allow is set pydantic won't remove the parent attribute, so it at least works.

I hope this helps.

@daniil-berg
Copy link
Contributor

I created an answer for this with more detailed explanations and tests here, but this is the short version:

from typing import Optional
from sqlmodel import Field, Relationship, SQLModel


class Node(SQLModel, table=True):
    __tablename__ = 'node'  # just to be explicit

    id: Optional[int] = Field(default=None, primary_key=True)
    data: str
    parent_id: Optional[int] = Field(
        foreign_key='node.id',  # notice the lowercase "n" to refer to the database table name
        default=None,
        nullable=True
    )
    parent: Optional['Node'] = Relationship(
        back_populates='children',
        sa_relationship_kwargs=dict(
            remote_side='Node.id'  # notice the uppercase "N" to refer to this table class
        )
    )
    children: list['Node'] = Relationship(back_populates='parent')
    # with Python `<3.9` use `typing.List` instead of `list`

Instead of using the legacy backref parameter, the desired model attributes parent and children are declared explicitly. This also allows type checking to work as expected. And no need for extra=Extra.allow.

Check the Stackoverflow post I linked above for some illustrations/tests.

I agree that this would be nice to have in the official docs. Maybe I will create a PR for this.

@xavipolo
Copy link

xavipolo commented Sep 3, 2022

I created an answer for this with more detailed explanations and tests here, but this is the short version:

Thank you very much, I've lost a couple of hours testing with ForwardRef, Self, etc. without success :-(

@philmade
Copy link

philmade commented Nov 30, 2022

Any success with these?

I am trying to create a join table to link a model (Paper) back to other Paper objects to make a "similar papers" function.

class PaperSimilarLink(SQLModel, table=True):
    paper_id: Optional[int] = Field(
        default=None, foreign_key="paper.id", primary_key=True
    )
    similar_id: Optional[int] = Field(
        default=None, foreign_key="paper.id", primary_key=True
    )

class Paper(SQLModel, HasRelatedObjects, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    similar: List["Paper"] = Relationship(
        back_populates="similar",
        link_model=PaperSimilarLink,
        sa_relationship=dict(
            primary_join=PaperSimilarLink.paper_id,
            secondary_join=PaperSimilarLink.similar_id,
        ),
    )


without the primary_join and secondary_join it creates an error:

sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join condition between parent/child tables on relationship PaperSection.similar - there are multiple foreign key paths linking the tables via secondary table 'papersectionsimilarlink'. Specify the 'foreign_keys' argument, providing a list of those columns which should be counted as containing a foreign key reference from the secondary table to each of the parent and child tables.

After adding it, it runs, but I get an error trying to access Paper.similar:

AttributeError: 'Paper' object has no attribute 'similar'

@meirdev
Copy link

meirdev commented Nov 30, 2022

Fix to your Paper model:

similar: List["Paper"] = Relationship(
        link_model=PaperSimilarLink,
        sa_relationship_kwargs=dict(
            primaryjoin="Paper.id == PaperSimilarLink.paper_id",
            secondaryjoin="Paper.id == PaperSimilarLink.similar_id",
        ),
    )

Working code:

from sqlmodel import SQLModel, Field, create_engine, Session, select, Relationship
from typing import Optional, List


class PaperSimilarLink(SQLModel, table=True):
    paper_id: Optional[int] = Field(
        default=None, foreign_key="paper.id", primary_key=True
    )
    similar_id: Optional[int] = Field(
        default=None, foreign_key="paper.id", primary_key=True
    )


class Paper(SQLModel, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    similar: List["Paper"] = Relationship(
        link_model=PaperSimilarLink,
        sa_relationship_kwargs=dict(
            primaryjoin="Paper.id == PaperSimilarLink.paper_id",
            secondaryjoin="Paper.id == PaperSimilarLink.similar_id",
        ),
    )


engine = create_engine("sqlite:///")

with Session(engine) as session:
    SQLModel.metadata.create_all(engine)

    p1 = Paper(id=1)
    p2 = Paper(id=2)
    p3 = Paper(id=3)
    session.add_all([p1, p2, p3])
    session.commit()

    session.refresh(p1)
    session.refresh(p2)
    session.refresh(p3)

    p1.similar = [p2, p3]
    session.add(p1)
    session.commit()

    statement = select(Paper)
    paper = session.exec(statement).unique().all()

    for i in paper:
        print(i.id, i.similar)

    """
    1 [Paper(id=2), Paper(id=3)]
    2 []
    3 []
    """

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

9 participants