Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Use lazy rather than joined eager loading #1959

Closed
wants to merge 1 commit into from

Conversation

john-bodley
Copy link
Contributor

@john-bodley john-bodley commented Dec 10, 2022

Description

#1361 introduced Joined Eager Loading for joining one-to-many or many-to-many model relationships, however per the SQLAlchemy example this results in a slew of LEFT OUTER JOINs which will rapidly result into a expansive interim result set—from a multiplicative combinatorial sense—when a model has multiple high cardinality relationships. Furthermore the result set—which is repetitive in nature—requires further overhead by SQLAlchemy to de-duplicate the rows against the actual model record.

Given that the FAB response is an addition of relationships rather than a multiplication of relationships it seems like Lazy Loading is more desirable. Granted this results in one query per relationship, but the interim result sets are typically much smaller non-repetitive in nature (especially for high cardinality relationships) and thus more performant.

For example in Superset at Airbnb we have a virtual dataset which is comprised of a thousands of metrics and columns. Previously a query to obtain the dataset with the associated metric and column IDs would time out (the join would result in an interim result set of 100M+ rows) whereas with lazy loading the query takes << 5 seconds.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Is CRUD MVC related.
  • Is Auth, RBAC security related.
  • Changes the security db schema.
  • Introduces new feature
  • Removes existing feature

@dpgaspar
Copy link
Owner

@john-bodley thanks!

I'll check possible implications on other endpoints, maybe we can make this has an option set at the REST model view or model itself

@dpgaspar dpgaspar self-requested a review December 12, 2022 17:05
@john-bodley
Copy link
Contributor Author

john-bodley commented Dec 13, 2022

@dpgaspar I wonder rather than having FAB explicitly define the relationship loading technique it should defer to the loading technique defined by the model. This should ensure the loading is optimized based on the explicit implementation/use case.

This can be achieved via the default loader and having the models specify—the via the lazy parameter of the ORM relationship (which defaults to select)—the relationship loader, i.e., in the case of Superset's SqlaTable model this would be:

class SqlaTable(Model):
    columns = relationship(
        "TableColumn",
        lazy="selectin",
    )

If you're in agreement with this approach I'll close this PR in favor of new PR which uses default rather than joined (eager) loading.

@john-bodley john-bodley changed the title chore: Use lazy rather than joined (eager) loading chore: Use Lazy rather than Joined Eager Loading Dec 14, 2022
@john-bodley john-bodley changed the title chore: Use Lazy rather than Joined Eager Loading chore: Use lazy rather than joined eager loading Dec 14, 2022
@john-bodley
Copy link
Contributor Author

Closing in favor of #1961.

@john-bodley john-bodley deleted the patch-1 branch December 14, 2022 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants