-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deferred computed fields? #40
Comments
@jerch a similar use case that may also be addressed by deferred computed fields is non-persisted computed fields. That may sound weird, but imagine we want to index some models and the indexed value depend on related models. Thanks to the dependency resolving capabilities of DCF we can get notified when a field on a related model changes that impacts indexing and reindex that model object, but we don't necessarily need to persist the computed field values in DB for that |
@mobiware This would be like a trigger executing stuff on related data changes. Indeed an interesting idea. |
AFAIU this is not yet implemented, is it? I need such feature in my project because for us running updates on How i see the interaction with the library is:
(3) and (4) probably should be single step - it is hard for me to imagine case where we would need to pre-lock records. But i keep it as a possibility here. We have lots and lots of logic in django signal handlers and considering this as a good alternative. Are there any plans to implement more control over the execution and change detection? Would you be able to review and accept PR's? cc: @jerch |
Yepp, nothing implemented in this regard. It is currently on the backlog, as it seemed of low interest. From your problem description I am not quite sure, if this will help you at all. But maybe it is just me - could you give a short yet more explanatory example of your task structures and where async offloading comes into play? |
@jerch sorry for taking so long to answer. My case is pobably not related to the OPs request, since my case is synchronous. If your are interested this is how we adopted this library into our codebase.
# proj/custom_computedfields_app.py
import sys
from computedfields.resolver import BOOT_RESOLVER
from computedfields.settings import settings
from django.apps import AppConfig
from django.db.models.signals import class_prepared
class ComputedfieldsConfig(AppConfig):
name = 'computedfields'
def __init__(self, *args, **kwargs):
super(ComputedfieldsConfig, self).__init__(*args, **kwargs)
class_prepared.connect(BOOT_RESOLVER.add_model)
self.settings = settings
def ready(self):
# disconnect model discovery to avoid resolver issues with models created later at runtime
class_prepared.disconnect(BOOT_RESOLVER.add_model)
# do not run graph reduction in migrations and own commands,
# that deal with it in their own specific way
for token in ('makemigrations', 'migrate', 'help', 'rendergraph', 'createmap'):
if token in sys.argv: # pragma: no cover
BOOT_RESOLVER.initialize(True)
return
# normal startup
BOOT_RESOLVER.initialize()
# connect signals
from computedfields.handlers import (
get_old_handler,
m2m_handler,
postdelete_handler,
postsave_handler,
predelete_handler,
)
from django.db.models.signals import m2m_changed, post_delete, post_save, pre_delete, pre_save
# need to run those manually
# pre_save.connect(
# get_old_handler, sender=None, weak=False, dispatch_uid='COMP_FIELD_PRESAVE')
# post_save.connect(
# postsave_handler, sender=None, weak=False, dispatch_uid='COMP_FIELD')
pre_delete.connect(
predelete_handler, sender=None, weak=False, dispatch_uid='COMP_FIELD_PREDELETE')
post_delete.connect(
postdelete_handler, sender=None, weak=False, dispatch_uid='COMP_FIELD_POSTDELETE')
m2m_changed.connect(
m2m_handler, sender=None, weak=False, dispatch_uid='COMP_FIELD_M2M')
def update_computedfields():
# breakpoint()
# TODO: deletions
by_model_class = defaultdict(lambda: defaultdict(list))
for model in changes.changes: # NOTE: it is our thread-local that is collecting all the changes during transaction to batch-process side effects and process notifications in the end
by_model_class[type(model)][model.pk] = model
for model_cls, pks in by_model_class.items():
queryset = model_cls.objects.filter(pk__in=pks.keys())
update_dependent(queryset)
@contextlib.contextmanager
def atomic(using=None, savepoint=True, durable=False): # NOTE: our custom transaction.atomic
connection = get_connection(using)
try:
add_handler = False
if not connection.in_atomic_block:
add_handler = True
with Atomic(using, savepoint, durable): # NOTE: original djangos atomic block
if add_handler:
on_commit(update_computedfields, robust=False) # NOTE: run our handler on the end of
yield
finally:
,,, so we wanna collect list of changed stuff and defer graph update untill end of transaction, because we have a lot of "broken" intermediate state during transaction |
@verschmelzen Yupp looks still pretty synchronous to me. Though I wonder, why the normal callstack did not work for your case - was that too computational-heavy? It is pretty clear, that the computation will explode for complicated dependencies, esp. when done with sequential single model object changes. There are some ideas to overcome this by temporarily disabling the recalculation and do batch calcs at the end, but that introduces its own set of issues, esp. around proper change tracking (see #148). |
Given that you have a quite expensive drilldown to do for a computed field with lots of waits (CPU hungry, waiting for DB or some other resource), the current realtime / live-sync approach of django-computedfields is quite limiting.
There are ways to implement deferred handling yourself, e.g. with a sentinel computed field indicating a dirty state:
Here the computed field just updates the dirty state based on changed dependencies of the longtaking calculation, while the computation itself can be deferred to update things in the future:
Wouldn't it be more convenient to have something like that:
that automatically deals with the dirty state handling and postponed updates by itself or some scheduled update rules?
Just a rough idea atm, also it is already clear, that those would have to be handled quite differently internally:
Well thats quite a lot of diverging behavior, not sure if that should be done in this package at all. On the proside it would still share the deps resolving work with sync computed fields and open the package to really complicated nasty tasks in the method code.
The text was updated successfully, but these errors were encountered: