Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve instance-level operation hooks #488

Closed
wants to merge 1 commit into from

Conversation

fabien
Copy link
Contributor

@fabien fabien commented Mar 5, 2015

For consistency, all instance-level operations (prototype.updateAttributes and prototype.delete) will now have access to the current instance as part of the hook's context:

  • prototype.save + before save: the instance will be available as instance (unchanged behavior).
  • prototype.save + after save: the instance will be available as instance (unchanged behavior).
  • prototype.updateAttributes + before save: because the instance should not be manipulated directly, the instance will be available as currentInstance to distinguish between the instance that can be directly modified (it is omitted now). In other words, currentInstance provides the current 'state' of the instance, and it should be regarded as immutable. Any modifications should be applied to context.data, which incorporates the partial nature of the updateAttributes operation.
  • prototype.updateAttributes + after save: in after hooks, this difference is not important, and thus, the instance will be simply be available as instance.
  • prototype.remove/destroy/delete + before save: again, the difference is irrelevant, so the instance will be available as instance.
  • prototype.remove/destroy/delete + after save: the instance will be available as instance.

@altsang altsang added the #review label Mar 5, 2015
@fabien
Copy link
Contributor Author

fabien commented Mar 5, 2015

The next step would be some 'canned' before hooks which will do the following:

Model.observe('before delete', fetchInstances); // OR:

Model.observe('before delete', fetchIds);

function fetchInstances(ctx, next) { // queryOnDelete: 'full'
  ctx.Model.find({ where: ctx.where }, function(err, instances) {
    if (err) return next(err);
    ctx.hookState.instances = instances;
    next();
  });
};

function fetchIds(ctx, next) { // queryOnDelete: 'id'
  ctx.Model.find({ where: ctx.where, fields: ['id'] }, function(err, instances) {
    if (err) return next(err);
    ctx.hookState.ids = _.pluck(instances, 'id');
    next();
  });
};

Perhaps we can even allow this syntax: Model.observe('before delete', 'presetHookFnName');

And: Model.registerObserver('presetHookFnName', function(ctx, next) { ... }).

This would be the basis for #474 (comment)

@fabien
Copy link
Contributor Author

fabien commented Mar 5, 2015

@bajtos @raymondfeng @ritch please review

@bajtos
Copy link
Member

bajtos commented Mar 5, 2015

prototype.remove/destroy/delete + before save: again, the difference is irrelevant, so the instance will be available as instance.

I prefer to keep this consistent with "before save" and use currentInstance name.

prototype.updateAttributes + after save: in after hooks, this difference is not important, and thus, the instance will be simply be available as instance.
prototype.remove/destroy/delete + after save: the instance will be available as instance.

At one hand, nobody should be really expecting that changes to ctx.instance made in an "after" hook will be persisted. On the other hand, it introduces inconsistency. Until now, the recommendation was to check for ctx.instance and if it was not there, then the user should expect ctx.where and ctx.data (or ctx.where for "delete" hooks). With the proposal here, this rule no longer apply and one has to use different rules for different hooks.

In the light of the above, I prefer to consistently use ctx.currentInstance in all places where this PR adds this new context property.

@fabien
Copy link
Contributor Author

fabien commented Mar 5, 2015

OK, I can live with that. Any other thoughts? If not, I'll apply the changes requested here.

if (dataSource.connector.findOrCreate) {
observedContexts.should.eql(aTestModelCtx({
where: { id: existingInstance.id },
data: { id: existingInstance.id, name: 'updated name' }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, you are executing updateOrCreate but testing for findOrCreate.

Atomic findOrCreate does not set ctx.currentInstance - we need to capture this in the documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a typo indeed. Sorry. Whenever an optimized (dataSource.connector.findOrCreate) call can be made, the currentInstance will not be available. However, in the normal, unoptimized scenario, updateAttributes is invoked internally, which does set currentInstance. This is why I had to make a distinction here.

@bajtos
Copy link
Member

bajtos commented Mar 5, 2015

See the comment above. Other than the comments above, the implementation looks good.

I'd like to hear from @ritch or @raymondfeng whether the name currentInstance is good enough (whether it makes clear what's it purpose), as they have fresh eyes.

My internet connection is poor in the evenings so I won't be able to continue this discussion until tomorrow.

@fabien
Copy link
Contributor Author

fabien commented Mar 5, 2015

@bajtos hope we can merge first thing in the morning. I won't be able to do much tonight, and I'm afraid I'll need a full day to port my code over ...

@bajtos
Copy link
Member

bajtos commented Mar 5, 2015

Frankly, expecting non-trivial pull requests to get landed the same day is unreasonable, I even dare to say silly. Please expect at least 2-3 working days.

@fabien
Copy link
Contributor Author

fabien commented Mar 5, 2015

@bajtos meanwhile I'll fiddle with my thumbs, OK? I have this hunch that nothing ever is really trivial in your opinion ... thanks anyway 👎

@fabien fabien mentioned this pull request Mar 5, 2015
8 tasks
@ritch
Copy link
Contributor

ritch commented Mar 5, 2015

I'm not worried about the name... my concern is we are adding yet another case you have to handle when using a hook. What would alleviate my concern is if you can use the same assumptions (eg. where vs instance) and not even know currentInstance exists.

Also... Why do we need currentInstance to do these:

Model.observe('before delete', fetchInstances); // OR:
Model.observe('before delete', fetchIds);

or is it only in the case of after save?

@fabien
Copy link
Contributor Author

fabien commented Mar 5, 2015

@ritch I tend to agree with you. Initially I only had currentInstance for the specific updateAttributes case. It's the only situation where it really matters.

Ideally, we would just forget about the currentInstance denominator and intelligently diff the changes to instance when performing updateAttributes. In that case instances could be the actual instance (===) or a temporary copy, used for accepting the changes, which then would be extracted for the partial update.

Finally, for consistency, before delete would also just receive instance, in the prototype.delete case.

If we can agree on such a solution, I'd be happy to work on it in the morning. I really need to know which direction we're heading with this - I'm stuck refactoring a huge code-base to be operation hooks compatible - and this Friday is an important deadline for it. @raymondfeng thoughts?

@raymondfeng
Copy link
Contributor

For the save context, I think there should be the following properties:

  1. Model - the model class (readonly)
  2. data - the data object to be sent to DB (this is the one that hooks should mutate to influence the save for all cases)
  3. instance - ideally it should be the object instance which the calling prototype method is on (readonly)
  4. where - the where clause of the criteria (read/write)
  5. hookState - the object to pass states from one hook to the next
  6. options - the options object (which might be used to propagate transactions later on)

The main concern for 3 is back-compatibility since instance is anti-intuitively used in the current implementation.

@fabien
Copy link
Contributor Author

fabien commented Mar 6, 2015

@raymondfeng if I gather correctly, this is similar to #483 (comment)

If an instance is passed in the save context, it should never be used to manipulate data - it is only there to get the current state (for comparison for example). Any changes would go into the data property of the context. Making it consistent with the batch operations as well.

@fabien
Copy link
Contributor Author

fabien commented Mar 6, 2015

However, working with data directly is rather low-level, because at the instance level there may be a setter, computed properties etc. that might affect other properties as a result of updated values. Therefor, I would suggest to with the solution above: #488 (comment)

When available, use instance directly for any modifications, and in the case of updateAttributes extract the diff to perform the partial update as usual.

Otherwise, accept the limitations of the low-level cx.data access during any kind of batch operation - this is as far as you can go, without having instances available to you - which is still an option if you'd fetch those with ctx.where.

@raymondfeng
Copy link
Contributor

@fabien Do you agree on the following?

  1. For prototype.save, data is the same as instance.
  2. For prototype.updateAttributes, data is the changes to be applied while instance is the model instance itself. (I think there is a bug in the current implementation which applies changes to the instance before making a DB call).
  3. For prototype.delete, data is undefined while instance is the model instance itself.

@bajtos
Copy link
Member

bajtos commented Mar 6, 2015

Let me cross-post my earlier comments explaining why I think it's bad idea to send data & instance as proposed by @raymondfeng.

#483 (comment)

If an instance is passed in the context, it should never be used to manipulate data - it is only there to get the current state (for comparison for example). Any changes would go into the data property of the context.

This is what I disagree with. One of the ideas behind ORM and OOP in general is that the objects should encapsulate the behaviour and hide implementation details. An example is a property with custom setter that creates a normalised version of the input value, or a prototype method that performs some calculation and updates several properties.

Another example is a code that changes some model properties and then would like to get a value computed from these changed properties.

That's why I believe users should use the instance object to manipulate the data in as many places as possible. Because if they had only the data object, then they would have to build a new instance in order to use these prototype methods.

Also because we perform a validation of the updated model instance, in many cases we would have to 1) extract data from the instance before calling the hooks, 2) pass the data back to the instance for validation. At the moment, we can skip 1) because hooks are manipulating the instance directly.

#483 (comment)

My concern is that converting the instance to data is not free, it takes some time and allocates extra memory. I don't want all users to pay this price.


Also I believe the hook implementations should have the information whether a partial or a full update will be performed, because the partial update may need special handling.

E.g. when applying partial update of "firstName" only on a model with "firstName", "lastName" and computed "fullName", then the hook has to do one of:

  • reject the update, because we don't have lastName to calculate "fullName" - this is the case for batch updateAll
  • modify the update data and include unmodified "lastName" and recomputed "fullName" - this is the case for instance updateAttributes

@fabien I would really appreciate if you can keep the work in the same pull request the next time, so that the conversation stays in single place.


@ritch I'm not worried about the name... my concern is we are adding yet another case you have to handle when using a hook. What would alleviate my concern is if you can use the same assumptions (eg. where vs instance) and not even know currentInstance exists.

See the second part of #483 (comment)

Any hook operating on an instance-level should also receive the instance in the context.

There are two basic ways how the hook may want to use the instance:

A) It wants to modify the data that will be persisted to the database, this is specific to "before save" hook. We must make it very clear whether the save will update the whole record (ctx.instance ATM) or perform a partial update (ctx.data ATM).

B) It wants to get more information about the model instance affected. IIRC, there are only two cases where we are not exposing this data even though we have it available: prototype.remove and prototype.updateAttributes.

Let's focus at updateAttributes first. For hooks interested in modifying the data (A), the API must make it very clear that there is no instance to manipulate, the hook must update the partial data object instead. Therefore we cannot use ctx.instance to provide the readonly info for (B), because that's inconsistent with the way all operations work, it means one can no longer check if(ctx.instance) to decide whether he has a full instance or partial data for modifications.

When it comes to prototype.remove, there are no restrictions similar to what we have in updateAttributes. However, I am reluctant to use ctx.instance simply because it will be inconsistent with the way how "before save" use ctx.instance.

My conclusion is that we need to come up with a new property name that will be set in the where+data case and that will make it very clear that the changes made to this instance will be ignored by the operation. Perhaps ctx.targetInstance would be a good name?

@bajtos
Copy link
Member

bajtos commented Mar 6, 2015

One more thought: while the current design of operation hooks may be perceived as too complex, it reflect the complexity of the DAO methods provided by LoopBack and forces users to think more about the different scenarios in which the hook may be invoked.

I am concerned that if we simplify the API too much, it may make it too easy to shoot oneself in the foot.

@raymondfeng
Copy link
Contributor

I think there are some confusions about the terms we use. Let me first clarify what I use:

  1. instance - the receiver of the method if it is a prototype method, for example, the instance will be order for order.updateAttributes().
  2. data - the data to be sent to the underlying connector for DB operations. Please note the data can be a plain JS object or an instance of the target model class (I believe that @bajtos uses instance for this case).

@bajtos
Copy link
Member

bajtos commented Mar 6, 2015

@Raymond For prototype.save, data is the same as instance.

This will break the contract that ctx.instance is immutable and does not change, because changes to ctx.data will be propagated to ctx.instance.

Other than that, I think this may be a reasonable compromise. I'd rather use a different name for ctx.instance, e.g. ctx.targetInstance, in order to preserve backwards compatibility with the current implementation.

Should the context include both where and ctx.targetInstance, or are these two properties mutually exclusive?

@fabien
Copy link
Contributor Author

fabien commented Mar 6, 2015

As I see it there are two ways to provide a consistent API - we'd have to choose one:

  1. whenever instance is available, you'd use it directly (save and updateAttributes, the latter will do a diff and perform the partial update)
  2. whenever data is available, you'd use it directly. If there's a instance (heck, even currentInstance) available, you can choose to inspect it for your modifications to data

@bajtos
Copy link
Member

bajtos commented Mar 6, 2015

@raymondfeng IIUC, if we go with your proposal, then ctx.data is an instance of the target model class in case of a full update, and a plain JS object in case of a partial update?

In other words, the following is correct?

var isFullUpdate = ctx.data instanceof ctx.Model;

@raymondfeng
Copy link
Contributor

var isFullUpdate = ctx.data instanceof ctx.Model;

Yes.

@fabien
Copy link
Contributor Author

fabien commented Mar 6, 2015

After giving this more thought, I think option 1 (and thus #488 (comment)) works most consistently. Principle Of Least Surprise, you know ...

@raymondfeng
Copy link
Contributor

What about the following?

  1. Add self to refer to the JS object instance that owns the prototype method
  2. self and where are exclusive
  3. Keep instance and data as-is (even though I really feel the term of instance is confusing)

@bajtos
Copy link
Member

bajtos commented Mar 6, 2015

For prototype.updateAttributes, data is the changes to be applied while instance is the model instance itself. (I think there is a bug in the current implementation which applies changes to the instance before making a DB call).

As far as I understand the current implementation, the changes are applied to the instance so that the instance can be validated. Which I find a bit weird, because we can't guarantee that the data we see in LoopBack will be the same that will end up in the database, since updateAttributes is making only a partial update.

Perhaps if we have prototype.update, performing a full property update, then we can remove the validation from updateProperties and keep it very lightweight partial update, similar to updateAll.

Another benefit of prototype.update is that the hooks will naturally receive ctx.instance, because we are sending the full instance to the database.

@fabien
Copy link
Contributor Author

fabien commented Mar 6, 2015

Note that instance is not actually immutable (not enforced) and it is currently the recommended way to modify during prototype.save. If we can extend this to updateAttributes and handle the partial update transparently.

@fabien
Copy link
Contributor Author

fabien commented Mar 6, 2015

@raymondfeng I personally like instance better than self.

@fabien
Copy link
Contributor Author

fabien commented Mar 9, 2015

@raymondfeng @bajtos I propose to land this PR and then refactor at a later stage - it would be a bit too much to expect those who ported their code over to the new hooks to do it differently in such short timespan. Perhaps towards v.3 is more appropriate?

@bajtos bajtos self-assigned this Mar 9, 2015
@raymondfeng
Copy link
Contributor

I'm fine to land your PR as a short-term workaround. But the complexity and counter-intuition on how to use the ctx object really bothers me. We should fix it ASAP and probably release the changes in loopback-datasource-juggler@3.0.0.

@raymondfeng
Copy link
Contributor

BTW, I think loopback-datasource-juggler deserves a new major release with the Promise and Hook features.

@bajtos
Copy link
Member

bajtos commented Mar 9, 2015

However, even though it goes against the philosophy of these new hooks, it would certainly be valuable to have ctx.operation and the current type of operation at hand ('create', 'update', ...).

I'd like to implement #397 for that.

@raymondfeng
Copy link
Contributor

@bajtos +1 to implement the per method (#397) based hooks. That will give developers maximum control and flexibility.

But even with that, the intent-based hooks are needing some fixes to make it consistent and intuitive.

@bajtos
Copy link
Member

bajtos commented Mar 9, 2015

BTW, I think loopback-datasource-juggler deserves a new major release with the Promise and Hook features.

IMO it's ok to release a minor version, because we are not breaking backwards compatiblity. That's the semver way. It also allows us to release a major version when we learn enough about the new hooks and find issues that require a major change.

@raymondfeng
Copy link
Contributor

IMO it's ok to release a minor version, because we are not breaking backwards compatibility. That's the semver way. It also allows us to release a major version when we learn enough about the new hooks and find issues that require a major change.

Clarification: I was not questioning what have been released. I meant to say if we fix the hooks with breaking changes, it's reasonable to release it together with the promise as a major release.

@bajtos
Copy link
Member

bajtos commented Mar 9, 2015

I don't want to repeat all my arguments why I still think the current design of operation hooks is good, I feel we are running in circles here.

Let's land this pull request to fix the @fabien's immediate problems (after the commit history is cleaned up) and start a new discussion on what is a better solution for this kind of hooks, hopefully getting more community input this time.

@bajtos
Copy link
Member

bajtos commented Mar 9, 2015

IMO it's ok to release a minor version, because we are not breaking backwards compatibility. That's the semver way. It also allows us to release a major version when we learn enough about the new hooks and find issues that require a major change.

Clarification: I was not questioning what have been released. I meant to say if we fix the hooks with breaking changes, it's reasonable to release it together with the promise as a major release.

👍, makes sense.

@bajtos
Copy link
Member

bajtos commented Mar 9, 2015

I'm really worried about the magic you had to put into your hook logic as seen from the gist with the current context structure. Let me pick a few examples:

var source = ctx.currentInstance || ctx.instance || ctx.data || {};
var target = ctx.instance || ctx.data || {};

Why do I have to check so many ctx properties to understand what's source and what's target?

Because there are subtle differences between "full update of an instance", "updateAttributes" and "updateAll".

However, I am ok with adding property getters ctx.source and ctx.target to provide a syntax-sugar API for users that don't care about these details.

@fabien
Copy link
Contributor Author

fabien commented Mar 9, 2015

@bajtos I'll clean it up so it can be merged. Thanks! 👍

@raymondfeng
Copy link
Contributor

For final fix to the intent-based hooks, I propose that we use a use-case driven approach. Please add or link your use cases to the wiki page: https://github.com/strongloop/loopback-datasource-juggler/wiki/Use-cases-for-intent-based-CRUD-hooks

@bajtos
Copy link
Member

bajtos commented Mar 9, 2015

@raymondfeng is there anything else to discuss in this PR? Can I merge it once @fabien cleans up the commit history?

@raymondfeng
Copy link
Contributor

Go ahead pls.

@fabien
Copy link
Contributor Author

fabien commented Mar 9, 2015

@bajtos I'll push the code in the morning ... sorry about the delay!

@fabien fabien force-pushed the feature/op-hooks branch from a08047b to 74ad455 Compare March 10, 2015 14:24
@fabien
Copy link
Contributor Author

fabien commented Mar 10, 2015

@bajtos cleaned up - can you merge please?

@bajtos
Copy link
Member

bajtos commented Mar 10, 2015

@fabien can you please beef up the commit message and explain the changes? If not then I'll try to do that myself tomorrow, so that this is finally landed.

@bajtos bajtos closed this in bc54104 Mar 11, 2015
@bajtos
Copy link
Member

bajtos commented Mar 11, 2015

Landed via bc54104

@fabien
Copy link
Contributor Author

fabien commented Mar 12, 2015

@bajtos I'm a bit late to the party - thanks for landing this patch and beefing up the commit!

@bajtos
Copy link
Member

bajtos commented Mar 12, 2015

You are welcome, thank you for your patience :) I'd like to wait with the release until #452 is landed too, so that we don't release minor versions too often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants