-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter default properties in json middlewares instead of filter fn #6
base: syncback-merge
Are you sure you want to change the base?
Filter default properties in json middlewares instead of filter fn #6
Conversation
src/snapshot_middleware/mod.rs
Outdated
log::warn!( | ||
"Rojo cannot serialize the property {}.{name} in JSON files.\n\ | ||
If this is not acceptable, resave the Instance at '{}' manually as an RBXM or RBXMX.", | ||
new_inst.class, snapshot.get_new_inst_path(new_inst.referent()) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this warning also provide the path of the file where the instance is defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can actually include SharedString
values in JSON now because of rojo-rbx/rbx-dom#414. I just forgot to change it.
We can probably just remove this warning altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind if we tackle that in a future PR? Making that change effects several tests and I'd rather not clutter this PR with them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can handle that in a different PR. A bit out of scope for this one either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allow rbx-dom's rbx_reflector to write defaults for SharedString properties (which is now possible after rojo-rbx/rbx-dom#414), then we won't have to alter any test data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's a compelling reason to not have rbx-dom write SharedString defaults now. Seems like a good change.
This also needs tests, I'll get to that shortly |
rojo-test/syncback-tests/ignore_paths_removing/expected/default.project.json
Show resolved
Hide resolved
I've added a test that contains an input file with some instances that have all but one property set to their default values. In the output project, I've defined properties which should be removed from whatever JSON file describing the instance, since the input file has them at their defaults. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change does cause the project_reserialize
test to reserialize default.project.json
, which is counter to the whole point of the test. I realize the name isn't entirely clear though, so that's on me. The test is confirming that project files don't reserialize when they don't have to, so it shouldn't include the default.project.json
. The file in the test's output
probably needs to be changed to account for this PR.
Otherwise, I think this is a good change and it simplifies the middleware logic a fair bit and makes it more consistent. I like it.
rojo-test/syncback-tests/ignore_paths_removing/expected/default.project.json
Show resolved
Hide resolved
src/snapshot_middleware/mod.rs
Outdated
log::warn!( | ||
"Rojo cannot serialize the property {}.{name} in JSON files.\n\ | ||
If this is not acceptable, resave the Instance at '{}' manually as an RBXM or RBXMX.", | ||
new_inst.class, snapshot.get_new_inst_path(new_inst.referent()) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can actually include SharedString
values in JSON now because of rojo-rbx/rbx-dom#414. I just forgot to change it.
We can probably just remove this warning altogether.
...For some reason GitHub includes my responses both in my review and on your comments above. Some of the above comments are confusing without the conversation chain. Thanks GitHub. |
I've refactored again, so that the new function is responsible for creating the property and attribute maps. This further simplifies the call sites, at the cost of having to create a new property BTreeMap for project middlewares rather than mutating, and making the new function slightly new complicated. I still need to address project reserialization - |
Outside of the tests this looks fine to me! |
Okay, I think I've fixed Sorry for the delay on this! |
Alright, I've slightly reworked the second pass over properties in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. It's a more elegant way of handling this.
Any last qualms or concerns you can think of before I merge it?
The only outstanding concern I personally have is about performance, but since I'm not familiar with this code or the cost of the extra hashes and comparisons, and don't have a very large project to profile, I'll leave it up to your judgment. |
So, I ran this across Adopt Me to test whether it resulted in a significant slowdown. The good news is that it doesn't so the performance cost doesn't seem to exist. The bad news is that this change seems to unintentionally triggering reserialization for every model.json in Adopt Me, every time. I'll investigate the underlying cause soon, but we can't merge it as-is. 3000 changed files every time we run syncback on Adopt Me is a no-go, haha. |
Unfortunate, but not entirely unexpected - thanks for taking the time! |
Having a hard time reproducing this - I tried with a Part defined in a JSON model. Just making sure, but this is happening for JSON models (not project files), and this new implementation is causing actual changes in the JSON models? I'm curious to know what the changes are, if these JSON models have any commonality, etc. |
Looking at it, it seems like this is just a consequence of how this is implemented. The check whether something needs to have syncback ran on it is done in the main loop, but at that point there will be a mismatch between JSONs and the model as read from the file system, so it obviously triggers syncback on them. It doesn't happen for projects because there's an internal "should node reserialize" check for them, but it would if there wasn't. This is new behavior because the hashing does filtering ahead of time using the same property filtering that JSON files used, but this PR changed We can fix this by either restoring that behavior in hashing (a couple ways to do that but not very complicated either way) or by adding custom "should reserialize" behavior to the other JSON representations. I am inclined to just fix hashing, since it'll be less complicated and keep code in one place. |
The problem is that when default properties are unilaterally excluded from hashes (the previous behavior), then it is impossible for syncback to set properties to default values, if that's the only change for a given instance in the input file. So for example, if we have the following instance in a JSON model: {
"className": "Part",
"properties": {
"Size": [
1.0,
2.0,
3.0
],
"Anchored": true,
}
} and we have an input file where |
The way I had in mind was that we could have a check to see if a value was equal to the default value, and if so just leave it out of the hash. This adds a bit more overhead to the hashing process but it was essentially overhead that was already present prior to this change so I'm fine with it. |
Apologies if I'm misunderstanding, but isn't that what syncback hashing currently does, producing the undesirable behavior I described here which this PR is trying to fix? See here in rojo/src/syncback/property_filter.rs Lines 57 to 70 in 6249d47
...which is later called by hash::hash_inst_filtered :Lines 76 to 84 in 6249d47
...so I'm not quite sure what to do. Maybe one way we could help this is by implementing rojo-rbx/rbx-dom#385 using the "WeakDom populates defaults" strategy. That way, instances in the old tree (which are generated directly from the Rojo project, and therefore are currently missing any properties left unspecified) would have defaults populated, and will have a better chance of hashing identically to instances in the new tree. |
Sorry for the delay in getting back to you on this! Yeah, looking at it again that does seem to be the exact behavior we're trying to fix here. Hmm. I suppose that probably means we'll need to go with the lame version, where we have 'should reserialize' functions for the JSON middleware instead. Frustrating, but at least it gives us the opportunity to do it in one place instead of isolated per JSON middleware, right? Populating the default properties of Instances would probably also help, but I can't go "this'll work if we change something upstream" because right now it just doesn't work. |
Yep, I've done some work in that direction already, so I'll get started on finalizing what I have. We can revisit the strategy here once rojo-rbx/rbx-dom#385 is further along, but I agree that we should just get it working first. |
I've added some logic to decide whether a JSON model should reserialize. Testing this is harder than it should be: syncback seems to think that instances described by JSON models included directly in a Additionally, syncback doesn't seem to play nice at all with JSON models with children, and wants to syncback JSON model children as if they were either described in the project file, or underneath a directory. So, I can't really test those code paths. Fixing this is out of scope for this PR, but we should do further testing and figure out what's going wrong. Meta files should hopefully be much simpler, since they cannot describe children. |
Added some similar logic to meta files, and it is much simpler, hooray! I'm not sure if this needs another test, because the "all middleware" test seems to adequately exercise this code. |
New logic looks good, and it works, though this branch still causes 887 files to attempt to reserialize for Adopt Me. If I let it continue, only 4 of them actually do change. There are also some surprise JSON models in there, so I'll investigate those, but I suspect the rest of these are still just issues caused by hashing. We could fix them the same way but it does beg the question of why we even bother with the hashing if we're manually checking every Instance with default properties we want to filter anyway (this will be basically anything that's not an RBXM). |
What kinds of files are those 887 made up of? Are they still JSON models, or other kinds? Any more information from which we could create isolated test cases will help. I'm also curious what the 4 actual changes were - maybe it's the implementation removing JSON properties that were found to be at default values (which we'd expect)? wrt the value of hashing: the more I think about it, the more this entire issue around default properties seems caused by rbx-dom not having enough parity with Roblox's implementation a la rojo-rbx/rbx-dom#385. I think it's highly likely that if rbx-dom starts populating default properties, then we'll start seeing the hashes match up when there are no logical changes, allowing us to once again skip all of this later work. If it doesn't cause too much of a performance hit right now, then I don't see too much of a problem. Maybe there's another way to avoid the defect of sometimes dropping change-to-defaults, but I can't say I've thought of it yet. |
This PR:
filter_properties_preallocated
populate_unresolved_properties
, which is now responsible for filtering default propertiesI think this makes sense since only JSON middlewares care about omitting default properties from their file representations. However, this means that more instances may have to be hashed and compared. I'm not totally sure how this will impact performance - it'd be great to profile this on a large project!
These changes allow JSON middlewares to correctly remove properties when property values change to their defaults. Previously, properties at default values would be unilaterally filtered out during hashing, meaning that syncback would produce incorrect results when properties were set back to default values, like in rojo-rbx#937 (comment).