Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: {find,canonicalize} vs unsafe{Find,Canonicalize} #12

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alt-romes
Copy link
Owner

@alt-romes alt-romes commented Aug 28, 2022

Closes #1

Work in progress (documentation missing)

I realize I'm still not 101% convinced.
I do realize the price to pay for the safe find if the e-graph is already rebuilt is minimal, but perhaps I'm still hung up on that.

Currently, our only consumer of find is in Extraction here and here

It feels a bit weird that the e-graph is being rebuilt everytime find is called since it's being done so many times.

The performance difference isn't noticeable though.

@alt-romes alt-romes force-pushed the unsafe-find-canonicalize branch from 692d233 to 1208a91 Compare August 31, 2022 10:21
@alt-romes
Copy link
Owner Author

@aspiwack

This looks good to me now. It's an improvement and probably even more so for just-e-graph users.

There's one thing against the lazy-e-graph-forced-on-rebuild point of view that I'm wondering if we can fix, but which is not really a problem.

In the lazy point of view, calling find on an un-rebuilt e-graph forces the e-graph to be rebuilt.

That would make me expect that

egr = ...    -- (1) un-rebuilt e-graph
find 2 egr -- (2) forces e-graph to be rebuilt
find 3 egr -- (3) e-graph has been forced, so this should be now instantaneous

My case is with (3). Calling find a second time will still rebuild the e-graph completely, which makes the lazy argument a bit weaker.

Just wondering if we can make this fit in the lazy idea, or if it's better not to consider it 100% as a lazy e-graph (and not advertise it as such)

It would be very cool if we could use haskell's laziness to make (3) a reality though!

@alt-romes
Copy link
Owner Author

@aspiwack Friendly reminder regarding this PR to close #1

This looks good to me now. It's an improvement and probably even more so for just-e-graph users.

I can otherwise just merge it, but since you brought up the issue I thought you might have something to say about the resolution 🙂

@aspiwack
Copy link
Contributor

aspiwack commented Sep 9, 2022

Ouch. I'm sorry. Your question requires more than a 5min answer, that how it ended up slipping my mind. I'll give my thoughts by tomorrow (Saturday) night.

@aspiwack
Copy link
Contributor

Ok, I'm back.

Yes, your point is a very good one. I was, ironically, thinking in terms of mutable data structures, in which case there is no problem of going back in time.

I think that there are 3 possible attitudes that we can take. Well, kind of 4, but the fourth one is quite a bit of a stretch.

  1. The first point of view it to not care. This data structure is meant to be used in an imperative fashion, the pure implementation is a distraction. It's quite possible to take this stance, but I don't think that it's super tenable anymore.
  2. We can cache rebuild. The trick is to add an extra field to graphs, this one is not a strict field
    rebuilt_graph :: ReprUnionFind
    Whenever creating a graph, rebuilt_graph is set to rebuilt_graph = internalRebuild <the rest of the graph>
    Then
    rebuild gr = gr { unionFind = reuilt_graph gr }
    This would work, at the cost of a little more complexity each time you modify a graph.
    • It's worth taking into account that this method prevents you from having to replay a rebuild step . However, if you do an insertion then a rebuild, if you go back before the insertion and rebuild again, you still have to replay the rebuild (sans insertion). A “solution” to that, is to make an Okasaki-style data structure for the e-graph. But this sounds like a research problem, so let's not go there.
  3. We can consider that all this is just a little too fragile. Maybe rebuild should always be called explicitly after all. To avoid calling find on a bad graph we can make find partial. I think that they do something like this in the Rust library. The idea is to add a field to the e-graph
    rebuilt :: Bool
    The rebuilt field is set to True by rebuild, and to False by any mutation. Then find fails if the rebuilt field is False (this makes find partial).

I can't ansewr for you which attitude you should have to this. It's a design question, and depends on where you want to bring this library. It's up to you to choose what serves your purpose best.

@alt-romes
Copy link
Owner Author

@aspiwack

Small update:

I've taken a detour into reading Okasaki's book whose ideas are sounding really interesting. I'll attempt to make the e-graph emulate laziness and use some of its ideas in it. Some variation of (2) right now is the most appealing to me.

I'm trying to implement https://dl.acm.org/doi/pdf/10.1145/3485496 at the moment, and intend on trying some other related projects. I hope to experiment a little bit more with the user side of the library to help me in making these decisions a little bit better.

Thanks again!

Note from #1

(This is getting very off topic, but this is very neat! I'd love to see you demonstrate it to me some time)

I sent out an e-mail regarding this!

@alt-romes alt-romes force-pushed the master branch 2 times, most recently from eb291c6 to d2862ab Compare June 28, 2023 18:29
@alt-romes alt-romes changed the title {find,canonicalize} vs unsafe{Find,Canonicalize} Draft: {find,canonicalize} vs unsafe{Find,Canonicalize} Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rename {find,canonicalize} to unsafe{Find,Canonicalize}
2 participants