Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take advantage of Java 8 language features #5

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

bmccutchon
Copy link

@bmccutchon bmccutchon commented May 12, 2016

I just took the time to enhance the project with Java 8 language features. The Generator syntax is more concise now, and it can be used to create Java 8 (parallel) streams. Also, Generators are now stateless by default, so you can iterate over a generator multiple times (unless it performs some mutation). The following is copied and pasted from my fork of your README.

Examples

The following is a simple generator that yields 1 and then 2:

Generator<Integer> simpleGenerator = s -> {
    s.yield(1);
    // Some logic here...
    s.yield(2);
};

for (Integer element : simpleGenerator)
    System.out.println(element);
// Prints "1", then "2".

Infinite generators are also possible:

Generator<Integer> infiniteGenerator = s -> {
    while (true)
        s.yield(1);
};

You can even use a generator to create a (parallel) Stream:

// Note that the generic parameter is necessary, or else Java can't determine
// the generator's type.
Generator.<Integer>stream(s -> {
    int i = 0;
    while (true) {
        s.yield(i++);
    }
}).limit(100).parallel() // and so on

bmccutchon and others added 12 commits May 11, 2016 15:43
Also use a lambda for a Runnable instead of an anonymous inner class.
This allows it to be used by functions that expect suppliers, for example, with
Stream.generate(). If the Supplier reaches the end of the underlying Iterator, a
NoSuchElementException is thrown.
Everything works yet, except... You can't use a generator multiple times, which
would be fine, except that Java caches "stateless" lambdas (even if the
functional interface is using a hack to give it state, but Java can't know
that).
This is the best solution I see to the problem with Java caching lambdas it
thinks are stateless.
It follows naturally from the last commit, and really only took a few lines of
code.
It had become redundant, as all it really did was hold an iterator. Now
GeneratorIterator takes its place.
The class no longer implements Supplier; that was a silly way of doing things.
It turns out that you can make a stream from an iterator easily. Generator is
now really stateless.
@mherrmann
Copy link
Owner

Thank you very much for the PR. It looks very interesting. Your commits are also very well encapsulated and easy to follow. I would like to incorporate your changes, but there are a couple of points I would like to discuss first:

  1. I would still like to make the old version of the library available to people who can't use Java 8 yet. I think the easiest way to do this would be for me to merge and release your changes (as v1.1 say) and update the readme to tell people that if they are on Java < 8 then they have to use the current release (v1.0).
  2. Your updated readme only contains examples using lambdas. I think it would be good to also still have examples showing the old syntax.
  3. Are your changes backwards-compatible? Ie. will existing users of the library be able to update the library while keeping their existing code base the same without breaking anything? This would be important I think.
  4. Making all Generators stateful/resettable is a no-go for me. I believe your implementation also results in a memory leak because of the static Map static iters in Generator.java. Each generator that is ever constructed remains in this map, slowly filling up memory. (A way around this would be to use Java's weak references.) Because of this (and other related headaches that I believe will result from statefulness), I'm very much against adding a reset method to the Generator interface. What could be done however would be to add a separate class (eg. ResettableGenerator) that can be used to wrap any existing generator. This class could have a reset method. It would keep the elements already yielded in a private list. When reset is called, it starts yielding the items from the list, until this list is exhausted at which point it yields from the wrapped generator again (adding the items to the internal list as it goes). The nice thing about this would also be that it avoids the memory leak (since the list would be contained in the ResettableGenerator instance so if this instance goes out of scope then all memory can be collected).

I haven't had time to look at all your changes in detail but these are the most important points I could see.

Cheers!
Michael

@bmccutchon
Copy link
Author

bmccutchon commented May 17, 2016

  1. Easy enough. Should I add that to my README or will you add that after you merge?
  2. Sure, I can add that.
  3. No, it isn't currently backwards-compatible, but I think that can be fixed. The one backward-incompatible change I made was removing the yield method from Generator, as lambdas can't call methods from their functional interfaces. Adding it back now would mean breaking the statelessness of the interface and going back to the Map of iters approach I recently did away with, but the real backwards-incompatible change is the change in method signature of run, which is basically the only way for lambdas to have access to the yield method. Instead, I think it would be better to split it into two types, a class Generator, which behaves in a backward-compatible way, and an interface GeneratorFunc, which behaves like the new Generator interface.
  4. No problem, I actually already changed that, if you look at c64c4b2, which I added after the original PR. It turns out I was able to make Generators stateless. Even before that, I was using weak references (via a WeakHashMap for iters), as I was aware of the need to avoid a memory leak.

bmccutchon and others added 5 commits May 17, 2016 12:36
Generator is now split into the backwards-compatible, stateful, abstract class
Generator and the newer functional interface GeneratorIterator.
Syntax highlighting is nice...
This became useful again because Generator is now stateful (which was required
for the yield() method to work in a backward-compatible way).
@bmccutchon
Copy link
Author

I think I've done pretty much everything you suggested (unless you want more than one README example using the old syntax). In the course of fixing your third point, I realized I basically did the same thing as your proposed solution for the fourth point, except that ResettableGenerator is just called Generator (for backwards compatibility) and the functional interface previously called Generator is now called GeneratorFunc.

@bmccutchon
Copy link
Author

Actually, I noticed after my last comment that you mention that the ResettableGenerator class (which I will just refer to as Generator) should have a list of yielded elements. Do you mean that, the first time the Generator runs, elements are stored in a list, and the second time, after reset is called, we just yield elements from that list? That would take more memory, so I don't imagine most people would want that (otherwise they would just build lists), but it could have the benefit of acting like a list whose elements are loaded lazily, although a lazy-loaded list class that loads from an iterator/iterable might be better for that.

Actually, I just thought of an even better way of handling that class. reset is meant to be called between invocations of iterator(), so why not make it private and call it at the beginning of iterator?

@bmccutchon
Copy link
Author

I just added another method for syntactic sugar. Are you still interested in merging this PR? I think the only question is whether or not you like the fact that all Generators are reusable (unlike in Python). There's no public reset() method anymore.

@mherrmann
Copy link
Owner

Hi Brian,

sorry I've been so out of touch. The main reason is that I am busy working on other projects, and java-generator-functions has very low priority for me. In fact, I am not even using it anymore myself, because I ran into performance issues of the thread-based approach on OS X (as mentioned in the README).

Your changes look fine, but they are basically a complete rewrite of the library. By merging your changes, I would take on responsibility of code that I mostly didn't write. Given that I'm already out of touch with the project, that's something I would rather not do. Having said that, I do think there is a very viable way forward: You could publish your code as a separate project, saying that it is "generator functions for Java 8+". And I could put a note at the top of "my" README saying that users who want Java 8 language features should look at your project. I think this would also be a very good solution to my point 1. above.

Sorry if I caused you unnecessary work. I hope it won't be in vain if you continue it as a separate project.

M

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants