Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeatead chars not working as expected #506

Closed
esfomeado opened this issue Jul 19, 2023 · 6 comments
Closed

Repeatead chars not working as expected #506

esfomeado opened this issue Jul 19, 2023 · 6 comments

Comments

@esfomeado
Copy link

Testing Problem

I had an issue where jqwik is generating duplicated values frequently which, for my example should almost never occur.

I have noticed that these duplicated values are always the same character e,g, AAAAAA so I decided to set the repeatedChars to 0 but this type of values are still generated. I have tried the other values but seems that there is no impact on the ammount of time a repeated char is used.

    @Test
    void bug() {
        StringArbitrary strings = Arbitraries.strings()
                .alpha()
                .ofMinLength(5)
                .ofMaxLength(25)
                .repeatChars(0);

        List<String> names = IntStream.range(0, 100)
                .mapToObj(i -> strings.sample())
                .toList();

        Map<String, Long> duplicateCount = names.stream()
                .filter(Objects::nonNull)
                .collect(Collectors.groupingBy(s -> s, Collectors.counting()));

        List<String> duplicateStrings = duplicateCount.entrySet().stream()
                .filter(entry -> entry.getValue() > 1)
                .map(Map.Entry::getKey)
                .collect(Collectors.toList());

        System.out.println("Duplicate strings: " + duplicateStrings);

        assertThat(duplicateStrings).isEmpty();
    }
@jlink
Copy link
Collaborator

jlink commented Jul 20, 2023

@esfomeado Thanks for raising that issue.
I see why you‘d expect that repeatChars(0) to not create any duplicate chars at all. Currently it just sets the probability of explicitly adding duplicate chars to 0. The effect you see is due to edge case resolution.

I’ll have to think about if changing the behaviour or documentation of repeatChars is the right thing to do. Maybe an additional uniqueChars() would be even clearer.

To solve your problem short-term, you can use withoutEdgeCases() on the strings arbitrary:

Arbitraries.strings().alpha()
		   .ofMinLength(5).ofMaxLength(25)
		   .repeatChars(0.0)
		   .withoutEdgeCases();

This will prevent „AAAAA“ to be generated on purpose. Duplicate chars will still occur by chance though.

If you definitely want no duplicates, you’d currently have to take a detour through something like:

Arbitraries.chars().alpha().list()
		   .ofMinSize(5).ofMaxSize(25)
		   .uniqueElements()
		   .map(chars -> chars.stream().map(String::valueOf).collect(Collectors.joining()));

@jlink
Copy link
Collaborator

jlink commented Jul 20, 2023

I decided to introduce an experimental new API method: StringArbitrary.uniqueChars()

The open question is if StringArbitrary.repeatChars(0.0) should log a warning or just call uniqueChars implicitly, or both?

@esfomeado What do you think?

@esfomeado
Copy link
Author

esfomeado commented Jul 20, 2023

Thanks for the quick response.

@jlink Personally I would expect to have the same behaviour was uniqueChars when set to 0 so I would say to just call uniqueChars implicitly.

@jlink
Copy link
Collaborator

jlink commented Jul 21, 2023

The StringArbitrary.uniqueChars() functionality and the fix for repeatChars(0.0) is available in 1.8.0-SNAPSHOT.

@jlink
Copy link
Collaborator

jlink commented Jul 21, 2023

In addition, there's now an annotation @UniqueChars that can be applied to any String parameter in a property.

@jlink
Copy link
Collaborator

jlink commented Jul 21, 2023

New features available in 1.8.0-SNAPSHOT

@jlink jlink closed this as completed Jul 21, 2023
@jlink jlink removed the in progress label Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants