Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runelookup to avoid testing regexps that have no chance to match #111

Merged
merged 9 commits into from
Sep 20, 2020

Conversation

ceymard
Copy link
Contributor

@ceymard ceymard commented Sep 17, 2020

Let's do it.

lexer/stateful/pattern.go Show resolved Hide resolved
lexer/stateful/pattern.go Outdated Show resolved Hide resolved
lexer/stateful/pattern.go Show resolved Hide resolved
lexer/stateful/pattern_test.go Show resolved Hide resolved
lexer/stateful/pattern_test.go Outdated Show resolved Hide resolved
lexer/stateful/stateful.go Outdated Show resolved Hide resolved
lexer/stateful/stateful.go Outdated Show resolved Hide resolved
lexer/stateful/stateful.go Show resolved Hide resolved
lexer/stateful/stateful.go Outdated Show resolved Hide resolved
lexer/stateful/stateful.go Outdated Show resolved Hide resolved
@alecthomas alecthomas mentioned this pull request Sep 18, 2020
19 tasks
@ceymard ceymard changed the base branch from master to v1 September 20, 2020 13:19
Copy link
Owner

@alecthomas alecthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this in, but please see the comment re. FIXME's and fold that into the next PR. The only FIXME's that should remain in production code are for potential improvements, but never bugs or edge cases that users will run into. The rationale here is that in reality FIXME's never get fixed.

// - what possible first runes it would match on
// - what possible match size can be expected (possibly zero, exactly one, one or more)
// The possible match size is for now not utilized, but could potentially be used to not
// even launch a regexp match if the rune is found is in the input.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks!

// since otherwise it would take too much memory.
const (
charclassSizeLimit = 4096
mayMatchZero = 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW for future reference this incrementing constants is solved by iota:

const (
  mayMatchZero = iota
  mayMatchOneOrMore
  matchesExactlyOne
)

(yes, the syntax is bizarre)

return []*syntax.Regexp{r}, matchesExactlyOne
}

panic("should not get here")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would typically also include contextual information in the panic, otherwise it can be very difficult to debug in "real life".

"regexp/syntax"
"testing"

"github.com/stretchr/testify/assert"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, somewhat counter-intuitively, assert.* functions do not fail the test immediately, they continue on. This is almost never what you want, but this behaviour is available in testify/require.


var rule = &rules[i]

// FIXME: what about one char rules ?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these FIXME's matter? ie. will user's encounter them or not?

If they will, they should be fixed now, as (generally) FIXME's are never fixed. Ironic.

}
return lexer.Token{}, err // FIXME
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, plus this has no context. Fix what? Delete the FIXME if it's not a real issue.

@alecthomas alecthomas merged commit 7621a32 into alecthomas:v1 Sep 20, 2020
alecthomas added a commit that referenced this pull request Oct 31, 2020
…tch (#111)"

This reverts commit 7621a32.

--

I've reverted this for a couple of reasons. The first is just a bug that
could be fixed, in  that ordering of rules with the same prefix aren't
respected (eg. `${` vs. `$(`).

The second reason is that on reflection I think that Participle will be
better served by keeping the default dynamic implementation fairly
simple and investing most optimisations into code generation. To that
end, while this lookup does bring some improvement (~5%-10% from
memory), it is also quite complex.
alecthomas added a commit that referenced this pull request Nov 26, 2020
…tch (#111)"

This reverts commit 7621a32.

--

I've reverted this for a couple of reasons. The first is just a bug that
could be fixed, in  that ordering of rules with the same prefix aren't
respected (eg. `${` vs. `$(`).

The second reason is that on reflection I think that Participle will be
better served by keeping the default dynamic implementation fairly
simple and investing most optimisations into code generation. To that
end, while this lookup does bring some improvement (~5%-10% from
memory), it is also quite complex.
alecthomas added a commit that referenced this pull request Nov 26, 2020
…tch (#111)"

This reverts commit 7621a32.

--

I've reverted this for a couple of reasons. The first is just a bug that
could be fixed, in  that ordering of rules with the same prefix aren't
respected (eg. `${` vs. `$(`).

The second reason is that on reflection I think that Participle will be
better served by keeping the default dynamic implementation fairly
simple and investing most optimisations into code generation. To that
end, while this lookup does bring some improvement (~5%-10% from
memory), it is also quite complex.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants