-
Notifications
You must be signed in to change notification settings - Fork 521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
safe_load with implicit_resolver slows down #439
Comments
That's a neat bug. It seems to be triggered by the combination of using Not a fix, obviously, but
The problem seems to be here: Lines 145 to 152 in d0d660d
# value = "foo"
else:
# value[0] = "f"
resolvers = self.yaml_implicit_resolvers.get(value[0], [])
# resolvers = [('tag:yaml.org,2002:bool', re.compile('^(?:yes|Yes|YES|no|No|NO\n |true|True|TRUE|false|False|FALSE\n |on|On|ON|off|Off|OFF)$', re.VERBOSE))]
# XXX This is mutating yaml.SafeLoader.yaml_implicit_resolvers['f']
resolvers += self.yaml_implicit_resolvers.get(None, [])
# self.yaml_implicit_resolvers.get(None) returns the custom implicit resolver
# The longer the list gets, the slower it gets
for tag, regexp in resolvers:
if regexp.match(value):
return tag Every time config:
foo: bar
bars:
- foo1: bar1
foo2: bar2
baz:
foo3: bar3
foo6:
foo7: bar7 |
Repeated calls to `resolve` can experience performance degredation, if `add_implicit_resolver` has been called with `first=None` (to add an implicit resolver with an unspecified first character). For example, every time `foo` is encountered, the "wildcard implicit resolvers" (with `first=None`) will be appended to the list of implicit resolvers for strings starting with `f`, which will normally be the resolver for booleans. The list `yaml_implicit_resolvers['f']` will keep getting longer. The same behavior applies for any first-letter matches with existing implicit resolvers. This change avoids unintentionally mutating the lists in the class-level dict `yaml_implicit_resolvers` by looping through a temporary copy. Fixes: yaml#439
Repeated calls to `resolve` can experience performance degredation, if `add_implicit_resolver` has been called with `first=None` (to add an implicit resolver with an unspecified first character). For example, every time `foo` is encountered, the "wildcard implicit resolvers" (with `first=None`) will be appended to the list of implicit resolvers for strings starting with `f`, which will normally be the resolver for booleans. The list `yaml_implicit_resolvers['f']` will keep getting longer. The same behavior applies for any first-letter matches with existing implicit resolvers. This change avoids unintentionally mutating the lists in the class-level dict `yaml_implicit_resolvers` by looping through a temporary copy. Fixes: #439
Every time safe_load is called, it turns out to be slower than previous one, if implicit resolver is used.
Without implicit_resolvers the safe_load() works the same time every time.
pyyaml versions: '5.1.2', '5.3.1'
python 3.6.5, ubuntu 18.04
My output (code below, test data attached):
sample.txt
The text was updated successfully, but these errors were encountered: