Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve implementation of rstrip and lstrip #22496

Merged
merged 8 commits into from
Jun 26, 2017

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Jun 23, 2017

I propose to consider the following:

  • change lstrip and rstrip to use SubString instead of allocating a new string;
  • removing chomp! without deprecation period; it is not exported and falls back to chomp anyway;
  • change readchomp to use chomp instead of chomp!.

bkamins added 3 commits June 23, 2017 23:17
`chomp` uses `SubString` so `chomp!` is not needed to ensure that copying of big string is not performed.
`chomp!` was only used in `readchomp` and I propose to replace it with `chomp` there. As `chomp!` was not exported I understand that it can be removed without deprecation.
In this way copying of string is avoided, and in `strip` currently the string is actually copied twice. A basic benchmark is given below:
```
julia> using BenchmarkTools

julia> s = " "^10*"x"^1000*" "^10;

julia> @benchmark strip(s) # current definition
BenchmarkTools.Trial:
  memory estimate:  2.13 KiB
  allocs estimate:  2
  --------------
  minimum time:     502.056 ns (0.00% GC)
  median time:      537.579 ns (0.00% GC)
  mean time:        631.383 ns (5.74% GC)
  maximum time:     6.003 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     197

julia> @benchmark strip(s) # with SubString definition
BenchmarkTools.Trial:
  memory estimate:  64 bytes
  allocs estimate:  2
  --------------
  minimum time:     334.062 ns (0.00% GC)
  median time:      339.819 ns (0.00% GC)
  mean time:        362.150 ns (1.00% GC)
  maximum time:     8.236 μs (93.19% GC)
  --------------
  samples:          10000
  evals/sample:     243
```
@kshyatt kshyatt added the strings "Strings!" label Jun 24, 2017
@nalimilan
Copy link
Member

The change to lstrip and rstrip sounds good to me, as it's consistent with what #18339 did for chomp. The removal of chomp! is a more complex issue: I think it was disabled by @JeffBezanson when changing the String representation, and the fact that he didn't completely remove the function probably means he plans to add it back at some point when it's possible again. So I would split the PR into two parts in the interest of merging the first one faster.

@nalimilan
Copy link
Member

BTW, the test failure is due to master being broken (fix in #22506).

@KristofferC KristofferC reopened this Jun 24, 2017
@tkelman
Copy link
Contributor

tkelman commented Jun 24, 2017

It looks like out of all registered packages, Compat might be the only one that is using Base.chomp!

@bkamins
Copy link
Member Author

bkamins commented Jun 24, 2017

I have updated the PR to keep chomp!. However, personally I feel that it breaks a fundamental invariant and should be removed. In readchomp I have retained change from chomp! to chomp.

base/io.jl Outdated
"""
readchomp(x) = chomp!(readstring(x))
readchomp(x) = chomp(readstring(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave this out too since that's unrelated to the other change. Either we should remove chomp! and stop using it everywhere, or we should keep the code as-is.

@bkamins bkamins changed the title Improve implementation of rstrip and lstrip, remove chomp! Improve implementation of rstrip and lstrip Jun 25, 2017
@bkamins
Copy link
Member Author

bkamins commented Jun 25, 2017

OK. Leaving only lstrip and rstrip fix.

@KristofferC
Copy link
Member

Perhaps deserves a NEWS-entry?

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's actually a trailing space in NEWS.md, which explains the CI failure. Otherwise, looks good.

@tkelman
Copy link
Contributor

tkelman commented Jun 25, 2017

should be squashed, on merge if not before

@KristofferC KristofferC merged commit 36379d2 into JuliaLang:master Jun 26, 2017
DrTodd13 pushed a commit to IntelLabs/julia that referenced this pull request Jun 26, 2017
* change chomp! to chomp in readchomp

`chomp` uses `SubString` so `chomp!` is not needed to ensure that copying of big string is not performed.

* Remove chomp! function

`chomp!` was only used in `readchomp` and I propose to replace it with `chomp` there. As `chomp!` was not exported I understand that it can be removed without deprecation.

* Make rstrip and lstrip use SubString

In this way copying of string is avoided, and in `strip` currently the string is actually copied twice. A basic benchmark is given below:
```
julia> using BenchmarkTools

julia> s = " "^10*"x"^1000*" "^10;

julia> @benchmark strip(s) # current definition
BenchmarkTools.Trial:
  memory estimate:  2.13 KiB
  allocs estimate:  2
  --------------
  minimum time:     502.056 ns (0.00% GC)
  median time:      537.579 ns (0.00% GC)
  mean time:        631.383 ns (5.74% GC)
  maximum time:     6.003 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     197

julia> @benchmark strip(s) # with SubString definition
BenchmarkTools.Trial:
  memory estimate:  64 bytes
  allocs estimate:  2
  --------------
  minimum time:     334.062 ns (0.00% GC)
  median time:      339.819 ns (0.00% GC)
  mean time:        362.150 ns (1.00% GC)
  maximum time:     8.236 μs (93.19% GC)
  --------------
  samples:          10000
  evals/sample:     243
```

* Correct strip tests

* Bring back chomp!

* reverse removal of chomp!

* Add NEWS entry for lstrip and rstrip.

* remove trailing whitespace in NEWS.md
tlnagy added a commit to GiovineItalia/Gadfly.jl that referenced this pull request Oct 19, 2018
strip on 1.0 returns a substring now (see JuliaLang/julia#22496).
It wasn't ever very necessary anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants