0.6.0: Introducing RE2::Scanner
Scanning
Thanks to a suggestion from Matthias Kadenbach, re2 now contains an API for incrementally scanning a string for matches. To use it, call scan
on an instance of RE2::Regexp
with the string you want to search:
scanner = RE2('(\d+)').scan("Some 1 long 23 string 4 containing 567 numbers")
scanner.scan #=> ["1"]
scanner.scan #= ["23"]
The scanner
in the example above is an instance of RE2::Scanner
which has one main method -- scan
-- which returns the next match. Once no more matches are found, scan
will return nil
. You can use rewind
to reset a scanner back to the beginning of the string.
The RE2::Scanner
class also implements Ruby's Enumerator
interface so you can call each
and to_enum
on it:
scanner = RE2('(\d+)').scan("Some 1 long 23 string 4 containing 567 numbers")
scanner.each do |match|
puts match
end
No more in-place replacement
This release removes methods that previously altered strings in-place. This means re2_sub!
and re2_gsub!
are gone and RE2.Replace
and RE2.GlobalReplace
now return new strings rather than modifying their input.
Encoding awareness
Again, thanks to a bug report by Matthias Kadenbach: in Ruby 1.9 and later, re2 will now set the correct encoding for strings.
m = RE2('(\w+)', :utf8 => true).match("foo")
m[1].encoding # => #<Encoding:UTF-8>