Skip to content

Commit

Permalink
Merge pull request #274 from didier-84/master
Browse files Browse the repository at this point in the history
Fix sentence segmentation bug
  • Loading branch information
camertron authored Apr 28, 2024
2 parents 65d0a66 + d9730a4 commit 4fcebf6
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 1 deletion.
2 changes: 1 addition & 1 deletion lib/twitter_cldr/segmentation/rule_set.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def each_boundary(cursor, stop = cursor.length)

until cursor.position >= stop || cursor.eos?
state_machine.handle_next(cursor)
yield cursor.position if suppressions.should_break?(cursor)
yield cursor.position if cursor.eos? || suppressions.should_break?(cursor)
end
end

Expand Down
7 changes: 7 additions & 0 deletions spec/segmentation/break_iterator_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,13 @@
"She's nice."
])
end

it "splits correctly when a string ends with an exception directly followed by a single space" do
str = "I like the Mrs. "
expect(iterator.each_sentence(str).map { |word, _, _| word }).to eq([
"I like the Mrs. "
])
end
end

context "without ULI exceptions" do
Expand Down

0 comments on commit 4fcebf6

Please sign in to comment.