You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi folks,
I like this cool segmenter for quality and speed, but something is a bit weird.
fromsyntok.segmenterimportanalyzetext='''Alexandri Aetoli Testimonia et Fragmenta. Studi e Testi 15. (1999)'''forpinanalyze(text):
forsinp:
print(' '.join(str(t) fortins))
I got:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-15-1217f364130d> in <module>
1 for p in analyze(text):
----> 2 for s in p:
3 print(' '.join(str(t) for t in s))
4
~/Codebase/toolchain/__pypackages__/3.9/lib/syntok/segmenter.py in segment(tokens, bracket_skip_len)
106 State.max_bracket_skipping_length = int(bracket_skip_len)
107
--> 108 for state in Begin(tokens):
109 if state.at_sentence:
110 history = state.collect_history()
~/Codebase/toolchain/__pypackages__/3.9/lib/syntok/_segmentation_states.py in __iter__(self)
128 while state is not None:
129 yield state
--> 130 state = next(state, None)
131
132 @abstractmethod
~/Codebase/toolchain/__pypackages__/3.9/lib/syntok/_segmentation_states.py in __next__(self)
468 return Terminal(self._stream, self._queue, self._history)
469
--> 470 self._move() # Do not skip parenthesis if they open the sentence.
471
472 if self.next_is_a_terminal:
~/Codebase/toolchain/__pypackages__/3.9/lib/syntok/_segmentation_states.py in _move(self)
324 def _move(self) -> bool:
325 """Advance the queue, storing the old value in history."""
--> 326 self.__history.append(self.__queue.pop(0))
327
328 if not self.__queue:
IndexError: pop from empty list
Is there any one can help me on it?
The text was updated successfully, but these errors were encountered:
Looks like a regression from my latest update on handling parenthesis. Your phrase probably needs to converted to a test case, analyzed, and fixed. Can you confirm if any 1.3 version works?
➜ test pdm run python test.py
Alexandri Aetoli Testimonia et Fragmenta .
Studi e Testi 15 .
( 1999 )
➜ test pdm list --freeze
regex==2022.1.18
syntok==1.3.3
fnl
changed the title
Simple case failed
Parenthesis at the end of input cause IndexError
Jan 30, 2022
This was a regression introduced by 1.4.1.
Thank you for pointing out the issue and helping in its review, @windreamer.
The issue is fixed in the latest release v1.4.2.
Hi folks,
I like this cool segmenter for quality and speed, but something is a bit weird.
I got:
Is there any one can help me on it?
The text was updated successfully, but these errors were encountered: