Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in ocropus-econf with some k-options #45

Closed
zuphilip opened this issue Jun 15, 2015 · 2 comments · Fixed by #131
Closed

Error in ocropus-econf with some k-options #45

zuphilip opened this issue Jun 15, 2015 · 2 comments · Fixed by #131
Labels

Comments

@zuphilip
Copy link
Collaborator

The function ocropus-econf returns an error when comparisons should only be done among the letters or digits, i.e.

$ ocropus-econf -k digits output/*/*.gt.txt
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-econf", line 59, in <module>
    outputs = sorted(list(outputs))
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 560, in parallel_map
    result = fun(e)
  File "/usr/local/bin/ocropus-econf", line 50, in process1
    err,cs = edist.xlevenshtein(txt,gt,context=args.context)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/edist.py", line 43, in xlevenshtein
    cost = current[n]
UnboundLocalError: local variable 'current' referenced before assignment
@zuphilip
Copy link
Collaborator Author

Okay, it seems that the function xlevenshtein(a,b,context=1) is not yet covering the boundary cases:

  1. a="", b="" --> I guess we can simply catch that case in the beginning and output the correct thing, i.e. return 0,[] (I am guessing)
  2. a=/="", b="" --> Maybe, the simplest would be to call xlevenshtein(b,a,context=1). Then the cost should be the same and we just have to swap each entry of the confusion matrix list.

Note, these two cases means that the ground truth is empty, which is normally not reasonable to assume. However, if we restrict the comparison for example to digits-only with the k-options, then these cases have to be handled as well.

@zuphilip
Copy link
Collaborator Author

The implementation in hocr-tools looks different and it seems to exactly deal also with these cases: https://github.com/tmbdev/hocr-tools/blob/master/hocr-eval-lines#L36-L57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant