Classification_report is going really slow for mode='strict' #62

eloukas · 2020-10-16T10:28:15Z

I have a dummy dataset in my local machine.
While my sklearn token-level evaluation (strict mode on/off) and my seqeval entity-level evaluation (strict mode off) run all together in 5 seconds, for some reason the seqeval entity-level evaluation with arg mode='strict' takes around 70 seconds, which is too much.

Is there any way to speed it up somehow? Maybe the code needs to get more optimized?

I can't run experiments with more data on my AWS machine using mode='strict'.
The evaluation on mode='strict' takes more time than the training of the neural models.

Many thanks!

Operating System: Ubuntu 18 (LTS)
Python Version: 3.8
Package Version: 1.1.0

The text was updated successfully, but these errors were encountered:

Hironsan · 2020-10-16T11:41:26Z

Profile:

unique_labels
extended_tokens
is_valid
Enum is slow

eloukas · 2020-10-16T12:01:19Z

What do you mean? Do you want me to report you anything from my current program?
Thank you again.

Hironsan · 2020-10-16T12:09:05Z

I performed the profiling and found it takes a long time to execute the above.
I'm thinking about how to solve the problem now.

By the way, what the number of samples did you try to evaluate?

eloukas · 2020-10-16T13:15:39Z

Oh, ok, cool!
In my dummy dataset, the numbers of gold/predicted tokens are 8721.
(I have them all in a list of a single list.)

In my big dataset in the cloud, I probably have one gazzilion millions of these :)

Hironsan added the enhancement New feature or request label Oct 16, 2020

Hironsan mentioned this issue Oct 17, 2020

Enhancement/speed up #63

Merged

Hironsan closed this as completed in #63 Oct 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classification_report is going really slow for mode='strict' #62

Classification_report is going really slow for mode='strict' #62

eloukas commented Oct 16, 2020

Hironsan commented Oct 16, 2020

eloukas commented Oct 16, 2020

Hironsan commented Oct 16, 2020

eloukas commented Oct 16, 2020 •

edited

Loading

Classification_report is going really slow for mode='strict' #62

Classification_report is going really slow for mode='strict' #62

Comments

eloukas commented Oct 16, 2020

Hironsan commented Oct 16, 2020

eloukas commented Oct 16, 2020

Hironsan commented Oct 16, 2020

eloukas commented Oct 16, 2020 • edited Loading

eloukas commented Oct 16, 2020 •

edited

Loading