Does the parser special-treat relation subtypes? #83

msklvsk · 2018-10-12T17:30:50Z

Imagine the parser is trying to decide between rela:subtype1, rela:subtype2 and relb. Let them have probabilities 0.25, 0.2 and 0.3 respectively. Will UDPipe simply select relb or will it select rela:subtype1 because universal rela is more probable (0.25+0.2)?

The text was updated successfully, but these errors were encountered:

foxik · 2018-10-13T11:14:28Z

UDPipe indeed simply selects relb, is has no knowledge of universal deprels. I am leaving this open to evaluate the possibility of aggregating over universal types first (with the upcoming UDPipe 2 it will be easy to do so).

ryszardtuora · 2021-01-13T23:27:49Z

I have a related question. I know that the conll18_ud_eval.py script skips consideration of relation subtypes, and I have been under the impression (can't really pinpoint as to where that impression came from) that UDPipe also does that during training, and thus does not treat rela:subtype1 as an error when the gold is rela:subtype2. I can't find any mention of this on the website or in the article now, maybe I just assumed for no reason, that UDPipe uses the conll18_ud_eval.py script for choosing between models from different epochs. I'd appreciate your input on this @foxik.

EDIT: To be more precise, I am concerned with all the stages of training, evaluation, and prediction, i.e. does the loss function ignore subtypes, does the evaluation process for choosing the best iteration ignore the subtypes, and does this apply in any way during the stage of prediction (this last question I think is answered by your last post here).

foxik · 2021-01-14T20:14:27Z

UDPipe 1 significantly pre-dates the conll18_ud_eval.py script, and it just considers the given deprels to be strings, without interpreting them in any way. Therefore, rela:subtype1 and relb:subtype2 are different during training (loss computation and heldout data evaluation) and prediction (each is predicted independently).

There is however one place where the subtypes are ignored -- when running udpipe --accuracy https://ufal.mff.cuni.cz/udpipe/1/users-manual#udpipe_accuracy , which measures the accuracy of a given model. In that case, the subtypes are really ignored (to report numbers same as the conll17_ud_eval script).

This is a design choice -- we try to reconstruct whatever the user has given us. If you are interested only in deprels without subtypes, you can remove them from the training data :-)

ryszardtuora · 2021-01-15T08:08:26Z

I see, in the meantime I have found this piece of code: github.com/ufal/udpipe/blob/master/src/model/evaluator.cpp lines 252-256, I take it that this is used only when running udpipe --accuracy, and does not affect the training in any way. Am I right?

foxik · 2021-01-15T10:09:10Z

Exactly. During training, https://github.com/ufal/udpipe/blob/master/src/parsito/parser/parser_nn_trainer.cpp#L476 is used, which compares the whole deprels.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the parser special-treat relation subtypes? #83

Does the parser special-treat relation subtypes? #83

msklvsk commented Oct 12, 2018

foxik commented Oct 13, 2018

ryszardtuora commented Jan 13, 2021 •

edited

Loading

foxik commented Jan 14, 2021

ryszardtuora commented Jan 15, 2021

foxik commented Jan 15, 2021

Does the parser special-treat relation subtypes? #83

Does the parser special-treat relation subtypes? #83

Comments

msklvsk commented Oct 12, 2018

foxik commented Oct 13, 2018

ryszardtuora commented Jan 13, 2021 • edited Loading

foxik commented Jan 14, 2021

ryszardtuora commented Jan 15, 2021

foxik commented Jan 15, 2021

ryszardtuora commented Jan 13, 2021 •

edited

Loading