Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rs3->dis/svg conversion can't handle Szeryng example text #6

Open
arne-cl opened this issue Feb 23, 2021 · 4 comments
Open

rs3->dis/svg conversion can't handle Szeryng example text #6

arne-cl opened this issue Feb 23, 2021 · 4 comments
Labels
bug Something isn't working

Comments

@arne-cl
Copy link
Collaborator

arne-cl commented Feb 23, 2021

feng-hirst-2014-result.rs3.txt

curl -XPOST localhost:9150/convert/rs3/dis -F input=@feng-hirst-2014-result.rs3.txt
{"error":"<class 'discoursegraphs.readwrite.rst.rs3.rs3tree.RSTTree'> can't handle input file 'feng-hirst-2014-result.rs3.txt'. Got: ","traceback":"Traceback (most recent call last):\n  File \"app.py\", line 113, in post\n    tree = read_function(temp_inputfile.name)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 57, in __init__\n    self.tree = self.dt()\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 117, in dt\n    return self.root2tree(start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 140, in root2tree\n    return self.dt(start_node=root_nodes[0])\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 231, in group2tree\n    return self.dt(start_node=child_id)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 245, in group2tree\n    sat_subtree = self.dt(start_node=sat_id)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 174, in group2tree\n    subtree = self.dt(start_node=subtree_id)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 247, in group2tree\n    nuc_subtree = self.dt(start_node=children['nucleus'])\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 245, in group2tree\n    sat_subtree = self.dt(start_node=sat_id)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 174, in group2tree\n    subtree = self.dt(start_node=subtree_id)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 217, in group2tree\n    for child_id in other_child_ids]\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 174, in group2tree\n    subtree = self.dt(start_node=subtree_id)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 245, in group2tree\n    sat_subtree = self.dt(start_node=sat_id)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 178, in group2tree\n    for c in self.child_dict[elem_id]]\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 174, in group2tree\n    subtree = self.dt(start_node=subtree_id)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 134, in dt\n    elem_id, elem, elem_type, start_node=start_node)\n  File \"/opt/discoursegraphs/src/discoursegraphs/readwrite/rst/rs3/rs3tree.py\", line 266, in group2tree\n    assert len(children['nucleus']) == 1\nAssertionError\n"}
@arne-cl arne-cl added the bug Something isn't working label Feb 23, 2021
@arne-cl
Copy link
Collaborator Author

arne-cl commented Mar 14, 2021

Problem group2tree expects assert len(children['nucleus']) == 1, but we have:

>>> children
defaultdict(<type 'list'>, {'satellite': ['99'], 'nucleus': ['95', '97']})

@arne-cl
Copy link
Collaborator Author

arne-cl commented Mar 14, 2021

minimal input:

Szeryng subsequently focused on teaching before resuming his concert career in 1954.

The "Le Duc" was the instrument on which he performed and recorded mostly, while the latter ("King David" Strad) was donated to the State of Israel.

feng-hirst-2014 output:

ParseTree('Elaboration[N][S]', [ParseTree('Temporal[N][S]', ['Szeryng subsequently focused on teaching', 'before resuming his concert career in 1954 .']), ParseTree('Elaboration[N][S]', ['The " Le Duc " was the instrument', ParseTree('Temporal[N][N]', ['on which he performed and recorded mostly ,', ParseTree('same-unit[N][N]', [ParseTree('same-unit[N][N]', [ParseTree('Elaboration[N][S]', ['while the latter', '( " King David "']), 'Strad )']), 'was donated to the State of Israel .'])])])])

feng-hirst-2014-result-minimal.rs3.txt
feng-hirst-converter-fail

@arne-cl
Copy link
Collaborator Author

arne-cl commented Mar 15, 2021

If we convert the original feng-hirst-2014 parser output to a tree,

$ curl -X POST -F "input=@feng-hirst-2014-result-minimal.fh2014" http://localhost:5000/convert/hilda/tree.prettyprint
                                                               Elaboration
                     _______________________________________________|________________
                    |                                                                S
                    |                                                                |
                    |                                                           Elaboration
                    |                              __________________________________|____________________________
                    |                             |                                                               S
                    |                             |                                                               |
                    |                             |                                                            Temporal
                    |                             |                  _____________________________________________|_____________
                    |                             |                 |                                                           N
                    |                             |                 |                                                           |
                    |                             |                 |                                                       same-unit
                    |                             |                 |                                              _____________|____________________
                    |                             |                 |                                             N                                  |
                    |                             |                 |                                             |                                  |
                    |                             |                 |                                         same-unit                              |
                    |                             |                 |                                _____________|______________________            |
                    N                             |                 |                               N                                    |           |
                    |                             |                 |                               |                                    |           |
                 Temporal                         |                 |                          Elaboration                               |           |
        ____________|____________                 |                 |                 ______________|_____________                       |           |
       N                         S                N                 N                N                            S                      N           N
       |                         |                |                 |                |                            |                      |           |
Szeryng subseque          before resuming  The " Le Duc "      on which he    while the latter             ( " King David "           Strad ) was donated to
ntly focused on             his concert    was the instrume   performed and                                                                    the State of
    teaching              career in 1954 .        nt        recorded mostly ,                                                                     Israel .

we see that it has two Temporal relations with different nuclearity:
The first one is (N: ... focused on teaching, S: before resuming ...).
The second one is (N: on which he performed ... mostly, N: while the latter ... was donated).
The while should not be interpreted in a temporal sense, but that's probably not the issue here.

@arne-cl
Copy link
Collaborator Author

arne-cl commented Mar 15, 2021

The rs3 files has some odd things, like relation names in different casing/spelling in the <relations> section,
while in the section, we only find Temporal (and not temporal)

    <relations>
      <rel name="Manner-Means" type="rst"/>
      <rel name="Mannermeans" type="rst"/>
      ...
      <rel name="Same-Unit" type="multinuc"/>
      <rel name="Same-unit" type="multinuc"/>
      <rel name="Temporal" type="multinuc"/>
      <rel name="same-unit" type="multinuc"/>
      <rel name="same_unit" type="multinuc"/>
      <rel name="temporal" type="rst"/>
    </relations>
...
  <body>
    <segment id="5" parent="3" relname="Temporal">Szeryng subsequently focused on teaching</segment>
    <segment id="7" parent="3" relname="Temporal">before resuming his concert career in 1954 .</segment>
    <segment id="15" parent="13" relname="Temporal">on which he performed and recorded mostly ,</segment>
    <group id="17" type="multinuc" parent="13" relname="Temporal"/>
  </body>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant