-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO
97 lines (89 loc) · 7.32 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
high priority:
- check reconstructed edges for root condition! currently we allow the user to "accept" a tree whose AVM is not compatible with a root condition. (a subsuming edge was root, at least).
- just now, it seems that using the default list of roots stored in the grammar image and offering a commandline option, the way ACE does, isn't so bad. users should be aware, though.
- idiom test too.
- write inferred discriminants
- on update, line up gold discriminants with current ones and set values where matches exist (instead of assuming they all are relevant still)
medium priority:
- show MRS before accepting
- verify local unifiability of twigs before offering them
- allow user to hide gold decisions, inferred decisions, when they aren't relevant
- add reset button -- remove all new decisions, turn all gold decisions to "off", set t-active=-1, ?clear comment?
- in case a span is selected and no discriminants are relevant *and* a unique tree is not available [yet] (so nothing is displayed), tell dan what we *do* know about that span -- one of:
- this can't be a constituent
- this *will* be a constituent, and here's the unary chain for it -- and maybe more of the partially determined tree?
lower priority:
- show old-style discriminants in addition to twigs
- allow rejection of discriminants
- show the derivation tree involved in reconstruction failures
- allow acceptance of all the remaining trees...?
- don't clobber results file ; if gold selected tree is not present, append it. otherwise noop.
- when writing to result files with 500 trees in them, updates get really slow. compute the whole new result file in memory and then write it out once, instead of per item?
unknown priority:
- color code for: overconstrained but no active gold
- when multiple active gold trees exist, color coding becomes murky...
- currently I compare to the first active tree and either show dark green or light blue.
- should maybe be golden brown if multiple trees remain, or dark green/auto-accept if just one
strange bug:
ws01 item 10030770 has 2 remaining trees (with all gold on), but no discriminants -- and all the lexemes are unambiguous.
in that same item, and also in 10030700, there is another UI glitch where hyphenated words are being displayed incorrectly
trec item 642 has 2 remaining trees but no discriminants
- "at" can be at_date_p or at_temp -- two lexemes with different COMPKEYs but identical lexical types and identical predications
ws01 item 10011280 claims 3 remaining trees, but two of them don't reconstruct
things that would be *nice*:
various UI enhancements
[ ] selective unpacking for verification of at least 1 valid parse, inspection of its probability, and possible acceptance thereof
[ ] possible implementation: expand the current 'unary closure' forest expansion operation to also include a 'context' field for up to N grandparents and also the choice of which packed representative of the immediate daughters to use. this would increase the in-memory forest size by a nontrivial factor, but would (maybe?) still be manageable...
- plenty of advantages of this:
- selective unpacking becomes very easy: best hypothesis for edge E is just ranked by its local score plus its daughters' best candidate scores.
- feature forest-based maxent training becomes within easy reach... if we can store the feature forests for all the training items in memory. heh. at least we could extract the feature forests themselves.
- the local context is large enough that some latent unification failures are likely to show up if they are locally replayed, allowing pruning of some of these edges
[ ] allow selection of old-style discriminants (i.e. break down unary chains), optionally
[ ] sentence-picker screen could show additional info about each item, e.g. number of parses, number of choices made, remaining ambiguity
[ ] a way to see an example [sub]tree before picking a discriminant
-- also perhaps an easy way to pull up the documentation for various rules/types
[ ] overlay/display of other annotation types (ILG, PTB)
[ ] write the labeled tree of the prefered unpacking to 'result'
improvements for special use cases:
- display various types of prior annotations:
- Emily would like to be able to view Inter-Linear Glosses along with the sentence being treebanked
- Stephan would like to be able to view an alternate tree (e.g. PTB) for the same sentence
done:
+ on update, write t-active = -1 explicitly for all items that fail to update
+ token avm strings in derivations need real IDs before them (slightly hard because 'struct tree' doesn't record those...)
+ allow rejection without narrowing to a single tree
+ when reporting failed reconstruction, offer the failed rule name, argument position, path and types. also preferably show the derivation tree.
+ when such an error occurs, force it to be hilighted and displayed to the user, instead of waiting for them to navigate to the span that would display it
+ whenever a new span becomes underlined in annotation, check that the newly-unique tree for that span is actually unifiable.
+ nothing should be saved to the profile until 'accept' or 'reject' is pressed, perhaps -- including decisions, comments, trees, whatnot
+ don't save comment until accept or reject is pressed
+ when rejecting an item, don't save the decisions leading up to that
+ UI to select a gold profile for an update operation (done via itsdb illusion of integration)
+ save decisions; preferably in real-time as they are made, so you can return to your session from another browser or after the server is reset
[ currently done in real-time, but oe suggests only on 'accept'/'reject'. ]
+ (fixed now:) updating from gold -- when update fails, the gold discriminants aren't actually showing up when we go in to manually fix it -- (reason was: gold path wasn't being taken from --gold for /parse? even though it was for auto-update)
+ when --items is passed and an update is requested, pay attention
+ add Exit button to auto update page
+ stephan wants -1's for all integer fields
+ also got the dates
+ commandline revision: --browser --items --gold --auto
+ exit button
+ comment editor
+ implement behavior of --items
+ write nothing instead of 0 when skipping an item
+ UI shows status of each item, both in annotator view and list view: accepted (>=1 t-active), rejected (0 t-active), unannotated (-1 t-active)
+ when saving, write 'decision', 'preference', 'tree' and 'result', including the derivation tree and the mrs of the preferred result
+ keep a cache of TSDB profiles; much faster than loading everytime we change sentences
+ apply gold inferred discriminants in addition to gold manual discriminants
+ pre-parse sentences and store their forests, to avoid having to spend time parsing while user is on-line
+ make sure we are supporting all the relevant old-style discriminants
+ hilight sentence span when mouseover a decision
+ speed up SVG rendering
+ use parse-node labels for tree drawing
+ display items with ambiguous tokenization properly
+ UI to allow inclusion of *some* but not *all* the old discriminants
+ item id display -- part of a prev/next/list interface
+ show parse tree when disambiguated (also for substrings when requested)
+ option to show *all* remaining discriminants (or, say, up to 20)
+ show lexical types instead of lexeme names
+ enable use of lexical type constraints from old tsdb decisions