fix case of special tokens in encoder #19

benlipkin · 2022-03-02T20:09:16Z

special tokens, e.g. , from tokenizer cause 1-off errors when using indices to extract sentence representations from context.

…urrent stim representation. in partial fulfilment of #19 (excludes special tokens in individual stim length computation, but includes them in the overall context anyway, so extraction is still affected)

aalok-sathe · 2022-03-25T16:16:37Z

Now what happens here is: the special tokens are chopped off from each stimulus when extracting stimulus-level representations evaluated within a context.
The remaining thing here is: being able to extract first-token/last-token/special-token representation for a single stimulus, because now special tokens are chopped off by default since in context they represent the whole context rather than any stimulus

…ect (#19)

aalok-sathe · 2022-03-25T19:05:25Z

whoops, that was an incorrect reference to this issue. it should have been #18 instead

aalok-sathe self-assigned this Mar 2, 2022

aalok-sathe added a commit that referenced this issue Mar 25, 2022

fix the special token offset issue (#19)

bb9bc0e

aalok-sathe added lbs:encoders related to the encoder part of the library bug Something isn't working labels Mar 25, 2022

aalok-sathe added a commit that referenced this issue Mar 25, 2022

allow subject_index=None and assume entire data is from a single subj…

8bbd056

…ect (#19)

aalok-sathe added the meta:pending-tests label Mar 25, 2022

benlipkin closed this as completed Mar 30, 2022

benlipkin reopened this Mar 30, 2022

aalok-sathe referenced this issue May 19, 2022

bugfixes in new indices extraction method to work with pytorch tensors

9b62b3b

aalok-sathe closed this as completed in 97bb72d May 20, 2022

aalok-sathe mentioned this issue May 20, 2022

We cannot rely on out-of-context tokenization to calculate tokenized offset lengths #31

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix case of special tokens in encoder #19

fix case of special tokens in encoder #19

benlipkin commented Mar 2, 2022

aalok-sathe commented Mar 25, 2022

aalok-sathe commented Mar 25, 2022

fix case of special tokens in encoder #19

fix case of special tokens in encoder #19

Comments

benlipkin commented Mar 2, 2022

aalok-sathe commented Mar 25, 2022

aalok-sathe commented Mar 25, 2022