An Error report about pipeline #3227

SizhaoXu · 2020-03-11T14:23:50Z

🐛 Bug

Information

This may be an easy question, but it has been bothering me all day.

When I run the code:
nlp = pipeline("question-answering")

It always tells me:
Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-cased-distilled-squad-modelcard.json' to download model card file.
Creating an empty model card.

If I ignore it and continue to run the rest of the code:
nlp({
'question': 'What is the name of the repository ?',
'context': 'Pipeline have been included in the huggingface/transformers repository'
})

The error will appear:
KeyError: 'token_type_ids'

EllieRoseS · 2020-03-11T15:17:53Z

I have this same issue, but have no problems running:

nlp = pipeline("question-answering")

Note: To install the library, I had to install tokenizers version 0.6.0 separately, git clone the transformers repo and edit the setup.py file before installing as per @dafraile's answer for issue: #2831

Update: This error was fixed when I installed tokenizers==0.5.2

nreimers · 2020-03-25T10:11:20Z

I sadly have this issue too with the newest transformers 2.6.0 version.

Tokenizers is at version 0.5.2. But newest version of tokenizers sadly also doesn't work.

And solutions to fix this issue?

maximuslee1226 · 2020-03-26T23:16:10Z

I have the same issue here. I first ran with my own tokenizer, but it failed, and then I tried to run the 03-pipelines.ipynb code with QnA example and I get the following error code.

Environment:
tensorflow==2.0.0
tensorflow-estimator==2.0.1
tensorflow-gpu==2.0.0
torch==1.4.0
transformers==2.5.1
tokenizers==0.6.0

Code that I ran:
nlp_qa = pipeline('question-answering')
nlp_qa(context='Hugging Face is a French company based in New-York.', question='Where is based Hugging Face ?')

Error code:

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=230.0, style=ProgressStyle(description_…

convert squad examples to features: 0%| | 0/1 [00:00<?, ?it/s]

RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/brandon/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/data/processors/squad.py", line 198, in squad_convert_example_to_features
p_mask = np.array(span["token_type_ids"])
KeyError: 'token_type_ids'
"""

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
in ()
1 nlp_qa = pipeline('question-answering')
----> 2 nlp_qa(context='Hugging Face is a French company based in New-York.', question='Where is based Hugging Face ?')

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/pipelines.py in call(self, *texts, **kwargs)
968 False,
969 )
--> 970 for example in examples
971 ]
972 all_answers = []

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/pipelines.py in (.0)
968 False,
969 )
--> 970 for example in examples
971 ]
972 all_answers = []

~/anaconda3/envs/transformers/lib/python3.7/site-packages/transformers/data/processors/squad.py in squad_convert_examples_to_features(examples, tokenizer, max_seq_length, doc_stride, max_query_length, is_training, return_dataset, threads)
314 p.imap(annotate_, examples, chunksize=32),
315 total=len(examples),
--> 316 desc="convert squad examples to features",
317 )
318 )

~/anaconda3/envs/transformers/lib/python3.7/site-packages/tqdm/std.py in iter(self)
1106 fp_write=getattr(self.fp, 'write', sys.stderr.write))
1107
-> 1108 for obj in iterable:
1109 yield obj
1110 # Update and possibly print the progressbar.

~/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py in (.0)
323 result._set_length
324 ))
--> 325 return (item for chunk in result for item in chunk)
326
327 def imap_unordered(self, func, iterable, chunksize=1):

~/anaconda3/envs/transformers/lib/python3.7/multiprocessing/pool.py in next(self, timeout)
746 if success:
747 return value
--> 748 raise value
749
750 next = next # XXX

KeyError: 'token_type_ids'

maximuslee1226 · 2020-03-26T23:17:46Z

Any help would be greatly appreciated!

paras55 · 2020-03-27T14:04:48Z

use :
pip install transformers==2.5.1
instead of :
pip install transformers

maximuslee1226 · 2020-03-28T04:28:29Z

Thank you @paras55. your solution worked for me!

LysandreJik · 2020-04-01T19:13:02Z

Installing v2.7.0 should work as well.

ypapanik · 2020-04-02T10:04:38Z

2.7.0 fails with the same error (at least with tokenizers==0.5.2)

@LysandreJik

Close #3639 + spurious warning mentioned in #3227 cc @LysandreJik @thomwolf

LysandreJik added Core: Pipeline Internals of the library; Pipeline. Version mismatch labels Mar 19, 2020

LysandreJik mentioned this issue Mar 25, 2020

Force the return of token type IDs #3439

Merged

thomwolf closed this as completed in #3439 Mar 26, 2020

julien-c added a commit that referenced this issue Apr 7, 2020

[model_cards] Turn down spurious warnings

11cc1e1

Close #3639 + spurious warning mentioned in #3227 cc @LysandreJik @thomwolf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An Error report about pipeline #3227

An Error report about pipeline #3227

SizhaoXu commented Mar 11, 2020

EllieRoseS commented Mar 11, 2020 •

edited

Loading

nreimers commented Mar 25, 2020

maximuslee1226 commented Mar 26, 2020

maximuslee1226 commented Mar 26, 2020

paras55 commented Mar 27, 2020

maximuslee1226 commented Mar 28, 2020

LysandreJik commented Apr 1, 2020

ypapanik commented Apr 2, 2020

An Error report about pipeline #3227

An Error report about pipeline #3227

Comments

SizhaoXu commented Mar 11, 2020

🐛 Bug

Information

EllieRoseS commented Mar 11, 2020 • edited Loading

nreimers commented Mar 25, 2020

maximuslee1226 commented Mar 26, 2020

convert squad examples to features: 0%| | 0/1 [00:00<?, ?it/s]

maximuslee1226 commented Mar 26, 2020

paras55 commented Mar 27, 2020

maximuslee1226 commented Mar 28, 2020

LysandreJik commented Apr 1, 2020

ypapanik commented Apr 2, 2020

EllieRoseS commented Mar 11, 2020 •

edited

Loading