You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
我在做build_index这一步时报错:
Traceback (most recent call last):
File "build_index.py", line 114, in
main(args)
File "build_index.py", line 85, in main
used_data, used_ids, max_norm = get_features(args.batch_size, args.norm_th, vocab, model, used_data, used_ids, max_norm_cf=args.max_norm_cf)
File "/root/miniconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/root/autodl-tmp/TM/retriever.py", line 342, in get_features
cur_vecs = model(batch, batch_first=True).detach().cpu().numpy()
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/root/miniconda3/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in replica 1 on device 1.
Original Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'input_ids'
看起来像是多卡运行有问题,但是我看到代码中已经指定了用0卡去运行,不知道这里是出了什么问题,能帮忙看一下么
The text was updated successfully, but these errors were encountered:
如果添加os.environ["CUDA_VISIBLE_DEVICES"]='0',或者使用只有一个卡的机器运行,会有下面这个错误:
Traceback (most recent call last):
File "build_index.py", line 115, in
main(args)
File "build_index.py", line 90, in main
mips.train(used_data)
File "/root/autodl-tmp/TM/mips.py", line 43, in train
self.index.train(data)
File "/root/miniconda3/lib/python3.8/site-packages/faiss/init.py", line 144, in replacement_train
self.train_c(n, swig_ptr(x))
File "/root/miniconda3/lib/python3.8/site-packages/faiss/swigfaiss.py", line 4134, in train
return _swigfaiss.GpuIndexIVFScalarQuantizer_train(self, n, x)
RuntimeError: Error in virtual void faiss::Clustering::train(faiss::Clustering::idx_t, const float*, faiss::Index&) at Clustering.cpp:82: Error: 'nx >= k' failed: Number of training points (1) should be at least as large as number of clusters (1024)
希望可以帮我一下,谢谢您
我在做build_index这一步时报错:
Traceback (most recent call last):
File "build_index.py", line 114, in
main(args)
File "build_index.py", line 85, in main
used_data, used_ids, max_norm = get_features(args.batch_size, args.norm_th, vocab, model, used_data, used_ids, max_norm_cf=args.max_norm_cf)
File "/root/miniconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/root/autodl-tmp/TM/retriever.py", line 342, in get_features
cur_vecs = model(batch, batch_first=True).detach().cpu().numpy()
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/root/miniconda3/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in replica 1 on device 1.
Original Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'input_ids'
看起来像是多卡运行有问题,但是我看到代码中已经指定了用0卡去运行,不知道这里是出了什么问题,能帮忙看一下么
The text was updated successfully, but these errors were encountered: