Question on AraBERT-Trainer-HyperParameterOpt-NER Notebok #179
Replies: 2 comments 3 replies
-
for predictions i suggest using the Also of you model is based on arabertv2 with presegmentation it might cause issues, but I'm not sure |
Beta Was this translation helpful? Give feedback.
-
I tried to do that using this code: from transformers import pipeline, AutoModel, AutoModelForTokenClassification, AutoTokenizer model_name = 'aubmindlab/bert-base-arabertv02' pipe = pipeline("ner", model=arabert_model, tokenizer=tokenizer) It shows me this error: Some weights of the model checkpoint at /gdrive/MyDrive/LearningSpacy were not used when initializing BertModel: ['classifier.weight', 'classifier.bias']
KeyError Traceback (most recent call last) 4 frames KeyError: 410 |
Beta Was this translation helpful? Give feedback.
-
Hi everyone!
I'm training an AraBERT model for NER, exactly as what you did in this notebook. After training and saving the model, I would like to see the predictions for some samples, I started by writing the following code:
from transformers import AutoTokenizer, AutoModel
from arabert.preprocess import ArabertPreprocessor
import torch
def predict(sample_text):
encoded_text = tokenizer.encode_plus(
sample_text,
max_length=138,
add_special_tokens=True,
return_token_type_ids=False,
pad_to_max_length=True,
return_attention_mask=True,
return_tensors='pt',
)
input_ids = encoded_text ['input_ids']
attention_mask = encoded_text ['attention_mask']
output = arabert_model(input_ids, attention_mask)
label_indices = np.argmax(output, axis=2)
print(f'Text: {text_preprocessed}')
print(f'Tags : {label_indices }')
return prediction
arabert_model = AutoModel.from_pretrained('/gdrive/MyDrive/AraBERT Model Config')
text = "محمد ذهب إلى أمريكا للحصول على شهادة الماجستير. "
arabert_prep = ArabertPreprocessor(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
text_preprocessed=arabert_prep.preprocess(text)
predict(text_preprocessed)
I don't know if my mapping is correct or not and also I want the output to be in the form of: [B-PER, O, O, B-LOC, O, O, O, O]
How can I accomplish that?
Thank you
Beta Was this translation helpful? Give feedback.
All reactions