Models hub (#13913)

--------- Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com> * 2023-06-27-roberta_embeddings_robertinh_gl (#13868) * Add model 2023-06-27-roberta_embeddings_robertinh_gl * Add model 2023-06-27-roberta_embeddings_roberta_base_wechsel_german_de * Add model 2023-06-27-roberta_embeddings_roberta_base_russian_v0_ru * Add model 2023-06-27-roberta_embeddings_ruperta_base_finetuned_spa_constitution_en * Add model 2023-06-27-roberta_embeddings_robasqu_eu * Add model 2023-06-27-roberta_embeddings_roberta_ko_small_ko * Add model 2023-06-27-roberta_embeddings_hindi_hi * Add model 2023-06-27-roberta_embeddings_sundanese_roberta_base_su * Add model 2023-06-27-roberta_embeddings_roberta_pubmed_en * Add model 2023-06-27-roberta_embeddings_distilroberta_base_climate_f_en * Add model 2023-06-27-roberta_embeddings_roberta_urdu_small_ur * Add model 2023-06-27-roberta_embeddings_BR_BERTo_pt * Add model 2023-06-27-roberta_embeddings_distilroberta_base_climate_d_s_en * Add model 2023-06-27-roberta_embeddings_distilroberta_base_climate_d_en * Add model 2023-06-27-roberta_embeddings_ukr_roberta_base_uk * Add model 2023-06-27-roberta_embeddings_roberta_base_wechsel_french_fr * Add model 2023-06-27-roberta_embeddings_Bible_roberta_base_en * Add model 2023-06-27-roberta_embeddings_bertin_roberta_large_spanish_es * Add model 2023-06-27-roberta_embeddings_roberta_base_wechsel_chinese_zh * Add model 2023-06-27-roberta_embeddings_bertin_roberta_base_spanish_es * Add model 2023-06-27-roberta_embeddings_bertin_base_gaussian_es * Add model 2023-06-27-roberta_embeddings_bertin_base_random_exp_512seqlen_es * Add model 2023-06-27-roberta_embeddings_RuPERTa_base_es * Add model 2023-06-27-roberta_embeddings_roberta_base_bne_es * Add model 2023-06-27-roberta_embeddings_bertin_base_stepwise_exp_512seqlen_es * Add model 2023-06-27-roberta_embeddings_MedRoBERTa.nl_nl * Add model 2023-06-27-roberta_embeddings_bertin_base_random_es * Add model 2023-06-27-roberta_embeddings_RoBERTalex_es * Add model 2023-06-27-roberta_embeddings_SecRoBERTa_en * Add model 2023-06-27-roberta_embeddings_KanBERTo_kn * Add model 2023-06-27-roberta_embeddings_distilroberta_base_finetuned_jira_qt_issue_title_en * Add model 2023-06-27-roberta_embeddings_MedRoBERTa.nl_nl * Add model 2023-06-27-roberta_embeddings_distilroberta_base_finetuned_jira_qt_issue_titles_and_bodies_en * Add model 2023-06-27-roberta_embeddings_bertin_base_stepwise_es * Add model 2023-06-27-roberta_embeddings_KanBERTo_kn * Add model 2023-06-27-roberta_embeddings_bertin_base_gaussian_exp_512seqlen_es * Add model 2023-06-27-roberta_embeddings_mlm_spanish_roberta_base_es * Add model 2023-06-27-roberta_embeddings_KNUBert_kn * Add model 2023-06-27-roberta_embeddings_javanese_roberta_small_jv * Add model 2023-06-27-roberta_embeddings_indonesian_roberta_base_id * Add model 2023-06-27-roberta_embeddings_indic_transformers_hi_roberta_hi * Add model 2023-06-27-roberta_embeddings_indo_roberta_small_id * Add model 2023-06-27-roberta_embeddings_fairlex_scotus_minilm_en * Add model 2023-06-27-roberta_embeddings_indic_transformers_te_roberta_te * Add model 2023-06-27-roberta_embeddings_javanese_roberta_small_imdb_jv * Add model 2023-06-27-roberta_embeddings_jurisbert_es * Add model 2023-06-27-roberta_embeddings_roberta_base_indonesian_522M_id * Add model 2023-06-27-roberta_embeddings_fairlex_ecthr_minilm_en * Add model 2023-06-27-roberta_embeddings_muppet_roberta_base_en --------- Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com> * Add model 2023-06-29-xlmroberta_embeddings_paraphrase_mpnet_base_v2_xx (#13872) Co-authored-by: Damla-Gurbaz <dml.grbz.01@gmail.com> * 2023-06-08-instructor_base_en (#13850) * Add model 2023-06-08-instructor_base_en * Update 2023-06-08-instructor_base_en.md * Add model 2023-06-21-e5_base_v2_en * Add model 2023-06-21-e5_base_en * Add model 2023-06-21-e5_large_v2_en * Add model 2023-06-21-e5_large_en * Add model 2023-06-21-e5_small_v2_en * Add model 2023-06-21-e5_small_en * Add model 2023-06-21-instructor_large_en --------- Co-authored-by: prabod <prabod@rathnayaka.me> Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr> * 2023-06-28-roberta_base_en (#13871) * Add model 2023-06-28-roberta_base_en * Add model 2023-06-28-roberta_base_opt_en * Add model 2023-06-28-roberta_base_quantized_en * Add model 2023-06-28-small_bert_L2_768_en * Add model 2023-06-28-small_bert_L2_768_opt_en * Add model 2023-06-28-small_bert_L2_768_quantized_en * Add model 2023-06-28-distilbert_base_cased_en * Add model 2023-06-28-distilbert_base_cased_opt_en * Add model 2023-06-28-distilbert_base_cased_quantized_en * Add model 2023-06-28-deberta_v3_base_en * Add model 2023-06-28-deberta_v3_base_opt_en * Add model 2023-06-28-deberta_v3_base_quantized_en * Add model 2023-06-28-distilbert_base_uncased_en * Add model 2023-06-28-distilbert_base_uncased_opt_en * Add model 2023-06-28-distilbert_base_uncased_quantized_en * Add model 2023-06-28-distilbert_base_multilingual_cased_xx * Add model 2023-06-28-distilbert_base_multilingual_cased_xx * Add model 2023-06-28-distilbert_base_multilingual_cased_opt_xx * Add model 2023-06-28-distilbert_base_multilingual_cased_quantized_xx * Add model 2023-06-28-distilbert_embeddings_distilbert_base_german_cased_de * Add model 2023-06-28-distilbert_embeddings_distilbert_base_german_cased_opt_de * Add model 2023-06-28-distilbert_embeddings_distilbert_base_german_cased_quantized_de * Add model 2023-06-29-bert_base_cased_en * Add model 2023-06-29-bert_base_cased_opt_en * Add model 2023-06-29-bert_base_cased_quantized_en --------- Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com> * Add model 2023-07-05-image_classifier_convnext_tiny_224_local_en (#13879) Co-authored-by: gadde5300 <gadde5300@gmail.com> * Add model 2023-07-06-quora_distilbert_multilingual_en (#13882) Co-authored-by: purulalwani <purulalwani@gmail.com> * removed duplicated sections (#13885) * Add model 2023-07-20-xlm_roberta_large_zero_shot_classifier_xnli_anli_xx (#13900) Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com> * Add model 2023-07-28-twitter_xlm_roberta_base_sentiment_en (#13905) Co-authored-by: veerdhwaj <veerdhwaj@aol.com> * 2023-07-30-albert_embeddings_ALR_BERT_ro (#13910) * Add model 2023-07-30-albert_embeddings_ALR_BERT_ro * Add model 2023-07-30-albert_embeddings_albert_base_japanese_v1_ja * Add model 2023-07-30-albert_embeddings_albert_large_arabic_ar * Add model 2023-07-30-albert_embeddings_albert_fa_base_v2_fa * Add model 2023-07-30-albert_embeddings_albert_german_ner_de * Add model 2023-07-30-albert_embeddings_albert_fa_zwnj_base_v2_fa * Add model 2023-07-30-albert_embeddings_marathi_albert_mr * Add model 2023-07-30-albert_embeddings_albert_tiny_bahasa_cased_ms * Add model 2023-07-30-albert_embeddings_albert_base_bahasa_cased_ms * Add model 2023-07-30-albert_embeddings_fralbert_base_fr * Add model 2023-07-30-albert_embeddings_marathi_albert_v2_mr * Add model 2023-07-30-albert_embeddings_albert_base_arabic_ar * Add model 2023-07-30-albert_embeddings_albert_large_bahasa_cased_ms * Add model 2023-07-30-camembert_embeddings_das22_10_camembert_pretrained_fr * Add model 2023-07-30-camembert_embeddings_zhenghuabin_generic_model_fr * Add model 2023-07-30-camembert_embeddings_das22_10_camembert_pretrained_fr * Add model 2023-07-30-camembert_embeddings_camembert_mlm_fr * Add model 2023-07-30-camembert_embeddings_edge2992_generic_model_fr * Add model 2023-07-30-camembert_embeddings_elusive_magnolia_generic_model_fr * Add model 2023-07-30-camembert_embeddings_zhenghuabin_generic_model_fr * Add model 2023-07-30-camembert_embeddings_camembert_aux_amandes_mt * Add model 2023-07-30-camembert_embeddings_elliotsmith_generic_model_fr * Add model 2023-07-30-camembert_embeddings_dianeshan_generic_model_fr * fixed wrong version * Add model 2023-07-31-camembert_embeddings_ankitkupadhyay_generic_model_fr * Add model 2023-07-31-camembert_embeddings_devtrent_generic_model_fr * Add model 2023-07-31-camembert_embeddings_eduardopds_generic_model_fr * Add model 2023-07-31-camembert_embeddings_adeiMousa_generic_model_fr * Add model 2023-07-31-camembert_embeddings_ericchchiu_generic_model_fr * Add model 2023-07-31-camembert_embeddings_Sebu_generic_model_fr * Add model 2023-07-31-camembert_embeddings_Weipeng_generic_model_fr * Add model 2023-07-31-camembert_embeddings_codingJacob_generic_model_fr * Add model 2023-07-31-camembert_embeddings_SummFinFR_fr * Add model 2023-07-31-camembert_embeddings_MYX4567_generic_model_fr * Add model 2023-07-31-camembert_embeddings_Katster_generic_model_fr * Add model 2023-07-31-camembert_embeddings_MYX4567_generic_model_fr * Add model 2023-07-31-camembert_embeddings_JonathanSum_generic_model_fr * Add model 2023-07-31-camembert_embeddings_Leisa_generic_model_fr * Add model 2023-07-31-camembert_embeddings_adam1224_generic_model_fr * Add model 2023-07-31-camembert_embeddings_est_roberta_et * Add model 2023-07-31-camembert_embeddings_generic2_fr * Add model 2023-07-31-camembert_embeddings_ysharma_generic_model_2_fr * Add model 2023-07-31-camembert_embeddings_DoyyingFace_generic_model_fr * Add model 2023-07-31-camembert_embeddings_Henrywang_generic_model_fr * Add model 2023-07-31-camembert_embeddings_xkang_generic_model_fr * Add model 2023-07-31-camembert_embeddings_wangst_generic_model_fr * Add model 2023-07-31-camembert_embeddings_seyfullah_generic_model_fr * Add model 2023-07-31-camembert_embeddings_tnagata_generic_model_fr * Add model 2023-07-31-camembert_embeddings_yancong_generic_model_fr * Add model 2023-07-31-camembert_embeddings_safik_generic_model_fr * Add model 2023-07-31-camembert_embeddings_tpanza_generic_model_fr * Add model 2023-07-31-camembert_embeddings_peterhsu_generic_model_fr * Add model 2023-07-31-camembert_embeddings_pgperrone_generic_model_fr * Add model 2023-07-31-camembert_embeddings_osanseviero_generic_model_fr * Add model 2023-07-31-camembert_embeddings_lijingxin_generic_model_fr * Add model 2023-08-01-camembert_embeddings_kaushikacharya_generic_model_fr * Add model 2023-08-01-camembert_embeddings_new_generic_model_fr * Add model 2023-08-01-camembert_embeddings_mbateman_generic_model_fr * Add model 2023-08-01-camembert_embeddings_lijingxin_generic_model_2_fr * Add model 2023-08-01-camembert_embeddings_katrin_kc_generic_model_fr * Add model 2023-08-01-camembert_embeddings_linyi_generic_model_fr * Add model 2023-08-01-camembert_embeddings_lewtun_generic_model_fr * Add model 2023-08-01-camembert_embeddings_joe8zhang_generic_model_fr * Add model 2023-08-01-camembert_embeddings_sloberta_sl * Add model 2023-08-01-camembert_embeddings_generic_model_test_fr * Add model 2023-08-01-camembert_embeddings_jcai1_generic_model_fr * Add model 2023-08-01-camembert_embeddings_umberto_commoncrawl_cased_v1_it * Add model 2023-08-01-camembert_embeddings_DataikuNLP_camembert_base_fr * Add model 2023-08-01-camembert_embeddings_umberto_wikipedia_uncased_v1_it * Add model 2023-08-01-camembert_base_oscar_4gb_fr * Add model 2023-08-01-camembert_embeddings_distilcamembert_base_fr * Add model 2023-08-01-camembert_base_wikipedia_4gb_fr * Add model 2023-08-01-camembert_base_ccnet_fr * Add model 2023-08-01-camembert_base_oscar_4gb_fr * Add model 2023-08-01-camembert_embeddings_hackertec_generic_fr * Add model 2023-08-01-camembert_base_ccnet_fr * Add model 2023-08-01-camembert_embeddings_h4d35_generic_model_fr * Add model 2023-08-01-camembert_embeddings_bertweetfr_base_fr * Add model 2023-08-01-camembert_base_ccnet_4gb_fr * Add model 2023-08-01-camembert_base_ccnet_4gb_fr * Add model 2023-08-01-xlmroberta_embeddings_fairlex_fscs_minilm_xx * Add model 2023-08-01-xlmroberta_embeddings_fairlex_cail_minilm_zh * Add model 2023-08-01-camembert_base_fr * Add model 2023-08-01-camembert_base_opt_fr * Add model 2023-08-01-camembert_base_quantized_fr * Add model 2023-08-02-albert_base_uncased_en * Add model 2023-08-02-albert_base_uncased_opt_en * Add model 2023-08-02-albert_base_uncased_quantized_en * Add model 2023-08-02-albert_large_uncased_en * Add model 2023-08-02-albert_large_uncased_en * Add model 2023-08-02-albert_large_uncased_opt_en * Add model 2023-08-02-albert_large_uncased_quantized_en --------- Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com> * 2023-07-28-twitter_xlm_roberta_base_sentiment_en (#13906) * Add model 2023-07-28-twitter_xlm_roberta_base_sentiment_en * Add model 2023-07-31-twitter_xlm_roberta_base_sentiment_pdc_en * Add model 2023-07-31-sentiment_twitter_xlm_roBerta_pdc_en --------- Co-authored-by: veerdhwaj <veerdhwaj@aol.com> --------- Co-authored-by: jsl-models <74001263+jsl-models@users.noreply.github.com> Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com> Co-authored-by: purulalwani <purulalwani@gmail.com> Co-authored-by: veerdhwaj <veerdhwaj@aol.com>
JohnSnowLabs · Aug 2, 2023 · 35478e0 · 35478e0
1 parent 2b2f93c
commit 35478e0
Show file tree

Hide file tree

Showing 88 changed files with 8,403 additions and 0 deletions.
diff --git a/.../ahmedlone127/2023-07-20-xlm_roberta_large_zero_shot_classifier_xnli_anli_xx.md b/.../ahmedlone127/2023-07-20-xlm_roberta_large_zero_shot_classifier_xnli_anli_xx.md
@@ -0,0 +1,106 @@
+---
+layout: model
+title: XlmRoBertaZero-Shot Classification Large xlm_roberta_large_zero_shot_classifier_xnli_anli
+author: John Snow Labs
+name: xlm_roberta_large_zero_shot_classifier_xnli_anli
+date: 2023-07-20
+tags: [zero_shot, xx, open_source, tensorflow]
+task: Zero-Shot Classification
+language: xx
+edition: Spark NLP 5.0.2
+spark_version: 3.0
+supported: true
+engine: tensorflow
+annotator: XlmRoBertaForZeroShotClassification
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+This model is intended to be used for zero-shot text classification, especially in English. It is fine-tuned on NLI by using XlmRoberta Large model.
+
+XlmRoBertaForZeroShotClassificationusing a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of TFXLMRoBertaForZeroShotClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible.
+
+We used TFXLMRobertaForSequenceClassification to train this model and used XlmRoBertaForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale!
+
+## Predicted Entities
+
+
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/xlm_roberta_large_zero_shot_classifier_xnli_anli_xx_5.0.2_3.0_1689886974932.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/xlm_roberta_large_zero_shot_classifier_xnli_anli_xx_5.0.2_3.0_1689886974932.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+```python
+document_assembler = DocumentAssembler() \
+.setInputCol('text') \
+.setOutputCol('document')
+
+tokenizer = Tokenizer() \
+.setInputCols(['document']) \
+.setOutputCol('token')
+
+zeroShotClassifier = XlmRobertaForSequenceClassification \
+.pretrained('xlm_roberta_large_zero_shot_classifier_xnli_anli', 'xx') \
+.setInputCols(['token', 'document']) \
+.setOutputCol('class') \
+.setCaseSensitive(True) \
+.setMaxSentenceLength(512) \
+.setCandidateLabels(["urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology"])
+
+pipeline = Pipeline(stages=[
+document_assembler,
+tokenizer,
+zeroShotClassifier
+])
+
+example = spark.createDataFrame([['I have a problem with my iphone that needs to be resolved asap!!']]).toDF("text")
+result = pipeline.fit(example).transform(example)
+
+```
+```scala
+val document_assembler = DocumentAssembler()
+.setInputCol("text")
+.setOutputCol("document")
+
+val tokenizer = Tokenizer()
+.setInputCols("document")
+.setOutputCol("token")
+
+val zeroShotClassifier = XlmRobertaForSequenceClassification.pretrained("xlm_roberta_large_zero_shot_classifier_xnli_anli", "xx")
+.setInputCols("document", "token")
+.setOutputCol("class")
+.setCaseSensitive(true)
+.setMaxSentenceLength(512)
+.setCandidateLabels(Array("urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology"))
+
+val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier))
+val example = Seq("I have a problem with my iphone that needs to be resolved asap!!").toDS.toDF("text")
+val result = pipeline.fit(example).transform(example)
+```
+</div>
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|xlm_roberta_large_zero_shot_classifier_xnli_anli|
+|Compatibility:|Spark NLP 5.0.2+|
+|License:|Open Source|
+|Edition:|Official|
+|Input Labels:|[token, document]|
+|Output Labels:|[label]|
+|Language:|xx|
+|Size:|2.0 GB|
+|Case sensitive:|true|
diff --git a/docs/_posts/ahmedlone127/2023-07-30-albert_embeddings_ALR_BERT_ro.md b/docs/_posts/ahmedlone127/2023-07-30-albert_embeddings_ALR_BERT_ro.md
@@ -0,0 +1,99 @@
+---
+layout: model
+title: Romanian ALBERT Embeddings (from dragosnicolae555)
+author: John Snow Labs
+name: albert_embeddings_ALR_BERT
+date: 2023-07-30
+tags: [albert, embeddings, ro, open_source, onnx]
+task: Embeddings
+language: ro
+edition: Spark NLP 5.0.2
+spark_version: 3.0
+supported: true
+engine: onnx
+annotator: AlbertEmbeddings
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+Pretrained ALBERT Embeddings model, uploaded to Hugging Face, adapted and imported into Spark NLP. `ALR_BERT` is a Romanian model orginally trained by `dragosnicolae555`.
+
+## Predicted Entities
+
+
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_embeddings_ALR_BERT_ro_5.0.2_3.0_1690752767725.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_embeddings_ALR_BERT_ro_5.0.2_3.0_1690752767725.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+```python
+documentAssembler = DocumentAssembler() \
+.setInputCol("text") \
+.setOutputCol("document")
+
+tokenizer = Tokenizer() \
+.setInputCols("document") \
+.setOutputCol("token")
+
+embeddings = AlbertEmbeddings.pretrained("albert_embeddings_ALR_BERT","ro") \
+.setInputCols(["document", "token"]) \
+.setOutputCol("embeddings")
+
+pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])
+
+data = spark.createDataFrame([["Îmi place Spark NLP"]]).toDF("text")
+
+result = pipeline.fit(data).transform(data)
+```
+```scala
+val documentAssembler = new DocumentAssembler() 
+.setInputCol("text") 
+.setOutputCol("document")
+
+val tokenizer = new Tokenizer() 
+.setInputCols(Array("document"))
+.setOutputCol("token")
+
+val embeddings = AlbertEmbeddings.pretrained("albert_embeddings_ALR_BERT","ro") 
+.setInputCols(Array("document", "token")) 
+.setOutputCol("embeddings")
+
+val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))
+
+val data = Seq("Îmi place Spark NLP").toDF("text")
+
+val result = pipeline.fit(data).transform(data)
+```
+
+{:.nlu-block}
+```python
+import nlu
+nlu.load("ro.embed.ALR_BERT").predict("""Îmi place Spark NLP""")
+```
+</div>
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|albert_embeddings_ALR_BERT|
+|Compatibility:|Spark NLP 5.0.2+|
+|License:|Open Source|
+|Edition:|Official|
+|Input Labels:|[sentence, token]|
+|Output Labels:|[bert]|
+|Language:|ro|
+|Size:|51.7 MB|
+|Case sensitive:|false|
diff --git a/docs/_posts/ahmedlone127/2023-07-30-albert_embeddings_albert_base_arabic_ar.md b/docs/_posts/ahmedlone127/2023-07-30-albert_embeddings_albert_base_arabic_ar.md
@@ -0,0 +1,99 @@
+---
+layout: model
+title: Arabic ALBERT Embeddings (Base)
+author: John Snow Labs
+name: albert_embeddings_albert_base_arabic
+date: 2023-07-30
+tags: [albert, embeddings, ar, open_source, onnx]
+task: Embeddings
+language: ar
+edition: Spark NLP 5.0.2
+spark_version: 3.0
+supported: true
+engine: onnx
+annotator: AlbertEmbeddings
+article_header:
+  type: cover
+use_language_switcher: "Python-Scala-Java"
+---
+
+## Description
+
+Pretrained ALBERT Embeddings model, uploaded to Hugging Face, adapted and imported into Spark NLP. `albert-base-arabic` is a Arabic model orginally trained by `asafaya`.
+
+## Predicted Entities
+
+
+
+{:.btn-box}
+<button class="button button-orange" disabled>Live Demo</button>
+<button class="button button-orange" disabled>Open in Colab</button>
+[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/albert_embeddings_albert_base_arabic_ar_5.0.2_3.0_1690753212237.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
+[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/albert_embeddings_albert_base_arabic_ar_5.0.2_3.0_1690753212237.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
+
+## How to use
+
+
+
+<div class="tabs-box" markdown="1">
+{% include programmingLanguageSelectScalaPythonNLU.html %}
+```python
+documentAssembler = DocumentAssembler() \
+.setInputCol("text") \
+.setOutputCol("document")
+
+tokenizer = Tokenizer() \
+.setInputCols("document") \
+.setOutputCol("token")
+
+embeddings = AlbertEmbeddings.pretrained("albert_embeddings_albert_base_arabic","ar") \
+.setInputCols(["document", "token"]) \
+.setOutputCol("embeddings")
+
+pipeline = Pipeline(stages=[documentAssembler, tokenizer, embeddings])
+
+data = spark.createDataFrame([["أنا أحب شرارة NLP"]]).toDF("text")
+
+result = pipeline.fit(data).transform(data)
+```
+```scala
+val documentAssembler = new DocumentAssembler() 
+.setInputCol("text") 
+.setOutputCol("document")
+
+val tokenizer = new Tokenizer() 
+.setInputCols(Array("document"))
+.setOutputCol("token")
+
+val embeddings = AlbertEmbeddings.pretrained("albert_embeddings_albert_base_arabic","ar") 
+.setInputCols(Array("document", "token")) 
+.setOutputCol("embeddings")
+
+val pipeline = new Pipeline().setStages(Array(documentAssembler, tokenizer, embeddings))
+
+val data = Seq("أنا أحب شرارة NLP").toDF("text")
+
+val result = pipeline.fit(data).transform(data)
+```
+
+{:.nlu-block}
+```python
+import nlu
+nlu.load("ar.embed.albert").predict("""أنا أحب شرارة NLP""")
+```
+</div>
+
+{:.model-param}
+## Model Information
+
+{:.table-model}
+|---|---|
+|Model Name:|albert_embeddings_albert_base_arabic|
+|Compatibility:|Spark NLP 5.0.2+|
+|License:|Open Source|
+|Edition:|Official|
+|Input Labels:|[sentence, token]|
+|Output Labels:|[bert]|
+|Language:|ar|
+|Size:|42.0 MB|
+|Case sensitive:|false|