Skip to content

Commit

Permalink
2023-09-12-tiny_mlm_glue_rte_en (#13975)
Browse files Browse the repository at this point in the history
* Add model 2023-09-13-bert_base_finnish_cased_v1_fi

* Add model 2023-09-13-bert_tiny_finetuned_nan_labels_nepal_bhasa_longer_en

* Add model 2023-09-12-dlub_2022_mlm_en

* Add model 2023-09-13-retbert_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_6_en

* Add model 2023-09-13-model_65000_20ep_en

* Add model 2023-09-13-arbert_ar

* Add model 2023-09-12-norbert_no

* Add model 2023-09-13-tiny_clinicalbert_en

* Add model 2023-09-13-bio_minialbert_128_en

* Add model 2023-09-13-bert_base_uncased_finetuning_en

* Add model 2023-09-13-clinical_minialbert_312_en

* Add model 2023-09-13-model_bangla_bert_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_12_en

* Add model 2023-09-13-applicanttrackingsystembert_en

* Add model 2023-09-13-bert_base_uncased_2022_habana_test_5_en

* Add model 2023-09-13-muril_with_mlm_cased_temp_en

* Add model 2023-09-12-bert_base_greek_uncased_v1_el

* Add model 2023-09-12-minilmv2_l6_h384_distilled_from_bert_large_en

* Add model 2023-09-13-german_medbert_de

* Add model 2023-09-13-cordbert_1000_v1_en

* Add model 2023-09-12-small_mlm_glue_cola_from_scratch_custom_tokenizer_expand_vocab_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_14_en

* Add model 2023-09-13-bert_base_uncased_2022_habana_test_6_en

* Add model 2023-09-13-distil_biobert_en

* Add model 2023-09-13-bert_large_cased_finetuned_lowr100_2_cased_da_20_en

* Add model 2023-09-13-bert_base_finnish_uncased_v1_fi

* Add model 2023-09-13-compact_biobert_en

* Add model 2023-09-13-rubert_large_ru

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_en

* Add model 2023-09-13-scibert_scivocab_uncased_finetuned_scibero_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_16_en

* Add model 2023-09-13-bert_small_nan_labels_500_en

* Add model 2023-09-13-mwp_bert_english_en

* Add model 2023-09-13-bert_nlp_en

* Add model 2023-09-13-bert_dk_laptop_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_17_en

* Add model 2023-09-12-bertbasekk_1e_en

* Add model 2023-09-13-clr_finetuned_bert_large_uncased_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr100_40_en

* Add model 2023-09-13-hindi_bert_v1_hi

* Add model 2023-09-13-hindi_bert_en

* Add model 2023-09-13-bert_uncased_tiny_multi_emails_hq_en

* Add model 2023-09-13-bert_uncased_tiny_2xthicc_multi_emails_hq_en

* Add model 2023-09-13-bert_pt_laptop_en

* Add model 2023-09-13-tsonga_test_en

* Add model 2023-09-13-bert_large_cased_sigir_support_norwegian_label_40_sigir_tune2nd_lr100_labelled_30_en

* Add model 2023-09-13-bert_pt_rest_en

* Add model 2023-09-13-mbertu_arabic_en

* Add model 2023-09-13-hindi_marathi_dev_bert_hi

* Add model 2023-09-13-estbert_et

* Add model 2023-09-13-biobert_base_1.2_en

* Add model 2023-09-13-german_bert_base_german_cased_finetuned_en

* Add model 2023-09-13-bert_hinglish_big_en

* Add model 2023-09-13-marathi_tweets_bert_mr

* Add model 2023-09-13-scibert_scivocab_uncased_finetuned_scibert_agu_abstracts_en

* Add model 2023-09-13-dapt_bert_ko

* Add model 2023-09-13-adrbert_base_p1_en

* Add model 2023-09-12-bert_base_uncased_sparse_90_unstructured_pruneofa_en

* Add model 2023-09-13-dragon_plus_query_encoder_en

* Add model 2023-09-13-dragon_plus_context_encoder_en

* Add model 2023-09-12-small_mlm_glue_mrpc_custom_tokenizer_expand_vocab_en

* Add model 2023-09-13-moresexistbert_en

* Add model 2023-09-13-bert_uncased_l_2_h_256_a_4_mlm_multi_emails_hq_en

* Add model 2023-09-13-slimr_msmarco_passage_en

* Add model 2023-09-13-bert_large_swedish_uncased_en

* Add model 2023-09-13-mymodel1007_en

* Add model 2023-09-13-financialbert_en

* Add model 2023-09-13-recipe_bert_base_uncased_en

* Add model 2023-09-13-mlm_gh_issues_en

* Add model 2023-09-13-bert_base_uncased_rotten_tomatoes_en

* Add model 2023-09-13-bio_tinybert_en

* Add model 2023-09-13-bert_large_cased_finetuned_low20_cased_da_20_en

* Add model 2023-09-13-bert_small_finetuned_finer_en

* Add model 2023-09-13-biomedical_en

* Add model 2023-09-13-transformer_exercise_01_en

* Add model 2023-09-13-esci_mlm_alllang_bert_base_uncased_en

* Add model 2023-09-13-neuba_bert_en

* Add model 2023-09-13-marathi_tweets_bert_hateful_mr

* Add model 2023-09-13-bert_yelp_en

* Add model 2023-09-13-archeobertje_en

* Add model 2023-09-12-tiny_mlm_glue_rte_custom_tokenizer_expand_vocab_en

* Add model 2023-09-13-minialbert_128_en

* Add model 2023-09-13-rubiobert_en

* Add model 2023-09-13-slim_beir_scifact_old_en

* Add model 2023-09-13-burmese_awesome_model_alexyalunin_en

* Add model 2023-09-13-bert_base_uncased_bert_mask_complete_word_en

* Add model 2023-09-13-bert_small_finer_en

* Add model 2023-09-13-dziribert_ar

* Add model 2023-09-13-lesssexistbert_en

* Add model 2023-09-13-sw_v1_sw

* Add model 2023-09-13-czert_b_base_cased_cs

* Add model 2023-09-13-bert_java_bfp_single_en

* Add model 2023-09-13-chupeto_en

* Add model 2023-09-13-bert_small_finetuned_finer_longer10_en

* Add model 2023-09-13-estbert_512_et

* Add model 2023-09-13-bert_hinglish_small_en

* Add model 2023-09-13-algarlegal_large_arabertv2_en

* Add model 2023-09-13-absa_mlm_1_en

* Add model 2023-09-13-marathi_bert_scratch_mr

* Add model 2023-09-13-protaugment_lm_banking77_en

* Add model 2023-09-13-bert_base_uncased_lm_en

* Add model 2023-09-13-hindi_bert_scratch_hi

* Add model 2023-09-13-sentmae_beir_en

* Add model 2023-09-13-bert_large_swedish_nordic_pile_150_en

* Add model 2023-09-13-hindi_marathi_dev_bert_scratch_hi

* Add model 2023-09-13-bert_base_uncased_contents_en

* Add model 2023-09-13-bert_base_5lang_cased_xx

* Add model 2023-09-13-dpr_catalan_question_encoder_viquiquad_base_en

* Add model 2023-09-13-qst_ar

* Add model 2023-09-13-bert_small_pretrained_on_squad_en

* Add model 2023-09-13-bert_base_macedonian_bulgarian_cased_en

* Add model 2023-09-13-qsr_ar

* Add model 2023-09-13-bert_base_24_en

* Add model 2023-09-13-bert_base_macedonian_cased_en

* Add model 2023-09-13-e4a_permits_bert_base_romanian_cased_v1_en

* Add model 2023-09-13-qe3_ar

* Add model 2023-09-13-qe6_ar

* Add model 2023-09-13-bert_base_48_en

* Add model 2023-09-13-bert_csl_gold8k_en

* Add model 2023-09-12-tiny_mlm_glue_cola_custom_tokenizer_expand_vocab_en

* Add model 2023-09-13-bert_base_cased_finetuned_imdb_en

* Add model 2023-09-13-bert_uncased_l_10_h_512_a_8_cord19_200616_en

* Add model 2023-09-13-legalnlp_bert_pt

* Add model 2023-09-13-bertimbau_base_finetuned_lener_breton_pt

* Add model 2023-09-13-bert_base_uncased_reviews_2_en

* Add model 2023-09-13-bert_base_cased_finetuned_semeval2017_mlm_en

* Add model 2023-09-13-topic_erica_bert_en

* Add model 2023-09-13-opticalbert_cased_en

* Add model 2023-09-13-bert_uncased_l_6_h_128_a_2_cord19_200616_en

* Add model 2023-09-13-notram_bert_norwegian_cased_080321_no

* Add model 2023-09-13-bert_small_finer_longer_en

* Add model 2023-09-13-dummy_model_aripo99_en

* Add model 2023-09-12-bert_base_parsbert_uncased_en

* Add model 2023-09-13-opticalbert_uncased_en

* Add model 2023-09-13-bert_base_uncased_reviews_3_en

* Add model 2023-09-13-bert_base_72_en

* Add model 2023-09-13-opticalpurebert_cased_en

* Add model 2023-09-13-opticalpurebert_uncased_en

* Add model 2023-09-13-chefberto_italian_cased_it

* Add model 2023-09-13-bert_base_96_en

* Add model 2023-09-13-wineberto_italian_cased_it

* Add model 2023-09-13-malay_bert_en

* Add model 2023-09-13-mathbert_custom_en

* Add model 2023-09-13-dpr_catalan_passage_encoder_viquiquad_base_en

* Add model 2023-09-13-wobert_chinese_plus_zh

* Add model 2023-09-12-mbert_tlm_sent_english_german_en

* Add model 2023-09-13-dam_bert_base_mlm_msmarco_lotte_write_test_en

* Add model 2023-09-13-bert_uncased_l_4_h_512_a_8_cord19_200616_en

* Add model 2023-09-13-test_telsayed_en

* Add model 2023-09-13-contractbr_bert_base_portuguese_en

* Add model 2023-09-13-javanese_bert_small_imdb_jv

* Add model 2023-09-13-bert_mini_arabic_ar

* Add model 2023-09-13-alberti_bert_base_multilingual_cased_linhd_postdata_xx

* Add model 2023-09-13-e4a_covid_bert_base_romanian_cased_v1_en

* Add model 2023-09-13-qse_en

* Add model 2023-09-13-marbert_ar

* Add model 2023-09-13-hindi_tweets_bert_hi

* Add model 2023-09-13-bert_base_uncased_finetuned_bertbero_en

* Add model 2023-09-13-bert_large_cased_finetuned_lowr10_0_cased_da_20_en

* Add model 2023-09-13-bert_base_arabertv02_twitter_ar

* Add model 2023-09-13-bert_wwm_words_law_en

* Add model 2023-09-13-dbbert_el

* Add model 2023-09-13-bert_base_arabertv02_ar

* Add model 2023-09-13-burmese_awesome_eli5_mlm_model_en

* Add model 2023-09-13-bert_yelp_local_en

* Add model 2023-09-13-simlm_base_msmarco_en

* Add model 2023-09-13-mymodel005_en

* Add model 2023-09-13-mymodel007_wbmitcast_en

* Add model 2023-09-13-condenser_en

* Add model 2023-09-13-qsrt_ar

* Add model 2023-09-13-mbert_finetuned_pytorch_en

* Add model 2023-09-13-bert_uncased_l_2_h_512_a_8_cord19_200616_en

* Add model 2023-09-13-clinical_pubmed_bert_base_128_en

* Add model 2023-09-13-energybert_en

* Add model 2023-09-13-bert_base_uncased_noisy_orcas_1.0positive_0.5_negative_margin1.0_cosine_en

* Add model 2023-09-13-improvedabg_20_epochs_en

* Add model 2023-09-13-labse_english_russian_erzya_v1_ru

* Add model 2023-09-13-inlegalbert_en

* Add model 2023-09-13-bert_base_uncased_test_en

* Add model 2023-09-13-hindi_tweets_bert_hateful_hi

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_40_3_en

* Add model 2023-09-13-gujibert_fan_en

* Add model 2023-09-13-dictbert_en

* Add model 2023-09-13-mabepa_sts_es

* Add model 2023-09-13-hindi_least_haitian_1m_hi

* Add model 2023-09-12-small_mlm_glue_mrpc_from_scratch_custom_tokenizer_expand_vocab_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_40_2_en

* Add model 2023-09-13-hebert_en

* Add model 2023-09-12-cro_cov_csebert_en

* Add model 2023-09-13-paraphraserplus_1epoch_en

* Add model 2023-09-13-nl1_en

* Add model 2023-09-13-parlamint_en

* Add model 2023-09-13-bert_base_finetuned_wellness_en

* Add model 2023-09-13-ern3_en

* Add model 2023-09-13-splade_cocondenser_selfdistil_baseplate_en

* Add model 2023-09-12-akeylegalbert_inscotus_and_ledgar_en

* Add model 2023-09-13-bert_base_dutch_cased_en

* Add model 2023-09-12-bert_base_english_french_chinese_cased_en

* Add model 2023-09-13-kpfbert_base_en

* Add model 2023-09-13-kcbert_base_petition_en

* Add model 2023-09-13-batterybert_uncased_en

* Add model 2023-09-13-berel_sivan22_he

* Add model 2023-09-13-first_model_en

* Add model 2023-09-13-bert_base_uncased_bert_mask_complete_word_updated_vocab_en

* Add model 2023-09-13-bert_base_120_en

* Add model 2023-09-13-nepalibert_ne

* Add model 2023-09-13-bert_cn_wudi7758521521_en

* Add model 2023-09-12-bert_c2_english_german_en

* Add model 2023-09-13-bert_small_finetuned_eurlex_longer_en

* Add model 2023-09-13-bert_base_uncased_finetuned_gap_en

* Add model 2023-09-13-clr_finetuned_bert_base_uncased_en

* Add model 2023-09-13-biblitbert_1_en

* Add model 2023-09-13-model_ankai_en

* Add model 2023-09-13-bert_base_standard_bahasa_cased_en

* Add model 2023-09-13-bert_portuguese_institutional_corpus_v.1_en

* Add model 2023-09-13-bert_base_uncased_issues_128_twidfeel_en

* Add model 2023-09-13-bert_base_vn_finetuned_portuguese_en

* Add model 2023-09-13-bert_model_nyashavision22_en

* Add model 2023-09-13-kcbert_base_dev_en

* Add model 2023-09-13-bert_finetuning_test_xiejiafang_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr50_8_fast_13_en

* Add model 2023-09-13-bert_base_uncased_dstc9_en

* Add model 2023-09-13-bert_embding_finetuned_spmlm_02_en

* Add model 2023-09-13-sentmae_en

* Add model 2023-09-13-bert_base_uncased_issues_128_munsu_en

* Add model 2023-09-13-pcscibert_cased_en

* Add model 2023-09-13-bert_base_uncased_2022_habana_test_1_en

* Add model 2023-09-13-kcbert_large_dev_en

* Add model 2023-09-13-bert_truncate_en

* Add model 2023-09-13-bert_large_arabertv2_ar

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_20_0_en

* Add model 2023-09-13-kcbert_large_en

* Add model 2023-09-12-cysecbert_en

* Add model 2023-09-13-scibert_scivocab_uncased_long_4096_en

* Add model 2023-09-13-bert_large_cased_sigir_support_refute_norwegian_label_40_2nd_test_lr10_20_1_en

* Add model 2023-09-13-bert_finetuning_test1_en

* Add model 2023-09-13-bert_concat_3_en

* Add model 2023-09-13-bert_base_spanish_wwm_cased_finetuned_literature_pro_en

* Add model 2023-09-13-bioptimus_en

* Add model 2023-09-13-bert_base_nli_stsb_en

* Add model 2023-09-13-batterybert_cased_en

* Add model 2023-09-13-bert_finetune_simcse_truncate_en

* Add model 2023-09-13-bert_base_nli_en

* Add model 2023-09-13-further_train_original_10_en

* Add model 2023-09-13-bert_concat_3_finetune_simcse_truncate_en

* Add model 2023-09-13-bert_base_uncased_binwang_en

* Add model 2023-09-13-legalbert_adept_en

* Add model 2023-09-13-tlm_ag_medium_scale_en

* Add model 2023-09-13-bert_concat_2_finetune_simcse_truncate_en

---------

Co-authored-by: ahmedlone127 <ahmedlone127@gmail.com>
  • Loading branch information
jsl-models and ahmedlone127 authored Sep 13, 2023
1 parent 16c83c2 commit f3c878e
Show file tree
Hide file tree
Showing 766 changed files with 71,238 additions and 0 deletions.
93 changes: 93 additions & 0 deletions docs/_posts/ahmedlone127/2023-09-12-20split_dataset_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
layout: model
title: English 20split_dataset BertEmbeddings from Billwzl
author: John Snow Labs
name: 20split_dataset
date: 2023-09-12
tags: [bert, en, open_source, fill_mask, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.1
spark_version: 3.0
supported: true
engine: onnx
annotator: BertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`20split_dataset` is a English model originally trained by Billwzl.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/20split_dataset_en_5.1.1_3.0_1694558868282.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/20split_dataset_en_5.1.1_3.0_1694558868282.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python


document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("documents")


embeddings =BertEmbeddings.pretrained("20split_dataset","en") \
.setInputCols(["documents","token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([document_assembler, embeddings])

pipelineModel = pipeline.fit(data)

pipelineDF = pipelineModel.transform(data)

```
```scala


val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("embeddings")

val embeddings = BertEmbeddings
.pretrained("20split_dataset", "en")
.setInputCols(Array("documents","token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings))

val pipelineModel = pipeline.fit(data)

val pipelineDF = pipelineModel.transform(data)


```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|20split_dataset|
|Compatibility:|Spark NLP 5.1.1+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents, token]|
|Output Labels:|[embeddings]|
|Language:|en|
|Size:|407.0 MB|

## References

https://huggingface.co/Billwzl/20split_dataset
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
layout: model
title: English abena_base_akuapem_twi_cased BertEmbeddings from Ghana-NLP
author: John Snow Labs
name: abena_base_akuapem_twi_cased
date: 2023-09-12
tags: [bert, en, open_source, fill_mask, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.1
spark_version: 3.0
supported: true
engine: onnx
annotator: BertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`abena_base_akuapem_twi_cased` is a English model originally trained by Ghana-NLP.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/abena_base_akuapem_twi_cased_en_5.1.1_3.0_1694558470329.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/abena_base_akuapem_twi_cased_en_5.1.1_3.0_1694558470329.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python


document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("documents")


embeddings =BertEmbeddings.pretrained("abena_base_akuapem_twi_cased","en") \
.setInputCols(["documents","token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([document_assembler, embeddings])

pipelineModel = pipeline.fit(data)

pipelineDF = pipelineModel.transform(data)

```
```scala


val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("embeddings")

val embeddings = BertEmbeddings
.pretrained("abena_base_akuapem_twi_cased", "en")
.setInputCols(Array("documents","token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings))

val pipelineModel = pipeline.fit(data)

val pipelineDF = pipelineModel.transform(data)


```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|abena_base_akuapem_twi_cased|
|Compatibility:|Spark NLP 5.1.1+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents, token]|
|Output Labels:|[embeddings]|
|Language:|en|
|Size:|664.5 MB|

## References

https://huggingface.co/Ghana-NLP/abena-base-akuapem-twi-cased
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
layout: model
title: English abena_base_asante_twi_uncased BertEmbeddings from Ghana-NLP
author: John Snow Labs
name: abena_base_asante_twi_uncased
date: 2023-09-12
tags: [bert, en, open_source, fill_mask, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.1
spark_version: 3.0
supported: true
engine: onnx
annotator: BertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`abena_base_asante_twi_uncased` is a English model originally trained by Ghana-NLP.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/abena_base_asante_twi_uncased_en_5.1.1_3.0_1694558682719.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/abena_base_asante_twi_uncased_en_5.1.1_3.0_1694558682719.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python


document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("documents")


embeddings =BertEmbeddings.pretrained("abena_base_asante_twi_uncased","en") \
.setInputCols(["documents","token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([document_assembler, embeddings])

pipelineModel = pipeline.fit(data)

pipelineDF = pipelineModel.transform(data)

```
```scala


val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("embeddings")

val embeddings = BertEmbeddings
.pretrained("abena_base_asante_twi_uncased", "en")
.setInputCols(Array("documents","token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings))

val pipelineModel = pipeline.fit(data)

val pipelineDF = pipelineModel.transform(data)


```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|abena_base_asante_twi_uncased|
|Compatibility:|Spark NLP 5.1.1+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents, token]|
|Output Labels:|[embeddings]|
|Language:|en|
|Size:|664.3 MB|

## References

https://huggingface.co/Ghana-NLP/abena-base-asante-twi-uncased
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
layout: model
title: English abstract_sim_query_pubmed BertEmbeddings from biu-nlp
author: John Snow Labs
name: abstract_sim_query_pubmed
date: 2023-09-12
tags: [bert, en, open_source, fill_mask, onnx]
task: Embeddings
language: en
edition: Spark NLP 5.1.1
spark_version: 3.0
supported: true
engine: onnx
annotator: BertEmbeddings
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained BertEmbeddings model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`abstract_sim_query_pubmed` is a English model originally trained by biu-nlp.

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/abstract_sim_query_pubmed_en_5.1.1_3.0_1694561530585.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/abstract_sim_query_pubmed_en_5.1.1_3.0_1694561530585.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python


document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("documents")


embeddings =BertEmbeddings.pretrained("abstract_sim_query_pubmed","en") \
.setInputCols(["documents","token"]) \
.setOutputCol("embeddings")

pipeline = Pipeline().setStages([document_assembler, embeddings])

pipelineModel = pipeline.fit(data)

pipelineDF = pipelineModel.transform(data)

```
```scala


val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("embeddings")

val embeddings = BertEmbeddings
.pretrained("abstract_sim_query_pubmed", "en")
.setInputCols(Array("documents","token"))
.setOutputCol("embeddings")

val pipeline = new Pipeline().setStages(Array(document_assembler, embeddings))

val pipelineModel = pipeline.fit(data)

val pipelineDF = pipelineModel.transform(data)


```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|abstract_sim_query_pubmed|
|Compatibility:|Spark NLP 5.1.1+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[documents, token]|
|Output Labels:|[embeddings]|
|Language:|en|
|Size:|410.2 MB|

## References

https://huggingface.co/biu-nlp/abstract-sim-query-pubmed
Loading

0 comments on commit f3c878e

Please sign in to comment.