diff --git a/docs/_posts/ahmedlone127/2024-02-01-bert_zero_shot_classifier_mnli_xx.md b/docs/_posts/ahmedlone127/2024-02-01-bert_zero_shot_classifier_mnli_xx.md new file mode 100644 index 00000000000000..3ecb405f28a104 --- /dev/null +++ b/docs/_posts/ahmedlone127/2024-02-01-bert_zero_shot_classifier_mnli_xx.md @@ -0,0 +1,107 @@ +--- +layout: model +title: BERT Zero-Shot Classification Base - MNLI (bert_zero_shot_classifier_mnli) +author: John Snow Labs +name: bert_zero_shot_classifier_mnli +date: 2024-02-01 +tags: [xx, open_source, onnx] +task: Zero-Shot Classification +language: xx +edition: Spark NLP 5.2.4 +spark_version: 3.4 +supported: true +engine: onnx +annotator: BertForZeroShotClassification +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +This model is intended to be used for zero-shot text classification. It is fine-tuned on MNLI. + +BertForZeroShotClassification using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of BertForSequenceClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible. + +We used TFBertForSequenceClassification to train this model and used BertForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale! + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bert_zero_shot_classifier_mnli_xx_5.2.4_3.4_1706784558791.zip){:.button.button-orange.button-orange-trans.arr.button-icon} +[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bert_zero_shot_classifier_mnli_xx_5.2.4_3.4_1706784558791.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +document_assembler = DocumentAssembler() \ +.setInputCol('text') \ +.setOutputCol('document') + +tokenizer = Tokenizer() \ +.setInputCols(['document']) \ +.setOutputCol('token') + +zeroShotClassifier = BertForZeroShotClassification \ +.pretrained('bert_zero_shot_classifier_mnli', 'xx') \ +.setInputCols(['token', 'document']) \ +.setOutputCol('class') \ +.setCaseSensitive(True) \ +.setMaxSentenceLength(512) \ +.setCandidateLabels(["urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology"]) + +pipeline = Pipeline(stages=[ +document_assembler, +tokenizer, +zeroShotClassifier +]) + +example = spark.createDataFrame([['I have a problem with my iphone that needs to be resolved asap!!']]).toDF("text") +result = pipeline.fit(example).transform(example) +``` +```scala +val document_assembler = DocumentAssembler() +.setInputCol("text") +.setOutputCol("document") + +val tokenizer = Tokenizer() +.setInputCols("document") +.setOutputCol("token") + +val zeroShotClassifier = BertForSequenceClassification.pretrained("bert_zero_shot_classifier_mnli", "xx") +.setInputCols("document", "token") +.setOutputCol("class") +.setCaseSensitive(true) +.setMaxSentenceLength(512) +.setCandidateLabels(Array("urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology")) + +val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier)) + +val example = Seq("I have a problem with my iphone that needs to be resolved asap!!").toDS.toDF("text") + +val result = pipeline.fit(example).transform(example) +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|bert_zero_shot_classifier_mnli| +|Compatibility:|Spark NLP 5.2.4+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[token, document]| +|Output Labels:|[label]| +|Language:|xx| +|Size:|409.1 MB| +|Case sensitive:|true| \ No newline at end of file