Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2023-10-19-asr_whisper_small_urdu_1000_64_1e_05_pretrain_arabic_en #14032

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
7e27249
Add model 2023-10-19-asr_whisper_small_urdu_1000_64_1e_05_pretrain_ar…
ahmedlone127 Oct 19, 2023
21fcc76
Add model 2023-10-19-asr_whisper_small_chinese_tw_voidful_en
ahmedlone127 Oct 19, 2023
fabe975
Add model 2023-10-19-asr_whisper_small_chinese_tw_voidful_pipeline_en
ahmedlone127 Oct 19, 2023
fa03476
Add model 2023-10-19-asr_whisper_small_bak_en
ahmedlone127 Oct 19, 2023
92e1e8b
Add model 2023-10-19-asr_whisper_small_bak_pipeline_en
ahmedlone127 Oct 19, 2023
026879c
Add model 2023-10-19-asr_whisper_small_urdu_1000_64_1e_05_pretrain_ar…
ahmedlone127 Oct 19, 2023
d14cd49
Add model 2023-10-19-asr_whisper_small_urdu_1000_64_1e_05_pretrain_ar…
ahmedlone127 Oct 19, 2023
f185377
Add model 2023-10-19-asr_personal_whisper_small_english_model_en
ahmedlone127 Oct 19, 2023
7c2fef9
Add model 2023-10-19-asr_personal_whisper_small_english_model_pipelin…
ahmedlone127 Oct 19, 2023
fbe75c2
Add model 2023-10-19-asr_whisper_tiny_tamil_example_ta
ahmedlone127 Oct 19, 2023
ab01bed
Add model 2023-10-19-asr_whisper_tiny_tamil_example_pipeline_ta
ahmedlone127 Oct 19, 2023
7962694
Add model 2023-10-19-asr_whisper_small_hindi_xinhuang_pipeline_hi
ahmedlone127 Oct 19, 2023
c2eee67
Add model 2023-10-19-asr_whisper_small_swedish_test_3000_sv
ahmedlone127 Oct 19, 2023
10c9194
Add model 2023-10-19-asr_whisper_small_hindi_xinhuang_hi
ahmedlone127 Oct 19, 2023
8447349
Add model 2023-10-19-asr_whisper_small_swedish_test_3000_pipeline_sv
ahmedlone127 Oct 19, 2023
b4d269d
Add model 2023-10-19-asr_whisper_malayalam_first_model_ml
ahmedlone127 Oct 19, 2023
f7af64b
Add model 2023-10-19-asr_whisper_malayalam_first_model_pipeline_ml
ahmedlone127 Oct 19, 2023
b454a3c
Add model 2023-10-19-asr_whisper_small_lithuanian_deividasm_lt
ahmedlone127 Oct 19, 2023
53986f1
Add model 2023-10-19-asr_whisper_small_lithuanian_deividasm_pipeline_lt
ahmedlone127 Oct 19, 2023
cc5b6a1
Add model 2023-10-19-asr_whisper_lithuanian_finetune_lt
ahmedlone127 Oct 19, 2023
8e277f8
Add model 2023-10-19-asr_whisper_lithuanian_finetune_pipeline_lt
ahmedlone127 Oct 19, 2023
a09dfff
Add model 2023-10-19-asr_whisper_small_bengali_subhadeep_en
ahmedlone127 Oct 19, 2023
5e22fd1
Add model 2023-10-19-asr_whisper_small_bengali_subhadeep_pipeline_en
ahmedlone127 Oct 19, 2023
f8f53b0
Add model 2023-10-19-asr_whisper_small_chinesebasetw_zh
ahmedlone127 Oct 19, 2023
ddc3df1
Add model 2023-10-19-asr_whisper_small_chinesebasetw_pipeline_zh
ahmedlone127 Oct 19, 2023
d75e6a1
Add model 2023-10-19-asr_whisper_small_swedish_se_afroanton_en
ahmedlone127 Oct 19, 2023
59bd5c1
Add model 2023-10-19-asr_whisper_small_swedish_se_afroanton_pipeline_en
ahmedlone127 Oct 19, 2023
c7db256
Add model 2023-10-19-asr_whisper_small_uzbek_uz
ahmedlone127 Oct 19, 2023
9cf28b9
Add model 2023-10-19-asr_whisper_small_uzbek_pipeline_uz
ahmedlone127 Oct 19, 2023
d1a1e2d
Add model 2023-10-19-asr_whisper_small_nepali_np_ne
ahmedlone127 Oct 19, 2023
ad00188
Add model 2023-10-19-asr_whisper_small_nepali_np_pipeline_ne
ahmedlone127 Oct 19, 2023
5ecbc59
Add model 2023-10-19-asr_whisper_small_polish_aspik101_pl
ahmedlone127 Oct 19, 2023
097f9d9
Add model 2023-10-19-asr_whisper_small_polish_aspik101_pipeline_pl
ahmedlone127 Oct 19, 2023
8704372
Add model 2023-10-19-asr_whisper_tiny_polish_pl
ahmedlone127 Oct 19, 2023
974ac2c
Add model 2023-10-19-asr_whisper_tiny_polish_pipeline_pl
ahmedlone127 Oct 19, 2023
cf34cfb
Add model 2023-10-20-asr_whisper_small_english_blueraccoon_en
ahmedlone127 Oct 20, 2023
62e5d7b
Add model 2023-10-20-asr_whisper_small_english_blueraccoon_pipeline_en
ahmedlone127 Oct 20, 2023
c3f20a0
Add model 2023-10-20-asr_whisper_small_spanish_1e_6_en
ahmedlone127 Oct 20, 2023
3585b3f
Add model 2023-10-20-asr_whisper_small_dutch_nl
ahmedlone127 Oct 20, 2023
a795103
Add model 2023-10-20-asr_whisper_small_spanish_1e_6_pipeline_en
ahmedlone127 Oct 20, 2023
19e7985
Add model 2023-10-20-asr_whisper_small_dutch_pipeline_nl
ahmedlone127 Oct 20, 2023
796c3d9
Add model 2023-10-20-asr_whisper_small_armenian_hy
ahmedlone127 Oct 20, 2023
6dea9e6
Add model 2023-10-20-asr_whisper_small_armenian_pipeline_hy
ahmedlone127 Oct 20, 2023
201a4f5
Add model 2023-10-20-asr_whisper_small_finnish_sgangireddy_fi
ahmedlone127 Oct 20, 2023
f7c38f6
Add model 2023-10-20-asr_whisper_small_finnish_sgangireddy_pipeline_fi
ahmedlone127 Oct 20, 2023
cb37b3a
Add model 2023-10-20-asr_whisper_base_swedish_en
ahmedlone127 Oct 20, 2023
8cb7fc1
Add model 2023-10-20-asr_whisper_base_swedish_pipeline_en
ahmedlone127 Oct 20, 2023
a2db2d4
Add model 2023-10-20-asr_whisper_small_lithuanian_serbian_v2_en
ahmedlone127 Oct 20, 2023
7cdf6b7
Add model 2023-10-20-asr_whisper_small_lithuanian_serbian_v2_pipeline_en
ahmedlone127 Oct 20, 2023
b7d2a66
Add model 2023-10-20-asr_whisper_tiny_spanish_arpagon_es
ahmedlone127 Oct 20, 2023
ac4e9da
Add model 2023-10-20-asr_whisper_tiny_spanish_arpagon_pipeline_es
ahmedlone127 Oct 20, 2023
837e04f
Add model 2023-10-20-asr_whisper_small_french_yocel1_hi
ahmedlone127 Oct 20, 2023
6db7621
Add model 2023-10-20-asr_whisper_small_french_yocel1_pipeline_hi
ahmedlone127 Oct 20, 2023
58eea1b
Add model 2023-10-20-asr_whisper_small_hungarian_cv11_en
ahmedlone127 Oct 20, 2023
552568f
Add model 2023-10-20-asr_whisper_small_hungarian_cv11_pipeline_en
ahmedlone127 Oct 20, 2023
9bfc155
Add model 2023-10-20-asr_whisper_small_swedish_torileatherman_sv
ahmedlone127 Oct 20, 2023
cc1f6f9
Add model 2023-10-20-asr_whisper_small_swedish_torileatherman_pipelin…
ahmedlone127 Oct 20, 2023
37b0d3b
Add model 2023-10-20-asr_whisper_tiny_italian_local_en
ahmedlone127 Oct 20, 2023
43f7dd5
Add model 2023-10-20-asr_whisper_tiny_italian_local_pipeline_en
ahmedlone127 Oct 20, 2023
cbbc7d0
Add model 2023-10-20-asr_whisper_small_arabic_cv11_en
ahmedlone127 Oct 20, 2023
30cf63d
Add model 2023-10-20-asr_whisper_small_arabic_cv11_pipeline_en
ahmedlone127 Oct 20, 2023
a89c25b
Add model 2023-10-20-asr_whisper_small_pashto_ihanif_ps
ahmedlone127 Oct 20, 2023
17a0712
Add model 2023-10-20-asr_whisper_small_pashto_ihanif_pipeline_ps
ahmedlone127 Oct 20, 2023
957907b
Add model 2023-10-20-asr_whisper_small_swedish_english_se
ahmedlone127 Oct 20, 2023
f03efc1
Add model 2023-10-20-asr_whisper_small_swedish_english_pipeline_se
ahmedlone127 Oct 20, 2023
6cd35fb
Add model 2023-10-20-asr_whisper_small_japanese_vumichien_ja
ahmedlone127 Oct 20, 2023
243df76
Add model 2023-10-20-asr_whisper_small_japanese_vumichien_pipeline_ja
ahmedlone127 Oct 20, 2023
3cc2d48
Add model 2023-10-20-asr_whisper_small_mongolian_3_en
ahmedlone127 Oct 20, 2023
4c55972
Add model 2023-10-20-asr_whisper_small_mongolian_3_pipeline_en
ahmedlone127 Oct 20, 2023
f5a70d7
Add model 2023-10-20-asr_whisper_small_swe2_en
ahmedlone127 Oct 20, 2023
5320004
Add model 2023-10-20-asr_whisper_small_swe2_pipeline_en
ahmedlone127 Oct 20, 2023
1304371
Add model 2023-10-20-asr_whisper_tiny_italian_2_it
ahmedlone127 Oct 20, 2023
0bc0caa
Add model 2023-10-20-asr_whisper_tiny_italian_2_pipeline_it
ahmedlone127 Oct 20, 2023
98e7ca0
Add model 2023-10-20-asr_whisper_small_nob_no
ahmedlone127 Oct 20, 2023
62be72e
Add model 2023-10-20-asr_whisper_small_nob_pipeline_no
ahmedlone127 Oct 20, 2023
add983f
Add model 2023-10-20-asr_whisper_danish_small_augmented_da
ahmedlone127 Oct 20, 2023
e409bfa
Add model 2023-10-20-asr_whisper_danish_small_augmented_pipeline_da
ahmedlone127 Oct 20, 2023
f1d79e6
Add model 2023-10-20-asr_whisper_testrun1_en
ahmedlone127 Oct 20, 2023
f7a50ff
Add model 2023-10-20-asr_whisper_testrun1_pipeline_en
ahmedlone127 Oct 20, 2023
67bd0df
Add model 2023-10-20-asr_whisper_small_korean_fl_ko
ahmedlone127 Oct 20, 2023
4c03dc6
Add model 2023-10-20-asr_whisper_small_korean_fl_pipeline_ko
ahmedlone127 Oct 20, 2023
36abb31
Add model 2023-10-20-asr_whisper_small_spanish_ari_pipeline_es
ahmedlone127 Oct 20, 2023
f86f507
Add model 2023-10-20-asr_whisper_small_spanish_ari_es
ahmedlone127 Oct 20, 2023
8a6801b
Add model 2023-10-20-asr_whisper_small_punjabi_eastern_pa
ahmedlone127 Oct 20, 2023
1c2ca44
Add model 2023-10-20-asr_whisper_small_punjabi_eastern_pipeline_pa
ahmedlone127 Oct 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
layout: model
title: English asr_personal_whisper_small_english_model WhisperForCTC from fractalego
author: John Snow Labs
name: asr_personal_whisper_small_english_model
date: 2023-10-19
tags: [whisper, en, open_source, asr, onnx]
task: Automatic Speech Recognition
language: en
edition: Spark NLP 5.1.4
spark_version: 3.4
supported: true
engine: onnx
annotator: WhisperForCTC
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`asr_personal_whisper_small_english_model` is a English model originally trained by fractalego.

This model is only compatible with PySpark 3.4 and above

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/asr_personal_whisper_small_english_model_en_5.1.4_3.4_1697754481302.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/asr_personal_whisper_small_english_model_en_5.1.4_3.4_1697754481302.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

audioAssembler = AudioAssembler() \
.setInputCol("audio_content") \
.setOutputCol("audio_assembler")


speechToText = WhisperForCTC.pretrained("asr_personal_whisper_small_english_model","en") \
.setInputCols(["audio_assembler"]) \
.setOutputCol("text")

pipeline = Pipeline().setStages([audioAssembler, speechToText])

pipelineModel = pipeline.fit(data)

pipelineDF = pipelineModel.transform(data)

```
```scala

val audioAssembler = new AudioAssembler()
.setInputCol("audio_content")
.setOutputCol("audio_assembler")

val speechToText = WhisperForCTC.pretrained("asr_personal_whisper_small_english_model","en")
.setInputCols(Array("audio_assembler"))
.setOutputCol("text")

val pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText))

val pipelineModel = pipeline.fit(data)

val pipelineDF = pipelineModel.transform(data)


```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|asr_personal_whisper_small_english_model|
|Compatibility:|Spark NLP 5.1.4+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[audio_assembler]|
|Output Labels:|[text]|
|Language:|en|
|Size:|1.7 GB|

## References

https://huggingface.co/fractalego/personal-whisper-small.en-model
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
layout: model
title: English asr_personal_whisper_small_english_model_pipeline pipeline WhisperForCTC from fractalego
author: John Snow Labs
name: asr_personal_whisper_small_english_model_pipeline
date: 2023-10-19
tags: [whisper, en, open_source, pipeline]
task: Automatic Speech Recognition
language: en
edition: Spark NLP 5.1.4
spark_version: 3.4
supported: true
annotator: PipelineModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`asr_personal_whisper_small_english_model_pipeline` is a English model originally trained by fractalego.

This model is only compatible with PySpark 3.4 and above

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/asr_personal_whisper_small_english_model_pipeline_en_5.1.4_3.4_1697754518503.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/asr_personal_whisper_small_english_model_pipeline_en_5.1.4_3.4_1697754518503.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

pipeline = PretrainedPipeline('asr_personal_whisper_small_english_model_pipeline', lang = 'en')
annotations = pipeline.transform(audioDF)

```
```scala

val pipeline = new PretrainedPipeline('asr_personal_whisper_small_english_model_pipeline', lang = 'en')
val annotations = pipeline.transform(audioDF)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|asr_personal_whisper_small_english_model_pipeline|
|Type:|pipeline|
|Compatibility:|Spark NLP 5.1.4+|
|License:|Open Source|
|Edition:|Official|
|Language:|en|
|Size:|1.7 GB|

## References

https://huggingface.co/fractalego/personal-whisper-small.en-model

## Included Models

- AudioAssembler
- WhisperForCTC
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
layout: model
title: Lithuanian asr_whisper_lithuanian_finetune WhisperForCTC from daniel-rdt
author: John Snow Labs
name: asr_whisper_lithuanian_finetune
date: 2023-10-19
tags: [whisper, lt, open_source, asr, onnx]
task: Automatic Speech Recognition
language: lt
edition: Spark NLP 5.1.4
spark_version: 3.4
supported: true
engine: onnx
annotator: WhisperForCTC
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained WhisperForCTC model, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`asr_whisper_lithuanian_finetune` is a Lithuanian model originally trained by daniel-rdt.

This model is only compatible with PySpark 3.4 and above

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/asr_whisper_lithuanian_finetune_lt_5.1.4_3.4_1697755801160.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/asr_whisper_lithuanian_finetune_lt_5.1.4_3.4_1697755801160.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

audioAssembler = AudioAssembler() \
.setInputCol("audio_content") \
.setOutputCol("audio_assembler")


speechToText = WhisperForCTC.pretrained("asr_whisper_lithuanian_finetune","lt") \
.setInputCols(["audio_assembler"]) \
.setOutputCol("text")

pipeline = Pipeline().setStages([audioAssembler, speechToText])

pipelineModel = pipeline.fit(data)

pipelineDF = pipelineModel.transform(data)

```
```scala

val audioAssembler = new AudioAssembler()
.setInputCol("audio_content")
.setOutputCol("audio_assembler")

val speechToText = WhisperForCTC.pretrained("asr_whisper_lithuanian_finetune","lt")
.setInputCols(Array("audio_assembler"))
.setOutputCol("text")

val pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText))

val pipelineModel = pipeline.fit(data)

val pipelineDF = pipelineModel.transform(data)


```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|asr_whisper_lithuanian_finetune|
|Compatibility:|Spark NLP 5.1.4+|
|License:|Open Source|
|Edition:|Official|
|Input Labels:|[audio_assembler]|
|Output Labels:|[text]|
|Language:|lt|
|Size:|1.7 GB|

## References

https://huggingface.co/daniel-rdt/whisper-lt-finetune
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
layout: model
title: Lithuanian asr_whisper_lithuanian_finetune_pipeline pipeline WhisperForCTC from daniel-rdt
author: John Snow Labs
name: asr_whisper_lithuanian_finetune_pipeline
date: 2023-10-19
tags: [whisper, lt, open_source, pipeline]
task: Automatic Speech Recognition
language: lt
edition: Spark NLP 5.1.4
spark_version: 3.4
supported: true
annotator: PipelineModel
article_header:
type: cover
use_language_switcher: "Python-Scala-Java"
---

## Description

Pretrained WhisperForCTC, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`asr_whisper_lithuanian_finetune_pipeline` is a Lithuanian model originally trained by daniel-rdt.

This model is only compatible with PySpark 3.4 and above

{:.btn-box}
<button class="button button-orange" disabled>Live Demo</button>
<button class="button button-orange" disabled>Open in Colab</button>
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/asr_whisper_lithuanian_finetune_pipeline_lt_5.1.4_3.4_1697755826126.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/asr_whisper_lithuanian_finetune_pipeline_lt_5.1.4_3.4_1697755826126.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}

## How to use



<div class="tabs-box" markdown="1">
{% include programmingLanguageSelectScalaPythonNLU.html %}
```python

pipeline = PretrainedPipeline('asr_whisper_lithuanian_finetune_pipeline', lang = 'lt')
annotations = pipeline.transform(audioDF)

```
```scala

val pipeline = new PretrainedPipeline('asr_whisper_lithuanian_finetune_pipeline', lang = 'lt')
val annotations = pipeline.transform(audioDF)

```
</div>

{:.model-param}
## Model Information

{:.table-model}
|---|---|
|Model Name:|asr_whisper_lithuanian_finetune_pipeline|
|Type:|pipeline|
|Compatibility:|Spark NLP 5.1.4+|
|License:|Open Source|
|Edition:|Official|
|Language:|lt|
|Size:|1.7 GB|

## References

https://huggingface.co/daniel-rdt/whisper-lt-finetune

## Included Models

- AudioAssembler
- WhisperForCTC
Loading