Skip to content

Commit

Permalink
[Bug fix] PC lexical + audio (NVIDIA#5109) (NVIDIA#5110)
Browse files Browse the repository at this point in the history
* training running

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* revert

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
  • Loading branch information
2 people authored and Hainan Xu committed Nov 29, 2022
1 parent 978620a commit db517fd
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ model:
max_seq_length: 512

sample_rate: 16000 # Target sample rate of audios can be used for downsampling or upsamling.
num_workers: 8
num_workers: 0

# Number of jobs for tokenization and labels encoding. If 0, then multiprocessing is not used. If null,
# number of jobs is equal to the number of CPU cores.
Expand Down Expand Up @@ -149,7 +149,7 @@ model:
tar_metadata_file: null

sample_rate: 16000
num_workers: 8
num_workers: 0

test_ds:
# if evaluation data is not in the model.train_ds.ds_item as the training data or multiple datasets are used for
Expand Down Expand Up @@ -180,7 +180,7 @@ model:
tar_metadata_file: null

sample_rate: 16000
num_workers: 8
num_workers: 0

tokenizer:
tokenizer_name: ${model.language_model.pretrained_model_name} # or sentencepiece
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -663,6 +663,7 @@ def _get_features(
create_progress_process = progress_queue is None
if n_jobs is None:
n_jobs = min(mp.cpu_count(), len(queries))

if verbose:
logging.info(f"Running tokenization with {n_jobs} jobs.")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -889,7 +889,7 @@ def _setup_dataloader_from_config(self, cfg: DictConfig, train: bool) -> torch.u
num_workers=cfg.num_workers,
pin_memory=cfg.pin_memory,
drop_last=cfg.drop_last,
persistent_workers=cfg.persistent_workers,
persistent_workers=cfg.persistent_workers if cfg.num_workers > 0 else False,
)

def _setup_infer_dataloader(
Expand Down

0 comments on commit db517fd

Please sign in to comment.