Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting error when num_workers > 0 #53

Closed
amilamad opened this issue Mar 5, 2018 · 5 comments
Closed

Getting error when num_workers > 0 #53

amilamad opened this issue Mar 5, 2018 · 5 comments

Comments

@amilamad
Copy link
Contributor

amilamad commented Mar 5, 2018

Hi,
I have tried to train the lj speech model with latest mater and it gives me error like this, with

num_workers = 2
image
It looks like _frontend for worker processes didn`t got assigned. I tried injecting _frontend object to the TextDataSource. But It failed. Is there a fix for this ?

When I set the num_workers = 0 , it is training ok.
After quick google search it tells me that when num_workers = 0 it will do all the work in main thread.
My question is, will it slow down my training process significantly ?

@r9y9
Copy link
Owner

r9y9 commented Mar 7, 2018

Seems dup of #37.

@r9y9 r9y9 added the windows label Mar 7, 2018
@r9y9
Copy link
Owner

r9y9 commented Mar 7, 2018

This is not really tested on Windows, so I'd recommend Linux instead if possible.

_frontend should be deepvoice3_pytorch.frontend.en for English datasets, not TextDataSource. I'm not sure wha's happening in the num_workers=0 case.

@amilamad
Copy link
Contributor Author

amilamad commented Mar 7, 2018

Oh sorry again for duplicate issue.
_frontend is initialized correctly with the deepvoice3_pytorch.frontend.en in the main python thread. But when it is accessed by worker threads _frontend is None .I think this issue is due to the nature of the python worker threads on windows.
Ill place a issue on the https://github.com/peterjc123/pytorch-scripts

Now my training is running ok with num_workers=0. Can I know will it significantly slow down compared to num_workers=2 case ?

@engiecat
Copy link
Contributor

engiecat commented Mar 8, 2018

@amilamad
I am currently working with VCTK set on Win 10 and num_workers=1 does work well. (About 10% loss of performance is observed.)
image
As seen in the screenshot, THAllocator error does occur(approx. once per day for me) though, but according to pytorch/pytorch#4239, it can be alleviated by setting lower num_workers.
(for me, careful monitoring and restarting the training from checkpoint worked)

amilamad pushed a commit to amilamad/deepvoice3_pytorch that referenced this issue Mar 10, 2018
r9y9 pushed a commit that referenced this issue Mar 10, 2018
* Fixed typeerror (torch.index_select received an invalid combination of arguments)

  File "synthesis.py", line 137, in <module>
    model, text, p=replace_pronunciation_prob, speaker_id=speaker_id, fast=True)
  File "synthesis.py", line 66, in tts
    sequence, text_positions=text_positions, speaker_ids=speaker_ids)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 79, in forward
    text_positions, frame_positions, input_lengths)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 116, in forward
    text_sequences, lengths=input_lengths, speaker_embed=speaker_embed)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\deepvoice3.py", line 75, in forward
    x = self.embed_tokens(text_sequences) <- change this to long!
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\sparse.py", line 103, in forward
    self.scale_grad_by_freq, self.sparse
  File "H:\envs\pytorch\lib\site-packages\torch\nn\_functions\thnn\sparse.py", line 59, in forward
    output = torch.index_select(weight, 0, indices.view(-1))
TypeError: torch.index_select received an invalid combination of arguments - got (�[32;1mtorch.cuda.FloatTensor�[0m, �[32;1mint�[0m, �[31;1mtorch.cuda.IntTensor�[0m), but expected (torch.cuda.FloatTensor source, int dim, torch.cuda.LongTensor index)

changed text_sequence to long, as required by torch.index_select.

* Fixed Nonetype error in collect_features

* requirements.txt fix

* Memory Leakage bugfix + hparams change

* Pre-PR modifications

* Pre-PR modifications 2

* Pre-PR modifications 3

* Post-PR modification

* remove requirements.txt

* num_workers to 1 in train.py
engiecat added a commit to engiecat/deepvoice3_pytorch that referenced this issue Apr 30, 2018
r9y9#53 (comment) issue solved in PyTorch 0.4
r9y9 pushed a commit that referenced this issue Apr 30, 2018
…pport (#78)

* Fixed typeerror (torch.index_select received an invalid combination of arguments)

  File "synthesis.py", line 137, in <module>
    model, text, p=replace_pronunciation_prob, speaker_id=speaker_id, fast=True)
  File "synthesis.py", line 66, in tts
    sequence, text_positions=text_positions, speaker_ids=speaker_ids)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 79, in forward
    text_positions, frame_positions, input_lengths)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 116, in forward
    text_sequences, lengths=input_lengths, speaker_embed=speaker_embed)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\deepvoice3.py", line 75, in forward
    x = self.embed_tokens(text_sequences) <- change this to long!
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\sparse.py", line 103, in forward
    self.scale_grad_by_freq, self.sparse
  File "H:\envs\pytorch\lib\site-packages\torch\nn\_functions\thnn\sparse.py", line 59, in forward
    output = torch.index_select(weight, 0, indices.view(-1))
TypeError: torch.index_select received an invalid combination of arguments - got (�[32;1mtorch.cuda.FloatTensor�[0m, �[32;1mint�[0m, �[31;1mtorch.cuda.IntTensor�[0m), but expected (torch.cuda.FloatTensor source, int dim, torch.cuda.LongTensor index)

changed text_sequence to long, as required by torch.index_select.

* Fixed Nonetype error in collect_features

* requirements.txt fix

* Memory Leakage bugfix + hparams change

* Pre-PR modifications

* Pre-PR modifications 2

* Pre-PR modifications 3

* Post-PR modification

* remove requirements.txt

* num_workers to 1 in train.py

* Windows log filename bugfix

* Revert "Windows log filename bugfix"

This reverts commit 5214c24.

* merge 2

* Windows Filename bugfix

In windows, this causes WinError 123

* Cleanup before PR

* JSON format Metadata support

Supports JSON format for dataset creation. Ensures compatibility with http://github.com/carpedm20/multi-Speaker-tacotron-tensorflow

* Web based Gentle aligner support

* README change + gentle patch

* .gitignore change

gitignore change

* Flake8 Fix

* Post PR commit - Also fixed #5

#53 (comment) issue solved in PyTorch 0.4

* Post-PR 2 - .gitignore
engiecat added a commit to engiecat/deepvoice3_pytorch that referenced this issue May 5, 2018
* Fixed typeerror (torch.index_select received an invalid combination of arguments)

  File "synthesis.py", line 137, in <module>
    model, text, p=replace_pronunciation_prob, speaker_id=speaker_id, fast=True)
  File "synthesis.py", line 66, in tts
    sequence, text_positions=text_positions, speaker_ids=speaker_ids)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 79, in forward
    text_positions, frame_positions, input_lengths)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 116, in forward
    text_sequences, lengths=input_lengths, speaker_embed=speaker_embed)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\deepvoice3.py", line 75, in forward
    x = self.embed_tokens(text_sequences) <- change this to long!
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\sparse.py", line 103, in forward
    self.scale_grad_by_freq, self.sparse
  File "H:\envs\pytorch\lib\site-packages\torch\nn\_functions\thnn\sparse.py", line 59, in forward
    output = torch.index_select(weight, 0, indices.view(-1))
TypeError: torch.index_select received an invalid combination of arguments - got (�[32;1mtorch.cuda.FloatTensor�[0m, �[32;1mint�[0m, �[31;1mtorch.cuda.IntTensor�[0m), but expected (torch.cuda.FloatTensor source, int dim, torch.cuda.LongTensor index)

changed text_sequence to long, as required by torch.index_select.

* Fixed Nonetype error in collect_features

* requirements.txt fix

* Memory Leakage bugfix + hparams change

* Pre-PR modifications

* Pre-PR modifications 2

* Pre-PR modifications 3

* Post-PR modification

* remove requirements.txt

* num_workers to 1 in train.py

Windows Filename bugfix

In windows, this causes WinError 123

Windows Specific Filename bugfix (r9y9#58)

* Fixed typeerror (torch.index_select received an invalid combination of arguments)

  File "synthesis.py", line 137, in <module>
    model, text, p=replace_pronunciation_prob, speaker_id=speaker_id, fast=True)
  File "synthesis.py", line 66, in tts
    sequence, text_positions=text_positions, speaker_ids=speaker_ids)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 79, in forward
    text_positions, frame_positions, input_lengths)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 116, in forward
    text_sequences, lengths=input_lengths, speaker_embed=speaker_embed)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\deepvoice3.py", line 75, in forward
    x = self.embed_tokens(text_sequences) <- change this to long!
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\sparse.py", line 103, in forward
    self.scale_grad_by_freq, self.sparse
  File "H:\envs\pytorch\lib\site-packages\torch\nn\_functions\thnn\sparse.py", line 59, in forward
    output = torch.index_select(weight, 0, indices.view(-1))
TypeError: torch.index_select received an invalid combination of arguments - got (�[32;1mtorch.cuda.FloatTensor�[0m, �[32;1mint�[0m, �[31;1mtorch.cuda.IntTensor�[0m), but expected (torch.cuda.FloatTensor source, int dim, torch.cuda.LongTensor index)

changed text_sequence to long, as required by torch.index_select.

* Fixed Nonetype error in collect_features

* requirements.txt fix

* Memory Leakage bugfix + hparams change

* Pre-PR modifications

* Pre-PR modifications 2

* Pre-PR modifications 3

* Post-PR modification

* remove requirements.txt

* num_workers to 1 in train.py

* Windows log filename bugfix

* Revert "Windows log filename bugfix"

This reverts commit 5214c24.

* merge 2

* Windows Filename bugfix

In windows, this causes WinError 123

* Cleanup before PR
engiecat added a commit to engiecat/deepvoice3_pytorch that referenced this issue May 5, 2018
r9y9#53 (comment) issue solved in PyTorch 0.4
engiecat added a commit to engiecat/deepvoice3_pytorch that referenced this issue May 5, 2018
…pport (r9y9#78)

* Fixed typeerror (torch.index_select received an invalid combination of arguments)

  File "synthesis.py", line 137, in <module>
    model, text, p=replace_pronunciation_prob, speaker_id=speaker_id, fast=True)
  File "synthesis.py", line 66, in tts
    sequence, text_positions=text_positions, speaker_ids=speaker_ids)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 79, in forward
    text_positions, frame_positions, input_lengths)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\__init__.py", line 116, in forward
    text_sequences, lengths=input_lengths, speaker_embed=speaker_embed)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\Tensorflow_Study\git\deepvoice3_pytorch\deepvoice3_pytorch\deepvoice3.py", line 75, in forward
    x = self.embed_tokens(text_sequences) <- change this to long!
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "H:\envs\pytorch\lib\site-packages\torch\nn\modules\sparse.py", line 103, in forward
    self.scale_grad_by_freq, self.sparse
  File "H:\envs\pytorch\lib\site-packages\torch\nn\_functions\thnn\sparse.py", line 59, in forward
    output = torch.index_select(weight, 0, indices.view(-1))
TypeError: torch.index_select received an invalid combination of arguments - got (�[32;1mtorch.cuda.FloatTensor�[0m, �[32;1mint�[0m, �[31;1mtorch.cuda.IntTensor�[0m), but expected (torch.cuda.FloatTensor source, int dim, torch.cuda.LongTensor index)

changed text_sequence to long, as required by torch.index_select.

* Fixed Nonetype error in collect_features

* requirements.txt fix

* Memory Leakage bugfix + hparams change

* Pre-PR modifications

* Pre-PR modifications 2

* Pre-PR modifications 3

* Post-PR modification

* remove requirements.txt

* num_workers to 1 in train.py

* Windows log filename bugfix

* Revert "Windows log filename bugfix"

This reverts commit 5214c24.

* merge 2

* Windows Filename bugfix

In windows, this causes WinError 123

* Cleanup before PR

* JSON format Metadata support

Supports JSON format for dataset creation. Ensures compatibility with http://github.com/carpedm20/multi-Speaker-tacotron-tensorflow

* Web based Gentle aligner support

* README change + gentle patch

* .gitignore change

gitignore change

* Flake8 Fix

* Post PR commit - Also fixed #5

r9y9#53 (comment) issue solved in PyTorch 0.4

* Post-PR 2 - .gitignore
engiecat added a commit to engiecat/deepvoice3_pytorch that referenced this issue May 5, 2018
r9y9#53 (comment) issue solved in PyTorch 0.4
@stale
Copy link

stale bot commented May 30, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label May 30, 2019
@stale stale bot closed this as completed Jun 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants