Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run MarcoPolo in local machine with Jupyter Notebook #5

Open
SHADJIA opened this issue Jul 26, 2022 · 3 comments
Open

Run MarcoPolo in local machine with Jupyter Notebook #5

SHADJIA opened this issue Jul 26, 2022 · 3 comments

Comments

@SHADJIA
Copy link

SHADJIA commented Jul 26, 2022

Hello @chanwkimlab ,

I'm a beginner for Python world. I would like to test your tool for my dataset and from my local machine with Jupyter Notebook. Can you help me with your code? Actually when I try your vignette, I get this error :

AssertionError                            Traceback (most recent call last)
C:\Users\AppData\Local\Temp/ipykernel_2416/4231211302.py in <module>
      5     adata.obs["size_factor"] = norm_factor/norm_factor.mean()
      6     print("size factor was calculated")
----> 7 regression_result = MarcoPolo.run_regression(adata=adata, size_factor_key="size_factor",
      8                          num_threads=8)
      9 # If you use a local machine, you can set `num_threads` to higher than 1 (maybe upto 4), which will speed up the regression a lot. For some reason, num_threads>1 does not seem to work well on colab (maybe due to the the limited RAM).
.
.
.
AssertionError: Torch not compiled with CUDA enabled

Do you have any idea to resolve this issue?

Thanks a lot in advance.

Regards,
Sha

@chanwkimlab
Copy link
Owner

chanwkimlab commented Jul 26, 2022

Hi @SHADJIA,

Thank you for using our software. The error occurred because the run_regression function uses GPU by default but the installed PyTorch on your local machine is the one that does not support GPU; therefore, the solutions are as below.

  1. If you have a GPU on your local machine and intend to use it to accelerate the MarcoPolo algorithm, you should install a proper PyTorch version with CUDA support. I believe the PyTorch currently installed on your machine only supports CPU. You may find many instructions on installing CUDA and CUDA-enabled PyTorch online such as this link: https://www.youtube.com/watch?v=GMSjDTU8Zlc.
  2. If you don't have a GPU, you can simply resolve the issue by changing the device parameter of the run_regression function from "cuda:0" to "cpu". However, the execution of MarcoPolo would be much slower.

Please let me know if you have any other questions.

Best,
Chanwoo

@SHADJIA
Copy link
Author

SHADJIA commented Jul 27, 2022

Hello @chanwkimlab ,

Thanks for your reply.

I don't have GPU in my computer. I execute the code by using CPU as device. It's still working since 2 hours. LIke in colab, I can't see the progression bar with jupyter. And since the code is running for regression, "size factor was calculated" is not printed. I just have : The numbers of clusters to test: [1, 2]
Y: (22748, 16656) X: (22748, 1) s: (22748,)
Is this normal or there is a problem in the execution?

Thanks once again.
Regards,
Sha

@chanwkimlab
Copy link
Owner

chanwkimlab commented Jul 27, 2022

Hi @SHADJIA,

As you use CPU instead of GPU and your input data is very large, it is very normal that the regression takes longer than 2 hours. Also, It's possible that you don't see "size factor was calculated" if your input data already contains "size_factor" column. However, it is interesting that you don't see a progress bar. When I changed the device parameter from "cuda:0" to "cpu" in the colab environment, I was still able to see the progress bar. For debugging purposes, you can manually edit the regression/trainer.py file to add the following lines to the fit_multiple_genes function. You can retrieve the path where MarcoPolo was installed by running MarcoPolo.__file__ after executing import MarcoPolo.

for iter_idx, exp_data_idx in enumerate(pbar):
+    if iter_idx%10==0:
+    print(iter_idx)
cell_dataset = CellDataset(Y_select[:, iter_idx:iter_idx + 1], X, s)

Best,
Chanwoo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants