Skip to content

Latest commit

 

History

History
55 lines (44 loc) · 20.6 KB

README.md

File metadata and controls

55 lines (44 loc) · 20.6 KB

nonconvexAG

Kai Yang

<kai.yang2 "at" mail.mcgill.ca>

License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)

This repository contains simulation study codes and results for my paper, Accelerated Gradient Methods for Sparse Statistical Learning with Nonconvex Penalties -- DOI link for publication or arXiv link for preprint.

  • The paper-related material is under this directory. All the codes and outputs (both intermediate outputs and final outputs) can be found here.
  • All the results were run on Compute Canada. The job submission bash scripts contain commands showing the computing resource name and information in the slurm outputs; the seff outputs show the computing time.
  • All studies using GPUs were run on Compute Canada Nvidia A100 GPU(s), which the Compute Canada slurm outputs confirm and show to have a CUDA compute capability of 8.0 (cupy.cuda.Device.compute_capability returns a string '80' stands for compute capability of 8.0).
  • All the Python scripts referred in the bash job submission scripts were generated from the Jupyter (iPython) notebooks (i.e., jupyter nbconvert *.ipynb --to python and move the python script to a separate sub-directory called dist).
  • Compute Canada has slurm-[jobID].out files consisting of outputs from running the scripts; I also created seff-[jobID].out files from command seff [jobID] >> seff-[jobID].output to record and report the wall-clock times to finish the computing-time comparison jobs.
  • With identical simulation setups, under the same $(\epsilon-)$ convergence criteria, seff files show that the computing time simulations for the AG method finished within $20$ minutes for SCAD or MCP-penalized logistic models; however, the computing time simulations could not finish within the 7-day time limit imposed by Compute Canada Narval cluster for the coordinate descent method on SCAD or MCP-penalized logistic models.
  • Again, all the above simulations were run on identical GPUs -- Nvidia A100 with CUDA compute capability of 8.0.
  • To ensure the fairness of comparison, we coded coordinate descent in Python/CuPy and compared the computing time with AG -- this was coded based on the state-of-the-art pseudo-code for the coordinate descent method (Breheny & Huang, 2011).
Model Penalty Comparison Optimization Method Output Data Jupyter Notebook/R code Bash Script slurm file seff output
Penalized Linear Models (LM) SCAD Signal Recovery Performance coordinate descent (ncvreg); with strong rule R_results_SCAD_signal_recovery.npy ncvreg_LM_sim.R LM.sh slurm-10933899.out
Penalized Linear Models (LM) MCP Signal Recovery Performance coordinate descent (ncvreg); with strong rule R_results_MCP_signal_recovery.npy ncvreg_LM_sim.R LM.sh slurm-10933899.out
Penalized Logistic Models SCAD Signal Recovery Performance coordinate descent (ncvreg); with strong rule R_results_SCAD_signal_recovery.npy ncvreg_logistic_sim.R logistic.sh slurm-10933900.out
Penalized Logistic Models MCP Signal Recovery Performance coordinate descent (ncvreg); with strong rule R_results_MCP_signal_recovery.npy ncvreg_logistic_sim.R logistic.sh slurm-10933900.out
Penalized Linear Models (LM) SCAD Signal Recovery Performance AG (proposed optimization hyperparameters); with strong rule results_SCAD_signal_recovery.npy task1.ipynb task1.sh slurm-10933901.out
Penalized Linear Models (LM) MCP Signal Recovery Performance AG (proposed optimization hyperparameters); with strong rule results_MCP_signal_recovery.npy task1.ipynb task1.sh slurm-10933901.out
Penalized Logistic Models SCAD Signal Recovery Performance AG (proposed optimization hyperparameters); with strong rule results_SCAD_signal_recovery.npy task2.ipynb task2.sh slurm-10933902.out
Penalized Logistic Models MCP Signal Recovery Performance AG (proposed optimization hyperparameters); with strong rule results_MCP_signal_recovery.npy task2.ipynb task2.sh slurm-10933902.out
Penalized Linear Models (LM) SCAD Number of Gradient Evaluations AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent SCAD_sim_results.npy task1speed.ipynb task1speed.sh slurm-10933903.out seff-10933903.out
Penalized Linear Models (LM) SCAD GPU Computing Time AG (proposed optimization hyperparameters), coordinate descent (coded in Python/CuPy) SCAD_sim_results.npy task1speed.ipynb task1speed.sh slurm-10933903.out seff-10933903.out
Penalized Linear Models (LM) MCP Number of Gradient Evaluations AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent MCP_sim_results.npy task1speed.ipynb task1speed.sh slurm-10933903.out seff-10933903.out
Penalized Linear Models (LM) MCP GPU Computing Time AG (proposed optimization hyperparameters), coordinate descent (coded in Python/CuPy) MCP_sim_results.npy task1speed.ipynb task1speed.sh slurm-10933903.out seff-10933903.out
Penalized Logistic Models SCAD GPU Computing Time coordinate descent (coded in Python/CuPy) task2speed_SCAD_coord_time.ipynb task2speed_SCAD_coord_time.sh slurm-10933904.out seff-10933904.out
Penalized Logistic Models MCP GPU Computing Time coordinate descent (coded in Python/CuPy) task2speed_MCP_coord_time.ipynb task2speed_MCP_coord_time.sh slurm-10933905.out seff-10933905.out
Penalized Logistic Models SCAD GPU Computing Time AG (proposed optimization hyperparameters) SCAD_sim_results_AG_time.npy task2speed_SCAD_AG_time.ipynb task2speed_SCAD_AG_time.sh slurm-10933906.out seff-10933906.out
Penalized Logistic Models MCP GPU Computing Time AG (proposed optimization hyperparameters) MCP_sim_results_AG_time.npy task2speed_MCP_AG_time.ipynb task2speed_MCP_AG_time.sh slurm-10933907.out seff-10933907.out
Penalized Logistic Models SCAD Number of Gradient Evaluations AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent SCAD_sim_results.npy task2speed_SCAD.ipynb task2speed_SCAD.sh slurm-10933908.out seff-10933908.out
Penalized Logistic Models MCP Number of Gradient Evaluations AG (proposed optimization hyperparameters), AG (original optimization hyperparameters), proximal gradient descent MCP_sim_results.npy task2speed_MCP.ipynb task2speed_MCP.sh slurm-10933909.out seff-10933909.out

Bibliography

  • Breheny, P., & Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics 2011, Vol. 5, No. 1, 232-253. https://doi.org/10.1214/10-AOAS388