- Introduction
- Getting Started
- Prerequisites
- Main Tutorials
- Example project layout
- Further Reading
- License and disclaimer
Let's learn Deep Learning deeply!
Open terminal on a UNIX-like system (Linux, macOS, etc.). Then clone this repository by:
git clone git@github.com:seoklab/ldld.git
Also, don't forget to change the working directory.
cd ldld
CAUTION FOR ADVANCED USERS: The listed commands will create a new
environment named ldld
and do nothing if it exists. Please check for existence
and remove it or check the packages in the original one against the
environment file.
Two scripts are provided for:
-
Everyone: Run this command to install
conda
and setup an environment, which will be used throughout this repo. Also don't forget to re-login after running the command. Once the script does its job, you're almost done, at least the "hard" (i.e., terminal) part!./scripts/bootstrap.sh
NOTE: If you're a lab member, then you must run this command on the cluster, not on your local machine.
-
Lab members only (optional): first setup the ssh configurations (see init-remote.sh).
For executing jupyter notebooks on the compute nodes, please ssh into the cluster1, then clone this repo. Once the repo is cloned, run the following command:
./scripts/start-jupyter.sh [nodename]
It submits a slurm job to start a jupyter server on a gpu compute node. The script will then reveal the URL to access the jupyter server on your terminal. Remember the one starting with
http://172.23
, which will be used later for VSCode jupyter remote kernel connection.To use the started jupyter server from your VSCode jupyter extension, follow this guide.
conda
helps us manage python packages much easier, by creating isolated
environments2 for each project. You'd love it as soon
as you get into the "real world" Python development. One drawback of the package
manager is that it takes some time to download the required packages (at
least for the first time). Both scripts will take around ~30 minutes to
complete.
There are two options just waiting for your choice:
-
(Recommended) Download vscode and install it. If you're on a Linux-based machine with sudo access, it's not a bad idea to give snap a try with
$ sudo snap install code --classic
. Once the editor is ready to go, setup these extensions:- Python
- Jupyter
- Markdown All in One
- Remote - SSH (for remote development3)
and also take a look at these optional but useful extensions:
- For a powerful language server: Pylance
- For better editor suggestions: Visual Studio IntelliCode
- For beautiful (IMO4) color scheme: Noctis
-
Start a jupyter kernel with
$ jupyter notebook
at the project root, open the web browser, then connect tohttp://localhost:8888
5. Local port forwarding (ssh -L8888:localhost:8888
) might be required if the jupyter kernel runs on a remote machine.
Type this command, which will be your friend while working in this repo. Type it
every time before starting a jupyter session. (ldld)
will then appear in
front of the
prompt.
conda activate ldld
If you're using vscode (see the previous section), choose
the environment named ldld
while starting the kernel. The editor will open a
dialog for the selection.
- Basic university math, including multivariable calculus.
- Basic shell commands: refer to the related lecture and a blog post.
- Python programming: refer to the related lecture.
NOTE: Only Korean versions are available.
TIP: Text too big in vscode? Get to
Settings - Notebook - Markup: Font Size
and change it to the desired value
(15
is a good start).
what is "deep learning?"
a.k.a. fully connected layers, linear layers, affine layers, feed-forward networks, and ...
not to be confused with the prominent American news network.
including: Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU), ...
Attention!
shortest Wikipedia article; longest list of proposed models: message passing neural networks (MPNN), graph transformers, etc., are all included in this category.
This repository also serves as an example project layout. The following directories are included:
src/
: example source code layout for model, dataset, training, and utilities.pyproject.toml
andsetup.cfg
: example project configuration, required for installing (pip install [-e] .
) the project.
Along with the ldld
module, the project also ships a few executables for
training the models - ldld-mlp
, ldld-cnn
, ldld-rnn
, ldld-trs
, and
ldld-gnn
. You can run them after installing the project.
-
Price, Simon J. D. Understanding Deep Learning; MIT Press, 2023. link (preprint)
This book has excellent visualizations of trending models. Visualization is always a key to understanding complex concepts.
-
Vaswani, A. et al. Attention is all you need. arXiv, 2017. DOI
The original paper introducing the Transformer model.
-
Kipf, Thomas N. et al. Semi-Supervised Classification with Graph Convolutional Networks. arXiv, 2017. DOI
The original paper introducing the Graph Convolutional Network (GCN) model.
-
Veličković, Petar et al. Graph Attention Networks. arXiv, 2018. DOI
The original paper introducing the Graph Attention Network (GAT) model.
-
Shi, Yunsheng et al. Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification. arXiv, 2020. DOI
The paper introducing the graph transformer model we've impelmented in the lecture.
The source code examples used in this project are licensed under the MIT License. Other contents of this repository are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright © 2022- Seoul National University Lab of Computational Biology and Biomolecular Engineering.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Footnotes
-
Contact to the system administrator if you're not sure what you're doing. ↩
-
An isolated "environment" for specific versions of python and packages. ↩
-
If not sure, consider installing it. It wouldn't hurt! ↩
-
Abbreviation of In My Opinion. Frequently used in developer communities. ↩
-
8888
is the default port for jupyter servers. It may depend on configuration and current port usage. ↩