# HPC Tutorial



Major acknowledgement to Ilia Kulikov and Richard Pang for contributions to this notebook. 

## EXTREMELY IMPORTANT NOTES

Do not run compute-heavy jobs on log-in nodes!
- HPC admins (and other HPC users) will be very upset and you might get into trouble. 
- Don't worry if you don't understand what "compute-heavy" or "log-in nodes" mean, now. But hopefully at the end of the section, you'll have a better idea!


Do not email NYU HPC unless absolutely necessary!!!
- Check the tutorials written by NYU HPC staff: https://sites.google.com/nyu.edu/nyu-hpc/hpc-systems/cloud-computing/hpc-bursting-to-cloud (and other pages on this site) 
- Check slurm documentation
- Ask on Campuswire


# Part 1: Logging in to Greene

If you're not on NYU network, then you have three options (choose one): 
- Use NYU VPN (please figure this out on your own) and directly ssh to Greene
- Gateway (`ssh [netid]@gw.hpc.nyu.edu`) -> Greene
- CIMS cluster (if you have access) -> Greene (if you don't have an CIMS account, don't worry---you're not at any disadvantage)

If you're on the NYU network including through NYU VPN: `ssh [netid]@greene.hpc.nyu.edu`


```
| \ | \ \ / / | | | | | | |  _ \ / ___|
|  \| |\ V /| | | | | |_| | |_) | |
| |\  | | | | |_| | |  _  |  __/| |___
|_| \_| |_|  \___/  |_| |_|_|    \____|

  ____
 / ___|_ __ ___  ___ _ __   ___
| |  _| '__/ _ \/ _ \ '_ \ / _ \
| |_| | | |  __/  __/ | | |  __/
 \____|_|  \___|\___|_| |_|\___|
 ```

# Part 2: Looking around the filesystem using bash

By default we are given the [Bash](https://www.gnu.org/software/bash/) shell upon successful connection. Shell is an environment in which we can run our commands, programs, and shell scripts. Today we will use shell to manage files, to list content of the filesystem, to run VIM text editor.

Please learn these commands on your own, if you're not familiar with them: `cd`, `pwd`, `rm` (as well as flags like `r` or `f`), `ls` (as well as flags like `l` or `a` or `h`), `du` (as well as flags like `h` or `s`), `cp`, `scp`. Use `man ls` to check the documentation for `ls`, for example. Some quick examples:
- `touch <filename>`: create file with name `<filename>`.
- rm `<filename>`: remove file with name `<filename>`.
- cp `<fp1>` `<fp2>`: copy file with path `<fp1>` to path `<fp2>`.
- mv `<fp1>` `<fp2>`: move file from `<fp1>` to path `<fp2>`. This is used to rename files as well.
- cd `<path>`: change current directory to.


Login node should not be used to run anything related to your computations, use it for file management (`git`, `rsync`), jobs management (`srun`, `salloc`, `sbatch`).

Bash holds the set of environment variables which are used to help other software to link some libraries or helper scripts:
```
[ik1147@log-2 ~]$ env
LD_LIBRARY_PATH=:/share/apps/centos/8/usr/lib:/share/apps/centos/8/lib64
SSH_CONNECTION=216.165.22.148 32920 216.165.13.138 22
ARCHIVE=/archive/ik1147
LANG=en_US.UTF-8
HISTCONTROL=ignoredups
HOSTNAME=log-2.nyu.cluster
SCRATCH=/scratch/ik1147
.....
```

## Important: different filesystems

Ok. We were on Greene (corresponding to filesystem B below). There're other filesystems. 


```
Local ---> Greene login node ---> Greene compute node (NOT USING FOR THIS COURSE)
                            |
                             ---> Burst node    ---> GCP compute node
```

### Filesystem A: Local


For example, your laptop. 

### Filesystem B: Greene

|  |	env  var |	what for |	flushed |	quota|
| --- | --- | --- | --- | --- | 
| /archive	| \$ARCHIVE | long term storage	| NO	| 2TB/20K inodes |
| /home	| \$HOME	| probably nothing	| NO	| 50GB/30K inodes |
| /scratch |	\$SCRATCH | experiments/stuff	| YES (60 days)	| 5TB/1M inodes |

Check quota by `myquota`. 

For this course:
- You probably won't be using `/archive`.  
- You will store very very few things (maybe just a few lines of environment-related code) on `/home`.


Where to store your data
- `/scratch/[netid]`

- How to get on Greene login node? `ssh [netid]@greene.hpc.nyu.edu` (see above).


Where to store your temporary data (will disappear after you exit the node)?
- `/tmp`

Where to store your data you want to keep?
- `/scratch/[netid]` (recommended)


### Burst: something between B and C

- How to get on Burst node? After `ssh [netid]@greene.hpc.nyu.edu`, do `ssh burst`.
- Mostly containing files from B, but not C.

### Filesystem C: NYU HPC GCP

Where to store your temporary data (will disappear after you exit the node)?
- `/tmp` or `/mnt/ram`

Where to store your data you want to keep?
- `/home/[netid] ` or `/scratch/[netid]`

How to get on GCP compute nodes? Our class will have one account `csci_ga_2590-2023sp`, and three partitions: `interactive`, `n1s8-v100-1`, `n1s16-v100-2`.

- For simple scripts / file operations: `srun --account=csci_ga_2590-2023sp --partition=interactive --pty /bin/bash`
  - Check hostname: this is on Google Cloud.
  - lscpu: 1-2 CPUs.
  - free -m: around 2GB memory.

- For GPUs
  - `srun --account=csci_ga_2590-2023sp --partition=n1s8-v100-1 --gres=gpu --time=1:00:00 --pty /bin/bash`

Always use the `interactive` partition, if you're only doing very simple operations (i.e., moving files around, editing code using vim, etc.). 

---------

### How to copy files around?

From A to B (you must be on NYU network; VPN also okay)
- On A, do `scp [optional flags] [file-path] [netid]@greene.hpc.nyu.edu:[greene-destination-path]`

From B to A (you must be on NYU network)
- On A, do `scp [optional flags] [netid]@greene.hpc.nyu.edu:[file-path] [local-destination-path]`

From B to C
- On C, do `scp [optional flags] greene-dtn:[file-path] [gcp-destination-path]`

From C to B
- On C, do `scp [optional flags] [file-path] greene-dtn:[greene-destination-path]`

From A to C
- A -> B -> C

From C to A
- C -> B -> A



# Part 3: Slurm, Burst, Singularity, and your typical workflow

Slurm is very very popular in both academic and industry settings. Also singularity. So it would be good to know these tools in general. The below might seem overwhelming if it's your first time knowing about these toools, but you'll get comfortable with them very soon!


[Slurm](https://slurm.schedmd.com/documentation.html) is a job management system which allocated resources (computers) to you given your requests and also run some scripts if you pass it in.

[Singularity](https://sites.google.com/nyu.edu/nyu-hpc/hpc-systems/greene/software/singularity-with-miniconda) is a software to instantiate the container-based userspace (can be seen as a virtual machine). The main idea of using a container is to provide an isolated user space on a compute node and to simplify the node management with single OS container image.


# Part 3.1: Interactive setting 

The typical workflow for **interactive** (running of some script/debugging) looks like this:

1. Log in: Greene’s login node.
2. Log in to Burst node.
2. Request a job / computational resource and wait until Slurm grants it.
  - You always need to request a job for GPUs. 
3. Execute singularity and start container instance.
4. Activate conda environment with your own deep learning libraries.
5. Run your code, make changes/debugging.

## Step 1: Log in to Greene's login node.

Described above. NEVER run compute heavy jobs on the login nodes. You'll get scolded by HPC admins and you may get into trouble.

## Step 2: Burst

`ssh burst`

Then, check our hostname by `hostname`. Do not run compute heavy jobs on this node either!

## Step 3: Requesting compute node(s)

On NYU HPC GCP, our class will have one account `csci_ga_2590-2023sp`, and three partitions: `interactive`, `n1s8-v100-1`, `n1s16-v100-2`.

Confusing thing: One partition is called `interactive`. This `interactive` and the "interactive setting" in the header do not refer to the same thing. 

---------

For simple scripts / file operations 
- `srun --account=csci_ga_2590-2023sp --partition=interactive --pty /bin/bash`
- Check the `man` page! Or here: https://slurm.schedmd.com/srun.html
- After getting onto GCP node:
  - Check `hostname`: this is on Google Cloud. 
  - `lscpu`: 1-2 CPUs.
  - `free -m`: around 2GB memory. 

For GPUs
- `srun --account=csci_ga_2590-2023sp --partition=n1s8-v100-1 --gres=gpu --time=1:00:00 --pty /bin/bash`

```
bash-4.4$ nvidia-smi
Mon Feb  6 15:31:06 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.48.07    Driver Version: 515.48.07    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P0    40W / 300W |      0MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

```

If you're cancelling your `srun`-submitted job, use control+D or `exit`.

Check what this job looks like in the slrum queue:
```
bash-4.4$ squeue -u [netid]
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            109533 n1s8-v100     bash  nhj4247  R       4:06      1 b-3-106
```

---------

PROBLEM: If you don't touch your keyboard for a while, or if your internet is unstable, then the job my die. A workaround is to use `tmux` or `screen`.



## Step 4: Setting up singularity &
## Step 5: Activate conda environment with your own deep learning libraries

First we copy over the empty filesystem image where we will put our conda environment later (you only need to do this once semester)

```
# On Burst: first get on GCP
srun --account=csci_ga_2590-2023sp --partition=n1s8-v100-1 --gres=gpu --time=1:00:00 --pty /bin/bash
# Then download the overlay filesystem
cd /scratch/[netid]
scp greene-dtn:/scratch/work/public/overlay-fs-ext3/overlay-25GB-500K.ext3.gz .
```

Unzip the ext3 filesystem. May take 5 min here.
```
gunzip -vvv ./overlay-25GB-500K.ext3.gz
```

Filesystems can be mounted as read-write (`rw`) or read-only (`ro`) when we use it with singularity.
- read-write: use this one when setting up env (installing conda, libs, other static files)
- read-only: use this one when running your jobs. It has to be read-only since multiple processes will access the same image. It will crash if any job has already mounted it as read-write.

Now lets launch singularity container with the fresh filesystem we just copied over (you need to do the below every time you want to run GPU jobs):



```
# On GCP (assuming our current directory is /scratch/[netid])

scp -rp greene-dtn:/scratch/work/public/singularity/cuda11.4.2-cudnn8.2.4-devel-ubuntu20.04.3.sif . 

singularity exec --bind /scratch --nv --overlay /scratch/[netid]/overlay-25GB-500K.ext3:rw /scratch/[netid]/cuda11.4.2-cudnn8.2.4-devel-ubuntu20.04.3.sif /bin/bash
```


**Important**: if you want to use GPUs inside singularity, add --nv argument after exec.

We are going to install Conda package in the `/ext3/` folder where your own filesystem is mounted.

```
## On GCP
Singularity> cd /ext3/
Singularity> wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
--2023-02-06 15:47:47--  https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3, 2606:4700::6810:8303, ...
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 72402405 (69M) [application/x-sh]
Saving to: 'Miniconda3-latest-Linux-x86_64.sh'

Miniconda3-latest-Linux-x86_6 100%[=================================================>]  69.05M   180MB/s    in 0.4s    

2023-02-06 15:47:47 (180 MB/s) - 'Miniconda3-latest-Linux-x86_64.sh' saved [72402405/72402405]
```


Now install the conda package. If the installer asks you where to install, type in `/ext3/miniconda3`. Agree to change your bashrc in the end if installer asks so):

```
ik1147@cl01:/ext3$ bash ./Miniconda3-latest-Linux-x86_64.sh
PREFIX=/ext3/miniconda3
Unpacking payload ...
```




Many python libraries store some static files like pretrained models on disk when you import particular model. Lets re-route the cache location to $SCRATCH (this is `/scratch/[netid]`; if the variable doesn't exist, type in `/scratch/[netid]` instead of `$SCRATCH`; or simply set `SCRATCH=/scratch/[netid]`).

First, create folders in scratch: 
```
(base) ik1147@cl01:~$ mkdir $SCRATCH/.cache
(base) ik1147@cl01:~$ mkdir $SCRATCH/.conda
```
Now remove all existing cache:
```
(base) ik1147@cl01:~$ rm -rfv .conda
(base) ik1147@cl01:~$ rm -rfv .cache
```
Now create symbolic links (symlinks) to scratch:
```
(base) ik1147@cl01:~$ ln -s $SCRATCH/.conda ./
(base) ik1147@cl01:~$ ln -s $SCRATCH/.cache ./

(base) ik1147@cl01:~$ ls -l .conda
lrwxrwxrwx 1 ik1147 ik1147 22 Feb 26 18:02 .conda -> /scratch/ik1147/.conda
```

Now let's install a few more libraries:
```
## Make sure that your conda environment is activated (e.g. run conda activate)
pip install torch
pip install transformers
pip install nlp
pip install sklearn
```

Lets check how ‘heavy’ our filesystem became:
```
(base) Singularity> du -sh /ext3/
3.9G	/ext3/
```

We are capped at 25G so we are good to go. Feel free to install other packages along the way, but remember to mount filesystem with `rw`, otherwise you will get read-only errors.



## Step 6: Run code

Lets try out some demo with transformers. Enter `python`:

```
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> x = torch.tensor([1,2])
>>> x
tensor([1, 2])
```

Make sure pytorch is using GPUs!
```
(base) Singularity> python
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda
<module 'torch.cuda' from '/ext3/miniconda3/lib/python3.10/site-packages/torch/cuda/__init__.py'>
>>> torch.cuda.is_available()
True
>>> 
```

# Part 3.2: Batch setting

Personally, for very quick experiments or for debugging, I use the interactive setting (Part 3.1). For submitting lots of experiments (maybe I want to start some experiments and check the results in the morning), I  use the batch setting. 

The typical workflow for submitting batch jobs looks like this:

1. Log in: Greene’s login node.
2. Log in to Burst node.
3. Submit a `sbatch` script. Within this script, do the following:
  - Request a job / computational resource. 
  - Execute singularity and start container instance.
  - Activate conda environment with your own deep learning libraries.
  - Run your code, make changes/debugging.

Here we consider the following python script which uses the available GPU:
https://nyu-cs2590.github.io/course-material/spring2023/section/sec02/test_gpu.py

Now we want to submit a job from login node using SLURM. The job will simply run this python script on the allocated machine, save the output from stdout and exit.

Make sure your SCRATCH is set to your scratch folder: `SCRATCH=/scratch/[netid]`

First, download the actual job script.

```
# Run from the burst node
cd $SCRATCH

wget https://nyu-cs2590.github.io/course-material/spring2023/section/sec02/test_gpu.py
```

Second, download the batch submission script:
```
cd $SCRATCH

wget https://nyu-cs2590.github.io/course-material/spring2023/section/sec02/gpu_job.slurm
```



We need to add the /ext3/env.sh script to your filesystem. Now we will use singularity purely for copying over the script, i.e. we are not runnning any actual computations there:
```
[nhj4247@log-burst nhj4247]$ srun --account=csci_ga_2590-2023sp --partition=n1s8-v100-1 --gres=gpu --time=1:00:00 --pty /bin/bash
[bash-4.4$ singularity exec --bind /scratch --nv --overlay /scratch/nhj4247/overlay-25GB-500K.ext3:rw /scratch/nhj4247/cuda11.4.2-cudnn8.2.4-devel-ubuntu20.04.3.sif /bin/bash

Singularity> 

Singularity> wget https://nyu-cs2590.github.io/course-material/spring2023/section/sec02/env.sh -O /ext3/env.sh
--2023-02-06 21:25:39--  https://nyu-cs2590.github.io/course-material/spring2023/section/sec02/env.sh
Resolving nyu-cs2590.github.io (nyu-cs2590.github.io)... 185.199.111.153, 185.199.109.153, 185.199.108.153, ...
Connecting to nyu-cs2590.github.io (nyu-cs2590.github.io)|185.199.111.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 98 [application/x-sh]
Saving to: '/ext3/env.sh'

/ext3/env.sh                  100%[=================================================>]      98  --.-KB/s    in 0s      

2023-02-06 21:25:40 (8.07 MB/s) - '/ext3/env.sh' saved [98/98]

Singularity> cat /ext3/env.sh 
#!/bin/bash

source /ext3/miniconda3/etc/profile.d/conda.sh
export PATH=/ext3/miniconda3/bin:$PATHSingularity>
Singularity> exit
bash-4.4$ exit
```

Now we submit the job using sbatch command and check how it is running using squeue command.
Note: make sure to change the singularity version in `gpu_job.slurm` to the correct version: `cuda11.4.2-cudnn8.2.4-devel-ubuntu20.04.3.sif`. 
```
[nhj4247@log-burst nhj4247]$ sbatch gpu_job.slurm 
Submitted batch job 109546
[nhj4247@log-burst nhj4247]$ squeue -u nhj4247
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            109546 n1s8-v100 job_wgpu  nhj4247 CF       0:12      1 b-3-30
```

After the job is done we check the corresponding output log (it will be in GCP filesystem, so you will need to request a node before)
```
bash-4.4$ cat 109546_job_wgpu.out 
Torch cuda available: True
GPU name: Tesla V100-SXM2-16GB


CPU matmul elapsed: 1.8376452922821045 sec.
GPU matmul elapsed: 4.6015520095825195 sec.
```

### Note

In above `sbatch` script we need to use the option `requeue`. Why `requeue`? GCP uses preemption, or preemptive instances
- Max time: 24 hours.
- Your job might be killed (with low-to-medium probability) for no reason. This means that you need to requeue your job, and you need to automatically load from your last checkpoint. This means you'll need to write code that automatically loads from the last checkpoint (perhaps always named `checkpoint_last.pt`) -- if the file doesn't exist, initialize the network from scratch. 

Jupyter notebook example:
- You might find it simpler to run on your laptop (you can download Anaconda which includes Jupyter notebook), if you don't need GPUs. 
- If you need GPUs, on your GCP instance, go to `/share/apps/examples/jupyter`, and edit `run-jupyter-gpu.sbatch` (change the account and the singularity image). Then, sbatch your script. 