Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Tensorflow-gpu-test/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
37 lines (34 sloc)
1.5 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Tensorflow-gpu-test | |
This repo includes a simple script to verify if Tensorflow installation is correct, as well as tests if GPU device is recognized and being used by Tensorflow. | |
This basically a tutorial on how to train the models on HPC and submitting your slurm jobs as well. All that needs to be changed is the name of Python file at the end of 'gpu.slurm' script. | |
# Instructions | |
Once the conda virtual environment is setup with Tensorflow-gpu: | |
```bash | |
$ conda install -c anaconda tensorflow-gpu | |
``` | |
1) submit the slurm job (script taken from here: http://hpc.coventry.domains/software/cuda-and-gpu-use-on-hpc/submitting-gpu-based-job/) | |
```bash | |
$ sbatch gpu.slurm | |
``` | |
This will submit a slurm job and run the python file 'gputest.py' which checks the installation of Tensorflow and if GPU is being used | |
2) There should appear an output file called 'slurm-XXXXXXX.out' in your directory | |
3) To display the contents of the file in console | |
```bash | |
$ cat NAME_OF_SLURM_OUTPUT_FILE | |
``` | |
4) If the GPU and Tensorflow are installed correctly, you should see something like this at the end | |
```bash | |
Tensorflow version: 2.4.1 | |
Default GPU device: /device:GPU:0 | |
``` | |
5) To verify more than 1 GPUs being used (i.e. 2): | |
Firstly, change a line in the 'gpu.slurm' file to this: | |
```bash | |
#SBATCH --gres=gpu:K20:2 | |
``` | |
Then, after running the 'gpu.slurm' file as explained in step 1, you should see a line in your new output file like this: | |
```bash | |
Adding visible gpu devices: 0, 1 | |
Tensorflow version: 2.4.1 | |
Default GPU device: /device:GPU:0 | |
``` |