You can install VESSL CLI through
pip install --upgrade vessl
Train nanoGPT with VESSL Run
To help you get started, we prepared a quickstart command that holds several example YAML files for popular open-source models on GitHub. The following command prompts a list of example models. At this step, you will be asked to log in and grant access permission.
nanogpt from a list of models. This initiates a VESSL Run with the following
nanogpt.yaml file, which you can also check on your terminal as the Run starts.
name: nanogpt image: nvcr.io/nvidia/pytorch:22.03-py3 resources: cluster: aws-apne2 preset: v1.v100-1.mem-52 import: /root/examples: git://github.com/vessl-ai/examples export: /output: vessl-artifact:// run: - workdir: /root/examples/nanogpt command: | pip install torchaudio -f https://download.pytorch.org/whl/cu111/torch_stable.html pip install transformers datasets tiktoken wandb tqdm python data/shakespeare_char/prepare.py python train.py config/train_shakespeare_char.py python sample.py --out_dir=out-shakespeare-char
The command performs the following as defined in the YAML file:
- Launch a training job & cluster on AWS with 1 NVIDIA V100 GPU.
- Configure runtime with CUDA compute-capable PyTorch 22.03.
- Mount the nanoGPT GitHub repo and set the working directory.
- Run the task’s run commands defined under
- Track training progress on VESSL.
Click the output link in your terminal to check the training progress for the Run along with the key metrics and hyperparameters.
You can also launch the same Run by copying and pasting the YAML above and running the following command.
vessl run -f nanogpt.yaml
Run’s unified YAML interface really shines as you (1) fine-tune a model with your dataset, (2) scale it on your cloud or on-prems, (3) and create a micro AI/ML app. Follow the guides below to experiment with popular models like Dreambooth Stable Diffusion and Segment Anything using VESSL Run.
Run a GPU-backed training job
Leverges the power of GPUs to efficiently train batch run.
Run a GPU-backed Jupyter and SSH server
Enable a real-time session of interacitve run on GPUs.
Backup and Restore Data with VESSL Artifact
Run, Backup, Repeat: VESSL Run with VESSL Artifact.
Dataset for a Run
Multiple ways to configure dataset.