Constructing a Run
With VESSL’s user-friendly Web Console, setting up a new machine learning run is easier than ever. There are two primary ways to create a run in the Web Console.
Create a Run from scratch
Metadata configuration is used to annotate runs with additional contextual infromation. This includes
tags. Note that name is required field and tags are unique in the project.
You can create a run on either VESSL’s managed cluster or your custom cluster. Start by selecting a cluster.
Cluster & Resource
VESSL managed cluster
Once you selected VESSL’s managed cluster, you can view a list of available resources under the dropdown menu.
You also have an option to use sopt instances.
Run on spot instance
Handling spot interruption and checkpointing to preserve your work.
Check out the full list of resource types and corresponding prices:
Calculating fees according to the time and type of computational resources consumed.
The Container image specifies typically the Docker image to be used for the run. The image encompasses all required dependencies and the environment needed for executing your machine learning model seamlessly. You can either use a VESSL-managed image or your own custom image.
VESSL managed image
Managed images serve as wrapper images built on top of NVIDIA GPU Cloud (NGC) images, providing an optimized and streamlined environment for GPU-accelerated applications and workflows.
The volumes configuration plays a crucial role in mananging data flows with respect to the run container. Three primary volume operations —
export— determine the data accessibility and transfer mechanisms.
During import operation, specified data will be downloaded into the run container. This is particularly useful when container requires local access to certain data before or during execution.
- Code: Source code required for the run.
- Dataset: The dataset registered in VESSL Dataset.
- Model: Pre-trained ML checkpoints registered in VESSL Model Registry.
- VESSL Artifcat: The storage manged within VESSL. You can use it as a backup volume.
- Object Storage: Data stored in a generic object storage.
- Files: Uploaded local files.
Backup and Restore Data
Run, Backup, Repeat: GPU-powered JupyterLab with VESSL Artifact
By understanding and correctly configuring these
volumes options, users can create a flexible and efficient data flow strategy in their VESSL Runs.
Start commands are a collection of commands that specify how a container should begin execution after it is initialized. These commands can be grouped into two categories.
- Commands that include a pair of working directory and the command to be run in the container.
- A wait command to introduce a delay before or between command execution.
The start command can be empty to signify an interactive run where the user is expected to manually execute commands within the container.
Interactive is a key feature designed to specify whether the container allows interactive communication with the user.
It is particularly useful for debugging, data analysis, or running services that require user interaction. By default, the interactive run supports JupyterLab and SSH. Both
Max runtime and
Jupyter idle timeout are useful to mange resource usage and costs. You can also use multiple types of custom service via specified ports.
Port configuration is a list of maps that specifies infromation about a particular application or service should expose. Each map within the list defines specific attributes of a port such as its number, name, and type.
You can set environment variables as key-value pairs.
A typical machine learning run will include hyperparameters such as
optimizer. You can also use them at runtime by appending them to the start command as follows.
python main.py \ --learning_rate $learning_rate --optimzer $optimizer
If you have sensitive information like API keys or passwords that you need to include in your environment, you can mark these variables as secrets. The values will never be shown in the UI, ensuring an extra layer of secrutiy.
Service account name
A service account is a type of non-human account that Kubernetes provides a distinct identity in a cluster. The account is useful to implement identity-based security policies. Create one in a Kubernetes cluster and specify its name.
Checking the termination protection option puts the run in idle once it completes running, so you to access the container of the finished run.
Create a Run from template
Initiate a new run using a pre-configured template as a baseline. Instead of setting up each parameter and configuration from scratch, you can use a template that already has essential settings and parameters defined. This can significantly accelerate the deployment and testing phases of your projects by reusing configurations that are known to work well for specific use-cases.
The template typically comes in a YAML format. You can further customize these templates to better fit your specific requirements, making it a versatile tool for repetitive or complex tasks.
Additionally, for more advanced configurations and examples, you can visit VESSL Hub. The hub offers a variatey of YAML examples that you can use as references.