Docker Setup¶

This guide describes how to provision a reproducible MarineGym training or development workspace using Docker. Containerising the workflow keeps dependencies consistent across machines and simplifies GPU access.

Prerequisites¶

Ubuntu 24.04 or newer.
NVIDIA GPU with CUDA support (RTX 40-series recommended).
Docker Engine installed and running.
Reliable internet connectivity for image downloads.

Environment Setup¶

Install NVIDIA Container Toolkit.
(Optional) Configure a network proxy for outbound services such as Weights & Biases.
Prepare local project directories.
Fetch the MarineGym container image.
Launch the container inside a screen session.
Run training workloads and monitor progress.

Install NVIDIA Container Toolkit¶

The toolkit exposes GPU resources to containers:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
    sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Restart Docker after configuration changes to apply the runtime settings.

(Optional) Configure Network Proxy¶

If outbound connectivity is restricted, install a CLI proxy client before running training jobs that report to cloud services such as Weights & Biases:

export url='https://fastly.jsdelivr.net/gh/juewuy/ShellCrash@master'
wget -q --no-check-certificate -O /tmp/install.sh "$url/install.sh"
bash /tmp/install.sh
source /etc/profile >/dev/null 2>&1

Skip this step if the host already has unrestricted access.

Prepare Project Files¶

Install screen for long-running sessions, then clone the project into a working directory:

sudo apt install -y screen
mkdir -p ~/Programing
cd ~/Programing
git clone https://github.com/Marine-RL/MarineGym.git

The repository will be available at ~/Programing/MarineGym.

Download the Docker Image¶

Pull the pre-built MarineGym environment image:

docker pull raiots/marinegym:env

The image bundles Isaac Sim, Conda environments, and required Python packages.

Start a Container in `screen`¶

Use screen to keep training active if the SSH session disconnects:

screen -S omni

Inside the screen session, launch the container:

docker run --rm --entrypoint bash -it \
    --runtime=nvidia \
    --gpus '"device=0"' \
    -e "ACCEPT_EULA=Y" \
    -e "PRIVACY_CONSENT=Y" \
    --network=host \
    -v /root/Programing/:/Programing \
    raiots/marinegym:env

Parameter highlights:

--rm: remove the container when it exits.
--runtime=nvidia: enable GPU passthrough.
--gpus '"device=0"': bind GPU 0 (adjust as needed).
--network=host: inherit the host network stack.
-v /root/Programing/:/Programing: mount local project files into the container.

Run Training Jobs¶

Within the container activate the Conda environment and move into the scripts directory:

cd /Programing/MarineGym/scripts
conda activate sim

Authenticate Weights & Biases (replace the placeholder with your API key):

export WANDB_API_KEY=<your_wandb_api_key>

Launch a representative training run:

cd scripts
python train.py task=Hover algo=ppo headless=false enable_livestream=false

Key parameters:

task: selects the Docking scenario.
algo: configures PPO with an LCNN backbone.
headless: disables GUI rendering for server usage.
task.env.num_envs: number of parallel environments (tune based on GPU memory).
wandb.mode: set to offline if the host lacks external connectivity.

Managing `screen` Sessions¶

Detach: Ctrl+A followed by D.
Reattach: screen -r omni.
List sessions: screen -ls.
Terminate: run exit inside the session or press Ctrl+D.

Troubleshooting¶

GPU unavailable

Ensure the container sees the GPU:

docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

No network access inside the container

Review firewall rules. As a fallback, launch with --network=bridge and map only the required ports.

Slow training

Reduce task.env.num_envs to lower memory usage, or scale up GPU resources. Monitor GPU and CPU utilisation with nvidia-smi and htop.

Development Tips¶

Drop --rm to preserve container state between restarts.
Lower task.env.num_envs during debugging for faster iteration.
Set headless=false to visualise simulations when running on a workstation.
Use wandb.mode=offline to store training logs locally if internet access is restricted.

Iterate with lightweight settings while prototyping, then scale up concurrency and resolution for production training.