March 5, 2026 5 min read

Anaconda to Apptainer: Reproducible Python Environments

Learn how to export your local Conda environment into a production-ready Apptainer container that runs identically across laptops, HPC clusters, and cloud systems. This step-by-step guide eliminates broken builds and version conflicts in distributed computing.

Build Reproducible Python Environments with Anaconda to Apptainer — Without Broken HPC Deployments

You’ve spent weeks perfecting a Python environment locally. Dependencies are locked in. Your code runs flawlessly on your laptop. Then you push it to the HPC cluster, and everything breaks.

Missing libraries. Version conflicts. Runtime errors at 3 AM when your job finally reaches the queue. The problem: your local Conda environment doesn’t travel. It’s fragile, system-dependent, and impossible to reproduce across different machines.

Apptainer (formerly Singularity) solves this. By containerizing your Conda environment into a portable .sif file, you get a frozen snapshot of your entire Python stack—OS, libraries, environment variables, everything—that runs identically everywhere.

What This Is

30-second pitch:

Apptainer is a lightweight container runtime designed for HPC environments (unlike Docker, which requires root). This tutorial shows you how to:

  1. Export your local Conda environment as a reproducible YAML snapshot
  2. Build an Apptainer container that includes that exact environment
  3. Bind local data and code folders to avoid copying terabytes of files
  4. Update containers using sandboxes without rebuilding from scratch

By the end, you’ll have a production-ready .sif file that runs identically on your laptop, university cluster, and cloud HPC systems.

Prerequisites

You’ll need:

  • Apptainer 1.0+ (or Singularity 3.8+) — installation guide
  • Conda/Miniconda — already installed locally
  • Linux system (native or WSL2 on Windows; macOS requires Apptainer VM)
  • Sudo access (required for building and sandbox operations)
  • ~5 GB disk space for the container image

Check your setup:

apptainer --version
conda --version

Step 1: Create and Populate Your Conda Environment Locally

Start with a clean environment. This becomes your “golden image.”

conda create -n conda-env python=3.12
conda activate conda-env
pip install pandas seaborn scipy matplotlib

Test that imports work:

python -c "import pandas; import seaborn; print('All packages loaded')"

Step 2: Export Your Conda Environment as YAML

This YAML file is the recipe for your container—it captures every package and version.

conda env export -n conda-env > config.yaml

Verify the file was created:

head -20 config.yaml

You should see a list of packages with pinned versions (e.g., pandas=2.0.1).

Step 3: Create the Apptainer Recipe File

Create a file called Singularity.def:

Bootstrap: docker
From: continuumio/miniconda3:latest

%files
    config.yaml /opt/config.yaml

%post
    apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        && rm -rf /var/lib/apt/lists/*
    
    conda env create -f /opt/config.yaml
    conda clean --all -y

%environment
    export PATH=/opt/conda/envs/conda-env/bin:$PATH
    export CONDA_DEFAULT_ENV=conda-env

%runscript
    exec python "$@"

What each section does:

  • Bootstrap: docker — pulls a base image with Conda pre-installed
  • %files — copies your config.yaml into the container
  • %post — installs system dependencies and creates the Conda environment
  • %environment — sets PATH so Conda is always active
  • %runscript — makes the container executable with Python by default

Step 4: Build the Apptainer Container

sudo apptainer build conda.sif Singularity.def

This takes 3–10 minutes depending on package count. Once complete, you have conda.sif—your portable container.

⚠️ Note: sudo is required because Apptainer needs root access to create the image. On shared HPC clusters, check with your admin about unprivileged builds.

Core Workflow

Run Python Scripts Inside the Container

Execute any Python script without needing Conda on the host:

apptainer exec conda.sif python script.py

The container automatically activates your Conda environment.

Bind Local Folders (Critical for Data & Code)

If your code or datasets live outside the container, bind them instead of copying:

apptainer exec --bind /local/path/to/data:/container/data conda.sif python analysis.py

Why binding matters:

  • Your 2 TB dataset stays on disk; the container just sees it
  • Code updates don’t require rebuilding
  • Cluster storage mounts seamlessly

The syntax is --bind source:destination, where source is on your host and destination is inside the container.

Set Working Directory Inside Container

If your script expects files in a specific location, use --pwd:

apptainer exec --bind $(pwd):/work --pwd /work conda.sif python script.py

This maps your current directory to /work inside the container and sets it as the working directory.

Create a Sandbox for Testing Updates

A sandbox is a writable directory version of your container—use it to test changes before rebuilding:

sudo apptainer build --sandbox conda-env-sandbox conda.sif

This creates a folder called conda-env-sandbox/ with your entire container unpacked.

Modify the Sandbox

Open a writable shell:

sudo apptainer shell --writable conda-env-sandbox

Inside the container:

conda activate conda-env
pip install markdown
exit

Test your code with the new package without rebuilding the entire container.

Rebuild Container from Updated Sandbox

Once you’ve tested changes, create a new .sif:

sudo apptainer build conda-updated.sif conda-env-sandbox/

Now conda-updated.sif includes the new packages.

Practical Example

Scenario: You have a data analysis script that reads CSV files from a shared cluster directory, generates plots, and saves results.

Local setup:

  • Script: ~/projects/analysis/demo.py
  • Data: /cluster/shared/datasets/ (2 TB)
  • Output: ~/projects/analysis/results/

Step-by-step:

1. Create and export the Conda environment:

conda create -n data-analysis python=3.10
conda activate data-analysis
pip install pandas matplotlib seaborn scikit-learn
conda env export > config.yaml

2. Build the container:

sudo apptainer build analysis.sif Singularity.def

3. Run on the HPC cluster with bound data:

apptainer exec \
  --bind /cluster/shared/datasets:/data \
  --bind ~/projects/analysis:/work \
  --pwd /work \
  analysis.sif python demo.py

What’s happening:

  • /cluster/shared/datasets is visible inside the container as /data
  • ~/projects/analysis (your code and output) is visible as /work
  • The working directory is /work, so relative paths work
  • Your script reads from /data and writes results to /work/results/

The script runs without copying any files. If you update demo.py, just re-run the command—no rebuild needed.

Common Issues & Fixes

“working data not found” error

You forgot to bind the folder or didn’t set --pwd. Use:

apptainer exec --bind $(pwd):/work --pwd /work conda.sif python script.py

Permission denied when building

Add sudo:

sudo apptainer build conda.sif Singularity.def

“conda: command not found” inside container

Verify %environment is in your .def file and includes the PATH export. Rebuild:

sudo apptainer build --force conda.sif Singularity.def

Sandbox changes aren’t reflected in the rebuilt container

Always rebuild after making changes:

sudo apptainer build conda-updated.sif conda-env-sandbox/

Container is too large (>5 GB)

Add cleanup to the %post section of your .def file:

conda clean --all -y
apt-get autoremove -y
apt-get clean

Then rebuild from scratch.

Next Steps

You now have a reproducible, portable container for any Python environment.

What to do next:

  • Test on your HPC cluster — copy conda.sif and verify bound paths work
  • Version your .def file — keep it in Git alongside your code for reproducibility
  • Automate rebuilds — script the export → build → test pipeline if packages change frequently
  • Share with collaborators — send them the .sif file; they run your exact environment without setup

What’s your biggest pain point with Python environments on HPC clusters right now? Reply in the comments—I’m building follow-ups on GPU support, distributed training, and cluster job submission integration.


What’s your current workflow for shipping Python environments to HPC clusters—and what’s broken about it?

Comments