Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions docs/developer/gadi_singularity/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Building Underworld3 for Gadi (NCI)

This directory contains two Containerfiles to build the Underworld3 (UW3) Singularity image for Gadi (nci.org.au).

Both use Rocky Linux 8.10 to match Gadi's OS for ABI compatibility.

## Build Order

Build commands must be run from the top-level `underworld3/` directory (the build context).
Builds targeting Gadi must use `--platform linux/amd64`.

### 1. Build PETSc layer

```bash
podman build . \
--platform linux/amd64 \
--format docker \
-t ghcr.io/<user>/petsc:3.25.0-ompi \
-f ./docs/developer/gadi_singularity/petsc.rhel
```

### 2. Push PETSc image to registry

```bash
podman push ghcr.io/<user>/petsc:3.25.0-ompi
```

### 3. Build Underworld3

```bash
podman build . \
--platform linux/amd64 \
--format docker \
--build-arg PETSC_IMAGE=ghcr.io/<user>/petsc:3.25.0-ompi \
--build-arg UW3_BRANCH=development \
-t ghcr.io/<user>/underworld3-gadi:latest \
-f ./docs/developer/gadi_singularity/underworld3.rhel
```

### 4. Push Underworld3 image

```bash
podman push ghcr.io/<user>/underworld3-gadi:latest
```

## What Each File Does

- **petsc.rhel** — Builds PETSc 3.25.0 with full AMR support (petsc4py, slepc4py, mmg, parmmg, etc.)
- **underworld3.rhel** — Builds Underworld3 on top of the PETSc image

## Running on Gadi

Pull the image on Gadi (redirect cache to scratch to avoid home quota issues):

```bash
export SINGULARITY_CACHEDIR=/scratch/<project>/<user>/.singularity
module load singularity
singularity pull docker://ghcr.io/<user>/underworld3-gadi:latest
```

Run a script with MPI:

```bash
module load singularity
module load openmpi/4.1.7
mpiexec -n <ncpus> singularity exec underworld3-gadi_latest.sif python3 <script.py>
```

## Notes

- OpenFabrics (mlx5_0) warnings in the job error log are harmless
- PostHog telemetry failures on compute nodes are harmless (no outbound internet)
- The ghcr.io images must be set to **public** for Singularity to pull without authentication
197 changes: 197 additions & 0 deletions docs/developer/gadi_singularity/petsc.rhel
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
#####################################################################
# UW3 PETSc container
# Multi stage Containerfile based on UW2 version
# This builds PETSc according to the pixi amr-dev environment
# see https://docs.docker.com/get-started/docker-concepts/building-images/multi-stage-builds/
#
# Stages:
# 1. 'runtime'
# The runtime environment (packages, permissions, ENV vars.)
# is consistent accross all stages of this Containerfile.
#
# 2. 'builder'
# The builder layer, takes the runtime layer and add compiling / building software
# that is added to /usr/local and /opt/venv
#
# 3. 'final' == runtime + min. builder
# The final image is a composite of the runtime layer and the
# minimal sections of the builder layer's final software stack.
#
# To build use podman from the top level underworld directory. i.e.
# $ podman build . \
# --platform linux/amd64 \
# --format docker \
# -t new_image_name \
# -f ./docs/development/gadi_singularity/petsc.rhel
#####################################################################

# The following are passed in via --build-args
# Must go before the 1st FROM see
# https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact
ARG PYTHON_VERSION="3.12"
ARG PETSC_VERSION="3.25.0"
ARG BASE_IMAGE="quay.io/rockylinux/rockylinux:8.10"

# 1. Stage 1: 'runtime'
FROM ${BASE_IMAGE} as runtime
LABEL maintainer="https://github.com/underworldcode/"

# need to repeat ARGS after every FROM
ARG PYTHON_VERSION

#### Containerfile ENV vars - for all image stages
ENV LANG=C.UTF-8
ENV PYVER=${PYTHON_VERSION}

# gadi-specific settings
ENV OPENBLAS_NUM_THREADS=1
ENV OMPI_MCA_io=ompio

# add user jovyan
ENV NB_USER jovyan
ENV NB_HOME /home/$NB_USER
RUN useradd -m -s /bin/bash -N $NB_USER

RUN yum update -y \
&& yum install -y \
bash-completion \
openssh \
openblas \
python${PYVER}-pip \
python${PYVER}-devel \
openmpi \
findutils \
&& yum clean all \
&& rm -rf /var/cache/yum

# add system openmpi to $PATH and $LD_LIBRARY_PATH
ENV PATH=/usr/lib64/openmpi/bin:$PATH
ENV LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

ENV PYOPT=/opt/venv
# build and set open permissions on virtual environment
RUN python${PYVER} -m venv $PYOPT \
&& chmod ugo+rwx $PYOPT

# define python env vars.
# prepappending on PATH means all pip install will goto the PYOPT
ENV PATH=$PYOPT/bin:$PATH
ENV PYTHONPATH=$PYTHONPATH:$PYOPT/lib/python${PYVER}/site-packages

# runtime python requirements
RUN python${PYVER} -m pip install wheel \
"numpy<2"

# 2. Define the builder layer
FROM runtime as builder

ARG PETSC_VERSION
ARG PYTHON_VERSION

RUN yum install -y \
ca-certificates \
wget \
make \
gcc \
gcc-gfortran \
gcc-c++ \
cmake \
patch \
openblas \
zlib-devel \
openmpi-devel \
findutils \
git \
flex \
bison \
&& yum clean all \
&& rm -rf /var/cache/yum
# NOTE flex and bison are needed to build

RUN python${PYVER} -m pip install "cython>=3.1" \
"setuptools>=75" \
"meson" \
"meson-python" \
"ninja" \
&& python${PYVER} -m pip install --no-cache-dir --no-binary=mpi4py "mpi4py>=4,<5"
# no-binary for mpi4py to force build against openmpi, rather than mpich (default)

# copy patches into builder
COPY petsc-custom/patches/scotch-7.0.10-c23-fix.tar.gz /tmp/scotch.tar.gz
COPY petsc-custom/patches/plexfem-internal-boundary-ownership-fix.patch /tmp/

# get petsc
RUN mkdir -p /tmp/src
WORKDIR /tmp/src
RUN wget https://web.cels.anl.gov/projects/petsc/download/release-snapshots/petsc-lite-${PETSC_VERSION}.tar.gz --no-check-certificate \
&& tar -zxf petsc-lite-${PETSC_VERSION}.tar.gz
WORKDIR /tmp/src/petsc-${PETSC_VERSION}

# apply patch then configure
# patch may already be included in newer PETSc versions - skip gracefully if it doesn't apply
RUN if patch -p1 --dry-run < /tmp/plexfem-internal-boundary-ownership-fix.patch 2>/dev/null; then \
patch -p1 < /tmp/plexfem-internal-boundary-ownership-fix.patch; \
echo "plexfem patch applied successfully"; \
else \
echo "plexfem patch not applicable (may already be in this PETSc version), skipping"; \
fi
RUN python${PYVER} ./configure \
--with-debugging=0 \
--prefix=/usr/local \
--with-shared-libraries=1 \
--with-cxx-dialect=C++11 \
"--COPTFLAGS=-g -O3" "--CXXOPTFLAGS=-g -O3" "--FOPTFLAGS=-g -O3" \
--useThreads=0 \
--with-x=0 \
--with-pragmatic=1 \
--with-petsc4py=1 \
--with-slepc4py=1 \
--download-eigen=1 \
--download-metis=1 \
--download-parmetis=1 \
--download-mumps=1 \
--download-scalapack=1 \
--download-hypre=1 \
--download-superlu=1 \
--download-superlu_dist=1 \
--download-mmg=1 \
"--download-mmg-cmake-arguments=-DMMG_INSTALL_PRIVATE_HEADERS=ON -DUSE_SCOTCH=OFF" \
--download-parmmg=1 \
--download-pragmatic=1 \
"--download-ptscotch=/tmp/scotch.tar.gz" \
--download-slepc=1 \
--download-hdf5=1 \
--download-fblaslapack=1 \
--download-zlib=1 \
--download-ctetgen=1 \
--download-triangle=1 \
--with-make-np=2
RUN make PETSC_DIR=`pwd` PETSC_ARCH=arch-linux-c-opt all
RUN make PETSC_DIR=`pwd` PETSC_ARCH=arch-linux-c-opt install \
|| (echo "=== petsc4py build log ===" && \
cat arch-linux-c-opt/lib/petsc/conf/petsc4py.build.log 2>/dev/null && \
echo "=== slepc4py build log ===" && \
cat arch-linux-c-opt/lib/petsc/conf/slepc4py.build.log 2>/dev/null && \
exit 1)
RUN rm -rf /usr/local/share/petsc

# record builder stage packages used
RUN python${PYVER} -m pip freeze > /opt/requirements.txt \
&& dnf history userinstalled > /opt/packages.txt

# Stage 3: 'final'
FROM runtime as final

COPY --from=builder /opt /opt
COPY --from=builder /usr/local /usr/local

# MUST set PETSc environment variables
ENV PETSC_DIR=/usr/local
ENV PYTHONPATH=$PYTHONPATH:$PETSC_DIR/lib

# switch to not-root user and workspace
USER $NB_USER
WORKDIR $NB_HOME

# default command is to run jupyter lab
CMD ["jupyter-lab", "--no-browser", "--ip='0.0.0.0'"]
Loading
Loading