-
Notifications
You must be signed in to change notification settings - Fork 49
Document approaches for site builds on top of EESSI #778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
a4d5de8
259e935
6aebcec
dcf695f
87ccf16
c887a09
729cf88
c31746d
e4fac7e
b9205ca
7b027fc
3f49e7b
8fe6e19
0cddb6b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,50 @@ | ||||||
| # Introduction | ||||||
| This documentation is aimed at HPC sites or other facilities that make EESSI available on their system, but would like to offer additional installations that are performed 'on top' of EESSI (i.e. using dependencies provided by EESSI). | ||||||
|
|
||||||
| There are several reasons why, as a site, you may want to offer additional software on top of EESSI. For example: | ||||||
|
|
||||||
| 1. You want to offer software that does is not suitable for upstream deployment in EESSI (e.g. because it is proprietary, or because it is a development build / otherwise very specific build that is not useful for a general audience). | ||||||
| 2. You need to make software available on (very) short notice to your users, and cannot wait for it to be deployed in upstream EESSI. | ||||||
| 3. You want to retain full autonomy over what gets deployed. | ||||||
|
|
||||||
| While all of these are valid arguments, note that there is also one major downside to deploying things locally: you loose one of the core benefits of EESSI, namely that it provides _the same software on every system_. The more site-specific installations you have, the more difficult it will be for your users to move their workflows from e.g. their own development machine/cloud environment to your cluster, or scale up to larger clusters. If you're doing site-builds to make software available to your users on short notice, we highly encourage you to _also_ contribute the same software installation in upstream EESSI. This way, once accepted upstream, users that rely on that software retain their 'mobility'. | ||||||
|
|
||||||
| # Choosing your approach | ||||||
| There are two approaches to doing site builds, each with their own advantages and disadvantages. | ||||||
|
|
||||||
| 1. Perform site builds using EESSI-extend on a shared filesystem. | ||||||
| 2. Leverage EESSI's build procedure for site builds. In this approach, you use the EESSI build bot (`EESSI/eessi-bot-software-layer`), together with the EESSI build scripts (`EESSI/software-layer-scripts`) to build and deploy software into a CernVM-FS repository of your own. Essentially, this means you'll build in a way that is essentially identical to how it is done for upstream EESSI - with the only major difference being the target CernVM-FS repository. | ||||||
|
|
||||||
| In both cases, you build 'on top' of EESSI, meaning that dependencies that are already provided by EESSI will not be reinstalled: they will simply be loaded from EESSI. | ||||||
|
|
||||||
| Here, we list some advantages and disadvantages to help you choose which approach best suites your requirements. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| ## Approach 1: using EESSI-extend on shared FS | ||||||
|
|
||||||
| Advantages: | ||||||
|
|
||||||
| - Easy to get started: no additional setup or knowledge needed | ||||||
| - Automatically optimizes for the host on which you run the installation, and installs in architecture-specific prefix that matches the host architecture. This means you can install optimized software for each of your CPU/GPU architectures in an organized way. | ||||||
|
|
||||||
| Disadvantages: | ||||||
|
|
||||||
| - This is a manual procedure (unless you create your own automation around it). As such, doesn't scale well to installing large amounts of software and/or installing software for many different hardware targets. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| - The fact that you get optimized installations means that on a very heterogeneous system, you will have to run the installation many times - once for each architecture on which you want to offer that particular piece of software. | ||||||
| - Shared filesystems (and especially _parallal_ filesystems) are generally ill-suited to serve software. This means start-up time can be quite long (you can find some numbers [here](../training-events/2025/tutorial-best-practices-cvmfs-hpc/performance.md)). | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| ## Approach 2: leveraging all of EESSI's tooling for site builds | ||||||
|
|
||||||
| Advantages: | ||||||
|
|
||||||
| - Highly automated | ||||||
| - Scalable to many architectures & installations | ||||||
| - Site builds are done based on a list of software in a GitHub repo - making it very transparent what is available / got added on your system | ||||||
| - Share maintenance on the automation with the EESSI community | ||||||
| - End-user look & feel are very similar to EESSI | ||||||
|
|
||||||
| Disadvantages | ||||||
|
|
||||||
| - More setup time | ||||||
| - Requires more extnesive knowledge (CVMFS, EESSI build bot, object store) | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is codespell not catching this?
Suggested change
|
||||||
| - More hardware resources (CVMFS infrastructure, bot infrastructure) | ||||||
| - More components (software/hardware) to maintain | ||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,137 @@ | ||
| # Site builds on top of a shared file system | ||
|
|
||
| For this approach we use a shared file system for doing site installations on top of EESSI. | ||
|
|
||
| The setup for this approach is very simple, and it allows you to quickly get started with making additional installations available to the users of your infrastructure. | ||
|
|
||
| ## Requirements | ||
|
|
||
| The following setup is needed, and we assume this is already in place: | ||
|
|
||
| - EESSI is available on your build nodes (note that you still need to build on every CPU type that you want to support) | ||
| - See the [native installation page](../getting_access/native_installation.md) for instructions | ||
| - A shared file system to make the software installations available on all your nodes | ||
| - A user account with write access to the shared file system | ||
| - Optionally: Singularity or Apptainer to do the builds in a controlled and isolated environment | ||
|
|
||
| Ideally, you already have some workflow or automation in place to build software for your different node types. | ||
| It should be straightforward to adapt these for building on top of EESSI. | ||
|
|
||
| ## Initialize EESSI and EESSI-extend | ||
|
|
||
| To get started, we need to initialize EESSI and load EESSI-extend on a build node, and we configure the EESSI environment for site installations. | ||
| This ensures that the installation directories will become world-readable. | ||
| By default, site installations will end up in the EESSI [host injections directory](../site_specific_config/host_injections.md). | ||
| If you have not configured this directory yet, it will point to `/opt/eessi`, meaning that your software installations will end up there as well. | ||
| Often you would want to use the host injections directory for node-specific files like GPU drivers, | ||
| while we would like the software installations to end up on the shared file system. | ||
| By setting the environment variable `$EESSI_SITE_SOFTWARE_PREFIX` before loading EESSI and EESSI-extend, | ||
| we can adjust the site software installation prefix and point it to the shared file system: | ||
|
|
||
| ``` { .bash .copy } | ||
| export EESSI_SITE_INSTALL=1 | ||
| export EESSI_SITE_SOFTWARE_PREFIX=/sharedfs/eessi | ||
| module load EESSI/2025.06 | ||
| module load EESSI-extend | ||
| ``` | ||
|
|
||
| !!! note | ||
|
|
||
| Note that we have to pick a specific EESSI version here, depending on the toolchain of the software that we want to install. | ||
| See [EESSI versions](../repositories/versions/) for more information about the different EESSI versions. | ||
|
|
||
| ## Start building | ||
|
|
||
| The environment should now be configured for doing site installations to the chosen prefix. | ||
| The `EESSI-extend` module automatically loads `EasyBuild`, and to be sure you can check its configuration: | ||
| ``` { .bash .copy } | ||
| $ eb --show-config | ||
|
|
||
| # | ||
| # Current EasyBuild configuration | ||
| # (C: command line argument, D: default value, E: environment variable, F: configuration file) | ||
| # | ||
| allow-loaded-modules (E) = EasyBuild, EESSI-extend | ||
| buildpath (E) = /tmp/user/easybuild/build | ||
| bwrap-installpath (E) = /tmp/user/easybuild/bwrap | ||
| containerpath (E) = /tmp/user/easybuild/containers | ||
| cuda-sanity-check-error-on-failed-checks (E) = True | ||
| debug (E) = True | ||
| experimental (E) = True | ||
| fail-on-mod-files-gcccore (E) = True | ||
| filter-deps (E) = binutils, bzip2, DBus, flex, gettext, gperf, help2man, intltool, libreadline, makeinfo, ncurses, NVPL, ParMETIS, util-linux, XZ, zlib | ||
| filter-env-vars (E) = LD_LIBRARY_PATH | ||
| hooks (E) = /cvmfs/software.eessi.io/versions/2025.06/init/easybuild/eb_hooks.py | ||
| ignore-osdeps (E) = True | ||
| installpath (E) = /sharedfs/eessi/versions/2025.06/software/linux/x86_64/amd/zen3 | ||
| ... | ||
| ``` | ||
|
|
||
| Nowm, we can start building, e.g.: | ||
|
|
||
| ``` { .bash .copy } | ||
| eb -r attr-2.5.2-GCCcore-14.3.0.eb | ||
| eb -r cowsay-3.04.eb | ||
| ``` | ||
|
|
||
| When the installation has completed, the software should be available in your prefix. | ||
|
|
||
| ## Container | ||
|
|
||
| Instead of doing the builds on the host system itself, you could consider doing them in a container. | ||
| Using a minimal build container minimizes the risk of accidentally picking up host libraries (instead of the ones provided by EESSI), | ||
| and a container also provides a controlled and isolated environment. | ||
|
|
||
| In principle you could any container, as long as you make sure that both the EESSI CVMFS repository and your shared file system are available in the container. | ||
| Assuming both are available on the build host, you can simply bind mount both of them into the container. | ||
|
|
||
| You can also use the (`eessi_container.sh` script)[https://github.com/EESSI/software-layer-scripts/blob/main/eessi_container.sh], provided by EESSI, | ||
| which mounts the EESSI CVMFS repository inside the EESSI build container. In order to also bind mount your shared file system, you can use: | ||
| ``` { .bash .copy } | ||
| eessi_container.sh -b $EESSI_SITE_SOFTWARE_PREFIX | ||
| ``` | ||
|
|
||
| In the container you can then use the same build procedure as described before. | ||
|
|
||
|
|
||
| ## Using the software | ||
| The EESSI module should make sure that site installations are automatically picked up by the module environment, | ||
| as long as you make sure that `$EESSI_SITE_SOFTWARE_PREFIX` is always set to your prefix before loading the `EESSI` module: | ||
|
|
||
| ``` { .bash .copy } | ||
| export EESSI_SITE_SOFTWARE_PREFIX=/sharedfs/eessi | ||
| module load EESSI/2025.06 | ||
| module avail | ||
| ``` | ||
| This should show something like: | ||
| ``` { .bash .copy } | ||
| ---- /sharedfs/eessi/versions/2025.06/software/linux/x86_64/amd/zen3/modules/all ---- | ||
| attr/2.5.2-GCCcore-14.3.0 cowsay/3.04 | ||
|
|
||
| ---- /cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen3/modules/all ---- | ||
| Abseil/20240722.0-GCCcore-13.3.0 | ||
| absl-py/2.1.0-GCCcore-13.3.0 | ||
| ... | ||
| ``` | ||
|
|
||
| As you can see, EESSI installed it into a CPU-specific directory under the chosen prefix. | ||
| This allows you to redo the installation on other CPU types that you want to support, | ||
| such that each of them has access to an optimized installation. | ||
| Every time the EESSI module gets loaded, it will detect the CPU of that node and use the corresponding subtree in your prefix. | ||
|
|
||
| You can now simply load the software using: | ||
|
|
||
| ``` { .bash .copy } | ||
| module load cowsay/3.04 | ||
| cowsay "EESSI keeps the clusters moo-ving." | ||
| ____________________________________ | ||
| < EESSI keeps the clusters moo-ving. > | ||
| ------------------------------------ | ||
| \ ^__^ | ||
| \ (oo)\_______ | ||
| (__)\ )\/\ | ||
| ||----w | | ||
| || || | ||
|
|
||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.