The CBRZ contains the supercomputer HSUper and the interactive scientific computing cloud (ISCC). Below some introductory information for users about the CBRZ systems and some testbeds.
Acknowledgement
The HPC cluster HSUper and the cloud solution ISCC have been provided by the project hpc.bw, funded by dtec.bw — Digitalization and Technology Research Center of the Bundeswehr. dtec.bw is funded by the European Union – NextGenerationEU.
See below for a suggestion of an acknowledgement statement for publications:
Computational resources (HPC cluster HSUper) have been provided by the project hpc.bw, funded by dtec.bw – Digitalization and
Technology Research Center of the Bundeswehr. dtec.bw is funded by the European Union – NextGenerationEU.
HSUper
HSUper is a supercomputer consisting of more than 580 nodes divided in various partitions with the largest currently consisting of 569 nodes. Jobs can be submitted using the installed slurm workload manager.
- Technical Specifications
- Partitions
- How to Access HSUper
- Available Software on HSUper
- USER-SPACK
- Storage & Quota
- Preparing Jobs & Testing
- How to Submit Jobs to HSUper
- Useful Environment Variables
Technical Specifications
The HSUper cluster consists of
- Regular nodes: 571 compute nodes each equipped with 256 GB RAM and 2 Intel Icelake sockets; each socket features a Intel(R) Xeon (R) Platinum 8360Y processor with (up to) 36 cores, yielding a total of 72 cores per node
- Fat memory nodes: 5 compute nodes each equipped with 1 TB RAM and 2 Intel Icelake sockets; each socket features a Intel(R) Xeon (R) Platinum 8360Y processor with (up to) 36 cores, yielding a total of 72 cores per node
- GPU nodes: 5 compute nodes each equipped with 256 GB RAM, 2 Intel Icelake sockets, 2 NVidia A100 (40GB) GPUs and 894GB local scratch storage; each socket features a Intel(R) Xeon (R) Platinum 8360Y processor with (up to) 36 cores, yielding a total of 72 cores per node
Besides, HSUper provides access to a 1PB BeeGFS file system, as well as a 1 PB Ceph-based file system.
All nodes are connected by a non-blocking NVIDIA InfiniBand HDR100 fabric.
Partitions
HSUper provides different nodes (see above) and also different partitions in which compute jobs can be run. In the following, you find a list of available partitions and their restrictions:
- dev: 1 job per user on 1-2 regular nodes (nodes 1-571); wall-clock limit: 1h. Please use this partition for testing purposes only.
- small: Jobs on 1-5 regular nodes (nodes 3-571); wall-clock limit: 72h; exclusive node reservation
- small_shared: Same settings as small but node resources are by default shared.
- small_fat: Jobs on 1-5 fat memory nodes (nodes 572-576); wall-clock limit: 24h; exclusive node reservation
- small_gpu: Jobs on 1-5 GPU nodes (gpu nodes 1-5); wall-clock limit: 24h
- medium: Jobs on 6-256 regular nodes (nodes 3-571); wall-clock limit: 24h; exclusive node reservation
- large: Jobs on >256 regular nodes (nodes 3-571; available to selected users only!); exclusive node reservation
How to Access HSUper
HSUper can be accessed from within the HSU network only. If you are not located at the HSU, you need to connect using the HSU VPN (see below). To be able to access HSUper, you need to be a registered HSUper user. Please apply for HPC access using the following form (available on campus/VPN only, login with RZ credentials), if needed.
Access HSUper using Linux / Windows 10+ / MacOS 10+:
Login using your RZ credentials. Open a terminal / command prompt and enter, replacing <rz-name> with your RZ name:
ssh <rz-name>@hsuper-login01.hsu-hh.de
Access HSUper using PuTTY:
You may also download and use “PuTTY” to manage different SSH connections (and more) using a GUI for settings. After opening PuTTY, put
as the “Host Name” and click on “Open”. Enter your credentials afterwards.hsuper-login01.hsu-hh.de
Access HSUper using your public SSH key:
Instead of entering every time your password, you may set up logging in using SSH keys.
- If you already have created SSH keys, you may add your public key to the ~/.ssh/authorized_keys file on HSUper using your favourite text editor.
- Connect to HSUper with your private key:
ssh -i ~/path/to/private/key <rz-name>@hsuper-login01.hsu-hh.de
- PuTTY: Configuration category “Connection -> SSH -> Auth”. Browse for your private key file.
- Note: You may skip the key parameter if you have only one SSH key stored in your ~/.ssh/ folder as it then gets transmitted automatically.
If you need to create new SSH keys, use the type “Ed25519” as it is currently seen as the fastest and most secure type. If you work from a terminal / command prompt, you may create a key using: ssh-keygen -t ed25519
Follow the instructions and remember where the public and private key are saved. Default values are fine.
- Manual for Windows: Creating a SSH key and connecting using it.
X11 Forwarding:
If you are on a console using ssh, you may add the “-X” parameter to use X11 forwarding, allowing you to open graphical applications on the HSUper frontend and having them forward their windows to your local computer(which can, however, be very slow). If you use PuTTY, you’ll find the option in the PuTTY configuration category “Connection -> SSH -> X11”. Have a look at PuTTY’s documentation for more information.
Using the HSU VPN:
- Install the OpenVPN client.
- (First time only: Download the the HSU OpenVPN configuration file “HSU Open VPN Konfiguration 2022″ while on campus from the webbox (Rechenzentrum).)
- Start the VPN connection:
- Linux:
sudo openvpn –config /path/to/HSU_VPN_2022.ovpn
,- Enter your RZ credentials.
- Windows:
- Start OpenVPN,
- First time only: Right click on the tray icon, import the VPN configuration file,
- Right click on the tray icon, click on HSU_VPN_2022 and enter your RZ credentials.
- Linux:
Available Software on HSUper
Most software packages are provided as a module and must be loaded. HSUper has its modules hierarchically organized by compiler and MPI implementation. The command module avail
shows all modules, that are available in the current environment. If you are missing a software package, it might not be available for the selected compilers or MPI implementation. module spider <modulename>
is useful to find out how a module could be loaded. More information on the module system is provided by module --help
and the Lmod documentation.
If you cannot find a package using module spider
, verify if you have loaded the USER-SPACK module.
Currently available software:
- Apptainer (Singularity)
- slurm
- Lmod
- tmux
- Python3
- GNU Debugger (gdb)
- vim
- tcsh
- zsh
Modules:
- OpenMPI
- Intel® MPI Library (Intel oneAPI MPI)
- Intel® oneAPI Math Kernel Library (Intel oneAPI MKL)
- Intel® oneAPI Base Toolkit & HPC Toolkit
- Intel oneAPI DPC++/C++ Compiler
- Intel® C++ Compiler Classic
- Intel® oneAPI Threading Building Blocks (Intel oneAPI TBB)
- …
- NVIDIA HPC SDK
- NVIDIA C, C++, Fortran 77, Fortran 90 compilers
- …
- NVIDIA CUDA®
- TensorFlow (TODO: coming soon)
- R
- gcc
- cmake
- ccmake (part of the cmake module)
- eigen3
- bison
- flex
- Python3
- py-pybind
- OpenCL
- FTTW
- LAPACK
- ScaLAPACK
- OpenFOAM
- GNU Octave
- gnuplot
- Miniconda3
- Spack (USER-SPACK)
USER-SPACK
The module “USER-SPACK” creates a folder with the same name in your home directory. The module allows you to use spack for installing packages with all the advantages spack offers. The module integrates your locally installed spack packages directly into the environment module system Lmod — if compiled with a different compiler than the default system compiler (currently [email protected]). For example spack install <package> %[email protected]
installs the <package> using the gcc compiler in version 12.1.0 and creates a module file for you. Not mentioning any compiler (with %) result in using the system compiler and no module file creation. In these cases you rely on spack load
.
Furthermore, all globally available software packages are made known to your local user spack installation on the first load of the module – no need to recompile these nor make them known to your spack installation.
Hence, you are free to choose between spack
and module
for loading and unloading installed packages.
Tip: spack find -x
shows you only the explicitly (globally) installed packages, without any dependencies. A spack documentation with many examples and tutorials can be found online.
If you cannot find a package that should exist globally using module spider
, you may unload USER-SPACK or update your local copy of the cluster module files using the following command: spack module lmod refresh --delete-tree --upstream-modules -y
Please note: USER-SPACK is only compatible with the following shells: bash/zsh/sh/csh
Storage & Quota
Every user has a personal quota on the BeeGFS parallel file system. Once the quota is reached, one cannot write more files. The current status is shown at login, noting ones personal quota (highlighted) and of ones (chair / research) groups.
Groups may have a project folder in the /beegfs/project/
folder.
Projects have an individual quota, independent of the user quota.
Feel free to submit a request using the ticket system if you need access to an existing project folder or need one created.
On every compute node and the login node is a /scratch
partition mounted. You may use it for temporary files. Please note that files created there count to your personal quota. Remember to clean up after your job succeeded and the results are saved.
In general it is a good idea to keep your home directory clean and small.
Please note: The parallel file systems do not offer any backup solution. Deleted files are gone forever. Create backups of important files!
Preparing Jobs & Testing
Software should be compiled/installed on the login nodes – manually or using USER-SPACK (see above).
Container (for Apptainer/Singularity) may be prepared on your local computer (or the login node) and later uploaded to the cluster.
Jobs running on compute nodes should only contain the execution of the already set up environment. All environment preparations (download of additional software packages and data from the Internet) should be done beforehand.
There is a special dev partition (see below) meant to be used for testing purposes. Feel free to use that partition to test different job settings etc.
How to Submit Jobs to HSUper
HSUper resources are managed via SLURM. Job scripts need to be written and scheduled. They describe how a certain compute task is to be executed. Exemplary SLURM job scripts are provided further below. They can be scheduled using sbatch
, e.g. sbatch helloworld-omp.job
.
For details on SLURM, refer to its documentation.
Before executing your software on HSUper, double-check that a time limit is set (that must not differ drastically from the actual run time) and that the job is capable of exploiting the requested resources efficiently:
- Is your software parallelized?
If it is parallelized, is it shared-memory parallelized (e.g. with OpenMP, Intel TBB, Cilk)? Then you can use single compute nodes, but typically not multiple of them.
If it is parallelized, is it distributed-memory parallelized (e.g. with MPI or some PGAS approach such as used in Fortran 2008, UPC++, Chapel)? Then your program can potentially also run on multiple compute nodes at a time. - Does your software support execution on graphics cards (GPUs)? If yes, you might want to consider using the GPU nodes of HSUper.
- Is your application extremely memory-intensive?
If it requires even more than 256 GB of memory and is not parallelized for distributed-memory systems, you might still be able to execute your software on the fat memory nodes of HSUper. Note: this should be the exception!, typically distributed-memory parallelism is highly recommended for most applications! - Does your software only require a subset of the resources of single nodes? Consider using the small_shared partition.
In the following, you find job script examples that execute a simple C++ test program on HSUper. The test program is the following — you may simply copy-paste it into a file helloworld.cpp or download it from the webbox:
#include<iostream>
#ifdef MYMPI
#include<mpi.h>
#endif
#ifdef MYOMP
#include<omp.h>
#endif
int main(int argc, char *argv[]){
int rank=0;
int size=1;
int threads=1;
#ifdef MYMPI
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
#endif
#ifdef MYOMP
threads=omp_get_max_threads();
#endif
if (rank==0){
std::cout<<"Hello World on root rank "<<rank << std::endl;
std::cout<<"Total number of ranks: "<< size << std::endl;
std::cout<<"Number of threads/rank: "<< threads << std::endl;
}
#ifdef MYMPI
MPI_Finalize();
#endif
return 0;
}
To run the examples, you may compile the program in different variants:
- Sequential (not parallel):
g++ helloworld.cpp -o helloworld_gnu
-> remark: sequential programs are not well-suited for execution on HSUper! Please make sure to run parallelized programs to efficiently exploit the given HPC hardware! For a respective job script, you could simply rely on the shared-memory example, excluding the use of threads and OpenMP. - Shared-memory parallel using OpenMP:
g++ -DMYOMP -fopenmp helloworld.cpp -lgomp -o helloworld_gnu_omp
-> remark: this is sufficient to use up to all cores and threads, respectively, of one single compute node! Remember that HSUper is particularly good for massively parallel compute jobs, leveraging many compute nodes at a time (see next bullet) - Distributed-memory parallel using MPI:
mpicxx -DMYMPI helloworld.cpp -lmpi -o helloworld_gnu_mpi
-> remark: this is sufficient to use all cores and, potentially, several nodes of HSUper. Make sure to load a valid MPI compiler beforehand (see section on software:module load ...
with...
representing an existing MPI module; check available modules viamodule avail
) - Distributed- and shared-memory parallel using both MPI and OpenMP:
mpicxx -DMYMPI -DMYOMP -fopenmp helloworld.cpp -lmpi -lgomp -o helloworld_gnu_mpi_omp
Examples:
- Single-node shared-memory parallel job using OpenMP:
The following batch script describes a job using 1 compute node (equipped with 72 cores), on which the OpenMP-parallel program is executed with 13 threads. A 2-minute time limit is set (the program should actually finish within few seconds). Output is written to the file helloworld-omp-%j.log, where %j corresponds to the job ID.
Please make sure to adapt the path to your script before you submit the job. The example suggests to have all files in the folderhelloworld
in your home directory. You can download the file also from the webbox.#!/bin/bash
#SBATCH --job-name=helloworld-omp # specifies a user-defined job name
#SBATCH --nodes=1 # number of compute nodes to be used
#SBATCH --ntasks=1 # number of MPI processes
#SBATCH --partition=small # partition (small_shared, small, medium, small_fat, small_gpu)# special partitions: large (for selected users only!)
# job configuration testing partition: dev
#SBATCH --cpus-per-task=72 # number of cores per process
#SBATCH --time=00:02:00 # maximum wall clock limit for job execution
#SBATCH --output=helloworld-omp_%j.log # log file which will contain all output
# commands to be executed
cd $HOME/YOUR_PATH_TO_COMPILED_PROGRAM/
helloworld #cd /beegfs/home/YOUR_PATH_TO_COMPILED_PROGRAM/helloworldexport OMP_NUM_THREADS=13
./helloworld_gnu_omp - Multi-node distributed-memory parallel job using MPI:
The following batch script describes a job using 3 compute nodes. On every compute node, 36 MPI processes are launched. One core is reserved per MPI process. A 2-minute time limit is set (the program should actually finish within few seconds). Output is written to the file helloworld-mpi-%j.log, where %j corresponds to the job ID.
Please make sure to adapt the path to your script before you submit the job. The example suggests to have all files in the folderhelloworld
in your home directory. You may search for MPI implementations utilizingml spider mpi
orml spider openmpi
. You can download the file also from the webbox.#!/bin/bash
#SBATCH --job-name=helloworld-mpi # specifies a user-defined job name
#SBATCH --nodes=3 # number of compute nodes to be used
#SBATCH --ntasks-per-node=36 # number of MPI processes per node
#SBATCH --partition=small# partition (small_shared, small, medium, small_fat, small_gpu)
# special partitions: large (for selected users only!)
# job configuration testing partition: dev
#SBATCH --cpus-per-task=1 # number of cores per process
#SBATCH --time=00:02:00 # maximum wall clock limit for job execution
#SBATCH --output=helloworld-mpi_%j.log # log file which will contain all output
### some additional information (you can delete those lines)
echo "#==================================================#"
echo " num nodes: " $SLURM_JOB_NUM_NODES
echo " num tasks: " $SLURM_NTASKS
echo " cpus per task: " $SLURM_CPUS_PER_TASK
echo " nodes used: " $SLURM_JOB_NODELIST
echo " job cpus used: " $SLURM_JOB_CPUS_PER_NODE
echo "#==================================================#"
# commands to be executed
# modify the following line to load a specific MPI implementation
module load mpi
cd $HOME/YOUR_PATH_TO_COMPILED_PROGRAM/
helloworld #cd /beegfs/home/YOUR_PATH_TO_COMPILED_PROGRAM/helloworld
# use the SLURM variable "ntasks" to set the number of MPI processes;
# here, ntasks is computed from "nodes" and "ntasks-per-node"; alternatively
# specify, e.g., ntasks directly (instead of ntasks-per-node)
mpirun -np $SLURM_NTASKS ./helloworld_gnu_mpi - Multi-node distributed-memory/shared-memory parallel job using MPI/OpenMP:
The following batch script describes a job using 3 compute nodes. On every compute node, 2 MPI processes are launched. Each process comprises 36 cores. A 2-minute time limit is set (the program should actually finish within few seconds). Output is written to the file helloworld-mpi-omp-%j.log, where %j corresponds to the job ID.
Please make sure to adapt the path to your script before you submit the job. The example suggests to have all files in the folderhelloworld
in your home directory. You may search for MPI implementations utilizingml spider mpi
orml spider openmpi
. You can download the file also from the webbox.#!/bin/bash
#SBATCH --job-name=helloworld-mpi-omp # specifies a user-defined job name
#SBATCH --nodes=3 # number of compute nodes to be used
#SBATCH --ntasks-per-node=2 # number of MPI processes per node
#SBATCH --partition=small
# partition (small_shared, small, medium, small_fat, small_gpu)
# special partitions: large (for selected users only!)
# job configuration testing partition: dev
#SBATCH --cpus-per-task=36 # number of cores per process
#SBATCH --time=00:02:00 # maximum wall clock limit for job execution
#SBATCH --output=helloworld-mpi-omp_%j.log # log file which will contain all output
### some additional information (you can delete those lines)
echo "#==================================================#"
echo " num nodes: " $SLURM_JOB_NUM_NODES
echo " num tasks: " $SLURM_NTASKS
echo " cpus per task: " $SLURM_CPUS_PER_TASK
echo " nodes used: " $SLURM_JOB_NODELIST
echo " job cpus used: " $SLURM_JOB_CPUS_PER_NODE
echo "#==================================================#"
# commands to be executed
# modify the following line to load a specific MPI implementation
module load mpi
cd $HOME/YOUR_PATH_TO_COMPILED_PROGRAM/
helloworld #cd /beegfs/home/YOUR_PATH_TO_COMPILED_PROGRAM/helloworld
# use the SLURM variable "ntasks" to set the number of MPI processes;# here, ntasks is computed from "nodes" and "ntasks-per-node"
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
mpirun -np $SLURM_NTASKS ./helloworld_gnu_mpi
_omp
Useful Environment Variables
$HOME
– Path to ones home directory$PROJECT
– Path to the BeeGFS project directory$SLURM_TMPDIR
– Path to the temporary folder for the current slurm job. On job completion all created files will be automatically deleted. Hence, the user’s BeeGFS quota will not be affected by the temporary files that are created in this path, once the slurm job completes. To keep files, essential files should be copied during the jobs runtime (copy commands in the slurm job script).- On compute nodes:
$SLURM_TMPDIR
points to the node’s memory. Please be careful not to run out of memory by writing too much data there. - On GPU nodes:
$SLURM_TMPDIR
points to a local SSD with a total of 894GB. Memory is not affected by writing data to$SLURM_TMPDIR
. The storage might be shared if the node is not allocated exclusively.
- On compute nodes:
$SCRATCH
– Path to the/scratch
directory.
ISCC
The Interactive Scientific Computing Cloud (ISCC) aims to fill the gap for users with specific needs that cannot be containerized as well as users with workloads that cannot be satisfied by existing local machines and do not yet require the entire power of HSUper.
Technical Specifications
The ISCC cluster consists of 12 hosts in total, with the same hardware specifications as HSUper, except for the interconnect, which is capable of 50Gb/s over Ethernet instead of InfiniBand HDR100.
- Regular hosts: 10 hosts each equipped with 256 GB RAM and 2 Intel Icelake sockets; each socket features a Intel(R) Xeon (R) Platinum 8360Y processor with 32 cores, yielding a total of 64 cores per host.
- GPU hosts: 2 hosts each equipped with 1 TB RAM, 2 Intel Icelake sockets, 8 NVidia A30 (24GB) GPUs and 2TB local scratch storage; each socket features a Intel(R) Xeon (R) Platinum 8360Y processor with 32 cores, yielding a total of 64 cores per host
Virtual Machines (VM)
The maximum number of resources a virtual machine can acquire is the same as a single host can provide.
Resources can be divided as small as a single core, 4MB of RAM, and a quarter of a single NVidia A30 GPU.
In theory, this allows a GPU host to be sliced into 32 equal virtual machines using a quarter of an NVidia A30, 2 CPU cores and 32GB of RAM.
As resources are scarce, it is important to release not used resources (e.g. by shutting down or deleting a VM).
VMs can be requested as needed and their resources can be changed as the needs of a VM or the entire ISCC evolve.
The guest operating system can be chosen freely, but should take into account the use case.
Storage
VMs generally do not have a local storage, but use storage provided by CEPH over a 50GbE network.
Increasing or decreasing the storage of a VM is generally possible, but sometimes the time and effort does not justify the result. Therefore, it is important to make a good guess when creating the VM or requesting the resources for a VM.
Besides, HSUper’s 1PB BeeGFS file system may be mounted using SSHFS utilizing the 50GbE network connection of the ISCC.
How to Access ISCC Resources
The ISCC can be accessed from within the HSU network only. If you are not located at the HSU, you need to connect using the HSU VPN (see above).
In order to use ISCC resources, someone needs to provide you with access information for a VM running in the ISCC.
Please apply for ISCC resources using the following form (available on campus/VPN only, login with RZ credentials, Microsoft’s multi-factor authorisation may intercept your request).
Please note: We are still testing and experimenting with the ISCC. We would therefore appreciate your feedback.
Fujitsu A64FX Testbed
Please note: Not part of the CBRZ! Access requests are handled separately.
- 8 nodes
- 48 cores @2.00 GHz per node
- 32 GB RAM per node
AMD Epyc Testbed
Please note: Not part of the CBRZ! Access requests are handled separately.
- 4 nodes
- 64 cores @2.45 GHz per node
- 256 GB RAM per node
Support
The administration / technical support can be reached via the ticket system (login with RZ credentials, Microsoft’s multi-factor authorisation may intercept your request):
- CBRZ: HSUper Access Registration
- CBRZ: HSUper Project Access Request
- CBRZ: ISCC Administration & Resources Access Registration
- CBRZ: HSUper/ISCC Support
Alternatively, send an email to: [email protected]
Letzte Änderung: 19. October 2023