For Head TAs

This topic contains information for head TAs that intend to use the cluster for a course that they are responsible for, including using JupyterHub for running Jupyter notebooks.

Please contact our service desk with the information that we ask below to set up a course.

Provisions

The cluster can only be used for teaching courses held for the D-INFK. It's not intended for projects or theses.

For plain CPU jobs we recommend going to Euler and use the free tier as we only have few CPU cores available, enough for Jupyter notebooks.

Planning

Before we can even set up the environment for your course we need the following information.

1. What You Need on the Cluster

Currently the cluster provides three three different options to use the cluster resources:

Interactive Jupyter sessions using GPUs. The constraining factor is the number of available GPUs.
Interactive Jupyter sessions using CPUs. The constraining factor is available RAM.
SLURM jobs using GPUs. The constraining factor is the number of available GPUs.

You can request only one, a combination of two or all three.

2. Course Tag

We need a tag for your course, either an abbreviation or something short. aml or mixed_reality would be fine.

The tag will be used for course folder and SLURM account names as well as for the access groups we will create.

If you requested more than one option in 1 then we will add additional course tags with jupyter, jobs or cpu suffixes as needed but these will only be relevant for starting jobs or Jupyter servers.

3. Number of Students

We only have 256 GPUs for GPU courses and CPU nodes with 1536GB of RAM for CPU Jupyter sessions. We need to know how many students you anticipate in your course and how likely they will be all using the cluster at the same time.

For large classes you may need to build teams. In this case we need to know which students form a team. If you give us a ratio we can also automatically assign students to teams.

Teams will be configured in a way that only one member of a team can run a job or Jupyter session at any given time. Collaboration is not possible.

4. Maximum Runtime

Currently we have two different priorities for GPU job run time and you will need to chose one that fits best for your course:

interactive: Has priority over jobs, maximum 60 min runtime, fills cluster completely if necessary. This is intended for running Jupyter notebooks or interactive sessions and nothing else.
jobs: Negotiable limit up to seven days, but jobs are preempted by interactive sessions and automatically restarted. Checkpointing is heavily recommended!

For CPU jobs there is only one priority:

interactive: Maximum 120 min runtime. This is intended for running Jupyter notebooks or interactive sessions and nothing else.

Please consider what the longest runtime of a job of your students is going to be.

For Jupyter sessions the server is killed when time runs out and the user has to start a new one after the time is up.

5. Total number of Job Hours for a Student or Team

How much time will a student or a team need to comfortably complete all exercises or projects of a semester? Each student or team will not be able to run new jobs if the amount of time is used up.

We do not want this number to be too high to encourage using the resources efficiently. This is a shared resource so your students are not entitled to have a GPU or CPU 24x7 for a semester.

6. Data

What data will your students use or produce and how much?

Each course gets a folder /cluster/courses/{course tag} where you can put data sets or whatever the students need. This folder will be writable by the TAs and readable by the students.

Your students have 20GB home directory each and a directory under /work/scratch which can temporarily hold up to 100GB (see here). If they need a storage location for large data or to collaborate with (in case of teams) then we can allocate space in /work/courses. In this case we need to know how much each user or team needs and what the total for the whole course is expected to be.

7. Software

We only install software that comes with Ubuntu. If you miss something then please let us know. Login nodes and compute nodes have the same software installed.

You can always compile your own software to /cluster/courses/{course tag} and have your students include a location there in their $PATH.

Additional Information

To complete the setup for production we also need the following but it's OK to supply this information at a later point of time:

The list of additional TA logins.
The initial list of student logins or mail addresses, for instance from eDoz.

Managing Users

Adding users to teams is done manually, send requests to our service desk.

Adding users to your course is done by simply adding users to a group we provide you. Users will be created or removed on the cluster with a delay of up to 15 minutes.