For Head TAs

This topic contains information for head TAs that intend to use the cluster for a course that they are responsible for, including using JupyterHub for running notebook servers.

Please contact our service desk with the information that we ask below to set up a course.

Usage of the cluster for courses for the D-INFK is free.

Planning

Before we can even set up the environment we need the following information.

1. Course Tag

We need a tag for your course, either an abbreviation or something short. aml or mixed_reality would be fine.

The tag will be used for course folder and SLURM account names as well as for the access groups we will create.

2. CPU or GPU Access

Currently the cluster has resources for:

  • Interactive Jupyter sessions using GPUs. The constraining factor is the number of GPUs.
  • Jobs using GPUs. The constraining factor is the number of GPUs.
  • Interactive Jupyter sessions using CPUs. The constraining factor is RAM.

Unless your course is complex you only need one of the option but we need to know which. If you need more than one then we will create two or three course tags.

For plain CPU jobs we recommend going to Euler and use the free tier.

3. Number of Students

We only have 256 GPUs for GPU courses and CPU nodes with 1536GB of RAM for CPU Jupyter sessions. For large classes you may need to build teams. In this case we need to know which students form a team. If you give us a ratio we can also automatically assign students to teams.

Teams will be configured in a way that only one member of a team can run a job or Jupyter session at any given time. Collaboration is currently not implemented.

4. Maximum Job Runtime

Currently we have four different priorities for GPU job run time and you will need to chose one that fits best for your course:

Interactive
Has priority over everything else, maximum 60 min runtime, fills cluster completely if necessary. This is intended for running Jupyter notebooks or interactive sessions and nothing else.
Short
10 min runtime
Medium
60 min runtime
Long
Negotiable limit, but jobs are preempted by Interactive/Short/Medium priority jobs and automatically restarted. Checkpointing is heavily recommended!

Please consider what the longest runtime of a job of your students is going to be.

For CPU Jupyter sessions the default session time is one hour. After that the server is killed and the user has to start a new one. For small courses this can be extended to two hours.

5. Total number of Job Hours for a Student

How much time will a student (or a team) need to comfortably complete all exercises of a semester? Each student (or team) will not be able to run new jobs if the amount of time is used up.

We do not want this number to be too high to encourage using the resources efficiently. This is a shared resource so your students are not entitled to have a GPU or CPU 24x7 for a semester.

Data

Each course gets a folder /cluster/courses/{course tag} where you can put data sets or whatever the students need. This folder will be writable by the TAs and readable by the students.

If you have large data (>500GB) then please contact us. We may need to offload this data to a different, slower location.

Group Work Directories

If students build teams then we can create an additional work directory per team, if this is justified.

Software

We only install software that comes with Ubuntu. If you miss something then please let us know. Login nodes and compute nodes have the same software installed.

You can compile your own software to /cluster/courses/{course tag} and have your students include a location there in their $PATH.

Python and Anaconda

We recommend that you set up a python virtual environment or anaconda installation under /cluster/courses/{course tag} with all the packages installed for your particular course. This is especially preferred if you need something like pytorch that requires multiple gigabytes. Then tell your students how to activate the environment.

Students can of course set up their own python environment or install anaconda in their home directory, but that will use up a lot of disk space.

In /cluster/data we keep recent anaconda and miniconda installers ready. Please use these for your class instead of letting each student download them.

Jupyter

To use Jupyter with https://student-jupyter.inf.ethz.ch requires you to setup a Jupyter environment under /cluster/courses/{course tag} that we will enable in the chooser for environment of the hub. Instruction can be found here.

Additional Information

To complete the setup for production we also need the following but it's OK to supply it at a later point of time:

  • The list of student logins or mail addresses, for instance from eDoz.
  • The list of additional TA logins.

At the moment you need to report us users that have to be added.

Page URL: https://www.isg.inf.ethz.ch/bin/view/Main/ServicesClusterComputingStudentClusterForHeadTAs
2024-07-25
© 2024 Eidgenössische Technische Hochschule Zürich