Supercomputing Division, Information Technology Center, The University Tokyo

Reedbush FAQ

Application for use

I would like to apply for the use of the Reedbush system. Could you tell me the start date for application and other details?
Application for use is accepted at any time as long as there are computational resources available. Please fill in the necessary fields in the application form for use of the Reedbush system, affix your seal, and submit it to the Research Support Team, Information Strategy Group, Information Systems Department, The University of Tokyo.
Are there any points I should check (take note of) before applying for and using the Reedbush system?
Various manuals are available on the application method, method of use, and other information pertaining to the Reedbush system. Please refer to these for details about the application method and service contents. (Manuals and guides provided by the manufacturers are only available to users.)

Introduction to Reedbush courses
Reedbush tokens
Cost of use for Reedbush(Applicable from April 1, 2020)
Cost of use by month for Reedbush(Applicable from April 1, 2020)
QuickStartGuide

(Explanations are provided on the overview of how to use the Reedbush system, etc.
For details on how to use the system, please refer to the guides and manuals provided in the User Support Portal.)
How is the quota for disk space within a group set?
In the disk space within a group, directories have been prepared for each individual user registered in the group (in the following example, “/ lustre/gz00/z3000*), as well as for shared use by all the users within the group (in the following example, “lustre/gz00/z3000”). A quota is set for the users and the group for all the directories.
In the case where four nodes are applied for, the quota for users and the group is both 4TB respectively.

Group course
Tokens
((34,560 node-hours allocated per 4 nodes applied for)
Disk space for the group
(4TB allocated per 4 nodes applied for)
/luster/gz00/share
/luster/gz00/z30000
/luster/gz00/z30001
/luster/gz00/z30002
It is stipulated that each group must have a group administrator. What kind of administration does the group administrator carry out?
For the Reedbush system, the number of tokens and nodes that are used to execute batch jobs is allocated to the entire group. However, the group administrator has the authority to set and change the maximum amount of these computational resources allocated to each user. Note that it is not possible to change the maximum number of tokens or nodes in excess of the maximum number that the group has been allocated with.

Functions of the group administrator
・Change the maximum number of tokens allocated to users within the group (Default value: Maximum number allocated to the group)
・Change the maximum number of nodes allocated to users within the group (Default value: Maximum number allocated to the group)
Is it possible to add users from a personal (group) course to my own group? In such cases, will new user numbers be assigned?
It is possible to add users to the group. However, when adding existing users of the Center to the group, such users are registered as additional users of the group based on the user numbers that they are currently using. Hence, new user numbers will not be assigned.
When adding users from a personal (group) course to my own group, how will the resources allocated to users in the group (tokens, disk storage) be handled?
For new users who are added to the group, approval for the use of tokens available for the group will be granted, and a user directory will be created in the space allocated to users within the group.
When adding user “z40000” to the group
Group course
Tokens
(34,560 node-hours allocated per 4 nodes applied for)
z40000 can use the tokens
Disk space within the group
(4TB allocated per 4 nodes applied for)
/luster/gz00/share
/luster/gz00/z30000
/luster/gz00/z30001
/luster/gz00/z30002
/luster/gz00/z40000 Directory created for z40000’s use

When deleting user “z40000” from the group

Group course
Tokens
(34,560 node-hours allocated per 4 nodes applied for)
z40000 cannot use the tokens
Disk space within the group
(4TB allocated per 4 nodes applied for)
/luster/gz00/share
/luster/gz00/z30000
/luster/gz00/z30001
/luster/gz00/z30002
/luster/gz00/z40000 Directory for z40000’s use deleted


Quotas for users

For users who are affiliated to multiple courses, the maximum quota, out of the amount allocated to each course, is applied to each user in the /lustre domain (not totaled).
For example, in a group course for which 4TB is allocated, the maximum quota will be 4TB. However, the quota remains unchanged as long as the amount allocated to other courses does not exceed this amount. (However, the number of directory options where files can be written increases.)
Under these circumstances, the maximum usage quota for users in a group course that is allocated with 8TB will also be 8TB. An example is shown below.

Example of users who are affiliated to multiple courses (Example 1)
Course affiliated to Maximum usage quota
Personal course 1TB
AGroup course A 4TB
Maximum usage quota for users: 4TB

Example of users who are affiliated to multiple courses (Example 2)
Course affiliated to Maximum usage quota
Personal course 1TB
AGroup course A 4TB
Group course B 8TB
Maximum usage quota for users: 8TB

The quota for each course (group*) is effective separately from the quota for users, so please take note with regard to the ownership groups for files.
*Unique groups are also allocated for the personal course.
Is it possible to register for multiple groups?
Yes, it is. The representative of the group to be added should submit the application for use.
I was a group user, but use by the group I was registered will be cancelled. I would like to continue using the supercomputer system. What should I do?
Please submit a new application for use under the group course or personal course before the period of use of the current group you are registered with expires. Please note that if you submit an application for use under the group course or personal course after the period of use expires, you will not be able to retain the same user number and keep the files that you had been using.
Is it possible to add users for a group course at any time?
Yes, it is. Fill in the necessary fields in the application form for changes for the Reedbush system, and submit it to the Research Support Team, Information Strategy Group, Information Systems Department, The University of Tokyo.
Is there a maximum number of users who can be registered under the group course?
There are no limits to the number of users who can apply (be registered) for the group course. However, regardless of the number of users, the number of tokens that are allocated to the group is fixed. Hence, this fixed number of tokens will be used (consumed) by all the users registered in the group.
There are descriptions about the number of nodes applied for, and the maximum number of nodes. Could you tell me the difference between these?
Under the Reedbush system, it is possible to use up to the maximum number of nodes in excess of the number of nodes applied for. However, when executing a batch job in excess of the number of nodes applied for, the number of tokens consumed will increase.

▲ Return to the top of the FAQ for the Reedbush system

Token

Could you tell me about the tokens?
For explanations and FAQ about tokens, please refer to the webpage about “Tokens.”

▲ Return to the top of the FAQ for the Reedbush system

Overall system, service contents

What types of uses does educational use refer to?
The Information Technology Center, the University of Tokyo (hereafter, “the Center) provides supercomputing resources for practices in graduate and undergraduate classes.
There have been moves to use supercomputers as a practice tool for the use of methods applied to a wide range of research fields, including climate, fluid analysis, structural analysis, molecular science, nanotechnology, and aerospace. Supercomputers are already being used for practices related to structural analysis, Earth science, and fluid-related subjects in specialized undergraduate courses as well as graduate classes.
The use of supercomputers in education is anticipated to contribute to human resource development and the expansion of use of such systems. Hence, the Center accepts applications, at any time, for use in lectures and workshops (including intensive lectures) conducted by teaching faculties or teachers at graduate schools, undergraduate schools, and vocational institutes.
Use for educational purposes is not limited to use within the University of Tokyo. Applications for use outside of the University of Tokyo are also accepted. For details, please refer to “Educational Use.”

Available resources
Reedbush-U: Maximum execution time per job of 10 minutes, and maximum of 8 nodes (288 cores)
Reedbush-H: Maximum execution time per job of 10 minutes, and maximum of 2 nodes (GPU 4 units)
Is it possible to change the login shell?
Changes can be made using the “chsh” command. The standard login shell is set to “bash.”

▲ Return to the top of the FAQ for the Reedbush system

Compilers

How do I check for version updates for the compilers?
Information on version updates for compilers is provided on the User Support Portal. Please check the portal for more details.
Are debug profilers, etc. available?
Intel Inspector XE, Intel VTune Amplifier, Intel Trace Analyzer & Collector, TotalView, Trace Analyzer & Collector, NVIDIA Visual Profiler, etc. are available. For details, please refer to the User’s Guide for the Reedbush System (Overview/Reedbush-U, Overview/Reedbush-H, Overview/Reedbush-L) and other documents available in the User Support Portal (-System Documentation).

▲ Return to the top of the FAQ for the Reedbush system

Libraries

Could you tell me about the versions of the libraries and applications provided?
The versions of the libraries and applications provided by the Center are listed as follows (as of June 2017). It is possible to obtain the version information using the “module” command. For more information on how to use the “module” command, please refer to the User’s Guide.
Librart Version
Intel library (MKL)
SuperLU
SuperLU MT
SuperLU DIST
METIS
MT-METIS
ParMETIS
Scotch
PT-Scotch
PETSc
GNU Scientific Library
netcdf (C language)
netcdf (C++ language)
netcdf (FORTRAN language)
Parallel netCDF
Xabclib
ppOpen-APPL/FEM
ppOpen-APPL/FDM
ppOpen-MATH/MP
ppOpen-APPL/FDM-AT
ppOpen-APPL/DEM-util
ppOpen-APPL/BEM
ppOpen-APPL/AMR-FDM
ppOpen-APPL/BEM-AT
ppOpen-APPL/FVM
ppOpen-MATH/VIS
ppOpen-AT
MassiveThreads
OpenJDK
Boost
CUDA
MAGMA
OpenCV
ITK
Theano
Anaconda
ROOT
TensorFlow
2017 update 2
5.2.0
3.1
5.1.0
4.0.3 5.1.0
0.4.4
4.0.3
6.0.4
6.0.4
3.7.1
2.1
4.4.0
4.3.0
4.4.4
1.7.0
1.03
1.0.1
0.3.1
1.0.0
1.0.0
1.0.0
0.4.0
0.3.0
0.1.0
0.3.0
0.2.0
1.0.0
0.95
1.8.0.91-0.b14
1.61
8.0.44
2.2.0
3.2.0
4.11.0
0.8.2
4.3.0
6.08.00
1.0.0

Application Version
OpenFOAM
ABINT-MP
PHASE/0
FrontFlow/blue
FrontISTR
REVOCAP Coupler
REVOCAP Refiner
OpenMX
xTAPP
AkaiKKR
MODYLAS
ALPS
feram
GROMACS
BLAST
R
bioconductor
BioPerl
BioRuby
BWA
GATK
SAMtools
K MapReduce
Spark
Torch
Caffe
Chainer
GEANT4
3.0.1
7.0
2015.01
8.1
4.4
2.1
1.1.01
3.8
150401
cpa2002v009c
1.0.4
2.1.1-r6176
0.24.02
5.1.2
2.3.0
3.2.5
3.2
1.6.924
1.5.0
0.7.13
3.5
1.3.1
1.8.1
1.6.1
7
1.0.0-rc4
1.24.0
4.10.03
I would like to use the library (xxxx). Could you tell me how to use it?
For details, please refer to the User’s Guide for the Reedbush System (Overview/Reedbush-U, Overview/Reedbush-H, Overview/Reedbush-L) and other documents available in the User Support Portal (- Browse documents).

▲ Return to the top of the FAQ for the Reedbush system

Applications

I would like to use the application (xxxx) provided. Could you tell me how to use it?
For details, please refer to the User’s Guide for the Reedbush System (Overview/Reedbush-U, Overview/Reedbush-H, Overview/Reedbush-L) and other documents available in the Support Portal (- Browse documents).
Can I use software that I have bought separately?
You may install and use your own software in the user directory after checking the license and other details.

▲ Return to the top of the FAQ for the Reedbush system

Job administration system

What are the types of job classes available?
Please refer to “Job Classes.”
Are there any commands I can use to check the job classes that I can submit jobs to?
You can check this using the “rbstat--rsc" command. Details about the “rbstat” command are provided in the User’s Guide for the Reedbush system (Overview/Reedbush-U) (-4.2.1. rbstat command) available in the User Support Portal.
What is the maximum number of jobs that can be submitted, and the maximum number of jobs that can be executed simultaneously?
The number of jobs differs depending on the course used (personal or group course). Please refer to the following table.

Reedush-U System
Personal course Group Course
User unit Group unit
Maximum number of jobs that can be executed simultaneously 2 No limit 12 when the number of nodes applied for is 4
15 when the number of nodes applied for is 8
Thereafter, add 3 jobs per 8 nodes
Maximum number of submissions 8 No limit Maximum number that can be executed simultaneously: *4
(48 when the number of nodes applied for is 4, and 60 when the number of nodes applied for is 8)
Maximum number of nodes
(Job unit)
16 128 128 128

However, for the u-debug queue, the maximum number of jobs that can be executed simultaneously per user is 4 (2 for the personal course), in order to avoid long job execution waiting times.

The maximum number of nodes for the personal course has been relaxed. (April 2, 2018)


Reedush-H System
Personal course Group Course
User unit Group unit
Maximum number of jobs that can be executed simultaneously 100 10 No limit 100 10
Maximum number of submissions 100 10 No limit 100 10
Maximum number of nodes
(Job unit)
32 32 32

However, for the h-debug queue, the maximum number of jobs that can be executed simultaneously per user is 1, in order to avoid long job execution waiting times.

With effect from July 2017, the relaxed maximum number of jobs that can be executed simultaneously and maximum number of jobs that can be submitted have been reviewed. (March 16, 2018)


Reedush-L System
Personal course Group Course
User unit Group unit
Maximum number of jobs that can be executed simultaneously 100 10 No limit 100 10
Maximum number of submissions 100 10 No limit 100 10
Maximum number of nodes
(Job unit)
16 16 16

With effect from July 2017, the relaxed maximum number of jobs that can be executed simultaneously and maximum number of jobs that can be submitted have been reviewed. (March 16, 2018)


The above is applicable for each course (application unit) for users who are affiliated with multiple courses.

You can check the maximum number of jobs that can be executed simultaneously and the maximum number of jobs that can be submitted using the following command.

$ rbstat --limit
PROJECT     U-SUBMIT     U-RUN    H-SUBMIT     H-RUN    L-SUBMIT     L-RUN
gXXX            7/60      0/15        0/10      0/10        0/10      0/10
pXXXXX           0/8       0/2        0/10      0/10        0/10      0/10
Could you tell me how to start up a batch job and an interactive job?
A batch job is submitted using the “qsub” command. For details on the “qsub” command, please refer to the User’s Guide for the Reedbush System Overview/Reedbush-U) (- 4.1.2.3. Submission of batch requests, 4.1.2.4. Submission of interactive jobs) available in the User Support Portal.
Could you tell me how to delete a job?
The “qdel” command is used to delete a job. For details, please refer to the User’s Guide for the Reedbush System (Overview/Reedbush-U) (- 4.2.2. Delete a batch job) available in the User Support Portal.
Could you tell me how to check the execution status, etc. of a job?
The “rbstat” command is used to check the execution status, etc. of a job. For details, please refer to the User’s Guide for the Reedbush System (Overview/Reedbush-U) (- 4.2.1. rbstat command) available in the User Support Portal.
Job execution does not start.
Please check the following points.
  1. There are cases where the system is waiting to secure the computational resources required for the job awaiting execution. You can check the overall usage status of the system using the “rbstat –nodeuse” command.
  2.   $ rbstat --nodeuse
      RSCGRP                                        Used/Total
      u-debug            ************-------------  48%( 26/ 54)
      u-short            ************************* 100%( 16/ 16)
      u-regular          ****************---------  63%(184/294)
      u-regular-low      -------------------------   0%(  0/294)
      h-debug            -------------------------   0%(  0/ 16)
      h-short            ******-------------------  25%(  2/  8)
      h-regular          *******************------  77%( 74/ 96)
      l-regular          ***----------------------  11%(  6/ 56)
    
  3. As job scheduling is carried out through FIFO operations, there are cases where the system is waiting to finish a job submitted earlier.
  4. The time required for computation specified for a job awaiting execution may sometimes be longer than the time remaining until service suspension. You can check the time remaining until service suspension using the “rbstat” command.
  5. There are cases where the system has reached its maximum limit on the number of jobs that can be executed simultaneously. Please check if other jobs are being executed, or if another user from the same group is executing the job. You can use the “rbstat --limit” command to check the number of jobs that are being executed within the group, and the maximum number of jobs that can be executed simultaneously.
  6. A system problem may have occurred. Please inquire via the webpage for consultation and inquiries on system use.
  7. *There are cases where you can speed up the initiation of a job by activating the backfill function. This can be done by appropriately setting a walltime (lapsed time) through the job script. For details, please refer to the User’s Guide for the Reedbush System (Overview/Reedbush-U) (- 4.1.2.8. Backfill function) available in the User Support Portal.

▲ Return to the top of the FAQ for the Reedbush system

User Support Portal/Key registration

What can I do through the User Support Portal?
The User Support Portal allows users to register the public keys that are used to login to the Reedbush system, and to browse through the User’s Guide, tuning guide, manuals provided by the manufacturer, and information about version updates for compilers, etc. Some of the group administrator functions can also be carried out through the User Support Portal. For details, please refer to the Guide for Reedbush Group Administrator Functions.
I tried to register the public key through the User Support Portal, but could not login (authentication failed).
There are some points to note (shown in red outside the box) with regard to the input of the character string displayed in the default password field in the notification of user registration (hard copy) sent out to you. Please check these points and try again.
If you do not know your password, please contact the reception (Email address for inquiries).
I have changed the permissions in the .ssh directory, and cannot login anymore. What should I do?
Please contact us by providing the necessary information via the webpage for consultation and inquiries on system use. You will not be able to login if there are any errors in the permissions for the directory or files in question, or in the saved key. Hence, please take great care when editing the information. (Please back up any files, etc. beforehand, and check if you can connect from other terminals before you logout.)

▲ Return to the top of the FAQ for the Reedbush system