Supercomputing Division, Information Technology Center, The University Tokyo

FAQ for Reedbush-H Large-Scale HPC Challenge

Use of the system

Could you tell me more about the Reedbush-H Large-Scale HPC Challenge?
The Large-Scale HPC Challenge is a public recruitment project that allows a research group to monopolize the use of all compute nodes (120 nodes (240 GPU units)) available on the Reedbush-H Supercomputer System. For details, please refer to “Reedbush-H Large-Scale HPC Challenge.”
Are there procedures that need to be completed in order to use the Reedbush-H system under the Large-Scale HPC Challenge?
To use the Reedbush-H Large-Scale HPC Challenge, it is necessary to submit an application during the recruitment held several times each year, and to have your project selected. For details, please refer to “Reedbush-H Large-Scale HPC Challenge.”
What are the criteria for using the Reedbush-H system under the Large-Scale HPC Challenge?
Use of the supercomputer under this initiative is limited to research projects that involve large-scale computation using a maximum of 120 nodes. In addition, the applicant and members of the research group must have a track record in large-scale computation using parallel computing systems in Japan or abroad. A wide range of research fields related to HPC is eligible. However, use of the systems under this initiative is limited to self-made, proprietary programs and open-source programs, and not the use of software developed by software vendors. For details, please refer to “Research subjects for the Reedbush-H Large-Scale HPC Challenge.”
Are there any restrictions on the members who can use the Reedbush-H system under the Large-Scale HPC Challenge?
Applicants must be researchers affiliated with universities or public organizations in Japan, and or be parties affiliated with private corporations. In cases where the members of a research group or the applicant is from a corporation, it is necessary to submit one of the following documents. For details, please refer to “Application Eligibility for the Large-Scale HPC Challenge.”

For research groups that include members from corporations

It is necessary to submit one of the following documents.
  • Photocopy of joint research agreement
  • Photocopy of a written pledge stating that appropriate supervision will be carried out, and photocopy of contract agreement
  • Written pledge stating compliance with the purpose of use set out in the Terms of Use

For research groups that include members who are foreign nationals

Please check to ensure that you are not in violation of laws and regulations related to export trade.
For details, please refer to “Use of supercomputers by foreign nationals and those residing abroad.”
I have received approval for use of the supercomputer under the Large-Scale HPC Challenge. Is it possible to add users other than the users I have named when submitting my application? Is it also possible to add disk storage, etc.? If it is possible, what are the application procedures for doing so?
Please inquire via e-mail (受付のメールアドレス).
Screening of the project, including the environment of use, is carried out when you submit the project proposal for the Large-Scale HPC Challenge. Hence, when adding users or disk storage, screening of the project will be carried out once again. The result of the second screening may result in approval for the additions. However, as it takes time to carry out screening once again, there may be cases where approval cannot be granted in time before your scheduled date for using the supercomputer. When applying, please check carefully the necessary items, including users of the Reedbush-H Supercomputer System and disk storage required, before submitting your application.
Could you tell me the flow of procedures leading up to the use of supercomputer under the Large-Scale HPC Challenge?
For the Large-Scale HPC Challenge, the project proposal submitted by the user is screened and selected by the screening committee. If a project is selected, the applicant will be notified of the date and time of use determined by the Center based on the choices listed by the applicant. A form to apply for the use of the supercomputer system will also be attached to the e-mail. Please fill in the necessary fields, and submit it to the Research Support Team, Information Strategy Group, Information Systems Department, the University of Tokyo. Approximately one month before use, a notification of user registration containing the date of use and other details will be sent out. After use, you will be required to submit a report of the results, submit an article on the results to a PR magazine, and/or submit a brief report to a peer-reviewed international conference. You may also be invited to make a presentation at a seminar or workshop organized or co-organized by the Center.
Large-Scale HPC Challenge Flow leading up to the use of the system (Overview)
I am already a user of the Reedbush-H Supercomputer System. Will a new user number be issued for use of the system under the Large-Scale HPC Challenge?
If the user selected for the Large-Scale HPC Challenge is an existing user of the system, a new user number will not be issued. Instead, only the tokens required for the Large-Scale HPC Challenge will be added (tokens that can be used by all members of the group for the Large-Scale HPC Challenge will be added). For members who are not users of the system (members), a user number that allows use of the supercomputer only during the Large-Scale HPC Challenge period, and the corresponding number of tokens, will be issued.
Will a new directory (file storage, etc.) be added for use under the Large-Scale HPC Challenge?
The group will be registered to use the supercomputer under the Large-Scale HPC Challenge. Hence, a directory that will be available for use only during the period of the Large-Scale HPC Challenge will be added. After the end of this usage period, the directory/files, etc. that have been added will all be deleted.
How do I use the supercomputer system after I have been granted approval for use under the Large-Scale HPC Challenge?
The notification of user registration sent out approximately one month before use under the Large-Scale HPC Challenge contains a project code for use in this Challenge. For the Large-Scale HPC Challenge, 7,200 node-hours and disk storage (16TB per group /lustre) will be assigned. The tokens can be used to carry out debugging and other processes until the date of the Large-Scale HPC Challenge. The period of use permitted under the Large-Scale HPC Challenge is from the first day of the month of the Large-Scale HPC Challenge, to the day before the Large-Scale HPC Challenge in the following month. Once the period of use ends, the group using system under the Large-Scale HPC Challenge will have its user permit cancelled, and the allocated tokens as well as files saved in the disk will be deleted. Login nodes can also be used on the day of the Large-Scale HPC Challenge, so please login and submit jobs as usual. During the Large-Scale HPC Challenge, tokens will not be consumed.
Can I submit the report of results in English? In such cases, are there any examples of how I should write the acknowledgements?
You may submit the report in English. For examples of English acknowledgements, please refer to “English translation of acknowledgements when reporting the research results in academic papers, etc.”

▲ Return to the top of the FAQ for Large-Scale HPC Challenge

Tokens

How many tokens will be allocated for use under the Reedbush-H Large-Scale HPC Challenge?
The number of tokens that can be used during the Large-Scale HPC Challenge is 7,200 node-hours (equivalent to 4 nodes). The number of tokens allocated can be used for debugging processes and other processes until the Large-Scale HPC Challenge.
If the number of tokens allocated initially is 7,200 nodes, is there a possibility that I may not have enough tokens for the Large-Scale HPC Challenge?
The queues that can be used on the day of the Large-Scale HPC Challenge are not covered under the tokens. For this reason, you will not use up the allocated tokens even if you were to execute a batch job, and it will be possible to execute batch jobs.
Are there any restrictions on the tokens that can be used under the Large-Scale HPC Challenge?
There are no particular restrictions on the tokens that can be used under the Large-Scale HPC Challenge. Hence, there is no difference from general users even for queues that can be used.
I have used up all the tokens allocated for use under the Large-Scale HPC Challenge. Is it possible to add tokens?
It is not permitted to add tokens on top of the tokens you have been allocated for use under the Large-Scale HPC Challenge.
Could you tell me the user number for use in the Large-Scale HPC Challenge, and the deadline for using the tokens? Also, when should I backup my files by?
The validity of use for tokens is until the day before the Large-Scale HPC Challenge is implemented during the month after use.

If you have acquired a user number for use in the Large-Scale HPC Challenge

As described above, the validity of use for tokens is until the day before the Large-Scale HPC Challenge is implemented during the month after use. Hence, please backup or move your files during this period (period when use is permitted.) (At the point when the validity of use expires, the user number and files, etc. will be deleted. As the Center does not back up any files, please complete the backup and transfer of your files during the period of use.)

If you have added tokens for use in the Large-Scale HPC Challenge

As described above, the validity of use for tokens is until the day before the Large-Scale HPC Challenge is implemented during the month after use. (Please note that after the validity of use expires, tokens added under the Large-Scale HPC Challenge will no longer be usable.) In cases where you have added disk storage up to the maximum level under the Large-Scale HPC Challenge, after the validity of use expires, file storage will be reset to the value set during user registration (changed to original value.) Please take care in managing the maximum value of your disk storage.

▲ Return to the top of the FAQ for Large-Scale HPC Challenge

Job submission and administration

On the day of the Reedbush-H Large-Scale HPC Challenge, will it be possible to use the login node?
Yes, it will be possible.
Please login using the login node and submit jobs as usual.
Could you tell me more about the project code, queue names (designated method), etc. that can be used under the Reedbush-H Large-Scale HPC Challenge?
Please refer to the notification of user registration, which contains details about the project code, queue names, etc. that can be used under the Large-Scale HPC Challenge.
An example of designation is provided below. For details, please refer to the user’s guide in the User Support Portal (-System Documentation).
[Example of designation]
 #!/bin/sh
 #PBS -q h-challenge                                  Designate queue
 #PBS -l select=120:ncpus=36:mpiprocs=36:ompthreads=1 Designate number of nodes, etc.
 #PBS -W group_list=gx10                              Designate project code
 #PBS -l walltime=5:00:00                             Designate lapsed time
 #PBS -N job1                                         Designate job name

 mpirun ./a.out                                       Execute program
Could you tell me how to check a job that is being executed?
You can check using the rbstat command.
What is the number of batch jobs that can be submitted simultaneously under the Reedbush-H Large-Scale HPC Challenge?
The maximum number of batch jobs that can be submitted simultaneously is 240 per user.
What is the number of batch jobs that can be executed simultaneously under the Reedbush-H Large-Scale HPC Challenge?
The maximum number of batch jobs that can be executed simultaneously is 120 per user.
What is the scheduling method for the Large-Scale HPC Challenge?
It is executed through FIFO.
Is there a way of checking the number of tokens used by users registered under the Large-Scale HPC Challenge?
You can check the number of tokens you have used through the show_token command.
Group administrators can check the number of tokens for users in their group.
Job execution will not commence.

It is possible that another user from the same group is executing a job. Use the rbstat command to check if jobs submitted by other users are being executed.

Is it possible to execute an interactive job during the Large-Scale HPC Challenge? Also, is it possible to use the debug queue?
During the implementation of the Large-Scale HPC Challenge, as all computational nodes will be used, it will not be possible to use all queues such as h-debug, h-short, h-interactive, and h-regular.

▲ Return to the top of the FAQ for Large-Scale HPC Challenge

Contacting the Center

During the Reedbush-H Large-Scale HPC Challenge, an incident that is suspected to be system failure has occurred. I would like to raise an inquiry.
Please contact the contact e-mail address for the day, provided on the notification of user registration for Reedbush.
*Please do not use the page for submitting questions and inquiries about use of the systems provided below. In particular, inquiries submitted through these pages will not be answered in the night time.
I would like to consult with someone about programs in relation to the Reedbush-H Large-Scale HPC Challenge.
Please inquire via the page for submitting questions and inquiries about use of the systems. Please note that depending on the contents, we may not be able to provide an answer during the implementation period of the Large-Scale HPC Challenge.

▲ Return to the top of the FAQ for Large-Scale HPC Challenge