HOME > 研究会・イベント > ASE研究会 > 第12回ASE研究会

第12回ASE研究会のお知らせ

 第12回となります、先進スーパーコンピューティング環境研究会(ASE研究会)開催のご連絡をいたします。
 今回は、招待講演として Lawrence Berkeley National Laboratory から Osni Marques 博士をお呼びし、アプリケーション開発環境における現在とこれからに関する講演を行います。
 また、国内招待講演者として、東京大学の相島健助 助教をお呼びし、特異値計算におけるdqds法の講演、山梨大学の鈴木智博 准教授による近年の計算機におけるQR分解の進展に関する講演、および、筑波大学の山本和磨氏による非線形固有値問題の並列アルゴリズムに関する講演を行います。

第12回ASE研究会開催予告

日時:2012年4月25日(水)13時30分~18時10分(懇親会:18時40分~)
場所:東京大学 情報基盤センター(本郷キャンパス弥生地区)4階遠隔講義室(地図
主催:東京大学情報基盤センター スーパーコンピューティング部門


懇親会参加の方は、末尾の参加票をお送りください。

プログラム

12th Advanced Supercomputing Environment (ASE) Seminar


25th April 2012 (Wed)
Information Technology Center, The University of Tokyo (Hongo Campus)
4th Floor, Telecommunication Lecture Room


【招待講演】

13:30 - 14:15

Invited Speaker

 Dr. Kensuke Aishima (The University of Tokyo, Japan), Dr. Yuji Nakatsukasa (University of Manchester, UK), Dr. Ichitaro Yamazaki (University of Tennessee, USA)

Title

 dqds with aggressive deflation for singular values

Abstract

 Matrix singular values play an important role in many applications. Accordingly, numerical methods for computing singular values are of great importance in practice. In order to compute the singular values, the given matrix is first transformed to a bidiagonal matrix with suitable orthogonal transformations, and then a certain iterative method is applied to the bidiagonal matrix. In 1994, Fernando and Parlett discovered the differential quotient difference with shifts (dqds) algorithm for computing singular values of bidiagonal matrices to high relative accuracy. The dqds algorithm is currently implemented in LAPACK as the DLASQ routine. Our objective is to reduce the dqds runtime without loss of high relative accuracy. More specifically, we incorporate into the dqds a technique called aggressive deflation, which has been applied successfully to the Hessenberg QR algorithm. We propose an efficient and stable implementation by taking advantage of the bidiagonal structure. Numerical results are also shown to illustrate that our aggressive deflation strategy often reduces the dqds runtime significantly. In addition, a shift-free version of our algorithm has a potential to be parallelized in a pipelined fashion. Our mixed forward-backward stability analysis proves that with our proposed deflation strategy, all the singular values are computed to high relative accuracy.

14:25 - 15:10

Invited Speaker

 Associate Professor Tomohiro Suzuki (Yamanashi University, Japan)

Title

 On implementations of tile QR factorization algorithm for recent hardware

Abstract

 There are many important matrix factorizations in dense linear algebra. Classic implementations suffer from performance limitations due to the use of L2 and L1 BLAS operations. The scalability limitation exists even in a blocking algorithms which are rich in L3 BLAS. Such limitations are called fork-join bottlenecks. In order to take advantage of the architectural features on recent multi-core or many-core systems, tile algorithms for the matrix factorization are proposed. In this talk, we present our implementations of the tile QR factorization algorithm for the GPU system and the multi-core CPU cluster system. It is implemented with OpenMP and MPI hybrid programing model for the multi-core cluster system i.e. T2K open supercomputer (U.Tokyo). For the GPU system, we also show the implementation for the multi GPU environment. In order to achieve high performance, it is important to tune each sub program (kernel) of the tile algorithm. In addition to that, a proper scheduling with checking dependencies among all kernels has an equivalent importance. Some studies for an optimized scheduling for the tile QR algorithm are reported.

15:20 - 16:20

Invited Speaker

 Dr. Osni Marques (Lawrence Berkeley National Laboratory, USA)

Title

 Dealing with Application Development -- Now and Henceforth

Abstract

 The development of simulation codes is often a costly process that results from the combination of the increasing complex problems to be solved and the evolution of computer architectures. Practitioners are expected to develop highly efficient codes, although emerging computer architectures pose formidable challenges in achieving adequate levels of performance. Code developers usually have a range of choices for programming ? MPI, OpenMP, PGAS Languages, CUDA, and the emerging OpenACC ? but whose benefits / advantages may not be clear. To easy the development process, scientific software libraries are increasingly used in simulation codes: in many cases, this approach has lessened the development effort, contributed to an optimal usage of the available computational resources, and lessened issues related to portability and application lifecycle. However, how will advances in programming and hardware impact libraries? This presentation will discuss some of these issues.

16:30 - 17:15

Invited Speaker

 Mr. Kazuma Yamamoto, Mr. Yasuyuki Maeda, Mr. Yasunori Futamura, Professor Tetsuya Sakurai (Department of Computer Science, University of Tsukuba, Japan)

Title

 Adaptive parallel algorithm for stochastic estimation of nonlinear eigenvalue density

Abstract

 A numerical method that estimates the eigenvalue density of nonlinear eigenvalue problems in the specified region has been proposed. Nonlinear eigenvalue problems arise in science and engineering. Since parameter settings for eigensolver that based on eigenvalues are required, accuracy and parallel efficiency can be improved by using eigenvalue density. In this presentation, we propose an algorithm for efficient execution of the estimation method on parallel computers. Conventional approach requires the solutions of linear systems for each integral point that uniformly distributed on the complex plane. Thus, it causes the load imbalance and requires a large computational cost due to the variation of solution time for linear systems. The proposed master-worker type adaptive algorithm improves the load balance and reduces the computational cost by the placing integral points according to the density of eigenvalue in the specified region. Moreover, we propose a look-ahead algorithm that balances the loads more efficiently by recycling the variables in the linear solver. We evaluate the efficiency of the proposed algorithms by several numerical examples.


【Regular Presentations】

17:25 - 18:10

Speaker

 Satoshi Itoh (Information Technology Center, The University of Tokyo, Japan)

Title

 Study of plugging-in AT mechanism in OpenFOAM

Abstract

 OpenFOAM is an open source CFD software package. It is free software and developers can describe the governing equations simply with its instinctive interface, it is spread widely. OpenFOAM is based on the finite volume method (FVM), so that the main application is CFD. However, it has a problem that it is difficult to achieve high performance on high-end machine such as supercomputers. We are developing ppOpen-AT, which is an infrastructure of auto-tuning (AT) for ppOpen-HPC. ppOpen-HPC is a numerical middleware for post Petascale era. One of its features is auto-tuning mechanism (ppOpen-AT). We chose OpenFOAM as one of testing software. In this study, we optimize OpenFOAM manually for the first step of auto-tuning. We show numerical results on T2K, and discuss the AT methodology for OpenFOAM.

18:10 Closing Remarks

 Takahiro Katagiri (The University of Tokyo)


18:40 A Banquet near Nedu station

-------------------------------------------------------------------------------
懇親会参加票(katagiri@cc.u-tokyo.ac.jpまで)
 締切:4月15日(日)


日時:2012年4月25日(水)18時40分~
場所:根津周辺
会費:4000円程度


ASE研究会懇親会に参加します


お名前:
ご所属:
備 考:領収書が必要な方はここにご記載ください。(宛名:     )
-------------------------------------------------------------------------------

研究会形式

  • センターユーザに限定せず、研究会は一般公開とします。
  • 参加費は無料で、基本的に事前登録は不要です。

また、今後の開催予定を確実に知りたい方は、メーリングリストへの登録をお願いします。登録依頼については、下記問い合わせ先までお願いします。

本研究会の問い合わせ先

〒113-8658 東京都文京区弥生2-11-16
東京大学 情報基盤センター


ASE研究会幹事 准教授 片桐孝洋
E-mail:katagiri@cc.u-tokyo.ac.jp
(”@”を半角にしてからお送りください。)