Parallel and distributed computing, high performance computing, supercomputing technologies are referred to today as the most important directions of the scientific technical development in many leading countries of the world, including Russia. The potential of high performance computing makes it possible to solve many fundamental and applied scientific and technical problems, which require large-scale computations.
It is necessary to start working purposefully at developing a system of parallel and distributed computing education in order to prepare specialists for the realities of the future superparallel computer world. Only three components together – hardware, software and education – will create a steady basis for the development of the entire high-performance computing domain.
Taking into account these issues educational community initiated different activities to stimulate education in Parallel and Distributed Computing (PDC). A special knowledge area called Parallel and Distributed Computing was included into Informatics Curricula 2013. Having started in 2010 IEEE Computer Society Technical Committee on Parallel Processing (TCPP) with NSF support has developed a draft of parallel and distributed computing curricula. Besides, in 2010 the national Supercomputing Education Project was started in Russia. The State University of Nizhni Novgorod (UNN) is one of the key participants in the project. On the basis of the State University of Nizhni Novgorod the UNN HPC Center was established. Currently this center conducts research in High Performance Computing and Supercomputing Technologies and supports the education process at the State University of Nizhni Novgorod in these areas of knowledge. Among the priority targets of the center is improvement of the quality of PDC education.
The main goal of the project is to renew the curriculum for Bachelor’s Degree at the CMC so that it corresponds to the requirements of the parallel computation world.
NSF/TCPP Curriculum Initiative recommendations fully correspond to the plans of curriculum renewal and is used as the basis for completion of this task.
To implement the project, the team will follow these provisions:
The course is aimed at teaching classical data structures (stacks, queues, lists, trees, tables) and their use in algorithm development.
The goal of the course development is to teach parallel programming for systems with distributed memory (MPI technology is used as a basis).
The renewed course will give additional information on modern high-performance systems that provide huge (over petaflops) computational potential. Students will be taught the basics of multi-processor organization of programs: concepts of process and its difference from the concept of thread, process independence and communication, communication problems (overhead costs of data communication, data waiting locks, deadlocks) and message passing methods (point-to-point and collective operations, asynchronous data passing, communication complexity evaluation), presentation of a parallel program as a set of concurrently executed processes.
The basics of the MPI technology (processes and communicators, operations of sending and receiving of messages, typical communication operations) will be given to student to help them master and use the above concepts in practice. This technology will be taught on the example of various computational tasks: matrix computation, graph processing, Monte Carlo methods.
In general, the added learning materials correspond to the recommendations of the TCPP Curriculum on the parallel computing for the systems with distributed memory.
The course is intended for mastering the basic concepts, methods and algorithms of architecture and functioning of operating systems. The subject includes models and algorithms used in realization of various subsystems, application runtime environments and architecture of state-of-the-art operating systems. Special attention is given to operation of multitask systems, multiprocess and multithreaded applications, which is critical for development of programs for multiprocessor / multicore systems.
This is the main course for student studying parallel computing. CMC has been giving this course since 1995. It is constantly modernized to cover all the new achievements in the field of parallel computing.
During this course students study examples and classifications of parallel systems, parallel computing performance factors, computation and communication complexity of algorithms, additional sections in OpenMP and MPI technologies, methods of development of parallel algorithms and programs on the example of matrix computation tasks, sorting, graph processing, optimization, etc. The peculiarity of the course is that it demonstrates the possibility to predict the efficiency of the developed parallel algorithms and to confirm their efficiency by computational experiments.
The course includes extended laboratory practicum. Upon completion of the course each student is to present an individual project.
The course will be renewed by adding sections connected with the development of parallel algorithms and algorithms for computational systems with hierarchy structure (many nodes with distributed memory, each node may be multiprocessor, each processor may be multi-core). For the development of parallel programs, these systems require combining of OpenMP and MPI technologies. To master this approach, students have to make a lot of effort and carry out a lot of computational experiments.
The course includes studying a lot of TCPP Curriculum topics and may be used as the basis for courses on parallel computing in advanced level.
Course Syllabus, Course Materials
The main objective of the course is to study basic principles and acquire skills to develop programs that efficiently utilize Intel Xeon Phi.
The course includes the following topics:
The course examines the construction and the performance analysis of deep neural networks using the Intel® neon™ Framework.
The following topics are covered:
The course examines the practical application of deep learning for solving actual problems of computer vision during developing video surveillance systems.
The following topics are covered.
The course examines the practical application of deep learning for solving actual problems of computer vision during developing video surveillance systems.
The following topics are covered.
The course examines the known numerical algorithms for solving systems of linear algebraic equations, and also a range of issues related to parallelization of these algorithms in shared memory systems.
The course covers the following problems:
The main course objective is to study basic iterative algorithms of solving linear systems, gain experience in developing parallel numerical algorithms for efficient use on shared memory systems. It involves solving the following problems:
The main objective of this course is to study numerical methods for solving ordinary and partial differential equations and approaches to their parallelization for shared memory systems. The course involves the following problems:
The purpose of this course is to study mathematical models, methods and technologies of parallel programming for multicore and multiprocessor computers to the extent ensuring a successful start in the field of parallel programming. The proposed set of knowledge and skills forms a theoretical basis for method of complex program development and includes such topics as purposes and objectives of parallel data processing , construction principles for parallel computing systems, modeling and analysis of parallel computations, development principles for parallel programs and algorithms, systems for parallel programming and parallel numerical algorithms for solving standard computing mathematics problems.
The training course covers parallel programming technology designed for development of high-performance implementation of time-consuming algorithms to be executed on the parallel computation systems of cluster architecture. Lecture sections propose studying of Message Passing Interface technology fundamental principles, structure of MPI library, types of communications between processes, derived data types, virtual topologies, blocking and nonblocking communications in point-to-point and collective modes. During laboratory practice phases of parallel software development will be reviewed. Among those are: development of serial implementation, as a comparative example, parallel version development, its analysis. Training process is based on the test problems which don’t require specific knowledge from particular application domains, except the information from the training course.
The purpose of this course is to study one of MPI technologies of parallel programming. This course also studies mathematical models, methods and technologies of parallel programming for multicore and multiprocessor computers to the extent ensuring a successful start in the field of parallel programming.
The purpose of this course is mastering a set of skills and knowledge required for successful start of professional activities in the field of parallel programming on shared memory systems. The course incorporates both all theory of parallel computing using Intel Threading Building Blocks (TBB), and practical knowledge and skills of TBB-based parallel programming.
The main objective of the course is to introduce basic principles and acquire skills of GPU programming. The course consideres the following topics: