5449: Introduction to High-Performance Deep Learning
Instructors: Prof. Dhabaleswar K. (DK) Panda and Dr. Hari Subramoni
Autumn 2022
Course Number: 5449
Class Number: 37009 (Grad) and 37010 (Undergrad)
Credits: 3
Course Time: WF 11:10 am - 12:25 pm
Classroom: Pomerene Hall 250
Course Description:
Recent advancements in Artificial Intelligence (AI) have been fueled
by the resurgence of Deep Neural Networks (DNNs); various Deep
Learning (DL) frameworks like PyTorch, Tensorflow, MXNet, and
Chainer; various Machine Learning (ML) frameworks like K-means;
and various data science frameworks like Dask.
DNNs have found widespread applications in classical areas
like Image Recognition, Speech Processing, Textual Analysis, as well
as areas like Cancer Detection, Medical Imaging, Physics,
Materials Science, and even Autonomous
Vehicle systems. However, scaling distributed training with scale-up
and scale-out approaches are still challenging. This is leading to the
emergence of a new field called "High-Performance Deep Learning".
The objectives of this course are to understand the principles and the
practice of this emerging trend, the open set of challenges, how
modern HPC technologies can be used to accelerate DL trainings, etc.
Topics to be Covered
- High-Performance Deep Learning: Issues, Trends, and Challenges
- Introduction to Deep Learning and Terminologies
- The Past, Present, and Future of Deep Learning
- What are Deep Neural Networks?
- Diverse Applications of Deep Learning
- Overview of commonly used Terminologies
- Introdction to HPC Technologies
- GPUs, CPUs, and TPUs
- High-Performance Networking (InfiniBand, HSE and RoCE)
- MPI, CUDA-Aware MPI, NCCL
- DGX-1, DGX-2, IBM Power-AI
- Overview of DL and ML Frameworks
- TensorFlow
- Facebook Torch/PyTorch
- Caffe and Caffe2
- Chainer/ChainerMN
- MXNet
- LBANN
- Horovod
- Deepspeed
- Overview of State-of-the-art DL Models
- ImageNet and VGG
- GoogleNet
- ResNet
- NASNet
- DeepSpeech
- Challenges for Exploiting HPC for DL
- Overall Challenges
- Need for Co-Design
- The need for Co-Designing DL/ML frameworks and HPC Middleware
- Solutions and Case Studies
- Generalized Solutions
- NVIDIA NCCL, Baidu-allreduce, Facebook Gloo
- CPU and GPU-Based Training
- Co-Designs
- Deep Learning and Big Data
- Latest and Emerging Trends
- Larger-scale Models
- Out-of-core DNNs
- Standardization on Benchmarking, OpenAI, ONNX
- SqueezeNet
- Neural Network Processors (GraphCore, Loihi, Pohohiki, Habana, Cerebras, etc.)
Text:
Selected papers from the literature including papers focusing on
past and on-going
research activities in the group.
Laboratory Exercises:
The course will involve laboratory expercises for students to
experiment with Deep Learning Frameworks. These exercises will be
carried out on OSC (Ohio Supercomputing Center) clusters using
GPUs. This will provide hands-on knowledge to the students in the area
of high-performance deep learning.
Last Updated: June 15, 2022