Digging into Deep Learning through Algorithms and Hardware Architectures
Prof. Brad McDanel’s research focuses on the design of specialized algorithms and hardware architectures to make deep learning more energy efficient. More broadly, he works across the areas of deep learning, computer vision, hardware architecture and computer networks.
“I was initially drawn to working in this area, like many researchers, as the result of all the interesting achievements we have seen in the past decade due to the development of deep learning,” says the computer scientist. “However, these systems are quite computationally expensive, leading to a large amount of energy to operate. Given recent trends, this energy efficiency problem appears to be getting worse over time as larger deep learning systems are being developed and deployed for a variety of new problems.”
Prof. McDanel’s main goal is to discover new techniques for improving neural network computational efficiency without impacting the model performance (how well the system solves the problem). There are several lines of research in this area (quantization, pruning, adaptive computation, hardware design) that he has been actively working on over the past five years. Recently, the professor started to focus more on improving the efficiency of the training phase, as this is less well-studied by the community compared to improving the performance of a deployed model. “In the long-term,” the professor comments, “I am interested in studying alternative ideas for training neural networks that do not rely on the current standard approach, for example, gradient-based methods such as SGD.”
The computer scientist is very interested in involving students in his research. “One of the fun parts of deep learning research is that anyone can start experimenting by training small models on their personal computer within an hour of looking into it,” he says. “This sort of experimentation is what initially drew me to working in this area. I am in the process of building a large state-of-the-art system with multiple GPUs to perform larger training tasks. Students will also get a chance to operate the system.”
One research project the computer scientist co-authored, with collaborators Surat Teerapittayanon and HT Kung, that gained attention was on performing computation in DNNs on a conditional basis. The core idea is that some samples can be processed quickly with little effort while others are more difficult and need additional attention. This general idea is not a new phenomenon and has some relation to Viola–Jones object detection from the early 2000s. Their main innovation was providing a joint-training framework across multiple decision points to improve the performance of such a system.
“I plan on continuing some collaboration with my PhD advisor at Harvard University on designing efficient deep learning hardware. Additionally, I will soon start collaborating with a group at a research university in Brazil, the Federal University of Rio Grande do Norte. This group focuses on improving efficiency of deep learning systems for embedded microcontrollers,” comments the professor about his future plans.
Prof. McDanel received his Ph.D. in Computer Science from Harvard University and his M.S. and B.S. in Computer Science from Wake Forest University. He enjoys playing chess and participated in tournament chess in school. He mostly plays online these days and is open to accept challenges, if any student is interested. Most of the professor’s free time “is mostly spent chasing my 18-month old daughter around our apartment.”