Top Five Takeaways from NeurIPS 2018

Although I’ve known about NIPS since my physics days, I never could have imagined the sheer massive scale of this world-famous conference on Artificial Intelligence. Renamed to “NeurIPS” in 2018, the 32 year old AI conference had over 1,000 accepted papers and nearly 8,000 registrants in attendance in Montreal for one day of tutorials, three days of invited talks, parallel sessions, and poster sessions and two full days of specialized workshops. Despite the scale and significant tech industry presence, NeurIPS is first and foremost an academic science conference.
The sheer variety and volume of topics at NeurIPS makes it impossible to see everything at the conference. Because of this, other attendees might come away with quite different takeaways than I have. I should also mention some personal disappointment in not seeing ethical issues in the field (such as diversity, systemic bias, and privacy) being more broadly addressed at the conference. With that in mind, however, here are my top five NeurIPS (formerly NIPS) takeaways:
  1. Subtle Architectural Advancements

After a few years of major architectural papers and announcements, such as Generative Adversarial Networks and Capsule Networks, 2018 was a quiet year in comparison.  My own personal view is that the major architectural advance in 2018 belongs to Neural Ordinary Differential Equations (already being coined ‘ODENets’ by some) by Ricky Chen et al. at  the University of Toronto. https://arxiv.org/pdf/1806.07366.pdf. The paper also shared top honors in the NeurIPS organizing committee’s best paper category.
‘ODENets’ aren’t new. You can find a good introduction at https://srome.github.io/Using-Ordinary-Differential-Equations-To-Design-State-of-the-Art-Residual-Style-Layers/. What is new from the paper from Chen et al. is their thorough investigation of ODENets properties, including tests on image recognition and continuous time models. Both will likely be relevant for deep learning on videos, and continuous time models would likely be helpful for challenging time series modeling tasks such as connected devices with a high sampling rate. Possibly the most useful results of the paper, however, is in showing how well ODENets perform in density estimation tasks.
  1. Improved fundamental understanding of existing approaches

To me, ‘Sanity Checks’ was the dominant theme to NeurIPS 2018. Quite a few papers focused on improving fundamental understanding of neural network architectures. A good ‘sanity check’ example comes from Pierre Baldi from UC-Irvine. He defined Neuronal Capacity using arguments from information theory (https://papers.nips.cc/paper/7999-on-neuronal-capacity) to show that deeper neural networks have “less neuronal capacity” than shallow networks, meaning that shallower networks can approximate more functions. However, the functions deep neural networks can learn are smoother and better behaved, thereby acting as another form of regularization that he termed ’structural regularization’
Another good ‘sanity check’ paper is from Shibiani Santurkar et al. “How Does Batch Normalization Help Optimization”. They convincingly demonstrate (http://papers.nips.cc/paper/7515-how-does-batch-normalization-help-optimization) that Batch Norm makes gradients more stable, smooths the loss function landscape, and thereby making neural network training faster and more reliable. If BatchNorm can be applied to your use case, definitely use it
Other authors pushed popular algorithms forward. U-Net is a popular architecture for image segmentation, and a team at DeepMind defined a ‘probabilistic U-Net’ (https://arxiv.org/abs/1806.05034) for applications such as medical imaging where image segmentation can strongly depend on the underlying pathology. Graph Neural Networks are a perpetually popular topic, and Stanford presented an innovative pooling technique (https://arxiv.org/abs/1806.08804) to summarize (i.e ‘pool’) local information in a graph.
There are many other examples in addition to the papers above. It is encouraging to see researchers both in universities and in the private sector spending significant time to understand how neural networks function.
  1. Popular algorithms were shown to have substantial limitations

The other side to the ‘sanity checks’ theme is that many popular algorithmic approaches were shown to have substantial limitations. Q-learning is a popular algorithm to learn policies (which action to take) in Reinforcement Learning, however Google Brain showed that Q-learning can lead to some severe biases known as ‘Delusional Bias’ https://arxiv.org/abs/1806.08804. In image recognition, several popular techniques exist to construct saliency maps from neural network gradients to rank which input pixels are most important in feature construction and classification. One of the most important NeurIPS results again comes from Google Brain (http://papers.nips.cc/paper/8160-sanity-checks-for-saliency-maps), who developed a rigorous saliency map testing framework. Unfortunately, several popular approaches failed the tests in this framework, calling into question many previously published research results in the image recognition domain.
  1. Improved collaboration between researchers/engineers and stakeholders are key to drive AI forward

“ Those [Reinforcement Learning] simulations may help make results reproducible, which is good, but they miss a lot of the complexity of the natural world, which may make the work less meaningful, and less rigorous” – Joelle Pineau
This was encouragingly a widely held sentiment at NeurIPS 2018. “From personal experience, 90% of our failed projects was because of poor communication and/or collaboration, not lack of technical expertise” remarked a managing Machine Learning Engineer at Doc.ai. Barbara Engelhardt in her talk in the NeurIPS ML4Health workshop observed that trust from physicians in Machine Learning is currently fragile, and machine learning practitioners need to deeply collaborate with medical professionals for artificial intelligence to truly have an impact on health care.
 
  1. AI in Drug Development is still Young

Although there were nearly 8,000 participants in NeurIPS 2018, the representation from the Pharma industry was relatively small. (See for example https://medium.com/@longevity/how-serious-is-big-pharma-about-ai-in-drug-discovery-f4cfe23cfe85) Additionally, many professionals in the tech industry lump the challenges of Machine Learning in health care and drug development together with one, broad brush stroke of “health care”, even though the problems are often quite different.
Nevertheless, it was great to attend an ‘Artificial Intelligence in Drug Development’ lunch with researchers from the industry applying the latest in advancements in AI/ML to drug development, including representatives from GSK, Pfizer, and AI startups InSitro, phonemic.ai, and benevolent.ai. Among discussing the usual challenges with applying these approaches to complex, multidimensional data, there was universal agreement that the field is in desperate need of standardized benchmark datasets to compare algorithmic approaches. (This is direct analogy with common benchmark datasets in the vision recognition domain, MNIST and CIFAR) There was also a desire to have a dedicated ‘AI in Drug Development’ workshop in a future NeurIPS conference.

Leave a comment