Thread: Circuits
In the original narrative of deep learning, each neuron builds progressively more abstract, meaningful features by composing features in the preceding layer. In recent years, there’s been some skepticism of this view, but what happens if you take it really seriously? InceptionV1 is a classic vision model with around 10,000 unique neurons — a large number, but
Growing Neural Cellular Automata
Contents This article is part of the Differentiable Self-organizing Systems Thread, an experimental format collecting invited short articles delving into differentiable self-organizing systems, interspersed with critical commentary from several experts in adjacent fields. Differentiable Self-organizing Systems Thread Self-classifying MNIST Digits Most multicellular organisms begin their life as a single egg cell – a single cell
Visualizing the Impact of Feature Attribution Baselines
Path attribution methods are a gradient-based way of explaining deep models. These methods require choosing a hyperparameter known as the baseline input. What does this hyperparameter mean, and how important is it? In this article, we investigate these questions using image classification networks as a case study. We discuss several different ways to choose a
Computing Receptive Fields of Convolutional Neural Networks
While deep neural networks have overwhelmingly established state-of-the-art results in many artificial intelligence problems, they can still be difficult to develop and debug. Recent research on deep learning understanding has focused on feature visualization , theoretical guarantees , model interpretability , and generalization . In this work, we analyze deep neural networks from a complementary
The Paths Perspective on Value Learning
Introduction In the last few years, reinforcement learning (RL) has made remarkable progress, including beating world-champion Go players, controlling robotic hands, and even painting pictures. One of the key sub-problems of RL is value estimation – learning the long-term consequences of being in a state. This can be tricky because future returns are generally noisy,
Learning from Incorrectly Labeled Data
Section 3.2 of Ilyas et al. (2019) shows that training a model on only adversarial errors leads to non-trivial generalization on the original test set. We show that these experiments are a specific case of learning from errors. We start with a counterintuitive result — we take a completely mislabeled training set (without modifying the inputs) and
Adversarial Examples are Just Bugs, Too
We demonstrate that there exist adversarial examples which are just “bugs”: aberrations in the classifier that are not intrinsic properties of the data distribution. In particular, we give a new method for constructing adversarial examples which: Do not transfer between models, and Do not leak “non-robust features” which allow for learning, in the sense of
Adversarially Robust Neural Style Transfer
A figure in Ilyas, et. al. that struck me as particularly interesting was the following graph showing a correlation between adversarial transferability between architectures and their tendency to learn similar non-robust features. Adversarial transferability vs test accuracy of different architectures trained on ResNet-50′s non-robust features. One way to interpret this graph is that it shows
Two Examples of Useful, Non-Robust Features
A Discussion of ‘Adversarial Examples Are Not Bugs, They Are Features’: Two Examples of Useful, Non-Robust Features Ilyas et al. define a feature as a function fff that takes xxx from the data distribution (x,y)∼D(x,y) \sim \mathcal{D}(x,y)∼D into a real number, restricted to have mean zero and unit variance. A feature is said to be