Making Neural Networks Linear Again: Projection and Beyond

Sun 16.02 10:30 - 11:30

Abstract: Every day, somewhere, a researcher mutters, “If only neural networks were linear, this problem would be solved”. Linear operations offer powerful tools: projection onto subspaces, eigen decomposition, and more. This talk explores their equivalents in the non-linear world of neural networks, with a special focus on projection, generalized by idempotent operators- operators that satisfy f(f(x)) = f(x). Idempotent Generative Network (IGN) is a generative model that is trained by enforcing two main objectives: (1) target distribution data map to themselves f(x) = x, defining the target manifold, and (2) latents project onto this manifold via the idempotence condition f(f(z)) = f(z). IGN generates data in a single step, but can iteratively refine, and projects corrupted data back onto the distribution. This projection ability gives rise to Idempotent Test-Time Training (IT³), a method to adapt models at test time using only current out-of-distribution (OOD) input. During training, the model f receives an input x along with either the ground truth label y or a neutral "don't know" signal ∅. At test-time, given corrupted/OOD input x, a brief training session minimizes ||f(x, f(x, ∅)) - f(x, ∅)||, making f(x,⋅) idempotent. IT³ works across architectures and tasks, demonstrated for MLPs, CNNs, and GNNs on corrupted images, tabular data, OOD facial age prediction, and aerodynamic predictions. Finally, I'll ask: "Who says neural networks are non-linear?" They're only non-linear with respect to the standard vector spaces! In an ongoing work, we construct vector spaces X, Y with their own addition, negation, and scalar multiplication, where f: X → Y becomes truly linear. This enables novel applications including spectral decomposition, zero-shot solutions to non-linear inverse problems via Pseudo-Inverse, and architecture-enforced idempotence.  

Speaker

Assaf Shocher

Berkeley & Nvidia