Weight Space Representation Learning via Neural Field Adaptation

Zhuoqian Yang, Mathieu Salzmann, Sabine Süsstrunk
EPFL CVPR 2026
TL;DR
We show that constraining neural field weight optimization to a structured low rank subspace via a pretrained base model and multiplicative LoRA enables weights themselves to serve as semantically meaningful data representations for both discriminative and generative tasks.

Key Insight

01
Independently optimized neural network weights can demonstrate semantic structure when constrained through appropriate inductive biases, specifically a pre-trained base model combined with low-rank adaptation.
02
Multiplicative LoRA (mLoRA-Asym) weights converge to a linear mode during optimization. This structured weight space geometry correlates strongly with generative performance in diffusion models, making downstream learning significantly easier.
Overview: LoRA-based weight space representation

Overview of the weight space representation pipeline. Each data instance is encoded as a set of low-rank weight deltas relative to a shared base model.

Weight space structure analysis

Weight space structure analysis. mLoRA-Asym weights exhibit strong linear mode connectivity, a key geometric property that enables high-quality generation.

This work takes a different approach from prior weight space learning: rather than building model-agnostic weight encoders that process weights as external data, we focus on enforcing structure directly on the weight space itself, making the weights serve as effective representations without additional encoding steps.

Method

The core challenge: neural network weights have permutation symmetry. Reordering neurons produces a different weight vector encoding the exact same function. This makes the weight distribution wildly multi-modal, and learning over it nearly impossible.

Goal: Collapse equivalent weight configurations into a single canonical form. Remove the symmetry, and the weight space becomes smooth, structured, and learnable.
External
Permutations of base network neurons create equivalent weight configs across all LoRA instances.

Fix: Share a single pre-trained base model. All instances reference the same neuron ordering. Symmetry eliminated.
Internal
The rank dimensions inside LoRA factors can be freely permuted. For any invertible G:
(AG)(G⁻¹B) = AB This gives a GL(r) fold equivalence class per function.
Fix: Asymmetric masking.

Our solution for permutation symmetries:

LoRA Factorization Each layer's weight delta is factored as W′ = W ⊙ BA (multiplicative LoRA). Rank components can still be freely reordered.
Asymmetric Masking For every layer, randomly freeze √dout entries per row of A. The frozen pattern is shared across all instances but unique per layer.
Symmetry Broken Each rank component now has a unique fingerprint from its frozen positions. Permuting ranks changes the mask pattern, so equivalent configs no longer exist.
Why multiplicative LoRA? Frozen entries are simply zeroed out, circumventing feature entanglement in neural fields. The result: weights converge to a linear mode with consistent structure across random initializations.

Results

Qualitative generation results

Qualitative generation results. mLoRA-Asym produces sharper, more coherent outputs compared to baselines across both 2D and 3D data.

Take-away The first successful weight space generation on high resolution natural images.

Reconstruction Quality

Method #Params FFHQ PSNR ↑ #Params ShapeNet CD ↓
Standalone MLP27,35735.1130,1962.57
LoRA (additive)27,39535.229,6963.10
mLoRA-Asym (Ours)26,30736.9127,5392.41

mLoRA-Asym achieves the best reconstruction quality on both 2D faces (FFHQ) and 3D shapes (ShapeNet).

Generation Quality (Latent Diffusion on Weights)

Method FFHQ FD ↓ ShapeNet Multi FD ↓
HyperDiffusion0.2410.117
mLoRA-Asym (Ours)0.0730.026

Structured weight space geometry directly translates to generation quality: mLoRA-Asym reduces Fréchet Distance by over 3× on FFHQ and 4× on ShapeNet Multi compared to HyperDiffusion.

Discriminative Analysis: ShapeNet Classification

Method Logistic Regression Accuracy ↑
Standalone MLP78.1%
mLoRA-Asym (Ours)90.0%

A linear classifier achieves 90% accuracy over 10 ShapeNet categories, confirming that the weight space encodes semantic structure.

t-SNE visualization of weight space

t-SNE visualization of weight representations. mLoRA-Asym weights form tight, well separated clusters per semantic category, demonstrating strong structure in weight space.

Citation

@inproceedings{yang2026wsr,
  title     = {Weight Space Representation Learning via Neural Field Adaptation},
  author    = {Yang, Zhuoqian and Salzmann, Mathieu and S{\"u}sstrunk, Sabine},
  booktitle = {Proceedings of the IEEE/CVF Conference on
               Computer Vision and Pattern Recognition (CVPR)},
  year      = {2026}
}