Research·arXiv cs.LG·Jun 24

An Analysis of Posterior Collapse, Parameterization and Initialization in Variational Deep Gaussian Processes

Deep Gaussian Processes face a fundamental training failure where variational inference collapses to the prior, effectively ignoring training data. This paper traces posterior collapse to the DSVI algorithm and standard linear prior mean initialization across hidden layers, revealing why a seemingly beneficial design choice actually undermines learning. The finding matters because DGPs remain attractive for uncertainty quantification in safety-critical domains, and fixing this pathology could unlock their practical deployment where current methods fail silently.

Modelwire context

Explainer

The paper identifies that posterior collapse in Deep Gaussian Processes isn't inevitable but stems from a specific algorithmic choice (DSVI) combined with a conventional initialization pattern. The finding is that the failure is reproducible and traceable, not a mysterious training instability.

This connects to the VAE layer work from the same day, which also addresses how to make classical probabilistic deep learning methods more reliable and modular. Both papers treat variational inference as a component that needs fixing rather than replacing. Where the VAE paper focuses on architectural integration, this DGP work focuses on the internal mechanics that prevent these probabilistic models from learning at all. The shared thread is recognition that variational methods remain valuable but require careful tuning to work in practice.

If follow-up work shows that replacing DSVI with an alternative inference algorithm (or modifying the prior mean initialization) recovers DGP performance on standard uncertainty benchmarks (e.g., UCI regression, OOD detection) without sacrificing computational cost, the fix is actionable. If the proposed changes only work on toy problems or require problem-specific tuning, the diagnosis remains academic.

Coverage we drew on

Variational Autoencoder Layer · arXiv cs.LG

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsDeep Gaussian Processes · DSVI · Variational Inference · Kullback-Leibler Divergence

Read full story at arXiv cs.LG →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.