Dear all,
I hope you will join us for the November LINCC Tech Talk session that will take place on Thursday, November 13, at 10am PT on Zoom (https://ls.st/lincc-talks ). We will hear about Representation learning in scientific datasets with disentangled generative ML models by Arkaprabha Ganguli.
Title: Enhancing interpretability in generative ML modeling: statistically disentangled latent spaces guided by generative factors in scientific datasets
Abstract: Scientific discovery often requires identifying relationships among noisy, biased, and uncertain measurements. Although data-driven models can achieve strong predictive performance, they often lack interpretability in these real scientific contexts. In extragalactic astronomy and cosmology, for example, we wish to link observed galaxy images and spectra to underlying physical drivers such as the dark-matter environment and evolutionary history; however, many dynamical parameters remain poorly constrained. Typical AI pipelines excel at classification or redshift estimation yet provide limited insight into these mechanisms, in part because their latent spaces entangle multiple generative factors and rarely exploit the data’s multi-fidelity nature. We present an encoder–decoder generative framework that learns disentangled representations by injecting domain-specific auxiliary information into the latent space. Well-understood generative factors are allocated to separate, interpretable dimensions, while uncertain or unknown factors remain entangled. This design improves semantic structure, enhances explainability, and increases robustness to adversarial or out-of-distribution inputs. We demonstrate the method on two testbeds: Synthetic galaxy images with known structural parameters, and cosmic microwave background lensing maps with associated halo properties from simulations. The model maintains high reconstruction fidelity, and the disentangled dimensions correspond to physically plausible perturbations—providing a transparent link between learned features and underlying cosmic processes.
LINCC Tech Talks are held on the second Thursday of every month. Events are also advertised at our web page and also provided in calendar form ; and the #lincc-tech-talks LSSTC Slack channel is always available for discussions before, during, and after the talks.