Projects
Modeling Non-Gaussian Extragalactic Foregrounds with DDPMs*
Tech stack: PyTorch · Accelerate · DDPMs
GitHub
- Built diffusion models to learn the joint non-Gaussian distribution of Cosmic Infrared Background (CIB) and tSZ maps from the Agora simulations.
- Generated fast, realistic CIB–tSZ samples that match 2-point, 3-point, and 4-point statistics of the training data.
- Integrated DDPM-generated maps into lensing pipelines, reducing bias in reconstructed lensing power spectra on small angular scales.
Forecasting ΛCDM Cosmological Constraints with SPT-3G*
Tech stack: Python · Julia · Statistical Inference
GitHub
- Developed a covariance and parameter-forecasting framework for upcoming SPT-3G temperature, polarization, and lensing data.
- Implemented optimal joint estimation of auto/cross-frequency covariance matrices for lensing and unlensed bandpowers.
- Produced forecasts showing up to \(\sim 2 \times\) tighter parameter constraints relative to Planck alone, enabling robust consistency tests of \(\Lambda\)CDM and its extensions.
Generative Modeling of Galactic Dust with Variational Autoencoders
Tech stack: Python · TensorFlow · CUDA · VAEs
GitHub
- Trained VAEs to model Galactic dust emission maps from Planck observations.
- Demonstrated the ability to generate new dust realizations, reconstruct withheld maps, and perform constrained realizations.
- Provided generative tools useful for simulation-based inference and foreground-cleaning tests for CMB experiments.
Pawsitive Retrieval — RAG System over 5.5M Reddit Posts
Tech stack: NLP · HuggingFace · LanceDB · LangChain GitHub
- Cleaned, normalized, and indexed a dataset of ~5.5M Reddit posts for large-scale retrieval experiments.
- Fine-tuned a domain-specific embedding model for semantic search and question answering.
- Achieved a 10–15% improvement in retrieval quality (MRR, NDCG) relative to baseline transformer embeddings.
Bayesian Inference Pipeline for CMB Lensing
Tech stack: Julia · CUDA · Bayesian Inference
GitHub
- Applied differentiable likelihood methods to extract non-Gaussian information from CMB temperature maps.
- Improved constraints on lensing amplitude and higher-order CMB statistics beyond traditional two-point analyses.
- Worked toward scalable, simulation-based inference techniques for next-generation CMB datasets.
*Private project repositories — contact me for access.