Principal Investigator: Rajiv Khanna
We frame coreset selection (ie data subset selection) via posterior sampling so the subset’s induced loss landscape better matches the full‑data landscape, yielding robustness under label corruption and small‑budget regimes and improving time‑to‑accuracy versus state of the art baselines. We report 20–200% time‑to‑best‑accuracy gains and lower memory than current state-of-the-art across vision/NLP benchmarks,