SENTINEL: Evaluating Pipeline Robustness to Distributional Shifts
Principal Investigator: Romila Pradhan
Ensuring the robustness of data preprocessing
pipelines is essential for maintaining the reliability of machine
learning model performance in the face of real-world data
shifts. Traditional methods optimize preprocessing sequences
for specific datasets but often overlook their vulnerability to
future data variations. We demonstrate SENTINEL, a system for
evaluating the robustness of data preprocessing pipelines under
real-world covariate shift. SENTINEL quantifies component-wise
susceptibility using a composite vulnerability score (VS) and
learns a lightweight linear mapping from VS to downstream
performance change. For any given pipeline and drift scenario,
SENTINEL produces an explanation that classifies the pipeline as
vulnerable or non-vulnerable and indicates the likely contribution
to performance degradation. For the shift-vulnerable scenarios,
we further suggest repaired pipelines for system maintainers that
preserve the model performance without re-optimization. The
proposed framework delivers a risk-aware assessment, empower-
ing practitioners to anticipate potential performance changes and
adapt their pipeline repair strategies accordingly. On multiple
real datasets, SENTINEL delivers a risk-aware, practitioner-
facing report that pinpoints brittle stages and recommends
repairs to mitigate drift-induced failures prior to deployment.
Personnel
Students: Jahid Hasan

