Knowledge Graph Construction for Resilient, Trustworthy, and Secure Software Supply Chains
Principal Investigator: Tianyi Zhang
This project will develop a unified knowledge graph that captures rich, up-to-date information about software components in heterogeneous software ecosystems. Building upon our prior work on noise-robust open knowledge extraction, we will develop a new neural knowledge acquisition pipeline that (1) extracts software information from various information sources, including but not limited to official documentation, software release notes, bug reports, CVEs, and online discussions, (2) consolidates the extracted information via an array of quality control and fact-checking mechanisms, and (3) constantly updates the knowledge graph by tracking new information from various sources. The resulting knowledge graph will empower us to further develop a novel multi-modal query interface for knowledge dissemination, as well as new risk mitigation approaches that perform deep scans on software systems, detect potential risks, and automatically repair them.
Personnel
Other PIs: Xiangyu Zhang
Students: Yifeng Di Yuan Tian Bonan Kou Minghai Lu Zhi Tu Ruixin Wang Weixi Tong Jiahao Shi Wei-Hao Chen
Representative Publications
Zhou, Zihan, Zhongkai Zhao, Bonan Kou, and Tianyi Zhang. "Decide: Knowledge-Based Version Incompatibility Detection in Deep Learning Stacks." In Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering, pp. 547-551. 2024.
- Yuan Tian, Jonathan K. Kummerfeld, Toby Jia-Jun Li, and Tianyi Zhang. 2024. SQLucid: Grounding Natural Language Database Queries with Interactive Explanations. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST '24). Association for Computing Machinery, New York, NY, USA, Article 12, 1–20. https://doi.org/10.1145/3654777.3676368
Tian, Yuan, Zheng Zhang, Zheng Ning, Toby Li, Jonathan K. Kummerfeld, and Tianyi Zhang. "Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations." In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp. 16149-16166. 2023.
Zhao, Zhongkai, Bonan Kou, Mohamed Yilmaz Ibrahim, Muhao Chen, and Tianyi Zhang. "Knowledge-Based Version Incompatibility Detection for Deep Learning." ESEC/FSE 2023.
Nguyen, Tai, Yifeng Di, Joohan Lee, Muhao Chen, and Tianyi Zhang. "Software Entity Recognition with Noise-Robust Learning." ASE 2023.
Keywords: Knowledge Graph, security, Software Supply Chains, Vulnerabilities

