Skip to content
  • Tiếng Việt
  • English

Chúc mừng nhóm sinh viên Hệ thống Thông tin có bài báo khoa học được chấp nhận công bố tại Hội nghị quốc tế ISDS 2025

Bài báo: “Enhancing Text-to-SQL Capabilities of Small Language Models via Schema Context Enrichment and Self-Correction”

Link bài báo: https://ctujs.ctu.edu.vn/index.php/ctujs/article/view/1907

Sinh viên thực hiện:

. Lê Gia Kiệt – HTCL2021 – Tác giả chính

. Lê Quốc Khánh – HTCL2021 – Đồng tác giả

. Nguyễn Minh Nhựt – HVCH-HTTT – Đồng tác giả

Hướng dẫn:

PGS.TS Nguyễn Đình Thuân

Tóm tắt:

Translating natural language into SQL is essential for intuitive database access, yet open-source small language models (SLMs) still lag behind larger systems when faced with complex schemas and tight context windows. This paper introduces a two-phase workflow designed to enhance the Text-to-SQL capabilities of SLMs. Phase 1 (offline) transforms the database schema into a graph, partitions it with Louvain community detection, and enriches each component in a cluster with metadata, relationships, and sample rows. Phase 2 (at runtime) selects the relevant tables, generates SQL queries, and iteratively refines the SQL through an execution-driven feedback loop until the query executes successfully. Evaluated on the Spider test set, our pipeline raises Qwen-2.5-Coder-14B to 86.2% Execution Accuracy (EX), surpassing its zero-shot baseline and outperforming all contemporary SLM + ICL approaches and narrowing the gap to GPT-4-based systems all while running on consumer-grade hardware. Ablation studies confirm that both schema enrichment and self-correction contribute significantly to the improvement. The study concludes that this workflow provides a practical methodology for deploying resource-efficient open-source SLMs in Text-to-SQL applications, effectively mitigating common challenges. An open-source implementation is released to support further research.

Chúng em xin chân thành gửi lời cảm ơn đến PGS.TS Nguyễn Đình Thuân, Giảng viên Khoa Hệ Thống Thông Tin, vì sự định hướng khoa học, những góp ý sâu sắc và sự tận tâm đồng hành của Thầy trong suốt quá trình nghiên cứu, giúp chúng em hoàn thiện ý tưởng, phương pháp và công bố bài báo này.

---

The 3rd International Conference on Intelligent Systems and Data Science (ISDS 2025) will be held at CICT, Can Tho University from October 18–19, 2025. Following the success of ISDS 2023 at Can Tho University and ISDS 2024 at Nha Trang University, this year’s conference continues its mission to attract domestic and international researchers to present outstanding and recent studies in the field of ICT. The ISDS conference serves as a forum for scientists to meet, exchange, and collaborate, as well as a platform for students to share and learn new results in intelligent systems and data science.

Thông tin xem tại: https://www.facebook.com/share/p/19s8tFH8Gi/