This use case presents AI-powered green multi-cluster management for large-scale intelligent infrastructure. It uses AI-driven scheduling, GPU multiplexing, cross-cluster coordination, and multimodal explanation mechanisms to improve cluster utilization, reduce energy consumption, and support sustainable low-carbon AI infrastructure management.
@incollection{weng2025GreenMultiCluster,author={Weng, Qizhen and Fan, Yuankai},title={AI for Green Multi-Cluster: Intelligent Management towards Green and Low-Carbon, Large-scale Multi-Clusters},booktitle={AI for Good Innovate for Impact Report},publisher={International Telecommunication Union},month=jul,year={2025},chapter={4.2-Climate Change},section={Use Case 8},pages={182--187},}
Rethinking Data in NL2SQL: A Survey of What We Have and What We Expect
Yuankai Fan, Qizhen Weng, Yin Chen, and X. Sean Wang
This survey rethinks the role of data in NL2SQL, reviewing what data resources, benchmarks, and practices the community currently has and outlining what future NL2SQL systems and datasets should support.
@article{fan2025NL2SQL,author={Fan, Yuankai and Weng, Qizhen and Chen, Yin and Wang, X. Sean},title={Rethinking Data in {NL2SQL}: A Survey of What We Have and What We Expect},journal={Vicinagearth},volume={2},number={1},pages={15},year={2025},month=nov,issn={3005-060X},doi={10.1007/s44336-025-00026-9},}
Grounding Natural Language to SQL Translation with Data-Based Self-Explanations
Yuankai Fan, Tonghui Ren, Can Huang, Zhenying He, and X. Sean Wang
In International Conference on Data Engineering 2025
Natural Language Interfaces for Databases empower non-technical users to interact with data using natural language (NL). Advanced approaches, utilizing either neural sequence-to-sequence or more recent sophisticated large-scale language models, typically implement NL to SQL (NL2SQL) translation in an end-to-end fashion. However, like humans, these end-to-end translation models may not always generate the best SQL output on their first try. In this paper, we propose CycleSQL, an iterative framework designed for end-to-end translation models to autonomously generate the best output through self-evaluation. The main idea of CycleSQL is to introduce data-grounded NL explanations of query results as self-provided feedback, and use the feedback to validate the correctness of the translation iteratively, hence improving the overall translation accuracy.
@inproceedings{cyclesql2025,author={Fan, Yuankai and Ren, Tonghui and Huang, Can and He, Zhenying and Wang, X. Sean},title={Grounding Natural Language to SQL Translation with Data-Based Self-Explanations},booktitle={International Conference on Data Engineering},pages={29--42},year={2025},}
2024
A Confidence-based Knowledge Integration Framework for Cross-Domain Table Question Answering
Yuankai Fan, Tonghui Ren, Can Huang, Beini Zheng, Yinan Jing, Zhenying He, Jinbao Li, and Jianxin Li
Recent advancements in TableQA leverage sequence-to-sequence (Seq2seq) deep learning models to accurately respond to natural language queries. These models achieve this by converting the queries into SQL queries, using information drawn from one or more tables. However, Seq2seq models often produce uncertain low-confidence predictions when distributing probability mass across multiple outputs during a decoding step, frequently yielding translation errors. To tackle this problem, we present CKIF, a confidence-based knowledge integration framework that uses a two-stage deep-learning-based ranking technique to mitigate the low-confidence problem commonly associated with Seq2seq models for TableQA.
@article{ckif2024,author={Fan, Yuankai and Ren, Tonghui and Huang, Can and Zheng, Beini and Jing, Yinan and He, Zhenying and Li, Jinbao and Li, Jianxin},title={A Confidence-based Knowledge Integration Framework for Cross-Domain Table Question Answering},journal={Knowledge-Based Systems},volume={306},pages={112718},year={2024},}
MetaSQL: A Generate-then-Rank Framework for Natural Language to SQL Translation
Yuankai Fan, Zhenying He, Tonghui Ren, Can Huang, Yinan Jing, Kai Zhang, and X. Sean Wang
In International Conference on Data Engineering 2024
The Natural Language Interface to Databases (NLIDB) empowers non-technical users with database access through intuitive natural language interactions. Advanced approaches, utilizing neural sequence-to-sequence models or large-scale language models, typically employ auto-regressive decoding to generate unique SQL queries sequentially. In this paper, we propose MetaSQL, a unified generate-then-rank framework that can be flexibly incorporated with existing NLIDBs to consistently improve their translation accuracy. MetaSQL introduces query metadata to control the generation of better SQL query candidates and uses learning-to-rank algorithms to retrieve globally optimized queries.
@inproceedings{metasql2024,author={Fan, Yuankai and He, Zhenying and Ren, Tonghui and Huang, Can and Jing, Yinan and Zhang, Kai and Wang, X. Sean},title={MetaSQL: A Generate-then-Rank Framework for Natural Language to SQL Translation},booktitle={International Conference on Data Engineering},pages={1765--1778},year={2024},}
2023
GAR: A Generate-and-Rank Approach for Natural Language to SQL Translation
Yuankai Fan, Zhenying He, Tonghui Ren, Dianjun Guo, Lin Chen, Ruisi Zhu, Guanduo Chen, Yinan Jing, Kai Zhang, and X. Sean Wang
In International Conference on Data Engineering 2023
A Natural Language Interface to Databases (NLIDB) aims to help end-users access databases. State-of-the-art approaches primarily construct language translation models to convert NL queries to SQL queries. While these models exhibit good performance on NLIDB benchmarks, the translation accuracy seems to have stalled at between 70%-75%, and most erroneous translations happen with complex queries that require an understanding of the structure and semantics specific to a database. This paper proposes a Generate-And-Rank approach called GAR.
@inproceedings{gar2023,author={Fan, Yuankai and He, Zhenying and Ren, Tonghui and Guo, Dianjun and Chen, Lin and Zhu, Ruisi and Chen, Guanduo and Jing, Yinan and Zhang, Kai and Wang, X. Sean},title={{GAR}: A Generate-and-Rank Approach for Natural Language to SQL Translation},booktitle={International Conference on Data Engineering},pages={110--122},year={2023},}
GenSql: A Generative Natural Language Interface to Database Systems
Yuankai Fan, Tonghui Ren, Zhenying He, X. Sean Wang, Ye Zhang, and Xingang Li
In International Conference on Data Engineering Apr 2023
GenSql is a generative natural language interface to database systems that demonstrates natural language to SQL translation capabilities for interactive database access.
@inproceedings{gensql2023,author={Fan, Yuankai and Ren, Tonghui and He, Zhenying and Wang, X. Sean and Zhang, Ye and Li, Xingang},title={{GenSql}: A Generative Natural Language Interface to Database Systems},booktitle={International Conference on Data Engineering},pages={3603--3606},month=apr,year={2023},doi={10.1109/ICDE55515.2023.00278},}