publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- DataMan: Data Manager for Pre-training Large Language ModelsIn The Thirteenth International Conference on Learning Representations, 2025
2024
- Tablegpt2: A large multimodal model with tabular data integrationarXiv preprint arXiv:2411.02059, 2024
-
- From Laws to Motivation: Guiding Exploration through Law-Based Reasoning and RewardsIn Intrinsically-Motivated and Open-Ended Learning Workshop@ NeurIPS2024, 2024
- On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A SurveyIn Findings of the Association for Computational Linguistics ACL 2024, 2024
- Embedding and Gradient Say Wrong: A White-Box Method for Hallucination DetectionIn Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
- A Comparative Study on Reasoning Patterns of OpenAI’s o1 ModelarXiv preprint arXiv:2410.13639, 2024
- When Quantum Computing Meets Database: A Hybrid Sampling Framework for Approximate Query ProcessingIEEE Transactions on Knowledge and Data Engineering, 2024
- FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based AgentsIn Findings of the Association for Computational Linguistics: EMNLP 2024, 2024