Skip to content

ChengpengLi1003/DotaMath

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 

Repository files navigation

🔥DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning

Chengpeng Li, Guanting Dong, Mingfeng Xue, Ru Peng, Xiang Wang, Dayiheng Liu

University of Science and Technology of China

Qwen, Alibaba Inc.

📃 ArXiv Paper • 🤗 Dataset (Huggingface) • 📚 Dataset (Google drive)


If you find this work helpful for your research, please kindly cite it.

@article{li2024dotamath,
  author       = {Chengpeng Li and
                  Guanting Dong and
                  Mingfeng Xue and
                  Ru Peng and
                  Xiang Wang and
                  Dayiheng Liu},
  title        = {DotaMath: Decomposition of Thought with Code Assistance and Self-correction
                  for Mathematical Reasoning},
  journal      = {CoRR},
  volume       = {abs/2407.04078},
  year         = {2024},
  url          = {https://doi.org/10.48550/arXiv.2407.04078},
  doi          = {10.48550/ARXIV.2407.04078},
  eprinttype    = {arXiv},
  eprint       = {2407.04078},
  timestamp    = {Wed, 07 Aug 2024 21:29:45 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2407-04078.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

💥 News

Introduction

Large language models (LLMs) have made significant strides in solving simple math problems but still struggle with complex tasks. This paper presents DotaMath, a series of LLMs that utilize thought decomposition, code assistance, and self-correction for mathematical reasoning. DotaMath tackles complex problems by breaking them down into simpler subtasks, using code to solve these subtasks, receiving detailed feedback from the code interpreter, and engaging in self-reflection. By annotating diverse interactive tool-use trajectories and applying query evolution on the GSM8K and MATH datasets, we create an instruction fine-tuning dataset called DotaMathQA, consisting of 574K query-response pairs. We train several base LLMs using imitation learning on DotaMathQA, resulting in models that outperform open-source LLMs on various benchmarks. Notably, DotaMath-deepseek-7B achieves 64.8% on the MATH dataset and 86.7% on GSM8K, maintaining strong competitiveness across multiple benchmarks (Avg. 80.1%). We believe the DotaMath paradigm will pave the way for tackling intricate mathematical problems.

Overall Framework

image image

Main Result

image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published