Skip to content

Latest commit

 

History

History
335 lines (260 loc) · 34.1 KB

README.md

File metadata and controls

335 lines (260 loc) · 34.1 KB

Deep Learning for Mathematical Reasoning (DL4MATH)

Awesome License: MIT Survey

This repository is the reading list on Deep Learning for Mathematical Reasoning (DL4MATH).

Contributors: Pan Lu @UCLA, Liang Qiu @UCLA, Wenhao Yu @Notre Dame, Sean Welleck @UW, Kai-Wei Chang @UCLA

For more details, please refer to the paper: A Survey of Deep Learning for Mathematical Reasoning.

🔔 If you have any suggestions or notice something we missed, please don't hesitate to let us know. You can directly email Pan Lu ([email protected]), comment on the twitter, or post an issue on this repo.

🧰 Resources

Related Surveys

  • A Survey of Question Answering for Math and Science Problem, arXiv:1705.04530 [paper]
  • The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers, TPAMI 2019 [paper]
  • Representing Numbers in NLP: a Survey and a Vision, NACL 2021 [paper]
  • Survey on Mathematical Word Problem Solving Using Natural Language Processing, ICIICT 2021 [paper]
  • A Survey in Mathematical Language Processing, arXiv:2205.15231 [paper]
  • Partial Differential Equations Meet Deep Neural Networks: A Survey, arXiv:2211.05567 [paper]
  • 🔥 Reasoning with Language Model Prompting: A Survey, arXiv:2212.09597 [paper]
  • 🔥 Towards Reasoning in Large Language Models: arXiv:2212.10403 [paper]
  • 🔥 A Survey for In-context Learning, arXiv:2301.00234 [paper]

Related Blogs

  • 🔥 How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources, Dec 2022, Yao Fu’s Notion [link]

Workshops

  • 🔥 The 1st MATH-AI Workshop: the Role of Mathematical Reasoning in General Artificial Intelligence, ICLR 2021 [website]
  • 🔥 Math AI for Education: Bridging the Gap Between Research and Smart Education (MATHAI4ED), NeurIPS 2021 [website]
  • 🔥 The 1st Workshop on Mathematical Natural Language Processing, EMNLP 2022 [website]
  • 🔥 The 2nd MATH-AI Workshop: Toward Human-Level Mathematical Reasoning, NeurIPS 2022 [website]
  • 🔥 FLAIM: Formal Languages, AI and Mathematics, IHP & META 2022 [YouTube]
  • 🔥 AI to Assist Mathematical Reasoning: A Workshop, NASEM 2023 [YouTube]

Talks

  • Can GPT-3 do math? | Grant Sanderson and Lex Fridman, 2020 [YouTube]
  • Computer Scientist Explains One Concept in 5 Levels of Difficulty, 2022 [YouTube]

🎨 Mathematical Reasoning Benchmarks

Math Word Problems (MWP)

  • [AI2/Verb395] Learning to Solve Arithmetic Word Problems with Verb Categorization, EMNLP 2014 [paper]
  • [Alg514] Learning to automatically solve algebra word problems, ACL 2014 [paper]
  • [IL] Reasoning about Quantities in Natural Language, TACL 2015 [paper]
  • [SingleEQ] Parsing Algebraic Word Problems into Equations, TACL 2015 [paper]
  • [DRAW] Draw: A challenging and diverse algebra word problem set, 2015 [paper]
  • [Dolphin1878] Automatically solving number word problems by semantic parsing and reasoning, EMNLP 2015 [paper]
  • [Dolphin18K] How well do computers solve math word problems? large-scale dataset construction and evaluation, ACL 2016 [paper]
  • [MAWPS] MAWPS: A math word problem repository, NAACL-HLT 2016 [paper]
  • [AllArith] Unit dependency graph and its application to arithmetic word problem solving, AAAI 2017 [paper]
  • [DRAW-1K] Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems, ACL 2017 [paper]
  • 🔥 [Math23K] Deep neural solver for math word problems, EMNLP 2017 [paper]
  • [AQuA] Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems, ACL 2017 [paper]
  • [Aggregate] Mapping to Declarative Knowledge for Word Problem Solving, TACL 2018 [paper]
  • 🔥 [MathQA] MathQA: Towards interpretable math word problem solving with operation-based formalisms, NAACL-HLT 2019 [paper]
  • [ASDiv] A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers, ACL 2020 [paper]
  • [HMWP] Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems, EMNLP 2020 [paper]
  • [Ape210K] Ape210k: A large-scale and template-rich dataset of math word problems, arXiv:2009.11506 [paper]
  • 🔥 [SVAMP] Are NLP Models really able to Solve Simple Math Word Problems?, NAACL-HIT 2021 [paper]
  • 🔥 [GSM8K] Training verifiers to solve math word problems, arXiv:2110.14168 [paper]
  • 🔥 [IconQA] IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning, NeurIPS 2021] [paper]
  • 🔥 [MathQA-Python] Program synthesis with large language models, arXiv:2108.07732 [paper]
  • [ArMATH] ArMATH: a Dataset for Solving Arabic Math Word Problems, LREC 2022 [paper]
  • 🔥 [TabMWP] Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning, arXiv:2209.14610, 2022 [paper]

Theorem Proving (TP)

  • [MML] Four Decades of Mizar, Journal of Automated Reasoning 2015, [paper]
  • [HolStep] HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving, ICLR 2017 [paper]
  • [GamePad] GamePad: A Learning Environment for Theorem Proving, ICLR 2019 [paper]
  • 🔥 [CoqGym] Learning to Prove Theorems via Interacting with Proof Assistants, ICML 2019 [paper]
  • [HOList] HOList: An environment for machine learning of higher order logic theorem proving, ICML 2019 [paper]
  • [IsarStep] IsarStep: a Benchmark for High-level Mathematical Reasoning, ICLR 2021 [paper]
  • [LISA] LISA: Language models of ISAbelle proofs, AITP 2021 [paper]
  • [INT] INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving, ICLR 2021 [paper]
  • 🔥 [NaturalProofs] NaturalProofs: Mathematical Theorem Proving in Natural Language, NeurIPS 2021 [paper]
  • [NaturalProofs-Gen] NaturalProver: Grounded Mathematical Proof Generation with Language Models, NeurIPS 2022 [paper]
  • 🔥 [MiniF2F] MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics, ICLR 2022 [paper]
  • 🔥 [LeanStep] Proof Artifact Co-training for Theorem Proving with Language Models, ICLR 2022 [paper]
  • 🔥 [miniF2F+informal] Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs, arXiv:2210.12283 [paper]

Geometry Problem Solving (GPS)

  • 🔥 [GEOS] Solving geometry problems: Combining text and diagram interpretation, EMNLP 2015 [paper]
  • [GeoShader] Synthesis of solutions for shaded area geometry problems, The Thirtieth International Flairs Conference, 2017 [paper]
  • [GEOS++] From textbooks to knowledge: A case study in harvesting axiomatic knowledge from textbooks to solve geometry problems, EMNLP 2017 [paper]
  • [GEOS-OS] Learning to solve geometry problems from natural language demonstrations in textbooks, Proceedings of the 6th Joint Conference on Lexical and Computational Semantics, 2017 [paper]
  • 🔥 [Geometry3K] Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
  • [GeoQA] GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper]
  • [GeoQA+] An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, COLING 2022 [paper]
  • 🔥 [UniGeo] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]

Math Question Answering (MathQA)

  • [QUAREL] QUAREL: A Dataset and Models for Answering Questions about Qualitative Relationships, AAAI 2019 [paper]
  • [McTaco] “Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding, EMNLP 2019 [paper]
  • 🔥 [DROP] DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs, NAACL 2019 [paper]
  • 🔥 [Mathematics] Analysing Mathematical Reasoning Abilities of Neural Models, ICLR 2019 [paper]
  • [FinQA] FinQA: A Dataset of Numerical Reasoning over Financial Data, EMNLP 2021 [paper]
  • [Fermi] How Much Coffee Was Consumed During EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI, EMNLP 2020 [paper]
  • 🔥 [MATH, AMPS] Measuring Mathematical Problem Solving With the MATH Dataset, NeurIPS 2021 [paper]
  • [TAT-QA] TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance, ACL-JCNLP 2021 [paper]
  • [MultiHiertt] MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data, ACL 2022 [paper]
  • [NumGLUE] NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks, ACL 2022 [paper]
  • 🔥 [Lila] Lila: A Unified Benchmark for Mathematical Reasoning, EMNLP 2022 [paper]

Other Quantitative Problems

  • [FigureQA] Figureqa: An annotated figure dataset for visual reasoning, arXiv:1710.07300 [paper]
  • 🔥 [DVQA] Dvqa: Understanding data visualizations via question answering, CVPR 2018 [paper]
  • [DREAM] DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension,TACL 2019 [paper]
  • [EQUATE] EQUATE: A Benchmark Evaluation Framework for Quantitative Reasoning in Natural Language Inference, CoNLL 2019 [paper]
  • 🔥 [NumerSense] Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models, EMNLP 2020 [paper]
  • [MNS] Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning, AAAI 2020 [paper]
  • [P3] Programming Puzzles, NeurIPS 2021 [paper]
  • [NOAHQA] NOAHQA: Numerical Reasoning with Interpretable Graph Question Answering Dataset, Findings of EMNLP 2021 [paper]
  • [ConvFinQA] ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering, arXiv:2210.03849 [paper]
  • [PGDP5K] PGDP5K: A Diagram Parsing Dataset for Plane Geometry Problems, arXiv:2205.0994 [paper]
  • [GeoRE] GeoRE: A Relation Extraction Dataset for Chinese Geometry Problems, NeurIPS 2021 MATHAI4ED Workshop [paper]
  • 🔥 [ScienceQA] Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering, NeurIPS 2022 [paper]

🧩 Neural Networks for Mathematical Reasoning

General Neural Networks

  • [LSTM] Long short-term memory, Neural computation 1997 [paper]
  • [Seq2Seq] Sequence to sequence learning with neural networks, NeurIPS 2014 [paper]
  • [GRU] Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation, EMNLP 2014 [paper]
  • [Attention] Neural machine translation by jointly learning to align and translate, arXiv:1409.0473 [paper]
  • [Attention] Show, attend and tell: Neural image caption generation with visual attention, ICML 2015 [paper]
  • [Faster-RCNN] Faster r-cnn: Towards real-time object detection with region proposal networks, NeurIPS 2015 [paper]
  • [TreeLSTM] Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, ACL 2015 [paper]
  • [BiLSTM] Google's neural machine translation system: Bridging the gap between human and machine translation, arXiv:1609.08144 [paper]
  • [ResNet] Deep residual learning for image recognition, CVPR 2016 [paper]
  • [ConvS2S] Convolutional sequence to sequence learning, ICML 2017 [paper]
  • [Top-Down Attention] Bottom-up and top-down attention for image captioning and visual question answering, CVPR 2018 [paper]
  • [FiLM] Film: Visual reasoning with a general conditioning layer, AAAI 2018 [paper]
  • [BAN] Bilinear Attention Networks, NeurIPS 2018 [paper]
  • [DAFA] Dynamic Fusion With Intra-and Inter-Modality Attention Flow for Visual Question Answering, CVPR 2018 [paper]

Seq2Seq Networks for Math

  • 🔥 [DNS] Deep Neural Solver for Math Word Problems, EMNLP 2017 [paper]
  • 🔥 [AnsRat] Program induction by rationale generation: Learning to solve and explain algebraic word problems, ACL 2017 [paper]
  • [Math-EN] Translating a Math Word Problem to a Expression Tree, EMNLP 2018 [paper]
  • [CASS] Neural math word problem solver with reinforcement learning, COLING 2018 [paper]
  • [SelfAtt] Data-driven methods for solving algebra word problems, arXiv:1804.10718 [paper]
  • [S-Aligned] Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems, NAACL 2019 [paper]
  • [T-RNN] Template-based math word problem solvers with recursive neural networks, AAAI 2019 [paper]
  • [GROUP-ATT] Modeling intra-relation in math word problems with different functional multi-head attentions, ACL 2019 [paper]
  • [QuaSP+] QUAREL: A Dataset and Models for Answering Questions about Qualitative Relationships, AAAI 2019 [paper]
  • [SMART] SMART: A Situation Model for Algebra Story Problems via Attributed Grammar, AAAI 2021 [paper]

Graph-based Networks for Math

  • [AST-Dec] Tree-structured decoding for solving math word problems, EMNLP 2019 [paper]
  • 🔥 [GTS] A Goal-Driven Tree-Structured Neural Model for Math Word Problems, IJCAI 2019 [paper]
  • [CoqGym] Learning to Prove Theorems via Interacting with Proof Assistants, ICML 2019 [paper]
  • [KA-S2T] A knowledge-aware sequence-to-tree network for math word problem solving, EMNLP 2020 [paper]
  • [TSN-MD, NT-LSTM] Solving arithmetic word problems by scoring equations with recursive neural networks, Expert Systems with Applications 2021 [paper]
  • [NS-Solver] Neural-Symbolic Solver for Math Word Problems with Auxiliary Tasks, ACL 2021 [paper]
  • [NumS2T] Math word problem solving with explicit numerical values, ACL 2021 [paper]
  • [HMS] Hms: A hierarchical solver with dependency-enhanced understanding for math word problem, AAAI 2021 [paper]
  • [LBF] Learning by fixing: Solving math word problems with weak supervision, AAAI 2021 [paper]
  • [Seq2DAG] A bottom-up dag structure extraction model for math word problems, AAAI 2021 [paper]
  • [Graph2Tree] Graph-to-Tree Neural Networks for Learning Structured Input-Output Translation with Applications to Semantic Parsing and Math Word Problem, EMNLP 2020 [paper]
  • [Multi-E/D] Solving math word problems with multi-encoders and multi-decoders, COLING 2020 [paper]
  • 🔥 [Graph2Tree] Graph-to-Tree Learning for Solving Math Word Problems, ACL 2020 [paper]
  • [EEH-G2T] An edge-enhanced hierarchical graph-to-tree network for math word problem solving, EMNLP 2021 [paper]

Other Neural Networks for Math

  • [DeepMath] Deepmath-deep sequence models for premise selection, NeurIPS 2016 [paper]
  • [Holophrasm] Holophrasm: a neural automated theorem prover for higher-order logic, arXiv:1608.02644 [paper]
  • 🔥 [CNNTP, WaveNetTP] Deep network guided proof search, arXiv:1701.06972 [paper]
  • 🔥 [MathDQN] Mathdqn: Solving arithmetic word problems via deep reinforcement learning, AAAI 2018 [paper]
  • [DDT] Solving math word problems with double-decoder transformer, arXiv:1908.10924 [paper]
  • [DeepHOL] HOList: An environment for machine learning of higher order logic theorem proving, ICML 2019 [paper]
  • [NGS] GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper]
  • [PGDPNet] Learning to Understand Plane Geometry Diagram, NeurIPS 2022 MATH-AI Workshop [paper]

📜 Pre-trained Language Models for Mathematical Reasoning

General Pre-trained Language Models (<100B)

  • [Transformer] Attention is all you need, NeurIPS 2017 [paper]
  • [BERT] Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv:1810.04805 [paper]
  • [T5] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, JMLR 2020 [paper]
  • [RoBERTa] Roberta: A robustly optimized bert pretraining approach, arXiv:1907.11692 [paper]
  • [GPT-2, 1.5B] Language models are unsupervised multitask learners, OpenAI Blog, 2020 [paper]
  • [BART] BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension, ACL 2020 [paper]
  • [ALBERT] Albert: A lite bert for self-supervised learning of language representations, arXiv:1909.11942 [paper]
  • [GPT-Neo] The pile: An 800gb dataset of diverse text for language modeling, arXiv:2101.00027 [paper]
  • [VL-T5] Unifying Vision-and-Language Tasks via Text Generation, ICML 2021 [paper]

Self-Supervised Learning for Math

  • 🔥 [GenBERT] Injecting numerical reasoning skills into language models, ACL 2020 [paper]
  • 🔥 [GPT-f] Generative language modeling for automated theorem proving, arXiv:2009.03393 [paper]
  • [LISA] LISA: Language models of ISAbelle proofs, AITP 2021 [paper]
  • [MATH-PLM] Measuring Mathematical Problem Solving With the MATH Dataset, NeurIPS 2021 [paper]
  • [LIME] Lime: Learning inductive bias for primitives of mathematical reasoning, ICML 2021 [paper]
  • [NF-NSM] Injecting Numerical Reasoning Skills into Knowledge Base Question Answering Models, arXiv:2112.06109 [paper]
  • [MWP-BERT] MWP-BERT: Numeracy-augmented pre-training for math word problem solving, Findings of NAACL 2022 [paper]
  • [HTPS] HyperTree Proof Search for Neural Theorem Proving, arXiv:2205.11491 [paper]
  • [Thor] Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers, arXiv:2205.10893 [paper]
  • [Set] Insights into pre-training via simpler synthetic tasks, arXiv:2206.10139 [paper]
  • [PACT] Proof artifact co-training for theorem proving with language models, ICLR 2022 [paper]
  • 🔥 [TaPEX] TAPEX: Table Pre-training via Learning a Neural SQL Executor, ICLR 2022 [paper]
  • 🔥 [Minerva] Solving quantitative reasoning problems with language models, NeurIPS 2022 [paper]

Task-specific Fine-tuning for Math

  • [EPT] Point to the expression: Solving algebraic word problems using the expression-pointer transformer model, EMNLP 2020 [paper]
  • [Generate & Rank] Generate & Rank: A Multi-task Framework for Math Word Problems, EMNLP 2021 [paper]
  • [RPKHS] Improving Math Word Problems with Pre-trained Knowledge and Hierarchical Reasoning, EMNLP 2021 [paper]
  • [PatchTRM] IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning, NeurIPS 2021 [paper]
  • 🔥 [GSM8K-PLM] Training verifiers to solve math word problems, arXiv:2110.14168 [paper]
  • 🔥 [Inter-GPS] Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
  • [Aristo] From ‘F’to ‘A’on the NY regents science exams: An overview of the aristo project, AI Magazine 2020 [paper]
  • [FinQANet] FinQA: A Dataset of Numerical Reasoning over Financial Data, EMNLP 2021 [paper]
  • [TAGOP] TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance, ACL-JCNLP 2021 [paper]
  • [LAMT] Linear algebra with transformers, arXiv:2112.01898 [paper]
  • 🔥 [Scratchpad] Show your work: Scratchpads for intermediate computation with language models, arXiv:2112.00114 [paper]
  • [Self-Sampling] Learning from Self-Sampled Correct and Partially-Correct Programs, arXiv:2205.14318 [paper]
  • [DeductReasoner] Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction, ACL 2022 [paper]
  • [DPE-NGS] An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, COLING 2022 [paper]
  • [BERT-TD+CL] Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems, Findings of ACL 2022 [paper]
  • [MT2Net] MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data, ACL 2022 [paper]
  • [miniF2F-PLM] MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics, ICLR 2022 [paper]
  • 🔥 [NaturalProver] NaturalProver: Grounded Mathematical Proof Generation with Language Models, NeurIPS 2022 [paper]
  • 🔥 [UniGeo] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]
  • 🔥 [Bhaskara] Lila: A Unified Benchmark for Mathematical Reasoning, EMNLP 2022 [paper]

🌠 In-context Learning for Mathematical Reasoning

General Large Language Models (100B+)

  • 🔥 [GPT-3, 175B] Language models are few-shot learners, NeurIPS 2020 [paper]
  • 🔥 [Codex, 175B] Evaluating large language models trained on code, arXiv:2107.03374 [paper]
  • 🔥 [PaLM, 540B] PaLM: Scaling Language Modeling with Pathways, arXiv:2204.02311 [paper]
  • 🔥 [ChatGPT, 175B] ChatGPT: Optimizing Language Models for Dialogue, November 30, 2022 [website]
  • ❓ [GPT-4]

In-context Example Selection

  • 🔥 [Few-shot-CoT] Chain of thought prompting elicits reasoning in large language models, NeurIPS 2022 [paper]
  • [Retrieval] Learning to retrieve prompts for in-context learning, NAACL-HLT 2022 [paper]
  • 🔥 [PromptPG-CoT] Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning, arXiv:2209.14610 [paper]
  • [Retrieval-CoT] Automatic Chain of Thought Prompting in Large Language Models, arXiv:2210.03493 [paper]
  • [Generate] Generate rather than retrieve: Large language models are strong context generators, arXiv:2209.10063 [paper]
  • [Complexity-CoT] Complexity-Based Prompting for Multi-Step Reasoning, arXiv:2210.00720 [paper]
  • [Auto-CoT] Automatic Chain of Thought Prompting in Large Language Models, arXiv:2210.03493 [paper]

High-quality Reasoning Chains

  • 🔥 [Self-Consistency-CoT] Self-consistency improves chain of thought reasoning in language models, arXiv:2203.11171 [paper]
  • 🔥 [Least-to-most CoT] Least-to-Most Prompting Enables Complex Reasoning in Large Language Models, arXiv:2205.10625 [paper]
  • On the Advance of Making Language Models Better Reasoners, arXiv:2206.02336 [paper]
  • Decomposed prompting: A modular approach for solving complex tasks, arXiv:2210.02406 [paper]
  • PAL: Program-aided Language Models, arXiv:2211.10435 [paper]
  • 🔥 [Few-shot-PoT] Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks, arXiv:2211.12588 [paper]

♣️ Other Related Work for Mathematical Reasoning

Early Work

  • Empirical explorations of the geometry theorem machine, Western Joint IRE-AIEE-ACM Computer Conference 1960 [paper]
  • Basic principles of mechanical theorem proving in elementary geometries, Journal of Automated Reasoning 1986 [paper]
  • Automated generation of readable proofs with geometric invariants, Journal of Automated Reasoning 1996 [paper]

Datasets

  • 🔥 [TextbookQA] Are You Smarter Than A Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension, CVPR 2017 [paper]
  • 🔥 [Raven] Raven: A dataset for relational and analogical visual reasoning, CVPR 2019 [paper]
  • [APPS] Measuring Coding Challenge Competence With APPS, NeurIPS 2021 [paper]
  • [PhysNLU] PhysNLU: A Language Resource for Evaluating Natural Language Understanding and Explanation Coherence in Physics, 2022 [paper]

Methods

  • My computer is an honor student—but how intelligent is it? Standardized tests as a measure of AI, AI Magazine 2016 [paper]
  • Learning pipelines with limited data and domain knowledge: A study in parsing physics problems, NeurIPS 2018 [paper]
  • Automatically proving plane geometry theorems stated by text and diagram, International Journal of Pattern Recognition and Artificial Intelligence 2019 [paper]
  • Classification and Clustering of arXiv Documents, Sections, and Abstracts, Comparing Encodings of Natural and Mathematical Language, JCDL 2020 [paper]

Latest Work (To be classified)

  • 🔥 Advancing mathematics by guiding human intuition with AI, Nature 2021 [paper]
  • [MWPToolkit] Mwptoolkit: an open-source framework for deep learning-based math word problem solvers, AAAI 2022 [paper]
  • A deep reinforcement learning agent for geometry online tutoring, Knowledge and Information Systems 2022 [paper]
  • ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler, NeurIPS 2022 [paper]
  • Solving math word problems with process and outcome-based feedback, arXiv:2211.14275 [paper]
  • APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning, arXiv:2212.07249 [paper]
  • Enhancing Financial Table and Text Question Answering with Tabular Graph and Numerical Reasoning, AACL 2022 [paper]
  • DyRRen: A Dynamic Retriever-Reranker-Generator Model for Numerical Reasoning over Tabular and Textual Data, AAAI 2023 [paper]
  • Generalizing Math Word Problem Solvers via Solution Diversification, arXiv:2212.00833 [paper]
  • Textual Enhanced Contrastive Learning for Solving Math Word Problems, arXiv:2211.16022 [paper]
  • Analogical Math Word Problems Solving with Enhanced Problem-Solution Association, EMNLP 2022 [paper]

Citation

If you find this repo useful, please kindly cite our survey:

@article{lu2022dl4math,
  title={A Survey of Deep Learning for Mathematical Reasoning},
  author={Lu, Pan and Qiu, Liang and Yu, Wenhao and Welleck, Sean and Chang, Kai-Wei},
  journal={arXiv preprint arXiv:2212.10535},
  year={2022}
}