## How to stimulate the arithmetic reasoning ability of large models? Scientists gi

Large models have garnered widespread attention over the past year or two, particularly for their achievements in solving mathematical problems.

In fact, as early as 2022, researchers from the Google Research team proposed the Chain-of-Thought (CoT) prompting technique, an effective method for improving the mathematical reasoning of large models, and validated its effectiveness in few-shot context learning [1].

Although this method was quickly widely adopted, researchers in the field still know very little about how it stimulates the arithmetic reasoning abilities of large models.

Previous related explorations have mainly focused on experimentally observing the impact of different components in CoT prompt statements on the arithmetic reasoning of large models.

Advertisement

Specifically, attempts were made to replace or remove parts of the CoT prompt statement, such as removing the verbal reasoning part in the CoT samples, leaving only the key mathematical formulas, and judging whether the replaced or removed part has an important contribution to stimulating the arithmetic reasoning ability of large models by observing the performance differences of large models on existing arithmetic reasoning benchmark tests before and after the replacement or removal.Although researchers in this field have discovered several interesting phenomena from these studies, they still cannot explain how CoT stimulates the arithmetic reasoning ability of large models from the internal mechanisms of neural networks.

At the same time, these studies have also raised more questions. For example, why different components of CoT have varying degrees of impact on the arithmetic reasoning of large models.

To address the above issues, Professor Yao Ziyu from George Mason University in the United States and his team started a series of explorations on the open-source Llama2 model from the perspective of "model interpretability" and proposed using "neuron activation" to systematically explain the phenomena observed in existing studies on CoT.

Recently, the relevant paper titled "An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs" was accepted by the International Conference on Computational Linguistics (ACL, Annual Meeting of the Association for Computational Linguistics) in 2024 [2].

Daking Rai, a doctoral student at George Mason University, is the first author, and Yao Ziyu serves as the corresponding author.In the study, they first focused on exploring whether the neurons in the Transformer feedforward layer expressed the concept of arithmetic reasoning.

The related concepts include the concepts of arithmetic operations such as addition, subtraction, multiplication, and division, the concept of logical connections in the arithmetic reasoning process (e.g., "so," "next"), and other arithmetic calculation concepts (e.g., "percentage," "algorithm," "formula").

Therefore, to discover the concept represented by each neuron, they mapped the neurons to the vocabulary space of the large model and summarized the meaning represented by the neuron by annotating the proportion of the concept on each word after the neuron mapping.

The research group proposed using GPT-4 to read and understand the vocabulary mapping of neurons to automate this process of neuron annotation and mining.

The experiment showed that there are indeed neurons in the Transformer feedforward layer that represent arithmetic concepts. When these neurons are damaged, the arithmetic reasoning ability of the large model will be compromised.At the same time, researchers also observed that the activity level of these neurons is positively correlated with the arithmetic reasoning ability of the large model. This positive correlation explains why different prompt statements can have different effects on the arithmetic reasoning of the large model.

Based on these neurons, the team systematically explained four CoT-related phenomena observed in previous studies.

First, when mathematical formulas are removed from the CoT samples and only the results of the operations are left, the arithmetic reasoning ability of the large model is impaired.

Second, when the verbal reasoning is removed from the CoT samples and only the mathematical formulas are left, the model's ability is also impaired.

Third, when the CoT samples lose operational diversity, such as when all samples only involve addition operations, the model's ability is impaired.Fourthly, when the computation results of the CoT samples are incorrect but the reasoning process is correct, the model's capabilities are not significantly affected.

"We see that these phenomena can basically be explained by the degree of activation of neurons. For example, the number of activated neurons decreases before and after the removal of mathematical formulas, explaining why the model's arithmetic reasoning ability is impaired," the researchers explained.

From an application perspective, this achievement will have application prospects in two aspects.

Firstly, it can be used to predict the capabilities of large models.

In experiments, researchers can already see that the degree of activation of neurons representing arithmetic reasoning is positively correlated with the arithmetic reasoning ability of the Llama2 model. This means that in the future, it may not be necessary to conduct benchmark tests to directly predict the capabilities of large models in specific tasks.At the same time, because benchmarking requires a lot of manpower and resources, such as data set annotation and computing resources, so understanding the intrinsic mechanisms of large models to directly predict their capabilities can also help save costs.

In addition, practitioners in this field all hope that in the near future, large models can complete tasks beyond human capabilities. However, due to the limitations of human capabilities, these tasks cannot be constructed into benchmark tests. Predicting the model's capabilities through the intrinsic mechanisms of the large model can well avoid this problem.

Secondly, by controlling the intrinsic mechanisms of large models, to enhance or weaken the model's capabilities.

"We believe that this application will become one of the important methods to improve the safety of large models in the future, and it also has the potential to achieve more efficient training of large models, such as by locating neurons with small data, and then achieving the purpose of model training by controlling the activation of neurons," said the research group.

In fact, in the second half of 2023, OpenAI proposed the "Super Alignment" proposal [3], aiming to help humans regulate and control superhuman AI models by encouraging scientific and technological innovation. Predicting and controlling model capabilities are two important tasks to achieve this purpose."This achievement is a preliminary exploration in this direction, and we hope that we or other researchers can continue to explore in this direction in the future," the team said. The research was inspired by "mechanism interpretability."

This is a subfield of model interpretability that has rapidly emerged and received widespread attention in recent years. Unlike previous interpretability methods, mechanism interpretability attempts to understand the behavior mechanism of the model by reverse engineering the neural network.

At present, such methods have been applied to the interpretation of the behavior and structural functions of large models.

"And one of the studies that greatly inspired us is the exploration of the Transformer feedforward layer by researchers from the Allen Institute for Artificial Intelligence in the United States and Bar-Ilan University in Israel [4]," said the researchers.

The study found that in the process of predicting the next lexical unit by large models, the Transformer feedforward layer of the model will construct predictions by continuously strengthening related concepts in the lexical space. This concept reinforcement is achieved by activating the neurons of the Transformer feedforward layer.This mechanism-level discovery has inspired our conjecture: the reason why CoT can stimulate the arithmetic reasoning ability of large models may be because it can effectively activate the neurons in the Transformer feedforward layer that represent the concept of arithmetic reasoning, and these neurons help to strengthen the arithmetic reasoning ability of large models, the research team said.

Based on this, the research team imagined whether there is a mechanism that can directly enhance the arithmetic reasoning ability of large models, especially small-scale large models.

The team pointed out: "This is a very meaningful thing, because small-scale large models have unique operational efficiency, economic efficiency, and security."

Moreover, at the same time, they also saw some research that improved the ability of small-scale large models in specific fields or tasks by collecting high-quality data or modifying the training objective function. However, the application of mechanism interpretability is still in its infancy.

Nevertheless, the research process of the team was not smooth sailing, and even at the beginning, they faced a "stall".The most significant challenge lies in the fact that they did not fully understand the internal mechanism of arithmetic reasoning in large models, and naturally, they could not achieve the envisioned model control.

"Therefore, I and my student, Lei, who is also the first author of this paper, decided to first focus on explaining the arithmetic reasoning of large models," said Yao Ziyu.

But they soon encountered the next challenge.

"Arithmetic reasoning" is a highly abstract concept, while the predictive execution of large models is at the level of specific vocabulary units.

If one wants to understand the arithmetic reasoning ability of large models from the perspective of "concept reinforcement of neurons in the vocabulary space," one must first implement this highly abstract concept into the concept at the specific vocabulary level.To bridge this gap, the research team first summarized several lower-level concepts related to arithmetic reasoning, including arithmetic operators, logical language expressions in arithmetic reasoning, and other arithmetic computation concepts.

They then efficiently annotated and searched for neurons expressing these lower-level concepts using GPT-4. Subsequently, they validated these searched neurons by referring to previous studies.

"The experimental results prove that these neurons indeed play an important role in our experimental large model Llama2," said the research team.

This also gives them more confidence to continue exploring in this direction.

They thought of using the activation state of these neurons to uniformly explain the effect of CoT on the arithmetic reasoning ability of large models, including explaining several phenomena observed in previous work.The results essentially confirmed their conjecture that the stimulating effect of different components of CoT on the arithmetic reasoning ability of large models can be explained by the activation of relevant neurons.

However, the study also pointed out that neuronal activation cannot explain all the arithmetic reasoning performance of large models. At the same time, whether the findings on Llama2 are applicable to other large model populations also needs further verification.

It is also reported that Yao Ziyu's laboratory currently has several fully-funded doctoral positions for admission in the fall of 2025. For more details, please visit the team's website and consult by email.

## Comment