384 words
2 minutes
AI Agent - ReAct

ReAct - Reasoning and Acting#

A unique feature of human intelligence is the ability to seamlessly combine task-oriented actions with verbal reasoning (or inner speech, which has been proven to play an important role in human cognition for enabling self-regulation and maintaining a working memory). The interleaving between “acting” and “reasoning” allows humans to learn new tasks quickly and perform robust decision-making or reasoning.

ReAct: Synergizing Reasoning And Acting#

Consider a general setup of an agent interacting with an environment for task solving. At time step tt, an agent receives an observation otOo_t \in \mathcal{O} from the environment and takes an action atAa_t \in \mathcal{A} following some policy π(atct)\pi(a_t | c_t), where ct=(o1,a1,,ot1,at1,ot)c_t = (o_1, a_1, \cdots, o_{t-1}, a_{t-1}, o_t) is the context to the agent. Learning a policy is challenging when the mapping ctatc_t \mapsto a_t is highly implicit and requires extensive computation.

What is a policy?

The term “policy” in this context is a core concept borrowed directly from the field of Reinforcement Learning (RL), whose definitive theoretical introduction [Reinforcement Learning: An Introduction] defines it as follows:

A policy defines the learning agent’s way of behaving at a given time. Roughly speaking, a policy is a mapping from perceived states of the environment to actions to be taken when in those states. It corresponds to what in psychology would be called a set of stimulus–response rules or associations. In some cases the policy may be a simple function or lookup table, whereas in others it may involve extensive computation such as a search process. The policy is the core of a reinforcement learning agent in the sense that it alone is sufficient to determine behavior. In general, policies may be stochastic, specifying probabilities for each action

In short, a policy is the “brain” or “strategy” of an agent. It is the set of rules that dictates what action the agent will take in any given state.

To break that down:

  • Agent: The AI (in this case, the LLM).
  • Environment: The world the agent interacts with (e.g., a website, a code terminal, a Wikipedia API).
  • State: The agent’s current situation (e.g., the webpage it’s on, the code it has written so far, the user’s initial question).
  • Policy: The function that looks at the current state and decides what action to take next. It answers the question, “Given what’s happening right now, what should I do?”
AI Agent - ReAct
https://blogs.openml.io/posts/agent-react/
Author
OpenML Blogs
Published at
2026-05-05
License
CC BY-NC-SA 4.0