Definition
RL is a branch of machine learning that focuses on teaching agents to make decisions by interacting with an environment
Through a system of rewards and penalties, the agent learns to take actions that maximize the cumulative reward over time
What is RL
RL is best perceived as a category of problems rather than a mere collection of techniques
Components of a RL model
- The policy of an agent dictates its actions in a given state, serving as a guide to choose the appropriate action based on the current state.
- A reward signal represents a singular numeric value that defines the objective of a RL problem. The agent’s aim is to maximize its cumulative reward over time, with the reward signal indicating which events are advantageous or detrimental for the agent
- The value function is responsible for estimating the expected cumulative reward an agent can obtain from a specific state or action. By assigning values to states or actions based on their potential for future rewards, the value function assists the agent in decision-making processes
- The model of the environment (optional component) replicates the behavior of the actual environment or permits inferences about its future behavior. This model can be employed for planning and decision-making by simulating potential future scenarios