Reinforcement learning is a type of machine learning based on rewards and punishments. This article explains its definition, operation, and primary applications.
Reinforcement Learning: AI Crash Course #9
Artificial intelligence (AI) programs are constantly using machine learning to improve speed and efficiency. Reinforcement learning rewards AI for desired actions and punishes it for undesired actions.
Reinforcement learning can only occur in a controlled environment. The programmer assigns positive and negative values (or “points”) to certain behaviors, and the AI is free to explore the environment to seek rewards and avoid punishments.
Ideally, the AI will delay short-term gain in favor of long-term gain, so if it chooses between earning one point in one minute or earning 10 points in two minutes, it will delay gratification and go for the higher value. At the same time, it will learn to avoid punitive measures that cause it to lose points.