Abhijeet Abhijeet Mittal
Abhijeet is a Data Scientist at Sigmoid. He mainly works in the domain of Recommendation Engines, Time Series Forecasting, Reinforcement Learning and Computer Vision.
Abhijeet Mittal
Abhijeet is a Data Scientist at Sigmoid.
Outsmarting Humans: An Introduction to Reinforcement Learning

The brain of a human child is spectacularly amazing. Even in any previously unknown situation, the brain makes a decision based on its primal knowledge. Depending on the outcome, it learns and remembers the most optimal choices to be taken in that particular scenario. On a high level, this process of learning can be understood as a ’trial and error’ process, where the brain tries to maximize the occurrence of positive outcomes.
Reinforcement Learning
Similar is the inception of Reinforcement Learning. An ideal machine is like a child’s brain, that can remember each and every decision taken in given tasks. Likewise, the goal is to try and optimize the results. In Reinforcement Learning, the learner isn’t told which action to take, but is instead made to try and discover actions that would yield maximum reward. In the most interesting and challenging cases, actions may not only affect the immediate reward, but also impact the next situation and all subsequent rewards. These two characteristics: ‘trial and error search’ and ‘delayed reward’ are the most distinguishing features of reinforcement learning.
Machines vs Humans
Many of us must have heard about the famous Alpha Go, built by Google using Reinforcement Learning. This machine has even beaten the world champion Lee Sudol in the abstract strategy board game of Go!
Elon Musk in a famous debate on AI with Jack Ma, explained how machines are becoming smarter than humans. Reinforcement Learning is definitely one of the areas where machines have already proven their capability to outsmart humans.
The Video Games Analogy
Reinforcement Learning can be understood by an example of video games. A typical video game usually consists of:
Reinforcement-Learning-Video-Game-Analogy

Fig: A Video Game Analogy of Reinforcement Learning
  • An agent (player) who moves around doing stuff
  • An environment that the agent exists in (map, room)
  • An action that the agent takes (moves upward one space, sells cloak)
  • A reward that the agent acquires (coins, killing other players)
  • A state that the agent currently exists in (on a particular square of a map, part of a room)
  • A goal that the agent may have (level up, getting as many rewards as possible)

The agent basically runs through sequences of state-action pairs in the given environment, observing the rewards that result, to figure out the best path for the agent to take in order to reach the goal.
Rules of Reinforcement Learning
The Markov decision process lays the foundation stone for Reinforcement Learning and formally describes an observable environment. There are two important parts of Reinforcement Learning:
Policy Learning: This is a function that maps a given state to probabilities of selecting each possible action from that state. In simple terms, it identifies the best probable set of actions an agent should take in order to maximize their reward. This policy is learnt by taking random decisions and then iterating backwards (updating weights) once there is a positive or negative end result.
Value Functions (Q Learning): These are functions of states, or of state-action pairs, that estimate how good is it for an agent to be in a given state, or how good is it for the agent to perform a given action in a given state. Unlike Policy Learning, Q-Learning takes two inputs: state and action, and returns a value for each pair. If you’re at an intersection, Q-Learning will tell you the expected value of each action that your agent could take (left, right, etc.).
Applications of Reinforcement Learning:
There are numerous application areas of Reinforcement Learning. Starting from robotics and games to self-driving cars, Reinforcement Learning has found applications in many areas. Famous researchers in the likes of Andrew Ng, Andrej Karpathy and David Silverman are betting big on the future of Reinforcement Learning.
Final Words
It seems till date that the idea of outsmarting humans in every field is farfetched. But the seed has been sown and companies like Google and Tesla have shown that if machines and humans work together, the future has many opportunities to offer. As far as Reinforcement Learning is concerned, we at Sigmoid are excited about its future and its game changing applications.

Recommended for you

Need for effective Log management systems – Comparing Splunk & Elastic search

March 18th, 2020|

Sudeep Rao & Jagannath Nikam Sudeep is a Senior Pre-Sales Manager at Sigmoid. This blog was co-authored with Jagannath who heads the DevOps team at Sigmoid. Sudeep Rao Sudeep is a Senior Pre-Sales Manager at Sigmoid. Need for effective Log management systems - Comparing Splunk & Elastic search Continuous integration and Continuous Deployment have increasingly shortened the time taken to build applications that need frequent changes, while still maintaining a reliable delivery process. On the other hand, a

The Quantum Supremacy Experiment

January 28th, 2020|

Aniruddh Rawat Aniruddh Rawat is a Data Scientist at Sigmoid. He works with data and application of Machine Learning algorithms. Currently he is focusing on Quantum Machine Learning, Recommendation Systems Aniruddh Rawat Aniruddh is a Data Scientist at Sigmoid. The Quantum Supremacy Experiment Introduction As a continuation of our previous blog about Quantum Computing and Quantum Supremacy, we have tried to explain the what, why and how of the experiment performed by Google as mentioned in this paper in

What’s all the fuss about Quantum Computing and Quantum Supremacy?

January 16th, 2020|

Aniruddh Rawat Aniruddh Rawat is a Data Scientist at Sigmoid. He works with data and application of Machine Learning algorithms. Currently he is focusing on Quantum Machine Learning, Recommendation Systems Aniruddh Rawat Aniruddh is a Data Scientist at Sigmoid. What’s all the fuss about Quantum Computing and Quantum Supremacy? Introduction “Quantum Computers will soon outperform Classical Machines” The internet has been flooded with such sentences in the past few days mentioning that in the near future, quantum computers can

By |2019-11-12T10:19:01+00:00November 8th, 2019|Streaming, Technology|