9 Unique Machine Learning Interview Questions on Markov Chains

Other Topics — Machine Learning Interview Questions

Article Overview

What are Markov Chains?

Markov chain is a stochastic concept used for decision-making. It rests upon the rule that the probability of transitioning from a current state to a future state is hinged upon the current state only and not any previous states. For example, in the statement ‘I love writing codes’. Markov chain can be used to predict the best next part of speech in a sentence. The POS tagging of the statement above is Pronoun Verb Verb Noun. Markov chain checks the probability of moving from one POS to the immediate next without considering any POS before it. These probabilities are used to construct a matrix called the probability matrix which states the probability of transitioning from one state to another in a given time. 

Markov chain has different applications such as in speech recognition, cryptography, web search ranking, text interpretation, and so on. In machine learning, Markov chains are widely used in Natural Language processing. 

How does Markov Chain work?

As illustrated, A Markov chain essentially consists of a set of transitions, which are determined by some probability distribution, that satisfy the Markov property.

The diagram above is called a Markov chain and it shows the transition between states A B and C. It can be seen from the diagram that the probability of moving from state A to B is 0.1 and vice versa. The probability of moving from state B to C is 0.9 while the probability of C to B is 0.5. The probability of moving from C to A is 0,5 while the probability of A to C is 0.9. 

To have a clearer picture of the Markov Chain, a transition probability matrix is usually drawn from the Markov chain which is shown in the matrix above. 

A Markov chain has two important properties. The first is the Markov property which has been discussed earlier. For sake of emphasis, the Markov property states that the probability of transition from a current state to a future state is dependent on only the current state. The second property of Markov chains states that the sum of all probabilities moving from a given state is equal to 1. 

Markov chains are used to find the stationary distribution, which is the probability of each state over a long period of time.

Why do we need Markov Chains?

Markov chains may be modelled using computer models that imitate sequential logics, and random walks which are a good illustration of how they can be used in mathematics. They are commonly used in economics, game theory, queueing (communication) theory, biology, and finance, and they emerge extensively in statistical and information-theoretic contexts. While Markov chains with any size of state space can be discussed, the initial theory and most implementations are focused on examples with a finite (or countably infinite) number of states.

The Markov property is particularly useful in simplifying complex real life processes as it relies on only two states, the state at time t and at time t+1.

Markov Chain Machine Learning Questions & Answers

Let’s dive into a few interview questions pertinent to Markov Chains. Try to answer them in your head before clicking the arrow to reveal the answer.

What are some of the applications of Markov Chains?

Markov chains can be used to determine the expected movement of stock prices, Meteorology, Genetics, information theory, search engines, speech recognition, cryptography and so on.

How does Markov Chain’s stochastic process differ from the general stochastic process?

A Markov chain is a type of stochastic process that varies from a typical stochastic process in that it must be “memory-less”. In other words, the steps that lead up to the current situation have no bearing on (the likelihood of) future acts. This is known as the Markov property.

What are the classes of Markov Chains?

A Markov chain could be irreducible, periodic, recurrent, transient, or absorbing.

  • If we can get from one state to another in a single or several steps, the Markov chain is said to be irreducible.
  • A state in a Markov chain is said to be Periodic if returning to it needs a multiple of some integer greater than 1.
  • A Markov chain state is considered to be Transient if there is a non-zero chance that the chain will never return to the same state; otherwise, it is Recurrent.
  • If there is no way to escape a state in a Markov chain, it is termed Absorbing. There are no outgoing transitions in absorbing states.

What information will Markov analysis provide?

The most evident information accessible from Markov analysis is the likelihood of being in a state at some future time period, which is also the type of information available from a decision tree. The future probability of being in a state may be calculated using matrix algebra.

What are the types of Markov Chains?

  • Discrete-time Markov chains: This indicates that the index set (the state of the process at time t) is a countable set, or that changes occur at certain states.
  • Continuous-time Markov chains: In CTMC, the index set T (state of the process at time t) is a continuum indicating that changes are continuous.

What are the assumptions of Markov Chains?

  • There are a limited number of states in the statistics system.
  • The states mutually exclude one another and are altogether exhaustive.
  • The likelihood of transitioning from one condition to another remains constant across time.

What are the advantages of Markov chains?

  • The Markov chain is fairly simple to generate from successional data.
  • We don’t need to delve further into the dynamic change method.
  • The Markov chain is quite informative. It may identify the areas of any process where we are deficient and allow us to make changes to improve.
  • Any system of any size may simply compute very low or modest computing needs.

What is the difference between a transition probability and an emission probability in Markov Chain?

The transition probability is simply the probability it takes to move from one state to another. Emission probability on the other hand is the measure of the maximum likelihood to have a given observation at a defined state. 

What is a stationary distribution in Markov chains?

A stationary distribution in Markov chains is one where the state of the probability distribution remains unchanged over a period of time. In other words, we can say the Markov chain converges. Every finite Markov chain has some stationary distribution matrix and this is useful in determining the probability of being in a given state irrespective of time.

Wrap Up

To summarize all the above, the importance of Markov Chain for machine learning is quite clear. Markov Chains can be used in a diverse range of problem spaces including search and cryptography. They are fairly easy to interpret and this makes them desirable in use cases where explainability is a requirement. Therefore, Markov Chains deserve your attention as an ML Developer.

Avi Arora
Avi Arora

Avi is a Computer Science student at the Georgia Institute of Technology pursuing a Masters in Machine Learning. He is a software engineer working at Capital One, and the Co Founder of the company Octtone. His company creates software products in the Health & Wellness space.