12 Unique ML Interview Questions on Linear Discriminant Analysis

Linear discriminant Analysis vs Primary component analysis compared visually

Other Topics — Machine Learning Interview Questions

Introduction

Not interested in background on Linear Discriminant Analysis? Skip to the questions here.

In this article, you will learn the theory behind LDAs and how it differs from PCA in a simplified manner.

By the end of this read, you should

  • Have understood the theory behind LDA
  • Know the difference between LDA and PCA
  • When and what to use LDAs for
  • Be more confident to take on questions relating to LDA in interviews.

Let’s begin with understanding LDAs.

Article Overview

What is Linear Discriminant Analysis (LDA)?

LDA is an analytical method that finds the linear combination of features that best separates the data into various classes. In other words, it finds the best linear equation that clearly separates the data into classes. It’s one of the numerous discriminant analysis methods. LDA is also a dimensionality reduction method but its simplicity and robustness make it suitable for classification problems. 

Why is Linear Discriminant Analysis important?

Some of the reasons LDA is important include:

  • It is multifaceted and can handle multiple and different scenarios
  • It can be used as a multi-class linear classifier unlike Logistic regression
  • It can be used for dimensionality reduction of features
  • It can be used for extracting features in face detection models.

How do LDAs work?

LDA attempts to maximize the separation between the projected classes and minimize the variance between projected classes.

LDA works in a similar way to PCA. The aim of an LDA algorithm is to try to find the best linear combination that gives the maximum separation between the number of groups present. 

It calculates the discriminant scores from a linear combination of weights and centred data points.These weights are extracted from eigenvectors. Unlike PCA, the eigenvectors are not calculated from a covariance matrix but from a matrix which is computed from the transpose of the distance between groups multiplied by the distance between groups.

Linear Discriminant Analysis ML Interview Questions/Answers

Try to answer them in your head before clicking the arrow to reveal the answer.

What is the most suitable application of LDA?

LDA is suitable for linear data, i.e datasets where a line can effectively separate the classes. You can also use it when you need a simple classifier that is easy to explain. It can also be used for dimensionality reduction.

What is the difference between PCA and LDA?

Unlike Linear Discriminant Analysis (LDA), Primary Component Analysis (PCA) captures more variation between the data points but it does not separate them very well.

PCA is a linear combination that accounts for as much variability as possible. For LDA, it maximizes the separation between two or more groups i.e it tries to increase the distance between two or more groups.

How does LDA calculate its maximum separation? 

The goal is for there to be little variance within the classes and more variance between the classes. LDA tries to reduce the distance between data points in the same class while increasing the distance between the data points two different classes. The ratio of the variance within classes and the variance between classes should be as large as possible to ensure separation. Simply, the means of the classes should be far away from each other and the data point should be close to the means.

Linear Discriminant Analysis computes its weights from the product of the inverse of the matrix-difference-within-groups and the matrix-difference-between-groups and then combines them into a linear combination to get the discriminant scores. 

Is LDA a supervised or unsupervised method?

Linear Discriminant Analysis is a supervised learning method because it requires labeled data, in contrast to PCA which is an unsupervised method. It is a classifier so it needs to have predefined labels.

How do you estimate how much each variable contributes to the separation?

To estimate how much a variable contributes to the separation, we must inspect the standardized discriminant function coefficients. If the variable is associated with a relatively high weight, then that variable is better to separate the groups compared to the others. 

What metrics can be calculated with LDA?

LDA supports some classification metrics such as:

  • Sensitivity: The ratio of the true positives to both true positivities and false negatives
  • Specificity: The percentage of true negatives to both true negatives and false positives
  • Accuracy: The ratio of the number of true positives and negatives compared to the whole result
  • AUC: A graph that shows the overall performance of the classifier over all possible thresholds.

What do the terms sensitivity and specificity mean?

Sensitivity tries to find the number of times a positive class is correctly predicted while It is calculated as the total number of true positives divided by the number of true positives and false negatives. 

Specificity on the other hand, finds the number of times a negative class is predicted. It is calculated as the total number of true negatives divided by the total number of true negatives and false positives.

How do you use Linear Discriminant Analysis as a classifier?

  • Split the data into training and validation
  • Calculate the discriminant scores with the training data
  • Determine an appropriate cut off value
  • Use the test data to calculate discriminant scores using the pre-computed weights from the training data
  • Evaluate the model using the metrics and make predictions.

Can LDA be used as a multi-class classifier? If so how would it work? 

LDA can be used to predict more than two groups, unlike some linear models. In the case of three groups, you’ll have two LDA equations with the first as the most distinguishing. If you have 4 groups, you’ll have 3 LDA equations, and so on.

What are some things LDA can be used for?

LDAs are more suitable as classifiers but they can also be used for dimensionality reduction and feature extraction.

Can Linear Discriminant Analysis be used for clustering?

No, it cannot. Clustering is an unsupervised learning approach while LDA is a supervised learning algorithm. Unlike clustering, LDA is used when the classes or labels are known. 

What are the limitations of LDA?

  • It requires that your data is linear and is not applicable to nonlinear problems
  • It assumes your data has a normal distribution.
  • It doesn’t perform well with imbalanced data

Wrap up

Explainability is something you want to look out for if you’re concerned about how your model makes decisions and LDA helps you with that. LDA is a simple and explainable discriminant analysis algorithm that helps you with classification tasks, dimensionality reduction, and feature extraction. However, its biggest limitation is that it is not suitable for non linear data. 

Avi Arora
Avi Arora

Avi is a Computer Science student at the Georgia Institute of Technology pursuing a Masters in Machine Learning. He is a software engineer working at Capital One, and the Co Founder of the company Octtone. His company creates software products in the Health & Wellness space.