Other Topics — Machine Learning Interview Questions
Introduction
In this article you will find essential machine learning interview questions that are geared towards beginners preparing for job or internship interviews. The questions in this article are general and cover a large breadth of information.
Without further ado, let’s get into the quiz!
Beginner Machine Learning Interview Quiz
#1. Are classifications and regressions supervised learning? Is community detection categorized as unsupervised learning or supervised learning?
#2. Does dimensionality reduction reduce the number of features or the number of samples in a dataset? Can reducing dimension help us save some computational cost?
#3. Are “for loops” efficient in Python Machine Learning? What alternative could be use?
#4. Which statement is correct when comparing numpy arrays to python lists?
#5. If a square matrix A has dimensions of (n X n) and linearly independent rows and columns, which of the following is False?
#6. Does dot-product represent the correlation between any two vectors? Is element-wise multiplication of two vectors the same as the dot-product of two vectors?
#7. How many parameters does a Gaussian Distribution have? If two Gaussians are independent of each other, is the summation of the two Gaussians a Gaussian?
#8. Which statement is incorrect?
1. Supervised learning needs both the Dataset (x) and the Labels (y)
2. Unsupervised learning finds the relationship between a data point and its label
3. Supervised learning needs a relationship between input (x) and output (y)
4. Unsupervised learning finds similarity among the data points #9. Which of the following equations are correct?
p(X) = Σₓ p(X, Y)
p(X, Y) = p(X | Y) p(X)
p(X) = Σᵧ p(X, Y) #10. Which of the following solutions are still a Gaussian Random Variable? In the choices X₁ and X₂ are independent Gaussian random variables, while A, B and b are just constant matrices or vectors.
1. X₁ + X₂
2. AX₁ + b
3. AX₁ + BX₂
4. AX₁ + BX₂ + b #11. You are trying to send a telegram about the weather in San Fransisco. The weather has an equal probability of being Sunny, Cloudy, Rainy, and Windy (uniform each with 1/4 probability). What are the expected amount of bits needed to encode this message / what is the entropy limit?
#12. Mutual information quantifies what?
#13. What is measured by Mutual Information? Do we usually look for a feature with high mutual information or a low one?
#14. How can we optimize the parameters of a non-probabilistic model with equality constraints? How to optimize it for inequality constraints? What approach would we take to optimize a probabilistic model
#15. When implementing an EM algorithm for a Gaussian Mixture Model (GMM), in the E step we:
#16. Is the area under a curve of a mixture model equal to 1? Is GMM a soft assignment task?
#17. Is the Gaussian Mixture Model (GMM) a supervised or unsupervised learning method? In a GMM, each data point may belong to ___?
#18. Assume the points in the image above have been manually assigned to their correct cluster. Which clustering algorithm, if any, would result in the same or similar classification of points?
#19. Which of the following is true for Single-link distance measurement?
A. It avoids chaining
B. It is sensitive to outliers #20. In the figure below, if you draw a horizontal line for y = 2, what would be the total number of clusters formed?
Results