Other Topics — Machine Learning Interview Questions
Introduction
Not interested in background on Collaborative Filtering? Skip to the questions here.
One common task for machine learning is providing recommendations to consumers in an attempt to generate more revenue, increase watch time, or improve some other metric. As such, knowledge of these algorithms is a very valuable skill to possess as a prospective Machine Learning Engineer or Intern.
This article will cover the what, how, and why of Collaborative Filtering. Then, you can test your skills with sample Interview Questions.
Article Overview
- What is Collaborative Filtering?
- How does Collaborative Filtering work?
- Collaborative Filtering Machine Learning Interview Q&A
- Wrap Up
What is Collaborative Filtering?
Building a recommender system in machine learning can be done using different methods. One of the most popularly used approaches is called collaborative filtering. By definition, collaborative filtering is a recommendation technique where a user’s preference is determined by the preference of similar users. It uses both user and item data, typically in the form of a user-item matrix.
In industry, collaborative filtering is widely applied in different applications such as YouTube, Netflix, Amazon, Medium, etc.
If you have the same demographic as a user that purchases a product on Amazon, they can recommend that same product to you. Additionally, YouTube incorporates this technique when recommending a video, Medium uses this when recommending an article, and of course, collaborative filtering plays a huge role in the Netflix movie recommender system.
How does Collaborative Filtering work?
Let’s walk through an example using Medium as a case study. First, they develop a matrix organized in a user-item format. Given our case scenario, let’s call it the user-post matrix.
On Medium, the “items” are typically articles. Each element within the matrix will either be a zero or one indicating that a particular user has read a particular post.
In more detail, if Medium has a user J which has like an article k, that entry will be denoted as a 1 in that particular user-item matrix. If a user J has not seen an article k on the other hand, the entry will be zero.
P1 | P2 | P3 | P4 | P5 | |
U1 | 1 | 0 | 1 | 1 | 1 |
U2 | 1 | 0 | 1 | 0 | 1 |
There are broadly two approaches to collaborative filtering: user-based collaborative filtering and item-based collaborative filtering.
User-based collaborative filtering
The goal here is to find similar users and recommend to some user, if another similar user liked the post. This similarity between users can be measured using different distance-measuring metrics such as the Jaccard distance, Cosine similarity or Hamming distance.
Say we want to predict the value of a particular entry for a given user. We can use the formula
\hat{response}_{\:U2, \:P1} = \frac{\sum_{P} sim(P1, Pi) \cdot response_{\:U2, \:Pi}}{\sum_{P}sim(P1, Pi)}
In other words, the response of user J will be equal to the summation of all the users in which user K is a similar user, normalized by all the similarities of the similar users.
We used binary indicators for our above example. Let’s say that users could rate a particular post (1 – 5 stars). Zero just means a user hasn’t seen it
P1 | P2 | P3 | P4 | P5 | |
U1 | 3 | 1 | 4 | 1 | 2 |
U2 | 2 | 1 | 4 | 0 | 2 |
To gauge similarity of these users, we will have to use different types of similarity metrics like Euclidean distance, Manhattan distance, Pearson correlation, Cosine similarity. We use the same formula as above to form a predicted response for a particular user.
Item Based Collaborative filtering
The recommendation is done by finding similar items based on how users interact with them. For example, items that have the same rating by a demographic are considered similar. To find the item-item similarity matrix, we take the cosine similarity of a user-item matrix with respect to just the products/articles. Medium can use this to predict a given user’s response to reading a particular article or Amazon predicting whether a user is buying a particular product.
P1 | P2 | |
U1 | 1 | 1 |
U2 | 0 | 1 |
U3 | 1 | 1 |
U4 | 1 | 1 |
U5 | 0 | 0 |
To predict the entry of a user given an item-item relationship, the formula is pretty much similar to what we used for user-based collaborative filtering. All that changes is the similarity values we use due to different similarities for the user-based and item-based techniques.
The results can be evaluated with Mean-Squared Error or any regression problem metric.
Collaborative Filtering Machine Learning Questions & Answers
Let’s dive into a few interview questions pertinent to Collaborative Filtering. Try to answer them in your head before clicking the arrow to reveal the answer.
The Hamming distance is a metric that can be used to calculate the similarity between users in a user-based collaborative filtering problem. To make use of Hamming distance (note that it is called distance and not similarity) correctly, our goal is to minimize the result (distance). Basically we maximize similarity and minimize distances.
It’s good to start off from an efficiency perspective. Since the matrices are typically sparse, such that many users don’t interact with many of the items, we have the time complexities:
User-based Item-based
O(U.P2) ⇾ O(U.P) O(U.P) ⇾ O(U+P)
Now, in terms of how they actually work:
User-based can provide better diversity since we are not comparing products as much as we are for users, so basically if a user x likes something very different from the items that user j has liked, we can still recommend user j those items because user j was similar to the user x, not necessarily similar to some post.
The downside is that it is typically more expensive because you have to calculate the K nearest neighbours.
For item-based, even though we have what appears to be a larger time complexity here, most of this work can be done offline. In practice, this means that we would calculate the item-item similarity matrix beforehand. A benefit to this is that there’s typically less recalculations. This is usually because items themselves usually change less than users change. This also leads to the problem of lack of diversity.
Both item-based and user-based collaborative filtering both exist under memory-based recommender systems.
Matrix factorization is a technique for decomposing a matrix m*n into m*k and k*n. Matrix factorization may be utilized in a variety of applications, including picture identification and recommendation. Matrices used in situations like this are often sparse since it is possible that one user may only rate a subset of the videos. Additionally, dimensionality reduction is one of the many uses for matrix factorization.
When two distinct types of entities are multiplied, matrix factorization is used to produce latent features. Collaborative filtering is the use of matrix factorization to determine the link between the “items” and users.
When using a Matrix Factorization approach, we are decomposing the large user-item matrix into lower dimensional user and item factors.
In order to learn these factors, we need to minimize a quadratic loss. Alternating Least Squares (ALS) is a different technique for optimizing the loss function. It is an iterative optimization procedure in which we aim to get closer and closer to a factored representation of our original data with each iteration. It has an almost similar approach to gradient descent.
Checkout this stellar resource on ALS from Stanford.
The following are all acceptable responses:
Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
Normalized Mean Absolute Error (NMAE)
Mean Average Precision @ K
Mean Average @ K
- It relies heavily on data.
- Usually, data changes so do the user preferences and thus the business approach has to change.
- The model requires constant adjustment and improvement.
The two major data sources for a collaborative filtering model are:
Explicit data: Actively provided data by a user, such as responses to a questionnaire or survey.
Implicit data: Data inferred by a system based on a user’s behavior, such as a preference for categories of articles after viewing/reading multiple articles in the same category and liking, commenting, and sharing.
Collaborative filtering approaches for recommender systems are ways for producing new recommendations that are completely based on previous interactions recorded between users and products. Content-based techniques make advantage of extra data on people and/or metadata.
Netflix is a real-world example of an enterprise that uses collaborative filtering (as part of their recommender system structure). They would generate a list of suggested movies by analysing what movies and television shows the user has viewed and comparing it to the viewing habits of other users. This method takes advantage of user behaviour.
Pandora employs a recommender system’s content-based approach. It searches for a ‘station’ that plays music with comparable qualities based on the attributes of a song or performer. When a user ‘dislikes’ a song, the station’s results are refined by deemphasizing some features and highlighting other attributes when a user ‘likes’ a song.
Wrap Up
To summarize all the above, the importance of collaborative filtering for businesses is nondebatable. It can produce immense value and be applied to help improve many different key metrics of success! They play a vital role, particularly in SaaS products where the goal is to give the user the highest level of satisfaction possible. Therefore, collaborative filtering deserves your attention as an ML Developer.