Video Recommendation - MLE
Video Recommendation - Machine Learning System Design
Requirements
Maximize users' engagement and recommend new types of content to users.
Metrics
Offline metrics
Use precision, recall, ranking loss, and logloss.
Online metrics
Use A/B testing to compare Click Through Rates, watch time, and Conversion rates.
Reasonable precision, high recall.
Training
Ideally, we want to train many times during the day to capture temporal changes every day.
Inference
The latency is between 100ms to 200ms.
Find the balance between exploration vs. exploitation (between relevancy and fresh new content).
Infra
User/Video DB
↓ (millions)
Candidate Generation Service
↓ (hundreds)
Ranking Service ← Video Features
↓ (dozens)
Users
The candidate model will find the relevant videos based on user watch history and the type of videos the user has watched.
The ranking model will optimize for the view likelihood, i.e., videos that have high a watch possibility should be ranked high. It’s a natural fit for the logistic regression algorithm.
Candidate Generation Model
Ranking model
Sort the video candidates based on probability.
Feature engineering can use embedding.
Sigmoid function outputs probability in the range [0, 1] that presents the watch probability to recommend videos. So we use Sigmoid activation at the last layer of our ranking model.
When using deep learning architecture, we can use ReLu(Rectified Linear Unit) as an activation function for hidden layers. The loss function can be cross-entropy loss.
Flow
When a user requests a video recommendation, the Application Server requests Video candidates from the Candidate Generation Model. Once it receives the candidates, it then passes the candidate list to the ranking model to get the sorting order. The ranking model estimates the watch probability and returns the sorted list to the Application Server. The Application Server then returns the top videos that the user should watch.