Rekha Project

  • How machine learning powers Facebook’s News Feed ranking algorithm

    Designing a personalized ranking system for more than 2 billion people (all with different interests) and a plethora of content to select from presents significant, complex challenges. This is something we tackle every day with News Feed ranking. Without machine learning (ML), people’s News Feeds could be flooded with content they don’t find as relevant or interesting, including overly promotional content or content from acquaintances who post frequently, which can bury the content from the people they’re closest to. Ranking exists to help solve these problems, but how can you build a system that presents so many different types of content in a way that’s personally relevant to billions of people around the world? We use ML to predict which content will matter most to each person to support a more engaging and positive experience. Models for meaningful interactions and quality content are powered by state-of-the-art ML, such as multitask learning on neural networks, embeddings, and offline learning systems. We are sharing new details of how we designed an ML-powered News Feed ranking system.

    Now let’s review how the aggregator works:

    Query inventory.

    • Purpose: We first need to collect all the candidate posts we can possibly rank for Juan (the cocker spaniel picture, the running video, etc.).
    • Membership: The eligible inventory includes any non-deleted post shared with Juan by a friend, Group, or Page that he is connected to that was made since his last login. But what about posts created before Juan’s last login that he hasn’t seen yet? Maybe these were higher quality or more relevant than the newer posts, but he simply didn’t have time to look at them.
    • Contact:rekha, johnnes

    Score Xit for Juan for each prediction (Yijt).

    • Purpose: Now that we have Juan’s inventory, we score each post using multitask neural nets
    • Membership: There are many, many features (xijtc) we can use to predict Yijt, including the type of post, embeddings (i.e., feature representations generated by deep learning models), and what the viewer tends to interact with. To calculate this for more than 1,000 posts, for each of billions of users — all in real time — we run these models for all candidate stories in parallel on multiple machines, called predictors.
    • Contact: Adil, Aksahy, chirang

    Calculate a single score out of many predictions:

    • Purpose: Now that we have all the predictions, we can combine them into a single score. To do this, multiple passes are needed to save computational power and to apply rules, such as content type diversity (i.e., content type should be varied so that viewers don’t see redundant content types, such as multiple videos, one after another), that depend on an initial ranking score.
    • Membership: This helps us rank fewer stories with high recall in later passes so that we can use more powerful neural network models. Pass 1 is the main scoring pass, where each story is scored independently and then all ~500 eligible posts are ordered by score.
    • Contact: Shouvik, lakshmi

    Designing a personalized ranking system for more than 2 billion people (all with different interests) and a plethora of content to select from presents significant, complex challenges. This is something we tackle every day with News Feed ranking. Without machine learning (ML), people’s News Feeds could be flooded with content they don’t find as relevant or interesting, including overly promotional content or content from acquaintances who post frequently, which can bury the content from the people they’re closest to. Ranking exists to help solve these problems, but how can you build a system that presents so many different types of content in a way that’s personally relevant to billions of people around the world? We use ML to predict which content will matter most to each person to support a more engaging and positive experience. Models for meaningful interactions and quality content are powered by state-of-the-art ML, such as multitask learning on neural networks, embeddings, and offline learning systems. We are sharing new details of how we designed an ML-powered News Feed ranking system.

    Now let’s review how the aggregator works:

    Query inventory.

    • Purpose: We first need to collect all the candidate posts we can possibly rank for Juan (the cocker spaniel picture, the running video, etc.).
    • Membership: The eligible inventory includes any non-deleted post shared with Juan by a friend, Group, or Page that he is connected to that was made since his last login. But what about posts created before Juan’s last login that he hasn’t seen yet? Maybe these were higher quality or more relevant than the newer posts, but he simply didn’t have time to look at them.
    • Contact:rekha, johnnes

    Score Xit for Juan for each prediction (Yijt).

    • Purpose: Now that we have Juan’s inventory, we score each post using multitask neural nets
    • Membership: There are many, many features (xijtc) we can use to predict Yijt, including the type of post, embeddings (i.e., feature representations generated by deep learning models), and what the viewer tends to interact with. To calculate this for more than 1,000 posts, for each of billions of users — all in real time — we run these models for all candidate stories in parallel on multiple machines, called predictors.
    • Contact: Adil, Aksahy, chirang

    Calculate a single score out of many predictions:

    • Purpose: Now that we have all the predictions, we can combine them into a single score. To do this, multiple passes are needed to save computational power and to apply rules, such as content type diversity (i.e., content type should be varied so that viewers don’t see redundant content types, such as multiple videos, one after another), that depend on an initial ranking score.
    • Membership: This helps us rank fewer stories with high recall in later passes so that we can use more powerful neural network models. Pass 1 is the main scoring pass, where each story is scored independently and then all ~500 eligible posts are ordered by score.
    • Contact: Shouvik, lakshmi
Page last updated: 23 Sep 2021, 07:44 PM