Paper ID: 2203.14363

piRank: A Probabilistic Intent Based Ranking Framework for Facebook Search

Zhen Liao

While numerous studies have been conducted in the literature exploring different types of machine learning approaches for search ranking, most of them are focused on specific pre-defined problems but only a few of them have studied the ranking framework which can be applied in a commercial search engine in a scalable way. In the meantime, existing ranking models are often optimized for normalized discounted cumulative gains (NDCG) or online click-through rate (CTR), and both types of machine learning models are built based on the assumption that high-quality training data can be easily obtained and well applied to unseen cases. In practice at Facebook search, we observed that our training data for ML models have certain issues. First, tail query intents are hardly covered in our human rating dataset. Second, search click logs are often noisy and hard to clean up due to various reasons. To address the above issues, in this paper, we propose a probabilistic intent based ranking framework (short for piRank), which can: 1) provide a scalable framework to address various ranking issues for different query intents in a divide-and-conquer way; 2) improve system development agility including iteration speed and system debuggability; 3) combine both machine learning and empirical-based algorithmic methods in a systematic way. We conducted extensive experiments and studies on top of Facebook search engine system and validated the effectiveness of this new ranking architecture.

Submitted: Mar 27, 2022