Finding related searches with Spark
Unsupervised learning from user behaviour When a user navigates a site they leave a valuable trail of information - what their first search was, what they followed this search with, and so on. Using this data we can learn related searches automatically by co-occurrence counting.
This post takes you through the steps to get from raw search logs to results using the Spark cluster computing framework.
Spark provides a natural processing language for flows of data, and can be scaled up to clusters when data growth dictates.
[Read More]