Building a Data Pipeline for a Recommender System



Recommendation engines are among the most well known, widely used and highest-value use cases for applying machine learning. Despite this, while there are many resources available for the basics of training a recommendation model, there are relatively few that explain how to use Nexus to build a data pipeline, actually deploy these models to create a production-level machine learning eco-system for a recommender system.


What you’ll build

You will use a Jupyter notebook to build a pipeline to train a recommendation system.

What you’ll learn

  • Set up a Nexus environment.
  • Query the data from Nexus using SPARQL.
  • Prepare the data into a good shape for collaborative filtering.
  • Perform a classical collaborative filtering algorithm - matrix factorization
  • Push the training output to Nexus
  • Recommend movies by querying the output from Nexus

What you’ll need

A Python environment with support of Jupyter notebook

Get the tutorial code

This tutorial code is available: