When given a movie as an input, the Machine Learning Algorithm would predict top 5 movies similar to the given input movie by the user.
-
Import numpy and pandas
-
This is a content based movie recommendation system, so we will take just important columns from dataset like, genres, id, keyword, title, etc.
-
A dictionary will store all the required columns and data cleansing will also take place- null data + duplicate check
-
Text Vectorization - It considers movie as a vector and identifies top 5 closest vectors to the given movie.
-
The stemmer will help in merging same words together.
-
The recommend function at the end will help in predicting 5 most similar movies.