This is Part 3 of our “Analyzing the Movielens data” series.
In Part 2, we answered the following by building Juxt flows:
- What is the average rating for each movie broken down by gender?
- What are the top 10 movies that men rate higher than women?
Continuing on, let’s address the next one
- List only the good movies – the ones that got an average rating of 4.3 or higher
In the process of doing this, we’ll go over how to build custom filters using Select building block.
The logical steps to address this question are
- Calculate the average rating for every movie title (total aggregate, not broken down by gender)
- Select(filter) only the movies that meet the 4.3 cut-off.
As before, we start with fetching the data from the user DB with Fetch from User DB.
Average rating per title can be calculated using the built-in Rollup library module (Recall that we had used a Pivot Table in the last example to further break it down by gender, but we have a simpler problem here).
The Rollup module outputs just two parameters – Title (Group by parameter) and Mean-Rating (aggregated feature).
Now, we need a mechanism to go over each of the entries and make a comparison against our selection criteria – mean > 4.30.
We use Select module for that. The Select module takes in each entry row by row and applies the user specified filter logic. We have a simple logic here, but you can apply rather sophisticated logic with multiple parameters using this mechanism.
In addition to the input data, Select module has two other inputs. Context Parameters enables users to provide extra parameters needed for the logic and a Drop down menu for picking the filter.
In our example, we use the filter called good movie selector.
Selector Logic – Juxt uses key-value stores. We use Lookup module with a key of mean-rating to a comparator block If True which compares the mean rating value with the preset value from Context Parameters which in this case is the number 4.3.
Finally, we render the results as a HTML Data Table
A two minute video of our discussion can be seen here