-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Machine Learning Algorithms

In some specific scenarios, the dataset can be structured like a matrix, where the rows represent a category and the columns represent another category. For example, let's suppose we have a set of feature vectors representing the preference (or rating) that a user expressed for a group of items. In this example, we can randomly create such a matrix, forcing 50% of ratings to be null (this is realistic considering that a user never rates all possible items):
import numpy as np nb_users = 100 nb_products = 150 max_rating = 10 up_matrix = np.random.randint(0, max_rating + 1, size=(nb_users, nb_products)) mask_matrix = np.random.randint(0, 2, size=(nb_users, nb_products)) up_matrix *= mask_matrix
In this case, we are assuming that 0 means that no rating has been provided, while a value bounded between 1 and 10 is an actual rating. The resulting matrix is shown in the following screenshot:
User-product matrix for the biclustering example
Given such a matrix, it's possible to assume...