-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Machine Learning Algorithms

scikit-learn implements three Naive Bayes variants based on the same number of different probabilistic distributions: Bernoulli, Multinomial, and Gaussian. The first one is a binary distribution, and is useful when a feature can be present or absent. The second one is a discrete distribution and is used whenever a feature must be represented by a whole number (for example, in NLP, it can be the frequency of a term), while the third is a continuous distribution characterized by its mean and variance.
If X is a Bernoulli-distributed random variable, it can have only two possible outcomes (for simplicity, let's call them 0 and 1) and their probability is this:
In general, the input vectors xi are assumed to be multivariate Bernoulli distributed and each feature is binary and independent. The parameters of the model are learned according to a frequency count. Hence, if there are n samples with m features, the probability for the ith feature is this...