-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Machine Learning Algorithms

In NLP, a very common pipeline can be subdivided into the following steps:
The pipeline is called Bag-of-Words and will be discussed in this chapter. A fundamental assumption is that the order of every single word in a sentence is not important. In fact, when defining a feature vector, as we're going to see, the measures taken into account are always related to frequencies, and therefore they are insensitive to the local positioning of all elements. From some viewpoints, this is a limitation because in a natural language the internal order of a sentence is necessary to preserve the meaning; however, there are many models that can work efficiently with texts without the complication of local sorting. When it's absolutely necessary to consider...