## Natural Language Processing : How N-grams models are used to solve NLP problems?

Natural language processing applies different methods to extract patterns and build knowledge based from text data. N-grams is one of the language model, where we use previous N-1 (N being the size of your document/sentence),to predict the next word.

Along with sequence prediction, n-grams model is being used for spelling correction (as in Google search), language translation and text summarization.

#### Math behind n-grams¶

n-gram model is based on the idea of computing the probability of a sentence or sequence of words.

Mathematically,

P(W) = P(w1, w2, w3, .....)

If we need to predict the upcoming word/ sequence (w4),

P(w4|w1,w2,w3..)

Here, we need to calculate the probability of number of words; which can be represented
as joint probability and by using `Chain Rule`

.

Conditional probability can be written as:

P(B | A) = P(A,B) / P(A)

=> P(A,B) = P (B | A) * P(A)

If we include more variables:

P(A,B,C,D,E) = = P(A) P(B|A) P(C|A,B) P(D|A,B,C) P(E|A,B,C,D)

Therefore, we use `Chain Rule`

to compute join probability for the words in a sentence.