Natural Language Processing : How N-grams models are used to solve NLP problems?

Natural language processing applies different methods to extract patterns and build knowledge based from text data. N-grams is one of the language model, where we use previous N-1 (N being the size of your document/sentence),to predict the next word.

Along with sequence prediction, n-grams model is being used for spelling correction (as in Google search), language translation and text summarization.

Math behind n-grams

n-gram model is based on the idea of computing the probability of a sentence or sequence of words.

Mathematically,

P(W) = P(w1, w2, w3, .....)

If we need to predict the upcoming word/ sequence (w4),

P(w4|w1,w2,w3..)

Here, we need to calculate the probability of number of words; which can be represented as joint probability and by using Chain Rule.

Conditional probability can be written as:

P(B | A) = P(A,B) / P(A)

=> P(A,B) = P (B | A) * P(A)

If we include more variables:

P(A,B,C,D,E) = = P(A) P(B|A) P(C|A,B) P(D|A,B,C) P(E|A,B,C,D)

Therefore, we use Chain Rule to compute join probability for the words in a sentence.

Sijan Bhandari on

Sentiment Analysis - Data Preprocessing and Feature Engineering : part 4

In this step, we can follow some data preprocessing operations as described in the image below:

1. Load data

In [6]:
import pandas as pd
pd.set_option('display.max_colwidth', -1)

PYTHONWARNINGS="ignore"
In [34]:
df = pd.read_csv('data.csv')
In [35]:
df.head()
Out[35]:
title sentiment
0 Fed official says weak data caused by weather, should not slow taper -1
1 Fed's Charles Plosser sees high bar for change in pace of tapering 1
2 US open: Stocks fall after Fed official hints at accelerated tapering 1
3 Fed risks falling 'behind the curve', Charles Plosser says 0
4 Fed's Plosser: Nasty Weather Has Curbed Job Growth -1
In [36]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2000 entries, 0 to 1999
Data columns (total 2 columns):
title        2000 non-null object
sentiment    2000 non-null int64
dtypes: int64(1), object(1)
memory usage: 31.3+ KB

Implementation of Perceptron in Python

In [2]:
# -*- coding: utf-8 -*-
# @Author: Sijan
# @Date:   2018-04-03 12:46:52
# @Last Modified time: 2018-04-09 15:21:41

from random import randint


def step_function(result):
    """
    Simple linear function which will be activated if the value is greater than 0.
    """
    if result > 0:
        return 1
    return 0


class Perceptron:
    """
    Perceptron class defines a neuron with attributes : weights, bias and learning rate.
    """

    def __init__(self, input_size):
        self.learning_rate = 0.5
        self.bias = randint(0, 1)
        self.weights = [randint(0, 1) for _ in range(input_size)]


def feedforward(perceptron, node_input):
    """
    Implements product between input and weights
    """
    node_sum = 0
    node_sum += perceptron.bias

    for index, item in enumerate(node_input):
        # print('input node is', item)
        node_sum += item * perceptron.weights[index]

    return step_function(node_sum)


def adjust_weight(perceptron, node_input, error):
    """
    Adjust weightage based on error. It simply scales input values towards right direction.

    """
    for index, item in enumerate(node_input):
        perceptron.weights[index] += item * error * perceptron.learning_rate

    perceptron.bias += error * perceptron.learning_rate


def train(perceptron, inputs, outputs):
    """
    Trains perceptron for given inputs.
    """
    for training_input, training_output in zip(inputs, outputs):
        actual_output = feedforward(perceptron, training_input)
        desired_output = training_output
        error = desired_output - actual_output
        adjust_weight(perceptron, training_input, error)
        print('weight after adjustment', perceptron.weights)
        print('bias after adjustment', perceptron.bias)


def predict(perceptron, test_input, test_output):
    """
    Predicts new inputs.
    """
    prediction = feedforward(perceptron, test_input)

    # if test_input[1] == test_output:
    print('input :%s gives output :%s' % (test_input, prediction))
    print('input :%s has true output :%s' % (test_input, test_output))


if __name__ == '__main__':

    train_inputs = [(0, 0), (0, 1), (1, 0), (1, 1)]
    train_outputs = [0, 0, 0, 1]

    # train perceptron
    perceptron = Perceptron(2)
    epochs = 10

    for _ in range(epochs):
        train(perceptron, train_inputs, train_outputs)

    # test perceptron
    test_input = (1,1)
    test_output = 1
    predict(perceptron, test_input, test_output)
weight after adjustment [1.0, 0.0]
bias after adjustment 0.5
weight after adjustment [1.0, -0.5]
bias after adjustment 0.0
weight after adjustment [0.5, -0.5]
bias after adjustment -0.5
weight after adjustment [1.0, 0.0]
bias after adjustment 0.0
weight after adjustment [1.0, 0.0]
bias after adjustment 0.0
weight after adjustment [1.0, 0.0]
bias after adjustment 0.0
weight after adjustment [0.5, 0.0]
bias after adjustment -0.5
weight after adjustment [1.0, 0.5]
bias after adjustment 0.0
weight after adjustment [1.0, 0.5]
bias after adjustment 0.0
weight after adjustment [1.0, 0.0]
bias after adjustment -0.5
weight after adjustment [0.5, 0.0]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -0.5
weight after adjustment [1.0, 0.5]
bias after adjustment -0.5
weight after adjustment [1.0, 0.5]
bias after adjustment -0.5
weight after adjustment [0.5, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 1.0]
bias after adjustment -0.5
weight after adjustment [1.0, 1.0]
bias after adjustment -0.5
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
weight after adjustment [1.0, 0.5]
bias after adjustment -1.0
input :(1, 1) gives output :1
input :(1, 1) has true output :1

Using Perceptron Model for classification : An illustrative Approach

In this post, we are going to devise a measurement tool (perceptron model) in order to classify : whether a person is infected by a diseases or not.

In binary terms, the output will be

       {
            1   if infected 
            0   not infected
        } 

To build inputs for our neural network, we take readings from the patients and we will treat readings as follows :

  body temperator = {
                          1   if body temperator > 99'F
                         -1   if body temperator = 99'F
                     }

  heart rate) = {
                      1   if heart rate > 60 to 100
                     -1   if heart rate = 60 to 100
                 }

   blood pressure = {
                          1   if heart rate > 120/80
                         -1   if heart rate = 120/80
                     }

So, input from each patient will be represented as a three dimensional vector:

  input = (body temperatur, heart rate, blood pressure)

So, a person can now be represented as :

(1, -1, 1)
i.e (body temperator > 99'F, heart rate = 60 to 100, heart rate > 120/80)

Let us create two inputs with desired output value

      x1 = (1, 1, 1), d1 = 1 (infected)
       x2 = (-1, -1, -1), d2 = 0 (not infected)

Let us take initial values for weights and biases: weights, w0 = (-1, 0.5, 0) bias, b0 = 0.5

And, activation function:

         A(S)   = {
                    1 if S >=0
                    0 otherwise
                  }
STEP 1

Feed x1 = (1, 1, 1) into the network.

weighted_sum:

S = (-1, 0.5, 0) * (1, 1, 1)^T + 0
  = -1 + 0.5 + 0 + 0
  = -0.5

When passed through activation function A(-0.5) = 0 = y1 We passed an infected input vector, but our perceptron classified it as not infected. Let's calculate the error term:

             e = d1 - y1 = 1 - 0 = 1

Update weight as:

             w1 = w0 + e * x1 = (-1, 0.5, 0) + 1 * (1, 1, 1) = (0, 1.5, 1)

And, update bias as:

             b1 = b0 + e = 1
STEP 2

Now, we feed second input (-1, -1, -1) into our network.

weighted_sum :

S = w1 * x2^T + b1 
  = (0, 1.5, 1) * (-1, -1, -1)^T + 1
  = -1.5 - 1 + 1
  = -1.5

When passed through activation function A(-1.5) = 0 = y2 We passed an not infected input vector, and our perceptron successfully classified it as not infected.

STEP 3

Since, our first input is mis-classified, so we will go for it.

weighted_sum :

S = w1 * x1^T + b1 
  = (0, 1.5, 1) * (1, 1, 1)^T + 1
  = 1.5 + 1 + 1
  = 3.5

When passed through activation function A(3.5) = 1 = y3 We passed an infected input vector, and our perceptron successfully classified it as infected.

Here, both input vectors are correctly classified. i.e algorithm is converged to a solution point.

Sijan Bhandari on

What is perceptron and how it works?

Perceptron is simply an artificial neuron capable of solving linear classification problems. It is made up of single layer feed-forward neural network.

A percentron can only takes binary input values and signals binary output for decision making. The output decision (either0 or 1), is based on the value of weighted sum of inputs and weights.

Mathematically perceptron can be defined as :

output O(n)=
                    {    0 if ∑wixi + $\theta$ <= 0
                         1 if ∑wixi + $\theta$ > 0
                     }

$\theta$ = threshold / bias

What is Deep Learning and Neural Network?

Deep learning, in simpler version is a learning mechanisms for Neural networks. And, Neural networks are computational model mimicing human nervous system which are capable of learning. Like interconnected neurons in human brains, the neural network is also connected by different nodes. It receives signals as a set of inputs, perform calcuations and signals output based on some activation value.

Here are some list of problems, that deep learning can solve

  1. Classification : object and speech recongnistion, classify sentiments from text
  2. Clustering : Fraud detection

Read more…

Sentiment Analysis Data Collection : part 3

After you have your problem analysis, you should only focus couple of your days to gather the related data.

Training / testing data collection

I have collected data from news portal ekantipur.com for training/testing of our sentiment model.

In [ ]:
# Here is the sample code that I have used for scraping news site.
# news_spider.py

# This module scrolls through news site (ekantipur.com) and collects news titles.
import time
import csv

from bs4 import BeautifulSoup

Sentiment Analysis Problem Statement : part 2

Though there are lots of challenges out there that can be solved using data science, I come up with a very basic problem. Why I am choosing this -- simply because I do not want to waste my energy just by thinking about big problems and also not to waste your energy to read it and forget in couple of hours :).

PROBLEM: We will classify the sentiment of news titles (Whether positive or negative or neutral.) Problem Analysis:

For this part, we will answer following questions:

  1. What will be the input for your program?

    Read more…

Sentiment Analysis Introduction : part 1

Sentiment analysis has been using as a tool to cassify response from your user/customer as 'positive' or 'negative' or even 'neutral'. It combines two different disciplines : Natural Language Processing and Text Analysis to extract information from text data. In sentiment analysis, algorithm will learn from labelled example data and predict the label of new /unseen data points. This approach is called supervised learning, as we will train our model with corpus of labelled news.

Why sentiment analysis?

People have different ways to express their attitudes or opinions / reviews towards your product /event / movie or even for people.

Read more…