My Youtube Channel

Please Subscribe

Flag of Nepal

Built in OpenGL

Word Cloud in Python

With masked image

Showing posts with label Jupyter Notebook. Show all posts
Showing posts with label Jupyter Notebook. Show all posts

Wednesday, December 7, 2022

N-grams for text processing


N-gram is an N sequence of words. It can be a unigram (one word), bigram (sequence of two words), trigram (sequence of three words), and so on. It focuses on a sequence of words. Such a method is very useful in speech recognition and predicting input text. It helps us to predict the next words that could occur in a given sequence. Search engines also use the n-gram technique to predict the next word while a search query is typed in a search bar. 

Let us consider a sentence: 

This is going to be an amazing experience.


Unigram for the above sentence can be written as:
  • "This"
  • "is"
  • "going"
  • "to"
  • "be"
  • "an"
  • "amazing"
  • "experience"

Bigram for the above sentence can be written as:

  • “This is”
  • “is going”
  • “going to”
  • "to be"
  • "be an"
  • "an amazing"
  • "amazing experience"

Trigram for the above sentence can be written as:

  • "This is going"
  • "is going to "
  • "going to be"
  • "to be an"
  • "be an amazing"
  • "an amazing experience"

With n-grams, we can use a bag of n-grams (for example, a bag of bigrams) instead of only using a bag of words. A bag of bigrams or trigrams is more powerful than using just a bag of words as it takes the context into consideration as well.


Sunday, August 16, 2020

Various features of Markdown in Jupyter Notebook


Output of Emphasis:




Output of List:



Output of Links:




Output of Images:





Output of table:




Output of Blockquotes:



Output of Horizontal Rule:



Output of Youtube links:




Output of Headers:



Get the Github link here.