Pandas is a popular library for data analysis built on top of the Python programming language. Pandas can be though as a digital toolbox that holds various tools for working with data. Pandas pairs well with other libraries for statistics, natural language processing, machine learning, visualization, and more. Pandas is…


ML Concept: What are Hyper-parameters?

The goal of ML applications is to create models that can master a task based on a dataset. But how do we know that our models are learning at an optimal rate? To achieve that, we need to regularly tune different aspects of the model and evaluate its performance. Think…


Introduction

Let’s talk about the nature of learning. We are not born knowing much. Over the curse of our lifetimes, we slowly gain an understanding of the world through interaction. We learn about cause and effect or how the world responds to our actions. Once we have an understanding of how…


Sequence-to-sequence (S2S) models are a special case of a general family of models called encoder–decoder models. An encoder–decoder model is a composition of two models, an “encoder” and a “decoder,” that are typically jointly trained. The encoder model takes an input and produces an encoding or a representation (ϕ) of…


Sequence prediction tasks require us to label each item of a sequence. Such tasks are common in natural language processing. Some examples include language modeling. in which we predict the next word given a sequence of words at each step; part-of-speech tagging, in which we predict the grammatical part of…


A sequence is an ordered collection of items. Traditional machine learning assumes data points to be independently and identically distributed (IID), but in many situations, like with language, speech, and time-series data, one data item depends on the items that precede or follow it. Such data is also called sequence…


Representing discrete types (e.g., words) as dense vectors is at the core of deep learning’s successes in NLP. The terms “representation learning” and “embedding” refer to learning this mapping from one discrete type to a point in the vector space. …


In this section we discussed feature engineering techniques using neural networks, such as word-embeddings, character-embeddings. The advantage of using embedding based features is that they create a dense, low-dimensional feature representation instead of the sparse, high-dimensional structure of bag of words/TFIDF and other such features. …


Organizing is what you do before you do something, so that when you do it, it is not all mixed up

In this Section we will look at one of the most popular tasks in NLP — text classification. It concerns with assigning one or more groups for a given…

Duy Anh Nguyen

AI Researcher - NLP Practitioner

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store