Import Spam Filtering Dataset in python | Tutorial No: 1

Import Spam Filtering Dataset in python | Tutorial No: 1. We are going to start a series for SPAM Detection Tutorial for NLP. There are many ways to implement spam detection methodology. But the method discussed here are very easy to understand.

In this series of tutorials you will learn that How to import Dataset in python? For Spam Detection Procedure as well as the following aspects in spam detection.

How to import  nltk in python?
How to import  sklearn in python?
How to import  matplotlib.pyplot as plt in python?
How to import  csv in python?
How to import  numpy as np in python?
How to import  re in python?

How to import  pandas in python?
How to import  matplotlib.pyplot in python?
How to import  csv in python?
How to import  wordcloud in python?
How to import  seaborn in python?
How to import  string in python?
How to import  regex in python?

# Importing Mandatory Libraries

For Installing different libraries in python, Anaconda or Jupyter Notebook you have to run the following command in python shell, Anaconda command line and Jupyter Notebook.

>>>import  nltk
>>>import  sklearn
>>>import  matplotlib.pyplot as plt|
>>>import  csv
>>>import  numpy as np
>>>import  re
>>>import  pandas as pd
>>>import  matplotlib.pyplot as plt
>>>import  csv
>>>import  wordcloud
>>>import  seaborn
>>>import  string
>>>import  regex

# Checking Current Working Directory in python.

>>>import os
>>>os.getcwd()

OutPut: ‘C:UsersMuhammadAhmadjupyter using python’

# Import Spam Filtering Dataset in python

>>>smsspam = pd.read_csv(‘SMSSpamCollection’, sep=“t”, header=None)

>>>smsspam.head()

How to Import Dataset in python
How to Import Dataset in python

# Changing the Labels Name of Dataset

>>>smsspam.columns = [‘label’,’sms’]
>>>smsspam.head()

# Checking Dataset Details

#View the details(Length, Number of “HAM” or “SPAM” messages, Number of row and columns as well as missing label messages.) for smsspam Dataset.

>>>print(f’input data has {len(smsspam)} rows, {len(smsspam.columns)} columns’)
>>>print(f’ham = {len(smsspam[smsspam[“label”] == “ham”])}’)
>>>print(f’spam = {len(smsspam[smsspam[“label”] == “spam”])}’)
>>>print(f” number of missing label = {smsspam[‘label’].isnull().sum()}”)
>>>print(f” number of missing msg = {smsspam[‘sms’].isnull().sum()}”)

Outputs

input data has 5572 rows, 2 columns
ham = 4825
spam = 747
number of missing label = 0
number of missing msg = 0

Download SMS Spam Filtering Dataset

download btn Download Spam Filtering Dataset

We will be happy to hear your thoughts

Leave a reply

eSkillsInstitute
Logo
Register New Account
Shopping cart