Getting Familiar with our Data

The Fake News data set from Kaggle contains over 7,500 rows or entries. I noticed some misaligned or improperly rendered rows, but generally this is our data size. As for fields there are 20 columns, one of which is ‘Type’ that categorizes the fake news entry. There seems to be some overlap among types, and “BS”, as the most common type, is fairly ambiguous. However, you can get an idea of the types of fake news compiled.

Below are the total counts per Fake News Type:


Type Count
bias 443
bs 11444
conspiracy 430
fake 19
hate 245
junksci 101
satire 146
state 121




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s