Introduction

This report presents an exploratory analysis of the SwiftKey text prediction dataset. The analysis uses samples from the blogs, news, and Twitter datasets.

Summary Statistics

Blogs: 1000 lines
News: 1000 lines
Twitter: 1000 lines

The datasets were analyzed to determine their sizes and word counts.

Visualization

A bar chart was created to compare the number of words in each dataset.

A word cloud was also generated to visualize the most frequently occurring words.

Conclusion

The exploratory data analysis provides an overview of the dataset and prepares the data for building a predictive text model.

Exploratory Data Analysis of the SwiftKey Text Prediction Dataset

Srilakshmi Darangula

2026-06-30

Introduction

Summary Statistics

Visualization

Conclusion