Data Analysis And Data Modeling

Pavan Akula

RDBMS Prespective

Data Analysis

Data Analysis leads to Information.

  • Data Analysis is set of tools and techniques.

  • Discovering insightful, interesting, and novel patterns.

  • Descriptive, understandable, and predictive models.

  • Pattern mining, clustering, classification, and forecasting.

Data Modeling

Data Modeling - art and science of collecting and managing data.

Data modeling is a set of tools and techniques to understand and analyze how to collect, update and store data. Traditionally Data Models used in practice falls into three categories.

Data Modeling

  • Data Modeling starts with the collection of data.

  • Represents how data should be structured and stored.

  • Accurate use of data types numeric, categorical, dimensions.

  • Naming convention and reserved words.

  • Factors driving data primary keys, foreign keys, and constraints.

Case - Study

Pew Research Center and Smithsonian magazine conducted a national survey U.S. Views of Technology and the Future which asked Americans about a wide range of potential scientific developments from near-term advances like robotics and bioengineering, to more “futuristic” possibilities like teleportation or space colonization. More information can be found at http://www.pewinternet.org/2014/04/17/us-views-of-technology-and-the-future/. PDF file Feb_2014_Views_Future_Crosstab.pdf, will be used to demonstrate relation between Data Analysis and Data Modeling.

Report

Scope and Requirements

  • This is crosstab report.

  • Business Analyst needs partial data.

  • Data needs to be stored in the RDBMS database.

  • Showcase Star Schema and Snowflake Schema.

Star Schema

Star Schema

Snowflake Schema

Snowflake Schema Data

Star Schema Vs. Snowflake Schema

Star

  • Fact table uses more space.
  • Queries are less complex.
  • Suitable in data warehouses.

Snowflake Schema

  • Fact table uses less space.
  • Queries are more complex.
  • Suitable for data marts.

Star Schema - Storage

Star Schema - Queries

Snowflake Schema - Queries