This document contains the Training plan for GSSC. The plan is divided into 4 basic modules. The Business Translator Module cuts across the other modules and is treated as a necessary ingredient in all the other modules
- Essentials
- Specialist - Modeling
- Specialist - Visualization
- Business Translator
Essentials Module
The Essentials Module is designed to provide a gentle introduction to the basics of Data Analysis using the R Statistical programming Language. By the end of this module you should be comfortable with
- An overview of the Insurance Industry - Insurance as a process of Risk aggregation, The major branches of the Insurance industry, Various analytics problems that can be solved in the Insurance value chain, an appreciation for the complexity involved
- A broad understanding of QBE - A glance through of the 2016 AGBM and notes from the Chairman’s desk, an understanding of the priorities of QBE and its overall numbers (GWP, Loss and Expense ratios, Product lines etc)
- Immediate Client context - Understanding the role and KRAs of your immediate client. Their priorities. Developing an empathy for their role
- Basics Statistics - Given a dataset - How to analyse it for sanity and consistency, Understanding the different data types and casting them to analytics friendly formats, How to check for missing values and treat such cases, How to generate basic Descriptive and Inferential statistics from the data, How to generate preliminary insights, How to generate structured hypotheses and test them on the data, Generating a report and sharing them with clients in Markdown format as a complete Analysis booklet
- Modeling - How to convert a business problem stated in English to an analytics problem, How to logically break it down into manageable chunks, Generating an exhaustive list of hypotheses, Identifying the data sets and elements required, Evaluating the modeling choices that are available and deciding on what will work best in a given situation, Analytical Dataset preparation, Feature Engineering to enhance a dataset
- Basics of Programming - Understanding the OOP and Functional paradigms of programming and how to easily adapt to different programming languages, Complexity and BigO notation to understand the complexity of queries and hence the need for Wrapper classes and functions
- SQL - Understanding RDBMS, Writing efficient queries and joins, Connecting to SQL from R
- PowerBI - Generating a simple report - Connecting to a database, Data ingestion, Manipulating simple elements and formatting to create a report
Specialist - Modeling Module
The Modeling module is designed to introduce Statistical modeling concepts. This module will provide examples and practice of basics of Univariate and Bi variate models. Finally there will be an application project - Cross sell of Insurance products
- Setting up a Good Exploratory Data analysis(EDA) - How to set up a good EDA, What conclusions should (and should not) be drawn from an EDA, How to present the results of an EDA
- Summary Statistics - Drawing conclusions from the EDA, Selecting variables(columns) for a model, Variable transformations and in what situations should they be applied, Modality of the data and deciding on the number of models to build, Ensemble approach to modeling and when it is applicable
- Building a Model - Selecting an appropriate modeling technique from available options, How to build a first model, Judging the goodness of a model and what parameters to look at to make a determination, Iterating to improve accuracy along with tips and tricks to improve accuracy (Bootstrapping and Boosting), Comparing 2 models to determine which is better, Interpreting the results of a model, Final model selection
- Model Deployment - How to convert a model output into Business recommendations, How to deploy your model in a production environment, How to generate further analysis leads out of a Model output, How to publish the results of a completed modeling exercise, An appreciation of the vavrious ways to calculate and catalogue the impact created by mathematical models, How to monitor a model for degradation and deciding on when to update a model
- Solution visualization - How to generate a coherent presentation out of your analyses, How to present them succinctly and impactfully, What analyses and project artifacts to create such that the project is properly documented
- (If time permits)
- Basics of Machine Learning and AI
- Supervised and Unsupervised Learning
- Support Vector Machines
- Random Forest
- Basics of Artificial Neural Nets using Tensorflow (Google Platform for Neural Nets)
Specialist - Visualization Module
- Thinking of a Dashboard in the Server-Client paradigm
- Frontend as a User friendly layer
- User base and their usage characteristics
- How to design drill downs - When are they useful vs When do they confuse?
- Backend as a data efficiency layer
- Designing for required user load
- Determining architecture based on usage
- Advanced PowerBI
- Connecting to databases
- Creating self updating dashboards
- Advanced R visuals for PowerBI
- Using Maps
- R Shiny
- Creating a basic server client code
- Publishing a dashboard
- Frequently used apps and the code for them
- A few tips and tricks for publishing apps
- How to customize interactions
- How to customize appearance
- Model performance monitoring using a Dashboard
- Dashboard build 1 - Power BI application
- Dashboard build 2 - R Shiny application
- (If time permits)
- D3 - Data driven documents (Javascript based Visualization)
- Bokeh - Python based Visualization library
Business Translator series
This series will have hands on practice sessions on how to create analyses and project artifacts
- Understanding your organization
- Importance of Client context
- Importance of Problem context
- Problem Design
- Hypotheses driven vs Data Driven
- Analytical design of a Business problem
- Understanding required data sources
- Data sources outside the firewall
- Solution Visualization
- Data source mapping
- Code flow and Problem flow diagrams
- Creating Solution artifacts
- Visualizing end product and its usability by end users
- Insight Generation - A helpful framework
- Storyboarding and Client presentation
- Estimating and Tracking Impact of a project
- Communication with stakeholders and setting expectations
- Creating project artifacts
- Creating an Analytics roadmap