Data Science Curriculum Pathway
Course Skills (by Course)
| CSC 201: Intro to Data Science | CSC 303: Data Science Foundations | CSC 464: Machine Learning at Scale | CSC 463: Artificial Intelligence | CSC 477: Visualizations |
|---|---|---|---|---|
| Identify data types, collection methods, licensing, and trusted data sources. | Master advanced R programming for efficient, maintainable code. | Explain ML concepts, applications and challenges. Why ML Hows humans learn and how machines learn. | Explain the hierarchical relationship among AI, ML, and DL (AI > ML > DL), with a primary focus on DL, while also introducing LLMs and RAG. | Explain the role and principles of data visualization in data science. |
| Acquire, clean, and preprocess data across multiple platforms. | Perform advanced data manipulation using dplyr, data.table, and the broader tidyverse ecosystem. | Design, select, train, and deploy machine a learning model: From data preparation to model maintenance based on data at scale (Big Data stack setup, storage & acquisition, batch/real-time/interactive analytics). | Define the components and structure of a neural network, and develop deep learning models using neural networks. | Conduct EDA to guide visualization choices. |
| Conduct exploratory data analysis to identify patterns and trends. | Build static visualizations using base R and ggplot2 to understand the grammar of graphics. | Master supervised learning as linear models, kNN, support vector machines, decision trees, ensemble learning, random forest, and gradient boosted trees | Build CNN models. | Create static and interactive visualizations and dashboards using ggplot2, Matplotlib, Seaborn, Plotly, Tableau, Quarto, and Jupyter. |
| Create clear and informative visualizations. | Apply regression, ANOVA, logistic regression, time series analysis, and diagnostics; implement supervised (classification/regression) and unsupervised (clustering/dimensionality reduction) learning. | Advanced unsupervised learning: Dimensionality reduction & feature extraction, k-means anomaly & novelty detection, and introduction to artificial neuro networks | Build Keras neural networks. | Apply design principles for clarity and accessibility, and create visualizations that incorporate statistical measures such as p-values and other key metrics to communicate insights effectively. |
Survey, Introductory Foundations Master Machine Learning Advanced DL, LLM, RL, RAG Master Visualization
Course Goals
| Course | Goal |
|---|---|
| CSC 201 | Introductory survey of data science concepts; build literacy in data acquisition, cleaning, and exploratory analysis. |
| CSC 303 | Strengthen core data science skills with advanced R; math concepts prepare for higher-level ML/AI. |
| CSC 464 | Master Machine Learning for classification/regression; focus on evaluation, bias–variance, feature engineering, hyperparameter tuning, and deployment; practice ensembles, kernels, PCA, neural nets, and text/image task at scale using Big Data techniques |
| CSC 463 | Apply advanced AI (deep learning, RL, LLMs, RAG) and evaluate ethical implications. |
| CSC 477 | Master visualization theory and tools; communicate through static, interactive, and dashboard formats. |
Prerequisites, Tools, and Math Requirements
| Course | Prerequisite(s) | Languages & Tools | Books | Math Level |
|---|---|---|---|---|
| CSC 201 | MAT 232 or MAT 325 | R, Excel (base R) | Wickham & Grolemund | Intro Prob & Stat |
| CSC 303 | CSC 201 | R (tidyverse) | DS4A; Hastie et al.; Wickham | Prob & Stat, Calc, Linear Alg |
| CSC 464 | CSC 303 | Python | HOML 3rd ed. , Aurélien Géron | Prob & Stat, Calc, Linear Alg |
| CSC 463 | CSC 303 | R, Python. (Keras) | DS4A; Deep Learning for R 2\(^{nd}\) Ed. (Allaire); AI (Russell) | Prob & Stat, Calc, Linear Alg |
| CSC 477 | CSC 201 | R, Python, Tableau, Shiny, Quarto, Jupyter | Wickham(R); McKinney(Python); Ryan(Tableau) | Intro Prob & Stat |
Whole picture
Venn diagram
Course Recurrence
| Term | F25 | S26 | F26 | S27 | F27 | S28 | F28 | S29 |
|---|---|---|---|---|---|---|---|---|
| Odd/Even | 4631 | 477 | 463 | 477 | 463 | |||
| Every Year | 201, 4642 | 303 | 201, 464 | 303 | 201, 464 | 303 | 201, 464 | 303 |
References
- Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for Data Science (2nd ed.). https://r4ds.hadley.nz/
- Grolemund, G. (2014). Hands-On Programming with R. https://rstudio-education.github.io/hopr/
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). ISL (2e). https://www.statlearning.com/
- Russell, S., & Norvig, P. (2021). AIMA (4e). https://aima.cs.berkeley.edu/
- Chollet, F., Kalinowski, T. and Allaire, J. J., 2022. Deep Learning with R. 2nd ed. Manning Publications..https://www.manning.com/books/deep-learning-with-r-second-edition
- Valderrama, E. F. (2024). Probability and Statistics with R. https://efvalder.github.io/RProbStatBook/