Ignasius Rabi Blolong
52250073
To understand the “why,” it helps to look at the three main goals you are trying to achieve:
Real-world data is rarely “clean.” It’s usually scattered across different formats, full of errors, or missing pieces. Programming allows you to create automated pipelines that ingest, clean, and transform this data at a scale that would be impossible to do manually.
This is where the “science” happens. You use programming to implement mathematical algorithms—like Linear Regression or Neural Networks—that can identify patterns. The purpose is to move from descriptive analysis (what happened?) to predictive analysis (what will happen next?).
A massive spreadsheet is useless to a CEO. Programming allows you to create sophisticated visualizations and dashboards. By turning numbers into visual stories, you help non-technical stakeholders make better decisions.
You can share your code so others can run the exact same analysis and get the same results.
The same script that analyzes 100 rows of data can often analyze 1,000,000 rows with minimal changes.
You aren’t limited by the “buttons” a software company decided to give you; if you can imagine a way to process data, you can code it.
Diving into Data Science Programming is like getting a VIP pass to the “behind the scenes” of how the modern world works. You aren’t just learning one thing; you’re learning a multidisciplinary toolkit. Here is the roadmap of what you typically learn in this study:
Variables, loops, functions, and logic.
Learning the “superpowers” of the language, like Pandas and NumPy for Python, which allow you to manipulate massive tables of data as easily as a small spreadsheet.
Filter,sort, and merge different datasets.
Deciding whether to delete “broken” data or fill it in using statistical methods.
Learning how to talk to databases to pull the specific information you need.
Calculating means, medians, variance, and correlations to see if two things are related.
Using libraries like Matplotlib, Seaborn, or ggplot2 to create charts that make patterns “pop” out visually.
Teaching the AI by showing it examples (e.g., “Here are 1,000 emails labeled as spam; now you find the next one”).
Letting the AI find its own groups in the data (e.g., “Group these 1,000 customers into 3 types based on their buying habits”).
You learn how to break a big, scary problem into small, logical steps. This is a skill that helps you even when you aren’t sitting at a computer. It’s about efficiency—finding the fastest way to get an answer without crashing the system.
The undisputed king. It is used for everything from data cleaning to Deep Learning. +1
Absolutely essential. You use it to pull data out of company databases. If you can’t use SQL, you can’t get the data to analyze it. +1
Still the gold standard for heavy statistical research and academic-grade data visualization.
For data manipulation (think of it as “Excel on steroids”).
For high-performance mathematical calculations.
For creating charts and data visualizations.
The primary library for building Machine Learning models (like predictions and classifications)
These are used for Deep Learning (creating Neural Networks for things like image recognition or AI chatbots).
Used when your data is too big for one computer to handle (Big Data).
The standard platform for using and deploying pre-trained AI models (like Large Language Models).
These allow you to create interactive dashboards that executives can click through.
A newer tool that lets you turn a Python script into a web app in minutes.
The standard “playground” for data scientists to write code and see results instantly.
For version control. This allows you to track changes in your code and collaborate with a team without overwriting each other’s work.
| Fase Project | Alat Utama (R) | Keahlian Spesifik | Status Prioritas |
|---|---|---|---|
|
SQL, DBI, httr | Data Retrieval | Core Skill |
|
tidyverse (dplyr) | Data Wrangling | Core Skill |
|
ggplot2, stats | Statistical Insight | Advanced |
|
tidymodels, XGBoost | Predictive AI | Expert Path |
|
Quarto, Shiny | Automated Reporting | Production |
I am specializing in Financial AI Engineering because, in 2026, data science is no longer just about analysis—it is about building AI Agents that can autonomously manage trades, detect fraud, and mitigate risks in real-time. My focus is on mastering Explainable AI (XAI) to ensure these autonomous systems remain transparent and compliant with financial regulations. By merging deep programming expertise with high-stakes financial strategy,I aim to become a key architect of the modern digital economy.