Financial AI Engineer Profile

Ignasius Rabi Blolong

52250073

What is the Main Purpose of our Study (Data Science Programming)?

To understand the “why,” it helps to look at the three main goals you are trying to achieve:

Automating Data Wrangling

Real-world data is rarely “clean.” It’s usually scattered across different formats, full of errors, or missing pieces. Programming allows you to create automated pipelines that ingest, clean, and transform this data at a scale that would be impossible to do manually.

Building Predictive Models

This is where the “science” happens. You use programming to implement mathematical algorithms—like Linear Regression or Neural Networks—that can identify patterns. The purpose is to move from descriptive analysis (what happened?) to predictive analysis (what will happen next?).

Communicating Complex Findings

A massive spreadsheet is useless to a CEO. Programming allows you to create sophisticated visualizations and dashboards. By turning numbers into visual stories, you help non-technical stakeholders make better decisions.

Why Programming Specifically?

Reproducibility:

You can share your code so others can run the exact same analysis and get the same results.

Scalability:

The same script that analyzes 100 rows of data can often analyze 1,000,000 rows with minimal changes.

Flexibility:

You aren’t limited by the “buttons” a software company decided to give you; if you can imagine a way to process data, you can code it.

What do We Lern About it ?

Diving into Data Science Programming is like getting a VIP pass to the “behind the scenes” of how the modern world works. You aren’t just learning one thing; you’re learning a multidisciplinary toolkit. Here is the roadmap of what you typically learn in this study:

The Language & Syntax (The “How”)

Before you can analyze data, you have to speak the language. Usually, this means Python or R.

Foundations:

Variables, loops, functions, and logic.

Libraries:

Learning the “superpowers” of the language, like Pandas and NumPy for Python, which allow you to manipulate massive tables of data as easily as a small spreadsheet.

Data Manipulation (The “Clean-up”)

In the real world, data is messy, duplicate, and often wrong. You learn how to:

Wrangle Data:

Filter,sort, and merge different datasets.

Handle Missing Values:

Deciding whether to delete “broken” data or fill it in using statistical methods.

SQL:

Learning how to talk to databases to pull the specific information you need.

Exploratory Data Analysis (EDA)

This is the detective work. You learn how to use code to find hidden trends:

Statistics:

Calculating means, medians, variance, and correlations to see if two things are related.

Visualization:

Using libraries like Matplotlib, Seaborn, or ggplot2 to create charts that make patterns “pop” out visually.

Machine Learning (The “Brain”)

This is often the most exciting part. You learn how to “train” a computer to make decisions:

Supervised Learning:

Teaching the AI by showing it examples (e.g., “Here are 1,000 emails labeled as spam; now you find the next one”).

Unsupervised Learning:

Letting the AI find its own groups in the data (e.g., “Group these 1,000 customers into 3 types based on their buying habits”).

Algorithmic Thinking

You learn how to break a big, scary problem into small, logical steps. This is a skill that helps you even when you aren’t sitting at a computer. It’s about efficiency—finding the fastest way to get an answer without crashing the system.

What the Tools to Have to Expert About?

The “Must-Have” Programming Languages

These are your primary tools for writing logic and building models.

Python:

The undisputed king. It is used for everything from data cleaning to Deep Learning. +1

SQL (Structured Query Language):

Absolutely essential. You use it to pull data out of company databases. If you can’t use SQL, you can’t get the data to analyze it. +1

R:

Still the gold standard for heavy statistical research and academic-grade data visualization.

Core Python Libraries (The “Big Four”)

If you choose Python, you must be an expert in these four libraries. They do 90% of the heavy lifting.

Pandas:

For data manipulation (think of it as “Excel on steroids”).

NumPy:

For high-performance mathematical calculations.

Matplotlib / Seaborn:

For creating charts and data visualizations.

Scikit-Learn:

The primary library for building Machine Learning models (like predictions and classifications)

Advanced AI & Big Data Tools

As you move toward “Expert” status, you will need tools that handle massive amounts of data or complex AI.

PyTorch or TensorFlow:

These are used for Deep Learning (creating Neural Networks for things like image recognition or AI chatbots).

Apache Spark (PySpark):

Used when your data is too big for one computer to handle (Big Data).

Hugging Face:

The standard platform for using and deploying pre-trained AI models (like Large Language Models).

Visualization & Business Intelligence (BI)

Sometimes, you need to show your results to people who don’t code.

Tableau or Power BI:

These allow you to create interactive dashboards that executives can click through.

Streamlit:

A newer tool that lets you turn a Python script into a web app in minutes.

Environment & Collaboration

Where do you actually write the code?

Jupyter Notebooks / Google Colab:

The standard “playground” for data scientists to write code and see results instantly.

Git & GitHub:

For version control. This allows you to track changes in your code and collaborate with a team without overwriting each other’s work.

Data to Insights: Workflow Summary

Fase Project	Alat Utama (R)	Keahlian Spesifik	Status Prioritas
Ingestion	SQL, DBI, httr	Data Retrieval	Core Skill
Cleaning	tidyverse (dplyr)	Data Wrangling	Core Skill
Analysis	ggplot2, stats	Statistical Insight	Advanced
Modeling	tidymodels, XGBoost	Predictive AI	Expert Path
Delivery	Quarto, Shiny	Automated Reporting	Production

Give your Interest Domain Knowledge (Data Science)

I am specializing in Financial AI Engineering because, in 2026, data science is no longer just about analysis—it is about building AI Agents that can autonomously manage trades, detect fraud, and mitigate risks in real-time. My focus is on mastering Explainable AI (XAI) to ensure these autonomous systems remain transparent and compliant with financial regulations. By merging deep programming expertise with high-stakes financial strategy,I aim to become a key architect of the modern digital economy.

Assignment Week 1

2026-03-02