Assignment Week 2 – Introduction to Data Science Programming
Data Science Programming | Week 2
1 ~ The Main Purpose Of Learning
The main purpose of studying Data Science Programming is to understand how to transform raw data into meaningful insights that support decision-making and innovation.
In this course, we learn how data is collected, cleaned, analyzed, and modeled using programming tools. The goal is not only to understand theory, but also to build practical skills in analyzing real-world problems using data.
Through programming, we can:
Process and manipulate data efficiently
Discover patterns and trends
Build predictive models
Communicate insights clearly through visualization
2 ~ Why Do We Learn About Programming
We learn Data Science because we live in a data-driven world. Every activity—online shopping, social media, banking, healthcare, education—produces massive amounts of data. Without proper analysis, this data is meaningless.
We learn Data Science because data has become one of the most valuable assets in today’s digital world. Businesses, governments, and researchers rely on data to make strategic decisions.
With Studying, we can :
1. Make Better Decisions
Organizations use Data Science to make data-driven decisions instead of relying on assumptions. For example:
Companies predict customer behavior.
Banks detect fraud.
Hospitals predict disease risks.
2. Build Intelligent Systems
Data Science allows us to create intelligent systems using Machine Learning and Artificial Intelligence. These systems can:
Recommend products (like Netflix or Shopee recommendations)
Detect spam emails
Predict future trends
3. Understand Patterns and Trends
Through Exploratory Data Analysis (EDA), we can identify:
Hidden relationships between variables
Seasonal trends
Market behavior patterns
4. Career Opportunities
Data Science is one of the most in-demand skills globally. Roles include:
Data Analyst
Data Scientist
Machine Learning Engineer
Business Intelligence Analyst
5. Solve Real-World Problems
Data Science helps solve problems in:
- Healthcare (disease prediction)
Example:
Problem: Hospitals often struggle to predict which patients are at high risk of heart disease.
How Data Science Solves It: By analyzing patient data (age, blood pressure, cholesterol levels), Machine Learning models can predict the risk of heart disease early.
Result: Doctors can give early treatment and reduce mortality rates.- Finance (risk modeling)
Example:
Problem: Banks face credit card fraud every day. Fraudulent transactions cause financial losses.
How Data Science Solves It: Using anomaly detection models, the system can identify unusual transaction patterns in real-time.
Result: Fraudulent transactions are detected immediately and blocked automatically.- Business (sales forecasting)
Example:
Problem: A company does not know which products customers will buy next month.
How Data Science Solves It: By analyzing past sales data and customer behavior, predictive models can forecast future sales.
Result: The company can manage stock better and increase profit.- Government (policy analysis)
Example:
Problem: Traffic congestion in big cities causes long travel times.
How Data Science Solves It: By analyzing traffic sensor data and historical patterns, systems can optimize traffic light timing.
Result: Reduced congestion and smoother transportation.3 ~ A Tools Used To Become An Expert
To become proficient in Data Science, we need to master several tools across different stages of the workflow.
1. PYTHON
Python is a beginner-friendly and powerful programming language widely used in AI, Machine Learning, automation, and big data. It has simple syntax and a huge ecosystem of libraries.
2. R
R adalah bahasa pemrograman yang fokus pada analisis statistik dan visualisasi data. Banyak digunakan dalam penelitian akademik.
3. ANACONDA
Anaconda is a Python and R distribution specifically designed for Data Science, Machine Learning, and scientific computing.
4. SQL
Structured Query Language used to manage and retrieve data from databases.
5. NUMPY
NumPy (Numerical Python) is a Python library used for numerical computing. It provides powerful tools to work with arrays, matrices, and mathematical operations efficiently.
6. PANDAS
Pandas is a Python library used for data manipulation and data analysis. It is one of the most important tools in Data Science because it helps us work with structured data such as tables (like Excel or CSV files).4 ~ Interest Domain Knowledge (about Data Science)
Data Science is an interdisciplinary field that combines statistics, mathematics, programming, and domain knowledge to extract meaningful insights from data.
It involves collecting, cleaning, analyzing, and interpreting data to support decision-making and solve real-world problems.
In simple terms: Data Science turns raw data into useful information.
My interest domain in Data Science is Business and Economic Analytics.
I am interested in this field because I previously studied Social Sciences (IPS) in high school, where I learned Economics. Through economics, I understood concepts such as supply and demand, market behavior, inflation, consumer behavior, and financial management.
By combining my economics background with Data Science, I can:
Analyze market trends using real data
Predict sales and business growth
Study consumer purchasing behavior
Evaluate financial performance using data-driven methods
Data Science enhances economic analysis by making it more accurate, measurable, and predictive. With programming and machine learning, economic theories can be tested using real datasets.
Therefore, my background in economics supports my interest in applying Data Science to Bussines and financial decision-making.