Assignment DS Programming Week 1

Angelique Kiyoshi Lakeisha B.U

NIM: 52250001

Student Major Data Science at Institut Teknologi Sains Bandung

RPubs Statistics Data Science Programming Assignment Week 1 – Mr. Bakti Siregar, M.Sc., CDS

1 Question 1

What is the Main Purpose of Our Study? (Data Sains Programming)

The development of digital technology generates large amounts of data that continue to increase, making manual data processing no longer effective. A systematic and automated approach is needed so that the analysis process can run efficiently, consistently, and with minimal errors.

The main purpose of studying Data Science Programming is to develop computational thinking skills in designing structured and automated data processing systems, so that every analysis process is arranged in a logical, systematic, and reproducible flow.

More Specifically, These Objectives Include:
→
Organizing data analysis steps in a sequential and structured flow, so that each process has a clear order and can be traced back.
→
Automating repetitive data processing tasks, such as data cleaning and transformation, to increase efficiency and minimize manual errors.
→
Understanding data representation in computer systems, including data types and structures, so that data manipulation is carried out according to its characteristics.
→
Serving as a foundation before entering advanced analysis and modeling stages, ensuring that statistical methods or machine learning are built on properly processed data.

2 Question 2

Why do We Learn About it?

Learning Data Science Programming has become important because the ability to design structured and automated data processing systems areS the foundation of modern analysis. Without programming skills, analytical concepts remain theoretical and are difficult to apply consistently in practice.

In reality, the data analysis process involves large-scale processing, repetitive steps, and the need for reproducible results. Programming enables this process to run efficiently, with minimal manual errors, and well-documented. Thus, this skill bridges the gap between understanding analytical concepts and implementing them in practice.

In addition, mastery of programming helps produce information that is not only technically accurate but can also be presented clearly to support decision-making, including for non-technical parties. Therefore, studying Data Science Programming means building relevant and applicable competencies in an increasingly data-driven environment.

3 Question 3

What Tools to Have to Expert About it

To become an expert in the field of Data Science, a combination of programming languages, development tools, analytical libraries, and integrated data management systems is required.

Programming Language

R

Used for statistical analysis and quantitative data exploration.

Python

Used for data manipulation, statistical analysis, visualization, and machine learning model development.

Programming Tools

Anaconda

Used to manage environments and packages, ensuring that analytical projects remain structured and stable.

VS Code

Used to write and manage analytical scripts efficiently.

Jupyter Notebook

Used for interactive data exploration and documentation of analysis processes.

Analytic Library

Pandas

Used for manipulating and processing structured data in table form (DataFrame).

Matplotlib

Used to create data visualizations such as charts and plots for analytical insights.

NumPy

Used for numerical computation and efficient large-scale array operations.

Scikit-Learn

Used to build, train, and evaluate machine learning models.

Database System

SQL

A query language used to retrieve, filter, and manipulate data in relational databases.

MySQL

A database management system used to store and manage structured data using SQL.

4 Question 4

Give Your Interest Domain Knowledge (Data Science)

I have an interest in the field of Operations Analytics, with a focus on Cost Forecasting and Decision Support Systems. This field is relevant because modern organizations are required to manage resources efficiently and control costs in a measurable way. Inaccuracies in cost planning can directly impact operational performance and financial stability, thus requiring a data-driven approach.

In the context of Data Science, Operations Analytics utilizes historical data and predictive models to estimate future costs and support more systematic decision-making. Cost forecasting helps to identify spending patterns and potential risks, while decision support systems transform analytical results into recommendations that can be used in strategic planning.

The connection of this interest to programming studies lies in the need for structured and automated data processing. The use of SQL allows for managing operational data on a large scale, while Python or R is used for analysis, modeling, and visualization. With a strong programming foundation, forecasting and decision recommendation processes can be carried out efficiently and consistently.