By Morris Alexander Pangaribuan, student at Institut Teknologi Sains Bandung
1. What is main purpose of our study ?
The main focus is on applying coding logic for information extraction, predictive modeling, and machine learning for data-driven decision making.
2. Why do use learn about it ?
And Programming is crucial in data science for efficiently accessing, cleaning, analyzing, and visualizing large volumes of data. Languages like Python and R are used to manipulate data, build predictive models (machine learning), and automate repetitive tasks. Coding trains the logical thinking and problem-solving skills needed to solve complex data problems.
3. What tools to have to expert about it ?
Here are the details of data science tools based on their function:
- Programming Languages & Working Environments:
Python: The most popular language with comprehensive libraries (Pandas for data manipulation, NumPy for computation, Scikit-learn for machine learning).
R: Powerful for statistical analysis and graphing (ggplot2, dplyr).
Jupyter Notebook: An interactive environment for writing code and visualization.
SQL: The standard language for retrieving and managing data from relational databases.
- Data Visualization & BI (Business Intelligence):
Tableau: Powerful and popular interactive visualization.
Power BI: Microsoft’s visualization platform.
Google Data Studio: Web-based data visualization.
- Big Data & Distributed Processing:
Apache Hadoop: A framework for processing large amounts of data.
Apache Spark: Fast in-memory computing for big data.
- Collaboration & More:
GitHub: A tool for version control and code collaboration.
SAS: Commercial analytics software for large enterprises.
Excel: A simple analysis tool for spreadsheets.
4. Give your interest domain knowledge of data science ?
- Primary Interest in Data Science:
Natural Language Processing (NLP): I am particularly interested in the ability to understand, analyze, and generate human language contextually.
Predictive Analytics & Machine Learning (ML): Using historical data to predict future trends and automate cognitive decisions.
Data Visualization & Communication: Transforming complex raw data into understandable, engaging, and actionable insights.
Data Automation (Data Pipeline): Helps speed up data workflows, from data cleaning to modeling.
- Knowledge and Skills in Data Science:
Language Modeling & Deep Learning: I operate using Deep Learning techniques and Large Language Models to understand complex language contexts and patterns.
Data Preprocessing: The ability to assist in cleaning, restructuring, and transforming raw data.
And with Popular tools like SQL, Python libraries (Pandas, Scikit-learn), and integration with Google Cloud (BigQuery)