DATA ANALYSIS

First RMarkdown Assignment by: Mandal, Mantilla, Masa, Rafael, Regoso


Why do we need to study and analyze data?

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. It organizes, interprets, structures and presents the data into useful information that provides context for the data. Analyzing the context of the past data is beneficial for the odds and future of a certain business, corporation, organization, individual, etc.

Data analysis helps one to arrive at some data-driven decisions. It allows one to make decisions scientifically because:

1. Data Analysis Lets You Understand the “Picture”

Data can be thought of as building blocks needed to create coherent models that allow one to imagine what’s going on in various parts of the organization, business, etc.

2. Data Analysis Enables Identification of Problems

The access to useful data and the ability to see their context make it an opportunity in identifying certain issues and potential problems to be addressed.

3. Data Analysis Shows One’s Strengths

A good data analysis allows one to see not only their problems but also their strengths, which could then be further optimized in a company’s strategies for higher gains. It could also be implemented in areas that aren’t performing so well.

4. Data Analysis Makes Future Approach Strategic

Substantial field data collection and analysis also enables you to identify where your precious resources and time are most needed. Data analysis is essential to understand what area needs to be prioritized to help you evolve and move forward.

In conclusion, the primary goal of data analysis is to determine a certain trend or observation from the previous data available. Through the trends observed, one can understand why certain things are happening and predict the odds and chances of the future. The awareness of the possibilities guides our decisions when forming a plan of action in handling a certain issue or achieving a certain goal.

Data Analysis in Business

Data analysis is indeed essential for an organization, corporation nor individual to operate more efficiently, just like in today’s business world.

Data analysis is a crucial element in business because of the following:

1. Product Development

Data analysis also offers both estimation and exploration capability for information. It allows one to understand the market or process’s current state and offers a solid base for forecasting future results. Data analysis helps companies to comprehend the current business situation and allows them to see the need for a new product creation that meets the current market demands.

2. Better Targetting

Using data analysis, you can determine what forms of advertising reach your customers effectively and make an impact that will make them buy your products. Data enables you to understand what methods of advertising your product have the biggest impact on the target audience and at what scale you can adopt such advertising.

3. Knowing your Target Customers

Once you understand what products are suitable for what clients, you can determine the areas that you are going to focus on and for which customers. The trends in the market are also informative on consumer spending and tastes. When you have enough information on these vital things, you can direct your business to produce or distribute certain goods or services to fulfill your potential customer’s desires.

4. Efficiency in Operations

The importance of data analytics in marketing finds more viable ways to streamline operations or increase benefit levels. It helps to recognize possible issues, avoids the waiting period, and acts on them.

5. Cut Costs of Operation

With a good data analysis system, you can determine the sectors of your business that are using unnecessary finances and the areas that need more financing. Through this, you will have a clear idea of where you should cut costs and the technology you are going to use to reduce operational and production costs.

6. Helps Solve Problems

Data analysis assists the organization to make an informed decision on the running of the business and providing information that could help the business to avoid any occurrence of loss.

7. Industry Knowledge

Industry awareness is another thing you can understand once you have evaluated results, it can reveal your company in the near future and what is the economy’s strength now. This is because before anyone else you can profit.

8. Opportunities

Since the economy tends to change to keep track of dynamic developments, while benefit generation is one that is most commonly pursued by the enterprise, the Data Analytics provides evaluated data, which enables them to look before the time, for more alternatives.


Policy Making

There has been a growing awareness that data “can reduce uncertainty about the best course of action” in policy design in recent decades. It can inform a better policy-making process and lead to more adequate, more efficient, and more effective public policies.

American cities “have started to use the ever-increasing amounts of data they collect to improve planning, offer better services and engage citizens.”

The city government of San Francisco has employed a form of the data science approach to reducing frequent traffic collisions in the city.

  • Due to frequent traffic accidents, resulting in several deaths every year, and generally low safety in this regard, the government assigned the Department of Public Health and Department of Transportation to develop an adequate policy to address this issue.

  • They chose to give potential arrangements in an information-driven and innovatively refined way.

  • The first step was to establish a continuous mapping and visualization of traffic-related incidents across the city through the TransBase online software platform.

  • Second, based on the gathered data, the Vision Zero High Injury Network was developed to identify where the main problems occurred and to provide insights into what kind of policy actions the government should undertake. They found out that “just 12 percent of intersections result in 70% of major injuries.”

  • Finally, insights gained through this process were transformed into policy solutions, introducing ‘protected intersections,’ underway intersections, and protected bike lanes as some of the measures.

Day-to-day Decisions

With the technology that we have today, a lot of people don’t realize the amount of data science that is going on behind the scenes of the programs which we use everyday that actually helps with our decisions. When it comes to entertainment, your personalized data of what you watch, what you listen, and what you read is saved on your device which is then analyzed by the program. This means that the next time you open up that application, you would see recommendations of shows or songs that fit your taste. Aside from Entertainment, knowing if it is going to rain or shine also has an effect on what we would do for that particular day. However, different programs have already been using data science to predict the weather which greatly helps people in planning their activities ahead of time.

Scientific Researches

One exemplary application of Data Science in scientific research is in Epidemiology. Researchers can utilize machine learning to detect health threats and improve diagnosis accuracy to impact patient outcomes positively.

Examples Include:

  • Using Feature Engineering and Feature Selection in order to identify bio markers capable of distinguishing between diseases and group samples with shared characteristics.

  • Applying regression models to examine the cause-and-effect relationship between disease risk factors.

  • Using random forests to make highly informative predictions for more targeted drug prescriptions.

  • Using CNN’s for image analysis to detect diseases such as Malaria.


Propose at least one data science topic that you want to pursue: Have a broad description of the topic, describe the availability of the data, what kinds of statistical method you think you will need, and who would benefit this study.

Topic: The Impact of Covid-19 Vaccination Program to the Confirmed Cases of Covid19 in the Municipality of Pasig.

Description of the Topic: The Municipality of Pasig had started their Covid-19 Vaccination Program last March 2021. It has already been four months since the said program had been implemented. In this mentioned time frame, we would like to examine the impact of the Covid-19 Vaccines Rollout to the confirmed cases of the virus in Pasig. Specifically, we would like to know if there’s a significant difference, such as a significant decrease in the confirmed cases of the virus in the said city prior and during the vaccination program. We would also like to examine closely the rate of decrease during the said four months.

Availability of Data: Reports about the number of confirmed cases of virus in Pasig and the doses of vaccines administered are being made available to the public through the official Facebook Page of Pasig City Public Information Office.

Statistical method: The independent variable would be the number of Pasiguenos that were vaccinated. Numbers of partially and/or fully vaccinated will be all considered. Meanwhile, the dependent variable would be the number of confirmed cases of Covid-19 among the Pasiguenos. The linear relationship of the two variables would be observed through the linear regression analysis or the chi-square test.

The confirmed cases prior and after the Covid Vaccination will also be examined if there’s a significant difference through the T-test.

Significance of The Study: The study will benefit the public as it will show them the impact of Covid19 Vaccination Program on the number of confirmed cases in a certain city. When the Covid19 Vaccination Rollout is shown to be effective in this study, this could urge the people who are reluctant in being vaccinated.


Topic: Surge of Cybercrime Amidst the Global Pandemic

Description of the topic: Cybercrime rose up to new volumes as more people are now connected to the internet due to the pandemic. It has already been a year and 4 months since the corona virus was declared a global pandemic. In this mentioned time frame, we would like to examine the impact of the global pandemic to cybercrime activity worldwide. Specifically, we would like to know if there’s a significant difference, such as a significant increase in the cybercrime activity prior and during the global pandemic. We would also like to examine closely the rate of increase during the said year and 4 months.

Availability of Data: The number of cybercrime cases is being made available to the public through the various articles and posts of the different agencies that tackle these specific problems.

Statistical method: Percentage and frequency distribution will be used for the surge of cybercrime amidst the pandemic. T-test will be used to determine the percentage of cybercrime before and after the pandemic.

Significance of The Study: The study will benefit the public as it will show them the impact of the global pandemic to the number of cybercrime activities. When the impact of the global pandemic is shown to have a significant impact, this could help the people to be more alert and prepared to deal with such activities.


Topic: Factors Affecting Food Insecurity of Households of Alaminos, Laguna During the Pandemic

Description of the Topic: The COVID-19 pandemic continues to enormously impact the country’s food insecurity status.Food insecurity is a result of the physical unavailability of food, people’s lack of social or economic access to adequate food, and/or inadequate food utilization, according to the Global Forum on Food Security of the United Nations Food and Agriculture Organization (FAO, 2012).

This study aims to measure the extent of food insecurity, and identify key socio-economic factors associated with this condition among households of Alaminos, Laguna during the pandemic. Specifically, the purposes of this study are to:

  • Present the respondents’ demographic profile

  • Determine the significant difference between the food insecurity status of males and females

  • Determine the significant relationship between the food insecurity status and the variables

  • Identify community needs or vulnerabilities based on the profile generated

  • Provide insightful information and data-driven recommendations

Availability of Data: Survey data will be collected from a random sample of households of Alaminos, Laguna. Through the Household Food Security Module of the U.S Department of Agriculture, the food insecurity status of respondents will be known. A context-specific categorization of household food insecurity status will be applied to distinguish the various levels of severity.

Statistical Methods: For the problem which deals on the demographic profile of the respondents, the percentage and frequency distribution will be used. T-test will be used for the problem, which deals with the difference between the food insecurity status of males and females. For hypothesis testing, Correlation will be used to determine if there is a significant relationship between food insecurity status and monthly income and age. Chi-square test of independence will be used to determine if there is a significant relationship between food insecurity status and residence type, gender, work status, and educational attainment.

Significance of the Study: This study will address the causes of food insecurity in the chosen municipality during the pandemic. Results gathered from this study will be utilized to create initiatives for groups greatly affected by food insecurity. Data-driven recommendations that are based on the community needs or vulnerabilities will be formulated.


Topic: Factors Affecting Covid-19 Vaccine Hesitancy in the Philippines

Description of the topic: Despite the Government’s effort to persuade the public to participate in its vaccination program against COVID-19, vaccine hesitancy remains a big challenge. The Philippine Government must consider social traumas as a factor in vaccine hesitancy.

Fear of adverse events, the negative information on vaccines, and different efficacy rates are the top three factors that affect some Filipinos’ decision on whether to get vaccinated against the coronavirus disease 2019 (Covid-19), a health official said Monday. In an online media forum, Health Undersecretary Maria Rosario Vergeire said the Department of Health conducted an online study with more than 43,000 respondents to know why Filipinos are hesitant about Covid-19 vaccination.

Design: Mixed Method

Site: Social Media Platforms

Participants: 30 random people on facebook and other social media apps

Availability of the data: Public surveys from late 2020 to early 2021 have shown that the number of Filipinos willing to get vaccinated has decreased, while the number of people unwilling to get vaccinated or uncertain about it has increased.

According to the Social Weather Stations’ latest survey done from late April to May 2021, fear of side effects and questions on the safety and efficacy of vaccines top reasons for refusing to receive a COVID-19 vaccine.

Statistical method:

  • Correlational analysis

  • Descriptive statistics for the demographic profile of the respondents

Significance of the study: This study aims to highlight the factors affecting the decision of each individual to get vaccinated.


Topic: Intensity of Household Air Pollution in the Whole World.

Description: Household air pollution is responsible for millions of deaths globally as it is one of the major factors of environmental contributors to ill health such as lung cancer, pneumonia, and ischaemic heart disease.

This study aims to measure the concentrations of fine particulate matter in both rural and urban areas of a country each year to find out if there is a rate of increase of air pollution or not. This way, Health Organizations could take action on controlling air pollution in that country to prevent the already worsening air quality of the planet.

Availability of Data: The World Health Organization (WHO) has made its data of concentrations of fine particulate matter in each country available to the public. This can be accessed through their official website where any user can view them. The data set is continuously being updated every 2 years and it currently hosts 4300 human settlements in 108 countries.

Statistical Method: The Statistical Method that would be used for this study will be Descriptive Statistics as it would be used to summarize and even graph the data set of the household air pollution of all the countries.

Significance of the Study: This study provides additional helpful information for Health Organizations all around the world that are trying to control the air quality of their own country.


REFERENCES

ActiveWizards. 2019. “Top 8 Data Science Use Cases in Marketing.” 2019. https://www.kdnuggets.com/2019/11/top-8-data-science-use-cases-marketing.html#:~:text=Data%20science%20is%20mostly%20applied,in%20marketing%20emerge%20every%20day.
Cambridge-Spark. n.d. “The Role of Data Science in Research.” https://www.cambridgespark.com/case-studies/the-role-of-data-science-in-research.
Francis, Mercy. 2021. “The Importance of Data Analysis: Resagratia: Data Analytics.” 2021. https://resagratia.com/2020/06/the-importance-of-data-analysis/.
GetSmarter-Blog. 2020. “Why Is Data Analysis Important in Business? [VIDEO].” 2020. https://www.getsmarter.com/blog/career-advice/data-analysis-important-business/.
Google. n.d. “What Is Predictive Analytics?” https://cloud.google.com/learn/what-is-predictive-analytics.
Hinshelwood, Sandra. 2020. “5 Reasons Why Data Analysis Is Important for Every Business.” 2020. https://businesspartnermagazine.com/5-reasons-why-data-analysis-is-important-for-every-business/.
Jigsaw-Academy. 2020. “Importance of Data Analytics in 2021.” 2020. https://www.jigsawacademy.com/blogs/business-analytics/importance-of-data-analytics/.
Magnimind-Academy. 2019. “WHAT IS THE ROLE OF DATA SCIENCE IN EVERYDAY LIFE AND EVERY SITUATION?” 2019. https://magnimindacademy.com/blog/what-is-the-role-of-data-science-in-everyday-life-and-every-situation.
MIchigan-Tech. 2020. “Eight Ways Big Data Affects Your Personal Life.” 2020. https://onlinedegrees.mtu.edu/news/ways-big-data-affects-your-personal-life.
Nitika. 2020. “10 Reasons Why Data Analysis Is the Key Business Growth.” 2020. https://www.loginworks.com/blogs/10-reasons-why-data-analysis-is-important-for-every-business/.
Numanović, Amar. 2017. “Data Science: The Next Frontier for Data-Driven Policy Making?” 2017. https://medium.com/@numanovicamar/https-medium-com-numanovicamar-data-science-the-next-frontier-for-data-driven-policy-making-8abe98159748.
Rappler. 2021. “EXPLAINER: The Philippines’ Fight Vs Vaccine Hesitancy.” 2021. https://www.rappler.com/newsbreak/podcasts-videos/explainer-philippines-fight-vs-vaccine-hesitancy.