WHY SHOULD WE ANALYZE DATA?

Image from WT Grant Foundation

Tina

Vast amounts of data are continuously being generated and collected daily. Grus (2019) enumerates that data is collected by websites that track every click a user makes, pedometers that record heart rate and movement, smart gadgets that collect information about human behavior, and many more. Existing data may not look useful at first glance, but studying, analyzing, and drawing insights from them helps humans make sense of the our changing lifestyle patterns; data analysis, and human actions that follow, helps us adapt to our ever-changing world. Additionally, analyzing data also allows humans to improve and optimize already-existing systems to respond to these modern changes and find effective solutions to contemporary issues. Ultimately, analyzing data leads our world to greater progress.

Aze

Poorly handled data costs the United States around USD 3.1 trillion each year (Cooper, 2019). Although the emergence of digital technology has saturated the world with data, most data remains unhelpful unless analysis and processing of data takes place. Its wide-ranging utility when analyzed promoted businesses and organizations to adopt data science The ability of data science in identifying opportunities serves as a specific instance. Cooper (2019) elaborates that the exploratory nature of data scientists compels them to question existing assumptions and processes in order to come up with additional methods and analytical algorithms, as they continually interact with the business’s analytics system.

Manu

The analysis of data is important because it allows us to make sense of events in our society. By learning data science, analysts are taught fundamental principles to be utilized in gathering useful information from datasets (Provost & Fawcett, 2013). Data analysis breaks down large sets of numbers and information into manageable and comprehensible pieces, where trends and patterns can be observed. From this, people can then utilize this knowledge to make informed data-driven decisions with a different perspective. Because of the power of data analysis, this technique is prominent in many different fields encountered in daily life.

Justin

The world nowadays boasts an abundance of information that not only continuously grows, but is also easy to access. The challenge then comes in the form of filtering and processing all this information into relevant and valuable statistics in which various inferences can be made from. This process of handling data is an invaluable procedure in interpreting what information is already available.

Ja

In research, data analysis provides professionals with the ability to interpret raw observations and information into conclusions. Historically, the study and analysis of data has provided the academe with evidence-based breakthroughs, traversing through all fields of study. Research and development continue to revolutionize the world we live in today, utilizing data in strengthening its conceptual basis and translating these findings into innovations that can further the efficiency and quality of human life.

Jaime

Studying and analyzing data, regardless of what field it’s implemented in, helps predict their corresponding trends and streamlines the information flow. As advancements continue to be a constant phenomena and pile up, the information and knowledge needed to aid in such endeavors will be collected and analyzed in a much more efficient manner, contributing to a much faster cycle in research and development.


DATA SCIENCE APPLICATIONS

CONTACT TRACING

Image from Geospatial World

In response to the global pandemic, contact tracing has proven to be an effective strategy for counteracting the quick spread of the virus. Thus, there has been an evident increase in the use of contact tracing apps in numerous establishments. Often, these apps require QR code scanning that could automate manual contact tracing, for example, Ateneo’s Blue Pass system. Levy and Stewart (2020) note that digital contact tracing draws upon several fields, including data science. Indeed, one can observe data science being applied in digital contact tracing through the collection and storage of personal user data and health information. This collection of data, when analyzed, then becomes valuable to national leaders and policy-makers who respond to the COVID-19 situation in the country.

SEARCH ENGINES

Image from Small Business Mentor

The utility of data science also extends to the mundane. Even before finishing one’s query in their search bar, Google already predicts what users want to know and displays dropdowns of possible search queries based on the data on user’s search histories and general search trends (Beam & Kohane, 2018). This feature eases user experiences as reading suggested search queries generally outpaces typing, streamlining internet browsing which has become a day-to-day task in today’s era (Brysbaert, 2019; Majaranta, MacKenzie, Aula & Räihä, 2006). The availability of suggestions also provide opportunities to browse what users might need but did not necessarily think of initially, ultimately expanding the user choices.

TARGETED ADVERTISING

Image from Global Reach

Advertisements are extremely prevalent in today’s digital age, with product placement showing up on many different media, most notably on sites like YouTube and Facebook. There is a lot of work that goes on behind the scenes in the creation and distribution of these materials. Data science is heavily used in today’s advertising landscape, particularly with the concept of targeted advertising. Product companies are noted to invest a significant amount of resources in paying market research firms to collect information about their potential consumers (Iyer et. al., 2004). With this knowledge, companies are able to make informed decisions on what types of advertisements would work best in promoting their products, as well as with which groups of people they would have the most luck targeting with their promotional materials.

MEDICAL IMAGE ANALYSIS AND COMPUTER VISION

Image from Roboticsbiz

An application of data science is medical image analysis and computer vision. Through machine learning and other various methods, an inference can be made using information gathered from an image which is beneficial in identifying diverse medical conditions such as the presence of tumors and artery stenosis.

ELECTIONS

Image from Florida Politics

In the modern world, data science has become a key player in the election process. Aside from the simulation polls conducted, demographics have revolutionized the way candidates strategize their campaigns to populations with respect to their beliefs and principles. Factors such as social class, religion, and age can influence the type of politician that societies aspire to put in power.

WEATHER FORECASTING

Image from Analytic Steps

Data science is used in weather forecasting — as the weather is a factor that can affect almost everyone, if not all of them, systems are made in predicting the climate. It aids people, businesses, and other entities in preparing accordingly based on the incoming weather. An example of this dynamic is when weather predictions help farmers in planning out for their harvest seasons.


DATA SCIENCE TOPICS

NATURAL LANGUAGE PROCESSING BASED QUICK ACCESS CHATBOT FOR UNIVERSITY CONCERNS

Image from FreshDesk

Description: “I ate dinner with a fork. I ate dinner with a friend.”

Chowdhary (2020) illustrates how the two preceding sentences share almost the same structure, only differing in one word. Nevertheless, most readers can tell that the meanings they hold demonstrate differences that go beyond one word change. The first sentence describes how the speaker used a fork to facilitate the eating process, but the same cannot be said for the latter, as the speaker meant that a friend was in close proximity during dinner, rather than using a friend as a tool for eating.

In the contemporary world progressing towards timeliness and efficiency, human intervention in analyzing the abundant volume of natural language data becomes increasingly difficult, especially within any given time limits (Manning, 1999). The nuances in language can be readily sensed by humans that machines cannot comprehend with basic programming.

Addressing the complexities of human language requires a specific application of data science called Natural Language Processing (NLP). NLP covers wide-ranging applications, but one specific application that demonstrates feasibility and utility is the creation of a Messenger chatbot that can address the university related concerns or student life queries of Ateneo students without the need for mobile data or internet, given the ability of Messenger to function on Free Data. The application of NLP allows the chatbot to understand the nuances of the students’ queries to an extent and generate an appropriate response.

Availability of the Data: Information about Ateneo de Manila University lies across its varying websites, such as Ateneo’s official website and LS One.The NLP-based chatbot can also be supplied with relevant data related to student life from ADMU Freedom Wall in Facebook and ADMU subreddit

Statistical Method/s Needed: Manning (2019) points out that NLP combines the swift computational power of machines with repositories of language data and probability theory to determine the common patterns that occur in language use to decide an appropriate response. The NLP-based chatbot can also fine tune its performance, vocabulary, and responses by employing statistical models for concept and structure prediction based on neural network architectures (Cahn, 2017).

Beneficiaries: The NLP-based chatbot streamlines the information seeking process for Ateneo students by uniting various datasets about the university and its student life into one entity that can be accessed with a quick Messenger chat and question. The chatbot also enables students, especially those in the online setting, who are experiencing either momentary or constant internet issues to get their basic concerns addressed.

COMPUTER VISION

Image from Ilija Mihajlovic

Description: According to Western Governors University (2021), computer vision is a method of enabling a machine to extract visual information from materials such as pictures and videos and make inferences and decisions based on this information with the help of machine learning.

Availability of the Data: The data gathered from computer vision mostly consists of pictures and videos in which the availability varies depending on the field and industry.

Statistical Method/s Needed: Various methods are used in the field of computer vision, mainly involving the use of machine learning as well as image analysis.

Beneficiaries: As computer vision in general encompasses a broad scope, a number of different industries would benefit from this. The medical industry would benefit in having a more advanced method for gathering and analyzing info from real world sources such as photos which can be taken in real time. This would also be beneficial in security, with cameras being able to detect potential threats more efficiently.

FAKE NEWS DETECTION

Image from BBC

Description: As much as technology can be maneuvered to spread and enhance the accessibility of information, the same powerful platforms can be weaponized to foster misinformation and disinformation. As a countermeasure, data science has been utilized to detect patterns of fraudulent behavior through linguistic cues such as “word patterns, syntax construction, readability features” (Kulkarni, 2021).

Availability of the Data: Relevant data for this study can be easily acquired through online repositories such as Nishit Patel’s Fake News Detection repositories accessible on GitHub.

Statistical Method/s Needed: To facilitate the study of fake news detection, the following methods can be utilized: machine learning algorithms, deep learning, natural language processing techniques, and blockchain (Singh et al., 2020). Although there are a vast range of procedural resources to consult, Singh et al. (2020) were able to narrow down the best technology to approach fake news detection: blockchain for ‘critical fake news detection’ and machine learning and natural language processing on the commercial level. Additionally, Wang et al. (2020) mentioned that the accuracy of these methods have improved through reinforcement learning techniques to filter through both the low and high quality samples, specifically citing deep learning as the focus of their study.

Beneficiaries: Media Sites and General Public — Albeit this study does not have a direct beneficiary, journalists along with their respective media sites can indirectly benefit from the detection of fake news as it distinguishes the truth from rumors and false information. In the long run, battling misinformation wards off attacks against the credibility of journalists and correspondents, safeguarding the integrity of honorable news sites.

DATA DRIVEN ASTRONOMY

Image from TMT International Observatory

Description: According to the Space Telescope Science Institute (2018), DDA is the production of astronomical knowledge built on pre-existing databases, somewhat similar to industrial data science wherein the data sets are a byproduct of other investigations rather than taken with the experiment in mind.

Availability of the Data: The data collected are in astronomically (pun intended) large sets and are available and derived from astronomy research oriented databases like the Sloan Digital Sky Survey and Hubble Legacy Archive.

Statistical Method/s Needed: This field requires a heavy reliance on machine learning and algorithms, given the large data sets needed to be processed. Research and development often involve astronomical data awareness, deep understanding of selection biases, and sophisticated multipoint statistics (STSI, 2018).

Beneficiaries: The community that would primarily benefit from these studies are astrophysicists and researchers in the field. The accumulation of the data can help scientists understand otherworldly phenomena better as well as build on our current knowledge of physics and other sciences.


REFERENCES

Data Flair. (n.d.). Data Science for Weather Prediction - The Prerequisite to all Natural Disasters. Data Flair. Retrieved June 23, 2022, from https://data-flair.training/blogs/data-science-for-weather-prediction/.


Deep Learning In Medical Image Processing - Scope and Challenges. RoboticsBiz. (2021, February 4). Retrieved June 24, 2022, from https://roboticsbiz.com/deep-learning-in-medical-image-processing-scope-and-challenges/


Estrella, J.A., Ong, E., Gelera, C.P., Villaruel, J.S., Quinzon, C.S., & Sánchez, M.C. (2019). Automated Text Summarization of Research Papers Regarding the Effectiveness of Various Treatment Plans for Leukemia.


Geospatial World. (n.d.). [Illustration of a contact tracing application]. Retrieved June 24, 2022, from https://www.geospatialworld.net/blogs/innovative-apps-support-philippines-fight-against-covid-19/


Global Reach. (2021, August 11). Targeted Advertising 101. Global Reach. Retrieved June 24, 2022, from https://www.globalreach.com/global-reach-media/blog/2021/08/11/targeted-advertising-101


Gruzs, J. (2019). Data Science from Scratch: First Principles with Python (2nd ed.). O’Reilly Media, Inc.


Hernandez, B. (2020, January 31). Tips on Writing a Music Analysis Essay. Making Music Magazine. Retrieved June 24, 2022, from https://makingmusicmag.com/music-analysis-essay/


Iyer, G., Soberman, D., & Villas-Boas, M. (2004, February). The Targeting of Advertising. Faculty & Research - INSEAD. Retrieved June 24, 2022, from https://flora.insead.edu/fichiersti_wp/inseadwp2004/2004-20.pdf


Levy, B. & Stewart, M. (2020). A Systematic Review of the Ethics and Efficacy of Digital Contact Tracing Applications. Harvard Data Science Review, p. 4. DOI: 10.13140/RG.2.2.23432.85766


Mihajlovic, I. (2019, April 26). Everything You Ever Wanted To Know About Computer Vision. Towards Data Science. Retrieved June 24, 2022, from https://towardsdatascience.com/everything-you-ever-wanted-to-know-about-computer-vision-heres-a-look-why-it-s-so-awesome-e8a58dfb641


Ogles, J. (2021). Survey says majority of Florida voters opposed to election law changes [Image]. Florida Politics.Retrieved June 23, 2022, from https://floridapolitics.com/archives/414872-survey-says-majority-of-florida-voters-opposed-to-election-law-changes/


Osborn, B. (2013). Subverting the Verse–Chorus Paradigm. Music Theory Spectrum, 35(1), 23–47. https://doi.org/10.1525/mts.2013.35.1.23


Patel, N. (2022). Fake News Detection [electronic resource: python source code and data sets]. Retrieved from https://github.com/nishitpatel01/Fake_News_Detection.


Peek, Joshua. (2018). Data Science in Astronomy, Three Ways. Space Telescope Science Institute.Retrieved June 23, 2022, from https://www.stsci.edu/contents/newsletters/2018-volume-35-issue-01/data-science-in-astronomy-three-ways#:~:text=Data%2DDriven%20Astronomy%20(DDA),of%20other%20processes%20or%20investigations.


Provost, F., & Fawcett, T. (2013). Data Science and its relationship to big data and data-driven decision making. Big Data, 1(1), 51–59. https://doi.org/10.1089/big.2013.1508


Statistical Image Analysis. Darmouth Department of Mathematics. (n.d.). Retrieved June 23, 2022, from https://math.dartmouth.edu/~m70s20/ImageAnalysis.pdf.


TMT International Observatory. (2022). Retrieved June 23, 2022, from https://www.tmt.org/page/observatory.


Tyagi, Neelam. (2019, October 17). Weather Forecasting: How Does Big Data Analytics Magnify it?. Analytic Steps. https://www.analyticssteps.com/blogs/weather-forecasting-how-do-big-data-analytics-magnify-it.


Upasana. (2022, April 5). Top 10 Data Science Applications. Edureka. Retrieved June 23, 2022, from https://www.edureka.co/blog/data-science-applications/


Weihs, C., Ligges, U., Mörchen, F., & Müllensiefen, D. (2007). Classification in Music Research. Advances in Data Analysis and Classification, 1(3), 255–291. https://doi.org/10.1007/s11634-007-0016-x