library(httr)
library(jsonlite)
library(tidyverse)
library(dplyr)
library(rmarkdown)
Today, We will observe APIs and how data travels around the web. An API, is a interface created for human interaction. Users are given a platform where they can see all the information and interact with code on user friendly display while all the processes are hidden in the back end.
This interaction of the final results is between two computers: the client and the server. The client is the user’s computer, where the data sent over and is structured in a user friendly interface. The server holds all the important chunks of information and instructions on how the code is suppose to appear.
I chose the most popular articles for my API of choice. First, let’s connect to the api and if the connection is active.
r<-GET("https://api.nytimes.com/svc/mostpopular/v2/shared/1/facebook.json?api-key=CMF7EZhpo021ld9wxFD6XMWP52HzeCxQ")
We successfully received a response code from the API. We know all the content from the pages on the API is stored in the body of the request. To retrieve the content, I specified content in translating the http response from JSON to R.
r<-fromJSON(rawToChar(r$content))
glimpse(as_tibble(r[]))
## Rows: 20
## Columns: 4
## $ status <chr> "OK", "OK", "OK", "OK", "OK", "OK", "OK", "OK", "OK", "OK"~
## $ copyright <chr> "Copyright (c) 2022 The New York Times Company. All Right~
## $ num_results <int> 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20~
## $ results <df[,22]> <data.frame[20 x 22]>
In the body, there are two nested items. A list of the http response and any string the website owners will included: if the content is copyrighted and the search results. The second item is the data frame of all the contents of the webpages that matched the most popular articles. I extracted he data frame and renamed the table df.
df<-r[["results"]]
paged_table(df)
The contents of the GET response is officially in a data table! There is a a lot of content in this table, so lets create a sub table. I will only need a few items: title, published date, sub section,key words, author, and abstract. It will be called NYT_Pop.
nyt_pop<-df%>%select(title,byline,abstract,nytdsection,published_date,adx_keywords)
paged_table(nyt_pop)
API provides the client’s computer a way to display the information from the server in a unique and visually pleasing way. The server is in contact with large range of clients, so a get request is the simplest way from transmitting the webpages that want to be view in a small packet.
There are some limitation in viewing exactly how the results are view on the NYT’s site in R, but packages like httr and json lite offer ways to view the http responses.