ETL

Author

Giancarlo Casanova

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(httr)
Warning: package 'httr' was built under R version 4.5.1
library(jsonlite)
Warning: package 'jsonlite' was built under R version 4.5.1

Attaching package: 'jsonlite'

The following object is masked from 'package:purrr':

    flatten
library(pins)
Warning: package 'pins' was built under R version 4.5.1

Attaching package: 'pins'

The following object is masked from 'package:httr':

    cache_info

Introduction

For this workflow, we’ll query the API from within Posit’s Public Package Manager using two custom R functions:

  • get_package_count() : Will extract the total number of downloads over the past N days (30 days by default), for a certain R package.

  • get_pacakge_count_history() : Will extract the total number of package downloads per day from a defined start date.