The Federal Registry Data

Through this project we got the opportunity to explore data from the Federal Register, which is the official journal of the Federal Government of the United States that contains government agency rules, proposed rules and public notices

The Link to the specific XML File used to read the data is given below: https://www.govinfo.gov/bulkdata/FR/2022/08/FR-2022-08-12.xml

The Federal Registry data is also made available in machine-readable format (i.e., XML) at: https://www.govinfo.gov/bulkdata/FR/

We also got the hands on experince on packages like xml2 is a wrapper around the comprehensive libxml2 C library that makes it easier to work with XML and HTML in R

The code Snippet

Loading the libraries required in this Assignment

library(xml2)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.6     ✓ dplyr   1.0.8
## ✓ tidyr   1.2.0     ✓ stringr 1.4.0
## ✓ readr   2.1.2     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggplot2)
library(dplyr)
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows

Read the XML File and store data Used the xml2 package to read the XML file directly from the URL: https://www.govinfo.gov/bulkdata/FR/2022/08/FR-2022-08-12.xml

federal_data= read_xml('https://www.govinfo.gov/bulkdata/FR/2022/08/FR-2022-08-12.xml')
#xml_as_kable(federal_data)

** The number of UNIQUE agencies are involved in this announcement IN CNTNTS**

Unique_Agencies=unique(xml_find_all(federal_data, "//CNTNTS/AGCY"))
length(Unique_Agencies)
## [1] 47

** A Barplot showing different categories of announcements and their counts**

cat <- xml_text(xml_find_all(federal_data, "//CNTNTS/AGCY/CAT/HD"))
cat_df <- as.data.frame(cat)
cat_df <- cat_df %>% 
  group_by(cat) %>%
  tally()

ggplot(cat_df, aes(x=cat, y=n, fill=cat)) + 
  geom_bar(stat="identity",width=0.5) +theme_bw()+ggtitle("Number of different categories and their count")+
  xlab("CATEGORY")+ylab("COUNT")+scale_fill_brewer(palette="Dark2")

** Names of all the agencies to which they made a notice in the category of “PROPOSED RULES”**

Proposed_Rules <-xml_find_all(federal_data, "//AGCY[CAT/HD/text() = 'PROPOSED RULES']")
#Proposed_Rules
xml_as_kable(Proposed_Rules)

Coast Guard Coast Guard PROPOSED RULES Drawbridge Operations: Bay St. Louis, Bay St. Louis, MS, 49793-49795 2022-17400 NOTICES Removal of Conditions of Entry on Vessels Arriving from Cote d’Ivoire, 49877 2022-17372 Request for Applications: National Chemical Transportation Safety Advisory Committee, 49876-49877 2022-17396

Environmental Protection Environmental Protection Agency PROPOSED RULES National Emission Standards for Hazardous Air Pollutants: Gasoline Distribution Technology Review and Standards of Performance for Bulk Gasoline Terminals Review, 49795-49796 2022-17282 Standards of Performance for Steel Plants: Electric Arc Furnaces Constructed After 10/21/74 and On or Before 8/17/83; Standards of Performance for Steel Plants: Electric Arc Furnaces and Argon-Oxygen Decarburization Constructed After 8/17/83, 49796 2022-17365 NOTICES Agency Information Collection Activities; Proposals, Submissions, and Approvals: Assessment of Environmental Performance Standards and Ecolabels for Federal Procurement (Reinstatement), 49826-49827 2022-17386 Establishing No-Discharge Zones under Clean Water Act Section 312, 49820-49821 2022-17385 National Emission Standards for Hazardous Air Pollutants for Inorganic Arsenic Emissions from Primary Copper Smelters (Renewal), 49828 2022-17395 Certain New Chemicals or Significant New Uses: Statements of Findings for March 2022 and April 2022, 49821-49822 2022-17393 Clean Air Act Grant: Hawaii Department of Health; Opportunity for Public Hearing, 49828-49830 2022-17481 Environmental Impact Statements; Availability, etc., 49827-49828 2022-17359 Meetings: Hazardous Waste Electronic Manifest (e-Manifest) System Advisory Board, 49830-49832 2022-17401 Pesticide Registration Maintenance Fee: Requests to Voluntarily Cancel Certain Pesticide Registrations, 49822-49826 2022-17310

Federal Aviation Federal Aviation Administration RULES Airspace Designations and Reporting Points: Coeur D’Alene—Pappy Boyington Field, ID; Correction, 49768 2022-16965 Sand Point, AK, 49769 2022-16948 St. Paul Island, AK, 49768-49769 2022-16947 Vicinity of Liberal, KS; Correction, 49767-49768 2022-17030 PROPOSED RULES Airspace Designations and Reporting Points: Bloomfield, IA, 49781-49783 2022-17051 Selma, AL, 49783-49784 2022-17129 Airworthiness Directives: Airbus Helicopters, 49773-49776 2022-16776 Bombardier, Inc., Airplanes, 49779-49781 2022-17122 De Havilland Aircraft of Canada Limited (Type Certificate Previously Held by Bombardier, Inc.) Airplanes, 49776-49779 2022-17120 NOTICES Airport Improvement Program Property Release: Spokane International Airport, Spokane, WA, 49911 2022-17326

Federal Energy Federal Energy Regulatory Commission PROPOSED RULES Duty of Candor, 49784-49793 2022-16608 NOTICES Application: Eagle Creek Schoolfield Hydro, LLC, City of Danville, 49818-49819 2022-17367 Combined Filings, 49819-49820 2022-17352 2022-17353 Environmental Assessments; Availability, etc.: Georgia Power Co., 49819 2022-17366

National Oceanic National Oceanic and Atmospheric Administration PROPOSED RULES Fisheries of the Northeastern United States: Amendment 22 to the Summer Flounder, Scup, and Black Sea Bass Fishery Management Plan, 49796-49798 2022-17315 NOTICES Meetings: New England Fishery Management Council, 49807-49809 2022-17387 2022-17388 2022-17389 Receipt of Application: Marine Mammals; File No. 26708, 49808-49809 2022-17368

Securities Securities and Exchange Commission PROPOSED RULES Exemption for Certain Exchange Members, 49930-49973 2022-16711 NOTICES Agency Information Collection Activities; Proposals, Submissions, and Approvals, 49893-49894 2022-17308 2022-17314 Self-Regulatory Organizations; Proposed Rule Changes: Cboe BYX Exchange, Inc., 49907-49910 2022-17319 MEMX, LLC, 49894-49907 2022-17320 ** Displys the text of the subject of the third rule**

subject_rule <- xml_text(xml_find_all(federal_data, "//RULES/RULE[3]/PREAMB/SUBJECT"))
#subject_rule
xml_as_kable(subject_rule)

Modification of Class E Airspace; Coeur D’Alene—Pappy Boyington Field, ID; Correction

** Displys the text of the subject made by Coast Guard**

Notice_subject <- xml_find_all(federal_data, "//NOTICES/NOTICE//SUBAGY[text() = 'Coast Guard']/../SUBJECT")
xml_as_kable(Notice_subject)

National Chemical Transportation Safety Advisory Committee; Vacancies

Notification of the Removal of Conditions of Entry on Vessels Arriving From Cote d’Ivoire

** Displys a bar plot of top AGENCY (not SUBAGY) which made more than 5 notices**

top_agency <- xml_text(xml_find_all(federal_data, "//NOTICES/NOTICE/PREAMB/AGENCY/text()"))

top_agency_df <- as.data.frame(top_agency)


top_agency_df <- top_agency_df %>% 
  group_by(top_agency) %>%
  tally()

top_agency <- filter(top_agency_df, n > 5) 

ggplot(top_agency, aes(x=top_agency,y=n)) + geom_bar(stat='identity',color="black",fill="red")+xlab("TOP AGENCIES")+ylab("COUNT OF NOTICES")+ coord_flip()+theme_bw()+ggtitle("Agencies having more than 5 Notices")

Thus, concluding:

1. Maximum number of announcements had been made in the form of Notices.

2. Least number of announcements had been made in the form of Rules

3. Department of Health and Human Services has the maximum number of notices.

4. Department of Health and Department of commerce gave out same number of notices