The Federal Registry Data
Through this project we got the opportunity to explore data from
the Federal Register, which is the official journal of the Federal
Government of the United States that contains government agency rules,
proposed rules and public notices
The Link to the specific XML File used to read the data is given
below: https://www.govinfo.gov/bulkdata/FR/2022/08/FR-2022-08-12.xml
The Federal Registry data is also made available in
machine-readable format (i.e., XML) at: https://www.govinfo.gov/bulkdata/FR/
We also got the hands on experince on packages like xml2 is a
wrapper around the comprehensive libxml2 C library that makes it easier
to work with XML and HTML in R
The code Snippet
Loading the libraries required in this Assignment
library(xml2)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ dplyr 1.0.8
## ✓ tidyr 1.2.0 ✓ stringr 1.4.0
## ✓ readr 2.1.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
library(dplyr)
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
Read the XML File and store data Used the
xml2 package to read the XML file directly from the URL: https://www.govinfo.gov/bulkdata/FR/2022/08/FR-2022-08-12.xml
federal_data= read_xml('https://www.govinfo.gov/bulkdata/FR/2022/08/FR-2022-08-12.xml')
#xml_as_kable(federal_data)
** The number of UNIQUE agencies are involved in this announcement IN
CNTNTS**
Unique_Agencies=unique(xml_find_all(federal_data, "//CNTNTS/AGCY"))
length(Unique_Agencies)
## [1] 47
** A Barplot showing different categories of announcements and their
counts**
cat <- xml_text(xml_find_all(federal_data, "//CNTNTS/AGCY/CAT/HD"))
cat_df <- as.data.frame(cat)
cat_df <- cat_df %>%
group_by(cat) %>%
tally()
ggplot(cat_df, aes(x=cat, y=n, fill=cat)) +
geom_bar(stat="identity",width=0.5) +theme_bw()+ggtitle("Number of different categories and their count")+
xlab("CATEGORY")+ylab("COUNT")+scale_fill_brewer(palette="Dark2")

** Names of all the agencies to which they made a notice in the
category of “PROPOSED RULES”**
Proposed_Rules <-xml_find_all(federal_data, "//AGCY[CAT/HD/text() = 'PROPOSED RULES']")
#Proposed_Rules
xml_as_kable(Proposed_Rules)
Coast Guard Coast Guard PROPOSED
RULES Drawbridge Operations: Bay
St. Louis, Bay St. Louis, MS, 49793-49795
2022-17400 NOTICES
Removal of Conditions of Entry on Vessels Arriving from
Cote d’Ivoire, 49877 2022-17372
Request for Applications: National
Chemical Transportation Safety Advisory Committee,
49876-49877 2022-17396
Environmental Protection Environmental
Protection Agency PROPOSED RULES National
Emission Standards for Hazardous Air Pollutants:
Gasoline Distribution Technology Review and Standards of
Performance for Bulk Gasoline Terminals Review,
49795-49796 2022-17282
Standards of Performance for Steel Plants:
Electric Arc Furnaces Constructed After 10/21/74 and On or Before
8/17/83; Standards of Performance for Steel Plants: Electric Arc
Furnaces and Argon-Oxygen Decarburization Constructed After 8/17/83,
49796 2022-17365
NOTICES Agency Information Collection Activities;
Proposals, Submissions, and Approvals: Assessment
of Environmental Performance Standards and Ecolabels for Federal
Procurement (Reinstatement), 49826-49827
2022-17386 Establishing
No-Discharge Zones under Clean Water Act Section 312,
49820-49821 2022-17385
National Emission Standards for Hazardous Air Pollutants for
Inorganic Arsenic Emissions from Primary Copper Smelters (Renewal),
49828 2022-17395
Certain New Chemicals or Significant New Uses:
Statements of Findings for March 2022 and April 2022,
49821-49822 2022-17393 Clean
Air Act Grant: Hawaii Department of Health;
Opportunity for Public Hearing, 49828-49830
2022-17481 Environmental
Impact Statements; Availability, etc., 49827-49828
2022-17359 Meetings:
Hazardous Waste Electronic Manifest (e-Manifest) System Advisory
Board, 49830-49832 2022-17401
Pesticide Registration Maintenance Fee:
Requests to Voluntarily Cancel Certain Pesticide Registrations,
49822-49826 2022-17310
Federal Aviation Federal
Aviation Administration RULES Airspace
Designations and Reporting Points: Coeur
D’Alene—Pappy Boyington Field, ID; Correction, 49768
2022-16965 Sand Point, AK,
49769 2022-16948
St. Paul Island, AK, 49768-49769
2022-16947 Vicinity of
Liberal, KS; Correction, 49767-49768
2022-17030 PROPOSED
RULES Airspace Designations and Reporting Points:
Bloomfield, IA, 49781-49783
2022-17051 Selma, AL,
49783-49784 2022-17129
Airworthiness Directives: Airbus Helicopters,
49773-49776 2022-16776
Bombardier, Inc., Airplanes,
49779-49781 2022-17122
De Havilland Aircraft of Canada Limited (Type Certificate
Previously Held by Bombardier, Inc.) Airplanes,
49776-49779 2022-17120
NOTICES Airport Improvement Program Property
Release: Spokane International Airport, Spokane,
WA, 49911 2022-17326
Federal Energy Federal Energy Regulatory
Commission PROPOSED RULES Duty of
Candor, 49784-49793 2022-16608
NOTICES Application:
Eagle Creek Schoolfield Hydro, LLC, City of Danville,
49818-49819 2022-17367
Combined Filings, 49819-49820
2022-17352 2022-17353
Environmental Assessments; Availability, etc.:
Georgia Power Co., 49819
2022-17366
National Oceanic National Oceanic and
Atmospheric Administration PROPOSED RULES
Fisheries of the Northeastern United States:
Amendment 22 to the Summer Flounder, Scup, and Black Sea Bass
Fishery Management Plan, 49796-49798
2022-17315 NOTICES
Meetings: New England Fishery Management
Council, 49807-49809 2022-17387
2022-17388 2022-17389
Receipt of Application: Marine Mammals; File
No. 26708, 49808-49809 2022-17368
Securities Securities and Exchange
Commission PROPOSED RULES Exemption
for Certain Exchange Members, 49930-49973
2022-16711
NOTICES Agency Information Collection Activities;
Proposals, Submissions, and Approvals, 49893-49894
2022-17308 2022-17314
Self-Regulatory Organizations; Proposed Rule Changes:
Cboe BYX Exchange, Inc., 49907-49910
2022-17319 MEMX, LLC,
49894-49907 2022-17320
** Displys the text of the subject of the third
rule**
subject_rule <- xml_text(xml_find_all(federal_data, "//RULES/RULE[3]/PREAMB/SUBJECT"))
#subject_rule
xml_as_kable(subject_rule)
Modification of Class E Airspace; Coeur D’Alene—Pappy Boyington
Field, ID; Correction
** Displys the text of the subject made by Coast Guard**
Notice_subject <- xml_find_all(federal_data, "//NOTICES/NOTICE//SUBAGY[text() = 'Coast Guard']/../SUBJECT")
xml_as_kable(Notice_subject)
National Chemical Transportation Safety Advisory Committee;
Vacancies
Notification of the Removal of Conditions of Entry on
Vessels Arriving From Cote d’Ivoire
** Displys a bar plot of top AGENCY (not SUBAGY) which made more than
5 notices**
top_agency <- xml_text(xml_find_all(federal_data, "//NOTICES/NOTICE/PREAMB/AGENCY/text()"))
top_agency_df <- as.data.frame(top_agency)
top_agency_df <- top_agency_df %>%
group_by(top_agency) %>%
tally()
top_agency <- filter(top_agency_df, n > 5)
ggplot(top_agency, aes(x=top_agency,y=n)) + geom_bar(stat='identity',color="black",fill="red")+xlab("TOP AGENCIES")+ylab("COUNT OF NOTICES")+ coord_flip()+theme_bw()+ggtitle("Agencies having more than 5 Notices")

Thus, concluding:
1. Maximum number of announcements had been made in the form
of Notices.
2. Least number of announcements had been made in the form of
Rules
3. Department of Health and Human Services has the maximum
number of notices.
4. Department of Health and Department of commerce gave out
same number of notices