Assignment: working with JSON, HTML, XML, and Parquet in R

You have received the following data from CUNYMart, located at 123 Example Street, Anytown, USA.

This data will be used for inventory analysis at the retailer. You are required to prepare the data for analysis by formatting it in JSON, HTML, XML, and Parquet. Additionally, provide the pros and cons of each format.

Your must include R code for generating and importing the data into R.

# Load Libraries

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(readr)
library(jsonlite)
library(xtable)
library(xml2)
library(XML)
library(arrow)
## 
## Attaching package: 'arrow'
## The following object is masked from 'package:utils':
## 
##     timestamp

Load the Raw Data

raw_data <- "Category,Item Name,Item ID,Brand,Price,Variation ID,Variation Details
Electronics,Smartphone,101,TechBrand,699.99,101-A,Color: Black, Storage: 64GB
Electronics,Smartphone,101,TechBrand,699.99,101-B,Color: White, Storage: 128GB
Electronics,Laptop,102,CompuBrand,1099.99,102-A,Color: Silver, Storage: 256GB
Electronics,Laptop,102,CompuBrand,1099.99,102-B,Color: Space Gray, Storage: 512GB
Home Appliances,Refrigerator,201,HomeCool,899.99,201-A,Color: Stainless Steel, Capacity: 20 cu ft
Home Appliances,Refrigerator,201,HomeCool,899.99,201-B,Color: White, Capacity: 18 cu ft
Home Appliances,Washing Machine,202,CleanTech,499.99,202-A,Type: Front Load, Capacity: 4.5 cu ft
Home Appliances,Washing Machine,202,CleanTech,499.99,202-B,Type: Top Load, Capacity: 5.0 cu ft
Clothing,T-Shirt,301,FashionCo,19.99,301-A,Color: Blue, Size: S
Clothing,T-Shirt,301,FashionCo,19.99,301-B,Color: Red, Size: M
Clothing,T-Shirt,301,FashionCo,19.99,301-C,Color: Green, Size: L
Clothing,Jeans,302,DenimWorks,49.99,302-A,Color: Dark Blue, Size: 32
Clothing,Jeans,302,DenimWorks,49.99,302-B,Color: Light Blue, Size: 34
Books,Fiction Novel,401,-,14.99,401-A,Format: Hardcover, Language: English
Books,Fiction Novel,401,-,14.99,401-B,Format: Paperback, Language: Spanish
Books,Non-Fiction Guide,402,-,24.99,402-A,Format: eBook, Language: English
Books,Non-Fiction Guide,402,-,24.99,402-B,Format: Paperback, Language: French
Sports Equipment,Basketball,501,SportsGear,29.99,501-A,Size: Size 7, Color: Orange
Sports Equipment,Tennis Racket,502,RacketPro,89.99,502-A,Material: Graphite, Color: Black
Sports Equipment,Tennis Racket,502,RacketPro,89.99,502-B,Material: Aluminum, Color: Silver"

Create DataFrame

dataframe <- read_csv(raw_data)
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
## Rows: 20 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): Category, Item Name, Brand, Variation ID, Variation Details
## dbl (2): Item ID, Price
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
print(dataframe)
## # A tibble: 20 × 7
##    Category         `Item Name`       `Item ID` Brand       Price `Variation ID`
##    <chr>            <chr>                 <dbl> <chr>       <dbl> <chr>         
##  1 Electronics      Smartphone              101 TechBrand   700.  101-A         
##  2 Electronics      Smartphone              101 TechBrand   700.  101-B         
##  3 Electronics      Laptop                  102 CompuBrand 1100.  102-A         
##  4 Electronics      Laptop                  102 CompuBrand 1100.  102-B         
##  5 Home Appliances  Refrigerator            201 HomeCool    900.  201-A         
##  6 Home Appliances  Refrigerator            201 HomeCool    900.  201-B         
##  7 Home Appliances  Washing Machine         202 CleanTech   500.  202-A         
##  8 Home Appliances  Washing Machine         202 CleanTech   500.  202-B         
##  9 Clothing         T-Shirt                 301 FashionCo    20.0 301-A         
## 10 Clothing         T-Shirt                 301 FashionCo    20.0 301-B         
## 11 Clothing         T-Shirt                 301 FashionCo    20.0 301-C         
## 12 Clothing         Jeans                   302 DenimWorks   50.0 302-A         
## 13 Clothing         Jeans                   302 DenimWorks   50.0 302-B         
## 14 Books            Fiction Novel           401 -            15.0 401-A         
## 15 Books            Fiction Novel           401 -            15.0 401-B         
## 16 Books            Non-Fiction Guide       402 -            25.0 402-A         
## 17 Books            Non-Fiction Guide       402 -            25.0 402-B         
## 18 Sports Equipment Basketball              501 SportsGear   30.0 501-A         
## 19 Sports Equipment Tennis Racket           502 RacketPro    90.0 502-A         
## 20 Sports Equipment Tennis Racket           502 RacketPro    90.0 502-B         
## # ℹ 1 more variable: `Variation Details` <chr>

JSON

dataframe_json <- toJSON(dataframe, pretty = TRUE)
cat(dataframe_json)
## [
##   {
##     "Category": "Electronics",
##     "Item Name": "Smartphone",
##     "Item ID": 101,
##     "Brand": "TechBrand",
##     "Price": 699.99,
##     "Variation ID": "101-A",
##     "Variation Details": "Color: Black, Storage: 64GB"
##   },
##   {
##     "Category": "Electronics",
##     "Item Name": "Smartphone",
##     "Item ID": 101,
##     "Brand": "TechBrand",
##     "Price": 699.99,
##     "Variation ID": "101-B",
##     "Variation Details": "Color: White, Storage: 128GB"
##   },
##   {
##     "Category": "Electronics",
##     "Item Name": "Laptop",
##     "Item ID": 102,
##     "Brand": "CompuBrand",
##     "Price": 1099.99,
##     "Variation ID": "102-A",
##     "Variation Details": "Color: Silver, Storage: 256GB"
##   },
##   {
##     "Category": "Electronics",
##     "Item Name": "Laptop",
##     "Item ID": 102,
##     "Brand": "CompuBrand",
##     "Price": 1099.99,
##     "Variation ID": "102-B",
##     "Variation Details": "Color: Space Gray, Storage: 512GB"
##   },
##   {
##     "Category": "Home Appliances",
##     "Item Name": "Refrigerator",
##     "Item ID": 201,
##     "Brand": "HomeCool",
##     "Price": 899.99,
##     "Variation ID": "201-A",
##     "Variation Details": "Color: Stainless Steel, Capacity: 20 cu ft"
##   },
##   {
##     "Category": "Home Appliances",
##     "Item Name": "Refrigerator",
##     "Item ID": 201,
##     "Brand": "HomeCool",
##     "Price": 899.99,
##     "Variation ID": "201-B",
##     "Variation Details": "Color: White, Capacity: 18 cu ft"
##   },
##   {
##     "Category": "Home Appliances",
##     "Item Name": "Washing Machine",
##     "Item ID": 202,
##     "Brand": "CleanTech",
##     "Price": 499.99,
##     "Variation ID": "202-A",
##     "Variation Details": "Type: Front Load, Capacity: 4.5 cu ft"
##   },
##   {
##     "Category": "Home Appliances",
##     "Item Name": "Washing Machine",
##     "Item ID": 202,
##     "Brand": "CleanTech",
##     "Price": 499.99,
##     "Variation ID": "202-B",
##     "Variation Details": "Type: Top Load, Capacity: 5.0 cu ft"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "T-Shirt",
##     "Item ID": 301,
##     "Brand": "FashionCo",
##     "Price": 19.99,
##     "Variation ID": "301-A",
##     "Variation Details": "Color: Blue, Size: S"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "T-Shirt",
##     "Item ID": 301,
##     "Brand": "FashionCo",
##     "Price": 19.99,
##     "Variation ID": "301-B",
##     "Variation Details": "Color: Red, Size: M"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "T-Shirt",
##     "Item ID": 301,
##     "Brand": "FashionCo",
##     "Price": 19.99,
##     "Variation ID": "301-C",
##     "Variation Details": "Color: Green, Size: L"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "Jeans",
##     "Item ID": 302,
##     "Brand": "DenimWorks",
##     "Price": 49.99,
##     "Variation ID": "302-A",
##     "Variation Details": "Color: Dark Blue, Size: 32"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "Jeans",
##     "Item ID": 302,
##     "Brand": "DenimWorks",
##     "Price": 49.99,
##     "Variation ID": "302-B",
##     "Variation Details": "Color: Light Blue, Size: 34"
##   },
##   {
##     "Category": "Books",
##     "Item Name": "Fiction Novel",
##     "Item ID": 401,
##     "Brand": "-",
##     "Price": 14.99,
##     "Variation ID": "401-A",
##     "Variation Details": "Format: Hardcover, Language: English"
##   },
##   {
##     "Category": "Books",
##     "Item Name": "Fiction Novel",
##     "Item ID": 401,
##     "Brand": "-",
##     "Price": 14.99,
##     "Variation ID": "401-B",
##     "Variation Details": "Format: Paperback, Language: Spanish"
##   },
##   {
##     "Category": "Books",
##     "Item Name": "Non-Fiction Guide",
##     "Item ID": 402,
##     "Brand": "-",
##     "Price": 24.99,
##     "Variation ID": "402-A",
##     "Variation Details": "Format: eBook, Language: English"
##   },
##   {
##     "Category": "Books",
##     "Item Name": "Non-Fiction Guide",
##     "Item ID": 402,
##     "Brand": "-",
##     "Price": 24.99,
##     "Variation ID": "402-B",
##     "Variation Details": "Format: Paperback, Language: French"
##   },
##   {
##     "Category": "Sports Equipment",
##     "Item Name": "Basketball",
##     "Item ID": 501,
##     "Brand": "SportsGear",
##     "Price": 29.99,
##     "Variation ID": "501-A",
##     "Variation Details": "Size: Size 7, Color: Orange"
##   },
##   {
##     "Category": "Sports Equipment",
##     "Item Name": "Tennis Racket",
##     "Item ID": 502,
##     "Brand": "RacketPro",
##     "Price": 89.99,
##     "Variation ID": "502-A",
##     "Variation Details": "Material: Graphite, Color: Black"
##   },
##   {
##     "Category": "Sports Equipment",
##     "Item Name": "Tennis Racket",
##     "Item ID": 502,
##     "Brand": "RacketPro",
##     "Price": 89.99,
##     "Variation ID": "502-B",
##     "Variation Details": "Material: Aluminum, Color: Silver"
##   }
## ]

HTML

dataframe_html <- print(xtable(dataframe), type = "html")
## <!-- html table generated in R 4.4.2 by xtable 1.8-4 package -->
## <!-- Wed Mar 26 21:13:03 2025 -->
## <table border=1>
## <tr> <th>  </th> <th> Category </th> <th> Item Name </th> <th> Item ID </th> <th> Brand </th> <th> Price </th> <th> Variation ID </th> <th> Variation Details </th>  </tr>
##   <tr> <td align="right"> 1 </td> <td> Electronics </td> <td> Smartphone </td> <td align="right"> 101.00 </td> <td> TechBrand </td> <td align="right"> 699.99 </td> <td> 101-A </td> <td> Color: Black, Storage: 64GB </td> </tr>
##   <tr> <td align="right"> 2 </td> <td> Electronics </td> <td> Smartphone </td> <td align="right"> 101.00 </td> <td> TechBrand </td> <td align="right"> 699.99 </td> <td> 101-B </td> <td> Color: White, Storage: 128GB </td> </tr>
##   <tr> <td align="right"> 3 </td> <td> Electronics </td> <td> Laptop </td> <td align="right"> 102.00 </td> <td> CompuBrand </td> <td align="right"> 1099.99 </td> <td> 102-A </td> <td> Color: Silver, Storage: 256GB </td> </tr>
##   <tr> <td align="right"> 4 </td> <td> Electronics </td> <td> Laptop </td> <td align="right"> 102.00 </td> <td> CompuBrand </td> <td align="right"> 1099.99 </td> <td> 102-B </td> <td> Color: Space Gray, Storage: 512GB </td> </tr>
##   <tr> <td align="right"> 5 </td> <td> Home Appliances </td> <td> Refrigerator </td> <td align="right"> 201.00 </td> <td> HomeCool </td> <td align="right"> 899.99 </td> <td> 201-A </td> <td> Color: Stainless Steel, Capacity: 20 cu ft </td> </tr>
##   <tr> <td align="right"> 6 </td> <td> Home Appliances </td> <td> Refrigerator </td> <td align="right"> 201.00 </td> <td> HomeCool </td> <td align="right"> 899.99 </td> <td> 201-B </td> <td> Color: White, Capacity: 18 cu ft </td> </tr>
##   <tr> <td align="right"> 7 </td> <td> Home Appliances </td> <td> Washing Machine </td> <td align="right"> 202.00 </td> <td> CleanTech </td> <td align="right"> 499.99 </td> <td> 202-A </td> <td> Type: Front Load, Capacity: 4.5 cu ft </td> </tr>
##   <tr> <td align="right"> 8 </td> <td> Home Appliances </td> <td> Washing Machine </td> <td align="right"> 202.00 </td> <td> CleanTech </td> <td align="right"> 499.99 </td> <td> 202-B </td> <td> Type: Top Load, Capacity: 5.0 cu ft </td> </tr>
##   <tr> <td align="right"> 9 </td> <td> Clothing </td> <td> T-Shirt </td> <td align="right"> 301.00 </td> <td> FashionCo </td> <td align="right"> 19.99 </td> <td> 301-A </td> <td> Color: Blue, Size: S </td> </tr>
##   <tr> <td align="right"> 10 </td> <td> Clothing </td> <td> T-Shirt </td> <td align="right"> 301.00 </td> <td> FashionCo </td> <td align="right"> 19.99 </td> <td> 301-B </td> <td> Color: Red, Size: M </td> </tr>
##   <tr> <td align="right"> 11 </td> <td> Clothing </td> <td> T-Shirt </td> <td align="right"> 301.00 </td> <td> FashionCo </td> <td align="right"> 19.99 </td> <td> 301-C </td> <td> Color: Green, Size: L </td> </tr>
##   <tr> <td align="right"> 12 </td> <td> Clothing </td> <td> Jeans </td> <td align="right"> 302.00 </td> <td> DenimWorks </td> <td align="right"> 49.99 </td> <td> 302-A </td> <td> Color: Dark Blue, Size: 32 </td> </tr>
##   <tr> <td align="right"> 13 </td> <td> Clothing </td> <td> Jeans </td> <td align="right"> 302.00 </td> <td> DenimWorks </td> <td align="right"> 49.99 </td> <td> 302-B </td> <td> Color: Light Blue, Size: 34 </td> </tr>
##   <tr> <td align="right"> 14 </td> <td> Books </td> <td> Fiction Novel </td> <td align="right"> 401.00 </td> <td> - </td> <td align="right"> 14.99 </td> <td> 401-A </td> <td> Format: Hardcover, Language: English </td> </tr>
##   <tr> <td align="right"> 15 </td> <td> Books </td> <td> Fiction Novel </td> <td align="right"> 401.00 </td> <td> - </td> <td align="right"> 14.99 </td> <td> 401-B </td> <td> Format: Paperback, Language: Spanish </td> </tr>
##   <tr> <td align="right"> 16 </td> <td> Books </td> <td> Non-Fiction Guide </td> <td align="right"> 402.00 </td> <td> - </td> <td align="right"> 24.99 </td> <td> 402-A </td> <td> Format: eBook, Language: English </td> </tr>
##   <tr> <td align="right"> 17 </td> <td> Books </td> <td> Non-Fiction Guide </td> <td align="right"> 402.00 </td> <td> - </td> <td align="right"> 24.99 </td> <td> 402-B </td> <td> Format: Paperback, Language: French </td> </tr>
##   <tr> <td align="right"> 18 </td> <td> Sports Equipment </td> <td> Basketball </td> <td align="right"> 501.00 </td> <td> SportsGear </td> <td align="right"> 29.99 </td> <td> 501-A </td> <td> Size: Size 7, Color: Orange </td> </tr>
##   <tr> <td align="right"> 19 </td> <td> Sports Equipment </td> <td> Tennis Racket </td> <td align="right"> 502.00 </td> <td> RacketPro </td> <td align="right"> 89.99 </td> <td> 502-A </td> <td> Material: Graphite, Color: Black </td> </tr>
##   <tr> <td align="right"> 20 </td> <td> Sports Equipment </td> <td> Tennis Racket </td> <td align="right"> 502.00 </td> <td> RacketPro </td> <td align="right"> 89.99 </td> <td> 502-B </td> <td> Material: Aluminum, Color: Silver </td> </tr>
##    </table>
write(dataframe_html, file = "cunyMart_inventory.html")

print("HTML file 'cunyMart_inventory.html' saved successfully.")
## [1] "HTML file 'cunyMart_inventory.html' saved successfully."

XML

# Function to convert data frame to XML
dataframe_to_xml <- function(dataframe, root_name = "items") {
  root <- newXMLNode(root_name)
  
  apply(dataframe, 1, function(row) {
    item_node <- newXMLNode("item", parent = root)
    mapply(function(colname, value) {
      newXMLNode(colname, value, parent = item_node)
    }, names(row), row)
  })
  
  return(root)
}

# Convert the data frame to XML
xml_data <- dataframe_to_xml(dataframe)

saveXML(xml_data, file = "cunyMart_inventory.xml")
## [1] "cunyMart_inventory.xml"
cat(saveXML(xml_data))
## <items>
##   <item>
##     <Category>Electronics</Category>
##     <Item Name>Smartphone</Item Name>
##     <Item ID>101</Item ID>
##     <Brand>TechBrand</Brand>
##     <Price> 699.99</Price>
##     <Variation ID>101-A</Variation ID>
##     <Variation Details>Color: Black, Storage: 64GB</Variation Details>
##   </item>
##   <item>
##     <Category>Electronics</Category>
##     <Item Name>Smartphone</Item Name>
##     <Item ID>101</Item ID>
##     <Brand>TechBrand</Brand>
##     <Price> 699.99</Price>
##     <Variation ID>101-B</Variation ID>
##     <Variation Details>Color: White, Storage: 128GB</Variation Details>
##   </item>
##   <item>
##     <Category>Electronics</Category>
##     <Item Name>Laptop</Item Name>
##     <Item ID>102</Item ID>
##     <Brand>CompuBrand</Brand>
##     <Price>1099.99</Price>
##     <Variation ID>102-A</Variation ID>
##     <Variation Details>Color: Silver, Storage: 256GB</Variation Details>
##   </item>
##   <item>
##     <Category>Electronics</Category>
##     <Item Name>Laptop</Item Name>
##     <Item ID>102</Item ID>
##     <Brand>CompuBrand</Brand>
##     <Price>1099.99</Price>
##     <Variation ID>102-B</Variation ID>
##     <Variation Details>Color: Space Gray, Storage: 512GB</Variation Details>
##   </item>
##   <item>
##     <Category>Home Appliances</Category>
##     <Item Name>Refrigerator</Item Name>
##     <Item ID>201</Item ID>
##     <Brand>HomeCool</Brand>
##     <Price> 899.99</Price>
##     <Variation ID>201-A</Variation ID>
##     <Variation Details>Color: Stainless Steel, Capacity: 20 cu ft</Variation Details>
##   </item>
##   <item>
##     <Category>Home Appliances</Category>
##     <Item Name>Refrigerator</Item Name>
##     <Item ID>201</Item ID>
##     <Brand>HomeCool</Brand>
##     <Price> 899.99</Price>
##     <Variation ID>201-B</Variation ID>
##     <Variation Details>Color: White, Capacity: 18 cu ft</Variation Details>
##   </item>
##   <item>
##     <Category>Home Appliances</Category>
##     <Item Name>Washing Machine</Item Name>
##     <Item ID>202</Item ID>
##     <Brand>CleanTech</Brand>
##     <Price> 499.99</Price>
##     <Variation ID>202-A</Variation ID>
##     <Variation Details>Type: Front Load, Capacity: 4.5 cu ft</Variation Details>
##   </item>
##   <item>
##     <Category>Home Appliances</Category>
##     <Item Name>Washing Machine</Item Name>
##     <Item ID>202</Item ID>
##     <Brand>CleanTech</Brand>
##     <Price> 499.99</Price>
##     <Variation ID>202-B</Variation ID>
##     <Variation Details>Type: Top Load, Capacity: 5.0 cu ft</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>T-Shirt</Item Name>
##     <Item ID>301</Item ID>
##     <Brand>FashionCo</Brand>
##     <Price>  19.99</Price>
##     <Variation ID>301-A</Variation ID>
##     <Variation Details>Color: Blue, Size: S</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>T-Shirt</Item Name>
##     <Item ID>301</Item ID>
##     <Brand>FashionCo</Brand>
##     <Price>  19.99</Price>
##     <Variation ID>301-B</Variation ID>
##     <Variation Details>Color: Red, Size: M</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>T-Shirt</Item Name>
##     <Item ID>301</Item ID>
##     <Brand>FashionCo</Brand>
##     <Price>  19.99</Price>
##     <Variation ID>301-C</Variation ID>
##     <Variation Details>Color: Green, Size: L</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>Jeans</Item Name>
##     <Item ID>302</Item ID>
##     <Brand>DenimWorks</Brand>
##     <Price>  49.99</Price>
##     <Variation ID>302-A</Variation ID>
##     <Variation Details>Color: Dark Blue, Size: 32</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>Jeans</Item Name>
##     <Item ID>302</Item ID>
##     <Brand>DenimWorks</Brand>
##     <Price>  49.99</Price>
##     <Variation ID>302-B</Variation ID>
##     <Variation Details>Color: Light Blue, Size: 34</Variation Details>
##   </item>
##   <item>
##     <Category>Books</Category>
##     <Item Name>Fiction Novel</Item Name>
##     <Item ID>401</Item ID>
##     <Brand>-</Brand>
##     <Price>  14.99</Price>
##     <Variation ID>401-A</Variation ID>
##     <Variation Details>Format: Hardcover, Language: English</Variation Details>
##   </item>
##   <item>
##     <Category>Books</Category>
##     <Item Name>Fiction Novel</Item Name>
##     <Item ID>401</Item ID>
##     <Brand>-</Brand>
##     <Price>  14.99</Price>
##     <Variation ID>401-B</Variation ID>
##     <Variation Details>Format: Paperback, Language: Spanish</Variation Details>
##   </item>
##   <item>
##     <Category>Books</Category>
##     <Item Name>Non-Fiction Guide</Item Name>
##     <Item ID>402</Item ID>
##     <Brand>-</Brand>
##     <Price>  24.99</Price>
##     <Variation ID>402-A</Variation ID>
##     <Variation Details>Format: eBook, Language: English</Variation Details>
##   </item>
##   <item>
##     <Category>Books</Category>
##     <Item Name>Non-Fiction Guide</Item Name>
##     <Item ID>402</Item ID>
##     <Brand>-</Brand>
##     <Price>  24.99</Price>
##     <Variation ID>402-B</Variation ID>
##     <Variation Details>Format: Paperback, Language: French</Variation Details>
##   </item>
##   <item>
##     <Category>Sports Equipment</Category>
##     <Item Name>Basketball</Item Name>
##     <Item ID>501</Item ID>
##     <Brand>SportsGear</Brand>
##     <Price>  29.99</Price>
##     <Variation ID>501-A</Variation ID>
##     <Variation Details>Size: Size 7, Color: Orange</Variation Details>
##   </item>
##   <item>
##     <Category>Sports Equipment</Category>
##     <Item Name>Tennis Racket</Item Name>
##     <Item ID>502</Item ID>
##     <Brand>RacketPro</Brand>
##     <Price>  89.99</Price>
##     <Variation ID>502-A</Variation ID>
##     <Variation Details>Material: Graphite, Color: Black</Variation Details>
##   </item>
##   <item>
##     <Category>Sports Equipment</Category>
##     <Item Name>Tennis Racket</Item Name>
##     <Item ID>502</Item ID>
##     <Brand>RacketPro</Brand>
##     <Price>  89.99</Price>
##     <Variation ID>502-B</Variation ID>
##     <Variation Details>Material: Aluminum, Color: Silver</Variation Details>
##   </item>
## </items>

Parquet

# Write data to Parquet
write_parquet(dataframe, "cunyMart_inventory.parquet")

# To load the data back into R
loaded_data <- read_parquet("cunyMart_inventory.parquet")
print(loaded_data)
## # A tibble: 20 × 7
##    Category         `Item Name`       `Item ID` Brand       Price `Variation ID`
##  * <chr>            <chr>                 <dbl> <chr>       <dbl> <chr>         
##  1 Electronics      Smartphone              101 TechBrand   700.  101-A         
##  2 Electronics      Smartphone              101 TechBrand   700.  101-B         
##  3 Electronics      Laptop                  102 CompuBrand 1100.  102-A         
##  4 Electronics      Laptop                  102 CompuBrand 1100.  102-B         
##  5 Home Appliances  Refrigerator            201 HomeCool    900.  201-A         
##  6 Home Appliances  Refrigerator            201 HomeCool    900.  201-B         
##  7 Home Appliances  Washing Machine         202 CleanTech   500.  202-A         
##  8 Home Appliances  Washing Machine         202 CleanTech   500.  202-B         
##  9 Clothing         T-Shirt                 301 FashionCo    20.0 301-A         
## 10 Clothing         T-Shirt                 301 FashionCo    20.0 301-B         
## 11 Clothing         T-Shirt                 301 FashionCo    20.0 301-C         
## 12 Clothing         Jeans                   302 DenimWorks   50.0 302-A         
## 13 Clothing         Jeans                   302 DenimWorks   50.0 302-B         
## 14 Books            Fiction Novel           401 -            15.0 401-A         
## 15 Books            Fiction Novel           401 -            15.0 401-B         
## 16 Books            Non-Fiction Guide       402 -            25.0 402-A         
## 17 Books            Non-Fiction Guide       402 -            25.0 402-B         
## 18 Sports Equipment Basketball              501 SportsGear   30.0 501-A         
## 19 Sports Equipment Tennis Racket           502 RacketPro    90.0 502-A         
## 20 Sports Equipment Tennis Racket           502 RacketPro    90.0 502-B         
## # ℹ 1 more variable: `Variation Details` <chr>

Pros and Cons of Each Format:

JSON:

Pros: Human-readable, flexible structure (supports nested data), widely used for APIs and configuration.

Cons: Can be inefficient for large datasets, not optimized for storage or speed.

HTML:

Pros: Good for displaying data on web pages, easy to style with CSS, supported in web browsers.

Cons: Not ideal for analysis, lacks structure for programmatic access, can be large for complex data.

XML:

Pros: Hierarchical structure, widely supported, good for data interchange.

Cons: Verbose, harder to parse compared to JSON, inefficient for large datasets.

Parquet:

Pros: Highly efficient for large datasets, optimized for query performance, supports complex data types, and compression.

Cons: Not human-readable, requires specialized libraries/tools to read.