Assignment: working with JSON, HTML, XML, and Parquet in R

You have received the following data from CUNYMart, located at 123 Example Street, Anytown, USA.

This data will be used for inventory analysis at the retailer. You are required to prepare the data for analysis by formatting it in JSON, HTML, XML, and Parquet. Additionally, provide the pros and cons of each format. Your must include R code for generating and importing the data into R.

library(readr)
library(jsonlite)
library(xtable)
library(XML)
library(arrow)
## 
## Attaching package: 'arrow'
## The following object is masked from 'package:utils':
## 
##     timestamp

Load Raw Data

raw_data <- "
Category,Item Name,Item ID,Brand,Price,Variation ID,Variation Details
Electronics,Smartphone,101,TechBrand,699.99,101-A,Color: Black, Storage: 64GB
Electronics,Smartphone,101,TechBrand,699.99,101-B,Color: White, Storage: 128GB
Electronics,Laptop,102,CompuBrand,1099.99,102-A,Color: Silver, Storage: 256GB
Electronics,Laptop,102,CompuBrand,1099.99,102-B,Color: Space Gray, Storage: 512GB
Home Appliances,Refrigerator,201,HomeCool,899.99,201-A,Color: Stainless Steel, Capacity: 20 cu ft
Home Appliances,Refrigerator,201,HomeCool,899.99,201-B,Color: White, Capacity: 18 cu ft
Home Appliances,Washing Machine,202,CleanTech,499.99,202-A,Type: Front Load, Capacity: 4.5 cu ft
Home Appliances,Washing Machine,202,CleanTech,499.99,202-B,Type: Top Load, Capacity: 5.0 cu ft
Clothing,T-Shirt,301,FashionCo,19.99,301-A,Color: Blue, Size: S
Clothing,T-Shirt,301,FashionCo,19.99,301-B,Color: Red, Size: M
Clothing,T-Shirt,301,FashionCo,19.99,301-C,Color: Green, Size: L
Clothing,Jeans,302,DenimWorks,49.99,302-A,Color: Dark Blue, Size: 32
Clothing,Jeans,302,DenimWorks,49.99,302-B,Color: Light Blue, Size: 34
Books,Fiction Novel,401,-,14.99,401-A,Format: Hardcover, Language: English
Books,Fiction Novel,401,-,14.99,401-B,Format: Paperback, Language: Spanish
Books,Non-Fiction Guide,402,-,24.99,402-A,Format: eBook, Language: English
Books,Non-Fiction Guide,402,-,24.99,402-B,Format: Paperback, Language: French
Sports Equipment,Basketball,501,SportsGear,29.99,501-A,Size: Size 7, Color: Orange
Sports Equipment,Tennis Racket,502,RacketPro,89.99,502-A,Material: Graphite, Color: Black
Sports Equipment,Tennis Racket,502,RacketPro,89.99,502-B,Material: Aluminum, Color: Silver
"

Create DataFrame

df <- read_csv(raw_data)
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
## Rows: 20 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): Category, Item Name, Brand, Variation ID, Variation Details
## dbl (2): Item ID, Price
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
print(df)
## # A tibble: 20 × 7
##    Category         `Item Name`       `Item ID` Brand       Price `Variation ID`
##    <chr>            <chr>                 <dbl> <chr>       <dbl> <chr>         
##  1 Electronics      Smartphone              101 TechBrand   700.  101-A         
##  2 Electronics      Smartphone              101 TechBrand   700.  101-B         
##  3 Electronics      Laptop                  102 CompuBrand 1100.  102-A         
##  4 Electronics      Laptop                  102 CompuBrand 1100.  102-B         
##  5 Home Appliances  Refrigerator            201 HomeCool    900.  201-A         
##  6 Home Appliances  Refrigerator            201 HomeCool    900.  201-B         
##  7 Home Appliances  Washing Machine         202 CleanTech   500.  202-A         
##  8 Home Appliances  Washing Machine         202 CleanTech   500.  202-B         
##  9 Clothing         T-Shirt                 301 FashionCo    20.0 301-A         
## 10 Clothing         T-Shirt                 301 FashionCo    20.0 301-B         
## 11 Clothing         T-Shirt                 301 FashionCo    20.0 301-C         
## 12 Clothing         Jeans                   302 DenimWorks   50.0 302-A         
## 13 Clothing         Jeans                   302 DenimWorks   50.0 302-B         
## 14 Books            Fiction Novel           401 -            15.0 401-A         
## 15 Books            Fiction Novel           401 -            15.0 401-B         
## 16 Books            Non-Fiction Guide       402 -            25.0 402-A         
## 17 Books            Non-Fiction Guide       402 -            25.0 402-B         
## 18 Sports Equipment Basketball              501 SportsGear   30.0 501-A         
## 19 Sports Equipment Tennis Racket           502 RacketPro    90.0 502-A         
## 20 Sports Equipment Tennis Racket           502 RacketPro    90.0 502-B         
## # ℹ 1 more variable: `Variation Details` <chr>

JSON

df_json <- toJSON(df, pretty = TRUE)
cat(df_json)
## [
##   {
##     "Category": "Electronics",
##     "Item Name": "Smartphone",
##     "Item ID": 101,
##     "Brand": "TechBrand",
##     "Price": 699.99,
##     "Variation ID": "101-A",
##     "Variation Details": "Color: Black, Storage: 64GB"
##   },
##   {
##     "Category": "Electronics",
##     "Item Name": "Smartphone",
##     "Item ID": 101,
##     "Brand": "TechBrand",
##     "Price": 699.99,
##     "Variation ID": "101-B",
##     "Variation Details": "Color: White, Storage: 128GB"
##   },
##   {
##     "Category": "Electronics",
##     "Item Name": "Laptop",
##     "Item ID": 102,
##     "Brand": "CompuBrand",
##     "Price": 1099.99,
##     "Variation ID": "102-A",
##     "Variation Details": "Color: Silver, Storage: 256GB"
##   },
##   {
##     "Category": "Electronics",
##     "Item Name": "Laptop",
##     "Item ID": 102,
##     "Brand": "CompuBrand",
##     "Price": 1099.99,
##     "Variation ID": "102-B",
##     "Variation Details": "Color: Space Gray, Storage: 512GB"
##   },
##   {
##     "Category": "Home Appliances",
##     "Item Name": "Refrigerator",
##     "Item ID": 201,
##     "Brand": "HomeCool",
##     "Price": 899.99,
##     "Variation ID": "201-A",
##     "Variation Details": "Color: Stainless Steel, Capacity: 20 cu ft"
##   },
##   {
##     "Category": "Home Appliances",
##     "Item Name": "Refrigerator",
##     "Item ID": 201,
##     "Brand": "HomeCool",
##     "Price": 899.99,
##     "Variation ID": "201-B",
##     "Variation Details": "Color: White, Capacity: 18 cu ft"
##   },
##   {
##     "Category": "Home Appliances",
##     "Item Name": "Washing Machine",
##     "Item ID": 202,
##     "Brand": "CleanTech",
##     "Price": 499.99,
##     "Variation ID": "202-A",
##     "Variation Details": "Type: Front Load, Capacity: 4.5 cu ft"
##   },
##   {
##     "Category": "Home Appliances",
##     "Item Name": "Washing Machine",
##     "Item ID": 202,
##     "Brand": "CleanTech",
##     "Price": 499.99,
##     "Variation ID": "202-B",
##     "Variation Details": "Type: Top Load, Capacity: 5.0 cu ft"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "T-Shirt",
##     "Item ID": 301,
##     "Brand": "FashionCo",
##     "Price": 19.99,
##     "Variation ID": "301-A",
##     "Variation Details": "Color: Blue, Size: S"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "T-Shirt",
##     "Item ID": 301,
##     "Brand": "FashionCo",
##     "Price": 19.99,
##     "Variation ID": "301-B",
##     "Variation Details": "Color: Red, Size: M"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "T-Shirt",
##     "Item ID": 301,
##     "Brand": "FashionCo",
##     "Price": 19.99,
##     "Variation ID": "301-C",
##     "Variation Details": "Color: Green, Size: L"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "Jeans",
##     "Item ID": 302,
##     "Brand": "DenimWorks",
##     "Price": 49.99,
##     "Variation ID": "302-A",
##     "Variation Details": "Color: Dark Blue, Size: 32"
##   },
##   {
##     "Category": "Clothing",
##     "Item Name": "Jeans",
##     "Item ID": 302,
##     "Brand": "DenimWorks",
##     "Price": 49.99,
##     "Variation ID": "302-B",
##     "Variation Details": "Color: Light Blue, Size: 34"
##   },
##   {
##     "Category": "Books",
##     "Item Name": "Fiction Novel",
##     "Item ID": 401,
##     "Brand": "-",
##     "Price": 14.99,
##     "Variation ID": "401-A",
##     "Variation Details": "Format: Hardcover, Language: English"
##   },
##   {
##     "Category": "Books",
##     "Item Name": "Fiction Novel",
##     "Item ID": 401,
##     "Brand": "-",
##     "Price": 14.99,
##     "Variation ID": "401-B",
##     "Variation Details": "Format: Paperback, Language: Spanish"
##   },
##   {
##     "Category": "Books",
##     "Item Name": "Non-Fiction Guide",
##     "Item ID": 402,
##     "Brand": "-",
##     "Price": 24.99,
##     "Variation ID": "402-A",
##     "Variation Details": "Format: eBook, Language: English"
##   },
##   {
##     "Category": "Books",
##     "Item Name": "Non-Fiction Guide",
##     "Item ID": 402,
##     "Brand": "-",
##     "Price": 24.99,
##     "Variation ID": "402-B",
##     "Variation Details": "Format: Paperback, Language: French"
##   },
##   {
##     "Category": "Sports Equipment",
##     "Item Name": "Basketball",
##     "Item ID": 501,
##     "Brand": "SportsGear",
##     "Price": 29.99,
##     "Variation ID": "501-A",
##     "Variation Details": "Size: Size 7, Color: Orange"
##   },
##   {
##     "Category": "Sports Equipment",
##     "Item Name": "Tennis Racket",
##     "Item ID": 502,
##     "Brand": "RacketPro",
##     "Price": 89.99,
##     "Variation ID": "502-A",
##     "Variation Details": "Material: Graphite, Color: Black"
##   },
##   {
##     "Category": "Sports Equipment",
##     "Item Name": "Tennis Racket",
##     "Item ID": 502,
##     "Brand": "RacketPro",
##     "Price": 89.99,
##     "Variation ID": "502-B",
##     "Variation Details": "Material: Aluminum, Color: Silver"
##   }
## ]

HTML

df_html <- print(xtable(df), type = 'html')
## <!-- html table generated in R 4.4.1 by xtable 1.8-4 package -->
## <!-- Mon Oct 28 02:47:49 2024 -->
## <table border=1>
## <tr> <th>  </th> <th> Category </th> <th> Item Name </th> <th> Item ID </th> <th> Brand </th> <th> Price </th> <th> Variation ID </th> <th> Variation Details </th>  </tr>
##   <tr> <td align="right"> 1 </td> <td> Electronics </td> <td> Smartphone </td> <td align="right"> 101.00 </td> <td> TechBrand </td> <td align="right"> 699.99 </td> <td> 101-A </td> <td> Color: Black, Storage: 64GB </td> </tr>
##   <tr> <td align="right"> 2 </td> <td> Electronics </td> <td> Smartphone </td> <td align="right"> 101.00 </td> <td> TechBrand </td> <td align="right"> 699.99 </td> <td> 101-B </td> <td> Color: White, Storage: 128GB </td> </tr>
##   <tr> <td align="right"> 3 </td> <td> Electronics </td> <td> Laptop </td> <td align="right"> 102.00 </td> <td> CompuBrand </td> <td align="right"> 1099.99 </td> <td> 102-A </td> <td> Color: Silver, Storage: 256GB </td> </tr>
##   <tr> <td align="right"> 4 </td> <td> Electronics </td> <td> Laptop </td> <td align="right"> 102.00 </td> <td> CompuBrand </td> <td align="right"> 1099.99 </td> <td> 102-B </td> <td> Color: Space Gray, Storage: 512GB </td> </tr>
##   <tr> <td align="right"> 5 </td> <td> Home Appliances </td> <td> Refrigerator </td> <td align="right"> 201.00 </td> <td> HomeCool </td> <td align="right"> 899.99 </td> <td> 201-A </td> <td> Color: Stainless Steel, Capacity: 20 cu ft </td> </tr>
##   <tr> <td align="right"> 6 </td> <td> Home Appliances </td> <td> Refrigerator </td> <td align="right"> 201.00 </td> <td> HomeCool </td> <td align="right"> 899.99 </td> <td> 201-B </td> <td> Color: White, Capacity: 18 cu ft </td> </tr>
##   <tr> <td align="right"> 7 </td> <td> Home Appliances </td> <td> Washing Machine </td> <td align="right"> 202.00 </td> <td> CleanTech </td> <td align="right"> 499.99 </td> <td> 202-A </td> <td> Type: Front Load, Capacity: 4.5 cu ft </td> </tr>
##   <tr> <td align="right"> 8 </td> <td> Home Appliances </td> <td> Washing Machine </td> <td align="right"> 202.00 </td> <td> CleanTech </td> <td align="right"> 499.99 </td> <td> 202-B </td> <td> Type: Top Load, Capacity: 5.0 cu ft </td> </tr>
##   <tr> <td align="right"> 9 </td> <td> Clothing </td> <td> T-Shirt </td> <td align="right"> 301.00 </td> <td> FashionCo </td> <td align="right"> 19.99 </td> <td> 301-A </td> <td> Color: Blue, Size: S </td> </tr>
##   <tr> <td align="right"> 10 </td> <td> Clothing </td> <td> T-Shirt </td> <td align="right"> 301.00 </td> <td> FashionCo </td> <td align="right"> 19.99 </td> <td> 301-B </td> <td> Color: Red, Size: M </td> </tr>
##   <tr> <td align="right"> 11 </td> <td> Clothing </td> <td> T-Shirt </td> <td align="right"> 301.00 </td> <td> FashionCo </td> <td align="right"> 19.99 </td> <td> 301-C </td> <td> Color: Green, Size: L </td> </tr>
##   <tr> <td align="right"> 12 </td> <td> Clothing </td> <td> Jeans </td> <td align="right"> 302.00 </td> <td> DenimWorks </td> <td align="right"> 49.99 </td> <td> 302-A </td> <td> Color: Dark Blue, Size: 32 </td> </tr>
##   <tr> <td align="right"> 13 </td> <td> Clothing </td> <td> Jeans </td> <td align="right"> 302.00 </td> <td> DenimWorks </td> <td align="right"> 49.99 </td> <td> 302-B </td> <td> Color: Light Blue, Size: 34 </td> </tr>
##   <tr> <td align="right"> 14 </td> <td> Books </td> <td> Fiction Novel </td> <td align="right"> 401.00 </td> <td> - </td> <td align="right"> 14.99 </td> <td> 401-A </td> <td> Format: Hardcover, Language: English </td> </tr>
##   <tr> <td align="right"> 15 </td> <td> Books </td> <td> Fiction Novel </td> <td align="right"> 401.00 </td> <td> - </td> <td align="right"> 14.99 </td> <td> 401-B </td> <td> Format: Paperback, Language: Spanish </td> </tr>
##   <tr> <td align="right"> 16 </td> <td> Books </td> <td> Non-Fiction Guide </td> <td align="right"> 402.00 </td> <td> - </td> <td align="right"> 24.99 </td> <td> 402-A </td> <td> Format: eBook, Language: English </td> </tr>
##   <tr> <td align="right"> 17 </td> <td> Books </td> <td> Non-Fiction Guide </td> <td align="right"> 402.00 </td> <td> - </td> <td align="right"> 24.99 </td> <td> 402-B </td> <td> Format: Paperback, Language: French </td> </tr>
##   <tr> <td align="right"> 18 </td> <td> Sports Equipment </td> <td> Basketball </td> <td align="right"> 501.00 </td> <td> SportsGear </td> <td align="right"> 29.99 </td> <td> 501-A </td> <td> Size: Size 7, Color: Orange </td> </tr>
##   <tr> <td align="right"> 19 </td> <td> Sports Equipment </td> <td> Tennis Racket </td> <td align="right"> 502.00 </td> <td> RacketPro </td> <td align="right"> 89.99 </td> <td> 502-A </td> <td> Material: Graphite, Color: Black </td> </tr>
##   <tr> <td align="right"> 20 </td> <td> Sports Equipment </td> <td> Tennis Racket </td> <td align="right"> 502.00 </td> <td> RacketPro </td> <td align="right"> 89.99 </td> <td> 502-B </td> <td> Material: Aluminum, Color: Silver </td> </tr>
##    </table>

XML

xml_doc <- newXMLDoc()
root <- newXMLNode("inventory", doc = xml_doc)
suppressWarnings(
for (row in 1:nrow(df)) {
  raw_node <- newXMLNode("item", parent = root)
  for (col in names(df)) {
    newXMLNode(col, df[row, col], parent = raw_node)
  }
}
)
df_xml <- saveXML(xml_doc)
cat(df_xml)
## <?xml version="1.0"?>
## <inventory>
##   <item>
##     <Category>Electronics</Category>
##     <Item Name>Smartphone</Item Name>
##     <Item ID>101</Item ID>
##     <Brand>TechBrand</Brand>
##     <Price>699.99</Price>
##     <Variation ID>101-A</Variation ID>
##     <Variation Details>Color: Black, Storage: 64GB</Variation Details>
##   </item>
##   <item>
##     <Category>Electronics</Category>
##     <Item Name>Smartphone</Item Name>
##     <Item ID>101</Item ID>
##     <Brand>TechBrand</Brand>
##     <Price>699.99</Price>
##     <Variation ID>101-B</Variation ID>
##     <Variation Details>Color: White, Storage: 128GB</Variation Details>
##   </item>
##   <item>
##     <Category>Electronics</Category>
##     <Item Name>Laptop</Item Name>
##     <Item ID>102</Item ID>
##     <Brand>CompuBrand</Brand>
##     <Price>1099.99</Price>
##     <Variation ID>102-A</Variation ID>
##     <Variation Details>Color: Silver, Storage: 256GB</Variation Details>
##   </item>
##   <item>
##     <Category>Electronics</Category>
##     <Item Name>Laptop</Item Name>
##     <Item ID>102</Item ID>
##     <Brand>CompuBrand</Brand>
##     <Price>1099.99</Price>
##     <Variation ID>102-B</Variation ID>
##     <Variation Details>Color: Space Gray, Storage: 512GB</Variation Details>
##   </item>
##   <item>
##     <Category>Home Appliances</Category>
##     <Item Name>Refrigerator</Item Name>
##     <Item ID>201</Item ID>
##     <Brand>HomeCool</Brand>
##     <Price>899.99</Price>
##     <Variation ID>201-A</Variation ID>
##     <Variation Details>Color: Stainless Steel, Capacity: 20 cu ft</Variation Details>
##   </item>
##   <item>
##     <Category>Home Appliances</Category>
##     <Item Name>Refrigerator</Item Name>
##     <Item ID>201</Item ID>
##     <Brand>HomeCool</Brand>
##     <Price>899.99</Price>
##     <Variation ID>201-B</Variation ID>
##     <Variation Details>Color: White, Capacity: 18 cu ft</Variation Details>
##   </item>
##   <item>
##     <Category>Home Appliances</Category>
##     <Item Name>Washing Machine</Item Name>
##     <Item ID>202</Item ID>
##     <Brand>CleanTech</Brand>
##     <Price>499.99</Price>
##     <Variation ID>202-A</Variation ID>
##     <Variation Details>Type: Front Load, Capacity: 4.5 cu ft</Variation Details>
##   </item>
##   <item>
##     <Category>Home Appliances</Category>
##     <Item Name>Washing Machine</Item Name>
##     <Item ID>202</Item ID>
##     <Brand>CleanTech</Brand>
##     <Price>499.99</Price>
##     <Variation ID>202-B</Variation ID>
##     <Variation Details>Type: Top Load, Capacity: 5.0 cu ft</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>T-Shirt</Item Name>
##     <Item ID>301</Item ID>
##     <Brand>FashionCo</Brand>
##     <Price>19.99</Price>
##     <Variation ID>301-A</Variation ID>
##     <Variation Details>Color: Blue, Size: S</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>T-Shirt</Item Name>
##     <Item ID>301</Item ID>
##     <Brand>FashionCo</Brand>
##     <Price>19.99</Price>
##     <Variation ID>301-B</Variation ID>
##     <Variation Details>Color: Red, Size: M</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>T-Shirt</Item Name>
##     <Item ID>301</Item ID>
##     <Brand>FashionCo</Brand>
##     <Price>19.99</Price>
##     <Variation ID>301-C</Variation ID>
##     <Variation Details>Color: Green, Size: L</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>Jeans</Item Name>
##     <Item ID>302</Item ID>
##     <Brand>DenimWorks</Brand>
##     <Price>49.99</Price>
##     <Variation ID>302-A</Variation ID>
##     <Variation Details>Color: Dark Blue, Size: 32</Variation Details>
##   </item>
##   <item>
##     <Category>Clothing</Category>
##     <Item Name>Jeans</Item Name>
##     <Item ID>302</Item ID>
##     <Brand>DenimWorks</Brand>
##     <Price>49.99</Price>
##     <Variation ID>302-B</Variation ID>
##     <Variation Details>Color: Light Blue, Size: 34</Variation Details>
##   </item>
##   <item>
##     <Category>Books</Category>
##     <Item Name>Fiction Novel</Item Name>
##     <Item ID>401</Item ID>
##     <Brand>-</Brand>
##     <Price>14.99</Price>
##     <Variation ID>401-A</Variation ID>
##     <Variation Details>Format: Hardcover, Language: English</Variation Details>
##   </item>
##   <item>
##     <Category>Books</Category>
##     <Item Name>Fiction Novel</Item Name>
##     <Item ID>401</Item ID>
##     <Brand>-</Brand>
##     <Price>14.99</Price>
##     <Variation ID>401-B</Variation ID>
##     <Variation Details>Format: Paperback, Language: Spanish</Variation Details>
##   </item>
##   <item>
##     <Category>Books</Category>
##     <Item Name>Non-Fiction Guide</Item Name>
##     <Item ID>402</Item ID>
##     <Brand>-</Brand>
##     <Price>24.99</Price>
##     <Variation ID>402-A</Variation ID>
##     <Variation Details>Format: eBook, Language: English</Variation Details>
##   </item>
##   <item>
##     <Category>Books</Category>
##     <Item Name>Non-Fiction Guide</Item Name>
##     <Item ID>402</Item ID>
##     <Brand>-</Brand>
##     <Price>24.99</Price>
##     <Variation ID>402-B</Variation ID>
##     <Variation Details>Format: Paperback, Language: French</Variation Details>
##   </item>
##   <item>
##     <Category>Sports Equipment</Category>
##     <Item Name>Basketball</Item Name>
##     <Item ID>501</Item ID>
##     <Brand>SportsGear</Brand>
##     <Price>29.99</Price>
##     <Variation ID>501-A</Variation ID>
##     <Variation Details>Size: Size 7, Color: Orange</Variation Details>
##   </item>
##   <item>
##     <Category>Sports Equipment</Category>
##     <Item Name>Tennis Racket</Item Name>
##     <Item ID>502</Item ID>
##     <Brand>RacketPro</Brand>
##     <Price>89.99</Price>
##     <Variation ID>502-A</Variation ID>
##     <Variation Details>Material: Graphite, Color: Black</Variation Details>
##   </item>
##   <item>
##     <Category>Sports Equipment</Category>
##     <Item Name>Tennis Racket</Item Name>
##     <Item ID>502</Item ID>
##     <Brand>RacketPro</Brand>
##     <Price>89.99</Price>
##     <Variation ID>502-B</Variation ID>
##     <Variation Details>Material: Aluminum, Color: Silver</Variation Details>
##   </item>
## </inventory>

Parquet

write_parquet(df, "df.parquet")
df_parquet <- read_parquet("df.parquet")
print(df_parquet)
## # A tibble: 20 × 7
##    Category         `Item Name`       `Item ID` Brand       Price `Variation ID`
##  * <chr>            <chr>                 <dbl> <chr>       <dbl> <chr>         
##  1 Electronics      Smartphone              101 TechBrand   700.  101-A         
##  2 Electronics      Smartphone              101 TechBrand   700.  101-B         
##  3 Electronics      Laptop                  102 CompuBrand 1100.  102-A         
##  4 Electronics      Laptop                  102 CompuBrand 1100.  102-B         
##  5 Home Appliances  Refrigerator            201 HomeCool    900.  201-A         
##  6 Home Appliances  Refrigerator            201 HomeCool    900.  201-B         
##  7 Home Appliances  Washing Machine         202 CleanTech   500.  202-A         
##  8 Home Appliances  Washing Machine         202 CleanTech   500.  202-B         
##  9 Clothing         T-Shirt                 301 FashionCo    20.0 301-A         
## 10 Clothing         T-Shirt                 301 FashionCo    20.0 301-B         
## 11 Clothing         T-Shirt                 301 FashionCo    20.0 301-C         
## 12 Clothing         Jeans                   302 DenimWorks   50.0 302-A         
## 13 Clothing         Jeans                   302 DenimWorks   50.0 302-B         
## 14 Books            Fiction Novel           401 -            15.0 401-A         
## 15 Books            Fiction Novel           401 -            15.0 401-B         
## 16 Books            Non-Fiction Guide       402 -            25.0 402-A         
## 17 Books            Non-Fiction Guide       402 -            25.0 402-B         
## 18 Sports Equipment Basketball              501 SportsGear   30.0 501-A         
## 19 Sports Equipment Tennis Racket           502 RacketPro    90.0 502-A         
## 20 Sports Equipment Tennis Racket           502 RacketPro    90.0 502-B         
## # ℹ 1 more variable: `Variation Details` <chr>

Conclusion

Pros of JSON Easy to read and write Widely supported by programming languages and databases Small file sizes

Cons of JSON Can be slow to parse large datasets Limited data typing capabilities Lack of standard schema definition

Pros of HTML Human-readable format Supports rich media content Widely supported by browsers and tools

Cons of HTML Not designed for structured data analysis Difficult to parse programmatically Large file sizes for tabular data

Pros of XML Self-describing format with standardized schema Supports complex data structures Widely supported by databases and tools

Cons of XML lengthy syntax leads to large file sizes Parsing can be slow for large datasets Less human-readable than simpler formats

Pros of Parquet Fast read/write speeds Supports complex data types and nested structures Optimized for big data analytics

Cons of Parquet Requires specialized libraries/tools to read/write Less human-readable than simpler formats Steeper learning curve for beginners

Key Points JSON is good for flexible, semi-structured data HTML is good for web-facing data presentation XML is good for standardized, self-describing data Parquet is good for large-scale analytics