These data will be scraped from the website of the creators of CryptoPunks.
Additional historical data will include Ethereum/USD rates and recent digital art token (NFT) sales prices.
You should phrase your research question in a way that matches up with the scope of inference your dataset allows for.
How much (predictive) signal can we find amid the noisy, speculative confluence of cryptocurrency and art auctions?
What are the cases, and how many are there?
There are 10,000 CryptoPunks (NFT’s – Non-Fungible Tokens), and we have a record of every sales price in the 4 years since they were given away for free. On May 13, Christie’s Auction House will auction off 9 of the Punks, which will serve as our ultimate test cases.
Describe the method of data collection.
Scrape CryptoPunk data from LarvaLabs.com and download data from other sites about cryptocurrency rates and digital art sales trends.
What type of study is this (observational/experiment)?
Observational
What is the response variable? Is it quantitative or qualitative?
Sales price, a quantitative variable showing how many Ether the buyer paid for the Punk.
CryptoPunks have several qualitative attributes, such as hair type, hat type, pipe/cigarette, alien/zombie/ape/female/male, etc, which all appeal to various buyers. What’s likely more important though is the (quantitative) scarcity of each of those attributes. For example there are only 9 aliens in the whole set of 10K Punks, one of which will be sold at the Christie’s auction on May 13. It will certainly fetch the highest price. And then there’s the Ethereum price factor, which is also quantitative: The Punks are sold in Ether tokens, the price of which (in dollars) is at all-time highs, amidst a speculative frenzy in cryptocurrency. And a token representing Beeple’s digital artwork sold for $69M in Ether last month at Christie’s, so we’ll also observe recent price trends in digital art tokens, a subset of NFT’s.
Provide summary statistics for each of the variables. Also include appropriate visualizations related to your research question (e.g. scatter plot, boxplots, etc). This step requires the use of R, hence a code chunk is provided below. Insert more code chunks as needed.
# 9 Punks auctioned at Christie's
christiesPunks <- as.character(c(2, 532, 58, 30, 635, 602, 768, 603, 757))
displayPunk <- function(punk) {
# url punk ID string must be 4 digits
punk <- str_pad(punk, 4, "left", "0")
Url <- glue('https://www.larvalabs.com/public/images/cryptopunks/punk{punk}.png')
include_graphics(Url)
}
displayPunk(christiesPunks[1:8])
punks <- read_csv('punkTypes.csv')
## Warning: Missing column names filled in: 'X1' [1]
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## X1 = col_double(),
## type = col_character(),
## ID = col_double()
## )
group_by(punks, type) %>%
ggplot(aes(type)) +
geom_bar()
|
glue("The average CryptoPunk ID is {mean(as.numeric(punks$ID))}.
The average ID in the Christie's auction is {mean(as.numeric(christiesPunks))}.")
## The average CryptoPunk ID is 4999.5.
## The average ID in the Christie's auction is 443.
For some reason, the auctioned Punks are from the lower numbers of the set. This will be interesting to explore.
|
pt <- read_csv('punkTrades.csv')
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## ID = col_character(),
## Type = col_character(),
## From = col_character(),
## To = col_character(),
## Amount = col_character(),
## Txn = col_character()
## )
head(pt)
## # A tibble: 6 x 6
## ID Type From To Amount Txn
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 0000 Sold 0xf5099e 14715954 25Ξ ($2,822) Nov 30, 2018
## 2 0000 Sold 0x00d7c9 10528156 1.60Ξ ($386) Jul 07, 2017
## 3 0000 Sold 0xc352b5 55241 0.98Ξ ($320) Jun 23, 2017
## 4 0000 Claimed <NA> 12800693 <NA> Jun 23, 2017
## 5 0001 Sold EliteCat… 0xcf6165 60Ξ ($36,305) Nov 30, 2020
## 6 0001 Sold 0xf5099e GoWest23 31Ξ ($5,155) Apr 06, 2019
glue("The average times each Punk has been sold, in the data I have so far, is
{round(sum(pt$Type == 'Sold') / sum(pt$Type == 'Claimed'), 2)}")
## The average times each Punk has been sold, in the data I have so far, is
## 0.85
pa <- read_csv('punkAttributes.csv')
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## Attribute = col_character(),
## `Number of Punks` = col_double(),
## `For Sale` = col_double(),
## `Avg Sale 90 Days` = col_double(),
## `Cheapest Offered Now` = col_character()
## )
head(pa)
## # A tibble: 6 x 5
## Attribute `Number of Punks` `For Sale` `Avg Sale 90 Day… `Cheapest Offered …
## <chr> <dbl> <dbl> <dbl> <chr>
## 1 Beanie 44 12 94.7 199Ξ
## 2 Choker 48 5 46.8 111Ξ
## 3 Pilot Helm… 54 10 89.8 160Ξ
## 4 Tiara 55 8 71.0 119.99Ξ
## 5 Orange Side 68 12 73.0 79.99Ξ
## 6 Buck Teeth 78 10 38.3 75Ξ
pa$Attribute <- factor(pa$Attribute, levels = pa$Attribute)
ggplot(pa, aes(Attribute, `Number of Punks`)) +
geom_bar(stat = 'identity') +
coord_flip()