Required packages

library(readr)
library(dplyr)
library(tidyr)
library(outliers)
library(forecast)

Executive Summary

This report will be focus on prepare the dataset containing gold and cryptocurrency price information for future price inference purpose. First, the dataset wiAll be created through merging two open source data gathered from www.kaggle.com. Necessary data type conversion will be performed in order to merge the datasets successfully. Afterwards, the new dataset will be examined through structure checking. Any variables require necessary type conversion or ordering will be performed. When all variables are examined to be in correct format, the whole dataset will be checked once more to make sure it is in a tidy format. Variables will filtered to unsure the efficiency of future analysis. Furthermore, the dataset will be scaned for missing values and outliers. Any missing values and outliers detected will be replaced or excluded with appropiate methodlogy. Finally, a distribution check will be performed on the tidy data. If the data is not in normal distribution, transformations will apply to scale the data to as normal distributed as possible.

Data

The two data sets are both open data sourced from www.kaggle.com.

Here are the links to the original data website and variable descriptions:

Cryptocurrency market information

The dataset is about cryptocurrency market with price information and relative price radios.

Variables:

  • slug: name commonly used
  • symbol: the symbol represent the cryptocurrency
  • name: Name of the cryptocurrency
  • date: the day information was recorded
  • rank: cryptocurrency market rank from 1 to 2000
  • open: open price of the day ($USD)
  • high: highest price of the day ($USD)
  • low: lowest price of the day ($USD)
  • close: close price of the day ($USD)
  • volume: total volume trade in the day ($USD)
  • market: total market cap ($USD)
  • close_ratio: the daily close rate, min-maxed with the high and low values for the day, Close Ratio = (Close-Low)/(High-Low)
  • spread: the $USD difference between the high and low values for the day

Link: https://www.kaggle.com/jessevent/all-crypto-currencies

Gold price

This dataset is about the VAISHNAVI GOLD share price information and relative price ratios.

Variables:

  • Date: the day information was recorded
  • Open: open price of the day ($USD)
  • High: highest price of the day ($USD)
  • Low: lowest price of the day ($USD)
  • Close: close price of the day ($USD)
  • WAP: weighted average price ($USD)
  • No. of Shares : numbers of shares traded in the day
  • No. of Trades: numbers of trades occured in the day
  • Total Turnover: total trade volume
  • Deliverable Quantity: total number of shares that were marked for delivery on a day
  • % Deli. Qty to Traded Qty: radio of % deliverable Qty to traded Qty
  • Spread H-L: the $USD difference between the high and low values for the day
  • Spread C-O: the $USD difference between the close and open values for the day

Link: https://www.kaggle.com/lakshmi25npathi/gold-price

After importing the two datasets, the two dataframes are merged by dates where only matched dates and informations are retained. To merge them by equal dates, “date” coloums need to be in “date” format in order to perform the merge. Through check for class, both “date” coloums are factors. Therefore, both are converted into date type with the same format: Y-M-D.

“Inner_join” is selected to perform the merge, because in this way, the new dataset is more tidy with relevant datas when come to further analysis.

cryptocurrency <- read.csv("cryptomarkets.csv", sep=",")
goldprice <- read.csv("Goldprice.csv", sep=",")

cryptocurrency$date %>% class
[1] "factor"
goldprice$date %>% class
[1] "factor"
goldprice$date <- strptime(as.character(goldprice$date), "%d/%m/%Y")
goldprice$date <- as.Date(goldprice$date)
cryptocurrency$date <- as.Date(cryptocurrency$date)
crypto_gold <- inner_join(cryptocurrency, goldprice, by = "date")

Understand

The attributes in the data are checked through showing the data structures.

Apart from data conversion previously (from factor to date), the rest attributes in the data are in appropiate data types. “Slug”, “symbo”l" and “names” are decided to store as factor becuase factors saves more memory compare to character and factors contain levels which helps ordering the data for further analysis. Also there may be cryptocurrency names that contain numbers and characters in the future.

Next, the data is ordered for better visualization.

First, check the levels in “symbol”. As shown in the data structure and level checks, there are 2005 levels in the dataset. It is impossible to label them all.

BTC, ETH and XRP is the three major cryptocurrencies in the market and has majority or the market caps. Assume the analysis between gold price and cryptocurrency price will be most likely to perform on these three cryptocurrencies. Thus, the dataset is filtered to only contain price information of the top 3 cryptocurrency.

Therefore, tidy & manipulation of data is performed first and level ordering will be performed afterwards.

str(crypto_gold)
'data.frame':   137678 obs. of  25 variables:
 $ slug                      : Factor w/ 2071 levels "0chain","0x",..: 209 209 209 209 209 209 209 209 209 209 ...
 $ symbol                    : Factor w/ 2005 levels "$$$","$PAC","0XBTC",..: 275 275 275 275 275 275 275 275 275 275 ...
 $ name                      : Factor w/ 2071 levels "0chain","0x",..: 205 205 205 205 205 205 205 205 205 205 ...
 $ date                      : Date, format: "2013-04-29" "2013-04-30" ...
 $ ranknow                   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ open                      : num  134 144 116 106 116 ...
 $ high                      : num  147 147 126 108 125 ...
 $ low                       : num  134 134.1 92.3 79.1 106.6 ...
 $ close                     : num  144.5 139 105.2 97.8 112.3 ...
 $ volume                    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ market                    : num  1.60e+09 1.54e+09 1.17e+09 1.09e+09 1.25e+09 ...
 $ close_ratio               : num  0.781 0.384 0.388 0.642 0.314 ...
 $ spread                    : num  13.5 12.9 33.3 29 18 ...
 $ Open                      : num  3.9 3.71 3.71 3.66 3.56 3.49 3.43 3.37 3.31 3.25 ...
 $ High                      : num  3.9 3.71 3.71 3.66 3.56 3.49 3.43 3.37 3.31 3.25 ...
 $ Low                       : num  3.9 3.71 3.53 3.53 3.56 3.49 3.43 3.37 3.31 3.25 ...
 $ Close                     : num  3.9 3.71 3.59 3.63 3.56 3.49 3.43 3.37 3.31 3.25 ...
 $ WAP                       : num  3.9 3.71 3.57 3.63 3.56 ...
 $ No..of.Shares             : int  154304 8775 20063 14141 80819 205 5040 1175 10089 50 ...
 $ No..of.Trades             : int  25 11 46 79 11 2 10 14 83 1 ...
 $ Total.Turnover            : int  601785 32555 71612 51353 287715 715 17287 3959 33394 162 ...
 $ Deliverable.Quantity      : int  154304 8775 20063 14141 80819 205 5040 1175 10089 50 ...
 $ X..Deli..Qty.to.Traded.Qty: num  100 100 100 100 100 100 100 100 100 100 ...
 $ Spread.H.L                : num  0 0 0.18 0.13 0 0 0 0 0 0 ...
 $ Spread.C.O                : num  0 0 -0.12 -0.03 0 0 0 0 0 0 ...
crypto_gold$symbol %>% levels
   [1] "$$$"       "$PAC"      "0XBTC"     "1337"      "1ST"       "1WO"      
   [7] "2GIVE"     "2GO"       "300"       "42"        "611"       "808"      
  [13] "8BIT"      "AAA"       "AAC"       "ABBC"      "ABC"       "ABDT"     
  [19] "ABL"       "ABS"       "ABT"       "ABX"       "ABY"       "ABYSS"    
  [25] "AC"        "AC3"       "ACAT"      "ACC"       "ACDC"      "ACE"      
  [31] "ACED"      "ACES"      "ACM"       "ACOIN"     "ACP"       "ACRE"     
  [37] "ACT"       "ACTP"      "ADA"       "ADB"       "ADC"       "ADCN"     
  [43] "ADH"       "ADI"       "ADK"       "ADL"       "ADST"      "ADT"      
  [49] "ADX"       "ADZ"       "AE"        "AEC"       "AEG"       "AEON"     
  [55] "AGI"       "AGLT"      "AI"        "AIB"       "AID"       "AIDOC"    
  [61] "AION"      "AIT"       "AIX"       "AKA"       "ALC"       "ALI"      
  [67] "ALIS"      "ALL"       "ALT"       "ALTX"      "ALX"       "AMB"      
  [73] "AMLT"      "AMM"       "AMMO"      "AMN"       "AMO"       "AMP"      
  [79] "AMS"       "ANC"       "ANI"       "ANON"      "ANT"       "AOA"      
  [85] "AOG"       "APC"       "APH"       "APIS"      "APL"       "APOT"     
  [91] "APPC"      "APR"       "APX"       "ARB"       "ARC"       "ARCO"     
  [97] "ARCT"      "ARDR"      "AREPA"     "ARG"       "ARGUS"     "ARI"      
 [103] "ARION"     "ARK"       "ARN"       "ARO"       "ART"       "ARY"      
 [109] "ASA"       "ASAFE2"    "AST"       "AT"        "ATB"       "ATC"      
 [115] "ATCC"      "ATD"       "ATH"       "ATL"       "ATM"       "ATMI"     
 [121] "ATMOS"     "ATN"       "ATOM"      "ATP"       "ATS"       "ATX"      
 [127] "AU"        "AUC"       "AUR"       "AURA"      "AUTO"      "AUX"      
 [133] "AV"        "AVA"       "AVH"       "AVINOC"    "AVT"       "AXIOM"    
 [139] "AXPR"      "AZART"     "B@"        "B2B"       "B2X"       "BAAS"     
 [145] "BANCA"     "BANK"      "BAT"       "BAX"       "BAY"       "BBC"      
 [151] "BBK"       "BBN"       "BBO"       "BBP"       "BBR"       "BBS"      
 [157] "BC"        "BCA"       "BCAC"      "BCARD"     "BCD"       "BCDN"     
 [163] "BCDT"      "BCF"       "BCH"       "BCI"       "BCN"       "BCO"      
 [169] "BCPT"      "BCV"       "BCX"       "BCY"       "BCZERO"    "BDG"      
 [175] "BDL"       "BDT"       "BEAT"      "BEE"       "BEET"      "BELA"     
 [181] "BEN"       "BENJI"     "BENZ"      "BERN"      "BERRY"     "BET"      
 [187] "BETHER"    "BETR"      "BEZ"       "BFF"       "BFT"       "BGG"      
 [193] "BHPC"      "BIFI"      "BIGUP"     "BIO"       "BIR"       "BIRDS"    
 [199] "BIS"       "BIT"       "BITB"      "BITBTC"    "BITCF"     "BITCNY"   
 [205] "BITEUR"    "BITF"      "BITG"      "BITGOLD"   "BITS"      "BITSILVER"
 [211] "BITUSD"    "BITX"      "BIX"       "BKBT"      "BKX"       "BLACK"    
 [217] "BLAST"     "BLAZR"     "BLC"       "BLK"       "BLN"       "BLOC"     
 [223] "BLOCK"     "BLT"       "BLU"       "BLUE"      "BLZ"       "BMC"      
 [229] "BMH"       "BMX"       "BNB"       "BNC"       "BND"       "BNK"      
 [235] "BNN"       "BNT"       "BNTY"      "BOAT"      "BOB"       "BOC"      
 [241] "BOE"       "BOLI"      "BON"       "BOS"       "BOST"      "BOT"      
 [247] "BOUTS"     "BOX"       "BOXX"      "BPL"       "BPT"       "BQ"       
 [253] "BQT"       "BRAT"      "BRD"       "BRIA"      "BRIT"      "BRK"      
 [259] "BRM"       "BRO"       "BRX"       "BRZC"      "BSC"       "BSD"      
 [265] "BSM"       "BSN"       "BSTN"      "BSTY"      "BSV"       "BSX"      
 [271] "BTA"       "BTAD"      "BTB"       "BTBC"      "BTC"       "BTCM"     
 [277] "BTCN"      "BTCONE"    "BTCP"      "BTCRED"    "BTCS"      "BTCZ"     
 [283] "BTDX"      "BTG"       "BTK"       "BTM"       "BTN"       "BTNT"     
 [289] "BTO"       "BTPL"      "BTQ"       "BTR"       "BTRN"      "BTS"      
 [295] "BTT"       "BTW"       "BTWTY"     "BTX"       "BTXC"      "BU"       
 [301] "BUB"       "BUBO"      "BUMBA"     "BUN"       "BUNNY"     "BURST"    
 [307] "BUT"       "BUZZ"      "BWK"       "BWS"       "BWT"       "BWX"      
 [313] "BZ"        "BZL"       "BZNT"      "BZX"       "C2"        "C20"      
 [319] "C2C"       "C2P"       "C8"        "CAB"       "CAG"       "CAN"      
 [325] "CANDY"     "CANN"      "CAPP"      "CAR"       "CARAT"     "CARBON"   
 [331] "CARD"      "CARE"      "CAS"       "CASH"      "CAT"       "CATO"     
 [337] "CAZ"       "CBC"       "CBT"       "CBX"       "CCC"       "CCCX"     
 [343] "CCL"       "CCO"       "CCRB"      "CCT"       "CDC"       "CDM"      
 [349] "CDN"       "CDT"       "CDX"       "CEDEX"     "CEEK"      "CEFS"     
 [355] "CEL"       "CEN"       "CENNZ"     "CET"       "CF"        "CFC"      
 [361] "CFI"       "CFL"       "CFUN"      "CGEN"      "CHAT"      "CHE"      
 [367] "CHEESE"    "CHESS"     "CHEX"      "CHIPS"     "CHP"       "CHSB"     
 [373] "CHX"       "CIF"       "CIT"       "CIV"       "CJ"        "CJS"      
 [379] "CJT"       "CKUSD"     "CL"        "CLAM"      "CLN"       "CLO"      
 [385] "CLOAK"     "CLUB"      "CMCT"      "CMIT"      "CMM"       "CMPCO"    
 [391] "CMS"       "CMT"       "CND"       "CNET"      "CNN"       "CNNC"     
 [397] "CNO"       "CNT"       "CNX"       "COAL"      "COB"       "COBRA"    
 [403] "COFI"      "COIN"      "COLX"      "COMP"      "CONI"      "CONX"     
 [409] "COR"       "COSM"      "COSS"      "COTN"      "COU"       "COUPE"    
 [415] "COV"       "COVAL"     "CPAY"      "CPC"       "CPLO"      "CPN"      
 [421] "CPT"       "CPX"       "CPY"       "CRAVE"     "CRB"       "CRBT"     
 [427] "CRC"       "CRD"       "CRE"       "CREA"      "CRED"      "CREDO"    
 [433] "CREVA"     "CRM"       "CROAT"     "CROP"      "CRPT"      "CRW"      
 [439] "CRYP"      "CS"        "CSC"       "CSM"       "CSNO"      "CST"      
 [445] "CSTL"      "CTC"       "CTIC2"     "CTIC3"     "CTL"       "CTRT"     
 [451] "CTX"       "CTXC"      "CURE"      "CV"        "CVC"       "CVN"      
 [457] "CVT"       "CWV"       "CXO"       "CXT"       "CYFM"      "CYL"      
 [463] "CYMT"      "CZR"       "D"         "DAC"       "DACC"      "DACH"     
 [469] "DACS"      "DADI"      "DAG"       "DAGT"      "DAI"       "DALC"     
 [475] "DAN"       "DAPS"      "DAR"       "DART"      "DASC"      "DASH"     
 [481] "DAT"       "DATA"      "DATP"      "DATX"      "DAV"       "DAX"      
 [487] "DAXT"      "DAXX"      "DAY"       "DBC"       "DBET"      "DBIX"     
 [493] "DCC"       "DCN"       "DCR"       "DCT"       "DCY"       "DDD"      
 [499] "DDX"       "DEAL"      "DEB"       "DEC"       "DEEX"      "DELIZ"    
 [505] "DELTA"     "DEM"       "DENT"      "DERO"      "DEUS"      "DEV"      
 [511] "DEW"       "DEX"       "DFT"       "DGB"       "DGC"       "DGD"      
 [517] "DGS"       "DGTX"      "DGX"       "DICE"      "DIG"       "DIM"      
 [523] "DIME"      "DIN"       "DIT"       "DIVI"      "DIVX"      "DIX"      
 [529] "DKPC"      "DLC"       "DLT"       "DMB"       "DMC"       "DMD"      
 [535] "DML"       "DMT"       "DNA"       "DNT"       "DNZ"       "DOCK"     
 [541] "DOGE"      "DOLLAR"    "DOPE"      "DOR"       "DOT"       "DOV"      
 [547] "DOW"       "DP"        "DPN"       "DPY"       "DRG"       "DRGN"     
 [553] "DRM"       "DROP"      "DRPU"      "DRT"       "DRXNE"     "DSR"      
 [559] "DT"        "DTA"       "DTB"       "DTC"       "DTEM"      "DTH"      
 [565] "DTR"       "DTRC"      "DTX"       "DUO"       "DUTCH"     "DWS"      
 [571] "DX"        "DXT"       "DYN"       "EAG"       "EARTH"     "EBC"      
 [577] "EBET"      "EBST"      "EBTC"      "ECA"       "ECASH"     "ECC"      
 [583] "ECO"       "ECOB"      "ECOM"      "ECOREAL"   "ECT"       "EDG"      
 [589] "EDN"       "EDO"       "EDR"       "EDRC"      "EDS"       "EDT"      
 [595] "EDU"       "EFL"       "EFX"       "EFYT"      "EGC"       "EGCC"     
 [601] "EGEM"      "EGT"       "EGX"       "EJOY"      "EKO"       "EKT"      
 [607] "EL"        "ELA"       "ELE"       "ELEC"      "ELF"       "ELI"      
 [613] "ELITE"     "ELIX"      "ELLA"      "ELLI"      "ELS"       "ELTCOIN"  
 [619] "ELY"       "EMB"       "EMC"       "EMC2"      "EMD"       "EMPR"     
 [625] "ENG"       "ENGT"      "ENJ"       "ENT"       "ENTS"      "EOS"      
 [631] "EOSDAC"    "EPLUS"     "EPY"       "EQL"       "EQT"       "ERA"      
 [637] "ERC20"     "ERO"       "ERT"       "ERY"       "ESCE"      "ESCO"     
 [643] "ESN"       "ESP"       "ESS"       "EST"       "ESZ"       "ETA"      
 [649] "ETBS"      "ETC"       "ETG"       "ETH"       "ETHD"      "ETHM"     
 [655] "ETHO"      "ETHOS"     "ETI"       "ETK"       "ETN"       "ETP"      
 [661] "ETT"       "ETZ"       "EUC"       "EUNO"      "EURS"      "EVC"      
 [667] "EVE"       "EVI"       "EVIL"      "EVN"       "EVR"       "EVX"      
 [673] "EXC"       "EXCL"      "EXMR"      "EXP"       "EXRN"      "EXT"      
 [679] "EXY"       "EZT"       "EZW"       "F1C"       "FACE"      "FAIR"     
 [685] "FANS"      "FBN"       "FCT"       "FDX"       "FDZ"       "FGC"      
 [691] "FID"       "FIL"       "FJC"       "FKX"       "FLASH"     "FLAX"     
 [697] "FLDC"      "FLIK"      "FLIXX"     "FLM"       "FLO"       "FLOT"     
 [703] "FLP"       "FLT"       "FLUZ"      "FMF"       "FND"       "FNKOS"    
 [709] "FNTB"      "FOIN"      "FOOD"      "FOR"       "FORK"      "FOTA"     
 [715] "FOX"       "FOXT"      "FRC"       "FREC"      "FREE"      "FRGC"     
 [721] "FRN"       "FRST"      "FSBT"      "FSN"       "FST"       "FT"       
 [727] "FTC"       "FTI"       "FTM"       "FTO"       "FTT"       "FTX"      
 [733] "FTXT"      "FUEL"      "FUN"       "FUNDZ"     "FUZZ"      "FXT"      
 [739] "FYP"       "GAM"       "GAME"      "GAP"       "GARD"      "GARY"     
 [745] "GAS"       "GAT"       "GB"        "GBC"       "GBG"       "GBX"      
 [751] "GBYTE"     "GCC"       "GCN"       "GCR"       "GCS"       "GDC"      
 [757] "GEERT"     "GEM"       "GEN"       "GENE"      "GEO"       "GET"      
 [763] "GETX"      "GIC"       "GIN"       "GIO"       "GLA"       "GLD"      
 [769] "GLT"       "GMCN"      "GNO"       "GNR"       "GNT"       "GNX"      
 [775] "GO"        "GOD"       "GOLF"      "GOLOS"     "GOOD"      "GOSS"     
 [781] "GOT"       "GPKR"      "GRC"       "GRFT"      "GRID"      "GRIM"     
 [787] "GRLC"      "GRMD"      "GRPH"      "GRS"       "GRWI"      "GRX"      
 [793] "GSC"       "GSE"       "GSR"       "GST"       "GTC"       "GTM"      
 [799] "GTO"       "GUESS"     "GUP"       "GUSD"      "GVE"       "GVT"      
 [805] "GXS"       "GZE"       "GZRO"      "HAC"       "HAL"       "HALLO"    
 [811] "HAND"      "HAV"       "HAVY"      "HB"        "HBC"       "HBT"      
 [817] "HBZ"       "HC"        "HDAC"      "HEAT"      "HELP"      "HER"      
 [823] "HERO"      "HGT"       "HIRE"      "HIT"       "HKN"       "HLC"      
 [829] "HLM"       "HMC"       "HMQ"       "HNC"       "HNDC"      "HODL"     
 [835] "HOLD"      "HONEY"     "HORSE"     "HORUS"     "HOT"       "HPB"      
 [841] "HPC"       "HPY"       "HQT"       "HQX"       "HRC"       "HSC"      
 [847] "HSN"       "HST"       "HT"        "HTH"       "HTML"      "HUC"      
 [853] "HUM"       "HUR"       "HUSH"      "HUZU"      "HVCO"      "HVN"      
 [859] "HWC"       "HXX"       "HYB"       "HYC"       "HYDRO"     "HYP"      
 [865] "I0C"       "IBANK"     "IBTC"      "IC"        "ICN"       "ICNQ"     
 [871] "ICOB"      "ICON"      "ICOO"      "ICR"       "ICX"       "IDH"      
 [877] "IDOL"      "IDT"       "IDXM"      "IETH"      "IFC"       "IFLT"     
 [883] "IFOOD"     "IFP"       "IFT"       "IG"        "IGNIS"     "IHF"      
 [889] "IHT"       "IIC"       "ILC"       "IMP"       "IMS"       "IMT"      
 [895] "IMX"       "INB"       "INC"       "INCNT"     "INCO"      "INCX"     
 [901] "IND"       "INDI"      "INFX"      "ING"       "INK"       "INN"      
 [907] "INO"       "INS"       "INSN"      "INSTAR"    "INSUR"     "INT"      
 [913] "INV"       "INVE"      "INXT"      "IOC"       "IOG"       "ION"      
 [919] "IONC"      "IOP"       "IOST"      "IOTX"      "IOV"       "IPC"      
 [925] "IPL"       "IPSX"      "IQ"        "IQN"       "IQT"       "IRD"      
 [931] "IRL"       "ISR"       "ITC"       "ITI"       "ITL"       "ITT"      
 [937] "ITZ"       "IVY"       "IXC"       "IXE"       "IXT"       "J"        
 [943] "J8T"       "JC"        "JET"       "JEW"       "JIN"       "JIYO"     
 [949] "JIYOX"     "JNT"       "JOINT"     "JOT"       "JS"        "JSE"      
 [955] "KAN"       "KARMA"     "KB3"       "KBC"       "KBR"       "KCASH"    
 [961] "KCS"       "KED"       "KEK"       "KEY"       "KICK"      "KIN"      
 [967] "KIND"      "KLKS"      "KLN"       "KMD"       "KNC"       "KNDC"     
 [973] "KNOW"      "KNT"       "KOBO"      "KORE"      "KRB"       "KRL"      
 [979] "KRM"       "KRONE"     "KST"       "KUN"       "KURT"      "KWATT"    
 [985] "KWH"       "KXC"       "KZC"       "LA"        "LABH"      "LALA"     
 [991] "LANA"      "LATX"      "LBA"       "LBC"       "LBTC"      "LCC"      
 [997] "LCP"       "LCS"       "LDC"       "LDOGE"    
 [ reached getOption("max.print") -- omitted 1005 entries ]

Tidy & Manipulate Data I

The dataset is filtered into price information only relate to BTC, ETH and XRP.

Then, the data will be ordered following the order BTC - ETH - XRP.

Furthermore, the date coloum will be seperated into year,month and day. So price analysis based on differenet time frame can be performed easily.

crypto_gold_filtered <- crypto_gold %>% filter(symbol == "BTC" | symbol == "ETH" | symbol == "XRP")
crypto_gold_filtered$symbol <- factor (crypto_gold_filtered$symbol, levels = c("BTC", "ETH", "XRP"), ordered =TRUE)
crypto_gold_filtered$symbol %>% levels
[1] "BTC" "ETH" "XRP"
crypto_gold_tidy <- crypto_gold_filtered %>% separate(date, into = c("Year", "Month", "Day"), sep = "-")

Tidy & Manipulate Data II

There is a variable named “spread” for the $USD difference between the high and low values of a cryptocurrency for the day.

It may be helpful to create a varibale forthe $USD difference between the close and open values of a cryptocurrency for the day.

The new variable will be named “spread C/O” and the origianl “spread” variable will be named “spread H/L”

crypto_gold_tidy2 <- mutate(crypto_gold_tidy,
       "spread C/O" = close- open)
crypto_gold_tidy2 %>% rename("spread H/L" = spread)

Scan I

Use is.na to scan for any NAs in the data. However, the dataset is too big with too many observations. Use the alternative way: identify total count of NAs in data frame to scan for NAs in the data more efficiently.

As there are 0 NAs, no actions need to take.

is.na(crypto_gold_tidy2)
         slug symbol  name  Year Month   Day ranknow  open  high   low close volume
   [1,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
   [2,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
   [3,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
   [4,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
   [5,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
   [6,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
   [7,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
   [8,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
   [9,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [10,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [11,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [12,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [13,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [14,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [15,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [16,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [17,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [18,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [19,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [20,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [21,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [22,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [23,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [24,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [25,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [26,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [27,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [28,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [29,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [30,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [31,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [32,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [33,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [34,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
  [35,] FALSE  FALSE FALSE FALSE FALSE FALSE   FALSE FALSE FALSE FALSE FALSE  FALSE
        market close_ratio spread  Open  High   Low Close   WAP No..of.Shares
   [1,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
   [2,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
   [3,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
   [4,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
   [5,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
   [6,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
   [7,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
   [8,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
   [9,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [10,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [11,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [12,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [13,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [14,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [15,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [16,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [17,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [18,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [19,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [20,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [21,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [22,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [23,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [24,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [25,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [26,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [27,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [28,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [29,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [30,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [31,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [32,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [33,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [34,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
  [35,]  FALSE       FALSE  FALSE FALSE FALSE FALSE FALSE FALSE         FALSE
        No..of.Trades Total.Turnover Deliverable.Quantity X..Deli..Qty.to.Traded.Qty
   [1,]         FALSE          FALSE                FALSE                      FALSE
   [2,]         FALSE          FALSE                FALSE                      FALSE
   [3,]         FALSE          FALSE                FALSE                      FALSE
   [4,]         FALSE          FALSE                FALSE                      FALSE
   [5,]         FALSE          FALSE                FALSE                      FALSE
   [6,]         FALSE          FALSE                FALSE                      FALSE
   [7,]         FALSE          FALSE                FALSE                      FALSE
   [8,]         FALSE          FALSE                FALSE                      FALSE
   [9,]         FALSE          FALSE                FALSE                      FALSE
  [10,]         FALSE          FALSE                FALSE                      FALSE
  [11,]         FALSE          FALSE                FALSE                      FALSE
  [12,]         FALSE          FALSE                FALSE                      FALSE
  [13,]         FALSE          FALSE                FALSE                      FALSE
  [14,]         FALSE          FALSE                FALSE                      FALSE
  [15,]         FALSE          FALSE                FALSE                      FALSE
  [16,]         FALSE          FALSE                FALSE                      FALSE
  [17,]         FALSE          FALSE                FALSE                      FALSE
  [18,]         FALSE          FALSE                FALSE                      FALSE
  [19,]         FALSE          FALSE                FALSE                      FALSE
  [20,]         FALSE          FALSE                FALSE                      FALSE
  [21,]         FALSE          FALSE                FALSE                      FALSE
  [22,]         FALSE          FALSE                FALSE                      FALSE
  [23,]         FALSE          FALSE                FALSE                      FALSE
  [24,]         FALSE          FALSE                FALSE                      FALSE
  [25,]         FALSE          FALSE                FALSE                      FALSE
  [26,]         FALSE          FALSE                FALSE                      FALSE
  [27,]         FALSE          FALSE                FALSE                      FALSE
  [28,]         FALSE          FALSE                FALSE                      FALSE
  [29,]         FALSE          FALSE                FALSE                      FALSE
  [30,]         FALSE          FALSE                FALSE                      FALSE
  [31,]         FALSE          FALSE                FALSE                      FALSE
  [32,]         FALSE          FALSE                FALSE                      FALSE
  [33,]         FALSE          FALSE                FALSE                      FALSE
  [34,]         FALSE          FALSE                FALSE                      FALSE
  [35,]         FALSE          FALSE                FALSE                      FALSE
        Spread.H.L Spread.C.O spread C/O
   [1,]      FALSE      FALSE      FALSE
   [2,]      FALSE      FALSE      FALSE
   [3,]      FALSE      FALSE      FALSE
   [4,]      FALSE      FALSE      FALSE
   [5,]      FALSE      FALSE      FALSE
   [6,]      FALSE      FALSE      FALSE
   [7,]      FALSE      FALSE      FALSE
   [8,]      FALSE      FALSE      FALSE
   [9,]      FALSE      FALSE      FALSE
  [10,]      FALSE      FALSE      FALSE
  [11,]      FALSE      FALSE      FALSE
  [12,]      FALSE      FALSE      FALSE
  [13,]      FALSE      FALSE      FALSE
  [14,]      FALSE      FALSE      FALSE
  [15,]      FALSE      FALSE      FALSE
  [16,]      FALSE      FALSE      FALSE
  [17,]      FALSE      FALSE      FALSE
  [18,]      FALSE      FALSE      FALSE
  [19,]      FALSE      FALSE      FALSE
  [20,]      FALSE      FALSE      FALSE
  [21,]      FALSE      FALSE      FALSE
  [22,]      FALSE      FALSE      FALSE
  [23,]      FALSE      FALSE      FALSE
  [24,]      FALSE      FALSE      FALSE
  [25,]      FALSE      FALSE      FALSE
  [26,]      FALSE      FALSE      FALSE
  [27,]      FALSE      FALSE      FALSE
  [28,]      FALSE      FALSE      FALSE
  [29,]      FALSE      FALSE      FALSE
  [30,]      FALSE      FALSE      FALSE
  [31,]      FALSE      FALSE      FALSE
  [32,]      FALSE      FALSE      FALSE
  [33,]      FALSE      FALSE      FALSE
  [34,]      FALSE      FALSE      FALSE
  [35,]      FALSE      FALSE      FALSE
 [ reached getOption("max.print") -- omitted 1793 rows ]
sum(is.na(crypto_gold_tidy2))
[1] 0

Scan II

By checking for outliers through the boxplots, outliers can be clearly identified in BTC price.

Next, we use z score to identify the outliers and find the locations of the z-scores whose absolute value is greater than 3.

There are three outliers as identified. These outliers will be replaced by the nearest neighbours that are not outliers. As in this case, the outliers are not a result of data entry error or data processing error, thus it might not be appropiate to exclude the outliers directly.

boxplot(crypto_gold_tidy2$close ~ crypto_gold_tidy2$symbol, main="price by cryptocurrency", ylab = "price", xlab = "cryptocurrency")


crypto_gold_outlier <- crypto_gold_tidy2 %>%  filter( symbol == "BTC" ) %>%  dplyr::select(close)
z.scores <- crypto_gold_outlier %>% scores(type = "z")
z.scores %>% summary()
     close         
 Min.   :-1.52668  
 1st Qu.:-0.79443  
 Median :-0.09933  
 Mean   : 0.00000  
 3rd Qu.: 0.72392  
 Max.   : 3.18060  
which( abs(z.scores) >3 )
[1] 148 151 793
cap <- function(x){
    quantiles <- quantile( x, c(.05, 0.25, 0.75, .95 ) )
    x[ x < quantiles[2] - 1.5*IQR(x) ] <- quantiles[1]
    x[ x > quantiles[3] + 1.5*IQR(x) ] <- quantiles[4]
    x
}

close_capped <- crypto_gold_outlier$close %>% cap()

Transform

Before any transformation, histogram of the close price of bitcoin is created for visualization of the distribution. From the plot, it is clear that the data is right skewed. For further inference on the price, it is preferred to transform the data into normal distribution. Box-Cox transformation is applied to transform the data into normal distribution. From the result histogram, it appears to be left-skewed rather than normal distributed.

Z score standardisation is applied to see if there is any improvement. From the histogram of z score standardiasation, it is clearly that the plot is relatively more normal than the Box-Cox transformation with most data gathered around mean = 0

hist(close_capped)

boxcox_close<- BoxCox(close_capped,lambda = "auto")
hist(boxcox_close)


z_close <- scale(close_capped, center = TRUE, scale = TRUE)
hist(z_close)

LS0tDQp0aXRsZTogIk1BVEgyMzQ5IFNlbWVzdGVyIDIsIDIwMTkiDQphdXRob3I6ICJaSElZSU4gV0FORyBTMzc5NTE4MiINCnN1YnRpdGxlOiBBc3NpZ25tZW50IDMNCm91dHB1dDoNCiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdA0KLS0tDQojIyBSZXF1aXJlZCBwYWNrYWdlcyANCg0KYGBge3J9DQpsaWJyYXJ5KHJlYWRyKQ0KbGlicmFyeShkcGx5cikNCmxpYnJhcnkodGlkeXIpDQpsaWJyYXJ5KG91dGxpZXJzKQ0KbGlicmFyeShmb3JlY2FzdCkNCmBgYA0KDQoNCiMjIEV4ZWN1dGl2ZSBTdW1tYXJ5IA0KDQpUaGlzIHJlcG9ydCB3aWxsIGJlIGZvY3VzIG9uIHByZXBhcmUgdGhlIGRhdGFzZXQgY29udGFpbmluZyBnb2xkIGFuZCBjcnlwdG9jdXJyZW5jeSBwcmljZSBpbmZvcm1hdGlvbiBmb3IgZnV0dXJlIHByaWNlIGluZmVyZW5jZSBwdXJwb3NlLiBGaXJzdCwgdGhlIGRhdGFzZXQgd2lBbGwgYmUgY3JlYXRlZCB0aHJvdWdoIG1lcmdpbmcgdHdvIG9wZW4gc291cmNlIGRhdGEgZ2F0aGVyZWQgZnJvbSB3d3cua2FnZ2xlLmNvbS4gTmVjZXNzYXJ5IGRhdGEgdHlwZSBjb252ZXJzaW9uIHdpbGwgYmUgcGVyZm9ybWVkIGluIG9yZGVyIHRvIG1lcmdlIHRoZSBkYXRhc2V0cyBzdWNjZXNzZnVsbHkuIEFmdGVyd2FyZHMsIHRoZSBuZXcgZGF0YXNldCB3aWxsIGJlIGV4YW1pbmVkIHRocm91Z2ggc3RydWN0dXJlIGNoZWNraW5nLiBBbnkgdmFyaWFibGVzIHJlcXVpcmUgbmVjZXNzYXJ5IHR5cGUgY29udmVyc2lvbiBvciBvcmRlcmluZyB3aWxsIGJlIHBlcmZvcm1lZC4gV2hlbiBhbGwgdmFyaWFibGVzIGFyZSBleGFtaW5lZCB0byBiZSBpbiBjb3JyZWN0IGZvcm1hdCwgdGhlIHdob2xlIGRhdGFzZXQgd2lsbCBiZSBjaGVja2VkIG9uY2UgbW9yZSB0byBtYWtlIHN1cmUgaXQgaXMgaW4gYSB0aWR5IGZvcm1hdC4gVmFyaWFibGVzIHdpbGwgZmlsdGVyZWQgdG8gdW5zdXJlIHRoZSBlZmZpY2llbmN5IG9mIGZ1dHVyZSBhbmFseXNpcy4gRnVydGhlcm1vcmUsIHRoZSBkYXRhc2V0IHdpbGwgYmUgc2NhbmVkIGZvciBtaXNzaW5nIHZhbHVlcyBhbmQgb3V0bGllcnMuIEFueSBtaXNzaW5nIHZhbHVlcyBhbmQgb3V0bGllcnMgZGV0ZWN0ZWQgd2lsbCBiZSByZXBsYWNlZCBvciBleGNsdWRlZCB3aXRoIGFwcHJvcGlhdGUgbWV0aG9kbG9neS4gRmluYWxseSwgYSBkaXN0cmlidXRpb24gY2hlY2sgd2lsbCBiZSBwZXJmb3JtZWQgb24gdGhlIHRpZHkgZGF0YS4gSWYgdGhlIGRhdGEgaXMgbm90IGluIG5vcm1hbCBkaXN0cmlidXRpb24sIHRyYW5zZm9ybWF0aW9ucyB3aWxsIGFwcGx5IHRvIHNjYWxlIHRoZSBkYXRhIHRvIGFzIG5vcm1hbCBkaXN0cmlidXRlZCBhcyBwb3NzaWJsZS4NCg0KDQojIyBEYXRhIA0KDQoNClRoZSB0d28gZGF0YSBzZXRzIGFyZSBib3RoIG9wZW4gZGF0YSBzb3VyY2VkIGZyb20gd3d3LmthZ2dsZS5jb20uDQoNCkhlcmUgYXJlIHRoZSBsaW5rcyB0byB0aGUgb3JpZ2luYWwgZGF0YSB3ZWJzaXRlIGFuZCB2YXJpYWJsZSBkZXNjcmlwdGlvbnM6DQoNCiMjIyBDcnlwdG9jdXJyZW5jeSBtYXJrZXQgaW5mb3JtYXRpb24NCg0KVGhlIGRhdGFzZXQgaXMgYWJvdXQgY3J5cHRvY3VycmVuY3kgbWFya2V0IHdpdGggcHJpY2UgaW5mb3JtYXRpb24gYW5kIHJlbGF0aXZlIHByaWNlIHJhZGlvcy4NCg0KVmFyaWFibGVzOg0KDQoqIHNsdWc6IG5hbWUgY29tbW9ubHkgdXNlZA0KKiBzeW1ib2w6IHRoZSBzeW1ib2wgcmVwcmVzZW50IHRoZSBjcnlwdG9jdXJyZW5jeQ0KKiBuYW1lOiBOYW1lIG9mIHRoZSBjcnlwdG9jdXJyZW5jeQ0KKiBkYXRlOiB0aGUgZGF5IGluZm9ybWF0aW9uIHdhcyByZWNvcmRlZA0KKiByYW5rOiBjcnlwdG9jdXJyZW5jeSBtYXJrZXQgcmFuayBmcm9tIDEgdG8gMjAwMA0KKiBvcGVuOiBvcGVuIHByaWNlIG9mIHRoZSBkYXkgKCRVU0QpDQoqIGhpZ2g6IGhpZ2hlc3QgcHJpY2Ugb2YgdGhlIGRheSAoJFVTRCkNCiogbG93OiBsb3dlc3QgcHJpY2Ugb2YgdGhlIGRheSAoJFVTRCkNCiogY2xvc2U6IGNsb3NlIHByaWNlIG9mIHRoZSBkYXkgKCRVU0QpDQoqIHZvbHVtZTogdG90YWwgdm9sdW1lIHRyYWRlIGluIHRoZSBkYXkgKCRVU0QpDQoqIG1hcmtldDogdG90YWwgbWFya2V0IGNhcCAoJFVTRCkNCiogY2xvc2VfcmF0aW86IHRoZSBkYWlseSBjbG9zZSByYXRlLCBtaW4tbWF4ZWQgd2l0aCB0aGUgaGlnaCBhbmQgbG93IHZhbHVlcyBmb3IgdGhlIGRheSwgQ2xvc2UgUmF0aW8gPSAoQ2xvc2UtTG93KS8oSGlnaC1Mb3cpDQoqIHNwcmVhZDogdGhlICRVU0QgZGlmZmVyZW5jZSBiZXR3ZWVuIHRoZSBoaWdoIGFuZCBsb3cgdmFsdWVzIGZvciB0aGUgZGF5DQoNCg0KTGluazogaHR0cHM6Ly93d3cua2FnZ2xlLmNvbS9qZXNzZXZlbnQvYWxsLWNyeXB0by1jdXJyZW5jaWVzDQoNCiMjIyBHb2xkIHByaWNlDQoNClRoaXMgZGF0YXNldCBpcyBhYm91dCB0aGUgVkFJU0hOQVZJIEdPTEQgc2hhcmUgcHJpY2UgaW5mb3JtYXRpb24gYW5kIHJlbGF0aXZlIHByaWNlIHJhdGlvcy4NCg0KVmFyaWFibGVzOg0KDQoqIERhdGU6IHRoZSBkYXkgaW5mb3JtYXRpb24gd2FzIHJlY29yZGVkDQoqIE9wZW46IG9wZW4gcHJpY2Ugb2YgdGhlIGRheSAoJFVTRCkNCiogSGlnaDogaGlnaGVzdCBwcmljZSBvZiB0aGUgZGF5ICgkVVNEKQ0KKiBMb3c6IGxvd2VzdCBwcmljZSBvZiB0aGUgZGF5ICgkVVNEKQ0KKiBDbG9zZTogY2xvc2UgcHJpY2Ugb2YgdGhlIGRheSAoJFVTRCkNCiogV0FQOiB3ZWlnaHRlZCBhdmVyYWdlIHByaWNlICgkVVNEKQ0KKiBOby4gb2YgU2hhcmVzIDogbnVtYmVycyBvZiBzaGFyZXMgdHJhZGVkIGluIHRoZSBkYXkNCiogTm8uIG9mIFRyYWRlczogbnVtYmVycyBvZiB0cmFkZXMgb2NjdXJlZCBpbiB0aGUgZGF5DQoqIFRvdGFsIFR1cm5vdmVyOiB0b3RhbCB0cmFkZSB2b2x1bWUNCiogRGVsaXZlcmFibGUgUXVhbnRpdHk6IHRvdGFsIG51bWJlciBvZiBzaGFyZXMgdGhhdCB3ZXJlIG1hcmtlZCBmb3IgZGVsaXZlcnkgb24gYSBkYXkNCiogJSBEZWxpLiBRdHkgdG8gVHJhZGVkIFF0eTogcmFkaW8gb2YgICUgZGVsaXZlcmFibGUgUXR5IHRvIHRyYWRlZCBRdHkNCiogU3ByZWFkIEgtTDogdGhlICRVU0QgZGlmZmVyZW5jZSBiZXR3ZWVuIHRoZSBoaWdoIGFuZCBsb3cgdmFsdWVzIGZvciB0aGUgZGF5DQoqIFNwcmVhZCBDLU86IHRoZSAkVVNEIGRpZmZlcmVuY2UgYmV0d2VlbiB0aGUgY2xvc2UgYW5kIG9wZW4gdmFsdWVzIGZvciB0aGUgZGF5DQoNCkxpbms6IGh0dHBzOi8vd3d3LmthZ2dsZS5jb20vbGFrc2htaTI1bnBhdGhpL2dvbGQtcHJpY2UNCg0KQWZ0ZXIgaW1wb3J0aW5nIHRoZSB0d28gZGF0YXNldHMsIHRoZSB0d28gZGF0YWZyYW1lcyBhcmUgbWVyZ2VkIGJ5IGRhdGVzIHdoZXJlIG9ubHkgbWF0Y2hlZCBkYXRlcyBhbmQgaW5mb3JtYXRpb25zIGFyZSByZXRhaW5lZC4gVG8gbWVyZ2UgdGhlbSBieSBlcXVhbCBkYXRlcywgImRhdGUiIGNvbG91bXMgbmVlZCB0byBiZSBpbiAiZGF0ZSIgZm9ybWF0IGluIG9yZGVyIHRvIHBlcmZvcm0gdGhlIG1lcmdlLiBUaHJvdWdoIGNoZWNrIGZvciBjbGFzcywgYm90aCAiZGF0ZSIgY29sb3VtcyBhcmUgZmFjdG9ycy4gVGhlcmVmb3JlLCBib3RoIGFyZSBjb252ZXJ0ZWQgaW50byBkYXRlIHR5cGUgd2l0aCB0aGUgc2FtZSBmb3JtYXQ6IFktTS1ELg0KDQoiSW5uZXJfam9pbiIgaXMgc2VsZWN0ZWQgdG8gcGVyZm9ybSB0aGUgbWVyZ2UsIGJlY2F1c2UgaW4gdGhpcyB3YXksIHRoZSBuZXcgZGF0YXNldCBpcyBtb3JlIHRpZHkgd2l0aCByZWxldmFudCBkYXRhcyB3aGVuIGNvbWUgdG8gZnVydGhlciBhbmFseXNpcy4NCg0KYGBge3J9DQpjcnlwdG9jdXJyZW5jeSA8LSByZWFkLmNzdigiY3J5cHRvbWFya2V0cy5jc3YiLCBzZXA9IiwiKQ0KZ29sZHByaWNlIDwtIHJlYWQuY3N2KCJHb2xkcHJpY2UuY3N2Iiwgc2VwPSIsIikNCg0KY3J5cHRvY3VycmVuY3kkZGF0ZSAlPiUgY2xhc3MNCmdvbGRwcmljZSRkYXRlICU+JSBjbGFzcw0KDQpnb2xkcHJpY2UkZGF0ZSA8LSBzdHJwdGltZShhcy5jaGFyYWN0ZXIoZ29sZHByaWNlJGRhdGUpLCAiJWQvJW0vJVkiKQ0KZ29sZHByaWNlJGRhdGUgPC0gYXMuRGF0ZShnb2xkcHJpY2UkZGF0ZSkNCmNyeXB0b2N1cnJlbmN5JGRhdGUgPC0gYXMuRGF0ZShjcnlwdG9jdXJyZW5jeSRkYXRlKQ0KY3J5cHRvX2dvbGQgPC0gaW5uZXJfam9pbihjcnlwdG9jdXJyZW5jeSwgZ29sZHByaWNlLCBieSA9ICJkYXRlIikNCmBgYA0KDQojIyBVbmRlcnN0YW5kIA0KDQpUaGUgYXR0cmlidXRlcyBpbiB0aGUgZGF0YSBhcmUgY2hlY2tlZCB0aHJvdWdoIHNob3dpbmcgdGhlIGRhdGEgc3RydWN0dXJlcy4gDQoNCkFwYXJ0IGZyb20gZGF0YSBjb252ZXJzaW9uIHByZXZpb3VzbHkgKGZyb20gZmFjdG9yIHRvIGRhdGUpLCB0aGUgcmVzdCBhdHRyaWJ1dGVzIGluIHRoZSBkYXRhIGFyZSBpbiBhcHByb3BpYXRlIGRhdGEgdHlwZXMuICJTbHVnIiwgInN5bWJvImwiIGFuZCAibmFtZXMiIGFyZSBkZWNpZGVkIHRvIHN0b3JlIGFzIGZhY3RvciANCmJlY3Vhc2UgZmFjdG9ycyBzYXZlcyBtb3JlIG1lbW9yeSBjb21wYXJlIHRvIGNoYXJhY3RlciBhbmQgZmFjdG9ycyBjb250YWluIGxldmVscyB3aGljaCBoZWxwcyBvcmRlcmluZyB0aGUgZGF0YSBmb3IgZnVydGhlciBhbmFseXNpcy4gQWxzbyB0aGVyZSBtYXkgYmUgY3J5cHRvY3VycmVuY3kgbmFtZXMgdGhhdCBjb250YWluIG51bWJlcnMgYW5kIGNoYXJhY3RlcnMgaW4gdGhlIGZ1dHVyZS4NCg0KTmV4dCwgdGhlIGRhdGEgaXMgb3JkZXJlZCBmb3IgYmV0dGVyIHZpc3VhbGl6YXRpb24uDQoNCkZpcnN0LCBjaGVjayB0aGUgbGV2ZWxzIGluICJzeW1ib2wiLg0KQXMgc2hvd24gaW4gdGhlIGRhdGEgc3RydWN0dXJlIGFuZCBsZXZlbCBjaGVja3MsIHRoZXJlIGFyZSAyMDA1IGxldmVscyBpbiB0aGUgZGF0YXNldC4gSXQgaXMgaW1wb3NzaWJsZSB0byBsYWJlbCB0aGVtIGFsbC4gDQoNCkJUQywgRVRIIGFuZCBYUlAgaXMgdGhlIHRocmVlIG1ham9yIGNyeXB0b2N1cnJlbmNpZXMgaW4gdGhlIG1hcmtldCBhbmQgaGFzIG1ham9yaXR5IG9yIHRoZSBtYXJrZXQgY2Fwcy4gQXNzdW1lIHRoZSBhbmFseXNpcyBiZXR3ZWVuIGdvbGQgcHJpY2UgYW5kIGNyeXB0b2N1cnJlbmN5IHByaWNlIHdpbGwgYmUgbW9zdCBsaWtlbHkgdG8gcGVyZm9ybSBvbiB0aGVzZSB0aHJlZSBjcnlwdG9jdXJyZW5jaWVzLiBUaHVzLCB0aGUgZGF0YXNldCBpcyBmaWx0ZXJlZCB0byBvbmx5IGNvbnRhaW4gcHJpY2UgaW5mb3JtYXRpb24gb2YgdGhlIHRvcCAzIGNyeXB0b2N1cnJlbmN5LiANCg0KVGhlcmVmb3JlLCB0aWR5ICYgbWFuaXB1bGF0aW9uIG9mIGRhdGEgaXMgcGVyZm9ybWVkIGZpcnN0IGFuZCBsZXZlbCBvcmRlcmluZyB3aWxsIGJlIHBlcmZvcm1lZCBhZnRlcndhcmRzLg0KYGBge3J9DQpzdHIoY3J5cHRvX2dvbGQpDQpjcnlwdG9fZ29sZCRzeW1ib2wgJT4lIGxldmVscw0KYGBgDQoNCg0KIyMJVGlkeSAmIE1hbmlwdWxhdGUgRGF0YSBJIA0KDQpUaGUgZGF0YXNldCBpcyBmaWx0ZXJlZCBpbnRvIHByaWNlIGluZm9ybWF0aW9uIG9ubHkgcmVsYXRlIHRvIEJUQywgRVRIIGFuZCBYUlAuIA0KDQpUaGVuLCB0aGUgZGF0YSB3aWxsIGJlIG9yZGVyZWQgZm9sbG93aW5nIHRoZSBvcmRlciBCVEMgLSBFVEggLSBYUlAuDQoNCkZ1cnRoZXJtb3JlLCB0aGUgZGF0ZSBjb2xvdW0gd2lsbCBiZSBzZXBlcmF0ZWQgaW50byB5ZWFyLG1vbnRoIGFuZCBkYXkuIFNvIHByaWNlIGFuYWx5c2lzIGJhc2VkIG9uIGRpZmZlcmVuZXQgdGltZSBmcmFtZSBjYW4gYmUgcGVyZm9ybWVkIGVhc2lseS4NCg0KYGBge3J9DQpjcnlwdG9fZ29sZF9maWx0ZXJlZCA8LSBjcnlwdG9fZ29sZCAlPiUgZmlsdGVyKHN5bWJvbCA9PSAiQlRDIiB8IHN5bWJvbCA9PSAiRVRIIiB8IHN5bWJvbCA9PSAiWFJQIikNCmNyeXB0b19nb2xkX2ZpbHRlcmVkJHN5bWJvbCA8LSBmYWN0b3IgKGNyeXB0b19nb2xkX2ZpbHRlcmVkJHN5bWJvbCwgbGV2ZWxzID0gYygiQlRDIiwgIkVUSCIsICJYUlAiKSwgb3JkZXJlZCA9VFJVRSkNCmNyeXB0b19nb2xkX2ZpbHRlcmVkJHN5bWJvbCAlPiUgbGV2ZWxzDQoNCmNyeXB0b19nb2xkX3RpZHkgPC0gY3J5cHRvX2dvbGRfZmlsdGVyZWQgJT4lIHNlcGFyYXRlKGRhdGUsIGludG8gPSBjKCJZZWFyIiwgIk1vbnRoIiwgIkRheSIpLCBzZXAgPSAiLSIpDQoNCmBgYA0KDQojIwlUaWR5ICYgTWFuaXB1bGF0ZSBEYXRhIElJIA0KDQpUaGVyZSBpcyBhIHZhcmlhYmxlIG5hbWVkICJzcHJlYWQiIGZvciB0aGUgJFVTRCBkaWZmZXJlbmNlIGJldHdlZW4gdGhlIGhpZ2ggYW5kIGxvdyB2YWx1ZXMgb2YgYSBjcnlwdG9jdXJyZW5jeSBmb3IgdGhlIGRheS4gDQoNCkl0IG1heSBiZSBoZWxwZnVsIHRvIGNyZWF0ZSBhIHZhcmliYWxlIGZvcnRoZSAkVVNEIGRpZmZlcmVuY2UgYmV0d2VlbiB0aGUgY2xvc2UgYW5kIG9wZW4gdmFsdWVzIG9mIGEgY3J5cHRvY3VycmVuY3kgZm9yIHRoZSBkYXkuDQoNClRoZSBuZXcgdmFyaWFibGUgd2lsbCBiZSBuYW1lZCAic3ByZWFkIEMvTyIgYW5kIHRoZSBvcmlnaWFubCAic3ByZWFkIiB2YXJpYWJsZSB3aWxsIGJlIG5hbWVkICJzcHJlYWQgSC9MIg0KDQpgYGB7cn0NCmNyeXB0b19nb2xkX3RpZHkyIDwtIG11dGF0ZShjcnlwdG9fZ29sZF90aWR5LA0KICAgICAgICJzcHJlYWQgQy9PIiA9IGNsb3NlLSBvcGVuKQ0KY3J5cHRvX2dvbGRfdGlkeTIgJT4lIHJlbmFtZSgic3ByZWFkIEgvTCIgPSBzcHJlYWQpDQpgYGANCg0KDQojIwlTY2FuIEkgDQpVc2UgaXMubmEgdG8gc2NhbiBmb3IgYW55IE5BcyBpbiB0aGUgZGF0YS4NCkhvd2V2ZXIsIHRoZSBkYXRhc2V0IGlzIHRvbyBiaWcgd2l0aCB0b28gbWFueSBvYnNlcnZhdGlvbnMuIFVzZSB0aGUgYWx0ZXJuYXRpdmUgd2F5OiBpZGVudGlmeSB0b3RhbCBjb3VudCBvZiBOQXMgaW4gZGF0YSBmcmFtZSB0byBzY2FuIGZvciBOQXMgaW4gdGhlIGRhdGEgbW9yZSBlZmZpY2llbnRseS4NCg0KQXMgdGhlcmUgYXJlIDAgTkFzLCBubyBhY3Rpb25zIG5lZWQgdG8gdGFrZS4NCmBgYHtyfQ0KaXMubmEoY3J5cHRvX2dvbGRfdGlkeTIpDQpzdW0oaXMubmEoY3J5cHRvX2dvbGRfdGlkeTIpKQ0KYGBgDQoNCg0KIyMJU2NhbiBJSQ0KDQpCeSBjaGVja2luZyBmb3Igb3V0bGllcnMgdGhyb3VnaCB0aGUgYm94cGxvdHMsIG91dGxpZXJzIGNhbiBiZSBjbGVhcmx5IGlkZW50aWZpZWQgaW4gQlRDIHByaWNlLg0KDQpOZXh0LCB3ZSB1c2UgeiBzY29yZSB0byBpZGVudGlmeSB0aGUgb3V0bGllcnMgYW5kIGZpbmQgdGhlIGxvY2F0aW9ucyBvZiB0aGUgei1zY29yZXMgd2hvc2UgYWJzb2x1dGUgdmFsdWUgaXMgZ3JlYXRlciB0aGFuIDMuDQoNClRoZXJlIGFyZSB0aHJlZSBvdXRsaWVycyBhcyBpZGVudGlmaWVkLiBUaGVzZSBvdXRsaWVycyB3aWxsIGJlIHJlcGxhY2VkIGJ5IHRoZSBuZWFyZXN0IG5laWdoYm91cnMgdGhhdCBhcmUgbm90IG91dGxpZXJzLiBBcyBpbiB0aGlzIGNhc2UsIHRoZSBvdXRsaWVycyBhcmUgbm90IGEgcmVzdWx0IG9mIGRhdGEgZW50cnkgZXJyb3Igb3IgZGF0YSBwcm9jZXNzaW5nIGVycm9yLCB0aHVzIGl0IG1pZ2h0IG5vdCBiZSBhcHByb3BpYXRlIHRvIGV4Y2x1ZGUgdGhlIG91dGxpZXJzIGRpcmVjdGx5Lg0KYGBge3J9DQpib3hwbG90KGNyeXB0b19nb2xkX3RpZHkyJGNsb3NlIH4gY3J5cHRvX2dvbGRfdGlkeTIkc3ltYm9sLCBtYWluPSJwcmljZSBieSBjcnlwdG9jdXJyZW5jeSIsIHlsYWIgPSAicHJpY2UiLCB4bGFiID0gImNyeXB0b2N1cnJlbmN5IikNCg0KY3J5cHRvX2dvbGRfb3V0bGllciA8LSBjcnlwdG9fZ29sZF90aWR5MiAlPiUgIGZpbHRlciggc3ltYm9sID09ICJCVEMiICkgJT4lICBkcGx5cjo6c2VsZWN0KGNsb3NlKQ0Kei5zY29yZXMgPC0gY3J5cHRvX2dvbGRfb3V0bGllciAlPiUgc2NvcmVzKHR5cGUgPSAieiIpDQp6LnNjb3JlcyAlPiUgc3VtbWFyeSgpDQp3aGljaCggYWJzKHouc2NvcmVzKSA+MyApDQoNCmNhcCA8LSBmdW5jdGlvbih4KXsNCiAgICBxdWFudGlsZXMgPC0gcXVhbnRpbGUoIHgsIGMoLjA1LCAwLjI1LCAwLjc1LCAuOTUgKSApDQogICAgeFsgeCA8IHF1YW50aWxlc1syXSAtIDEuNSpJUVIoeCkgXSA8LSBxdWFudGlsZXNbMV0NCiAgICB4WyB4ID4gcXVhbnRpbGVzWzNdICsgMS41KklRUih4KSBdIDwtIHF1YW50aWxlc1s0XQ0KICAgIHgNCn0NCg0KY2xvc2VfY2FwcGVkIDwtIGNyeXB0b19nb2xkX291dGxpZXIkY2xvc2UgJT4lIGNhcCgpDQpgYGANCg0KDQojIwlUcmFuc2Zvcm0gDQoNCkJlZm9yZSBhbnkgdHJhbnNmb3JtYXRpb24sIGhpc3RvZ3JhbSBvZiB0aGUgY2xvc2UgcHJpY2Ugb2YgYml0Y29pbiBpcyBjcmVhdGVkIGZvciB2aXN1YWxpemF0aW9uIG9mIHRoZSBkaXN0cmlidXRpb24uIEZyb20gdGhlIHBsb3QsIGl0IGlzIGNsZWFyIHRoYXQgdGhlIGRhdGEgaXMgcmlnaHQgc2tld2VkLiBGb3IgZnVydGhlciBpbmZlcmVuY2Ugb24gdGhlIHByaWNlLCBpdCBpcyBwcmVmZXJyZWQgdG8gdHJhbnNmb3JtIHRoZSBkYXRhIGludG8gbm9ybWFsIGRpc3RyaWJ1dGlvbi4NCkJveC1Db3ggdHJhbnNmb3JtYXRpb24gaXMgYXBwbGllZCB0byB0cmFuc2Zvcm0gdGhlIGRhdGEgaW50byBub3JtYWwgZGlzdHJpYnV0aW9uLg0KRnJvbSB0aGUgcmVzdWx0IGhpc3RvZ3JhbSwgaXQgYXBwZWFycyB0byBiZSBsZWZ0LXNrZXdlZCByYXRoZXIgdGhhbiBub3JtYWwgZGlzdHJpYnV0ZWQuDQoNClogc2NvcmUgc3RhbmRhcmRpc2F0aW9uIGlzIGFwcGxpZWQgdG8gc2VlIGlmIHRoZXJlIGlzIGFueSBpbXByb3ZlbWVudC4NCkZyb20gdGhlIGhpc3RvZ3JhbSBvZiB6IHNjb3JlIHN0YW5kYXJkaWFzYXRpb24sIGl0IGlzIGNsZWFybHkgdGhhdCB0aGUgcGxvdCBpcyByZWxhdGl2ZWx5IG1vcmUgbm9ybWFsIHRoYW4gdGhlIEJveC1Db3ggdHJhbnNmb3JtYXRpb24gd2l0aCBtb3N0IGRhdGEgZ2F0aGVyZWQgYXJvdW5kIG1lYW4gPSAwDQoNCmBgYHtyfQ0KaGlzdChjbG9zZV9jYXBwZWQpDQpib3hjb3hfY2xvc2U8LSBCb3hDb3goY2xvc2VfY2FwcGVkLGxhbWJkYSA9ICJhdXRvIikNCmhpc3QoYm94Y294X2Nsb3NlKQ0KDQp6X2Nsb3NlIDwtIHNjYWxlKGNsb3NlX2NhcHBlZCwgY2VudGVyID0gVFJVRSwgc2NhbGUgPSBUUlVFKQ0KaGlzdCh6X2Nsb3NlKQ0KDQpgYGANCg==