While massive data bring many statistical issues to the fore, including issues in exploratory data analysis and data visualization, there remains the core inferential need to assess the quality of estimators. Indeed, the uncertainty and biases in estimates based on large data can remain quite significant, as large datasets are often high dimensional, are frequently used to fit complex models with large numbers of parameters, and can have many potential sources of bias. [A Scalable Bootstrap for Massive Data]
Additionally, because the variability of an estimator on a subsample differs from its variability on the full dataset, these procedures must perform a rescaling of their output, and this rescaling requires knowledge and explicit use of the convergence rate of the estimator in question; these methods are thus less automatic and easily deployable than the bootstrap. [A Scalable Bootstrap for Massive Data]
Bootstrapping uses the sample data to estimate relevant characteristics of the population. The sampling distribution of a statistic is then constructed empirically by resampling from the sample. The resampling procedure is designed to parallel the process by which sample observations were drawn from the population. For example, if the data represent an independent random sample of size n (or a simple random sample of size n from a much larger population), then each bootstrap sample selects n observations with replacement from the original sample. The key bootstrap analogy is the following: The population is to the sample as the sample is to the bootstrap samples.
For estimating \(SD(\hat{\theta})\):
Let \(\hat{F}\) denote the empirical probability distribution of the data
(i.e., placing mass \(1/n\) at each of the \(n\) data points)
Select \(s\) subsets of size \(b\) from the full data (i.e., randomly sample a set of \(b\) indices \(\mathcal{I}_{j}=\left\{i_{1},\ldots,i_{b}\right\}\) from \(\left\{1,2,\ldots,n\right\}\) without replacement, and repeat \(s\) times).
## [,1] [,2]
## [1,] 5.027871e-05 5.616129e-05
## [,1] [,2]
## [1,] 4.91208e-05 5.616129e-05
## [,1] [,2] [,3]
## [1,] 3.163569e-05 0.0002607607 0.00014168
## [,1] [,2] [,3]
## [1,] 3.365926e-05 0.000338991 0.0001443844
## [1] 1.135657e-06 1.055956e-05 4.589873e-06
## [,1] [,2]
## [1,] 1.098333e-07 1.460337e-07
## [,1] [,2]
## [1,] 1.087101e-07 1.467998e-07
## [,1] [,2] [,3]
## [1,] 3.163569e-05 0.0002607607 0.00014168
## [,1] [,2] [,3]
## [1,] 3.365926e-05 0.000338991 0.0001443844
## [1] 1.135657e-06 1.055956e-05 4.589873e-06
## [,1] [,2]
## [1,] 5.616129e-05 2.395781e-05
## [,1] [,2]
## [1,] 5.616129e-05 2.395781e-05
## [,1] [,2] [,3]
## [1,] 3.365926e-05 0.000338991 0.0001443844
## [,1] [,2] [,3]
## [1,] 1.706016e-05 0.0001358804 7.370805e-05
## [1] 1.135657e-06 1.055956e-05 4.589873e-06
## [,1] [,2]
## [1,] 1.460337e-07 2.378013e-08
## [,1] [,2]
## [1,] 1.467998e-07 2.365717e-08
## [,1] [,2] [,3]
## [1,] 3.365926e-05 0.000338991 0.0001443844
## [,1] [,2] [,3]
## [1,] 1.706016e-05 0.0001358804 7.370805e-05
## [1] 1.135657e-06 1.055956e-05 4.589873e-06
## [,1] [,2]
## [1,] 5.515175e-05 2.37764e-05
## [,1] [,2]
## [1,] 5.515175e-05 2.37764e-05
## [,1] [,2] [,3]
## [1,] 3.38527e-05 0.0003290636 0.0001440936
## [,1] [,2] [,3]
## [1,] 1.674922e-05 0.0001363287 6.886627e-05
## [1] 1.135657e-06 1.055956e-05 4.589873e-06
###Quantile 0.05
## [,1] [,2]
## [1,] 1.385951e-07 2.343538e-08
## [,1] [,2]
## [1,] 1.388482e-07 2.344412e-08
## [,1] [,2] [,3]
## [1,] 3.38527e-05 0.0003290636 0.0001440936
## [,1] [,2] [,3]
## [1,] 1.674922e-05 0.0001363287 6.886627e-05
## [1] 1.135657e-06 1.055956e-05 4.589873e-06
## [,1] [,2]
## [1,] 5.613607e-05 2.417312e-05
## [,1] [,2]
## [1,] 5.613607e-05 2.417312e-05
## [,1] [,2] [,3]
## [1,] 3.426225e-05 0.0003362309 0.0001430772
## [,1] [,2] [,3]
## [1,] 1.711931e-05 0.000136229 6.920184e-05
## [1] 1.135657e-06 1.055956e-05 4.589873e-06
## [,1] [,2]
## [1,] 1.446439e-07 2.32966e-08
## [,1] [,2]
## [1,] 1.447943e-07 2.326147e-08
## [,1] [,2] [,3]
## [1,] 3.426225e-05 0.0003362309 0.0001430772
## [,1] [,2] [,3]
## [1,] 1.711931e-05 0.000136229 6.920184e-05
## [1] 1.135657e-06 1.055956e-05 4.589873e-06