Kernel density estimation

Pawel Borowiec
November 24 2014

Reproducible Pitch Presentation - 5 slide deck
for “Developing Data Products” course

Data smoothing problem via KDE

More details on http://en.wikipedia.org/wiki/Kernel_density_estimation link.
For real shiny application click https://mycourseraaccount.shinyapps.io/kde-app/

  • What is unknown probability density function (values) ?
  • How is it related to histogram ?
  • What is the best method to estimate it ?
  • Which parameter is most important ?
  • Implementation in R is delivered by 'density' function
  • 'quantmod' package is used to get daily EUR/PLN exchange rates from the beginning of 2014 year.

Histogram with different bandwith mehods

All bandwith selection algorithms except 'ucv' has small impact on estimated density function values.

plot of chunk unnamed-chunk-2

Histogram with different kernel mehods

Here we can see that different kernel functions give us similar result.

plot of chunk unnamed-chunk-3

Histogram with different adjustments

Optimal kernel bandtwith is the most important parameter.

plot of chunk unnamed-chunk-4