library(ggplot2)
data("diamonds")
str(diamonds)

histagram

qplot(price, data = diamonds)

The function througs error bins = 30, binwidth defaulting to range/30. Where range is spread of data max and min points.

range(diamonds$price)
[1]   326 18823

range equals to $18497

qplot(price, data= diamonds, binwidth = 18497/30)

THe histogram is almost identical to the | previous one! If you typed 18497/30 at the command line you would get | the result 616.5667. This means that the height of each bin tells you | how many diamonds have a price between x and x+617 where x is the | left edge of the bin.

qplot(price,data=diamonds,binwidth=18497/30,fill=cut)

This shows how the counts within each price grouping (bin) are | distributed among the different cuts of diamonds.

Building density graph

qplot(price,data=diamonds, geom= "density")
qplot(price,data=diamonds, geom= "density", color = cut)

graph transphoramation

qplot(carat, price, data = diamonds,  color = cut)
qplot(carat, price, data = diamonds,  color = cut)+ geom_smooth(method= "lm")
qplot(carat, price, data = diamonds,  color = cut, facets =.~cut )+ geom_smooth(method= "lm")

Working more with ggplot

g<-ggplot(data=diamonds, aes(depth, price))

relationship of entire dataset of 2 variables

g+geom_point(alpha=1/3)

factor with 5 levels (Fair, Good, Very | Good, Premium, and Ideal). But carat is numeric and not a discrete | factor. Can we do this? R has a handy command, cut, which | allows you to divide your data into sets and label each entry as | belonging to one of the sets, in effect creating a new factor. First, | we’ll have to decide where to cut the data.

cutpoints<-quantile(diamonds$carat,seq(0,1,length=4),na.rm=TRUE)

We see a 4-long vector (explaining why length was set equal to 4)

Assigning a new vector

diamonds$car2<-cut(diamonds$carat, cutpoints)

Plotting byfacets

g+geom_point(alpha=1/3)+facet_grid(cut ~ car2 )

Adding smoothing line

g+geom_point(alpha=1/3)+facet_grid(cut~car2)+geom_smooth(method="lm", size = 1, color = "pink")

| One more time. You can do it! Or, type info() for more options.

| Type ggplot(diamonds,aes(carat,price))+geom_boxplot()+facet_grid(.~cut)
| at the command prompt.

Getting boxplot with ggplot

ggplot(diamonds,aes(carat,price))+geom_boxplot()+facet_grid(.~cut)
LS0tCnRpdGxlOiAiR0dwbG90IDIiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCmBgYHtyfQpsaWJyYXJ5KGdncGxvdDIpCmRhdGEoImRpYW1vbmRzIikKYGBgCgpgYGB7cn0Kc3RyKGRpYW1vbmRzKQpgYGAKaGlzdGFncmFtCmBgYHtyfQpxcGxvdChwcmljZSwgZGF0YSA9IGRpYW1vbmRzKQpgYGAKClRoZSBmdW5jdGlvbiB0aHJvdWdzIGVycm9yIGBiaW5zID0gMzBgLCBiaW53aWR0aCBkZWZhdWx0aW5nIHRvIHJhbmdlLzMwLiBXaGVyZSByYW5nZSBpcyBzcHJlYWQgb2YgZGF0YSBtYXggYW5kIG1pbiBwb2ludHMuCgpgYGB7cn0KcmFuZ2UoZGlhbW9uZHMkcHJpY2UpClsxXSAgIDMyNiAxODgyMwoKYGBgCnJhbmdlIGVxdWFscyB0byAkMTg0OTcKCmBgYHtyfQpxcGxvdChwcmljZSwgZGF0YT0gZGlhbW9uZHMsIGJpbndpZHRoID0gMTg0OTcvMzApCmBgYApUSGUgaGlzdG9ncmFtIGlzIGFsbW9zdCBpZGVudGljYWwgdG8gdGhlCnwgcHJldmlvdXMgb25lISBJZiB5b3UgdHlwZWQgMTg0OTcvMzAgYXQgdGhlIGNvbW1hbmQgbGluZSB5b3Ugd291bGQgZ2V0CnwgdGhlIHJlc3VsdCA2MTYuNTY2Ny4gVGhpcyBtZWFucyB0aGF0IHRoZSBoZWlnaHQgb2YgZWFjaCBiaW4gdGVsbHMgeW91CnwgaG93IG1hbnkgZGlhbW9uZHMgaGF2ZSBhIHByaWNlIGJldHdlZW4geCBhbmQgeCs2MTcgd2hlcmUgeCBpcyB0aGUKfCBsZWZ0IGVkZ2Ugb2YgdGhlIGJpbi4KCmBgYHtyfQpxcGxvdChwcmljZSxkYXRhPWRpYW1vbmRzLGJpbndpZHRoPTE4NDk3LzMwLGZpbGw9Y3V0KQpgYGAKVGhpcyBzaG93cyBob3cgdGhlIGNvdW50cyB3aXRoaW4gZWFjaCBwcmljZSBncm91cGluZyAoYmluKSBhcmUKfCBkaXN0cmlidXRlZCBhbW9uZyB0aGUgZGlmZmVyZW50IGN1dHMgb2YgZGlhbW9uZHMuCgoKQnVpbGRpbmcgZGVuc2l0eSBncmFwaApgYGB7cn0KcXBsb3QocHJpY2UsZGF0YT1kaWFtb25kcywgZ2VvbT0gImRlbnNpdHkiKQpxcGxvdChwcmljZSxkYXRhPWRpYW1vbmRzLCBnZW9tPSAiZGVuc2l0eSIsIGNvbG9yID0gY3V0KQpgYGAKCmdyYXBoIHRyYW5zcGhvcmFtYXRpb24KCmBgYHtyfQpxcGxvdChjYXJhdCwgcHJpY2UsIGRhdGEgPSBkaWFtb25kcywgIGNvbG9yID0gY3V0KQpxcGxvdChjYXJhdCwgcHJpY2UsIGRhdGEgPSBkaWFtb25kcywgIGNvbG9yID0gY3V0KSsgZ2VvbV9zbW9vdGgobWV0aG9kPSAibG0iKQpxcGxvdChjYXJhdCwgcHJpY2UsIGRhdGEgPSBkaWFtb25kcywgIGNvbG9yID0gY3V0LCBmYWNldHMgPS5+Y3V0ICkrIGdlb21fc21vb3RoKG1ldGhvZD0gImxtIikKYGBgCgoKV29ya2luZyBtb3JlIHdpdGggZ2dwbG90CmBgYHtyfQpnPC1nZ3Bsb3QoZGF0YT1kaWFtb25kcywgYWVzKGRlcHRoLCBwcmljZSkpCmBgYApyZWxhdGlvbnNoaXAgb2YgZW50aXJlIGRhdGFzZXQgb2YgMiB2YXJpYWJsZXMKYGBge3J9CmcrZ2VvbV9wb2ludChhbHBoYT0xLzMpCmBgYAoKZmFjdG9yIHdpdGggNSBsZXZlbHMgKEZhaXIsIEdvb2QsIFZlcnkKfCBHb29kLCBQcmVtaXVtLCBhbmQgSWRlYWwpLiBCdXQgY2FyYXQgaXMgbnVtZXJpYyBhbmQgbm90IGEgZGlzY3JldGUKfCBmYWN0b3IuIENhbiB3ZSBkbyB0aGlzPwogUiBoYXMgYSBoYW5keSBjb21tYW5kLCBjdXQsIHdoaWNoCnwgYWxsb3dzIHlvdSB0byBkaXZpZGUgeW91ciBkYXRhIGludG8gc2V0cyBhbmQgbGFiZWwgZWFjaCBlbnRyeSBhcwp8IGJlbG9uZ2luZyB0byBvbmUgb2YgdGhlIHNldHMsIGluIGVmZmVjdCBjcmVhdGluZyBhIG5ldyBmYWN0b3IuIEZpcnN0LAp8IHdlJ2xsIGhhdmUgdG8gZGVjaWRlIHdoZXJlIHRvIGN1dCB0aGUgZGF0YS4KCmBgYHtyfQpjdXRwb2ludHM8LXF1YW50aWxlKGRpYW1vbmRzJGNhcmF0LHNlcSgwLDEsbGVuZ3RoPTQpLG5hLnJtPVRSVUUpCmBgYApXZSBzZWUgYSA0LWxvbmcgdmVjdG9yIChleHBsYWluaW5nIHdoeSBsZW5ndGggd2FzIHNldCBlcXVhbCB0byA0KQoKQXNzaWduaW5nIGEgbmV3IHZlY3RvcgpgYGB7cn0KZGlhbW9uZHMkY2FyMjwtY3V0KGRpYW1vbmRzJGNhcmF0LCBjdXRwb2ludHMpCmBgYAoKUGxvdHRpbmcgYnlmYWNldHMKYGBge3J9CmcrZ2VvbV9wb2ludChhbHBoYT0xLzMpK2ZhY2V0X2dyaWQoY3V0IH4gY2FyMiApCmBgYAoKQWRkaW5nIHNtb290aGluZyBsaW5lCmBgYHtyfQpnK2dlb21fcG9pbnQoYWxwaGE9MS8zKStmYWNldF9ncmlkKGN1dH5jYXIyKStnZW9tX3Ntb290aChtZXRob2Q9ImxtIiwgc2l6ZSA9IDEsIGNvbG9yID0gInBpbmsiKQpgYGAKR2V0dGluZyBib3hwbG90IHdpdGggZ2dwbG90CmBgYHtyfQpnZ3Bsb3QoZGlhbW9uZHMsYWVzKGNhcmF0LHByaWNlKSkrZ2VvbV9ib3hwbG90KCkrZmFjZXRfZ3JpZCgufmN1dCkKYGBgCgo=