State region Date Home.Value Structure.Cost Land.Value Land.Share..Pct.
<char> <char> <int> <int> <int> <int> <num>
1: AK West 20101 224952 160599 64352 28.6
2: AK West 20102 225511 160252 65259 28.9
3: AK West 20093 225820 163791 62029 27.5
4: AK West 20094 224994 161787 63207 28.1
5: AK West 20074 234590 155400 79190 33.8
6: AK West 20081 233714 157458 76256 32.6
Home.Price.Index Land.Price.Index
<num> <num>
1: 1.481 1.552
2: 1.484 1.576
3: 1.486 1.494
4: 1.481 1.524
5: 1.544 1.885
6: 1.538 1.817
Asthetic Mapping
An attribute of a geometric object specified using the aes command. An Asthetic Mapping mapping between some column of the data.table and a scaled asthetic of the geoms. An asthetic of each data point is scaled according to the specified value. Asthetic mappings are controlled using scales.
example scatter plot asthetics
# scale scatterplot datapoint colour by Home.value... myPlot+geom_point(aes(color = Home.Value))
Asthetic mapping is destinct from a raw geom asthetic which does not vary over data points
# select a single colour for all datapoints... myPlot+geom_point(color ="blue")
alternatively we can specify the data and asthetic mapping in the ggplot command which gets forwarded to the associated geom_ calls
ggplot(data = housing[Date==20011],mapping =aes(y=Structure.Cost, x=log(Land.Value), color = Home.Value))+geom_point()# uses mapping from parent ggplot command
Geometric Object
A visual artifact in a plot such as; a line, text box, error bar
available geometric objects…
help.search("geom_", package ="ggplot2")# Basic Geomsgeom_point() # Scatter plot, pointsgeom_line() # Line plot, lines connecting pointsgeom_path() # Similar to geom_line, but points are connected in ordergeom_bar() # Bar plot, height of the bar is proportional to a valuegeom_col() # Bar plot, height represents the value directlygeom_histogram() # Histogram, distribution of a single variablegeom_area() # Area plot, shaded region under a linegeom_tile() # Rectangles of arbitrary size, useful for heatmapsgeom_raster() # Rectangles for regularly spaced data, faster than geom_tile# Smooth Geomsgeom_smooth() # Add a smoothed conditional mean (e.g., LOESS, linear models)# Textual Geomsgeom_text() # Text annotationsgeom_label() # Label annotations with a backgroundgeom_text_repel() # Avoid overlapping text labels (from ggrepel package)geom_label_repel() # Avoid overlapping label boxes (from ggrepel package)# Ribbon Geomsgeom_ribbon() # Shaded area between two lines (useful for confidence intervals)# Boxplot Geomsgeom_boxplot() # Boxplot for visualizing distributionsgeom_violin() # Violin plot, shows distribution with density estimates# Error Bars and Range Geomsgeom_errorbar() # Error bars for continuous variablesgeom_errorbarh() # Horizontal error barsgeom_linerange() # Line ranges, often used for confidence intervalsgeom_crossbar() # Horizontal bar with error bars above and belowgeom_pointrange() # Point with vertical error bar# Jittered Pointsgeom_jitter() # Scatter plot with jitter to reduce overplotting# Density Plotsgeom_density() # Kernel density estimate plotgeom_density_2d() # 2D density estimate, contoursgeom_density_2d_filled() # 2D density estimate, filled contours# Function Geomsgeom_function() # Plot mathematical functions# Rug Geomsgeom_rug() # Marginal distribution plots on x and y axis# Dot Plotgeom_dotplot() # Dot plot, count of points in each bin# Step Geomsgeom_step() # Step function plot# Hexbin Plotgeom_hex() # Hexagonal binning for 2D data# Contour Plotsgeom_contour() # Contour plot, for 3D surface projectionsgeom_contour_filled()# Filled contour plot# Polygon Geomsgeom_polygon() # Polygon shapes, useful for maps# Segment and Arrow Geomsgeom_segment() # Line segments between specified pointsgeom_curve() # Curved lines between specified pointsgeom_abline() # Lines with specified slope and intercept (y = mx + b)geom_hline() # Horizontal linesgeom_vline() # Vertical lines# Spoke Geomsgeom_spoke() # Line segments defined by angles and lengths# Quantile Plotgeom_quantile() # Quantile regression# Interactive Geoms (from ggiraph package)geom_point_interactive() # Interactive pointsgeom_bar_interactive() # Interactive barsgeom_tile_interactive() # Interactive tiles# Map Geomsgeom_map() # Map datageom_sf() # Simple features (for maps using sf objects)geom_sf_label() # Label points in sf plotsgeom_sf_text() # Text annotations in sf plots# Error Handling Geomsgeom_blank() # Useful for creating empty plots or dummy layers# Custom and Unusual Geomsgeom_freqpoly() # Frequency polygon, like a line histogramgeom_hex() # Hexagonal binning, useful for overplotting in 2Dgeom_quantile() # Quantile regressiongeom_rug() # Adds tick marks at the marginsgeom_treemap() # Treemap, requires treemapify packagegeom_qq() # Quantile-Quantile plotgeom_qq_line() # QQ plot line# Ribbon Plot (shaded regions)geom_ribbon() # Shaded area between two linesgeom_violin() # Violin plot, showing distribution
Statistical Transformation
Data is typically processed to create a plot e.g. boxplot calculates the median and inter-quartile range points while a smoother calculates an interpolated set of values.
Each geometric object has default stat_ that may be overridden. Don’t do this - generally you should use a different geom with defaults to the correct statistical model
Geometric Object
Statistical Transform
Statistical Transform Identifier
Description
geom_bar
stat_count()
“count”
counts the number of cases at each x position
geom_col
stat_identity()
“identity”
leaves the data as is
geom_freqpoly
stat_bin()
“bin”
display the counts with lines (histogram line)
geom_histogram
stat_bin()
“bin”
display the counts with bars (histogram bars)
geom_smooth
stat_smooth()
“smooth”
aids the eye in seeing patterns
Stat commands
# Basic Statsstat_identity() # Leaves data unchanged, useful when raw data should be plottedstat_count() # Counts the number of occurrences of each x value (used in bar plots)stat_bin() # Bins data along the x-axis (used in histograms)stat_smooth() # Computes a smoothed conditional mean (e.g., LOESS or regression lines)stat_boxplot() # Computes boxplot statisticsstat_summary() # Summarizes y values at unique x positions (e.g., mean, median)stat_summary_bin() # Summarizes y values at binned x positions# Density Statsstat_density() # Kernel density estimatestat_density_2d() # 2D kernel density estimate, computes contour linesstat_density_2d_filled()# 2D kernel density estimate with filled contours# Bin Statsstat_bin_2d() # Bins two-dimensional data (creates hexagonal or rectangular heatmaps)stat_binhex() # Hexagonal binning of 2D data# Function Statsstat_function() # Computes a function on a range of x values# Contour Statsstat_contour() # Computes contour lines for 3D datastat_contour_filled() # Computes filled contour regions# Summary Statsstat_summary_hex() # Applies a summary function to binned hexagonal datastat_summary_2d() # Applies a summary function to 2D binned data# Ellipse Statsstat_ellipse() # Computes confidence ellipses (useful in scatter plots)# ECDF Statsstat_ecdf() # Empirical cumulative distribution function# Rug Statsstat_rug() # Adds marginal rug plots along axes# Quantile Statsstat_quantile() # Quantile regression# Binomial and QQ Statsstat_qq() # Quantile-quantile plot, comparing two distributionsstat_qq_line() # Adds a QQ line (a reference line in a QQ plot)# Smooth Statsstat_smooth() # Adds a smoothed conditional mean (e.g., LOESS, regression)# Summary Stats with Custom Functionsstat_summary() # Provides summaries for y values (mean, median, etc.)stat_summary_bin() # Summarizes y values in binned x positionsstat_summary_hex() # Summarizes y values in hexagonal binned positions# Unique to `geom_sf()`stat_sf() # Simple features geometry computations for mapsstat_sf_coordinates() # Extracts coordinates from simple feature geometries# Identity Statsstat_identity() # No statistical transformation, data is plotted as is# Custom and Miscellaneous Statsstat_boxplot() # Computes boxplot statisticsstat_function() # Adds mathematical functionsstat_summary() # Summarizes y values (mean, median, etc.)stat_unique() # Filters out duplicated points (keeps unique points)stat_sf() # Simple feature (sf) data for spatial plotting# Deprecated or Less Common Statsstat_bindot() # Deprecated in favor of stat_bin for dot plots
Scale
Scalable asthetics include; position, colour, fill, transparency, size, shape, line type. Scales are modified using the family of functions scale__.
Scale commands
# Continuous scalesscale_x_continuous() # Adjust x-axis for continuous datascale_y_continuous() # Adjust y-axis for continuous data# Discrete scalesscale_x_discrete() # Adjust x-axis for discrete datascale_y_discrete() # Adjust y-axis for discrete data# Date/time scalesscale_x_date() # Adjust x-axis for date datascale_y_date() # Adjust y-axis for date datascale_x_datetime() # Adjust x-axis for datetime datascale_y_datetime() # Adjust y-axis for datetime datascale_x_time() # Adjust x-axis for time datascale_y_time() # Adjust y-axis for time data# Logarithmic scalesscale_x_log10() # Logarithmic transformation on x-axisscale_y_log10() # Logarithmic transformation on y-axis# Reverse scalesscale_x_reverse() # Reverse x-axisscale_y_reverse() # Reverse y-axis# Manual scalesscale_fill_manual() # Manually specify fill colorsscale_color_manual() # Manually specify line/point colorsscale_shape_manual() # Manually specify shape of pointsscale_linetype_manual() # Manually specify line typesscale_size_manual() # Manually specify size of points# Gradient scalesscale_fill_gradient() # Continuous color gradient for fillscale_fill_gradient2() # Diverging color gradient for fillscale_fill_gradientn() # Multiple color gradient for fillscale_color_gradient() # Continuous color gradient for lines/pointsscale_color_gradient2() # Diverging color gradient for lines/pointsscale_color_gradientn() # Multiple color gradient for lines/points# Brewer scales (for discrete palettes)scale_fill_brewer() # Color Brewer palettes for fillscale_color_brewer() # Color Brewer palettes for lines/points# Viridis scales (perceptually uniform)scale_fill_viridis_d() # Viridis color palette for discrete fillscale_color_viridis_d() # Viridis color palette for discrete lines/pointsscale_fill_viridis_c() # Viridis color palette for continuous fillscale_color_viridis_c() # Viridis color palette for continuous lines/points# Alpha transparency scalesscale_alpha() # Adjust alpha (transparency)scale_alpha_continuous() # Continuous alpha scalescale_alpha_discrete() # Discrete alpha scale# Identity scales (raw values)scale_fill_identity() # Use raw values for fillscale_color_identity() # Use raw values for colorscale_size_identity() # Use raw values for sizescale_shape_identity() # Use raw values for shapescale_linetype_identity()# Use raw values for line type# Other aestheticsscale_size() # Adjust size of points/linesscale_shape() # Adjust shape of pointsscale_linetype() # Adjust line typesscale_edge_color() # For geom_edges, adjusts edge colorscale_edge_size() # For geom_edges, adjusts edge sizescale_edge_linetype() # For geom_edges, adjusts edge linetype
Coordinate System
Use a coord command to choose soordinate system and set axes limits (like a scene)
These are data limits, these will actually remove outliers from the data and create visible gaps in the plot.
p<-p+xlim(9,12) # Truncate data & p<-p+ylim(75000+5000,200000-5000) # produce QNANS or gaps
Coord commands
# Cartesian Coordinatescoord_cartesian() # Default Cartesian coordinate system, allows zooming in on plotcoord_fixed() # Cartesian coordinates with a fixed aspect ratio between x and y axescoord_equal() # Alias for coord_fixed(), ensuring equal scaling for x and y# Flipping Coordinatescoord_flip() # Flips the x and y axes, useful for horizontal bar plots# Polar Coordinatescoord_polar() # Polar coordinates, useful for creating pie charts and circular plots# Map Projections (for Geospatial Data)coord_map() # Projects coordinates onto a 2D plane with an approximation of map-like projectionscoord_quickmap() # Similar to coord_map, but faster and less precise (good for quick geospatial plots)coord_sf() # Handles simple feature (sf) objects for map projections in ggplot2# Transformed Coordinatescoord_trans() # Apply a transformation to the x and y axes (e.g., log or sqrt transformations)
Position Adjustment
Faceting
Lattice style graphics via +facet_wrap(~State, ncol = 10)+
Themes
Themes set general appearance
Theme commands
# Base Theme Functionstheme_gray() # Default gray theme with white background and gray grid linestheme_bw() # A white background theme with black grid lines (no gray background)theme_minimal() # A minimalistic theme with no background annotationstheme_light() # A theme similar to theme_gray, but with a lighter backgroundtheme_dark() # A dark theme with a black background and white grid linestheme_classic() # A classic theme with no gridlines and a white backgroundtheme_void() # A completely empty theme (no axes, gridlines, or text)theme_test() # A theme for testing, showing base color schemes# Specialized Themestheme_linedraw() # Black and white theme with no fill colorstheme_eco() # Environmentally-friendly theme with light colors (from ggthemes)theme_economist() # A theme inspired by The Economist style (requires ggthemes)theme_few() # A minimal theme based on Stephen Few's principles (from ggthemes)theme_grey() # Alias for theme_gray (default theme)theme_tufte() # Minimalistic theme following Edward Tufte's design principles (from ggthemes)theme_map() # A map-friendly theme (useful for choropleths)theme_solarized() # Solarized light and dark theme options (from ggthemes)theme_wsj() # Wall Street Journal inspired theme (from ggthemes)theme_excel() # Excel-style themes (from ggthemes)# Customization Helperstheme() # Customizable theme that lets you adjust individual plot elementstheme_set() # Set the default theme for future plotstheme_update() # Update the current theme with new settingstheme_replace() # Replace the existing theme with new settings# Theme Element Functions (for customizing specific elements)theme_line() # Customize line elements (e.g., axis lines)theme_rect() # Customize rectangular elements (e.g., plot background)theme_text() # Customize text elements (e.g., axis titles, labels)
Examples
Scatterplot
Example: Linear Trendline
Let’s add a trend line…
housing2<-housing[Date==20011]t<-predict(lm(housing2[,Structure.Cost]~log(housing2[,Land.Value])))housing2[,pred.structure.cost:=t]p<-ggplot(data = housing2,mapping =aes(y=Structure.Cost, x=log(Land.Value)) )+coord_cartesian(xlim =c(9,12),ylim =c(75000,200000) )+xlim(9,12)+# Truncate data &ylim(75000+5000,200000-5000) # produce QNANS or gapsp<-p+geom_point(aes(color = Home.Value))p<-p+geom_line(aes(y=pred.structure.cost))p<-p+theme_bw(base_family="times")print(p)
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 17 rows containing missing values or values outside the scale range
(`geom_line()`).
Example: Smoother
if (!require("ggrepel")) {install.packages("ggrepel"); # Install if not already installed}
totHomeValPerState<-housing[,.(Total.Home.Value=sum(Home.Value)),by=State]p<-ggplot(data = totHomeValPerState, # Pre-sumerised datamapping =aes(x=State,y=Total.Home.Value) )+geom_bar(stat ="identity"# "bin" wouldn't work )# could have instead used +geom_col() which defaults stat to "identity"print(p)