Creating a ggplot
First, you will need to install the package ggplot2 on your machine, then load the package with the usual
The starting point for creating a plot is to use the
ggplot function with the following basic structure:
plot1 <- ggplot(data, aes(x variable, y variable)) + geom_graph.type() print(plot1)
data is a data frame that holds the variables that you would like to plot.
aes specifies which variables to plot. This code commonly causes confusion when creating ggplots. While
aes stands for aesthetics, in ggplot it does not relate to the visual look of the graph but rather what data you want to see in the graph. It specifies what the graph presents rather than how it is presented.
+ geom_graph.type specifies what sort of plot you want to make. ggplot will not work unless you have this added on. You don’t actually type ‘graph.type()’, but choose one of the types of graph.
Commonly used examples include:
- scatterplot –
- boxplot –
- histogram –
- bar plot –
Ensure there is a open and closed bracket after the
geom code. This tells R to make the graph with the basic standard formatting for this graph type.
The type of graph you want to make has to match the classes of the inputs. For example, a scatterplot would require both variables to be numeric. Or a boxplot would require the x variable to be a factor and the y variable to be numeric. You can check the class of any variable with the
class function, or all variables in a data frame with the
A helpful practise when making ggplots is to assign the plot you’ve made to an object (e.g., plot1 in the code above) and then ‘print’ it separately. As your ggplot becomes more complicated this will make it much easier.
In these examples, let’s use a data set that is already in R with the length and width of floral parts for three species of iris. First, load the data set:
To contrast petal length across the three species of iris with a box plot, we would use:
plot1 <- ggplot(iris, aes(Species, Petal.Length)) + geom_boxplot() print(plot1)
aes included the x then the y variables which you wanted to plot, and that
+ geom_boxplot() specified a boxplot.
To plot a frequency histogram of petal length, we would use:
plot2 <- ggplot(iris, aes(Petal.Length)) + geom_histogram() print(plot2)
To use a scatter plot the visualise the relationship between petal length and sepal length, we would use:
plot3 <- ggplot(iris, aes(Sepal.Length, Petal.Length)) + geom_point() print(plot3)
These points could then be coloured according to the levels of a categorical variable. To do this, add
colour="categorical.variable" in the
aes brackets. To colour by species, we would use:
plot4 <- ggplot(iris, aes(Sepal.Length, Petal.Length, colour=Species)) + geom_point() print(plot4)
Adding a basic theme
The overall look of the ggplot can be altered with different set themes. This can be done by adding
+ theme_bw() or
+ theme_classic() to the end of your line of code. Like
geom(), ensure there is open and closed brackets after the theme name.
plot5 <- ggplot(iris, aes(Sepal.Length, Petal.Length, colour=Species)) + geom_point() + theme_bw() print(plot5)
plot6<-ggplot(iris, aes(Sepal.Length, Petal.Length, colour=Species)) + geom_point() + theme_classic() print(plot6)
To further customise the aesthetics of the graph, including colour and formatting, see our other ggplot help pages:
Help on all the ggplot functions can be found at the The master ggplot help site.
A useful cheat sheet on commonly used functions can be downloaded here.
Chang, W (2012) R Graphics cookbook. O’Reilly Media. – a guide to ggplot with quite a bit of help online here
Author: Fiona Robinson
##  "Mon May 23 10:23:47 2016"