Beginner's guide to R: Painless data visualization

Part 4 of our hands-on guide covers simple graphics, bar graphs and more complex charts.

1 2 3 4 5 6 7 8 Page 4
Page 4 of 8

Add main="Graph of demand" if you want a main headline on your graph:

barplot(BOD$demand, main="Graph of demand")

   Bar plot
Bar chart with R's bar plot() function.

To label the bars on the x axis, use the names.arg argument and set it to the column you want to use for labels:

barplot(BOD$demand, main="Graph of demand", names.arg = BOD$Time)

Sometimes you'd like to graph the counts of a particular variable but you've got just raw data, not a table of frequencies. R's table() function is a quick way to generate counts for each factor in your data.

The R Graphics Cookbook uses an example of a bar graph for the number of 4-, 6- and 8-cylinder vehicles in the mtcars data set. Cylinders are listed in the cyl column, which you can access in R using mtcars$cyl.

Here's code to get the count of how many entries there are by cylinder with the table() function; it stores results in a variable called cylcount:

cylcount <- table(mtcars$cyl)

Bar plot
Creating a bar plot.

That creates a table called cylcount containing:

4 6 8

11 7 14

Now you can create a bar graph of the cylinder count:

barplot(cylcount)

ggplot2's qplot() quick plotting function can also create bar graphs:

qplot(mtcars$cyl)

Blank variables
What happens to your bar chart when you don't instruct R not to plot continuous variables.

However, this defaults to an assumption that 4, 6 and 8 are part of a variable set that could run from 4 through 8, so it shows blank entries for 5 and 7.

To treat cylinders as distinct groups -- that is, you've got a group with 4 cylinders, a group with 6 and a group with 8, not the possibility of entries anywhere between 4 and 8 -- you want cylinders to be treated as a statistical factor:

qplot(factor(mtcars$cyl))

To create a bar graph with the more robust ggplot() function, you can use syntax such as:

ggplot(mtcars, aes(factor(cyl))) + geom_bar()

Histograms

Histograms work pretty much the same, except you want to specify how many buckets or bins you want your data to be separated into. For base R graphics, use:

hist(mydata$columnName, breaks = n)

where columnName is the name of your column in a mydata dataframe that you want to visualize, and n is the number of bins you want.

1 2 3 4 5 6 7 8 Page 4
Page 4 of 8
Download: EMM vendor comparison chart 2019
  
Shop Tech Products at Amazon