Beginner's guide to R: Painless data visualization

Part 4 of our hands-on guide covers simple graphics, bar graphs and more complex charts.

1 2 3 4 5 6 7 8 Page 7
Page 7 of 8

But what if you want the scores 80 and above to be blue and the lower scores to be red? To do this, create a vector of colors of the same length and in the same order as your data, adding a color to the vector based on the data. In other words, since the first test score is 96, the first color in your color vector should be blue; since the second score is 71, the second color in your color vector should be red; and so on.

Of course, you don't want to create that color vector manually! Here's a statement that will do so:

testcolors <- ifelse(testscores >= 80, "blue", "red")

If you've got any programming experience, you might guess that this creates a vector that loops through the testscores data and runs the conditional statement: 'If this entry in testscores is greater than or equal to 80, add "blue" to the testcolors vector; otherwise add "red" to the testcolors vector.'

Now that you've got the list of colors properly assigned to your list of scores, just add the testcolors vector as your desired color scheme:

barplot(testscores, col=testcolors)

Note that the name of a color must be in quotation marks, but a variable name that holds a list of colors should not be within quote marks.

Add a graph headline:

barplot(testscores, col=testcolors, main="Test scores")

And have the y axis go from 0 to 100:

barplot(testscores, col=testcolors, main="Test scores", ylim=c(0,100))

Then use las-1 to style the axis label to be horizontal and not turned 90 degrees vertical:

barplot(testscores, col=testcolors, main="Test scores", ylim=c(0,100), las=1)

And you've got a color-coded bar graph.

   Color-coded bar graph
A color-coded bar graph.

By the way, if you wanted the scores sorted from highest to lowest, you could have set your original testscores variable to:

testscores <- sort(c(96, 71, 85, 92, 82, 78, 72, 81, 68, 61, 78, 86, 90), decreasing = TRUE)

The sort() function defaults to ascending sort; for descending sort you need the additional argument: decreasing = TRUE.

If that code above is starting to seem unwieldy to you as a beginner, break it into two lines for easier reading, and perhaps also set a new variable for the sorted version:

testscores <- c(96, 71, 85, 92, 82, 78, 72, 81, 68, 61, 78, 86, 90)

testscores_sorted <- sort(testscores, decreasing = TRUE)

1 2 3 4 5 6 7 8 Page 7
Page 7 of 8
Shop Tech Products at Amazon