
#Scatterplot rstudio how to
In the future I may update this post with how to do this using ggplot2.įirst, let’s load the data. The key is just to include a few additional parameters and functions. Here is a simple tutorial on how to re-create the nice version of the plot above using the ‘base’ R package. Unfortunately, even ggplot2-which is commended for the ease with which one can make good quality visualizations-is not so pretty right out of the box. I have a hunch that the main reason plots such as the first one above are so common is simply due to a lack of knowing how to easily customize plots in R.

For that reason, it is important that we take a subjective, and dare I say aesthetic, approach towards ensuring scatterplots (and all other plot types, really) are visually appealing and easy to understand on a quick glance. Unlike a statistical test, the goal of data visualizations is subjective- to help a viewer understand a particular relationship or story. In other words, while the data may be accurate, the actual visual design of scatterplots is often overlooked and unattended. There are several issues here, but without elaborating, here are the same data after a few visual tweaks: Here’s a typical example of the type of plot I’ve seen one-too-many times: Check out this really cool article from the New Yorker about ‘When graphs are a matter of life and death’ for more history on the subject.Īll through my grad school years and beyond, I’ve repeatedly come across scatterplots that almost defeat the purpose of helping us easily understand the relationship between two variables. They are a powerful tool, but one that I believe merits a bit more attention.

We might take them for granted by their simplicity, but we shouldn’t assume the seeming intuition with which we can see and comprehend these figures.

They present the relationship between two continuous variables. Scatterplots are one of the most common types of data visualizations you will encounter as a biologist.
