Chapter 3 Packages

As you likely know by now, R is an open source programming language. The most obvious benefit for you as a student is that that means R is free. That is a huge benefit to professional statisticians as well. In addition, R’s open source design means that anyone can contribute to the R project and develop R add-ons and extensions called packages.

These packages are also free and usually pretty easy to access. CRAN, the Comprehensive R Archive Network, is maintained by the R Core Team and houses thousands of contributed packages. Other places you might find packages include sites like Bioconductor or GitHub. Packages on CRAN go through many more checks than packages on other sites and are generally considered more stable.

Though most of the methods you will learn in an introductory statistics class are available in R out-of-the-box, you may need to use extension packages from time to time. Packages I have my students use include:

  • readxl for importing data from Microsoft Excel files
  • rmarkdown for report writing
  • ggplot2 for graphics
  • epitools for functions to compute risk ratios and odds ratios

There are also packages that include datasets you may need to use.

Installing and Loading Packages

Getting packages ready to use is a two step process. The steps are not difficult but it is important to remember when you do and do not need to use them. This process is summarized in Figure 3.1.

The process for accessing a package includes downloading it from the internet and loading it into your R session.

First, you will need to download the package from the internet. In R terminology, this is called “installing” the package. If the package you want is on CRAN, you can use the function install.packages for this. For example, to install readxl, run

install.packages('readxl')

Notice that the function name is in quotation marks1.

Installing a package is something you only have to do once on your computer. Once the package is downloaded, you have it. If you use another computer or need to completely uninstall R for some reason, you will need to install the package again. Otherwise, this is a one-time process2.

Once you have installed the package it is on your computer, but R does not automatically load all packages into your R session. This is primarily for time and memory management, but there are other reasons as well, such as multiple packages having functions with the same name. In order to load the package into your R session, use the library function:

library(readxl)

Notice here the function name is not in quotation marks. Now all the functionality and data in the package is available to you. If you want to know what functions or dataset a package includes, you can use the help function. For example, to see what is in readxl, run

help(package='readxl')

Unlike installing, loading packages needs to be done every time you start a new R session. With every session, you start with a minimal number of packages. Any packages you need that you had to download from the internet (and some others that come pre-installed) will need to be loaded into your new session.

As with most topics in this book, there is plenty more to say about using, managing, and creating packages that is beyond the scope of this guide. More resources are available in Additional R Resources.


  1. You can use single quotations, as done here, or double quotations. You just need to use the same to start and end the quotation.↩︎

  2. You can also update packages this way, but for the purposes of a single course this is not very relevant.↩︎