Your browser lacks required capabilities. Please upgrade it or switch to another to continue.
Loading…
<span class="header1">Welcome</span>
Welcome to this interactive tutorial. This tutorial acts as a short introduction to R and RStudio.
In today's workshop you will learn some basics around using RStudio.
<strong>To start us off, what is your name?</strong>
<<textbox "$name" "">>
Now, let's get stated.
[[Proceed|Intro1]]
----
<span class="header2">Table of Contents</span>
The below <strong>Table of Contents</strong> is intended to help you navigate back to a point in the tutorial if you need to stop half-way through. You can click the 'back to start' <strong>'<<'</strong> link at the bottom of each page to return to the this page.
<span class="header2">Preliminary Steps</span>
[[Preliminaries: R and RStudio|Intro1]]
[[Quick Overview|Intro2]]
[[R is a calculator|Intro3]]
[[R is a scientific calculator|Intro5]]
[[R allows you to bind numbers into datasets|Intro7]]
[[The standard way to hold data in R is a 'dataframe'|Intro9]]
[[Selecting columns out of a dataframe|Intro11]]
[[Basic statistical tests and figures|Intro12]]
[[Finish|Intro14]]
<img src="http://cpjohnstone.com/wp-content/uploads/2018/09/R_logo.png" alt="past" width="10%" height="auto"/> <img src="http://cpjohnstone.com/wp-content/uploads/2018/09/R_studio_logo.png" alt="past" width="10%" height="auto"/>
<span class="header1">Starting Off</span>
Hi, $name. Before we get started, it's worth explaining what R and RStudio are and why we are using them.
R is a language for statistical computing. It was based on a language called 'S'. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand. The first version was released in 1995. A stable version was made available in 2000.
<span class="header2">Why use R?</span>
We have a lot of good reasons for preferring R to other statistical programs like SPSS or GraphPad Prism.
* R is free
* R is extraordinarily stable and can handle huge datasets. I've personally opened the entire US census data in R (all of it, for all censuses in one multi-gigabyte text file). No other program I tried was even remotely capable of this.
* R has a huge amount of support from generous-minded people. There is a veritable army of stats monkeys crawling all over it fixing bugs.
* Code can be saved and rerun easily. If someone questions statistics in a ten year old paper, you can easily just go back and check your old R code. If you used a point-and-click program, there is no way you'll be able to easily reconstruct what you did ten years ago.
<span class="header2">But R has its drawbacks</span>
* On the other hand R has a steep learning curve. Because it is a programming language, you need to learn the syntax of the language.
* The error messages are often esoteric and incomprehensible, especially if you are new to R. This can leave you feeling very lost when something goes wrong.
* Unfortunately, if you are already familiar with languages like Python or Ruby, this won't help you much. R feels like exactly what you would expect to get if two stats experts with no in-depth knowledge of programming languages decided to make up a computing language.
That said, R isn't too bad once you get used to it. It can seem scary at first, but before long you'll find that it starts to make (at least) some sort of sense.
<span class="header2">RStudio</span>
In this tutorial we will be using RStudio. R is a fully independent program and doesn't require RStudio, but it is hard to use on its own.
RStudio is a 'wrap-around' for R. It adds a whole lot of functionality that improves our experience of using R. Luckily for us, RStudio is also free for personal use.
If you haven't already, you need to <a href="https://www.r-project.org/" target="_blank">download and install R first</a>, and then <a href="https://www.rstudio.com/products/rstudio/download/" target="_blank">download and install RStudio</a> before proceeding.
If you prefer, you can use the <a href="https://rstudio.cloud/" target="_blank">RStudio Cloud Service</a> instead if you wish (although be aware this may be moving to a paid-for-use model).
The main reason why I prefer to use a downloaded version of RStudio is that I can work offline (i.e. in a remote location, or on fieldwork) and the downloaded app is (generally) a bit less buggy than the online version.
Let's start with a quick overview of RStudio.
[[Next|Intro2]]
----
[[<|start]]
[[<<|start]]<span class="header2">Quick Overview</span>
RStudio is divided into four window spaces.
* The top-left is where code is written (and saved)
* The bottom-left is where code is executed (if you are using a standard R script)
* The top-right shows objects, datasets and other things in your working space. This is useful, because if you import something (like a dataset) and can't remember what you named it, you can check here.
* The bottom right is where the help menu and graphs appear.
<span class="header2">R Script versus R Notebook</span>
You can choose to use either an R Script or an R Notebook to write and save code throughout this tutorial. I prefer R Scripts because I find them more stable and less prone to crash. However, R Notebooks make collaborating on code in a team easier. It's just a matter of preference.
* To create an R Script select <strong>File</strong>, <strong>New File</strong>, <strong>R Script</strong>.
* To create an R Notebook select <strong>File</strong>, <strong>New File</strong>, <strong>R Notebook</strong>.
Here's what you should see if you create a <strong>Notebook</strong>.
<img src="http://cpjohnstone.com/wp-content/uploads/2018/09/R02.png" alt="past" width="100%" height="auto"/>
Here's what you should see if you create a <strong>Script</strong>.
<img src="http://cpjohnstone.com/wp-content/uploads/2018/09/R01.png" alt="past" width="100%" height="auto"/>
<span class="header2">Step 1: Create a Script or Notebook</span>
As per your preference, create either a Script or Notebook.
<span class="header2">Step 2: Write a header</span>
You can use hashtags in R to add notes or headings help keep you code clear. Anything after one (or more) hashtags won't be read as 'code' by R. It's just a note for you. Type the following into your Script or Notebook.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #008000;">### R INTRO ###</span></span></strong><br /><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># This is a basic intro to R</span></span></strong>
Note that this block of text is shown in green here. I will be colour coding text to make it a bit easier for you to follow. Notes will alwasy be in green.
<span class="header2">Step 3: Save your file</span>
So far, your file doesn't have in code in it, but that's fine. Let's save a copy. You can select <strong>File</strong>, <strong>Save</strong>. Alternatively, you can click on the little blue floppy disc save icon just above your Script or Notebook.
You can call this file anything you want. It could be <strong>$name R file</strong> or <strong>I_hate_R</strong> or anything you like. However, just be sure that you append the name with the correct extension.
* If you are using a script add .R to the end of the file like this: this_is_my_R_file.R
* If you are using a notebook add .Rmd to the end of the file like this: this_is_my_R_file.Rmd
Excellent. Once you have done that, you have a file to save script to. We can start working through some of the basics of R scripting.
[[Next|Intro3]]
----
[[<|Intro1]]
[[<<|start]]<span class="header1">R is a calculator</span>
It's easy to get carried away with all the fancy things R can do, and forget that at its heart, R is a straightforward scientific calculator.
Type the following into your script or notebook and run it.
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;">3 <span style="color: #000000;">+ </span>8</span></strong></span>
You can run script by placing your cursor on it, and then clicking the 'Run' button. Alternatively, you can highlight the code and use <strong>command-return</strong> (Mac) or <strong>Control-Enter</strong> (Windows).
Also, if you are using a Notebook, remember to create new chuncks from time to time to keep your code seperated into sensible portions. If you are using a Script, adding headings and notes using hashtags as you go is sensible.
Once you have run the above 'code' you should see the answer (11) appear either in the bottom-left window (Script) or under your chunk (Notebook).
The mathematical operations are all what you would expect. Try these.
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;">11 <span style="color: #000000;">/ </span>2</span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;">20 <span style="color: #000000;">-</span> 4</span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;">13 <span style="color: #000000;">*</span> 7</span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;">(</span></span><span style="color: #0000ff;">11 <span style="color: #000000;">/ </span>2<span style="color: #000000;">)*</span>5</span></strong></span>
You can also allocate numbers (or other things) to placeholders. The way we do this is with an attribution arrow. You'll notice now that we are using blue and black colour coding too. <strong>Blue</strong> will be used for anything that you can change or alter yourself, such as a number or the name of a variable or a dataset. <strong>Black</strong> will be used for basic syntactical things, like commas, multipliers, brackets or equal signs.
Now try this:
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;">x <span style="color: #000000;"><- </span>3</span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;">y <span style="color: #000000;"><- <span style="color: #0000ff;">1.2</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;">((</span></span><span style="color: #0000ff;">6 <span style="color: #000000;">/ </span>x<span style="color: #000000;">)*</span>5<span style="color: #000000;">)^</span>y</span></strong></span>
<strong>What was the answer? Write this to one decimal place.</strong>
<<textbox "$q1" "">>
[[Check your answer|Intro4]]
----
[[<|Intro2]]
[[<<|start]]
<<if $name is "cheat">>
<span class="cheat"> 15.8 </span>
<</if>>
<span class="header2">Your answer was $q1</span>
<<if $q1 is "15.8">>Correct! Great work, $name! [[proceed|Intro5]]
<<else>>Hmm, that doesn't look right. Maybe [[try again|Intro3]]
<</if>>
----
[[<|Intro3]]
[[<<|start]]<span class="header1">R is a scientific calculator</span>
You can use R to apply scientific functions to numbers, such as square roots or logs. Try these:
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #ff6600;">sqrt</span><span style="color: #000000;">(</span>31<span style="color: #000000;">)</span></span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #ff6600;">log</span><span style="color: #000000;">(<span style="color: #0000ff;">1.9</span>)</span></span></strong></span>
Note that I have coloured these commands in orange. Both <strong>sqrt</strong> and <strong>log</strong> are 'functions' in R. Functions sit outside a bracket and apply to everything inside the bracket. Most of the statistical tools in R that you will use are functions of some sort or another.
We can 'nest' functions inside brackets if we wanted to. For example, an arcsine square root transformation is a standard transformation for proportions. We could write it like this:
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">asin</span>(</span><span style="color: #000000;"><span style="color: #0000ff;"><span style="color: #ff6600;">sqrt</span><span style="color: #000000;">(</span></span><span style="color: #000000;"><span style="color: #0000ff;">0.8</span>)</span>)</span></span></strong></span>
However, it is really easy to get confused with this sort of 'nested' coding. You can start to lose track of brackets. It is better to use the attribution arrows to drop a number (or anything else) into an object, and then work with the object. This is a better way to write the exact same operation as above.
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #0000ff;">x</span> <- </span><span style="color: #000000;"><span style="color: #ff6600;">sqrt</span>(<span style="color: #0000ff;">0.8</span>)</span></span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">asin</span></span><span style="color: #0000ff;"><span style="color: #000000;">(</span></span><span style="color: #000000;"><span style="color: #0000ff;">x<span style="color: #000000;">)</span></span></span></span></strong></span>
Here's a good example of why this can be confusing. The following code takes the inverse of 5, logs this, then squares the answer and then rounds this to two decimal places. It is confusing to read, even with colour coding.
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">round<span style="color: #000000;">(</span></span>((<span style="color: #ff6600;">log</span></span><span style="color: #000000;">(<span style="color: #0000ff;">1</span>/<span style="color: #0000ff;">5</span>))^<span style="color: #0000ff;">2</span>),<span style="color: #0000ff;">2</span>)</span></span></strong></span>
This second bit of code does the same thing, but uses attribution arrows to keep things simpler. It takes longer to write out, but it much easier to follow step-by-step. Have a go at entering both of these segments of code into your Script or Notebook and run them. Check that they give the same answer.
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #0000ff;">x</span> <- <span style="color: #0000ff;">1</span>/</span>5</span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #0000ff;">y</span> <- <span style="color: #ff6600;">log</span>(<span style="color: #0000ff;">x</span>)</span></span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #0000ff;">z</span> <- <span style="color: #0000ff;">y</span>^<span style="color: #0000ff;">2</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">round</span>(<span style="color: #0000ff;">z</span>,<span style="color: #0000ff;">2</span>)</span></span></strong></span>
<strong>What was the square root of 1085? Write the answer rounded to two decimal places.</strong>
<<textbox "$q2" "">>
[[Check your answer|Intro6]]
----
[[<|Intro4]]
[[<<|start]]
<<if $name is "cheat">>
<span class="cheat"> 32.94 </span>
<</if>>
<span class="header2">Your answer was $q2</span>
<<if $q2 is "32.94">>Correct! Great work, $name! [[proceed|Intro7]]
<<else>>That doesn't look quite right. Remember to round correctly. Best to [[try again|Intro5]]
<</if>>
----
[[<|Intro5]]
[[<<|start]]<span class="header1">R allows you to bind numbers into datasets</span>
Although you will typically be importing data into R, you can just enter numbers and bind them into a dataset using the 'combine' function.
Let's imagine we had species richness counts for macroinvertebrates at eight locations in a stream (i.e. a count of the number of species caught at each site). We might want to bind these into a single dataset.
Try this:
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #0000ff;">invert.species</span></span><span style="color: #000000;"> <- <span style="color: #ff6600;">c</span>(<span style="color: #0000ff;">4</span>, <span style="color: #0000ff;">5</span>, <span style="color: #0000ff;">0</span>, <span style="color: #0000ff;">0</span>, <span style="color: #0000ff;">3</span>, <span style="color: #0000ff;">3</span>, <span style="color: #0000ff;">2</span>, <span style="color: #0000ff;">1</span>)</span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong>invert.species</strong></span>
<span class="header2">A note on the 'namespace'</span>
Programming languages like R have a 'namespace'. This is a list of all the names of things like functions. <strong>sqrt</strong> and <strong>log</strong> are part of the namespace. Unlike most languages R doesn't have a very extensive protected namespace. Most languages won't let you just write over an important function. R will mostly let you do whatever you want.
For example, if you try to write over the ANOVA function, <strong>aov</strong>, most languages would stop you, or at least give you a warning. R just shrugs and says 'I guess you know what you're doing'. This means that it is possible to cause all sorts of problems by writing over something you didn't want to. Some rules to help:
* Generally speaking single lettes like a, b, c, x, y, z are safe to use
* Most functions don't include underscores or full stops. Appending something on the end like this <strong>lizards_aov</strong> or <strong>lizards.aov</strong> will usually help you avoid any problems.
* Just remember that you can always just quit and restart the R session if something goes drastically wrong. Save your script before you quit out, then start R up again. It will return to the default settings.
<span class="header2">Applying functions to a dataset</span>
What do you think happens if you apply a square root to our species richness counts? Try it and see.
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">sqrt</span>(</span></span></span>invert.species<span style="color: #000000;">)</span></strong></span>
<span class="header2">Basic summary statistics</span>
Here is a list of some useful and basic summary functions. You'll notice that Standard Error (which we use a lot in Biology) does not have its own basic function. You have to calculate it as the standard deviation divided by the square root of the number of observations. The stats people who run R don't much like Standard Error. The reason for this is that a SE is only a 68.2% confidence interval of the mean. That is, we only have a 68.2% confidence that the true mean lies somewhere within the SE. That's not very high. We like SE in Biology because biological data is often messy and using 95% or 99% confidence intervals can make our graphs look depressingly uncertain with huge error bars.
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">mean</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Mean</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">median</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Median</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">max</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Highest number</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">min</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Lowest number</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">length</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Number of observations</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">var</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Variance</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">sd</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Standard Deviation</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">sd</span>(</span></span></span>x<span style="color: #000000;">)/<span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #ff6600;">sqrt</span><span style="color: #000000;">(<span style="color: #ff6600;">length</span>(</span></span></span><span style="color: #0000ff;">x</span>)) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Standard Error</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">summary</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Mean, ranges and quartiles (when applied to a set of numbers)</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">str</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># The 'structure' of an object. Very useful for complex datasets.</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">is.numeric</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Asks, is something numeric? (true or false)</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">is.factor</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Asks, is something categorical? (true or false) In R categorical data (names, words etc) are called 'factors'.</span></span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">boxplot</span>(</span></span></span>x<span style="color: #000000;">) <span style="font-family: 'Courier New', Courier;"><span style="color: #008000;"># Generates a quick boxplot of a single set of numbers.</span></span></span></strong></span>
Have a go at the above using the <strong>invert.species</strong> and then answer the following.
<strong>What was the mean species count per site? Write the answer rounded to one decimal place.</strong>
<<textbox "$q3" "">>
<strong>What was the variance of the species counts? Write the answer rounded to one decimal place.</strong>
<<textbox "$q4" "">>
<strong>What was the standard error of the species counts? Write the answer rounded to one decimal place.</strong>
<<textbox "$q5" "">>
[[Check your answer|Intro8]]
----
[[<|Intro6]]
[[<<|start]]
<<if $name is "cheat">>
<span class="cheat"> 2.3, 3.4, 0.6</span>
<</if>><span class="header2">Your answer was $q3</span>
<<if $q3 is "2.3">>Correct! Great work, $name!
<<else>>Hmm, that doesn't look right.
<</if>>
<span class="header2">Your answer was $q4</span>
<<if $q4 is "3.4">>Correct! Great work, $name!
<<else>>Hmm, that doesn't look right.
<</if>>
<span class="header2">Your answer was $q5</span>
<<if $q5 is "0.6">>Correct! Great work, $name!
<<else>>Hmm, that doesn't look right.
<</if>>
<<if $q3 is "2.3" and $q4 is "3.4" and $q5 is "0.6">>All correct! Great work, $name! [[proceed|Intro9]]
<<else>>You might need to [[try again|Intro7]]. Remember to round correctly.
<</if>>
----
[[<|Intro7]]
[[<<|start]]<span class="header1">The standard way to hold data in R is a 'dataframe'</span>
We have created a basic dataset, but it is currently just a list of numbers. In R such a list is called a <strong>vector</strong>. Sometimes we do want to work with vectors, but more typically, we need data to be in the form of a <strong>data frame</strong>.
Here's our set of numbers again.
<span style="font-family: 'Courier New', Courier;"><strong><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #0000ff;">invert.species</span></span><span style="color: #000000;"> <- <span style="color: #ff6600;">c</span>(<span style="color: #0000ff;">4</span>, <span style="color: #0000ff;">5</span>, <span style="color: #0000ff;">0</span>, <span style="color: #0000ff;">0</span>, <span style="color: #0000ff;">3</span>, <span style="color: #0000ff;">3</span>, <span style="color: #0000ff;">2</span>, <span style="color: #0000ff;">1</span>)</span></span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong>invert.species</strong></span>
Now try this.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Use the 'as.data.frame' function to change our first list into a data frame.</span></span></span></span></span></strong>
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span> <- <span style="color: #ff6600;">as.data.frame</span>(<span style="color: #0000ff;">invert.species</span>)</span></strong><br /><span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.species <span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Look at the old list </span></span></span></span></strong></span><br /><span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.survey <span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Look at the new data frame</span></span></span></span></strong></span>
Note how the layout of the numbers has changed. It has converted to a column, more like how we would enter data into a spreadsheet in a program like Excel or Numbers.
Typically, you would:
* Do your data entry in Excel (or similar)
* Save as a <strong>csv</strong> (comma seperated file)
* Import the csv file into RStudio
And if you import it correctly it will arrive as a data frame. That is, it will be arranged in a set of columns with headings for each column.
<span class="header2">Making a 'data frame' in R</span>
We're going to have a go at making a data frame in R. I think there are two good reasons to do this:
* You will get a clearer idea of how a data frame is made up and arranged
* You will see that making data frames in R is fiddly and annoying. This second point is useful only in that it will reinforce that the better way to do things is undertake your data entry in a dedicated program such as Excel.
Let's imagine we have species richness counts from two streams, Scotchman's Creek and Salt Creek. We collected the samples over a number of days and recorded the water temperature and whether it was sunny or cloudy on the day too.
The first thing we will do is create a new set of numbers representing species richnesses. There are twenty in total. Ten come from Scotchman's Creek. Ten come from Salt Creek.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Create a larger invert.species dataset. We will have 10 observations from Scotchman's Creek and 10 from Salt Creek</span></span></span></span></span></strong>
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.species</span> <- <span style="color: #ff6600;">c</span>(<span style="color: #0000ff;">4</span>, <span style="color: #0000ff;">5</span>, <span style="color: #0000ff;">0</span>, <span style="color: #0000ff;">0</span>, <span style="color: #0000ff;">3</span>, <span style="color: #0000ff;">3</span>, <span style="color: #0000ff;">2</span>, <span style="color: #0000ff;">1</span>, <span style="color: #0000ff;">4</span>, <span style="color: #0000ff;">7</span>, <span style="color: #0000ff;">5</span>, <span style="color: #0000ff;">3</span>, <span style="color: #0000ff;">4</span>, <span style="color: #0000ff;">0</span>, <span style="color: #0000ff;">1</span>, <span style="color: #0000ff;">4</span>, <span style="color: #0000ff;">7</span>, <span style="color: #0000ff;">7</span>, <span style="color: #0000ff;">0</span>, <span style="color: #0000ff;">3</span>)</span></strong>
Now create a list of creek names. These have to align exactly with the species counts above. The first number corresponds to the first creek name.
Notice also that when you do this, the new list will appear in your working space in the top-right window.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Create a list of stream names. The first ten samples are from Scotchmans. The second ten are from Salt Creek.</span></span></span></span></span></strong><br /><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">stream</span> <- <span style="color: #ff6600;">c</span>("<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">scotchmans</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>", "<span style="color: #0000ff;">salt</span>")</span></strong>
Now create a list of water temperatures.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Create a list of temperatures.</span></span></span></span></span></strong><br /><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">temp</span> <- <span style="color: #ff6600;">c</span>(<span style="color: #0000ff;">10.2</span>, <span style="color: #0000ff;">11.3</span>, <span style="color: #0000ff;">8.4</span>, <span style="color: #0000ff;">7.9</span>, <span style="color: #0000ff;">10.5</span>, <span style="color: #0000ff;">9.8</span>, <span style="color: #0000ff;">9.5</span>, <span style="color: #0000ff;">10.1</span>, <span style="color: #0000ff;">12.5</span>, <span style="color: #0000ff;">13.1</span>, <span style="color: #0000ff;">9.8</span>, <span style="color: #0000ff;">9.5</span>, <span style="color: #0000ff;">9.5</span>, <span style="color: #0000ff;">7.1</span>, <span style="color: #0000ff;">7.5</span>, <span style="color: #0000ff;">10.4</span>, <span style="color: #0000ff;">11.1</span>, <span style="color: #0000ff;">11.3</span>, <span style="color: #0000ff;">6.5</span>, <span style="color: #0000ff;">10.2</span>)</span></strong>
Now create a list of weather conditions that were recorded on the day of sampling.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Create a list of weather conditions.</span></span></span></span></span></strong><br /><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">weather</span> <- <span style="color: #ff6600;">c</span>("<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">sunny</span>", "<span style="color: #0000ff;">cloudy</span>", "<span style="color: #0000ff;">sunny</span>")</span></strong>
At this stage, we have a set of lists (vectors) but no data frame. There are quicker ways to create a data frame using libraries like 'dplyr', but I think it is a bit easier to understand if we just do this one step at a time using standard coding.
Here, we take 'invert.species' and us 'as.data.frame' to turn it into a data frame. Note that we are dropping it into a new object that we are calling 'invert.survey'.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Use the 'as.data.frame' function to change our first list into a data frame.</span></span></span></span></span></strong>
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span> <- <span style="color: #ff6600;">as.data.frame</span>(<span style="color: #0000ff;">invert.species</span>)</span></strong><br /><span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.species <span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Look at the old list </span></span></span></span></strong></span><br /><span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.survey <span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Look at the new data frame</span></span></span></span></strong></span>
When you look at <strong>invert.survey</strong>, you should see two things have changed:
* Now the numbers are arranged vertically, like a standard spreadsheet. This arrangement is what we would call 'long' or 'tall' format.
* The data frame is called <strong>invert.survey</strong> but inside the data frame is a column named <strong>invert.species</strong>.
Now we can add the other vectors to this new data frame. We do this by using the attribution arrow to take the vector and drop it into a new column. Here, we are naming the columns using the dollar sign symbol. The dollar sign (<strong>$</strong>) mans 'inside of' or 'look inside'. So you can read <strong>invert.survey$stream</strong> as:
* Look inside 'invert.survey' for a column called 'stream'.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span>$<span style="color: #0000ff;">stream</span> <- <span style="color: #0000ff;">stream <span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Add stream name</span></span></span></span></span></strong><br /><span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.survey</span></strong></span>
Do the same for the other two variables, temperature and weather.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span>$<span style="color: #0000ff;">temp</span> <- <span style="color: #0000ff;">temp <span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Add temperature</span></span></span></span></span></strong><br /><span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.survey</span></strong></span>
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span>$<span style="color: #0000ff;">weather</span> <- <span style="color: #0000ff;">weather <span style="font-family: 'Courier New', Courier; color: #0000ff;"><span style="color: #000000;"><span style="color: #008000;"># Add weather</span></span></span></span></span></strong><br /><span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.survey</span></strong></span>
Now, you should have a data frame that has four columns. This would have been a lot easier to do using a data entry app like Excel, but this way you get to see us building a data frame up from the ground.
Columns inside data frames can be numbers or factors just like vectors can. Or they can be other classes of things entirely, such as 'characters'. We prefer our variables to be either numbers or factors. Use <strong>is.factor</strong> and <strong>is.numeric</strong> to check what classes the four data columns are.
Here's the first one to start you off.
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">is.numeric</span>(<span style="color: #0000ff;">invert.survey</span>$<span style="color: #0000ff;">invert.species</span></span></span></span><span style="color: #000000;">)</span></strong></span>
<span style="font-family: 'Courier New', Courier; color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;"><span style="color: #000000;"><span style="color: #ff6600;">is.factor</span>(<span style="color: #0000ff;">invert.survey</span>$<span style="color: #0000ff;">invert.species</span></span></span></span><span style="color: #000000;">)</span></strong></span>
<strong>Species richness (invert.species) is a number:</strong>
<<radiobutton "$q6" "true">> true
<<radiobutton "$q6" "false">> false
[[Check your answers|Intro10]]
----
[[<|Intro8]]
[[<<|start]]
<<if $name is "cheat">>
<span class="cheat">
true
</span>
<</if>><span class="header2">Answers</span>
<strong> Species count is a number: </strong>
<<if $q6 is "true">>Correct! Great work, $name.
The way we constructed a dataframe was bit-by-bit to try and give you a sense of how they come together. It is actually possible to do it in one step. Try this and see if it works...
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey.2</span> <- <span style="color: #ff6600;">data.frame</span>(<span style="color: #0000ff;">invert.species</span>, <span style="color: #0000ff;">stream</span>, <span style="color: #0000ff;">temp</span>, <span style="color: #0000ff;">weather</span>)</span></strong>
<span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.survey</span></strong></span><br /><span style="color: #0000ff;"><strong><span style="font-family: 'Courier New', Courier;">invert.survey.2</span></strong></span>
You should get two dataframes that look very similar.
[[Next|Intro11]]
<<else>>That doesn't look correct. Maybe [[try again|Intro9]].
<</if>>
----
[[<|Intro8]]
[[<<|start]]<span class="header1">Selecting columns out of a dataframe</span>
Just as you can build dataframes, you can select data out of a dataframe too. You can do this either by selecting out columns by their position (i.e. by number) or by their name (i.e. the heading).
Try these:
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span>[<span style="color: #0000ff;">4</span>] <span style="color: #008000;"># The fourth element.</span></span></strong>
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span>[<span style="color: #0000ff;">-4</span>] <span style="color: #008000;"># All but the fourth.</span></span></strong>
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span>[<span style="color: #0000ff;">2</span>:<span style="color: #0000ff;">4</span>] <span style="color: #008000;"># Elements two to four.</span></span></strong>
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span>[-(<span style="color: #0000ff;">3</span>:<span style="color: #0000ff;">4</span>)] <span style="color: #008000;"># All elements except three to four.</span></span></strong>
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #0000ff;">invert.survey</span>[<span style="color: #ff6600;">c</span>(<span style="color: #0000ff;">1</span>, <span style="color: #0000ff;">3</span>)] <span style="color: #008000;"># Elements one and five.</span></span></strong>
<strong><span style="color: #0000ff; font-family: 'Courier New', Courier;">invert.survey<span style="color: #000000;">['</span>weather<span style="color: #000000;">']</span> <span style="color: #008000;"># The element named 'weather'</span></span></strong>
<strong><span style="color: #0000ff; font-family: 'Courier New', Courier;">invert.survey<span style="color: #000000;">[<span style="color: #ff6600;">c</span>('</span>stream','weather<span style="color: #000000;">')]</span> <span style="color: #008000;"># The elements named 'stream' and 'weather'</span></span></strong>
You can also seperate out columsn of data and drop them into a new object, like so:
<strong><span style="color: #0000ff; font-family: 'Courier New', Courier;">stream_and_weather_only<span style="color: #000000;"> <-</span> invert.survey<span style="color: #000000;">[<span style="color: #ff6600;">c</span>('</span>stream','weather<span style="color: #000000;">')]</span> <span style="color: #008000;"># Take the elements named 'stream' and 'weather' and drop them into a new object called 'stream_and_weather_only'</span></span></strong>
<strong><span style="color: #0000ff; font-family: 'Courier New', Courier;">stream_and_weather_only</span></strong>
[[Proceed|Intro12]]
----
[[<|Intro10]]
[[<<|start]]<span class="header1">Basic statistical tests and figures</span>
Most basic (core) statistical tests and figure in R follow the same syntax. Here is the basic structure:
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #ff6600;">function</span>(<span style="color: #0000ff;">response</span> ~ <span style="color: #0000ff;">predictor</span>, data = <span style="color: #0000ff;">your.data.frame</span>)</span></strong>
<span class="header2">Basic boxplot</span>
Here's the code to create a basic boxplot. To make this sutiable for a report you would need to add extra code or use a package like <strong>ggplot2</strong>, but if you just want to eyeball data, this is a useful option.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #ff6600;">boxplot</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">stream</span>, data = <span style="color: #0000ff;">invert.survey</span>)</span></strong>
You can change the colour of the boxes if you like:
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #ff6600;">boxplot</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">stream</span>, data = <span style="color: #0000ff;">invert.survey</span>, col = <span style="color: #ff6600;">c</span>("<span style="color: #0000ff;">hotpink</span>","<span style="color: #0000ff;">gold3</span>"))</span></strong>
You can try other colours. The colour names that R uses are the same as the colour names supported by CSS. You can even just have a go at guessing some colours. Does 'navy' exist? What about 'darkorange' or 'forestgreen'? What other names can you find? Note that you can always just search online for 'CSS colour names' or 'R colour names' if you want to see a list.
<span class="header2">Basic scatterplot</span>
There is no 'scatterplot' function in base R. Instead you just use 'plot' and R will default to the most suitable plot.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #ff6600;">plot</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">temp</span>, data = <span style="color: #0000ff;">invert.survey</span>)</span></strong>
To add a line of best fit, you need to use the <strong>abline</strong> function. Try running these two lines together.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #ff6600;">plot</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">temp</span>, data = <span style="color: #0000ff;">invert.survey</span>)</span></strong><br /><strong><span style="font-family: 'Courier New', Courier;"> <span style="color: #ff6600;">abline</span>(<span style="color: #ff6600;">lm</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">temp</span>, data = <span style="color: #0000ff;">invert.survey</span>))</span></strong>
You can modify scatterplots just as you can modify boxplots. Here are a few things you can change (there is a whole lot more):
* <strong>lwd</strong> = line width
* <strong>pch</strong> = point character (the points)
* <strong>col</strong> = colour
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #ff6600;">plot</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">temp</span>, data = <span style="color: #0000ff;">invert.survey</span>, pch = <span style="color: #0000ff;">20</span>)
<span style="color: #ff6600;">abline</span>(<span style="color: #ff6600;">lm</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">temp</span>, data = <span style="color: #0000ff;">invert.survey</span>), lwd = <span style="color: #0000ff;">2</span>, col = "<span style="color: #0000ff;">red</span>")</span></strong>
<span class="header2">t-test</span>
A t-test is a statistical test that tests if two means are different. If P < 0.05, then the two <strong>means</strong> are taken to be significantly different.
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #ff6600;">t.test</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">stream</span>, data = <span style="color: #0000ff;">invert.survey</span>)</span></strong>
<span class="header2">Mann-Whitney U test</span>
A Mann-Whitney U test (also called a Wilcox test) is a <strong>non-parametric</strong> version of a t-test. It is (sort of) testing for a difference in <strong>medians</strong> rather than means. A t-test is only valid if the <strong>assumptions</strong> are met. In R, the default t-test is a Welch's unequal variance t-test. It has two assumptions:
* Observations must be independent
* Data must be normally distributed
A Mann-Whitney U test still requires independent observations, but the data doesn't need to be normally distributed.
Try this:
<strong><span style="font-family: 'Courier New', Courier;"><span style="color: #ff6600;">wilcox.test</span>(<span style="color: #0000ff;">invert.species</span> ~ <span style="color: #0000ff;">stream</span>, data = <span style="color: #0000ff;">invert.survey</span>)</span></strong>
The first thing to note is that you will get an 'error' like so:
<p><span style="color: #ff0000;"><strong><span style="font-family: 'Courier New', Courier;">Warning message: In wilcox.test.default(x = c(5, 3, 4, 0, 1, 4, 7, 7, 0, 3), y = c(4, : cannot compute exact p-value with ties</span></strong></span></p>
Because a Mann-Whitney U test works by comparing ranks of numbers, it can't generate an exact P value if there are 'ties' in the data. That's usually fine. It would only be a concern if the P value was very close to 0.05, in which case we couldn't be certain if it was significant or not. Because our result is strongly non-significant (P = 0.673), any imprecision is negligible.
Given that non-parametric tests have fewer assumptions than parametric tests, you might be wodnering why we don't use non-parametric tests all the time. Non-parametric tests tend to be a bit more limited in terms of the data they can accept, and they tend to inflate (or increase) 'Type II error', which is your chance of getting a false negative.
Because researchers are always looking for significance, there is a tendency to start with paramtric tests (making it more likely that you will obtain a significant result) and only move to a non-parametriuc test if you have no other option. This is actually a little bit dodgey, because we shouldn't be making decisions that increase our chance of obtaining significance.
Anyway. Great work so far, $name. Now try answering these questions:
<strong>What was the P-value for a t-test (t.test) of 'invert.species' as a function of 'weather'? Write the answer rounded to three decimal places.</strong>
<<textbox "$q9" "">>
<strong>What was the P-value for a Mann-Whitney U (wilcox.test) of 'invert.species' as a function of 'weather'? Write the answer rounded to three decimal places.</strong>
<<textbox "$q10" "">>
[[Check your answer|Intro13]]
----
[[<|Intro11]]
[[<<|start]]
<<if $name is "cheat">>
<span class="cheat"> 0.006, 0.011</span>
<</if>><span class="header2">Your answer was $q9</span>
<<if $q9 is "0.006">>Correct! Great work, $name!
<<else>>Hmm, that doesn't look right.
<</if>>
<span class="header2">Your answer was $q10</span>
<<if $q10 is "0.011">>Correct! Great work, $name!
<<else>>Hmm, that doesn't look right.
<</if>>
<<if $q9 is "0.006" and $q10 is "0.011">>All correct! Really great work, $name! You're getting the hang of this now!
Note how the result for the non-parametric <strong>wilcox.test</strong> is <i>higher</i> than the result for the parametric <strong>t-test</strong>. Both are significant, but the result for the non-parametric test is <i>closer to being non-significant</i>. This is the reason why researchers (perhaps a bit sneakily) prefer parametric tests.
[[proceed|Intro14]]
<<else>>You might need to [[try again|Intro12]]. Remember to round correctly.
<</if>>
----
[[<|Intro12]]
[[<<|start]]<span class="header1">Finish</span>
Great work, $name. You've reached the end of this short introductory tutorial. A lot of the information in the tutorial is presented in a quick reference PDF that you can access via <a href="https://rstudio.com/wp-content/uploads/2016/10/r-cheat-sheet-3.pdf" target="_blank">this link</a>.
It's a very useful little PDF. Well worth keeping a copy of.
Now, you can either [[go back to the start|start]] or move onto your next interactive lab.