This is a worked set of answers to the ggplot course
First we are going to load the main tidyverse library.
library(tidyverse)
## -- Attaching packages ---------------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1 v purrr 0.3.3
## v tibble 2.1.3 v dplyr 0.8.3
## v tidyr 1.0.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.4.0
## -- Conflicts ------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
We’ll plot out the data in the weight_chart.txt
file. Let’s load it and look first.
read_tsv("weight_chart.txt") -> weight
## Parsed with column specification:
## cols(
## Age = col_double(),
## Weight = col_double()
## )
weight
We’ll start with a simple plot, just setting the minimum aesthetics.
weight %>%
ggplot(aes(x=Age, y=Weight)) +
geom_point()
Now we can customise this a bit by adding fixed aesthetics to the geom_point()
function.
weight %>%
ggplot(aes(x=Age, y=Weight)) +
geom_point(size=3, colour="blue2")
Now repeat but with a different geometry.
weight %>%
ggplot(aes(x=Age, y=Weight)) +
geom_line()
Finally, combine the two geometries.
weight %>%
ggplot(aes(x=Age, y=Weight)) +
geom_line()+
geom_point(size=3, colour="blue2")
Now let’s look at the chromosome_position_data.txt
file.
read_tsv("chromosome_position_data.txt") -> chr.data
## Parsed with column specification:
## cols(
## Position = col_double(),
## Mut1 = col_double(),
## Mut2 = col_double(),
## WT = col_double()
## )
head(chr.data)
We have the data in three separate columns at the moment so we need to use pivot_longer
to put them into a single column.
chr.data %>%
pivot_longer(cols=-Position, names_to = "sample", values_to = "value") -> chr.data
head(chr.data)
Now we can plot out a line graph of the position vs value for each of the samples. We’ll use colour to distiguish the lines for each sample.
chr.data %>%
ggplot(aes(x=Position, y=value, colour=sample)) +
geom_line(size=1)
Finally we’re going to look at the genome size vs number of chromosomes and colour it by domain in our genomes data.
read_csv("genomes.csv") -> genomes
## Parsed with column specification:
## cols(
## Organism = col_character(),
## Groups = col_character(),
## Size = col_double(),
## Chromosomes = col_double(),
## Organelles = col_double(),
## Plasmids = col_double(),
## Assemblies = col_double()
## )
head(genomes)
To get at the Domain
we’ll need to split apart the Groups field.
genomes %>%
separate(col=Groups, into=c("Domain","Kingdom","Class"), sep=";") -> genomes
head(genomes)
Now we can draw the plot.
genomes %>%
ggplot(aes(x=log10(Size),y=Chromosomes, colour=Domain)) +
geom_point()