-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path10_nmds-bird-detections.Rmd
286 lines (221 loc) · 15.5 KB
/
10_nmds-bird-detections.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
---
editor_options:
chunk_output_type: console
---
# Non-metric multidimensional scaling of bird species detections
Here, we are interested not only in comparing univariate descriptors of communities, like diversity, but also in how the constituent species — or the composition — changes from one community to the next.
One tool to do this is non-metric multidimensional scaling, or NMDS. The goal of NMDS is to collapse information from multiple dimensions (e.g, from multiple communities, sites, etc.) into just a few, so that they can be visualized and interpreted. Unlike other ordination techniques that rely on (primarily Euclidean) distances, such as Principal Component Analysis, NMDS uses rank orders, and thus is an extremely flexible technique that can accommodate a variety of different kinds of data (The text above was copied from the link below).
NMDS does not use the absolute abundances of species in communities, but rather their rank orders. The use of ranks omits some of the issues associated with using absolute distance (e.g., sensitivity to transformation), and as a result is much more flexible technique that accepts a variety of types of data. (It’s also where the “non-metric” part of the name comes from).
A wonderful tutorial is presented in this link: https://jonlefcheck.net/2012/10/24/nmds-tutorial-in-r/
## Install required libraries
```{r}
library(tidyverse)
library(dplyr)
library(stringr)
library(vegan)
library(ggplot2)
library(scico)
library(psych)
library(ecodist)
library(RColorBrewer)
library(ggforce)
library(ggalt)
library(patchwork)
library(sjPlot)
# Source any custom/other internal functions necessary for analysis
source("code/01_internal-functions.R")
```
## Load the necessary data for NMDS calculations
```{r}
# We load the subset data
datSubset <- read.csv("results/datSubset.csv")
# Load species-trait data to essentially check for associations by habitat type
trait_dat <- read.csv("data/species-trait-dat.csv")
# Site-summary (Number of detections across all sites)
datSummary <- datSubset %>%
group_by(Site, Restoration.type) %>%
transform() %>% replace(is.na(.), 0) %>%
summarise_at(.vars = vars(c("IP":"HSWP")),.funs = sum)
# richness across sites (converting detections to 1)
richness <- datSummary %>%
mutate_at(vars(c("IP":"HSWP")),~ replace(., . > 0, 1)) %>%
rowwise() %>%
mutate(richness = sum(c_across(IP:HSWP))) %>%
dplyr::select(Site, Restoration.type, richness)
```
## Preparing dataframe of traits and species to be used for NMDS later on
```{r}
# Calculate the overall number of detections for each site. Here, we include dates, since each visit can explain the extrapolation of species richness when jackknife estimates are extracted.
nDetections_site_date <- datSubset %>%
group_by(Site, Restoration.type, Date) %>%
transform() %>% replace(is.na(.), 0) %>%
summarise_at(.vars = vars(c("IP":"HSWP")),.funs = sum)
# Combine the nDetections and trait based data to obtain a dataframe for further analysis
nDetections_trait <- nDetections_site_date %>%
pivot_longer(cols=IP:HSWP, names_to="Species_Code", values_to="count") %>%
left_join(.,trait_dat, by=c("Species_Code"="species_annotation_codes")) %>%
mutate(forRichness = case_when(count>0 ~ 1,count==0 ~ 0)) %>%
rename(., nDetections = count)
```
## Preparing a dataframe of detections to run ordinations
```{r}
# We will prepare a community matrix data that can be used to run dissimilarity indices
nmdsDat <- nDetections_trait %>%
dplyr::select(Site, Species_Code, nDetections, Restoration.type) %>%
group_by(Species_Code, Site, Restoration.type) %>%
summarise (totDetections = sum(nDetections)) %>%
arrange(Restoration.type) %>%
pivot_wider (names_from = Species_Code, values_from = totDetections, values_fill = list(totDetections=0))
# Convert to matrix form
nmdsDatMatrix <- as.matrix(nmdsDat[, 3:ncol(nmdsDat)])
```
## Bray-Curtis dissimilarity index
Run a bray-curtis dissimilarity index and identify least stressed configuration for the ordinations. Bray-curtis is a statistic used to quantify the compositional dissimilarity between two different sites, based on counts at each site. Ecologists use the Bray-Curtis dissimilarity calculation, which has a number of ideal properties:
1. It is invariant to changes in units.
2. It is unaffected by additions/removals of species that are not present in two communities.
3. It is unaffected by the addition of a new community.
4. It can recognize differences in total abundances when relative abundances are the same.
Please note that this link provides more information on NMDS: http://strata.uga.edu/8370/lecturenotes/multidimensionalScaling.html
```{r}
# Run a bray-curtis dissimilarity index and use metaMDS function from vegan to run ordinations
disBrayCurtis <- vegdist(nmdsDatMatrix, method = "bray")
nmdsBrayCurtis <- vegdist (nmdsDatMatrix, method = "bray") %>%
metaMDS (nmdsBrayCurtis, k=6)
# extract nmds scores
nmdsScores <- as.tibble(scores(nmdsBrayCurtis, display = "site"))
# Write the scores to a separate .csv
write.csv(nmdsScores, "data/nmdsBrayCurtis-bird-detections.csv", row.names = F)
# With the above analysis, we note the stress is 0.001537865. However, if stress is high, we should reposition the points in 2 dimensions in the direction of decreasing stress, and repeat until stress is below some threshold.**A good rule of thumb: stress < 0.05 provides an excellent representation in reduced dimensions, < 0.1 is great, < 0.2 is good/ok, and stress < 0.3 provides a poor representation.** To reiterate: high stress is bad, low stress is good!
```
## Plotting the NMDS scores
```{r}
# First let's add the treatment type back to the nmds scores
nmdsScores$Restoration.type <- nmdsDat$Restoration.type
# Add a custom set of colors
mycolors <- c(brewer.pal(name="Dark2", n = 3), brewer.pal(name="Paired", n = 3))
# reordering factors for plotting
nmdsScores$Restoration.type <- factor(nmdsScores$Restoration.type, levels = c("Benchmark", "Active", "Passive"))
fig_nmds <- ggplot(data=nmdsScores) +
stat_ellipse(aes(x=NMDS1,y=NMDS2,colour=Restoration.type),level = 0.50) +
geom_point(aes(x=NMDS1,y=NMDS2,shape=Restoration.type,colour=Restoration.type),size=5) +
theme_bw() +
scale_x_continuous(name="NMDS 1") +
scale_y_continuous(name="NMDS 2") +
scale_shape_manual("Treatment type",values= 1:length(unique(nmdsScores$Restoration.type)), labels=c("BM","AR","NR"))+
scale_color_manual("Treatment type",values=mycolors, labels=c("BM","AR","NR"))+
theme(axis.title = element_text(family="Century Gothic",
size = 14, face = "bold"),
axis.text = element_text(family="Century Gothic",
size = 12),
legend.title = element_text(family="Century Gothic",
size = 14, face = "bold"),
legend.key.size = unit(1,"cm"),
legend.text = element_text(family="Century Gothic",size = 14)) +
annotate(geom = "text", x = max(nmdsScores$NMDS1), y = max(nmdsScores$NMDS2), label = "Stress = 0.001", hjust = 1, family = "Century Gothic", size=5)
ggsave(fig_nmds, filename = "figs/fig_nmds_birdDetections.png", width=12, height=7,device = png(), units="in", dpi = 300); dev.off()
```
![AR sites are transitioning toward BM sites](figs/fig_nmds_birdDetections.png)
## Main text Figure 5
```{r}
# reload previous figures to create part A and B for the main text Fig. 4
library(patchwork)
fig_prop_ordinations <-
wrap_plots(fig_propRF, fig_propOC, fig_nmds,
design = 'AB
CC'
) +
plot_annotation(
tag_levels = "a",
tag_prefix = "(",
tag_suffix = ")"
)
# Expand the width to avoid compression
ggsave(fig_prop_ordinations, filename = "figs/fig05.png", width=15, height=12,device = png(), units="in", dpi = 300); dev.off()
```
![**Proportion of acoustic detections of rainforest and open-country bird species and NMDS ordination results of bird species detections**
(a) We observed significant differences in the proportion of rainforest bird species detections across each of the three treatment types (Tukey HSD test, P < 0.05). BM sites had the highest proportion of rainforest bird species detections (mean ± SD: 0.97 ± 0.04), followed by AR sites (mean ± SD: 0.81 ± 0.12), and NR sites (mean ± SD: 0.71 ± 0.17). (b) We observed significant differences in the proportion of open-country bird species detections across each of the three treatment type (Tukey HSD test, P < 0.05). NR sites had the highest proportion of open-country bird species detections (mean ± SD: 0.28 ± 0.17), followed by AR sites (mean ± SD: 0.18 ± 0.12), and BM sites (mean ± SD: 0.02 ± 0.04). (b) The ordination analysis of bird detections data (stress = 0.001) revealed distinct clusters of BM sites, but relatively loose clusters for AR and NR sites. However, AR sites occupied an intermediate position between BM and NR sites, indicating a direction of change in bird community composition toward BM sites. In the above figure, BM = undisturbed benchmark rainforest sites, AR = Actively restored forest sites and NR = Naturally regenerating forest sites.](figs/fig05.png)
## Testing multivariate homogeneity of group dispersions
One measure of multivariate dispersion (variance) for a group of samples is to calculate the average distance of group members to the group centroid or spatial median in multivariate space. To test if the dispersions (variances) of one or more groups are different, the distances of group members to the group centroid are subject to ANOVA.
Betadisper tests whether two or more groups (for example, restored and unrestored sites) are homogeneously dispersed in relation to their species in studied samples. This test can be done to see if one group has more compositional variance than another. Moreover, homogeneity of dispersion among groups is very advisable to have if you want to test if two or more groups have different compositions, which is tested by adonis.
```{r}
nmdsVariance <- betadisper(disBrayCurtis, group = nmdsDat$Restoration.type)
nmdsVariance
anova(nmdsVariance)
permutest(nmdsVariance, pairwise = TRUE, permutations = 999)
TukeyHSD(nmdsVariance)
# These results suggest that there is no difference in within-group variance between one group and another
```
## Visualizing the multivariate homogeneity of group dispersions
The below lines of code have been adapted from: https://chrischizinski.github.io/rstats/adonis/
```{r}
# extract the centroids and the site points in multivariate space.
centroids <-data.frame(grps=rownames(nmdsVariance$centroids),
data.frame(nmdsVariance$centroids))
vectors <- data.frame(group=nmdsVariance$group,
data.frame(nmdsVariance$vectors))
# to create the lines from the centroids to each point we will put it in a format that ggplot can handle
seg.data<-cbind(vectors[,1:3],centroids[rep(1:nrow(centroids),as.data.frame(table(vectors$group))$Freq),2:3])
names(seg.data)<-c("group","v.PCoA1","v.PCoA2","PCoA1","PCoA2")
# create the convex hulls of the outermost points
grp1.hull <- seg.data[seg.data$group=="Active",1:3][chull(seg.data[seg.data$group=="Active",2:3]),]
grp2.hull <- seg.data[seg.data$group=="Benchmark",1:3][chull(seg.data[seg.data$group=="Benchmark",2:3]),]
grp3.hull <- seg.data[seg.data$group=="Passive",1:3][chull(seg.data[seg.data$group=="Passive",2:3]),]
all.hull <- rbind(grp1.hull,grp2.hull,grp3.hull)
# plot the panel and convex hulls
fig_hull <- ggplot() +
geom_polygon(data= all.hull,aes(x=v.PCoA1,y=v.PCoA2),colour="black",alpha=0,linetype="dashed") +
geom_segment(data=seg.data,aes(x=v.PCoA1,xend=PCoA1,y=v.PCoA2,yend=PCoA2),alpha=0.30) +
geom_point(data=centroids[,1:3], aes(x=PCoA1,y=PCoA2,shape=grps),size=4,colour="red") +
geom_point(data=seg.data, aes(x=v.PCoA1,y=v.PCoA2,shape=group),size=2) +
labs(title="All",x="",y="") +
#coord_cartesian(xlim = c(-0.2,0.2), ylim = c(-0.25,0.2)) +
theme_bw() +
theme(legend.position="none")
```
## Testing compositional dissimilarity between groups
We will do this by using the vegan::adonis() function which allows you to run permutational multivariate analysis of variance using distance matrices. In the above figure, the NMDS confidence ellipses suggest that there is a significant difference between benchmark and passive-active sites, but no difference between active and passive sites. Adonis works by first finding the centroids for each group and then calculates the squared deviations of each of site to that centroid. Then significance tests are performed using F-tests based on sequential sums of squares from permutations of the raw data. Please note that adonis analyzes and partitions sums of squares using distance matrices. It can be thought of as an ANOVA using distance matrices (analogous to MANOVA - multivariate analysis of variance). Therefore, it is used to test if two or more groups have similar compositions.
```{r}
# We will use the NMDS scores for axis 1 and axis 2 to test for compositional dissimilarity
groups <- nmdsScores$Restoration.type
adonisNMDS <- adonis2(nmdsDatMatrix ~ groups, method="bray",perm=999)
adonisNMDS
# The results suggest that there are significant compositional differences between groups.
```
## Separate analysis: Instead of using bird detections, using richness to run nmds ordinations
```{r}
nmdsDatRichness <- datSummary %>%
mutate_at(vars(c("IP":"HSWP")),~ replace(., . > 0, 1))
# Convert to matrix form
nmdsDatMatrixRichness <- as.matrix(nmdsDatRichness[, 3:ncol(nmdsDatRichness)])
# Run a bray-curtis dissimilarity index and use metaMDS function from vegan to run ordinations
disBrayCurtisRichness <- vegdist(nmdsDatMatrixRichness, method = "bray")
nmdsBrayCurtisRichness <- vegdist(nmdsDatMatrixRichness, method = "bray") %>% metaMDS (nmdsBrayCurtisRichness, k=6)
# extract nmds scores
nmdsScoresRichness <- as.tibble(scores(nmdsBrayCurtisRichness))
# plot the data
# First let's add the treatment type back to the nmds scores
nmdsScoresRichness$Restoration.type <- nmdsDatRichness$Restoration.type
# Add a custom set of colors
mycolors <- c(brewer.pal(name="Dark2", n = 3), brewer.pal(name="Paired", n = 3))
# reordering factors for plotting
nmdsScoresRichness$Restoration.type <- factor(nmdsScoresRichness$Restoration.type, levels = c("Benchmark", "Active", "Passive"))
fig_nmds_richness <- ggplot(data=nmdsScoresRichness) +
stat_ellipse(aes(x=NMDS1,y=NMDS2,colour=Restoration.type),level = 0.50) +
geom_point(aes(x=NMDS1,y=NMDS2,shape=Restoration.type,colour=Restoration.type),size=4) +
theme_bw() +
scale_x_continuous(name="NMDS 1") +
scale_y_continuous(name="NMDS 2") +
scale_shape_manual("Treatment type",values= 1:length(unique(nmdsScoresRichness$Restoration.type)), labels=c("BM","AR","NR"))+
scale_color_manual("Treatment type",values=mycolors, labels=c("BM","AR","NR"))+
theme(axis.title = element_text(family="Century Gothic",
size = 14, face = "bold"),
axis.text = element_text(family="Century Gothic",
size = 12),
legend.title = element_text(family="Century Gothic",
size = 14, face = "bold"),
legend.key.size = unit(1,"cm"),
legend.text = element_text(family="Century Gothic",size = 14)) +
annotate(geom = "text", x = max(nmdsScoresRichness$NMDS1), y = max(nmdsScoresRichness$NMDS2), label = "Stress = 0.1", hjust = 1, family = "Century Gothic", size=5)
ggsave(fig_nmds_richness, filename = "figs/fig_nmds_birdRichness.png", width=12, height=7,device = png(), units="in", dpi = 300); dev.off()
```
![Similar results as that of bird species detections were observed.](figs/fig_nmds_birdRichness.png)