dashboard.Rmd

---
title: "Project #3: Escapades in Market Risk"
author:
- Nicolas Schoonmaker
- Guillermo Delgado
- Katie Guillen
- Leanne Harper
date: "`r format(Sys.time(), '%m/%d/%Y')`"
output: flexdashboard::flex_dashboard
subtitle: 'Lastname-Firstname_Project3'
---

```{r, include=FALSE}
# ----------------------- Setup section -------------------------
```

```{r Packages and Installers, echo=FALSE}
# Install RTools
# https://cran.rstudio.com/bin/windows/Rtools/
#
# Restart R Studio
#
# Install packages
#install.packages("dplyr")
#install.packages("rstudioapi")
#install.packages("tinytex")
#install.packages("magick")
#install.packages("plotly")
#install.packages("xts")
#install.packages("ggplot2")
#install.packages("moments")
#install.packages("matrixStats")
#install.packages("quantreg")
#install.packages("flexdashboard")
#install.packages("QRM")
#install.packages("qrmdata")
```

```{r Library Includes, echo=FALSE}
# The list of libraries to include
library(stats)
library(dplyr)
library(rstudioapi)
library(flexdashboard)
```

```{r Get current directory, echo=FALSE}
# Get the directory so we can run this from anywhere
# Get the script directory from R when running in R
if(rstudioapi::isAvailable())
{
  #print("Running in RStudio")
  script.path <- rstudioapi::getActiveDocumentContext()$path
  #(script.path)
  script.dir <- dirname(script.path)
}
if(!exists("script.dir"))
{
  #print("Running from command line")
  script.dir <- getSrcDirectory(function(x) {x})
}
#(script.dir)
```

```{r Working Directory and Data setup, echo=FALSE}
# Set my working directory
# There is a "data" folder here with the files and the script
setwd(script.dir)
# Double check the working directory
#getwd()
# Error check to ensure the working directory is set up and the data
# directory exists inside it.  Its required for this file
working_dir <- getwd()
full_path <- paste(working_dir,"/data", sep = "")
data_dir_exists <- dir.exists(full_path)
if(data_dir_exists == FALSE) {
  stop("Data directory does not exist. Make sure the working directory
       is set using setwd() and the data folder exists in it.")
}# else {
  #print("Working directory and data set up correctly")
#}
```

```{r, include=FALSE}
# ----------------------- Project code section -------------------------
```

```{r, echo=FALSE }
library(ggplot2)
library(flexdashboard)
library(shiny)
library(QRM)
library(qrmdata)
library(xts)
library(zoo)
library(psych)

rm(list = ls())

# PAGE: Exploratory Analysis
data <- na.omit(read.csv("data/metaldata.csv", header = TRUE))
prices <- data
# Compute log differences percent using as.matrix to force numeric type
data.r <- diff(log(as.matrix(data[, -1]))) * 100
# Create size and direction
size <- na.omit(abs(data.r)) # size is indicator of volatility
#head(size)
colnames(size) <- paste(colnames(size),".size", sep = "") # Teetor
direction <- ifelse(data.r > 0, 1, ifelse(data.r < 0, -1, 0)) # another indicator of volatility
colnames(direction) <- paste(colnames(direction),".dir", sep = "")
# Convert into a time series object: 
# 1. Split into date and rates
dates <- as.Date(data$DATE[-1], "%m/%d/%Y")
dates.chr <- as.character(data$DATE[-1])
#str(dates.chr)
values <- cbind(data.r, size, direction)
# for dplyr pivoting and ggplot2 need a data frame also known as "tidy data"
data.df <- data.frame(dates = dates, returns = data.r, size = size, direction = direction)
data.df.nd <- data.frame(dates = dates.chr, returns = data.r, size = size, direction = direction, stringsAsFactors = FALSE) 
#non-coerced dates for subsetting on non-date columns
# 2. Make an xts object with row names equal to the dates
data.xts <- na.omit(as.xts(values, dates)) #order.by=as.Date(dates, "%d/%m/%Y")))
#str(data.xts)
data.zr <- as.zooreg(data.xts)
returns <- data.xts # watch for this data below!

# PAGE: Market risk 
corr_rolling <- function(x) {	
  dim <- ncol(x)	
  corr_r <- cor(x)[lower.tri(diag(dim), diag = FALSE)]	
  return(corr_r)	
}
vol_rolling <- function(x){
  library(matrixStats)
  vol_r <- colSds(x)
  return(vol_r)
}
ALL.r <- data.xts[, 1:3]
window <- 90 #reactive({input$window})
corr_r <- rollapply(ALL.r, width = window, corr_rolling, align = "right", by.column = FALSE)
colnames(corr_r) <- c("nickel.copper", "nickel.aluminium", "copper.aluminium")
vol_r <- rollapply(ALL.r, width = window, vol_rolling, align = "right", by.column = FALSE)
colnames(vol_r) <- c("nickel.vol", "copper.vol", "aluminium.vol")
year <- format(index(corr_r), "%Y")
r_corr_vol <- merge(ALL.r, corr_r, vol_r, year)
# Market dependencies
#library(matrixStats)
R.corr <- apply.monthly(as.xts(ALL.r), FUN = cor)
R.vols <- apply.monthly(ALL.r, FUN = colSds) # from MatrixStats	
# Form correlation matrix for one month 	
R.corr.1 <- matrix(R.corr[20,], nrow = 3, ncol = 3, byrow = FALSE)	
rownames(R.corr.1) <- colnames(ALL.r[,1:3])	
colnames(R.corr.1) <- rownames(R.corr.1)	
#R.corr.1
R.corr <- R.corr[, c(2, 3, 6)]
colnames(R.corr) <- c("nickel.copper", "nickel.aluminium", "copper.aluminium") 	
colnames(R.vols) <- c("nickel.vols", "copper.vols", "aluminium.vols")	
R.corr.vols <- na.omit(merge(R.corr, R.vols))
R.corr.vols.logs <- na.omit(log(R.corr.vols))
year <- format(index(R.corr.vols), "%Y")
R.corr.vols.y <- data.frame(nickel.correlation = R.corr.vols[,1], copper.volatility = R.corr.vols[,5], year = year)
nickel.vols <- as.numeric(R.corr.vols[,"nickel.vols"])	
copper.vols <- as.numeric(R.corr.vols[,"copper.vols"])	
aluminium.vols <- as.numeric(R.corr.vols[,"aluminium.vols"])
```

```{r, include=FALSE}
# ----------------------- Background section -------------------------
```

Background {data-orientation=columns}
=====================================  
    
Column {data-width=650}
-------------------------------------
    
### Problem

A freight forwarder with a fleet of bulk carriers wants to optimize their portfolio in  metals markets with entry into the nickel business and use of the tramp trade.  Tramp ships are the company's "swing" option without any fixed charter or other constraint. They allow the company flexibility in managing several aspects of freight uncertainty. The call for tramp transportation is a "derived demand" based on the value of the cargoes. This value varies widely in the spot markets. The company allocates \$250 million to manage receivables. The company wants us to:

1.	Retrieve and begin to analyze data about potential commodities for diversification,
2.	Compare potential commodities with existing commodities in conventional metal spot markets,
3.	Begin to generate economic scenarios based on events that may, or may not, materialize in the commodities.
4.	The company wants to mitigate their risk by diversifying their cargo loads. This risk measures the amount of capital the company needs to maintain its portfolio of services.

### Additional details

1.	Product: Metals commodities and freight charters
2.	Metal, Company, and Geography:
    a. Nickel: MMC Norilisk, Russia
    b. Copper: Codelco, Chile and MMC Norilisk, Russia
    c. Aluminium: Vale, Brasil and Rio Tinto Alcan, Australia
3.	Customers: Ship Owners, manufacturers, traders
4.  All metals are traded on the London Metal Exchange  

Using RMD from: https://wgfoote.github.io/fin-alytics/HTML/PR03_market-risk.html  
Our code is on github at: https://github.com/nschoonm/FIN654-Project3

### Key business questions

1.	How would the performance of these commodities affect the size and timing of shipping arrangements?
2.	How would the value of new shipping arrangements affect the value of our business with our current customers?
3.	How would we manage the allocation of existing resources given we have just landed in this new market?

Column {data-width=350}
-------------------------------------
   
### Getting a response: More detailed questions

These detailed questions are answered in part by the tables, graphs and models developed - add commentary as needed to explain the outputs

1. What is the decision the freight-forwarder must make? List key business questions and data needed to help answer these questions and support the freight-forwarder's decision. Retrieve data and build financial market detail into the data story behind the questions.

2. Develop the stylized facts of the markets the freight-forwarder faces. Include level, returns, size times series plots. Calculate and display in a table the summary statistics, including quantiles, of each of these series. Use autocorrelation, partial autocorrelation, and cross correlation functions to understand some of the persistence of returns including leverage and volatility clustering effects. Use quantile regressions to develop the distribution of sensitivity of each market to spill-over effects from other markets. Interpret these stylized "facts" in terms of the business decision the freight-forwarder makes.

3. How much capital would the freight-forwarder need? Determine various measures of risk in the tail of each metal's distribution. Then figure out a loss function to develop the portfolio of risk, and the determination of risk capital the freight-forwarder might need. Confidence intervals might be used to create a risk management plan with varying tail experience thresholds.

```{r include=FALSE}
# --------------------- Data Exploration section -----------------------
```

Data Exploration {data-orientation=rows}
=====================================     
   
Row {data-height=600}
-------------------------------------

### Metals Market Percent Changes

```{r}
title.chg <- "Metals Market Percent Changes"
autoplot.zoo(data.xts[,1:3]) + ggtitle(title.chg) + ylim(-5, 5)
```

### Metals Market Percent Changes - Size

```{r}
title.chg <- "Metals Market Percent Changes - Size"
autoplot.zoo(data.xts[,4:6]) + ggtitle(title.chg) + ylim(-5, 5)
```

### Data
Descriptive Statistics
```{r}
# Add min/1st qu/median/srd qu/max of nickel/copper/alum
summary(data.r, digits=3)
```

```{r}
# Add min/1st qu/median/srd qu/max of nickel/copper/alum
summary(size, digits=3)
```
<hr>
Data Moments
```{r}
# Output the copper.size, alum.size, nickel.dir and copper.dir
# this should be mean, median, std_dev, IQR, skewness, kurtosis
#
# Load the data_moments() function
# data_moments function
# INPUTS: r vector
# OUTPUTS: list of scalars (mean, sd, median, skewness, kurtosis)
data_moments <- function(data){
  library(moments)
  library(matrixStats)
  mean.r <- colMeans(data)
  median.r <- colMedians(data)
  sd.r <- colSds(data)
  IQR.r <- colIQRs(data)
  skewness.r <- skewness(data)
  kurtosis.r <- kurtosis(data)
  result <- data.frame(mean = mean.r, median = median.r, std_dev = sd.r, IQR = IQR.r, skewness = skewness.r, kurtosis = kurtosis.r)
  return(result)
}
# Run data_moments()
answer <- data_moments(data.xts[, 4:6])
# Build pretty table
answer <- round(answer, 3)
knitr::kable(answer)

answer <- data_moments(data.xts[, 1:3])
# Build pretty table
answer <- round(answer, 3)
knitr::kable(answer)
```

<hr>
Correlation
```{r}
R.corr.1
```

Row {data-height=400}
-------------------------------------
    
### Data Insights

The time series data runs from January 11, 2012 to March 16, 2017.  The graph shows the volitality of each of the metals over time within the range of about +5% to -5%.

The time series plot shows lots of clustering and large spikes in all 3 metals.  When comparing all 3 metals, we see the most clustering and the heaviest tails in nickel, indicating that nickel has higher volatility than the other metals.  As we look at the descriptive statistics we see that nickel has much bigger gains when compared to losses and averages higher returns than both copper and aluminium. The data moments show that size of the returns is highly skewed.  The kurtosis levels indicate a leptokurtic distribution.  This is an indicator of a higher probability of extreme outliers.  

There are peaks in the end of the second quarter, beginning of third quarter, in the current time periods examined - which leads us to believe that these peaks are seasonal in nature. This is perhaps due to the increase in construction projects during these time periods. 

The global economic events of 2008 and 2009 had a lasting impact on the performance of Nickle and Copper markets.  The European debt crisis in 2012 created instability and very high volatility in the copper market.  The slow and weak recovery after the great recession continued to transfer over to unstable pricing and created challenges in risk management in metals.

All metals showed a negative 1st quartile values which shows the metals values below the average.

     nickel            copper            aluminium
     1st Qu.:-0.9997   1st Qu.:-0.6343   1st Qu.:-0.5038

These have almost the same magnitude in the opposite direction for the 3rd quartile values above the average.

     nickel            copper            aluminium
     3rd Qu.: 1.0481   3rd Qu.: 0.7165   3rd Qu.: 0.5634

This seems to indicate that the fluctuation between positive and negative for each metal was about the same.  

Looking at the mean, it appears that they are spent time in the positive range, which nickel having the highest value, followed by copper, then aluminum.

     nickel            copper            aluminium
     Mean   : 0.0489   Mean   : 0.0200   Mean   : 0.0149 

The magnitude of the correlation between nickel/copper and nickel/aluminum is about the same (~0.1).  Nickel/copper has a negative correlation (-0.114838413), while nickel/aluminum has a positive correlation (0.101273327).  Copper/aluminum has a very week correlation (-0.003052345).  Nickel and aluminum are both good conductors and rate similarly for wire usage.  This could lead to their positive correlation, while copper being the best, would results in a substitution of product, rather than similar use.  This could be the cause of the negative correlation between copper and nickel.

```{r include=FALSE}
# --------------------- Volatility section -----------------------
```

Volatility {data-orientation=rows}
=====================================     
   
Row {data-height=300}
-------------------------------------

### Returns % change - ACF

```{r}
acf(coredata(data.xts[,1:3])) # returns
```

### Size Returns % change - ACF

```{r}
acf(coredata(data.xts[,4:6])) # sizes
```

Row {data-height=300}
-------------------------------------
    
### Returns % change - PACF

```{r PACF code}
# build function to repeat these routines
run_ccf <- function(one, two, main = title.chg, lag = 20, color = "red"){
  # one and two are equal length series
  # main is title
  # lag is number of lags in cross-correlation
  # color is color of dashed confidence interval bounds
  stopifnot(length(one) == length(two))
  one <- ts(one)
  two <- ts(two)
  main <- main
  lag <- lag
  color <- color
  ccf(one, two, main = main, lag.max = lag, xlab = "", ylab = "", ci.col = color)
  #end run_ccf
}
```

```{r}
# Add graphs for Returns % change - PACF
one <- ts(data.zr[,1]) # nickel
two <- ts(data.zr[,2]) # copper
title <- "nickel-copper"
run_ccf(one, two, main = title, lag = 20, color = "red")
```

### Size Returns % change - PACF

```{r}
# Add graphs for Size Returns % change - PACF
# now for volatility (sizes)
one <- data.zr[,4] # nickel
two <- data.zr[,5] # copper
title <- "Nickel-Copper: volatility"
run_ccf(one, two, main = title, lag = 20, color = "red")
```

Row {data-height=300}
-------------------------------------

### Data Insights

The corresponding timeseries suggests most fluctuations with nickel, followed by copper, then aluminum. Since the VaR is the largest likely loss for a given portfolio, and nickel has greater ES, this indicates a considerably risky investment.

There are highstress points noted in our research that are indirectly correlated with the data in the time series. The European Union financial crisis with Greece, as well as the potential impact of Brexit during the entirety of the 2013-2016 years, can attribute to the fluctuations of the metals we have examined. Also, there may be a shortening ripple effect from the 2008 Recession experienced across global markets, during the time periods examined.

The overall loss limit exhibits an exponential tail. Without further proof whether the overall distribution is normal, suggests the use of ES over VaR. Also, the GPD provides better accuracy by relying less on the distribution central mode. Since calculations focuses directly on the exceedance, results will better fit the data subset.


```{r include=FALSE}
# --------------------- Correlation Sensitivity section -----------------------
```

Correlation Sensitivity {data-orientation=rows}
=====================================    

```{r Correlation code}
run_correlation_sensitivity <- function(corrAndVols, marketRisk, standardDeviation, correlationName, volatilityName){
  library(quantreg)
  taus <- seq(.05,.95,.05)	# Roger Koenker UI Bob Hogg and Allen Craig
  new_formula <- reformulate(response = marketRisk, termlabels = standardDeviation)
  fit.rq.a <- rq(new_formula, tau = taus, data = corrAndVols)	
  fit.lm.a <- lm(new_formula, data = corrAndVols)	
  plot(summary(fit.rq.a), parm = standardDeviation, main = paste(correlationName, "correlation sensitivity to", volatilityName, "volatiltiy", sep=" "))
  #(ni.cu.summary <- summary(fit.rq.a, se = "boot"))
}
```
   
Row {data-height=700}
-------------------------------------

### nickel-copper correlation sensitivity to copper volatility

```{r}
# Add graphs for nickel-copper correlation sensitivity to copper volatility
run_correlation_sensitivity(R.corr.vols, "nickel.copper", "copper.vols", "nickel-copper", "copper")
```

### copper-aluminum correlation sensitivity to aluminum volatility

```{r}
# Add graphs for copper-aluminum correlation sensitivity to aluminum volatility
run_correlation_sensitivity(R.corr.vols, "copper.aluminium", "aluminium.vols", "copper-aluminium", "aluminium")
```

### nickel-aluminum correlation sensitivity to aluminum volatility

```{r}
# Add graphs for nickel-aluminum correlation sensitivity to aluminum volatility
run_correlation_sensitivity(R.corr.vols, "nickel.aluminium", "aluminium.vols", "nickel-aluminium", "aluminium")
```

Row {data-height=300}
-------------------------------------

### Data Insights

The nickel-aluminium correlation sensivity to alumium volatility is the only combination that fits within our limits. The nickel-copper correlation sensivity to copper volatility shows the highest positive spikes.  

As the graphs above indicate, both the Nickel-Copper correlation and the Copper-Aluminium correlation, have significant valleys outside of their lower control limits, and minor peaks just above their upper control limit. This shows high volatility with these two metals, whenever there are changes within the market these appear to be the most affected. 

Whereas, the Nickel-Aluminium correlation appears to be the most stable, as it is staying within the bounds of both limits.
```{r include=FALSE}
# --------------------- Expected Shortfall section -----------------------
```

Expected Shortfall {data-orientation=rows}
=====================================     

```{r Shortfall code}
# elementReturns = returns[,1]
expected_shortfall <- function(elementReturns)
{
  returns1 <- elementReturns
  colnames(returns1) <- "Returns" #kluge to coerce column name for df
  returns1.df <- data.frame(Returns = returns1[,1], Distribution = rep("Historical", each = length(returns1)))
    
  alpha <- 0.95 # reactive({ifelse(input$alpha.q>1,0.99,ifelse(input$alpha.q<0,0.001,input$alpha.q))})
    
  # Value at Risk
  VaR.hist <- quantile(returns1,alpha)
  VaR.text <- paste("Value at Risk =", round(VaR.hist, 2))
    
  # Determine the max y value of the density plot.
  # This will be used to place the text above the plot
  VaR.y <- max(density(returns1.df$Returns)$y)
    
  # Expected Shortfall
  ES.hist <- median(returns1[returns1 > VaR.hist])
  ES.text <- paste("Expected Shortfall =", round(ES.hist, 2))
    
  p <- ggplot(returns1.df, aes(x = Returns, fill = Distribution)) + geom_density(alpha = 0.5) + 
      geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "firebrick1") + 
      geom_vline(aes(xintercept = ES.hist), size = 1, color = "firebrick1") +
      annotate("text", x = 1+ VaR.hist, y = VaR.y*1.05, label = VaR.text) +
      annotate("text", x = 1+ ES.hist, y = VaR.y*1.1, label = ES.text) + scale_fill_manual( values = "dodgerblue4")
  p
}
```
   
Row {data-height=700}
-------------------------------------

### Nickel Shortfall

```{r}
# Add graphs for Nickel Shortfall
expected_shortfall(returns[,1])
```

### Copper Shortfall

```{r}
# Add graphs for Copper Shortfall
expected_shortfall(returns[,2])
```

### Aluminum Shortfall

```{r}
# Add graphs for Aluminum Shortfall
expected_shortfall(returns[,3])
```

Row {data-height=300}
-------------------------------------

### Data Insights

Nickel VaR = 2.73 ES = 3.5  
Copper VaR = 1.87 ES = 2.27  
Aluminium VaR = 1.96 ES = 2.58  

The variance between the value at risk and the expected shortfall is the smallest for copper close to that of aluminium.  Nickel has the greatest value at risk hence a greater expected shortfall.  We can visually see on the nickel graph that the curve is less peaked and therefore has longer tails than both copper and aluminium.  

Entering the nickel market is going to be our riskiest investment and therefore require more capital than is required for our existing customers.     

```{r include=FALSE}
# --------------------- Losses section -----------------------
```

Losses {data-orientation=rows}
=====================================     
   
Row {data-height=700}
-------------------------------------

### Losses Graph

```{r Losses code}
# Now for Loss Analysis
# Get last prices
price.last <- as.numeric(tail(data[, -1], n=1))
# Specify the positions
position.rf <- c(1/3, 1/3, 1/3)
# And compute the position weights
w <- position.rf * price.last
# Fan these  the length and breadth of the risk factor series
weights.rf <- matrix(w, nrow=nrow(data.r), ncol=ncol(data.r), byrow=TRUE)
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=3)
# We need to compute exp(x) - 1 for very small x: expm1 accomplishes this
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=4)
loss.rf <- -rowSums(expm1(data.r/100) * weights.rf)
loss.rf.df <- data.frame(Loss = loss.rf, Distribution = rep("Historical", each = length(loss.rf)))
# Simple Value at Risk and Expected Shortfall
alpha.tolerance <- .95
VaR.hist <- quantile(loss.rf, probs=alpha.tolerance, names=FALSE)
# Just as simple Expected shortfall
ES.hist <- median(loss.rf[loss.rf > VaR.hist])
VaR.text <- paste("Value at Risk =\n", round(VaR.hist, 2)) # ="VaR"&c12
ES.text <- paste("Expected Shortfall \n=", round(ES.hist, 2))
title.text <- paste(round(alpha.tolerance*100, 0), "% Loss Limits")
# using histogram bars instead of the smooth density
p <- ggplot(loss.rf.df, aes(x = Loss, fill = Distribution)) + geom_histogram(alpha = 0.8) + geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "blue") + geom_vline(aes(xintercept = ES.hist), size = 1, color = "blue") + annotate("text", x = VaR.hist, y = 40, label = VaR.text) + annotate("text", x = ES.hist, y = 20, label = ES.text) + xlim(0, 500) + ggtitle(title.text)
p
```

Row {data-height=300}
-------------------------------------

### Data Insights

Based on the historical distribution, we are 95% confident that our equally weighted portfolio will not have losses in excess of 194.46.

However, we should reserve 244.99 in the case of an extreme market loss.  It is always wise to look into the extreme tail risk measures, in the case of an extreme event that could affect the market of our commodities.  

```{r include=FALSE}
# --------------------- Extremes section -----------------------
```

Extremes {data-orientation=rows}
=====================================    

```{r Extremes code}
# mean excess plot to determine thresholds for extreme event management
data <- as.vector(loss.rf) # data is purely numeric
umin <-  min(data)         # threshold u min
umax <-  max(data) - 0.1   # threshold u max
nint <- 100                # grid length to generate mean excess plot
grid.0 <- numeric(nint)    # grid store
e <- grid.0                # store mean exceedances e
upper <- grid.0            # store upper confidence interval
lower <- grid.0            # store lower confidence interval
u <- seq(umin, umax, length = nint) # threshold u grid
alpha <- 0.95                  # confidence level
for (i in 1:nint) {
    data <- data[data > u[i]]  # subset data above thresholds
    e[i] <- mean(data - u[i])  # calculate mean excess of threshold
    sdev <- sqrt(var(data))    # standard deviation
    n <- length(data)          # sample size of subsetted data above thresholds
    upper[i] <- e[i] + (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # upper confidence interval
    lower[i] <- e[i] - (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # lower confidence interval
  }
mep.df <- data.frame(threshold = u, threshold.exceedances = e, lower = lower, upper = upper)
loss.excess <- loss.rf[loss.rf > u]
# Voila the plot => you may need to tweak these limits!
p <- ggplot(mep.df, aes( x= threshold, y = threshold.exceedances)) + geom_line() + geom_line(aes(x = threshold, y = lower), colour = "red") + geom_line(aes(x = threshold,  y = upper), colour = "red") + annotate("text", x = 400, y = 200, label = "upper 95%") + annotate("text", x = 200, y = 0, label = "lower 5%")
#
# GPD to describe and analyze the extremes
#
#library(QRM)
alpha.tolerance <- 0.95
u <- quantile(loss.rf, alpha.tolerance , names=FALSE)
fit <- fit.GPD(loss.rf, threshold=u) # Fit GPD to the excesses
xi.hat <- fit$par.ests[["xi"]] # fitted xi
beta.hat <- fit$par.ests[["beta"]] # fitted beta
data <- loss.rf
n.relative.excess <- length(loss.excess) / length(loss.rf) # = N_u/n
VaR.gpd <- u + (beta.hat/xi.hat)*(((1-alpha.tolerance) / n.relative.excess)^(-xi.hat)-1) 
ES.gpd <- (VaR.gpd + beta.hat-xi.hat*u) / (1-xi.hat)
n.relative.excess <- length(loss.excess) / length(loss.rf) # = N_u/n
VaR.gpd <- u + (beta.hat/xi.hat)*(((1-alpha.tolerance) / n.relative.excess)^(-xi.hat)-1) 
ES.gpd <- (VaR.gpd + beta.hat-xi.hat*u) / (1-xi.hat)
```
   
Row {data-height=700}
-------------------------------------

### GDP Loss Limits

```{r GPD Loss Limits Graph}
# Plot away
VaRgpd.text <- paste("GPD: Value at Risk =", round(VaR.gpd, 2))
ESgpd.text <- paste("Expected Shortfall =", round(ES.gpd, 2))
title.text <- paste(VaRgpd.text, ESgpd.text, sep = " ")
title.text = paste0('GPD: ', round(alpha.tolerance * 100, 0), "% Loss Limits")
ggplot(loss.rf.df, aes(x = Loss, fill = Distribution)) +
  geom_density(alpha = 0.2) +
  geom_vline(aes(xintercept = VaR.gpd), colour = "blue", linetype = "dashed", size = 0.8) +
  geom_vline(aes(xintercept = ES.gpd), colour = "blue", size = 0.8) +
  annotate("text", x = 300, y = 0.005, label = ESgpd.text, colour = "blue") +
  xlim(0,500) +
  ggtitle(title.text)
```

### Estimated tail probabilities: Risk Measure = Value at Risk ("VaR")

```{r Value at Risk}
# Add graphs for Estimated tail probabilities for confidence interval x
showRM(fit, alpha = 0.99, RM = "VaR", method = "BFGS")
```

### Estimated tail probabilities: Risk Measure = Expected Shortfall ("ES")

```{r Expected Shortfal}
# Add graphs for Estimated tail probabilities for confidence interval y
showRM(fit, alpha = 0.99, RM = "ES", method = "BFGS") 
```

Row {data-height=300}
-------------------------------------

### Data Insights

Here we use the Generalized Pareto Distribution (GPD) to measure our risks in the case of extreme market stress.  We should definately consider these extreme risks because Russia produces 90% of the world's nickel.  If there is any instability in our relationship with Russia, such as an imposed tariff, we could suffer from the extreme losses and must have the reserves to hold our company through this market stress event. 

In this case of extreme market stress we need at least 305.99 but should really have a reserve of 386.06 to make sure that we can survive the extreme event.  It appears that if we want to have an equal investment in all 3 metals we would need more capital to invest in our receivables above the alloted $250 million.

Using GPD, the expected shortfall computed at 450.9, more than double the original 244.99. This significant difference could be detrimental for an investment should the risk occur. Lastly, the "Estimated tail probabilities" indicates the GPD having greater variance for ES compared to VaR.

```{r include=FALSE}
# --------------------- Conclusion section -----------------------
```

Conclusion {data-orientation=rows}
=====================================     
   
Row {data-height=400}
-------------------------------------

### Skills and Tools

The following methods & packages in R were used to explore the data:

1. flexdashboard
  + Create interactive dashboards using rmarkdown.
2. ggplot2
  + Create elegant data visualisations using the grammar of graphics package.  Used to visualize the size and direction of the exchange rate percentages.
3. shiny
  + Used for interactive pages.
4. QRM
  + Provides functions for quantitative risk management .
5. qrmdata
  + Various data sets (stocks, stock indices, constituent data, FX, zero-coupon bond yield curves, volatility, commodities) for Quantitative Risk Management practice.
6. xts
  + eXtensible Time Series package. Used to create a time series of objects with the ability to add custom attributes at any time.
7. zoo
  + S3 infrastructure for regular and irregular time series package. Used to transform our data into a structure that we can use to interface with other time series data and packages.
8. psych
  + A package for personality, psychometric, and psychological research.
9. matrixStats
  + Functions that apply to rows and columns of matricies (and vectors) package.
10. quantreg
  + Quantile regression package. Used to do a regression analysis on the 95th percent percentile.

The specific functions used where:  
  
1.  na.omit
  + Returns the object with incomplete cases removed.
2.  read.csv
  + Reads a file in table format and creates a data frame from it, with cases corresponding to lines and variables to fields in the file. read.csv and read.csv2 are identical to read.table except for the defaults. They are intended for reading ‘comma separated value’ files (‘.csv’).
3.  diff
  + Returns suitably lagged and iterated differences.
4.  log
  + Computes logarithms, by default natural logarithms.
5.  as.matrix
  + Attempts to turn its argument into a matrix.
6.  abs
  + Computes the absolute value of x.
7.  paste
  + Concatenate vectors after converting to character.
8.  colnames
  + Retrieve or set the row or column names of a matrix-like object.
9.  as.Date
  + Convert an object to a date or date-time.
10.  as.character
  + Create or test for objects of type "character".
11.  cbind
  + Take a sequence of vector, matrix or data-frame arguments and combine by columns or rows, respectively.
12.  data.frame
  + Creates data frames, tightly coupled collections of variables which share many of the properties of matrices and of lists, used as the fundamental data structure by most of R's modeling software.
13.  as.zooreg
  + Convert an object to a zooreg.
14.  ncol
  + Return the number of columns present in x.
15.  cor
  + Compute the covariance of x and y if these are vectors.
16.  colSds
  + Standard deviation estimates for each column in a matrix.
17.  rollapply
  + A generic function for applying a function to rolling margins of an array.
18.  format
  + Format an R object for pretty printing.
19.  index
  + Generic functions for extracting the index of an object and replacing it.
20.  merge
  + Merge two data frames by common columns or row names, or do other versions of database join operations.
21.  apply.monthly
  + Apply a specified function to each distinct period in a given time series object.
22.  as.xts
  + Conversion functions to coerce data objects of arbitrary classes to class xts and back, without losing any attributes of the original format.
23.  matrix
  + Creates a matrix from the given set of values.
24.  rownames
  + Retrieve or set the row names of a matrix-like object.
25.  as.numeric
  + Method should return an object of base type "numeric" number.
26.  autoplot.zoo
  + Takes a zoo object and returns a ggplot2 object.
27.  summary
  + Generic function used to produce result summaries of the results of various. model fitting functions. 
28. colMeans
  + Calculates the mean for each column in a matrix.
29. colMedians
  + Calculates the median for each column in a matrix.
30. colSds
  + Standard deviation estimates for each column in a matrix.
31. colIQRs
  + Estimates of the interquartile range for each column in a matrix.
32. skewness
  + This function computes skewness of given data. In statistics, skewness is a measure of the asymmetry of the probability distribution of a random variable about its mean. In other words, skewness tells you the amount and direction of skew (departure from horizontal symmetry). The skewness value can be positive or negative, or even undefined.
33. kurtosis
  + This function computes the estimator of Pearson's measure of kurtosis. Like skewness, kurtosis is a statistical measure that is used to describe the distribution. Whereas skewness differentiates extreme values in one versus the other tail, kurtosis measures extreme values in either tail. Distributions with large kurtosis exhibit tail data exceeding the tails of the normal distribution (e.g., five or more standard deviations from the mean). Distributions with low kurtosis exhibit tail data that are generally less extreme than the tails of the normal distribution.
34.  knitr::kable
  + This is a very simple table generator. It is simple by design. It is not intended to replace any other R packages for making tables.
35.  round
  + Round rounds the values in its first argument to the specified number of decimal places (default 0).
36.  coredata
  + Generic functions for extracting the core data contained in a (more complex) object and replacing it.
37.  ts
  + Used to create time-series objects.
38.  ccf
  + Computes the cross-correlation or cross-covariance of two univariate series.
39.  seq
  + Generate regular sequences. seq is a standard generic with a default method.
40.  reformulate
  + Returns a formula based on the string input.
41.  rq
  + Quantile regression.
42.  lm
  + Used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance.
43.  plot
  + Generic function for plotting of R objects.
44.  quantile
  + The generic function quantile produces sample quantiles corresponding to the given probabilities. The smallest observation corresponds to a probability of 0 and the largest to a probability of 1.
45.  max
  + Return the maximum of all the values present in their arguments.
46.  density
  + The (S3) generic function density computes kernel density estimates.
47.  median
  + Compute the sample median.
48.  ggplot
  + Initializes a ggplot object. It can be used to declare the input data frame for a graphic and to specify the set of plot aesthetics intended to be common throughout all subsequent layers unless specifically overridden.
49.  aes
  + Aesthetic mappings describe how variables in the data are mapped to visual properties (aesthetics) of geoms.
50.  geom_density
  + Computes and draws kernel density estimate, which is a smoothed version of the histogram.
51.  geom_vline
  + These geoms add reference lines (sometimes called rules) to a plot, either horizontal, vertical, or diagonal (specified by slope and intercept). These are useful for annotating plots.
52.  annotate
  + This function adds geoms to a plot, but unlike typical a geom function, the properties of the geoms are not mapped from variables of a data frame, but are instead passed in as vectors. This is useful for adding small annotations (such as text labels) or if you have your data in vectors, and for some reason don't want to put them in a data frame.
53.  rowSums
  + Form row sums and means for numeric arrays (or data frames).
54.  geom_histogram
  + Visualise the distribution of a single continuous variable by dividing the x axis into bins and counting the number of observations in each bin. Histograms (geom_histogram()) display the counts with bars.
55.  as.vector
  + Attempts to coerce its argument into a vector of mode.
56.  min
  + Returns the (regular or parallel) minima of the input values.
57.  sqrt
  + Computes the (principal) square root of x, √{x}.
58.  length
  + Get or set the length of vectors (including lists) and factors, and of any other R object for which a method has been defined.
59.  fit.GPD
  + Functions for fitting, analysing and risk measures according to POT/GPD.
60.  showRM
  + Calculate the expected shortfall confidence intervals.

### Data Insights

Data insights provided in each of the sections:

  + Data Exploration
  + Volatility
  + Correlation Sensitivity
  + Expected Shortfall
  + Losses
  + Extremes

Row {data-height=600}
-------------------------------------

### Business Remarks
1. How would the performance of these commodities affect the size and timing of shipping arrangements?

Performance can be measured by defining a VaR, which defines the maximum expected loss with a given confidence. An associated ES, is the average of all loses that exceed the VaR. Using previous ES against an accepted risk level, will ensure the right amount of commodities are produced.

When the price of a commodity falls due to a decrease in demand than the size of the shipment is decreased leading to an increase in shipping time.  On the other hand, if the price of commodity falls due to an increase in supply than we would see an increase in the size of the shipment and a decrease in timing. 

For the other scenario, we may see commodity prices rise due to an increase in demand which would in turn increase the size of the shipment.  Since the ship is filled with an increased cargo size, the shipment would be expedited.  Whereas, if the commodity price is rising due to decrease in supply, we would have a smaller shipment increasing the shipping time. 

Specifically, an abundance of non-distributed commodities may overwhelm transportation means. Likewise, not maximizing the commodities for a given shipping location, corresponds to under-utilizing shipping resources. Performance can be measured by an overall net sale, or net sale relative to shipped goods.

Since we already have bulk carriers and are going to be using tramp ships as our flexible option, we may be able to fill the gap when commodities are performing well.  The employment of these tramp ships could help mitigate the risk of smaller and slower shipments to our existing customers. 

2. How would the value of new shipping arrangements affect the value of our business with our current customers?

The value of the new shipping arrangements will affect our existing customers by our decision on the allocation of our $250 million in receivables.  As we enter into the nickel market, we may want to see how well our existing customers respond to our new commodity.  We must keep in mind that our existing customer/commodity portfolio is what we have based our projected earnings on in the past.

New shipping arrangements could pivot away companies from current customers to tramp trade, if shipment is more lucrative. This would hurt the current supply chain relationships.

3. How would we manage the allocation of existing resources given we have just landed in this new market?

Given the value at risk for each individual return we are taking the highest risk with the nickel and will need to mitigate that risk with at least an equal investment in our existing customer markets.  As we took a quick glance in an equally weighted portfolio, we can see that we have enough capital along with collateral to enter at least 1/3 of our existing portfolio into the new market.  It is wise however to consider existing customers when entering into a new market. 
 
Understanding the price sensitivity of the metals market, is an important requirement for accurate asset allotment. Information such as VAR and ES, allows resources and budget to be properly distributed, with the intention to improves processes, without negative side-effects to existing business models used.

Sources:  
https://articles2.marketrealist.com/2014/01/commodity-prices-price-arbitrage-affect-dry-bulk-shippers/#

https://www.thebalance.com/the-biggest-cobalt-producers-2339726