Lastname-Firstname_FinalProject.Rmd

---
title: 'Final'
author:
- Group 3
- Nicolas Schoonmaker
- Guillermo Delgado
- Katie Guillen
- Leanne Harper
date: "`r format(Sys.time(), '%m/%d/%Y')`"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    theme: readable
    highlight: tango
    logo: "data/corn_wheat_soybean.jpg"
runtime: shiny
---

Background
=======================================================================

Column
-----------------------------------------------------------------------

### Purpose, Process, Product

We will represent a wholesale distributor. We will compute optimal holdings of risky and risk-free assets for the Markowitz mean-variance model. We will then build a simple financial web application. With this tool we can also explore impact of the extremes of distributions of financial returns on portfolio results. 

### Problem

We are a wholesale distributor of soybean and wheat; we are considering adding corn to our portfolio of commodities.  We currently distribute our product to food processors, animal feed producers, exporters, and other buyers.  We feel that many of our current customers will be attracted to this added commodity as it is used in many of their end products.  We will use future contracts as leverage to help mitigate risk.  Future contracts will allow the company flexibility in managing price movement speculation.  The future contracts will allow us to take long or short positions depending on future market expectations.  They have allocated $250 million to purchase commodities. The company wants us to:

1.	Retrieve and begin to analyze data about potential commodities to diversify into
2.	Compare potential commodities with existing commodities in conventional agricultural spot markets
3.	Begin to generate economic scenarios based on events that may, or may not, materialize in the commodities
4.	The company wants to mitigate their risk by using future contracts, which will require closely monotoring the agricultural spot market and continued speculation on future price movements.

### Data and analysis to inform the decision

Data analysis highlights:

- Spot market prices of Corn, Soybean, and Wheat
- Corn and Soybean: correlation
- Corn and Wheat: correlation
- Soybean and Wheat: Correlation
- Corn and Soybean: Correlation sensitivity to Soybean dependency
- All together: correlations and volaitlities among these indicators
- Cross-section of rolling correlation will be visualize correlation
- Value at Risk (VaR) represents our maximum expected loss given a certain confidence level and Expected Shortfall represents the expected loss in extreme cases, when the loss exceeds VaR


Column
-----------------------------------------------------------------------

### Method

Identify the optimal combination of Corn, Soybean, and Wheat to trade

1.	Product: Agricultural commodities and future contracts
2.	Commodity, Company, and Geography:
    a. Corn: West Bend, Iowa, United States
    b. Soybean: Mato Grosso, Brazil
    c. Wheat: Beijing, China
3.	Customers: food processors, animal feed producers, exporters, and other buyers
4.  All commodities traded on the Chicago Board of Trade commodity exchanges

### Stylized facts of the Agriculture market

The Chicago Board of Trade (CBOT) is one of the major and most known commodity exchanges operating with agricultural commodities such as corn, soybeans, wheat, oats, and rice.  Today, CBOT is part of the Chicago Mercantile Exchange (CME) Group.

- Volatility is rarely constant and often has a structure (mean reversion) and is dependent on the past.  In other words, over time the returns will move back to its average historical levels.  For the most part, it typically rises after it falls too low and falls after rising too high.
- Supply and demand dynamics are the main reason commodity prices change.  For instance, extreme weather events or an abundance of crop planting will influence supply and demand and drive price to extremes.  As we have learned, extreme events are likely to happen with other extreme events.  
- For wheat negative returns, though close to the mean, are more likely but there are fewer outliers.  For both corn and soybean, we see the opposite.  There is a higher likelihood of positive returns, close to the mean, but there are more extreme outliers, indicating a greater chance of extremely large deviations from the expected return.   

### Key business questions

The following are the key business questions that will be answered in the conclusion section.

1.	How would we manage the allocation of existing resources given we have just landed in this new market?
2.	How does this decision impact the business?
3.	What are your recommendations? 


Approach
-----------------------------------------------------------------------

### Getting to a reponse: more detailed questions

1. What is the decision the wholesale distributor must make? List key business questions and data needed to help answer these questions and support the wholesale distributors' decision.

2. Develop a model to optimize the holdings of each of the three commodities; corn, soybean, wheat 

3. Run two scenarios: with and without short sales of the commodities. 

4. Interpret results for the wholesale distributor, including tangency portfolio, amount of cash and equivalents in the portfolio allocation, minimum risk portfolio and the risk and return characteristics of each commodity.


```{r Get current directory}
# Get the directory so we can run this from anywhere
# Get the script directory from R when running in R
if(rstudioapi::isAvailable())
{
  script.path <- rstudioapi::getActiveDocumentContext()$path
  script.dir <- dirname(script.path)
}
if(!exists("script.dir"))
{
  script.dir <- getSrcDirectory(function(x) {x})
}
```

```{r Working Directory and Data setup}
# Set my working directory
# There is a "data" folder here with the files and the script
setwd(script.dir)
# Double check the working directory
getwd()
# Error check to ensure the working directory is set up and the data
# directory exists inside it.  Its required for this file
if(dir.exists(paste(getwd(),"/data", sep = "")) == FALSE) {
  stop("Data directory does not exist. Make sure the working directory
       is set using setwd() and the data folder exists in it.")
}
```

```{r Setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
#install.packages("blogdown")
#library(blogdown)
#rmarkdown::render_site()

library(ggplot2)
library(flexdashboard)
library(shiny)
library(QRM)
library(qrmdata)
library(xts)
library(zoo)
library(plotly)
#library(ggfortify)
library(psych)

rm(list = ls())
# PAGE: Exploratory Analysis
data <- na.omit(read.csv("data/ingredients.csv", header = TRUE))
#data$DATE <- as.Date(data$DATE, "%m/%d/%Y")
#data <- data[order(data$DATE),]
# Compute log differences percent using as.matrix to force numeric type
data.r <- diff(log(as.matrix(data[, -1]))) * 100
# Create size and direction
size <- na.omit(abs(data.r)) # size is indicator of volatility
#head(size)
colnames(size) <- paste(colnames(size),".size", sep = "") # Teetor
direction <- ifelse(data.r > 0, 1, ifelse(data.r < 0, -1, 0)) # another indicator of volatility
colnames(direction) <- paste(colnames(direction),".dir", sep = "")
# Convert into a time series object: 
# 1. Split into date and rates
dates <- as.Date(data$DATE[-1], "%m/%d/%Y")
dates.chr <- as.character(data$DATE[-1])
str(dates.chr)
values <- cbind(data.r, size, direction)
# for dplyr pivoting and ggplot2 need a data frame also known as "tidy data"
data.df <- data.frame(dates = dates, returns = data.r, size = size, direction = direction)
data.df.nd <- data.frame(dates = dates.chr, returns = data.r, size = size, direction = direction, stringsAsFactors = FALSE) 
#non-coerced dates for subsetting on non-date columns
# 2. Make an xts object with row names equal to the dates
data.xts <- na.omit(as.xts(values, dates)) #order.by=as.Date(dates, "%d/%m/%Y")))
#str(data.xts)
data.zr <- as.zooreg(data.xts)
returns <- data.xts

# PAGE: Market risk 
corr.rolling <- function(x) {	
  dim <- ncol(x)	
  corr.r <- cor(x)[lower.tri(diag(dim), diag = FALSE)]	
  return(corr.r)	
}

ALL.r <- data.xts[, 1:3]
window <- 90 #reactive({input$window})
corr.returns <- rollapply(ALL.r, width = window, corr.rolling, align = "right", by.column = FALSE)
corr.returns.df <- data.frame(Date = index(corr.returns), Corn.Soybean = corr.returns[,1], Corn.Wheat = corr.returns[,2], Soybean.Wheat = corr.returns[,3])

# Market dependencies
library(matrixStats)
R.corr <- apply.monthly(as.xts(ALL.r), FUN = cor)
R.vols <- apply.monthly(ALL.r, FUN = colSds) # from MatrixStats	
# Form correlation matrix for one month 	
R.corr.1 <- matrix(R.corr[20,], nrow = 3, ncol = 3, byrow = FALSE)	
rownames(R.corr.1) <- colnames(ALL.r[,1:3])	
colnames(R.corr.1) <- rownames(R.corr.1)	
R.corr <- R.corr[, c(2, 3, 6)]
colnames(R.corr) <- c("Corn & Soybean", "Corn & Wheat", "Soybean & Wheat")
colnames(R.vols) <- c("Corn.vols", "Soybean.vols", "Wheat.vols")	
R.corr.vols <- na.omit(merge(R.corr, R.vols))
Corn.vols <- as.numeric(R.corr.vols[,"Corn.vols"])	
Soybean.vols <- as.numeric(R.corr.vols[,"Soybean.vols"])	
Wheat.vols <- as.numeric(R.corr.vols[,"Wheat.vols"])
library(quantreg)
# hist(rho.fisher[, 1])
Corn.corrs <- R.corr.vols[,1]
#hist(Corn.corrs)
taus <- seq(.05,.95,.05)	# Roger Koenker UI Bob Hogg and Allen Craig
fit.rq.Corn.Soybean <- rq(Corn.corrs ~ Soybean.vols, tau = taus)	
fit.lm.Corn.Soybean <- lm(Corn.corrs ~ Soybean.vols)	
#' Some test statements	
#summary(fit.rq.Corn.Soybean, se = "boot")
#'
#summary(fit.lm.Corn.Soybean, se = "boot")
#plot(summary(fit.rq.Corn.Soybean), parm = "Soybean.vols", main = "Corn-Soybean correlation sensitivity to Soybean volatility") #, ylim = c(-0.1 , 0.1))	
```


Data
=======================================================================

Column
-----------------------------------------------------------------------

### Data Definitions

- *Corn*: daily Corn price (\$/per bushel)
- *Soybean*: daily Soybean prices (\$/per bushel)
- *Wheat *: daily Wheat prices (\$/per bushel)


### Historical data 2015-2020

Notes about the historical data for Corn, Soybean and Wheat.

- Corn has experienced a number of spikes in price and magnitude percentage change. In our data set corn experienced the lowest return at the end of June 2015 and the highest return in August 2019. 
- Soybean is less volatile in terms of price and magnitude percentage change.
- Wheat experienced the most volatility clustering of all the grains.  We see that small changes are followed by small changes and large changes tend to be followed by large changes.

```{r Data summary}
summary(data.r)
```

### Correlation 

```{r pairs.panels}
pairs.panels(data.r)
```

### Data and markets

The following lists data sources and markets for trading. 

- https://www.cmegroup.com/trading/agricultural/ for daily trading data
- https://www.macrotrends.net/2534/wheat-prices-historical-chart-data for historical wheat data
- https://www.macrotrends.net/2532/corn-prices-historical-chart-data for historical corn data
- https://www.macrotrends.net/2531/soybean-prices-historical-chart-data for historical soybean data
- Github and Stack Overflow for trouble shooting

Column
-----------------------------------------------------------------------

### Commodity Price Percent Changes
```{r Price Changes}
renderPlotly({
  library(ggplot2)
  #library(ggfortify)
  library(plotly)
  #title.chg1 <- "Ingrediant Price Percent Changes"
  #title.chg2 <- "Size of Ingrediants Price Percent Changes"
  p <- autoplot.zoo(data.xts[,1:3]) # + ggtitle(title.chg1) #+ ylim(-5, 5)
  ggplotly(p)
})
```

### Size of Commodity Price Percent Changes

```{r Size of Price Changes}
renderPlotly({
  #title.chg1 <- "Ingrediants Price Percent Changes"
  #title.chg2 <- "Size of Ingrediants Price Percent Changes"
  p <- autoplot.zoo(abs(data.xts[,1:3])) # + ggtitle(title.chg2) #+ ylim(-5, 5)
  ggplotly(p)
  })
```


Exploratory
=======================================================================

Inputs {.sidebar}
-----------------------------------------------------------------------
A quantile divides the returns distribution into two groups. For example 75\% of all returns may fall below a return value of 10\%. The distribution is thus divided into returns above 10\% and below 10\% at the 75\% quantile.

Pull slide to the right to measure the risk of returns at desired quantile levels. The minimum risk quantile is 75\%. The maximum risk quantile is 99\%.


```{r Exploratory}
sliderInput("alphaq", label = "Risk Measure quantiles (%):",
            min = 0.75, max = 0.99, value = 0.95, step = 0.01)
```

Columnn {data-width=200}
-----------------------------------------------------------------------

### Corn Value at Risk

```{r Corn Value at Risk}
#threshold <- reactive({input$threshold.q}) #BE SURE that {} included 
renderValueBox({
  new_alpha1 <- input$alphaq
  alpha1 <- ifelse(new_alpha1>1,0.99,ifelse(new_alpha1<0,0.001,new_alpha1))
  returns1 <- returns[,1]
  colnames(returns1) <- "Returns" #kluge to coerce column name for df below
  q <- quantile(returns1,alpha1)
  VaR.hist <- q
  valueBox(round(VaR.hist, 2),
           icon = "fa-mortar-pestle", color = "light-blue")
})
```

### Soybean Value at Risk

```{r Soybean Value at Risk}
#threshold <- reactive({input$threshold.q}) #BE SURE that {} included 
renderValueBox({
  new_alpha1 <- input$alphaq
  alpha1 <- ifelse(new_alpha1>1,0.99,ifelse(new_alpha1<0,0.001,new_alpha1))
  returns1 <- returns[,2]
  colnames(returns1) <- "Returns" #kluge to coerce column name for df below
  q <- quantile(returns1,alpha1)
  VaR.hist <- q
  valueBox(round(VaR.hist, 2),
           icon = "fa-mortar-pestle", color = "light-blue")
})
```

### Wheat Value at Risk

```{r What Value at Risk}
#threshold <- reactive({input$threshold.q}) #BE SURE that {} included 
renderValueBox({
  new_alpha1 <- input$alphaq
  alpha1 <- ifelse(new_alpha1>1,0.99,ifelse(new_alpha1<0,0.001,new_alpha1))
  returns1 <- returns[,3]
  colnames(returns1) <- "Returns" #kluge to coerce column name for df below
  q <- quantile(returns1,alpha1)
  VaR.hist <- q
  # fa-cube, fa-cubes, fa-leaf, fa-mortar-pestle, fa-seedling
  valueBox(round(VaR.hist, 2),
           icon = "fa-mortar-pestle", color = "light-blue")
})
```

Column {.tabset .tabset-fade}
-----------------------------------------------------------------------

### Corn Returns Distribution

```{r Corn Returns}
renderPlotly({
  returns1 <- returns[,1]
  colnames(returns1) <- "Returns" #kluge to coerce column name for df
  returns1.df <- data.frame(Returns = returns1[,1], 
                            Distribution = rep("Historical", each = length(returns1)))
  
  new_alpha1 <- input$alphaq
  alpha1 <- ifelse(new_alpha1>1,0.99,ifelse(new_alpha1<0,0.001,new_alpha1))
  
  # Value at Risk
  VaR1.hist <- quantile(returns1,alpha1)
  VaR1.text <- paste("Value at Risk =", round(VaR1.hist, 2))
  
  # Determine the max y value of the desity plot.
  # This will be used to place the text above the plot
  VaR1.y <- max(density(returns1.df$Returns)$y)
  
  # Expected Shortfall
  ES1.hist <- median(returns1[returns1 > VaR1.hist])
  ES1.text <- paste("Expected Shortfall =", round(ES1.hist, 2))
  
  p1 <- ggplot(returns1.df, aes(x = Returns, fill = Distribution)) + 
    geom_density(alpha = 0.5) + 
    geom_vline(aes(xintercept = VaR1.hist), linetype = "dashed", size = 1, color = "firebrick1") + 
    geom_vline(aes(xintercept = ES1.hist), size = 1, color = "firebrick1") +
    annotate("text", x = 1+ VaR1.hist, y = VaR1.y*1.05, label = VaR1.text) +
    annotate("text", x = 0.5+ ES1.hist, y = VaR1.y*1.1, label = ES1.text) +
    scale_fill_manual(values = "dodgerblue4")
  p1	##ggplotly(p)
  })
```

###  Soybean Returns Distribution

```{r Soybean Returns}
renderPlotly({
  returns2 <- returns[,2]
  colnames(returns2) <- "Returns" #kluge to coerce column name for df
  returns2.df <- data.frame(Returns = returns2[,1], 
                            Distribution = rep("Historical", each = length(returns2)))
  
  alpha2 <- ifelse(input$alphaq>1,0.99,ifelse(input$alphaq<0,0.001,input$alphaq))
  
  # Value at Risk
  VaR2.hist <- quantile(returns2,alpha2)
  VaR2.text <- paste("Value at Risk =", round(VaR2.hist, 2))
  
  # Determine the max y value of the desity plot.
  # This will be used to place the text above the plot
  VaR2.y <- max(density(returns2.df$Returns)$y)
  
  # Expected Shortfall
  ES2.hist <- median(returns2[returns2 > VaR2.hist])
  ES2.text <- paste("Expected Shortfall =", round(ES2.hist, 2))
  
  p2 <- ggplot(returns2.df, aes(x = Returns, fill = Distribution)) + 
    geom_density(alpha = 0.5) + 
    geom_vline(aes(xintercept = VaR2.hist), linetype = "dashed", size = 1, color = "firebrick1") + 
    geom_vline(aes(xintercept = ES2.hist), size = 1, color = "firebrick1") +
    annotate("text", x = 1+VaR2.hist, y = VaR2.y*1.05, label = VaR2.text) +
    annotate("text", x = 0.5+ES2.hist, y = VaR2.y*1.1, label = ES2.text) + 
    scale_fill_manual(values = "dodgerblue4")
  p2	##ggplotly(p)
})
```

### Wheat Returns Distribution

```{r Wheat Returns}
renderPlotly({
  returns3 <- returns[,3]
  colnames(returns3) <- "Returns" #kluge to coerce column name for df
  returns3.df <- data.frame(Returns = returns3[,1], 
                            Distribution = rep("Historical", each = length(returns3)))
  ggplot(returns3.df, aes(x = Returns, fill = Distribution)) + geom_density(alpha = 0.8)
  
  alpha3 <- ifelse(input$alphaq>1,0.99,ifelse(input$alphaq<0,0.001,input$alphaq))
  
  # Value at Risk
  VaR3.hist <- quantile(returns3,alpha3)
  VaR3.text <- paste("Value at Risk =", round(VaR3.hist, 2))
  
  # Determine the max y value of the desity plot.
  # This will be used to place the text above the plot
  VaR3.y <- max(density(returns3.df$Returns)$y)
  
  # Expected Shortfall
  ES3.hist <- median(returns3[returns3 > VaR3.hist])
  ES3.text <- paste("Expected Shortfall =", round(ES3.hist, 2))
  
  p3 <- ggplot(returns3.df, aes(x = Returns, fill = Distribution)) + 
    geom_density(alpha = 0.5) + 
    geom_vline(aes(xintercept = VaR3.hist), linetype = "dashed", size = 1, color = "firebrick1") + 
    geom_vline(aes(xintercept = ES3.hist), size = 1, color = "firebrick1") +
    annotate("text", x = 1+VaR3.hist, y = VaR3.y*1.05, label = VaR3.text) +
    annotate("text", x = 0.5+ES3.hist, y = VaR3.y*1.1, label = ES3.text) +
    scale_fill_manual(values = "dodgerblue4")
  p3	##ggplotly(p)
})

```

### ACF

```{r ACF}
require(graphics)
renderPlot({
  returnz.value <- acf(coredata(data.xts[,1:3])) # returns
  plot(returnz.value)
})
```

```{r Correlation}
require(graphics)
renderPlot({
  returnz.size <- acf(coredata(data.xts[,4:5])) # sizes
  plot(returnz.size)
})
```

### Statistics

```{r Statistics}
## data_moments function
## INPUTS: r vector
## OUTPUTS: list of scalars (mean, sd, median, skewness, kurtosis)
data_moments <- function(data){
  library(moments)
  library(matrixStats)
  mean.r <- colMeans(data)
  median.r <- colMedians(data)
  sd.r <- colSds(data)
  IQR.r <- colIQRs(data)
  skewness.r <- skewness(data)
  kurtosis.r <- kurtosis(data)
  result <- data.frame(mean = mean.r, 
                       median = median.r, 
                       std_dev = sd.r, 
                       IQR = IQR.r, 
                       skewness = skewness.r, 
                       kurtosis = kurtosis.r)
  return(result)
}
# Run data_moments()
answer <- data_moments(data.xts[, 1:3])
# Build pretty table
answer <- round(answer, 4)
knitr::kable(answer)
```


Market Risk
=======================================================================

Column
-----------------------------------------------------------------------

### Corn, Soybean, Wheat Observations
Relationship and correlation observations.

- This distribution demonstrates volatility clustering which allows us to observe patterns in market behavior and persistent correlations in commodity price movements.
- The scatterplot illustrates the outliers of each of the three ingredients.  Wheat appears to be more concentrated around the mean but also experienced its share of volatility.  This has a direct impact on risk measures associated to portfolio performance.
- Soybean is also closely grouped around its mean
- Corn experienced the most volatility with points appearing the furthest from the mean.
- The correlation of returns in this commodities market was expected since these ingredients are used in similar applications or in the creation of bread and other foods.   
- Corn is also used in a variety of other applications including animal feed, ethanol, and bio-based plastics. 
- The current COVID pandemic is creating increased uncertainty and these commodities are not immune to it’s impact. Corn futures have shed nearly 10%, wheat futures have fallen nearly 2%, and soybean futures have dropped over 4%. (https://www.wsj.com/, 03/21/2020)  With increased volatility and uncertainty in the market, we can expect further decline and instability in prices.  We are beginning to see a volumetric risk, as the labor market for these commodities is being affected by the pandemic which could impact supply in the long run.


### Corn, Soybean, Wheat relationships

```{r Relationships}
#library(psych)
pairs.panels(corr.returns.df)
```

### Getting practical
Useful information about market risk.

- Using price returns we can compute loss.  Increase volatilty and uncertainty, increase the risk of shortfalls in the portfolio.

- Weights for each are defined as the value of the positions in each risk factor. 

- We can compute this as the notional (in bushels equivalent for this market) times the last observed price.

- Losses for a bushel equivalent of the three commodities are computed back into the sample relative to the most recently observed prices. Historical prices demonstrate the strong correlation of these commodities, with the exception of the recent COVID impact on the market.


Column {.tabset }
-----------------------------------------------------------------------

### Corn and Soybean (90 day rolling correlation)

```{r Corn and Soybean Rolling Correlation}
renderPlotly({
  p <- ggplot(corr.returns.df, aes(x = Date, y = Corn.Soybean)) + geom_line()
  p	##ggplotly(p)
})
```

### Corn and Wheat (90 day rolling correlation)

```{r Corn and Wheat Rolling Correlation}
renderPlotly({
  p <- ggplot(corr.returns.df, aes(x = Date, y = Corn.Wheat)) + geom_line()
  p	##ggplotly(p)
})
```

### Soybean and Wheat (90 day rolling correlation)

```{r Soybean and Wheat Rolling Correlation}
renderPlotly({
  p <- ggplot(corr.returns.df, aes(x = Date, y = Soybean.Wheat)) + geom_line()
  p	##ggplotly(p)
})
```

### 30 day within-sample correlations and volatilities

```{r Correlations and Vols}
plot.zoo(R.corr.vols, main= "Monthly Correlations and Volatilities")
```

### Corn - Soybean Dependency

```{r Corn and Soybean Dependency}
renderPlot({ 
  plot(summary(fit.rq.Corn.Soybean), 
       parm = "Soybean.vols", 
       main = "Corn-Soybean correlation sensitivity to Soybean volatility")
})
```

- Assume that the loss density $\Large f_L$ is strictly positive so that the distribution function of loss possesses a diffentiable inverse and change variables so that $\Large v = q_u(L) = F_L(u)$ the cumulative loss distribution. Then 

$$
\Large
\frac{dv}{du} = f^{-1}(v)
$$
and we can compute
$$
\Large
\frac{\partial r_{ES}^{\alpha}}{\partial \lambda_i}(1) = \frac{1}{1-\alpha}\int_{q_{\alpha}(L)}^{\infty}E(L_i | L=v)f_L(v)dv = \frac{1}{1-\alpha}\int_{\alpha}^1E(L_i \, | \, L \geq q_{\alpha}(L))
$$
- (Finally) we have the expected shortfall contribution of a line of business $\Large i$ as
$$
\Large
C_i^{ES} = E(L_i | L \geq VaR_{\alpha}(L))
$$


### Empirical loss

```{r Tolerance Input}
sliderInput("losstol", label = "Loss Tolerance (%):",
            min = 0.50, max = 0.99, value = 0.95, step = 0.01)
```

```{r Loss Analysis}
## Now for Loss Analysis
# Get last prices

price.last <- as.numeric(tail(data[, -1], n=1))
# Specify the positions
position.rf <- c(1/3, 1/3, 1/3)
# And compute the position weights
w <- position.rf * price.last
# Fan these  the length and breadth of the risk factor series
weights.rf <- matrix(w, nrow=nrow(data.r), ncol=ncol(data.r), byrow=TRUE)
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=3)
## We need to compute exp(x) - 1 for very small x: expm1 accomplishes this
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=4)
loss.rf <- -rowSums(expm1(data.r/100) * weights.rf)
loss.rf.df <- data.frame(Loss = loss.rf, Distribution = rep("Historical", each = length(loss.rf)))
## Simple Value at Risk and Expected Shortfall
renderPlotly({
  alpha.tolerance <- input$losstol
  #alpha.tolerance <- .95
  VaR.hist <- quantile(loss.rf, probs=alpha.tolerance, names=FALSE)
  ## Just as simple Expected shortfall
  ES.hist <- median(loss.rf[loss.rf > VaR.hist])
  VaR.text <- paste("Value at Risk =\n", round(VaR.hist, 4)) # ="VaR"&c12
  ES.text <- paste("Expected Shortfall \n=", round(ES.hist, 4))
  title.text <- paste(round(alpha.tolerance*100, 0), "% Loss Limits")
  p <- ggplot(loss.rf.df, aes(x = Loss, fill = Distribution)) +
    geom_histogram(alpha = 0.8, bins=30) +
    geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "blue") +
    geom_vline(aes(xintercept = ES.hist), size = 1, color = "blue") +
    annotate("text", x = VaR.hist, y = 200, label = VaR.text) +
    annotate("text", x = ES.hist, y = 100, label = ES.text) +
    xlim(-0.5, 0.5) + ggtitle(title.text)
  p	##ggplotly(p)
})
```

Extremes
=======================================================================
Column {data-width=200}
-----------------------------------------------------------------------
### Let's go to extremes

- All along we have been stylizing financial returns, commodities and exchange rates, as skewed and with thick tails.
- We next go on to an extreme tail distribution called the Generalized Pareto Distribution (GPD). 
- For very high thresholds, GPD not only well describes behavior in excess of the threshold, but the mean excess over the threshold is linear in the threshold. 
- From this we get more intuition around the use of expected shortfall as a coherent risk measure. 
- In recent years we well exceeded all Gaussian and Student's t thresholds.

For a random variate $\Large x$, this distribution is defined for the shape parameters $\Large \xi \geq 0$ as:

$$
\Large
g(x; \xi \geq 0) = 1- (1 + x \xi/\beta)^{-1/\xi}
$$


and when the shape parameter $\Large \xi = 0$, the GPD becomes the exponential distribution dependent only on the scale parameter $\beta$:

$$
\Large
g(x; \xi = 0) = 1 - exp(-x/\beta).
$$

Now for one reason for GPD's notoriety...

- If $\Large u$ is an upper (very high) threshold, then the excess of threshold function for the GPD is

$$
\Large
e(u) = \frac{\beta + \xi u}{1 - \xi}.
$$

- This simple measure is _linear_ in thresholds. 
- It will allow us to visualize where rare events begin (see McNeil, Embrechts, and Frei (2015, chapter 5)). 
- we often exploit this property when we look at operational loss data.
- Here is a mean excess loss plot for the `loss.rf` data. If there is a straight-line relationship after a threshold, then we have some evidence for the existence of a GPD for the tail.

Column {data-width=400}
-----------------------------------------------------------------------
### Mean excess loss

```{r Confidence level}
sliderInput("confidencelevel", label = "Confidence level (%):",
            min = 0.50, max = 0.99, value = 0.95, step = 0.01)
```

```{r Extreme}
# mean excess plot to determine thresholds for extreme event management

renderPlotly({
  data <- as.vector(loss.rf) # data is purely numeric
  umin <-  min(data) + 0.3         # threshold u min
  umax <-  max(data) - 0.1   # threshold u max
  nint <- 100                # grid length to generate mean excess plot
  grid.0 <- numeric(nint)    # grid store
  e <- grid.0                # store mean exceedances e
  upper <- grid.0            # store upper confidence interval
  lower <- grid.0            # store lower confidence interval
  u <- seq(umin, umax, length = nint) # threshold u grid

  #alpha <- 0.95                  # confidence level
  alpha.confidence <- input$confidencelevel
  for (i in 1:nint) {
    data <- data[data > u[i]]  # subset data above thresholds
    e[i] <- mean(data - u[i])  # calculate mean excess of threshold
    sdev <- sqrt(var(data))    # standard deviation
    n <- length(data)          # sample size of subsetted data above thresholds
    upper[i] <- e[i] + (qnorm((1 + alpha.confidence)/2) * sdev)/sqrt(n) # upper confidence interval
    lower[i] <- e[i] - (qnorm((1 + alpha.confidence)/2) * sdev)/sqrt(n) # lower confidence interval
  }
  mep.df <- data.frame(threshold = u, threshold.exceedances = e, lower = lower, upper = upper)
  loss.excess <<- loss.rf[loss.rf > u]
  upper_text <- paste("upper ", round(alpha.confidence*100,0), "%")
  lower_text <- paste("upper ", round((1 - alpha.confidence)*100,0), "%")
  # Voila the plot => you may need to tweak these limits!
  p <- ggplot(mep.df, aes( x= threshold, y = threshold.exceedances)) +
    geom_line() +
    geom_line(aes(x = threshold, y = lower), colour = "red") +
    geom_line(aes(x = threshold, y = upper), colour = "red") +
    annotate("text", x = 0.2, y = 0.2, label = upper_text) +
    annotate("text", x = 0.2, y = -0.1, label = lower_text)
  p2 <- ggplotly(p) %>% layout(height = 300)
  p2
})
```

### GPD fits and starts

```{r GPD}
#library(QRM)
# Plot away
renderPlotly({
  library(QRM)
  #alpha.tolerance <- 0.95
  alpha.tolerance <- input$confidencelevel
  u <- quantile(loss.rf, alpha.tolerance , names=FALSE)
  loss.excess <<- loss.rf[loss.rf > u]
  fit <- fit.GPD(loss.rf, threshold=u) # Fit GPD to the excesses
  xi.hat <- fit$par.ests[["xi"]] # fitted xi
  beta.hat <- fit$par.ests[["beta"]] # fitted beta
  data <- loss.rf
  n.relative.excess <- length(loss.excess) / length(loss.rf) # = N_u/n
  VaR.gpd <- u + (beta.hat/xi.hat)*(((1-alpha.tolerance) / n.relative.excess)^(-xi.hat)-1) 
  ES.gpd <- (VaR.gpd + beta.hat-xi.hat*u) / (1-xi.hat)
  n.relative.excess <- length(loss.excess) / length(loss.rf) # = N_u/n
  VaR.gpd <- u + (beta.hat/xi.hat)*(((1-alpha.tolerance) / n.relative.excess)^(-xi.hat)-1) 
  ES.gpd <- (VaR.gpd + beta.hat-xi.hat*u) / (1-xi.hat)

  VaRgpd.text <- paste("GPD: Value at Risk =", round(VaR.gpd, 2))
  ESgpd.text <- paste("Expected Shortfall =", round(ES.gpd, 2))
  title.text <- paste(VaRgpd.text, ESgpd.text, sep = " ")
  loss.plot <- ggplot(loss.rf.df, aes(x = Loss, fill = Distribution)) +
    geom_density(alpha = 0.2)
  loss.plot <- loss.plot + 
    geom_vline(aes(xintercept = VaR.gpd), colour = "blue", linetype = "dashed", size = 0.8)
  loss.plot <- loss.plot + 
    geom_vline(aes(xintercept = ES.gpd), colour = "blue", size = 0.8) 
  loss.plot <- loss.plot + xlim(0,0.5) + ggtitle(title.text)
  loss.plot <- loss.plot + annotate("text", x = VaR.gpd, y = 10, label = VaRgpd.text)
  loss.plot <- loss.plot + annotate("text", x = ES.gpd, y = 5, label = ESgpd.text)
  loss.plot	##ggplotly(loss.plot)
})
```

Column {data-width=400}
-----------------------------------------------------------------------
### Confidence and risk measures
Confidence tolerance of 95%
```{r BFGS}
alpha.tolerance <- 0.95
u <- quantile(loss.rf, alpha.tolerance , names=FALSE)
loss.excess <<- loss.rf[loss.rf > u]
fit <- fit.GPD(loss.rf, threshold=u) # Fit GPD to the excesses
xi.hat <- fit$par.ests[["xi"]] # fitted xi
beta.hat <- fit$par.ests[["beta"]] # fitted beta
fit <- fit.GPD(loss.rf, threshold=u) # Fit GPD to the excesses
showRM(fit, alpha = 0.99, RM = "ES", method = "BFGS")
# Add graphs for Estimated tail probabilities for confidence interval x
showRM(fit, alpha = 0.99, RM = "VaR", method = "BFGS")
```

### How good a fit to the data have we found?
```{r Data fit}
gpd.density <- pGPD(loss.excess, xi = xi.hat, beta = beta.hat)
gpd.density.df <- data.frame(Density = gpd.density, Distribution = rep("GPD", each = length(gpd.density))) ## This should be U[0,1]
ggplot(gpd.density.df, aes(x = Density, fill = Distribution)) + geom_histogram()
```

Optimization
==========================================================
Column {.tabset}
----------------------------------------------------------

### If that wasn't enough...

Stylized market facts indicate

- Allocation across various component of loss drivers requires both body and tail considerations
- Pessisimistic risk measurement requires some sort of distortion measure to assess the probability of good and bad news

So that ...

- Bassett et al.(2004) show that the mean-expected shortfall efficient portfolio problem is equivalent to a quantile regression with linear constraints.
- Enlarge scope of expected utility from monetary and probabality to include an assessment (distortion) of probability.
- Choquet integrals build on Lebesgue measures by inflating or deflating the probabilities by the rank order of the outcomes.
- Expected shortfall is an example of a Choquet, rank-ordered, criterion.

**The risk-free rate was**

```{r Optimization}
mu.free <- 0.0004/253 ## input value of daily risk-free interest rate
#
```

- Obtained from https://www.treasury.gov/resource-center/data-chart-center/interest-rates/pages/textview.aspx?data=yield
- 1 month Daily Treasury Yield Curve Rate is 0.04% on 3/20/20
- Using 253 (2020 is a leap year) as the number of trading days
- Risk free rate is calculated as (0.04% / 253) = `r mu.free` or `r format(mu.free, scientific=FALSE)`


### Pessimism reigns

- A risk measure $\Large \rho$ is pessistic if, for some probability measure $\Large \phi$ on $\Large [0,1]$, 
$$
\Large
\rho(L) = \int_0^1 \rho_{u}(L) \phi(u) du.
$$

- For expected shortfall, $\Large \phi(u) = (1-\alpha)^{-1}I_{(u\geq\alpha)}$: equal weight is placed on all quantiles beyond the $\alpha$-quantile.
- Suppose we have a loss portfolio with position weights $\Large \pi$ and losses $X$ so that total loss is $\Large L = X\,'\pi$ with mean loss $\Large \mu(L)$. Let's choose loss weights to minimize
$$
\Large
min_{\pi}\,[\rho_{\alpha}(L) - \lambda \mu(L)] \,\, s.t.\, \mu(L)=\mu_0, \,\, 1^T\pi = 1
$$

where the weights add up to 1 and we try to achieve a minimum return $\Large \mu_0$.

- Taking this formulation to a sample version for $\Large n$ observations of losses, we get
$$
\Large
min_{\beta, \xi}\sum_{k=1}^m \, \sum_{i=1}^n \, \nu_k \rho_{\alpha}(X_{i1}-\sum_{j=2}^p (x_{i1}-x_{ij}\beta_{j})-\xi_k))
$$

$$
\Large
s.t.\, \bar{X}\,'\pi(\beta) = \mu_0
$$

- In this approach, there are $\Large m$ weights $\Large \nu$ that pull together $\Large m$ different sets of portfolio weightings. The $\Large \xi$ terms represent $\Large m$ different intercepts, one for each $\Large \nu_k$ weight. 
- There are $\Large p$ assets or loss categories here. We use the first asset, $\Large i = 1$ as the "numerarire" or benchmark asset. We measure returns on assets 2 to $\Large p$ relative to the first asset. The weights for assets 2 to $\Large p$ are the regression coeffients $\Large \beta$. the weight for the first asset uses the adding up constraint so that

$$
\Large
\pi_1 = 1 - \sum_{j=2}^p \pi_j
$$


The corresponding Markowitz (1952) approach is
$$
\Large
min_{\beta, \xi} \, \sum_{i=1}^n (X_{i1}-\sum_{j=2}^p (x_{i1}-x_{ij})\beta_{j}-\xi))^2
$$
subject to the constraint

$$
\Large
s.t.\, \bar{X}\,'\pi = \mu_0
$$

- We model distortions using weighted quantiles.
- The Choquet criterion ends up using a weighted average of quantile allocations across assessed probabilities to express preferences.
- Mimimize a weighted sum of quantile regression objective functions using the specified $\alpha$ quantiles. 
- The model permits distinct intercept parameters at each of the specified taus, but the slope parameters are constrained to be the same for all $\Large \alpha$s. 
- This estimator was originally suggested to the Roger Koenker by Bob Hogg in one of his famous blue book notes of 1979. 
- The algorithm used to solve the resulting linear programming problems is either the Frisch Newton algorithm described in Portnoy and Koenker (1997), or the closely related algorithm described in Koenker and Ng(2002) that handles linear inequality constraints. 
- Linear inequality constraints can be imposed.

```{r Quantiles, mysize=TRUE, size='\\footnotesize', echo = TRUE}
library(quantreg)
x <- data.r/100
n <- nrow(x)
p <- ncol(x)
alpha <-  c(0.1, 0.3) # quantiles
w <-  c(0.3, 0.7) # distortion weights
lambda <- 100 # Lagrange multiplier for adding up constraint
m <- length(alpha)
# error handling: if (length(w) != m) stop("length of w doesn't match length of alpha")
xbar <- apply(x, 2, mean)
mu.0 <-  mean(xbar)
y <- x[, 1] #set numeraire
r <- c(lambda * (xbar[1] - mu.0), -lambda * (xbar[1] - mu.0))
X <- x[, 1] - x[, -1]
R <- rbind(lambda * (xbar[1] - xbar[-1]), -lambda * (xbar[1] - xbar[-1]))
R <- cbind(matrix(0, nrow(R), m), R)
f <- rq.fit.hogg(X, y, taus = alpha, weights = w, R = R, r = r)
fit <- f$coefficients
# transform regression coeff to portfolio weights
pihat <- c(1 - sum(fit[-(1:m)]), fit[-(1:m)]) 
x <- as.matrix(x)
yhat <- x %*% pihat # predicted 
etahat <- quantile(yhat, alpha)
muhat <- mean(yhat)
qrisk <- 0
for (i in 1:length(alpha))
{
  qrisk <- qrisk + w[i] * sum(yhat[yhat < etahat[i]])/(n * alpha[i])
}
qrisk
pihat
```

Trying a different distortion

```{r Returns, mysize=TRUE, size='\\footnotesize', echo = TRUE}
library(quantreg)
#library(dplyr) # use data.df now
alpha <- 0.95
u <- quantile(data.df$returns.Corn, alpha )
x <- data.df.nd[data.df.nd$returns.Corn < u, 2:4]/100
n <- nrow(x)
p <- ncol(x)
alpha <-  c(0.01, 0.1) # quantiles at lower (negative) tail
w <-  c(0.95, 0.05) # distortion weights
lambda <- 100 # Lagrange multiplier for adding up constraint
m <- length(alpha) #alpha and w length must be the same
xbar <- apply(x, 2, mean)
mu.0 <-  mean(xbar)
y <- x[, 1] #set numeraire
r <- c(lambda * (xbar[1] - mu.0), -lambda * (xbar[1] - mu.0))
X <- x[, 1] - x[, -1] # set up design matrix of adjusted all but numeraire returns
R <- rbind(lambda * (xbar[1] - xbar[-1]), -lambda * (xbar[1] - xbar[-1])) # constraints
R <- cbind(matrix(0, nrow(R), m), R) #augmented constraints
f <- rq.fit.hogg(X, y, taus = alpha, weights = w, R = R, r = r) #Bob Hogg estimator
fit <- f$coefficients
# transform regression coeff to portfolio weights
pihat <- c(1 - sum(fit[-(1:m)]), fit[-(1:m)]) 
x <- as.matrix(x)
yhat <- x %*% pihat # predicted 
(etahat <- quantile(yhat, alpha))
(muhat <- mean(yhat))
qrisk <- 0
for (i in 1:length(alpha))
{
  qrisk <- qrisk + w[i] * sum(yhat[yhat < etahat[i]])/(n * alpha[i])
}
qrisk
pihat
```


### Extreme frontier finance

Weights
```{r Extreme Frontier}
mu.0 <- xbar
mu.P <- seq(-.0005, 0.0015, length = 100) ## set of 300 possible target portfolio returns
qrisk.P <-  mu.P ## set up storage for quantile risks of portfolio returns
weights <-  matrix(0, nrow=300, ncol = ncol(data.r)) ## storage for portfolio weights
colnames(weights) <- names(data.r)
for (i in 1:length(mu.P))
{
  mu.0 <-  mu.P[i]  ## target returns
  result <- qrisk(x, mu = mu.0)
  qrisk.P[i] <- -result$qrisk # convert to loss risk already weighted across alphas
  weights[i,] <-  result$pihat
}
qrisk.mu.df <- data.frame(qrisk.P = qrisk.P, mu.P = mu.P )
mu.P <- qrisk.mu.df$mu.P
##mu.free <-  0.00011 ## input value of risk-free interest rate
sharpe <- ( mu.P-mu.free)/qrisk.P ## compute Sharpe's ratios
ind <-  (sharpe == max(sharpe)) ## Find maximum Sharpe's ratio
ind2 <-  (qrisk.P == min(qrisk.P)) ## find the minimum variance portfolio
ind3 <-  (mu.P > mu.P[ind2]) ## find the efficient frontier (blue)
col.P <- ifelse(mu.P > mu.P[ind2], "blue", "grey")
weights.extr <- weights[ind,] # for use in calculating tengency risk measures
qrisk.mu.df$col.P <- col.P
eff_slope <- ((mu.P[ind]-mu.free) / qrisk.P[ind])
(weights.extr)
renderPlotly({
  p <- ggplot(qrisk.mu.df, aes(x = qrisk.P, y = mu.P, group = 1))
  p <- p + geom_line(aes(colour= col.P, group = col.P))
  p <- p + scale_colour_identity()  
  p <- p + geom_point(aes(x = 0, y = mu.free), colour = "red")
  options(digits=3)
  p <- p + geom_abline(intercept = mu.free, 
                       slope = eff_slope,
                       colour = "red")
  p <- p + geom_point(aes(x = qrisk.P[ind], y = mu.P[ind])) 
  p <- p + geom_point(aes(x = qrisk.P[ind2], y = mu.P[ind2])) 
  p2 <- ggplotly(p) %>% layout(height = 600, width = 600)
  p2	### ggplotly(p)
})
```

### Extreme portfolio risk measures

```{r Extreme Portfolio}
# price.last <- as.numeric(tail(data[, -1], n=1))
# Specify the positions
position.rf <- weights.extr #c(1/3, 1/3, 1/3)
# And compute the position weights
w <- position.rf * price.last
# Fan these  the length and breadth of the risk factor series
weights.rf <- matrix(w, nrow=nrow(data.r), ncol=ncol(data.r), byrow=TRUE)
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=3)
## We need to compute exp(x) - 1 for very small x: expm1 accomplishes this
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=4)
loss.rf <- -rowSums(expm1(data.r/100) * weights.rf)
alpha.tolerance <- 0.90
u <- quantile(loss.rf, alpha.tolerance, names = FALSE)
fit.extr <- fit.GPD(loss.rf, threshold = u)
renderPlot({
  showRM(fit.extr, alpha = alpha.tolerance, RM = "ES", method = "BFGS")
})
```

### Portfolio Analytics: the Markowitz model: default

```{r Portfolio Analytics}
library(quadprog)

R <- returns[,1:3]/100
quantile_R <- quantile(R[,1], 0.95)
#R <- subset(R, Corn > quantile_R, select = Corn:Wheat)
names.R <- colnames(R)
mean.R <-  apply(R,2,mean)
cov.R <-  cov(R)
sd.R <-  sqrt(diag(cov.R)) ## remember these are in daily percentages
#library(quadprog)
Amat <-  cbind(rep(1,3),mean.R)  ## set the equality constraints matrix
mu.P <- seq(min(mean.R), max(mean.R), length = 300)  ## set of 300 possible target portfolio returns
#mu.P <- seq(0.5*quantile_R, max(R), length = 100)  ## set of 300 possible target portfolio returns
sigma.P <-  mu.P ## set up storage for std dev's of portfolio returns
weights <-  matrix(0, nrow=300, ncol = ncol(R)) ## storage for portfolio weights
colnames(weights) <- names.R
for (i in 1:length(mu.P))
{
  bvec <- c(1,mu.P[i])  ## constraint vector
  result <- solve.QP(Dmat=2*cov.R,dvec=rep(0,3),Amat=Amat,bvec=bvec,meq=2)
  sigma.P[i] <- sqrt(result$value)
  weights[i,] <- result$solution
}
sigma.mu.df <- data.frame(sigma.P = sigma.P, mu.P = mu.P )
##mu.free <- .00011 ## input value of daily risk-free interest rate
sharpe <- ( mu.P-mu.free)/sigma.P ## compute Sharpe's ratios
ind <-  (sharpe == max(sharpe)) ## Find maximum Sharpe's ratio
ind2 <-  (sigma.P == min(sigma.P)) ## find the minimum variance portfolio
ind3 <-  (mu.P > mu.P[ind2]) ## finally the efficient frontier
col.P <- ifelse(mu.P > mu.P[ind2], "blue", "grey")
sigma.mu.df$col.P <- col.P
mark_default_slop <- ((mu.P[ind]-mu.free)/sigma.P[ind])
(mark_default_slop)
renderPlotly({
  p <- ggplot(sigma.mu.df, aes(x = sigma.P, y = mu.P, group = 1)) +
    geom_line(aes(colour=col.P, group = col.P)) +
    scale_colour_identity() + xlim(0, max(sd.R*1.5)) + ylim(0, max(mean.R)*1.5)
  p <- p + geom_point(aes(x = 0, y = mu.free), colour = "red")
  options(digits=4)
  p <- p + geom_abline(intercept = mu.free, slope = mark_default_slop, colour = "red")
  p <- p + geom_point(aes(x = sigma.P[ind], y = mu.P[ind], pch="*")) 
  p <- p + geom_point(aes(x = sigma.P[ind2], y = mu.P[ind2], pch="-")) ## show min var portfolio
  p <- p + annotate("text", x = sd.R[1], y = 0.00005+mean.R[1], label = names.R[1]) + # Corn
    annotate("text", x = sd.R[2]-0.01, y = 0.00001+mean.R[2], label = names.R[2]) + # Soybean
    annotate("text", x = sd.R[3]-0.005, y = -0.00005+mean.R[3], label = names.R[3]) # Wheat
  p2 <- ggplotly(p) %>% layout(height = 600, width = 800)
  p2	### ggplotly(p)
})
```

### Portfolio Analytics: the Markowitz model: no short

```{r Markowitz}
library(quadprog)

R <- returns[,1:3]/100
quantile_R <- quantile(R[,1], 0.95)
#R <- subset(R, Corn > quantile_R, select = Corn:Wheat)
names.R <- colnames(R)
mean.R <-  apply(R,2,mean)
cov.R <-  cov(R)
sd.R <-  sqrt(diag(cov.R)) ## remember these are in daily percentages
#library(quadprog)
Amat <-  cbind(rep(1,3),mean.R,diag(1,3))  ## set the equality constraints matrix
mu.P <- seq(min(mean.R), max(mean.R), length = 300)  ## set of 300 possible target portfolio returns
#mu.P <- seq(0.5*quantile_R, max(R), length = 100)  ## set of 300 possible target portfolio returns
sigma.P <-  mu.P ## set up storage for std dev's of portfolio returns
weights.x <-  matrix(0, nrow=300, ncol = ncol(R)) ## storage for portfolio weights.x
colnames(weights.x) <- names.R
for (i in 1:length(mu.P))
{
  bvec <- c(1,mu.P[i],rep(0,3)) ## no short sales
  result <- solve.QP(Dmat=2*cov.R,dvec=rep(0,3),Amat=Amat,bvec=bvec,meq=2)
  sigma.P[i] <- sqrt(result$value)
  weights.x[i,] <- result$solution
}
sigma.mu.df <- data.frame(sigma.P = sigma.P, mu.P = mu.P )
##mu.free <- .00011 ## input value of daily risk-free interest rate
sharpe <- ( mu.P-mu.free)/sigma.P ## compute Sharpe's ratios
inx <-  (sharpe == max(sharpe)) ## Find maximum Sharpe's ratio
inx2 <-  (sigma.P == min(sigma.P)) ## find the minimum variance portfolio
indx3 <-  (mu.P > mu.P[inx2]) ## finally the efficient frontier
col.P <- ifelse(mu.P > mu.P[inx2], "blue", "grey")
sigma.mu.df$col.P <- col.P
mark_no_short_slope <- ((mu.P[inx]-mu.free)/sigma.P[inx])
(mark_no_short_slope)
renderPlotly({
  p <- ggplot(sigma.mu.df, aes(x = sigma.P, y = mu.P, group = 1)) +
    geom_line(aes(colour=col.P, group = col.P)) +
    scale_colour_identity() + xlim(0, max(sd.R*1.5)) + ylim(0, max(mean.R)*1.5)
  p <- p + geom_point(aes(x = 0, y = mu.free), colour = "red")
  options(digits=4)
  p <- p + geom_abline(intercept = mu.free, slope = mark_no_short_slope, colour = "red")
  p <- p + geom_point(aes(x = sigma.P[inx], y = mu.P[inx], pch="*")) 
  p <- p + geom_point(aes(x = sigma.P[inx2], y = mu.P[inx2], pch="-")) ## show min var portfolio
  p <- p + annotate("text", x = sd.R[1], y = 0.00005+mean.R[1], label = names.R[1]) + # Corn
    annotate("text", x = sd.R[2]-0.01, y = 0.00001+mean.R[2], label = names.R[2]) + # Soybean
    annotate("text", x = sd.R[3]-0.005, y = -0.00005+mean.R[3], label = names.R[3]) # Wheat
  p2 <- ggplotly(p) %>% layout(height = 600, width = 800)
  p2	### ggplotly(p)
})
```


Conclusion
=======================================================================

Column 
-------------------------------------------------------------------------

### Skills and Tools

1. Packages: ggplot, scales, quadprog, quantreg, shiny, flexdashboard, qrmdata, xts, matrixStats, zoo, QRM, plotly, and psych

Skills used for data exploration and analytics are the following -

1. head() to return the first part of a vector, matrix, table, data frame or function.
2. tail() to return the last part of a vector, matrix, table, data frame or function.
3. summary() to produce result summaries of the results of various model fitting functions.
4. format() to format an R object for pretty printing.
5. diff() to return suitably lagged and iterated differences.
6. ifelse() to return a value with the same shape as test which is filled with elements selected from either yes or no depending on whether the element of test is TRUE or FALSE.
7. function() to create and store a function for data analysis.
8. round() to round the values in its first argument to the specified number of decimal places (default 0).
9. mean() is a generic function for the (trimmed) arithmetic mean.
10. ts() to create time-series objects.
11. rollapply() to apply a function to rolling margins of an array.
12. merge() to merge two data frames by common columns or row names, or do other versions of database join operations.
13. seq() to generate regular sequences.
14. rq() to perform a quantile regression on a design matrix, x, of explanatory variables and a vector, y, of responses.
15. lm() can be used to carry out regression, single stratum analysis of variance and analysis of covariance.
16. Plot() to make a map of the values of a Raster* object, or make a scatterplot of their values.
17. split() to divide the data in the vector x into the groups defined by f. 
18. lapply() to return a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.
19. image_animate() manipulate or combine multiple frames of an image. 
20. ncol() returns the number of columns.
21. cor() computes the variance of x and the covariance or correlation of x and y.
22. apply.monthly() applies a specified function to each distinct period in a given time series object.
22. quantile() produces sample quantiles corresponding to the given probabilities.
23. apply.monthly() applies a specified function to each distinct period in a given time series object.
24. max() returns the maximum of all values in a vector by passing codemax as fn argument to univar function.

Skills used in packaging and express of data in terms of graphs and tables are the following -

1. ggplot()/ggplot2() to generate the graph based on the vectors.
2. kable() to create a data table.
3. table() returns a contingency table.
4. xtable to create LaTeX formatted and rendered table.
5. ggplotly() to convert a ggplot2::ggplot() object to a plotly object.
6. autoplot.zoo() takes a zoo object and converts it into a data frame (intended for ggplot2)
7. ggtitle() to format chart titles.
8. ylim() to to set the limits of the y axis.
9. coredata() to extract the core data contained in a (more complex) object and replacing it.
10. acf() to compute (and by default plots) estimates of the autocovariance or autocorrelation function.
11. pacf() is the function used for the partial autocorrelations.


Column 
-------------------------------------------------------------------------

### Portfolio Weights

For the working capital accounts:

No constraints:

```{r echo = FALSE}
library(scales)
money_to_spend <- 250000000
spend_format <- format(money_to_spend, big.mark=",", scientific=FALSE)
weights[ind,]
name <- colnames(weights)
posn <- ifelse((weights[ind,]<0), "go short (sell)", "go long, (buy)")
value <- percent(abs(weights[ind,]))
weights[ind,]*money_to_spend
buy_sell_corn <- ifelse(weights[ind,1] > 0, "Buy", "Short (Sell)")
buy_sell_soybean <- ifelse(weights[ind,2] > 0, "Buy", "Short (Sell)")
buy_sell_wheat <- ifelse(weights[ind,3] > 0, "Buy", "Short (Sell)")
```

* `r spend_format` denominated in dollars
* `r buy_sell_corn` `r dollar(abs(weights[ind,1]*money_to_spend))` in Corn
* `r buy_sell_soybean` `r dollar(abs(weights[ind,2]*money_to_spend))` in Soybean
* `r buy_sell_wheat` `r dollar(abs(weights[ind,3]*money_to_spend))` in Wheat


With constraints:

Constraint: Minimum Variance Portfolio
```{r echo = FALSE}
library(scales)
weights[ind2,]
name <- colnames(weights)
posn <- ifelse((weights[ind2,]<0), "go short (sell)", "go long, (buy)")
value <- percent(abs(weights[ind2,]))
weights[ind2,]*money_to_spend
buy_sell_corn2 <- ifelse(weights[ind2,1] > 0, "Buy", "Short (Sell)")
buy_sell_soybean2 <- ifelse(weights[ind2,2] > 0, "Buy", "Short (Sell)")
buy_sell_wheat2 <- ifelse(weights[ind2,3] > 0, "Buy", "Short (Sell)")
```

* `r spend_format` denominated in dollars
* `r buy_sell_corn2` `r dollar(abs(weights[ind2,1]*money_to_spend))` in Corn
* `r buy_sell_soybean2` `r dollar(abs(weights[ind2,2]*money_to_spend))` in Soybean
* `r buy_sell_wheat2` `r dollar(abs(weights[ind2,3]*money_to_spend))` in Wheat

Constraint: No Short Portfolio

```{r echo = FALSE}
library(scales)
weights.x[inx,]
name <- colnames(weights.x)
posn <- ifelse((weights.x[inx,]<0), "go short (sell)", "go long, (buy)")
value <- percent(abs(weights.x[inx,]))
weights.x[inx,]*money_to_spend
buy_sell_corn3 <- ifelse(weights.x[inx,1] > 0, "Buy", "Short (Sell)")
buy_sell_soybean3 <- ifelse(weights.x[inx,2] > 0, "Buy", "Short (Sell)")
buy_sell_wheat3 <- ifelse(weights.x[inx,3] > 0, "Buy", "Short (Sell)")
```

* `r spend_format` denominated in dollars
* `r buy_sell_corn3` `r dollar(abs(weights.x[inx,1]*money_to_spend))` in Corn
* `r buy_sell_soybean3` `r dollar(abs(weights.x[inx,2]*money_to_spend))` in Soybean
* `r buy_sell_wheat3` `r dollar(abs(weights.x[inx,3]*money_to_spend))` in Wheat

The weights for the Markowitz tangency [default] portfolio ("*") are

```{r echo = FALSE}
library(scales)
weights[ind,]
name <- colnames(weights)
posn <- ifelse((weights[ind,]<0), "go short (sell)", "go long, (buy)")
value <- percent(abs(weights[ind,]))
```

The weights for the Markowitz tangency [min var]  portfolio  ("-") are

```{r echo = FALSE}
library(scales)
weights[ind2,]
name <- colnames(weights)
posn <- ifelse((weights[ind2,]<0), "go short (sell)", "go long, (buy)")
value <- percent(abs(weights[ind2,]))
```

The weights for the Markowitz tangency [no short] portfolio ("+") are

```{r echo = FALSE}
library(scales)
weights.x[inx,]
name <- colnames(weights.x)
posn <- ifelse((weights.x[inx,]<0), "go short (sell)", "go long, (buy)")
value <- percent(abs(weights.x[inx,]))
```


Column 
-------------------------------------------------------------------------

### Business Remarks

- The working capital account of \$250 million should be allocated as follows: `r buy_sell_corn` `r dollar(abs(weights[ind,1]*money_to_spend))` in Corn, `r buy_sell_soybean` `r dollar(abs(weights[ind,2]*money_to_spend))` in Soybean, and `r buy_sell_wheat` `r dollar(abs(weights[ind,3]*money_to_spend))` in Wheat.

- There is a strong correlation (0.65) between Corn and Wheat.  We see moderate correlation (0.58) between Corn and Soybeans.  There is only a weak correlation (0.39) between Soybeans and Corn.

- Corn is used not only for human consumption but to feed livestock.  Also, higher energy prices has led to using corn for ethanol production.
    
- Wheat is used in animal feed and in the production of flour for breads, pastas and more.  

- Soybeans are the most popular oilseed product with an almost limitless rang of uses, ranging from food to industrial products.  Its relatively stable price is testament to this.  

- With grain, like with most tangible commodities, supply and demand will determine the price.  All of these commodities are contracted on a 5,000-bushel basis.  We can use future contracts on all these commodities to mitigate risk by taking a long or short position in these contracts.  

### Recommendations

Why did you choose those weights (which model)?

- We chose the default tangency portfolio with no constraints.  We feel that putting constraints on our portfolio limits the transfer of assets into optimal portfolio positions, decreasing the value of the added commodity.  
  
How does your decision impact the business?

- We will have to enter into the future contracts market as a leverage tool, allowing us to take long and/or short positions in the market. Because they are traded at the Chicago Board of Trade (CBOT), offer more financial leverage than trading the commodities themselves.  Trading futures contracts is done with a performance margin; therefore, it requires considerably less capital than the physical market.  Currently as a wholesale distributor we operate on the spot price, which is the current cash price for the physical commodity in the market.  The futures contracts are based on a derivative contract for a delivery at a future date in time.  The difference between spot and futures prices in the market is referred to as the basis.  The basis is a crucial concept in managing our portfolio, because this relationship between cash and futures prices affects the value of the contracts used in hedging.  The basis can be used to gauge the profitability of delivery of cash or the actual commodity and can be used to search for arbitrage opportunities. 

What are your recommendations?

- We will need to closely monitor current and future market expectations in order to effectively use the futures contracts in our portfolio.  It is also crucial to benchmark our portfolio.  Benchmarking allows us to gauge risk-tolerance and expectations for return.  Even more important, benchmarking provides a basis for comparison of our portfolio performance with the rest of the market.  For commodities, we found that the S&P GSCI Total Return Index is considered a good benchmark.  It holds all futures contracts for commodities such as soybean, wheat, corn, etc.  The S&P GSCI index is considered more representative of the commodity market compared to similar indexes.  We will need to consider rebalancing our portfolio based on these benchmarks and market expectations as we continue to update and analyze the data.  We also should consider investing in a commodity that shows negative correlation to our current commodities to use as a hedging tool.


References
=======================================================================

### REFERENCES

**References**

Artzner, P., F. Delbaen, J.-M. Eber, and D. Heath (1999), Coherent measures of
risk, Mathematical Finance, 9:203???228.

Bassett, G., R. Koenker, G. Kordas (2004), Pessimistic Portfolio Allocation and Choquet Expected Utility, Journal of Financial Econometrics, 2, 477-492.

Choquet, G. (1953), Theory of Capacities, Annales de l'Institut Fourier 5, pages 131-295.

Cox, D. (1962), Comment on L. J. Savage's Lecture "Subjective Probability and Statistical Practice", in The Foundations of Statistical Inference (ed. by G. Barnard and D. Cox), London: Methuen.

Koenker, R. and Ng, P (2003), Inequality Constrained Quantile Regression, preprint.

Koenker, Roger (2005), Quantile Regression (Econometric Society Monographs), Cambridge University Press.

Koenker, R. (1984), A note on L-estimates for linear models, Stat. and Prob Letters, 2, 323-5.

Markowitz, Harry (1952), Portfolio Selection, The Journal of Finance, Vol. 7, No. 1, pp. 77-91.

McNeill, Alexander J., Rudiger Frey, and Paul Embrechts (2015), Quantitative Risk Management: Concepts, Techniques and Tools. Revised Edition. Princeton: Princeton University Press.

Polbennikov, S. and B. Melenberg (2005), Mean-Coherent Risk and Mean-Variance Approaches in Portfolio Selection: an Empirical Comparison, Discussion Papers 2005 ??? 013, Tilburg University.

Portnoy, S. and Koenker, R. (1997), The Gaussian Hare and the Laplacean Tortoise: Computability of Squared-error vs Absolute Error Estimators, (with discussion). Statistical Science, (1997) 12, 279-300.

Rockafellar, R. T. and S. Uryasev (2000), Optimization of conditional value-at-risk.
The Journal of Risk, 2:21???41.

Ruppert, David and David S. Matteson (2014), Statistics and Data Analysis for Financial Engineering with R Examples, Second Edition. New York: Springer. 

Schmeidler, D. (1989), Subjective Probability and Expected Utility Without Additivity, Econometrica, 57, 571-587.

Sharpe, William F. (1966), Mutual Fund Performance, Journal of Business, January 1966, 119-138.

Tasche, D. (2000), Conditional expectation as a quantile derivative, Preprint, TU-Munchen. (Available from arXiv math/0104190.)

Tversky, A and Kahneman, D. (1992), Advances in Prospect Theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5(4): 297-323.

von Nuemann, J. and Morgenstern, O. (1944), Theory of games and economic behaviour, Princeton University Press. 

Zou, Hui and and Ming Yuan (2008), Composite quantile regression and the Oracle model selection theory, Annals of Statistics, 36, 1108-11120.

Jacob Bunge, Kirk Maltais and Jesse Newman (2020) Coronavirus Hits Already Frail U.S. Farm Economy (https://www.wsj.com/articles/coronavirus-hits-already-frail-u-s-farm-economy-11584783001)

http://www.greenwichai.com/index.php/hf-essentials/measure-of-risk

https://www.investopedia.com/terms/c/conditional_value_at_risk.asp

https://www.risk.net/risk-magazine/technical-paper/1506669/var-versus-expected-shortfall

https://www.risk.net/definition/value-at-risk-var

https://www.investopedia.com/ask/answers/041615/what-does-value-risk-var-say-about-tail-loss-distribution.asp

https://www.plindia.com/blog/risks-associated-with-agricultural-commodities-trading/

http://www.arborinvestmentplanner.com/portfolio-rebalancing-portfolio-weighting-strategies/

http://news.morningstar.com/classroom2/course.asp?docId=145385&page=6&CN=COM

https://books.google.com/books?id=zun4Xl9tcMMC&pg=PA171&lpg=PA171&dq=risk+and+return+characteristics+of+corn+soybean+wheat&source=bl&ots=JKF1z5o87h&sig=ACfU3U0WWpcY1zO1jkrj28TdjltgJGeXsQ&hl=en&sa=X&ved=2ahUKEwj74pu5zKfoAhXZVs0KHSB4AOkQ6AEwCHoECAgQAQ#v=onepage&q=risk%20and%20return%20characteristics%20of%20corn%20soybean%20wheat&f=false