Fig5_supervisedExploration.Rmd

---
title: "Figure 2 cell type deconvoltion"
output: html_document
author: Sara Gosline
date: "2023-05-05"
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

library(ggplot2)
library(ggfortify)
library(cowplot)
#library(leapR)
library(dplyr)

source('spleenDataFormatting.R')
library(spammer)
source('spatialProtUtils.R')

```

The assignment of pulp type from the basic data was not robust. Here we will use the cell type signatures from the sorted data to label the voxels.


## Cell type signatures

Here we get the differential expression from the sorted cells.

```{r cell type signatures}


pumap<-scater::runPCA(spat.prot)

phumap<-spat.phos%>%
  scater::runPCA()

fullmap<-scater::runPCA(global.sorted)

fullmap<-spatialDiffEx(fullmap)

full<- fullmap%>%
  rowData(.)%>%
  as.data.frame()%>%
  dplyr::select(featureID='X',
                logFC='pulpAnnotation.limma.logFC',
                adj.P.Val='pulpAnnotation.limma.adj.P.Val',
                AveExpr='pulpAnnotation.limma.AveExpr')

upsig<-full%>%
  subset(adj.P.Val<0.01)%>%
  subset(logFC>1)|>  subset(AveExpr>(1))


downsig<-full%>%
  subset(adj.P.Val<0.01)%>%
  subset(logFC<(-1))|>
    subset(AveExpr>(1))

pumap<-calcSigScore(pumap,rownames(downsig),'RedPulp')%>%
  calcSigScore(rownames(upsig),'WhitePulp')

##now project same scores to phosphosites

colData(phumap)[['RedPulp']]<-colData(pumap)$RedPulp

colData(phumap)[['WhitePulp']]<-colData(pumap)$WhitePulp

newVals<-colData(pumap)%>%
  as.data.frame()%>%
  mutate(isRed=RedPulp>0.5,isWhite=WhitePulp>0.5)%>%
  mutate(pulp=ifelse(isRed,'red',ifelse(isWhite,'white','None')))

colData(pumap)[['pulp']]<-newVals$pulp
colData(phumap)[['pulp']]<-newVals$pulp


```

## Plot deconvolution scores

how do the deconvolution scores align?

```{r plot deconv}

annotes<-as_data_frame(colData(pumap))|>
  dplyr::select(RedPulp,WhitePulp)|>
  as.data.frame()

#rownames(annotes)<-rownames(colData(pumap))

pheatmap(deconv,annotation_col = as.data.frame(annotes))

```

## PTPN6 activity
What is this protein doing with phosphosites - it's a tyrosine phosphotase, so therefore _remove_ phosphorylated tyrosone from residues. Therefore we can look for down-regulated pY sites in the data. However, only 2 pY sites exist:

```{r pressure, echo=FALSE}
library(ggplot2)

pySites<- rownames(rowData(phumap)[grep('-Y',rowData(phumap)$Phosphosite),])

res<-expToLongForm(phumap,'Phosphosite')|>
  left_join(tibble::rownames_to_column(as.data.frame(colData(phumap)),'Voxel'))

ps<-res|>
  subset(Phosphosite%in%pySites)|>
  ggplot(aes(x=Phosphosite,y=LogRatio,fill=pulp))+geom_boxplot()

ps

ggsave('phosphoTyrosine.pdf',ps)
```
We can't find any PTPN6 sites in the data.

## CD immune proteins

Now we can look at individual proteins as well. THose from the sorteddata include:

- White pulp: CD44, CD1C, CD19, CD48, CD22, CD37, CD20 (MS4A1), CD74
- Red pulp: CD163,CD81, CD47, CD9, CD36, CD63, CD38, CD8A, CD4

```{r calculate CD scores}
wp<-c('CD44', 'CD1C', 'CD19', 'CD48', 'CD22', 'CD37', 'MS4A1', 'CD74')
rp<-c('CD163','CD81', 'CD47', 'CD9', 'CD36', 'CD63', 'CD38', 'CD8A','CD4')
op<-c('CD36','CD59')


resProt<-expToLongForm(pumap,'Protein')|>
  left_join(tibble::rownames_to_column(as.data.frame(colData(pumap)),'Voxel'))


resPhos<-expToLongForm(phumap,'Protein')|>
  left_join(tibble::rownames_to_column(as.data.frame(colData(pumap)),'Voxel'))

allmarkers<-resProt|>
  subset(Protein%in%c(wp,rp,op))|>
  ggplot(aes(x=Protein,y=LogRatio,fill=pulp))+geom_boxplot()+ggtitle('Immune markers')

ggsave('allImmMarkers.pdf',allmarkers)

##CD44 is also phosphorylatd
cdp<-plotFeatureGrid(phumap,'CD44_S697','CD44_S697')

cd<-plotFeatureGrid(pumap,'CD44','CD44')

dd<-plotFeatureGrid(adj.phumap,'CD44_S697','CD44_S697')

p<-cowplot::plot_grid(cdp,cd,dd,ncol=1)

p

ggsave('cd44.pdf',p,width=12,height=20)
```

INterestingly there is evidence in CD44 activating PD1 signaling, but this is PDL1
https://aacrjournals.org/cancerres/article/80/3/444/646081/CD44-Promotes-PD-L1-Expression-and-Its-Tumor
  
PD1 expression keeps CD44 active, regulation in check
https://www.sciencedirect.com/science/article/pii/S2211124720308081
  
Can we find evidnece of this link in co-expression datasets?

## CDC44 activation

```{r cd44 activation}
library(leapR)
data('kinasesubstrates')##our latest db

kins<-names(which(apply(kinasesubstrates$matrix,1,function(x) "CD44-S697"%in%x)))
resProt|>
  subset(Protein%in%kins)|>
  ggplot(aes(x=Protein,y=LogRatio,fill=pulp))+geom_boxplot()+ggtitle('CD44 Kinases')

psubs<-kinasesubstrates$matrix['PRKACA',]|>
  stringr::str_replace('-','_')

p_sub<-res|>
  subset(Phosphosite%in%psubs)|>
  ggplot(aes(x=Phosphosite,y=LogRatio,fill=pulp))+geom_boxplot()+ggtitle('PRKACA Substrates')+coord_flip()
p_sub
ggsave('prkaca_sub.pdf',p_sub,height=10)
```
Only one of the 5 that we know of (PRKACA,CAMK2A,ROCK2,TGFBR2) are expressed. higher in red pulp.