Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DESeqDatasetFromMatrix() doesn’t like tibbles? #448

Open
cansavvy opened this issue Mar 26, 2021 · 1 comment
Open

DESeqDatasetFromMatrix() doesn’t like tibbles? #448

cansavvy opened this issue Mar 26, 2021 · 1 comment

Comments

@cansavvy
Copy link
Contributor

There were 3 or 4 participants who encountered this problem at March 2021 workshops.

They had a tibble and tried to turn it into a DESeqDataSet. The error returned is something about row.names() but an as.data.frame() solves the problem. Idk that this requires a change in instruction but it was a common problem so I figured we should have it written down.

@jashapiro
Copy link
Member

I would lean toward changing to a matrix, since that is what the function explicitly asks for. This is also what we model in the exercise notebook:

```{r convert_round, live = TRUE}
rnaseq_mat <- rnaseq_exp %>%
# move gene_id to the rownames
tibble::column_to_rownames("gene_id") %>%
# convert to a matrix and round
as.matrix() %>%
round()
```
### Variance Stabilizing Transformation
Raw counts are not usually suitable for the algorithms we use for clustering and heatmap display, so we will use the `vst()` function from the `DESeq2` package to transform our data.
Since we are starting from a matrix, not a `SummarizedExperiment` as we did previously, we will need to provide the sample information ourselves.
Just to be sure nothing is out of order, we will check that the identifiers for the sample information stored in `histologies_df` matches the columns of our matrix.
```{r check-order}
all.equal(histologies_df$Kids_First_Biospecimen_ID,
colnames(rnaseq_mat))
```
Now we can make our matrix into a `DESeq2` dataset, adding on the sample information from `histologies_df`.
Unlike when we were performing differential expression analysis, we won't provide an experimental design at this stage.
```{r make-DESEq}
ddset <- DESeqDataSetFromMatrix(rnaseq_mat,
colData = histologies_df,
design = ~ 1) # don't store an experimental design
```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants