Skip to content

Commit

Permalink
fix and regenerate vignette
Browse files Browse the repository at this point in the history
newly strict tibble revealed that I was using `metadata(m)$journal` to
access `metadata(m)$journaltitle`.

Also, hide the progress report from `simplify_state` (invoked by
write_mallet_model).
  • Loading branch information
agoldst committed Jul 23, 2016
1 parent 261fb5d commit 6ef6b07
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions vignettes/introduction.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ The metadata supplied here as a parameter to `train_model` is not used in modeli

Though this `r n_docs(m)`-corpus needs only minutes to model, it often takes hours or more to produce a topic model of even a moderately-sized corpus. You are likely to want to save the results. It is most convenient, I have found, to save both the richest possible MALLET outputs and user-friendlier transformations: many analyses need only the estimated document-topic and topic-word matrices, for example. For this reason, the default `write_mallet_model` function takes the results of `train_model` and outputs a directory of files.

```{r message=F}
```{r message=F, results="hide"}
write_mallet_model(m, "modeling_results")
```

Expand Down Expand Up @@ -272,7 +272,7 @@ This is a "long" data frame suitable for plotting, which we turn to shortly. But
To make this more general operation a little easier, I have supplied generalized aggregator functions `sum_row_groups` and `sum_col_groups` which take a matrix and a grouping factor. As a simple example, suppose we wanted to tabulate the way topics are split up between the two journals in our corpus:

```{r}
journal <- factor(metadata(m)$journal)
journal <- factor(metadata(m)$journaltitle)
doc_topics(m) %>%
sum_row_groups(journal) %>%
normalize_cols()
Expand Down

0 comments on commit 6ef6b07

Please sign in to comment.