Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no object in values #1

Open
MIZUNOYUSUKE opened this issue Jun 1, 2017 · 3 comments
Open

no object in values #1

MIZUNOYUSUKE opened this issue Jun 1, 2017 · 3 comments

Comments

@MIZUNOYUSUKE
Copy link

MIZUNOYUSUKE commented Jun 1, 2017

I made tables with @koheiw's github (https://github.com/koheiw/IJTA/blob/master/documents/corpus.md) to learn how to use corpus.
Although other process had complete successfully, I made an error when I tried to count how many objects in the values. The file named "data_corpus_asahi_2016" is exist.

corp_morning <- corpus_subset(data_corpus_asahi_2016, edition == '朝刊') # 朝刊だけを選択
ndoc(corp_morning)

To improve this situation, I confirmed that machine's default encoding and Rstudio's one were UTF-8 ran these commands below.

require(quanteda) # パッケージの読み込み
> txt <- readLines("data/asahi_head.txt")
> setwd('C:\\Users\\mizuno yusuke\\Downloads\\IJTA-master\\IJTA-master')
> load('data/data_corpus_asahi_2016.RData') # Rオブジェクトの読み込み
> table(docvars(corp, 'month'))

I worked under the environment below.

version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=Japanese_Japan.932  LC_CTYPE=Japanese_Japan.932    LC_MONETARY=Japanese_Japan.932
[4] LC_NUMERIC=C                   LC_TIME=Japanese_Japan.932    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] quanteda_0.9.9-50

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.11        lattice_0.20-35     digest_0.6.12       withr_1.0.2         plyr_1.8.4         
 [6] grid_3.4.0          gtable_0.2.0        scales_0.4.1        RcppParallel_4.3.20 ggplot2_2.2.1      
[11] rlang_0.1.1         stringi_1.1.5       lazyeval_0.2.0      data.table_1.10.4   Matrix_1.2-9       
[16] fastmatch_1.1-0     devtools_1.13.1     tools_3.4.0         munsell_0.4.3       compiler_3.4.0     
[21] colorspace_1.3-2    memoise_1.1.0       tibble_1.3.1   
@koheiw
Copy link
Owner

koheiw commented Jun 3, 2017

@MIZUNOYUSUKE  報告ありがとうございます。エラーメッセージも投稿してもらえますか?

@MIZUNOYUSUKE
Copy link
Author

赤文字のエラーメッセージは出ませんでしたが、実行すると以下のようになりました。

> corp_morning <- corpus_subset(data_corpus_asahi_2016, edition == '朝刊') # 朝刊だけを選択
> ndoc(corp_morning)
[1] 0
> table(weekdays(docvars(corp_morning, 'date')))
< table of extent 0 >

@koheiw
Copy link
Owner

koheiw commented Jun 12, 2017

Rコンソール上での入力がUTF-8になっていないことが考えられます。試しに以下のコマンドをコピーせず、タイプし、実行してみてください。

Encoding('朝刊')
Encoding('あいうえお')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants