Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mystery Text Discussion: dc.txt #108

Open
ebeshero opened this issue Oct 25, 2024 · 5 comments
Open

Mystery Text Discussion: dc.txt #108

ebeshero opened this issue Oct 25, 2024 · 5 comments

Comments

@ebeshero
Copy link
Contributor

Post your screenshots and discuss your findings about dc.txt here!

@EvLar64
Copy link
Contributor

EvLar64 commented Oct 28, 2024

After putting the document into AntConc/Voyant, one of the first things I did was run it for Ngrams of length 3. It didn't take me too long to figure out the source after that, because pretty close to the top of the Ngram list was the phrase "dr van helsing," making me figure it's something with Dracula and probably the original novel.

I also checked Ngrams of 5 and found a lot of mentions of diaries and journals, and without having read the book myself, it makes me think that it might play out largely (or at least partially) through the perspective of a fictional characters's writings. After that, I looked at the word collage. Probably because I figured it was Dracula and a horror story, I was expecting to see a lot of uses of ominous and dreary words, like "blood" or "fear," etc. close to the top. In reality, its biggest words are pretty mundane ("time," "said," "know"). Again having not read the book, I feel like this backs by conclusion that it's mostly diary entries, being written by people who are clinically trying to recall life events rather than tell a story to an audience.

dc screenshot 1 dc screenshot 2

@meganlitz28
Copy link

I had this document pushed into the Voyant tool so I could see if I could know where this text was from. However, the main two words were "said" and "know" so then I switched over to AntConc to see if I could learn more about the text. When looking at Antconc it gave me the phrase "dr seward's diary" when I did the N-Gram size at a four. When I started reading the text I didn't know what it was until I got to the part where they mentioned Dracula. I was 50% sure it was Dracula and then I looked up "dr seward's diary" to confirm I was correct and it showed I was correct. I've not read Drcula before so that is why I was not sure about my answer.

Screenshot 2024-10-29 at 9 08 17 AM Screenshot 2024-10-29 at 10 28 01 AM

@NathanH1611
Copy link

When I first put this mystery text into Voyant Tools, nothing immediately jumped out at me as the most used words were "said, shall, know, time, come." These words are not exactly unique, so I had to load up AntConc to check the n-grams. I started with N-grams of 3, which is where I saw word combinations such as "Dr. Van Helsing, Dr. Seward, the Count." From this, I upped the N-grams to 4, at which point Dr. Seward's Diary was revealed to be the most frequent, at 39. Then With a quick Google search as to what Dr. Seward's Diary was, I determined that the mystery text must be Dracula.

Screenshot 2024-10-29 234342
Screenshot 2024-10-29 234154
Screenshot 2024-10-29 234937

@afish2003
Copy link

I used ant conk for all my work on this assignment as I was having issues with getting voyant to work. My chosen document was dc.txt

My discoveries are as follows:

First I checked how my results changed with varying Ngram sizes. I found that for count to be above 5 you had to be ngram size 5 or lower. I also found for double digit counts your size has to be 4 or lower.

Some of the most repeated phrases were "I did not know what to" (7 times) and dr Seward s diary (39 times).

Screenshot 2024-10-30 at 12 34 19 PM Screenshot 2024-10-30 at 12 33 54 PM

For the segment on choosing clusters to explore I could not access the tutorial as the link does not work for me. I tried my best to figure out what I was supposed to do. I searched in the cluster section of ant konk for dr Sewards diary and it displayed all the contexts in which that string was displayed.
Screenshot 2024-10-30 at 12 35 53 PM

My biggest questions on this regard how to go about using the software as Im struggling to remember from class and can not access the demo. Hopefully once I get access to the demo my understanding of this topic will clear up

@ashlynnallgeier
Copy link

When I uploaded the text into Voyant, it came up that the most frequent words were "said" and "shall". When i put it into AntConc, I first tested it with n-grams of 2, which brought up things like "of the", "in the", and "to the". I then moved onto n-grams of 3 which was still pretty generic. So, when I put in n-grams of 4, the most frequent words were "dr seward s diary". Along with that, another frequent cluster of words was "jonathan harker s journal". This was enough context for me to google these phrases and find out that the text is Dracula.

Screenshot 2024-10-30 at 12 35 02 PM Screenshot 2024-10-30 at 12 23 23 PM Screenshot 2024-10-30 at 12 22 03 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants