Skip to content
vvvictoria edited this page Oct 16, 2014 · 20 revisions

Guidelines

  1. Post a link to documentation of your assignment. This could be a version of your work that runs online or a blog post with written / visual description of your work.
  2. Place a copy of your code in our class' shared dropbox folder. (Look for it in email.) This is so that we can run or debug in class if necessary.

Assignment

  • Experiment with text analysis. Here are some ideas if you are feeling stuck.
  • Visualize the results of a concordance using canvas (or some other means).
  • Expand the information the concordance holds so that it keeps track of word positions (i.e. not only how many times do the words appear in the source text, but where do they appear each time.)
  • Implement some of the ideas specific to spam filtering to the bayesian classification exmple.
  • In James W. Pennebaker's book The Secret Life of Pronouns, Pennebaker describes his research into how the frequency of words that have little to no meaning on their own (I, you, they, a, an, the, etc.) are a window into the emotional state or personality of an author or speaker. For example, heavy use of the pronoun “I” is an indicator of “depression, stress or insecurity”. Create a page sketch that analyzes the use of pronouns. For more, visit analyzewords.com.
  • Use the ideas to find similarities between people. For example, if you look at all the e-mails on the ITP student list, can you determine who is similar? Consider using properties in addition to word count, such as time of e-mails, length of e-mails, etc.
  • Add a link below.

Questions

If you are struggling with something, write about it below and it will help me organize material for class. (Name optional.)

  • your question here

Add Your Homework

Clone this wiki locally