Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code update to Python 3 #335

Open
wants to merge 35 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
3cc2ad0
Adding first readme
rohankhairnar Feb 3, 2018
4c8cdff
Adding readme
Feb 3, 2018
357f703
Adding readme
Feb 3, 2018
2a7be5d
Adding readme
Feb 3, 2018
e5e4f4c
Python 3
rohankhairnar Feb 5, 2018
6b27de5
Python 3
rohankhairnar Feb 5, 2018
0269553
Python 3
rohankhairnar Feb 5, 2018
673a988
Python 3 work in progress
rohankhairnar Feb 5, 2018
0f35dcd
Python 3 work in progress
rohankhairnar Feb 5, 2018
608aaca
NLTK Iterator
rohankhairnar Apr 29, 2018
a09efac
Dropped clean_html()
rohankhairnar Apr 29, 2018
66029eb
Python 3
rohankhairnar Apr 29, 2018
395d7d3
Centralities calculations
rohankhairnar Apr 29, 2018
00956c8
Python 3
rohankhairnar Apr 29, 2018
2867a7d
Python 3
rohankhairnar Apr 29, 2018
daa68c2
Check hRecipe block
rohankhairnar Apr 29, 2018
dd78762
Python 3
rohankhairnar Apr 29, 2018
8316b70
Python 3
rohankhairnar Apr 29, 2018
e0739bd
Python 3 WIP
rohankhairnar Apr 29, 2018
3beed09
Python 3 WIP - Histogram plots
rohankhairnar Apr 29, 2018
fbc4323
Python 3 WIP - Trial run required
rohankhairnar Apr 29, 2018
5b6be32
Python 3 WIP
rohankhairnar Apr 30, 2018
e1520b9
Python 3 - Trial Run required
rohankhairnar Apr 30, 2018
63a165a
Python 3 - Trial run required
rohankhairnar Apr 30, 2018
dd9addc
sys.stderr updated as per Python 3
rohankhairnar Apr 30, 2018
18ce4b0
Python 3
rohankhairnar Apr 30, 2018
6185d9d
Python 3 - Cleaned O/Ps
rohankhairnar Apr 30, 2018
af0cc0d
Python 3 - Cleaned Outputs
rohankhairnar Apr 30, 2018
42dc942
Python 3 - Histogram WIP
rohankhairnar Apr 30, 2018
0c63a6e
Cleaned
rohankhairnar Apr 30, 2018
724203a
Updated to Python 3
rohankhairnar Apr 30, 2018
fa6c7f9
Updated to Python 3
rohankhairnar Apr 30, 2018
4ce6f6d
LinkedIn mining discontinued
rohankhairnar Apr 30, 2018
4618e3f
Updated to Python 3 - Cleaned
rohankhairnar May 1, 2018
b8bc85f
Code updated to Python 3
rohankhairnar May 2, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
Mining the Social Web (2nd Edition)
=================================

## Summary

_Mining the Social Web, 2nd Edition_ is available through O'Reilly Media, Amazon, and other fine book retailers. [Purchasing the ebook directly from O'Reilly](http://bit.ly/135dHfs) offers a number of great benefits, including a variety of digital formats and continual updates to the text of book for life! Better yet, if you choose to use O'Reilly's DropBox or Google Drive synchronization, your ebooks will automatically update every time there's an update. In other words, you'll always have the latest version of the book if you purchase the ebook through O'Reilly, which is why it's the recommended option in comparison to a paper copy or other electronic version. (If you prefer a [paperback or Kindle version from Amazon](http://amzn.to/GPd59m), that's a fine option as well.)

There's an incredible turn-key virtual machine experience for this second edition of the book that provides you with a powerful social web mining toolbox. This toolbox provides the ability to explore and run all of the source code in a hassle-free manner. All that you have to do is [follow a few simple steps](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix A - Virtual Machine Experience.html) to get the virtual machine installed, and you'll be running the example code in as little as 20-30 minutes. (And by the way, most of that time is waiting for files to download.)

This [short screencast](https://vimeo.com/72383764) demonstrates the steps involved in installing the virtual machine, which installs every single dependency for you automatically and save you a lot of time. Even sophisticated power users tend to prefer using it versus using their own environments.

If you experience any problems at all with installation of the virtual machine, file an issue here on GitHub. Be sure to also follow [@SocialWebMining](http://twitter.com/socialwebmining) on Twitter and like http://facebook.com/MiningTheSocialWeb on Facebook.

Be sure to also visit http://MiningTheSocialWeb.com for additional content, news, and updates about the book and code in this GitHub repository.

## Preview the Full-Text of Chapter 1 (Mining Twitter)

Chapter 1 of the book provides a gentle introduction to hacking on Twitter data. It's available in a variety of convenient formats

* A free [PDF download](http://bit.ly/135dHfs)
* An [online ebook](http://bit.ly/1an184a) excerpt
* An [IPython Notebook (ipynb) file](http://bit.ly/1aIXjFf) (checked into this repository)

Choose one, or choose them all. There's no better way to get started than following along with the opening chapter.

## Preview the IPython Notebooks

This edition of _Mining the Social Web_ extensively uses [IPython Notebook](http://ipython.org/notebook.html) to facilitate the learning and development process. If you're interested in what the example code for any particular chapter does, the best way to preview it is with the links below. When you're ready to develop, pull the source for this GitHub repository and follow the instructions for installing the virtual machine to get started.

A bit.ly bundle of all of these links is also available: http://bit.ly/mtsw2e-ipynb

* [Chapter 0 - Preface](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 0 - Preface.html)
* [Chapter 1 - Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking About, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 1 - Mining Twitter.html)
* [Chapter 2 - Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 2 - Mining Facebook.html)
* [Chapter 3 - Mining LinkedIn: Faceting Job Titles, Clustering Colleagues, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 3 - Mining LinkedIn.html)
* [Chapter 4 - Mining Google+: Computing Document Similarity, Extracting Collocations, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 4 - Mining Google+.html)
* [Chapter 5 - Mining Web Pages: Using Natural Language Processing to Understand Human Language, Summarize Blog Posts and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 5 - Mining Web Pages.html)
* [Chapter 6 - Mining Mailboxes: Analyzing Who's Talking To Whom About What, How Often, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 6 - Mining Mailboxes.html)
* [Chapter 7 - Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 7 - Mining GitHub.html)
* [Chapter 8 - Mining the Semantically Marked-Up Web: Extracting Microformats, Inferencing Over RDF, and More](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 8 - Mining the Semantically Marked-Up Web.html)
* [Chapter 9 - Twitter Cookbook](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/Chapter 9 - Twitter Cookbook.html)
* [Appendix A - Virtual Machine Experience](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix A - Virtual Machine Experience.html)
* [Appendix B - OAuth Primer](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix B - OAuth Primer.html)
* [Appendix C - Python & IPython Notebook Tips](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix C - Python & IPython Notebook Tips.html)

## Blog & Screencasts

Be sure to bookmark the [Mining the Social Web Vimeo Channel]() to stay up to date with short instructional videos that demonstrate how to use the tools in this repository. More screencasts are being added all the time, so check back often -- or better yet, subscribe to the channel.

<p align="center">
<a href="https://vimeo.com/channels/MiningTheSocialWeb" target="_blank"><img src="https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/raw/master/images/screencast-installing-vm.png"
alt="Installing the Virtual Machine" width="50%" height="50%" border="10" /></a><br />
<em>A ~3 minute screencast on installing a powerful toolbox for social web mining.<br />
View a collection of all available screencasts at http://bit.ly/mtsw2e-screencasts</em>
</p>

You might also benefit from the content that is being regularly added to the companion blog at http://MiningTheSocialWeb.com

## The _Mining the Social Web_ Virtual Machine

_You may enjoy [this short screencast](https://vimeo.com/72383764) that demonstrates the step-by-step instructions involved in installing the book's virtual machine._

The code for _Mining the Social Web_ is organized by chapter in an [IPython Notebook](http://ipython.org/notebook.html) format to maximize enjoyment of following along with examples as part of an interactive experience. Unfortunately, some of the Python dependencies for the example code can be a little bit tricky to get installed and configured, so providing a completely turn-key virtual machine to make your reading experience as simple and enjoyable as possible is in order. Even if you are a seasoned developer, you may still find some value in using this virtual machine to get started and save yourself some time. The virtual machine is powered with [Vagrant](http://vagrantup.com/), an amazing development tool that you'll probably want to know about and arguably makes working with virtualization even easier than a native [Virtualbox](http://www.virtualbox.org/) or VMWare image.

## Quick Start Guide

The recommended way of getting started with the example code is by taking advantage of the Vagrant-powered virtual machine as illusrated in [this short screencast](https://www.youtube.com/watch?v=BTyKPMfi_JQ). After all, you're more interested in following along and learning from the examples than installing and managing all of the system dependencies just to get to that point, right?

[Appendix A - Virtual Machine Experience](https://rawgithub.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/master/ipynb/html/_Appendix A - Virtual Machine Experience.html) provides clear step-by-step instructions for installing the virtual machine and is intended to serve as a quick start guide.


## The _Mining the Social Web_ Wiki

This project takes advantage of its GitHub repository's [wiki](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/wiki) to act as a point of collaboration for consumers of the source code. Feel free to use the wiki however you'd like to share your experiences, and create additional pages as needed to curate additional information.

One of the more important wiki pages that you may want to bookmark is the [Advisories](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/wiki/Advisories) page, which is an archive of notes about particularly disruptive commits or other changes that may affect you.

Another page of interest is a listing of all [100+ numbered examples](https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/wiki/Numbered-Examples) from the book that conveniently hyperlink to read-only version of the IPython Notebooks

## "Premium Support"

The source code in this repository is free for your use however you'd like. If you'd like to complete a more rigorous study about social web mining much like you would experience by following along with a textbook in a classroom, however, you should consider picking up a copy of [Mining the Social Web](http://bit.ly/135dHfs) and follow along. _Think of the book as offering a form of "premium support" for this open source project._

The publisher's description of the book follows for your convenience:

How can you tap into the wealth of social web data to discover who’s making connections with whom, what they’re talking about, and where they’re located? With this expanded and thoroughly revised edition, you’ll learn how to acquire, analyze, and summarize data from all corners of the social web including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.

* Employ IPython Notebook, the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
* Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
* Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
* Build interactive visualizations with D3.js, a state-of-the-art HTML5 and JavaScript toolkit
* Take advantage of more than two-dozen Twitter recipes presented in O’Reilly’s popular and well-known cookbook format

The example code for this data science book is maintained in a public GitHub repository and is designed to be especially accessible through a turn-key virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.
File renamed without changes.
Loading