Richard Wen
rrwen.dev@gmail.com
Command line tool for extracting Twitter data to MongoDB databases
- Install Node.js
- Install twitter2mongodb-cli via
npm
npm install -g twitter2mongodb-cli
For the latest developer version, see Developer Install.
Get help:
twitter2mongodb --help
Open documentation in web browser:
twitter2mongodb doc twitter2mongodb
twitter2mongodb doc twitter
twitter2mongodb doc mongodb
See twitter2mongodb for programmatic usage.
An environment file .env
is used to store Twitter API credentials and MongoDB details.
Step 1. Set the default config for the .env
file:
- Every
twitter2mongodb
command will now use the designated.env
file
twitter2mongodb config set env path/to/.env
Step 2. Set Twitter API credentials
twitter2mongodb env set TWITTER_CONSUMER_KEY ***
twitter2mongodb env set TWITTER_CONSUMER_SECRET ***
twitter2mongodb env set TWITTER_ACCESS_TOKEN_KEY ***
twitter2mongodb env set TWITTER_ACCESS_TOKEN_SECRET ***
Step 3. Set MongoDB connection
twitter2mongodb env set MONGODB_CONNECTION mongodb://localhost:27017
The REST API obtains Twitter data in batches using search queries.
Step 1. Setup default twitter options:
- Set Twitter REST method (one of
get
,post
,delete
orstream
) - Set Twitter path
- Set Twitter parameters for path
twitter2mongodb config set twitter.method get
twitter2mongodb config set twitter.path search/tweets
twitter2mongodb config set twitter.params "{\"q\":\"twitter\"}"
Step 2. Setup default MongoDB options:
- Set database to store streamed Twitter data
- Set collection to store streamed Twitter data
- Set MongoDB query method for streamed Twitter data
- Set jsonata filter before inserting
twitter2mongodb config set mongodb.database twitter2mongodb_database
twitter2mongodb config set mongodb.collection twitter_data
twitter2mongodb config set mongodb.method insertMany
twitter2mongodb config set jsonata statuses
Step 3. Extract Twitter data into MongoDB collection given setup options:
twitter2mongodb > log.csv
The Stream API obtains Twitter data in real-time using tracking filters.
Step 1. Setup default twitter options:
- Set Twitter stream method
- Set Twitter path
- Set Twitter stream parameters
twitter2mongodb config set twitter.method stream
twitter2mongodb config set twitter.path statuses/filter
twitter2mongodb config set twitter.params "{\"track\":\"twitter\"}"
Step 2. Setup default MongoDB options:
- Set database to store streamed Twitter data
- Set collection to store streamed Twitter data
- Set MongoDB query method for streamed Twitter data
twitter2mongodb config set mongodb.database twitter2mongodb_database
twitter2mongodb config set mongodb.collection twitter_data
twitter2mongodb config set mongodb.method insertOne
Step 3a. Stream Twitter data into MongoDB collection given setup options:
twitter2mongodb > log.csv
Step 3b. Stream Twitter data into a MongoDB collection as a service:
- Save a node runnable script of the current options
- Install pm2 (
npm install pm2 -g
) - Use
pm2
to run the saved script as a service
twitter2mongodb save path/to/script.js
pm2 start path/to/script.js
pm2 save
The logs are in the following Comma-Separated Values (CSV) format:
time_iso8601
: Time and date in ISO 8601 formatstatus
: Status of the logmessage
: Relevant messagesjson
: JSON object containing relevant debugging information
time_iso8601 | status | message | json |
---|---|---|---|
... | ... | ... | ... |
- Reports for issues and suggestions can be made using the issue submission interface.
- Code contributions are submitted via pull requests
See CONTRIBUTING.md for more details.
Install the latest developer version with npm
from github:
npm install git+https://github.com/rrwen/twitter2mongodb-cli
Install from git
cloned source:
- Ensure git is installed
- Clone into current path
- Install via
npm
git clone https://github.com/rrwen/twitter2mongodb-cli
cd twitter2mongodb-cli
npm install
- Clone into current path
git clone https://github.com/rrwen/twitter2mongodb-cli
- Enter into folder
cd twitter2mongodb-cli
- Ensure devDependencies are installed and available
- Run tests with a
.env
file (see tests/README.md) - Results are saved to tests/log with each file corresponding to a version tested
npm install
npm test
- Ensure git is installed
- Inside the
twitter2mongodb-cli
folder, add all files and commit changes - Push to github
git add .
git commit -a -m "Generic update"
git push
- Update the version in
package.json
- Run tests and check for OK status
- Login to npm
- Publish to npm
npm test
npm login
npm publish
The module twitter2mongodb-cli uses the following npm packages for its implementation:
npm | Purpose |
---|---|
path | Handle file and directory paths |
fs | Read and write config file |
envfile | Parse and write env files |
dotenv | Load environmental variables from a file |
yargs | Command line builder and parser |
yargs-command-config | Command for managing config files |
yargs-command-env | Command for managing env files |
twitter2mongodb | Extracts Twitter data to MongoDB |
opn | Open online browser documentation |
mongodb | Send queries to MongoDB database |
parse-mongo-url | Parse MongoDB urls |
path <-- Handle file and dir paths
|
fs <-- Read and write config file
|
envfile <-- parse and write env file
|
dotenv <-- load env file
|
yargs
|--- yargs-command-config <-- manage config
|--- yargs-command-env <-- manage env
|--- twitter2mongodb <-- default command
|--- opn <-- doc
|--- mongodb <-- query
|--- parse-mongo-url <-- parse MongoDB url for info