twitter2mongodb-cli v1.1.5
twitter2mongodb-cli
Richard Wen
rrwen.dev@gmail.com
Command line tool for extracting Twitter data to MongoDB databases
Install
- Install Node.js
- Install twitter2mongodb-cli via
npm
npm install -g twitter2mongodb-cliFor the latest developer version, see Developer Install.
Usage
Get help:
twitter2mongodb --helpOpen documentation in web browser:
twitter2mongodb doc twitter2mongodb
twitter2mongodb doc twitter
twitter2mongodb doc mongodbSee twitter2mongodb for programmatic usage.
Environment File
An environment file .env is used to store Twitter API credentials and MongoDB details.
Step 1. Set the default config for the .env file:
- Every
twitter2mongodbcommand will now use the designated.envfile
twitter2mongodb config set env path/to/.envStep 2. Set Twitter API credentials
twitter2mongodb env set TWITTER_CONSUMER_KEY ***
twitter2mongodb env set TWITTER_CONSUMER_SECRET ***
twitter2mongodb env set TWITTER_ACCESS_TOKEN_KEY ***
twitter2mongodb env set TWITTER_ACCESS_TOKEN_SECRET ***Step 3. Set MongoDB connection
twitter2mongodb env set MONGODB_CONNECTION mongodb://localhost:27017REST API
The REST API obtains Twitter data in batches using search queries.
Step 1. Setup default twitter options:
- Set Twitter REST method (one of
get,post,deleteorstream) - Set Twitter path
- Set Twitter parameters for path
twitter2mongodb config set twitter.method get
twitter2mongodb config set twitter.path search/tweets
twitter2mongodb config set twitter.params "{\"q\":\"twitter\"}"Step 2. Setup default MongoDB options:
- Set database to store streamed Twitter data
- Set collection to store streamed Twitter data
- Set MongoDB query method for streamed Twitter data
- Set jsonata filter before inserting
twitter2mongodb config set mongodb.database twitter2mongodb_database
twitter2mongodb config set mongodb.collection twitter_data
twitter2mongodb config set mongodb.method insertMany
twitter2mongodb config set jsonata statusesStep 3. Extract Twitter data into MongoDB collection given setup options:
twitter2mongodb > log.csvStream API
The Stream API obtains Twitter data in real-time using tracking filters.
Step 1. Setup default twitter options:
- Set Twitter stream method
- Set Twitter path
- Set Twitter stream parameters
twitter2mongodb config set twitter.method stream
twitter2mongodb config set twitter.path statuses/filter
twitter2mongodb config set twitter.params "{\"track\":\"twitter\"}"Step 2. Setup default MongoDB options:
- Set database to store streamed Twitter data
- Set collection to store streamed Twitter data
- Set MongoDB query method for streamed Twitter data
twitter2mongodb config set mongodb.database twitter2mongodb_database
twitter2mongodb config set mongodb.collection twitter_data
twitter2mongodb config set mongodb.method insertOneStep 3a. Stream Twitter data into MongoDB collection given setup options:
twitter2mongodb > log.csvStep 3b. Stream Twitter data into a MongoDB collection as a service:
- Save a node runnable script of the current options
- Install pm2 (
npm install pm2 -g) - Use
pm2to run the saved script as a service
twitter2mongodb save path/to/script.js
pm2 start path/to/script.js
pm2 saveLogs
The logs are in the following Comma-Separated Values (CSV) format:
time_iso8601: Time and date in ISO 8601 formatstatus: Status of the logmessage: Relevant messagesjson: JSON object containing relevant debugging information
| time_iso8601 | status | message | json |
|---|---|---|---|
| ... | ... | ... | ... |
Contributions
- Reports for issues and suggestions can be made using the issue submission interface.
- Code contributions are submitted via pull requests
See CONTRIBUTING.md for more details.
Developer Notes
Developer Install
Install the latest developer version with npm from github:
npm install git+https://github.com/rrwen/twitter2mongodb-cliInstall from git cloned source:
- Ensure git is installed
- Clone into current path
- Install via
npm
git clone https://github.com/rrwen/twitter2mongodb-cli
cd twitter2mongodb-cli
npm installTests
- Clone into current path
git clone https://github.com/rrwen/twitter2mongodb-cli - Enter into folder
cd twitter2mongodb-cli - Ensure devDependencies are installed and available
- Run tests with a
.envfile (see tests/README.md) - Results are saved to tests/log with each file corresponding to a version tested
npm install
npm testUpload to Github
- Ensure git is installed
- Inside the
twitter2mongodb-clifolder, add all files and commit changes - Push to github
git add .
git commit -a -m "Generic update"
git pushUpload to npm
- Update the version in
package.json - Run tests and check for OK status
- Login to npm
- Publish to npm
npm test
npm login
npm publishImplementation
The module twitter2mongodb-cli uses the following npm packages for its implementation:
| npm | Purpose |
|---|---|
| path | Handle file and directory paths |
| fs | Read and write config file |
| envfile | Parse and write env files |
| dotenv | Load environmental variables from a file |
| yargs | Command line builder and parser |
| yargs-command-config | Command for managing config files |
| yargs-command-env | Command for managing env files |
| twitter2mongodb | Extracts Twitter data to MongoDB |
| opn | Open online browser documentation |
| mongodb | Send queries to MongoDB database |
| parse-mongo-url | Parse MongoDB urls |
path <-- Handle file and dir paths
|
fs <-- Read and write config file
|
envfile <-- parse and write env file
|
dotenv <-- load env file
|
yargs
|--- yargs-command-config <-- manage config
|--- yargs-command-env <-- manage env
|--- twitter2mongodb <-- default command
|--- opn <-- doc
|--- mongodb <-- query
|--- parse-mongo-url <-- parse MongoDB url for info