1.0.0 • Published 3 years ago

twitter-data-indexer v1.0.0

Weekly downloads
-
License
MIT
Repository
-
Last release
3 years ago

Skynet Twitter Data Indexer

transform data from twitter to S3 and dynamodb.

Prerequisite

The system must have yarn installed.

Also please export the following environment variables:

export SKYNET_AWS_ACCESS_KEY_ID="key"
export SKYNET_AWS_SECRET_ACCESS_KEY="access-key"
export SKYNET_AWS_REGION="us-east-1"
export SKYNET_ENVIRONMENT=dev        # prd for production
export SKYNET_TWITTER_CONSUMER_KEY="Y8h1RXQXD5G80jnsMHt46RiII"
export SKYNET_TWITTER_CONSUMER_SECRET="GrkMVyzsX79qUhu9v5SJxZy6MMPlBES518FKR8yIhTrH7QSlOX"
export SKYNET_TWITTER_ACCESS_TOKEN_KEY="49341211-ttKNIHw90RqTC9S6sly9iCYX8p7vAH3IioTGYTltu"
export SKYNET_TWITTER_ACCESS_TOKEN_SECRET="QsRRCpjPmWrWxSDRofjKLooFA8YFGYd52GGyjxCck4ZBj"

List of Tables

skynet-<env>-twitter-metrics

List of Supported Metrics

followersCount(twitterId=dypfinance)
tweetsCount(twitterId=dypfinance)
retweetsCount(twitterId=dypfinance)
favoritesCount(twitterId=dypfinance)

Start

# delta mode, continuously pull latest data and process
yarn node bin/indexer --mode delta --projectId CertiK

# rebuild mode, pull as much data as possible
yarn node bin/indexer --mode rebuild --projectId CertiK

# reset mode, delete all address records
yarn node bin/indexer --mode reset --projectId CertiK

Run nomad jobs

Nomad: https://nomad.certik-skynet.com/ui/jobs

Run Twitter Rebuild Jobs

  1. Make sure environments are ready
SKYNET_AWS_ACCESS_KEY_ID="<todo-use-real-key>"
SKYNET_AWS_SECRET_ACCESS_KEY="<todo-use-real-access-key>"
SKYNET_AWS_REGION="us-east-1"
SKYNET_ENVIRONMENT="prd"
SKYNET_TWITTER_CONSUMER_KEY="Y8h1RXQXD5G80jnsMHt46RiII"
SKYNET_TWITTER_CONSUMER_SECRET="GrkMVyzsX79qUhu9v5SJxZy6MMPlBES518FKR8yIhTrH7QSlOX"
SKYNET_TWITTER_ACCESS_TOKEN_KEY="49341211-ttKNIHw90RqTC9S6sly9iCYX8p7vAH3IioTGYTltu"
SKYNET_TWITTER_ACCESS_TOKEN_SECRET="QsRRCpjPmWrWxSDRofjKLooFA8YFGYd52GGyjxCck4ZBj"
  1. Run jobs with project IDs
yarn node bin/indexer --mode rebuild --projectId dypfinance

Run Twitter Delta Jobs

  1. Make sure environments are ready
SKYNET_AWS_ACCESS_KEY_ID="<todo-use-real-key>"
SKYNET_AWS_SECRET_ACCESS_KEY="<todo-use-real-access-key>"
SKYNET_AWS_REGION="us-east-1"
SKYNET_ENVIRONMENT="prd"
SKYNET_TWITTER_CONSUMER_KEY="Y8h1RXQXD5G80jnsMHt46RiII"
SKYNET_TWITTER_CONSUMER_SECRET="GrkMVyzsX79qUhu9v5SJxZy6MMPlBES518FKR8yIhTrH7QSlOX"
SKYNET_TWITTER_ACCESS_TOKEN_KEY="49341211-ttKNIHw90RqTC9S6sly9iCYX8p7vAH3IioTGYTltu"
SKYNET_TWITTER_ACCESS_TOKEN_SECRET="QsRRCpjPmWrWxSDRofjKLooFA8YFGYd52GGyjxCck4ZBj"
  1. Prepare a twitter-id.txt file with all projectId
CertiK
aave
dypfinance
...
  1. Run jobs with project IDs
# run delta job once, useful for test run and check results
yarn node bin/indexer --mode once --projectId dypfinance

# run one delta job
yarn node bin/deployer --projectId dypfinance

# run one rebuild job
yarn node bin/indexer --mode rebuild --projectId dypfinance

# run many
cat project-id.txt | xargs -n 1 yarn node bin/deployer --projectId 
  1. Delete delta twitter jobs
nomad status | grep -v ID | grep twitter-data-indexer | awk '{print $1}' | xargs -n 1 nomad job stop -purge