1.0.0 • Published 3 years ago
twitter-data-indexer v1.0.0
Skynet Twitter Data Indexer
transform data from twitter to S3 and dynamodb.
Prerequisite
The system must have yarn installed.
Also please export the following environment variables:
export SKYNET_AWS_ACCESS_KEY_ID="key"
export SKYNET_AWS_SECRET_ACCESS_KEY="access-key"
export SKYNET_AWS_REGION="us-east-1"
export SKYNET_ENVIRONMENT=dev # prd for production
export SKYNET_TWITTER_CONSUMER_KEY="Y8h1RXQXD5G80jnsMHt46RiII"
export SKYNET_TWITTER_CONSUMER_SECRET="GrkMVyzsX79qUhu9v5SJxZy6MMPlBES518FKR8yIhTrH7QSlOX"
export SKYNET_TWITTER_ACCESS_TOKEN_KEY="49341211-ttKNIHw90RqTC9S6sly9iCYX8p7vAH3IioTGYTltu"
export SKYNET_TWITTER_ACCESS_TOKEN_SECRET="QsRRCpjPmWrWxSDRofjKLooFA8YFGYd52GGyjxCck4ZBj"
List of Tables
skynet-<env>-twitter-metrics
List of Supported Metrics
followersCount(twitterId=dypfinance)
tweetsCount(twitterId=dypfinance)
retweetsCount(twitterId=dypfinance)
favoritesCount(twitterId=dypfinance)
Start
# delta mode, continuously pull latest data and process
yarn node bin/indexer --mode delta --projectId CertiK
# rebuild mode, pull as much data as possible
yarn node bin/indexer --mode rebuild --projectId CertiK
# reset mode, delete all address records
yarn node bin/indexer --mode reset --projectId CertiK
Run nomad jobs
Nomad: https://nomad.certik-skynet.com/ui/jobs
Run Twitter Rebuild Jobs
- Make sure environments are ready
SKYNET_AWS_ACCESS_KEY_ID="<todo-use-real-key>"
SKYNET_AWS_SECRET_ACCESS_KEY="<todo-use-real-access-key>"
SKYNET_AWS_REGION="us-east-1"
SKYNET_ENVIRONMENT="prd"
SKYNET_TWITTER_CONSUMER_KEY="Y8h1RXQXD5G80jnsMHt46RiII"
SKYNET_TWITTER_CONSUMER_SECRET="GrkMVyzsX79qUhu9v5SJxZy6MMPlBES518FKR8yIhTrH7QSlOX"
SKYNET_TWITTER_ACCESS_TOKEN_KEY="49341211-ttKNIHw90RqTC9S6sly9iCYX8p7vAH3IioTGYTltu"
SKYNET_TWITTER_ACCESS_TOKEN_SECRET="QsRRCpjPmWrWxSDRofjKLooFA8YFGYd52GGyjxCck4ZBj"
- Run jobs with project IDs
yarn node bin/indexer --mode rebuild --projectId dypfinance
Run Twitter Delta Jobs
- Make sure environments are ready
SKYNET_AWS_ACCESS_KEY_ID="<todo-use-real-key>"
SKYNET_AWS_SECRET_ACCESS_KEY="<todo-use-real-access-key>"
SKYNET_AWS_REGION="us-east-1"
SKYNET_ENVIRONMENT="prd"
SKYNET_TWITTER_CONSUMER_KEY="Y8h1RXQXD5G80jnsMHt46RiII"
SKYNET_TWITTER_CONSUMER_SECRET="GrkMVyzsX79qUhu9v5SJxZy6MMPlBES518FKR8yIhTrH7QSlOX"
SKYNET_TWITTER_ACCESS_TOKEN_KEY="49341211-ttKNIHw90RqTC9S6sly9iCYX8p7vAH3IioTGYTltu"
SKYNET_TWITTER_ACCESS_TOKEN_SECRET="QsRRCpjPmWrWxSDRofjKLooFA8YFGYd52GGyjxCck4ZBj"
- Prepare a twitter-id.txt file with all projectId
CertiK
aave
dypfinance
...
- Run jobs with project IDs
# run delta job once, useful for test run and check results
yarn node bin/indexer --mode once --projectId dypfinance
# run one delta job
yarn node bin/deployer --projectId dypfinance
# run one rebuild job
yarn node bin/indexer --mode rebuild --projectId dypfinance
# run many
cat project-id.txt | xargs -n 1 yarn node bin/deployer --projectId
- Delete delta twitter jobs
nomad status | grep -v ID | grep twitter-data-indexer | awk '{print $1}' | xargs -n 1 nomad job stop -purge
1.0.0
3 years ago