1.1.1 • Published 3 years ago

gigapr-data-csv v1.1.1

Weekly downloads
-
License
ISC
Repository
-
Last release
3 years ago

gigapr-data-csv

This project calls GigaPR APIs to get the data and serializes the response to a CSV file.

The project is implemented with NodeJS on Google Cloud Functions.

Description

This cloud functions does 4 steps:

  • it calls gigapr-data-api and downloads the detailed data of each application
  • it trasforms the data into a csv format (saved to a local temporary file)
  • it uploads the file to Google Drive
  • it publishes a Google Pub/Sub event to notify it'sdone

The detail data for each application is retrieved with a single API call to the endpoint /allapplications/*/detail.

The mapping between the data in JSON format and the csv format is specified in the file src/mapper/acmColumnMap.

Prerequisites

npm installed.

  • Note: The default shell for npm must be bash. To set this up please do a $npm config edit and set something like script-shell=C:\Program Files\Git\bin\bash.exe.

gcloud is needed when you want to deploy to Google Cloud. You can develop, test, run locally without it. If you use the standard pipeline and you do not from your own machine, you can avoid having gcloud installed locally.

Getting Started

When you have the prerequisites ok, this is the shortest way to get started:

  1. clone the project to you local PC
  2. install node dependencies using npm i

At this point you should be able to launch a local test with npm test. All tests should pass 👍👍👍!

Building

Google Cloud Functions by default support Javascript on node 10. To support the latest JS features and provide more type safety, this project is built with Typescript.

The Typescript compiler tsc transpiles the code to JavaScript and puts it into the target folder.

From the target folder you can either run the code locally or deploy it to Google Cloud.

To transpile the code run:

npm run clean
npm run build

Running Locally

In order to run locally, you need to:

If you run from inside Credem's LAN you also need to setup the proxy.

Setup your local env file

The setup of your local enviroment is done through the .env file. This is a file where you can specify enviroment variables that will be loaded into node.js when running locally.

You can find a .env.example in the git repo. Copy it to .env and adjust it to your needs. The file .env will be ignored when committing to git.

Required variables are:

# this is the path to the file with the credentials of the service account used for
# local runs
GOOGLE_APPLICATION_CREDENTIALS="credentials/privatekey.json"

# this is the typescript function that will be invoked when the function is  deployed to
# Google with an http trigger (or when run locally with `npm run start_local_http`)
DEPLOY_HTTP_ENTRY_POINT='mainHttp'

# this is the typescript function that will be invoked when the function is deployed to
# Google with a Pub/Sub trigger (or when run locally with `npm run start_local_pubsub`)
DEPLOY_PUBSUB_ENTRY_POINT='mainPubsub'

# this is the url from which data are retrieved
CREDEM_API_URL="https://europe-west3-gigapr-tst.cloudfunctions.net/gigapr-data-api-test"

# this is the id of the csv file on google drive
CREDEM_DRIVE_FILE_ID=1lSFQy6Uiy6NmpDlwvP5kIco9BY4r9BgO

# this is the local path where the file will be temporarily written before
# uploading to Google drive
CREDEM_LOCAL_FILE_PATH=tmp/tempfile.csv

# The following are the parameters used to identify the Pub/Sub topic where
# the function will publish an event when it's finished.
# The Pub/Sub topic will be: projects/$CREDEM_PROJECT_ID/topics/CREDEM_PUBSUB_TOPIC

CREDEM_PROJECT_ID="gigapr-tst"
CREDEM_PUBSUB_TOPIC="gigapr-bus-events"

A Google service-account credential file is required. See the specific paragraph for instructions to set it up.

Setup Google authentication

The first step is creating a Google Service Account, that is a technical account on GCP, and giving you local node.js the credentials to login with that account:

  • create a Google service account
  • create a key in json format
  • download the json and save it here under ./credentials folder (it will be ignored by Git)
  • set the enviroment variable (possibly using .env)
    • GOOGLE_APPLICATION_CREDENTIALS

The second step is to ensure that the service account is authorized to access the target Application data api. That is part of the cofiguration of that API.

Running locally with HTTP trigger

To run the function locally as a http-triggered function use:

npm run clean
npm run build
npm start_local_http

At this point you can invoke the function with a GET call to http://localhost:8082/serialize

Running locally with Pub/Sub trigger

To run locally as a function triggered by Pub/Sub use:

npm run clean
npm run build
npm start_local_pubsub

At this point you can invoke the function with a POST call to http://localhost:8082/serialize. The body of the POST call must contain properly structured event data.

You can use ./scripts/simulatePubsubEvent.sh which in turn uses the event spciefied in ./data/mockPubsubEvent.json:

{
  "@type": "type.googleapis.com/google.pubsub.v1.PubsubMessage",
  "attributes": {
    "eventType": "RequestProcessingEnded",
    "user": "testuser@test.credem.it"
  },
  "data": ""
}

Oh, my proxy

When running inside Credem's network you may encounter several obstacles dealing with the proxy.

As of Feb 2020 the following works for me:

  • For googleapis library to work correctly you should have HTTP_PROXY and HTTPS_PROXY set to http://proxyre02.group.credem.net:8080 (possibly using .env)
  • You should point Internet Exporer or another proxy to the same proxy and authenticate.
  • The authentication will last for 15 minutes for the shole machine. So be sure to visit a new page in IE every 15 minutes.

Testing

Test is done with jest and ts-jest.

You can also run tests continuously within vscode using the extension Jest extension by Orta.

Unit tests do not use environment variables and they do not require an .env file.

See README-TESTING-STRATEGY.md for other test-related information.

Deploying to GCP

Note deploying from your local machine to Google cloud is deprecated. please use the official Azure Devops pipeline instead.

Prerequisites for deployments

As a prerequisite you need to have gcloud setup, login done and running fine. You must be logged in with an account who is authorized to deploy a cloud function to the target project.

Deploy configuration

Deploy is configured through parameters set in the config section of your .env file:

# The followign are the variables used to deploy on GCP
TODO

As a minimum, These values define the target for your deploy.

Executing the deploy

Note: if deploying for the first time Google will ask you whether to allow unauthenticated calls. Answer N!

Read data from Spreadsheet

To use a Google Spreadsheet as the source of data instead use:

npm run clean
npm run build
npm run deploy_with_spreadsheet_data

This requires to have the following set properly in your package.json:

# The following are the variables used to deploy on GCP
...
DEPLOY_SPREADSHEET_ID=1hNDt9vt7JdPH6AxzQhHHU9iyJ0OqGb4ZVgGeO31CiHc
DEPLOY_SHEET_NAME=Sheet1
...

Read data from local CSV file

To use a CSV file as the source of data instead use:

npm run clean
npm run build

Then be sure to copy your CSV file inside the target folder and only after that run:

npm run deploy_with_csv_data

This requires to have the following set properly in your package.json:

# The followign are the variables used to deploy on GCP
...
DEPLOY_CSV_FILEPATH=./data/MegaDownload_ElencoApplicazioniAll.txt
...

In this case, you need to have your file save as ./target/data/MegaDownload_ElencoApplicazioniAll.txt on your local machine before deploying.

There is a little helper file to copy the Mega download file directly into your target when running from a Credem PC. In that case you may do:

npm run deploy_with_csv_data
npm run copy_csv_file
npm run deploy_with_spreadsheet_data

Environment variables in GCP

This .env file will not be available in the deployed environment. You are expected to set the corresponding env variables via configuration of the target environment.

Google cloud functions allow you to set environment variables at deploy time by the command:

gcloud functions deploy ... --set-env-vars VAR1=VALUE1,VAR2=VALUE2,...

To simplify your life, I already prepared 3 scripts to deploy with diffent configurations:

TODO

How to allow the API gateway to call the function

When deploying for the first time you want allow the API-Gateway service account to call this function. That's easy, just digit:

npm run enableapicinvoker

This will enable as invoker of the cloud function the service account specified in your .env file:

# The followign are the variables used to deploy on GCP
...
DEPLOY_INVOKER_SERVICE_ACCOUNT=apiman@gigapr.iam.gserviceaccount.com

Troubleshooting

My tests are failing

  • check your .env file is set properly for:

... TODO ...

When running locally the data are different than my spreadsheet

  • check CREDEM_USE_LOCAL_CSV_DATA in .env file, if set to YES, the data will be retrieved from a local CSV file and from Google Spreadsheet