1.0.2 • Published 4 years ago

cloud_etl v1.0.2

Weekly downloads
1
License
ISC
Repository
-
Last release
4 years ago

MS SQl Server to Google Bigquery

Descripttions

Source code base on cli to read csv data (exported auto by SQLCMD) and passing it to google bigquery

Usage

Install nodejs on server. Link: https://nodejs.org/en/ Run follow command:

CMD1: load table to google bigquery

 node --max-old-space-size=8192 app.js  ggc_process_cdc <datasetid> <tableid>  <sourcefile> <schemafile> <isCreated>
  • isCreated = true process will create table in google bigquery.
  • isCreated = false process will filter csv to another csv then push new csv to google cloud storage then load data to google big query.

CMD2: load cdc table to google big query

node --max-old-space-size=8192 app.js ggc_process_cdc <datasetid> <tableid>  <sourcefile> <schemafile> <isCreated>
  • isCreated = true process will create table in google bigquery inside dataset TEMP.
  • isCreated = false process will filter csv to another csv then push new csv to google cloud storage then load data to google bigquery dataset TEMP. After that run merged script to merged change from temp table to master table.

CMD3: convert schema format from mssql to google big query

node --max-old-space-size=8192 app.js convert_mssql_schema <source> <target>
  • Convert script schema auto export from mssql server to txt schema format google big query.

Notes: : Dataset name in google bigquery. : Table name will be create in google bigquery. This name = tablename in MS SQL SERVER. : Full path file of csv file. : full path of schema file.

Excample

 node --max-old-space-size=8192  app.js ggc_process CONTRACTS OtoMakers /OtoMakers.txt true

MS SQL

Command to export csv file:

 sqlcmd -S . -U <user> -P <password> -d <database> -Q "select * from OtoMakers" -o E:\Temp\OtoMakers.csv -s"#" -W  -u

SCHEMA Template

Map schema config:

MSSQL - GGBQ:
nvarchar - string
boolean - integer
int - integer
long - integer
datetime - NUMERIC
decimal - Timestamp

RawContract Schema txt example:

Id:integer,AffiliateOrganizationId:integer,CertificationNumber:string,CompanyId:integer,ContractType:integer,Created:integer,CreatedBy:integer,CreatedDate:Timestamp,HoldTimeExpire:Timestamp,HolderId:integer,MissingInfomationText:string,MissingInfomationType:integer,SellerName:string,SourceOfDataType:integer,CustomerPhone:string,Cost:NUMERIC,TransactionCode:string,TransactionStatus:integer,RevenueRequestStatus:integer,RevenueRequestCode:string,CertificationRequestStatus:integer,CertificationRequestCode:string,EffectiveDate:Timestamp,ExpiredDate:Timestamp,Flags:integer,TransactionDate:Timestamp,RevenueFailsCount:integer,CertificationFailsCount:integer,IsBlock:integer,InsuranceProductId:integer,InsuranceProductCategoryId:integer,InsuranceProductCategoryCode:string,IsDeleted:integer,Status:integer

CMD get schema: mssql-scripter -S <hostIp>,1633 -d <database> -U <user> -P <password> --include-objects InsurancePrograms > ./InsurancePrograms.sql

Step by Step

  1. Export csv to sourcefile path
  2. Create schema file with name = TableName exported or run cli convert_mssql_schema
  3. Copy schema file to schema folder
  4. Run cmd: node --max-old-space-size=8192 app.js ggc_process

POSTGRESQL

  1. Must install postgres cli
  2. cli export schema: PGPASSWORD= pg_dump -d organization -U postgres -h -p -s -t unit -f /ORGANIZATION.Unit.sql
  3. cli run convert to GBQ schema: node --max-old-space-size=8192 app.js convert_postgres_schema /ORGANIZATION.Unit.sql /ORGANIZATION.Unit.txt

    License Information

    This project is licensed according to VSIT

Copyright (c) 2019. VSIT (khanhh@vsit.com.vn)

Licensed under the VSIT, Version 1.0 (the "License");
you may not use this file except in compliance with the License.