2.0.1 • Published 8 months ago

@elbwalker/destination-node-bigquery v2.0.1

Weekly downloads
-
License
MIT
Repository
github
Last release
8 months ago

BigQuery destination for walkerOS

Why You Need this: Streamline your data collection and analysis by directly sending walkerOS events to Google BigQuery. This destination makes it easy to integrate your data pipeline with one of the most powerful data warehouses available.

Usage

The BigQuery destination allows you to send server-side walkerOS events to Google BigQuery. It handles the data transformation and ensures that your events are correctly formatted.

Basic example

Follow the setup steps first.

Install the package

npm i @elbwalker/destination-node-bigquery

Add and configure the BigQuery destination:

import { destinationBigQuery } from '@elbwalker/destination-node-bigquery';

elb('walker destination', destinationBigQuery, {
  custom: {
    projectId: 'PR0J3CT1D', // Required
    // client: BigQuery; // A BigQuery instance from @google-cloud/bigquery
    // datasetId: string; // 'walkerOS' as default
    // tableId: string; // 'events' as default
    // location: string; // 'EU' as default
    // bigquery?: BigQueryOptions; // BigQueryOptions from @google-cloud/bigquery
  },
});

Learn more how to authenticate with a service account key file using the custom bigquery options.

Setup

The destination requires an existing dataset and table to ingest data into. Replace PR0J3CT1D.walkerOS.events with your actual project ID, dataset and table names. Adjust the options if necessary, and run the query to create it.

CREATE TABLE `PR0J3CT1D.walkerOS.events` (
  timestamp TIMESTAMP NOT NULL,
  event STRING NOT NULL,
  data JSON,
  context JSON,
  custom JSON,
  globals JSON,
  user JSON,
  nested JSON,
  consent JSON,
  id STRING,
  trigger STRING,
  entity STRING,
  action STRING,
  timing NUMERIC,
  `group` STRING,
  count NUMERIC,
  version JSON,
  source JSON,
  createdAt TIMESTAMP NOT NULL
)
PARTITION BY DATE(timestamp)
OPTIONS (
  description="walkerOS raw events",
  partition_expiration_days=365, -- Automatically delete data older than 1 year
  require_partition_filter=true -- Enforce the use of partition filter in queries
);

Note: If you also need to create a new dataset, consider to actively enable physical storage billing model to eventually reduce your BigQuery costs. Based on your events a compression factor of 6 is possible, but may result in higher querying costs.

Permissions

When using Service Accounts (SAs) for Google Cloud BigQuery, it's recommended to follow the principle of least privilege. Never grant more permissions than what it needs to perform its intended functions.

Assign explicit permissions directly to datasets within BigQuery (using the share option). This ensures that the service account only has access to what is necessary for operation.

For more detailed information, refer to the official Google Cloud IAM documentation.

Who this package is for

This destination is ideal for data engineers and analysts who are already using Google BigQuery or plan to integrate it into their data stack. It's also useful for companies looking to centralize their data collection and analysis efforts.

Dependencies

Before using the BigQuery destination, ensure you have:

  • walkerOS node client
  • Google Cloud Platform account
  • BigQuery dataset and table
  • GCP service account with permissions to write to the events table
2.1.0-next.0

8 months ago

2.0.2-next.0

9 months ago

1.1.1

11 months ago

1.1.0

1 year ago

1.0.4-next.1

1 year ago

1.0.4-next.0

1 year ago

1.1.2

11 months ago

1.1.0-next.2

1 year ago

2.0.0-next.0

11 months ago

2.0.1

10 months ago

2.0.0

11 months ago

1.0.3

1 year ago

1.0.2

1 year ago

1.0.2-next.0

1 year ago

1.0.1

1 year ago

1.0.1-next.0

1 year ago

1.0.0

1 year ago

0.0.1

2 years ago