2.5.2 • Published 2 months ago

target-clickhouse v2.5.2

Weekly downloads
-
License
AGPL-3.0-only
Repository
github
Last release
2 months ago

Target Clickhouse

A Singer target for Clickhouse, for use with Singer streams generated by Singer taps, written in node js using singer-node.

Usage

Install

As npm package on host

npm install -g target-clickhouse

Docker image

docker pull ghcr.io/biron-bi/target-clickhouse

Registry page

Run

  1. Create a config file config.json with connection information and ingestion parameters.

    {
      "host": "localhost",
      "port": 8123,
      "database": "destination_database",
      "username": "user",
      "password": "averysecurepassword"
    }
  2. Run target-clickhouse against a Singer tap.

In the following exemples:

  • We echo state at the end of a 'state.jsonl' file

  • The file current_state.json contains last line of state.jsonl

  • The file config.json contains clickhouse connection informations

Npm package:

<tap-anything> --state current_state.json | target-clickhouse --config config.json >> state.jsonl

Docker:

In this exemple, container reads config file in a /config directory

<tap-anything> --state current_state.json | docker run --rm -i -a STDIN -a STDOUT -a STDERR -v "$(pwd):/config:ro" ghcr.io/biron-bi/target-clickhouse --config /config/config.json >> state.jsonl

Config.json

The fields available to be specified in the config file.

Mandatory fields

  • host
  • port
  • username
  • password
  • database

Optional fields

  • logging_level Default to "INFO"
  • subtable_separator Default to "__"
  • translate_values: Whether fields should be parsed again to allow conversion of specific values, e.g. True accepted as true. Default false
  • batch_size: Amount of records to read before sending to clickhouse. Default 100
  • finalize_concurrency: Amount of concurrent stream ingestion finalisation. Default 3
  • extra_active_tables: List of tables that are considered active even if not present in ACTIVE_STREAMS message. Default [] finalize_concurrency

Singer specification extension

Several features are supported that are not standard to the singer Spec:

  • Update schemas : Pass the repeatable CLI option --update-streams <stream> to specify streams for which you want to recreate tables (root and children).
  • Clean first : Specify clean_first: true in SCHEMA messages to wipe table content before each ingestion.
  • Cleaning column : Specify cleaning_column: "<column_name>" in SCHEMA messages to wipe table content that matches column value during ingestion. For instance, if column "date" is specified as cleaning column, and the value "2022-01-01" is encountered in a record, all rows with values "2022-01-01" are replaced with those contained in the stream
  • All key properties : Specify all_key_properties: {props: [], children: {}} in SCHEMA messages to specify primary keys for all children of a root table. This will allow children to create a foreign key to their parent (with the format _parent_<column>)

Sponsorship

Target Clickhouse is written and maintained by Biron https://birondata.com/

Acknowledgements

Special thanks to the people who built

License

Distributed under the AGPLv3

2.5.2

2 months ago

2.5.1

2 months ago

2.5.0

3 months ago

2.4.0

4 months ago

2.3.0

5 months ago

2.2.3

6 months ago

2.2.2

6 months ago

2.2.5

6 months ago

2.2.4

6 months ago

2.2.7

6 months ago

2.2.6

6 months ago

2.2.9

6 months ago

2.2.8

6 months ago

2.3.0-0

5 months ago

2.2.1

7 months ago

2.1.2

7 months ago

2.2.0

7 months ago

2.1.1

7 months ago

2.1.0

7 months ago

2.1.0-alpha5

7 months ago

2.1.0-alpha4

7 months ago

2.1.0-alpha1

8 months ago

2.1.0-alpha3

8 months ago

2.0.0-0

8 months ago

2.1.0-alpha2

8 months ago

2.0.0

8 months ago

1.1.1

1 year ago

1.1.0

1 year ago

1.1.2

1 year ago

1.0.6

2 years ago

1.0.5

2 years ago

1.0.4

2 years ago

1.0.3

2 years ago

1.0.3-1

2 years ago

1.0.3-0

2 years ago

1.0.2

2 years ago

1.0.1

2 years ago

1.0.0

2 years ago