0.1.1 • Published 4 years ago

elasticsearch-migration v0.1.1

Weekly downloads
1
License
MIT
Repository
github
Last release
4 years ago

Elasticsearch Updater

Elasticsearch Updater takes care of automatically updating index mappings, running data transformation scripts, and re-targeting aliases to new indices when a schema change needs to be made. It assumes that all mutable logical indices are actually represented by aliases pointing to some concrete backing index, and that there is a one-to-one relationship between aliases and active backing indices. Updates are performed on an alias by re-indexing its current backing index, leaving the originaindex behind as a back-up, and then re-targeting the alias at the new active index.

Elasticsearch Cloner

Elasticsearch Cloner takes care of preparing a target index with the appropriate mappings and index settings, re-indexing to copy data from a remote instance alias, and setting up the alias for the cloned index. It requires that your target instance already have the source whitelisted.

Usage

  • esmigrate init [endpoint] -- Generate a migration file that will produce a database from scratch implementing the same schema as the target instance running at endpoint, and set the schema version on endpoint if it has not been set before. Note that this is a global command, which acts on all aliases in the target remote instance.

  • esmigrate clone [source] [target] [alias] [size? = Infinity] -- Copy documents from the given alias in the source instance to a backing index with the same name as the source backing index in the target instance, and set up alias to point to that index in the target instance. If provided, the optional size argument limits the maximum number of documents to be cloned.

  • esmigrate validate [path? = './migrations'] -- Check that migration files in the specified folder are all valid.

  • esmigrate project [endpoint] [project] -- Migrate the Elasticsearch instance at endpoint to the latest schema version, using a project-specific schema-version index.

  • esmigrate project [endpoint] [project] YYYY-MM-DD(.V) -- Migrate the Elasticsearch instance at endpoint to the schema as of the given date (with an optional sub-version if the schema was updated multiple time in a single day), using a project-specific schema-version index.

  • esmigrate [endpoint] -- Migrate the Elasticsearch instance at endpoint to the latest schema version, using a global schema-version index (with document _id '1').

  • esmigrate [endpoint] YYYY-MM-DD(.V) -- Migrate the Elasticsearch instance at endpoint to the schema as of the given date (with an optional sub-version if the schema was updated multiple time in a single day), using a global schema-version index (with document _id '1').

Note that the tool does not do any checking to ensure that project-specific migrations only touch project-specific indices/aliases. You must ensure that different projects using the same instance avoid stepping on each other's toes manually. To that end, it is recommended that you not mix project-specific and global migrations in the same instance.

Config format

Migration files are specified in a /migrations directory, with the naming scheme YYYY-MM-DD(.V).{human-readable-name}.esmigration.

At the top level, migration files are divided into "ups" and "downs"--commands for advancing or rewinding the schema version. Each section is introduced with as #UPS: or #DOWNS: line, respectively. Within each section, the updates to apply to each alias are introduced with an #alias: {alias-name} line. Within each #alias section, one can specify any or all of:

  • a new mapping
  • a migration script
  • new settings
  • a new name

#alias sections with mappings included are treated as upserts--if the alias does not currently exist, a new index and associated alias will be created; if it does, the existing alias will have its backing index updated.

Mapping updates are specified beginning with a #mappings: tag. The content of the #mappings section is JSON corresponding to the contents of the mappings field in the Elasticsearch API request to create an index. For updates to existing logical indexes, though, it is not necessary to re-specify the complete mapping in every migration; instead, you can just include the fields that are to be added or changed. Including a field with a value of null will be interpreted as a command to delete that field from the pre-existing mappings. Mappings are required when creating a new index from scratch rather than updating an existing index.

Data-transform scripts are specified with a #script({lang}) tag. If no language is specified, it defaults to painless. The content of a #script section is just the literal source code for the tranform script.

Settings updates are specified beginning with a #settings: tag. The content of the #settings section is JSON corresponding to the contents of the settings field in the Elasticsearch API request to create an index. For updates to existing logical indexes, though, it is not necessary to re-specify the complete settings in every migration; instead, you can just include the fields that are to be added or changed. Including a field with a value of null will be interpreted as a command to delete that field from the pre-existing settings. If no settings are specified when creating a new index, Elasticsearch defaults will be used.

New names for an alias are specified beginning with a #rename: {alias-name} tag. This section has no further content.

#delete: {alias-name} lines indicate that the named alias should be removed during the given migration. These can occur anywhere in an #UPS or #DOWNS section (though they will be most useful in the DOWNS section), and override any #script or #mapping commands that might also be associated with that alias.