Our Favorite Elasticsearch Features: Part 3 - Index Templates
Complete example source for this article can be found here.
What are index templates?
By now, we hope you see the value in configuring explicit mappings, and in using index aliases. You may have noticed that there is a good amount of book-keeping required to make sure all the indices in your aliases are consistently defined, and that an alias includes all the relevant indices. You have to develop a system to manage this.
The final feature we want to cover - index templates - are an important and useful part of such a system. They allow you to configure mappings and aliases for indices you haven't even created yet.
Index templates allow you to specify index settings for any new index that matches the template's pattern (typically one or more text-and-wildcard matches). You can use these to configure your Elasticsearch system such that new indices will have the correct mappings without manually setting the mapping definition, and as these indices are created they are automatically available through an alias.
In fact, any index setting can be managed through index templates, including index aliases. Other useful index options to consider adding to your index templates are shard counts, replica counts, and refresh intervals.
Creating an index template
Index templates are easy to demonstrate. These examples create an identical setup for the visitor log indices and alias shown earlier. Note, as with many Elasticsearch index options, changing the settings doesn't retroactively affect indexed data; the following examples assume no visitor log indices exist yet.
First, we create a template matching any index with the name "visitor_logs*", and set those indices up with a mapping, and with an index alias.
Other than this template, nothing has (yet) been created.
Now, simply indexing data into a target index will create the index if it doesn't exist, before adding the data.
Query the indices using the index alias:
As expected, we see two hits, one from each index created when the documents themselves were indexed.
Looking in more detail at what got set up during indexing, we see that the indices themselves, and the index aliases are all configured automatically.
...and the aliases:
There isn't anything surprising either in the resulting index definition, nor in the alias setup. Index templates are unobtrusive, yet incredibly useful.
The cost of abstraction
If you realize you need to make a change to the index settings, changing an index template is easy, but remember you will likely want to also change the index settings for all the extant indices, plus you may want to reindex the data in those indices to use the new change.
Mapping and index settings changes are necessary and do require some effort, and index templates add a little more work to those migrations. However, the benefits clearly outweigh the costs. Reindexes are generally a carefully planned operation, and if possible we apply a number of logically discrete changes to our index or template structure at the same time.
Summary of Our Favorite Elasticsearch Features
Elasticsearch is a fantastic tool. The features outlined in this and the previous two articles are basic, not necessarily obvious, and very useful for production Elasticsearch clusters. Of course there are many other Elasticsearch features we rely on every day, for maintenance, ingest and querying, but these three articles demonstrate some features that may not be as obvious or flashy as the latest (now open) X-Pack offerings, but we find are critical to our Elasticsearch operations.
Explicit mappings help us keep our database structure consistent, and give us the full power of Elasticsearch's rich data types and more complex mapping and analysis options.
Index aliases are essential for minimizing the impact of necessary reindexing maintenance operations. These operations are unavoidable and tend to accumulate over time, and index aliases allow you to keep your data online whilst performing the changes.
Finally, index templates are an invaluable organizational tool that prevent us from forgetting to use the "right" settings when creating new indices.
Ian Truslove is a co-founder of Cambium Consulting. He specializes in building large-scale resilient data processing systems using tools like Clojure and Elasticsearch. When not hunched over an Emacs terminal, you might find him on a bike in the wilds of Colorado.