1 Data Management
1.1 Task: Define an index that satisfies a given set of requirements
Example 1: Creating an Index for a Blogging Platform
Requirements
- The platform hosts articles, each with
textcontent, a publication date, author details, and tags. - Articles need to be searchable by
content,title, andtags. - The application requires fast search responses and efficient storage.
- The application should handle date-based queries efficiently.
- Author details are nested objects that include the author’s name and email.
Steps
Open the Kibana Console or use a REST client.
Define Mappings:
- Content and Title: Use the
textdata type - Publication Date: Use the
datedata type - Tags: Use the
keyworddata type for exact matching - Author: Use a
nestedobject to keep author details searchable and well-structured
- Content and Title: Use the
Create the index
PUT blog_articles { "settings": { "number_of_shards": 3, "number_of_replicas": 1 }, "mappings": { "properties": { "title": { "type": "text" }, "content": { "type": "text" }, "publication_date": { "type": "date" }, "author": { "type": "nested", "properties": { "name": { "type": "text" }, "email": { "type": "keyword" } } }, "tags": { "type": "text" } } } }Or insert the settings and mappings separately.
PUT blog_articles { "settings": { "number_of_shards": 3, "number_of_replicas": 1 } }And:
PUT blog_articles/_mapping { "properties": { "title": { "type": "text" }, "content": { "type": "text" }, "publication_date": { "type": "date" }, "author": { "type": "nested", "properties": { "name": { "type": "text" }, "email": { "type": "keyword" } } }, "tags": { "type": "text" } } }
Test
Verify the index
GET /_cat/indicesVerify the mappings
GET /blog_articles/_mappingIndex and search for a document
# Index POST /blog_articles/_doc { "title" : "My First Blog Post", "content" : "What an interesting way to go...", "publication_date" : "2024-05-15", "tags" : "superb", "author" : { "name" : "John Doe", "email" : "john@doe.com" } }# Search like this GET /blog_articles/_search# Or search like this GET /blog_articles/_search?q=tags:superb# Or search like this GET blog_articles/_search { "query": { "query_string": { "default_field": "tags", "query": "superb" } } }# Or search like this GET blog_articles/_search { "query": { "nested": { "path": "author", "query": { "match": { "author.name": "john" } } } } }Considerations
- Shards and Replicas: Adjust these settings based on expected data volume and query load.
- Nested Objects: These are crucial for maintaining the structure and searchability of complex data like author details.
Clean-up (optional)
In the console execute the following
DELETE blog_articles
Documentation
Example 2: Creating an Index for Log Data
Requirements
- Store log data with a timestamp field
Steps
Open the Kibana Console or use a REST client.
Create the index
PUT /log_data { "settings": { "number_of_shards": 3 }, "mappings": { "properties": { "@timestamp": { "type": "date" }, "log_source": { "type": "keyword" }, "message": { "type": "text" } } } }
Test
Verify the index creation
GET /log_dataOr
GET /_cat/indicesVerify the field mapping
GET /log_data/_mappingIndex and search for a sample document
Index
PUT /log_data/_doc/1 { "@timestamp": "2023-05-16T12:34:56Z", "log_source": "web_server", "message": "HTTP request received" }Search
GET /log_data/_searchThe response should show the indexed document.
Considerations
- In
settings,number_of_replicasdoesn’t appear as its default is set to one 1 which is sufficient. The fieldnumber_of_shardsshould be higher than 1 depending on the requirements for a log index. No, you do not need to have a settings block for the index to be created. - The
@timestampfield is mapped as a date type for time-based data management. - The
log_sourcefield is mapped as akeywordtype to enable custom routing based on its value.
Clean-up (optional)
In the console execute the following
DELETE log_data
Documentation
Example 3: Creating an index for e-commerce product data with daily updates
Requirements
- Store product information including
name,description,category,price, andstock_level. - Allow filtering and searching based on product
name,category, andpricerange. - Enable aggregations to calculate average price per category.
Steps
Open the Kibana Console or use a REST client.
Define mappings:
- Use the
textdata type fornameanddescriptionto allow full-text search. - Use the
keyworddata type forcategoryto enable filtering by exact terms. - Use the
integerdata type forpriceto allow for range queries and aggregations. - Use the
integerdata type forstock_levelfor inventory management.
- Use the
Create the index
PUT products { "mappings": { "properties": { "name": { "type": "text" }, "description": { "type": "text" }, "category": { "type": "keyword" }, "price": { "type": "integer" }, "stock_level": { "type": "integer" } } } }
Configure analyzers (optional):
- You can define custom analyzers for
nameanddescriptionto handle special characters or stemming based on your needs. Notice two things:How the
custom_analyzerrefers to thefilterandtokenizer(both of which are optional).The fields that will use
custom_analyzer,nameanddescription, have an analyzer reference tocustom_analyzer.PUT /products { "settings": { "analysis": { "tokenizer": { "custom_tokenizer": { "type": "standard" } }, "filter": { "custom_stemmer": { "type": "stemmer", "name": "english" }, "custom_stop": { "type": "stop", "stopwords": "_english_" } }, "analyzer": { "custom_analyzer": { "type": "custom", "tokenizer": "custom_tokenizer", "filter": [ "lowercase", "custom_stop", "custom_stemmer" ] } } } }, "mappings": { "properties": { "name": { "type": "text", "analyzer": "custom_analyzer" }, "description": { "type": "text", "analyzer": "custom_analyzer" }, "category": { "type": "keyword" }, "price": { "type": "integer" }, "stock_level": { "type": "integer" } } } }
- You can define custom analyzers for
Test
Verify the index creation
GET productsOr
GET /_cat/indicesVerify the field mapping
GET /products/_mappingIndex and search some sample product data
Index some products
POST /products/_bulk { "index": { "_index": "products", "_id": "1" } } { "name": "Wireless Bluetooth Headphones", "description": "High-quality wireless Bluetooth headphones with noise-cancellation and long battery life.", "category": "electronics", "price": 99, "stock_level": 250 } { "index": { "_index": "products", "_id": "2" } } { "name": "Stainless Steel Water Bottle", "description": "Durable stainless steel water bottle, keeps drinks cold for 24 hours and hot for 12 hours.", "category": "home", "price": 25, "stock_level": 500 } { "index": { "_index": "products", "_id": "3" } } { "name": "Smartphone", "description": "Latest model smartphone with high-resolution display and fast processor.", "category": "electronics", "price": 699, "stock_level": 150 } { "index": { "_index": "products", "_id": "4" } } { "name": "LED Desk Lamp", "description": "Energy-efficient LED desk lamp with adjustable brightness and flexible neck.", "category": "home", "price": 45, "stock_level": 300 } { "index": { "_index": "products", "_id": "5" } } { "name": "4K Ultra HD TV", "description": "55-inch 4K Ultra HD TV with HDR support and smart features.", "category": "electronics", "price": 499, "stock_level": 200 } { "index": { "_index": "products", "_id": "6" } } { "name": "Vacuum Cleaner", "description": "High-suction vacuum cleaner with multiple attachments for versatile cleaning.", "category": "home", "price": 120, "stock_level": 100 }Search
GET /products/_search?q=name:desk
Use aggregations to calculate the average price per category.
POST /products/_search { "size": 0, "aggs": { "average_price_per_category": { "terms": { "field": "category" }, "aggs": { "average_price": { "avg": { "field": "price" } } } } } }
Considerations
- Using the appropriate data types ensures efficient storage and querying capabilities.
- Text fields allow full-text search, while keyword fields enable filtering by exact terms.
Clean-up (optional)
In the console execute the following
DELETE products
Documentation
1.2 Task: Define and use an index template for a given pattern that satisfies a given set of requirements
Example 1: Creating an index template for a user profile data
Requirements
- Create an index template named
user_profile_template. - The template should apply to indices starting with
user_profile-. - The template should have two shards and one replica.
- The template should have a mapping for the
namefield as atextdata type with an analyzer ofstandard. - The template should have a mapping for the
agefield as anintegerdata type.
Steps
Open the Kibana Console or use a REST client.
Create the index template
PUT /_index_template/user_profile_template { "index_patterns": ["user_profile-*"], "template": { "settings": { "number_of_shards": 2, "number_of_replicas": 1 }, "mappings": { "properties": { "name": { "type": "text", "analyzer": "standard" }, "age": { "type": "integer" } } } } }
Test
Verify the index template was created
GET _index_template/user_profile_templateCreate an index named
user_profile-2024using the REST API:PUT /user_profile_2024Verify that the index was created with the expected settings and mappings:
GET /user_profile_2024/_settingsGET /user_profile_2024/_mapping
Considerations
- Two shards are chosen to allow for parallel processing and improved search performance.
- One replica is chosen for simplicity and development purposes; in a production environment, this would depend on the expected data volume and search traffic.
- The
standardanalyzer is chosen for thenamefield to enable standard text analysis.
Clean-up (optional)
In the console execute the following
DELETE /user_profile_2024 DELETE /_index_template/user_profile_template
Documentation
Example 2: Creating a monthly product index template
Requirements
- Index name pattern:
products-* - Index settings:
- Number of shards: 3
- Number of replicas: 2
- Mapping:
- Field
nameshould be of typetext - Field
descriptionshould be of typetext - Field
priceshould be of typefloat - Field
categoryshould be of typekeyword
- Field
Steps
Open the Kibana Console or use a REST client.
Create the index template
PUT _template/monthly_products { "index_patterns": ["products-*"], "settings": { "number_of_shards": 3, "number_of_replicas": 2 }, "mappings": { "properties": { "name": { "type": "text" }, "description": { "type": "text" }, "price": { "type": "float" }, "category": { "type": "keyword" } } } }
Test
Verify the index template was created
GET _index_template/monthly_productsCreate a new index matching the pattern (e.g., products-202305):
PUT products-202305Verify that the index was created with the expected settings and mappings:
GET /products-202305/_settingsGET /products-202305/_mappingIndex a sample document and verify that the mapping is applied correctly:
Index
POST products-202305/_doc { "name": "Product A", "description": "This is a sample product", "price": 19.99, "category": "Electronics" }Search
GET products-202305/_search
The response should show the correct mapping for the fields specified in the index template.
Considerations
- The
index_patternsfield specifies the pattern for index names to which this template should be applied. - The
number_of_shardsandnumber_of_replicassettings are chosen based on the expected data volume and high availability requirements. - The
texttype is used fornameanddescriptionfields to enable full-text search and analysis. - The
floattype is used for thepricefield to support decimal values. - The
keywordtype is used for thecategoryfield to prevent analysis and treat the values as exact matches.
Clean-up (optional)
In the console execute the following
DELETE products-202305DELETE _template/monthly_products
Documentation
Example 3: Creating an index template for log indices
Requirements
- The template should apply to any index starting with
logs-. - The template must define settings for three primary shards and one replica.
- The template should include mappings for fields
@timestamp,log_level, andmessage.
Steps
Open the Kibana Console or use a REST client.
Create the index template
PUT /_index_template/logs_template { "index_patterns": ["logs-*"], "template": { "settings": { "index": { "number_of_shards": 3, "number_of_replicas": 1 } }, "mappings": { "properties": { "@timestamp": { "type": "date" }, "log_level": { "type": "keyword" }, "message": { "type": "text" } } } } }
Test
Verify the index template was created
GET _index_template/logs_templateCreate a new index matching the pattern (e.g., logs-202405)
PUT logs-202405Verify that the index was created with the expected settings and mappings
GET /logs-202405/_settingsGET /logs-202405/_mappingIndex a sample document and verify that the mapping is applied correctly:
Index
POST logs-202405/_doc { "@timestamp": "2024-05-16T12:34:56Z", "log_level": "ERROR", "message": "Help!" }Search
GET logs-202405/_search
The response should show the correct mapping for the fields specified in the index template.
Considerations
- Index Patterns: The template applies to any index starting with
logs-, ensuring consistency across similar indices. - Number of Shards: Three shards provide a balance between performance and resource utilization.
- Replicas: A single replica ensures high availability and fault tolerance.
- Mappings: Predefined mappings ensure that the fields are properly indexed and can be efficiently queried.
Clean-up (optional)
In the console execute the following
DELETE logs-202405DELETE _index_template/logs_template
Documentation
1.3 Task: Define and use a dynamic template that satisfies a given set of requirements
FYI: The difference between index templates and dynamic templates is:
An index template is a way to define settings, mappings, and other configurations that should be applied automatically to new indices when they are created. A dynamic template is part of the mapping definition within an index template or index mapping that allows Elasticsearch to dynamically infer the mapping of fields based on field names, data patterns, or the data type detected.
There is one example per field mapping type. They all use an explicit dynamic template, but Exercise 1 also shows the use of a dynamic template embedded in the index definition.
Example 1: Create a Dynamic Template for Logging Using Field Name Patterns
Requirements
- Apply a specific text analysis to all fields that end with
_log. - Use a
keywordtype for all fields that start withstatus_. - Default to
textwith a standard analyzer for other string fields. - Define a custom
log_analyzerfor_logfields.
Steps
Open the Kibana Console or use a REST client.
Define the dynamic template
As part of the index definition
PUT /logs_index { "mappings": { "dynamic_templates": [ { "log_fields": { "match": "*_log", "mapping": { "type": "text", "analyzer": "log_analyzer" } } }, { "status_fields": { "match": "status_*", "mapping": { "type": "keyword" } } }, { "default_string": { "match_mapping_type": "string", "mapping": { "type": "text", "analyzer": "standard" } } } ] }, "settings": { "analysis": { "analyzer": { "log_analyzer": { "type": "custom", "tokenizer": "standard", "filter": ["lowercase", "stop"] } } } } }- or as a standalone definition to be added to indexes as needed using the index_pattern
PUT /_index_template/logs_dyn_template { "index_patterns": ["logs_*"], "template": { "mappings": { "dynamic_templates": [ { "log_fields": { "match": "*_log", "mapping": { "type": "text", "analyzer": "log_analyzer" } } }, { "status_fields": { "match": "status_*", "mapping": { "type": "keyword" } } }, { "default_string": { "match_mapping_type": "string", "mapping": { "type": "text", "analyzer": "standard" } } } ] }, "settings": { "analysis": { "analyzer": { "log_analyzer": { "type": "custom", "tokenizer": "standard", "filter": ["lowercase", "stop"] } } } } } }
Test
Verify the dynamic template was created
- If you used the embedded version
GET /logs_index/_mapping- If you used the standalone version
GET /_index_template/logs_dyn_templateCreate a new index matching the pattern (e.g., logs-202405)
- Optional if you used the embedded version
PUT logs_indexVerify that the created index has the expected settings and mappings
- Ensure error_log is of type text with log_analyzer
- Ensure status_code is of type keyword
- Ensure message is of type text with standard analyzer
GET /logs_index/_mappingIndex a sample document and verify that the mapping is applied correctly
POST /logs_index/_doc/1 { "error_log": "This is an error log message.", "status_code": "200", "message": "Regular log message." }Perform Searches:
- Search within error_log and verify the custom analyzer is applied
GET /logs_index/_search { "query": { "match": { "error_log": "error" } } }- Check if status_code is searchable as a keyword
GET /logs_index/_search { "query": { "term": { "status_code": "200" } } }
Considerations
- The custom analyzer log_analyzer is used to provide specific tokenization and filtering for log fields.
- The keyword type for status_* fields ensures they are treated as exact values, useful for status codes.
- The default_string template ensures other string fields are analyzed with the standard analyzer, providing a balanced default.
Clean-up (optional)
Delete the index
DELETE logs_indexDelete the dynamic template
DELETE /_index_template/logs_dyn_template
Documentation
Example 2: Create Dynamic Template for Data Types
Requirements
- All string fields should be treated as
textwith astandardanalyzer. - All long fields should be treated as
integer. - All
datefields should use a specific date format.
Steps
Open the Kibana Console or use a REST client.
Define the Dynamic Template
PUT /_index_template/data_type_template { "index_patterns": ["data_type_*"], "template": { "mappings": { "dynamic_templates": [ { "strings_as_text": { "match_mapping_type": "string", "mapping": { "type": "text", "analyzer": "standard" } } }, { "longs_as_integer": { "match_mapping_type": "long", "mapping": { "type": "integer" } } }, { "dates_with_format": { "match_mapping_type": "date", "mapping": { "type": "date", "format": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'" } } } ] } } }
Test
Verify the dynamic template was created
GET /_index_template/data_type_templateCreate a new index matching the pattern
PUT data_type_202405Check the Field Types
- Verify that all string fields are mapped as
textwith thestandardanalyzer. - Verify that all long fields are mapped as
integer. - Verify that all
datefields are mapped with the correct format.
GET /data_type_202405/_mapping- Verify that all string fields are mapped as
Insert sample documents to ensure that the dynamic template is applied correctly
POST /data_type_202405/_bulk { "index": { "_index": "data_type_202405", "_id": "1" } } { "name": "Wireless Bluetooth Headphones", "release_date": "2024-05-28T14:35:00.000Z", "price": 99 } { "index": { "_index": "data_type_202405", "_id": "2" } } { "description": "Durable stainless steel water bottle", "launch_date": "2024-05-28T15:00:00.000Z", "quantity": 500 }Perform Searches
- Search launch_date
GET /data_type_202405/_search { "query": { "query_string": { "query": "launch_date:\"2024-05-28T15:00:00.000Z\"" } } }- Check if price is searchable as a value
GET /data_type_202405/_search { "query": { "query_string": { "query": "price: 99" } } }
Considerations
- Dynamic Templates: Using dynamic templates based on data types allows for flexible and consistent field mappings without needing to know the exact field names in advance.
- Data Types: Matching on data types (string, long, date) ensures that fields are mapped appropriately based on their content.
- Date Format: Specifying the date format ensures that date fields are parsed correctly, avoiding potential issues with date-time representation.
Clean-up (optional)
Delete the index
DELETE data_type_202405Delete the dynamic template
DELETE /_index_template/data_type_template
Documentation
Example 3: Create a Dynamic Template for Logging Data for Data Patterns
Requirements
- Automatically map fields that end with “_ip” as IP type.
- Map fields that start with “timestamp_” as date type.
- Map any field containing the word “keyword” as a keyword type.
- Use a custom analyzer for fields ending with “_text”.
Steps
Open the Kibana Console or use a REST client.
Create the dynamic template
PUT /_index_template/logs_template { "index_patterns": ["logs*"], "template": { "settings": { "analysis": { "analyzer": { "custom_analyzer": { "type": "standard", "stopwords": "_english_" } } } }, "mappings": { "dynamic_templates": [ { "ip_fields": { "match": "*_ip", "mapping": { "type": "ip" } } }, { "date_fields": { "match": "timestamp_*", "mapping": { "type": "date" } } }, { "keyword_fields": { "match": "*keyword*", "mapping": { "type": "keyword" } } }, { "text_fields": { "match": "*_text", "mapping": { "type": "text", "analyzer": "custom_analyzer" } } } ] } } }
Test
Verify the dynamic template was created
GET /_index_template/logs_templateCreate a new index matching the pattern
PUT logs_202405Check the Field Types
- Verify that all
_ipfields are mapped asip - Verify that all
timestamp_fields are mapped asdate - Verify that all fields that contain the string keyword are mapped as
keyword
GET /logs_202405/_mapping- Verify that all
Insert sample documents to ensure that the dynamic template is applied correctly
POST /logs_202405/_bulk { "index": { "_id": "1" } } { "source_ip": "192.168.1.1", "timestamp_event": "2024-05-28T12:00:00Z", "user_keyword": "elastic", "description_text": "This is a log entry." } { "index": { "_id": "2" } } { "destination_ip": "10.0.0.1", "timestamp_access": "2024-05-28T12:05:00Z", "log_keyword": "search", "details_text": "Another log entry." }Perform Searches
- Search source_ip
GET /logs_202405/_search { "query": { "query_string": { "query": "source_ip:\"192.168.1.1\"" } } }- Check if timestamp_event is searchable as a date
GET /logs_202405/_search { "query": { "query_string": { "query": "timestamp_event:\"2024-05-28T12:00:00Z\"" } } }
Considerations
- The use of patterns in the dynamic template ensures that newly added fields matching the criteria are automatically mapped without the need for manual intervention.
- Custom analyzer configuration is critical for ensuring text fields are processed correctly, enhancing search capabilities.
Clean-up (optional)
Delete the index
DELETE logs_202405Delete the dynamic template
DELETE /_index_template/logs_template
Documentation
1.4 Task: Define an Index Lifecycle Management policy for a timeseries index
Example 1: Creating an ILM policy for log data indices
Requirements
- Indices are prefixed with
logstash- - Indices should be rolled over daily (create a new index every day).
- Old indices should be deleted after 30 days.
Steps using the Elastic/Kibana UI
Open the hamburger menu and click on Management > Data > Life Cycle Policies.
Press + Create New Policy.
Enter the following:
- Policy name: logstash-example-policy.
- Hot phase:
- Change Keep Data in the Phase Forever (the infinity icon) to Delete Data After This Phase (the trashcan icon).
- Click Advanced Settings.
- Unselect Use Recommended Defaults.
- Set Maximum Age to 1.
- Delete phase:
- Move data into phase when: 30 days old.
Press Save Policy.
Open the Kibana Console or use a REST client.
Create an index template that will match on indices that match the pattern
logstash-*.PUT /_index_template/ilm_logstash_index_template { "index_patterns": ["logstash-*"] }Return to the Management > Data > Life Cycle Policies page.
Press the plus sign (+) to the right of logstash-example-policy.
- The Add Policy “logstash-example-policy” to index template dialog opens.
- Click on the Index Template input field and type the first few letters of the index template created above.
- Select the template created above (ilm_logstash_index_template).
- Press Add Policy.
Open the Kibana Console or use a REST client.
List
ilm_logs_index_template. Notice the ILM policy is now part of the index template.GET /_index_template/ilm_logstash_index_templateOutput from the
GET:{ "index_templates": [ { "name": "ilm_logstash_index_template", "index_template": { "index_patterns": ["logstash-*"], "template": { "settings": { "index": { "lifecycle": { "name": "logstash-example-policy" } } } }, "composed_of": [] } } ] }Create an index.
PUT logstash-2024.05.16Verify the policy is there.
GET logstash-2024.05.16The output should look something like this:
{ "logstash-2024.05.16": { "aliases": {}, "mappings": {}, "settings": { "index": { "lifecycle": { "name": "logstash-example-policy" }, "routing": { "allocation": { "include": { "_tier_preference": "data_content" } } }, "number_of_shards": "1", "provided_name": "logstash-2024.05.16", "creation_date": "1717024100387", "priority": "100", "number_of_replicas": "1", "uuid": "mslAKuZGTpSDdFr4hSpAAA", "version": { "created": "8503000" } } } } }
Steps Using the REST API (which I would not recommend)
Open the Kibana Console or use a REST client.
Create the ILM policy.
PUT _ilm/policy/logstash-example-policy { "policy": { "phases": { "hot": { "actions": { "rollover": { "max_age": "1d" } } }, "delete": { "min_age": "30d", "actions": { "delete": {} } } } } }Create an index template that includes the above policy. The two fields within settings are required.
PUT /_index_template/ilm_logstash_index_template { "index_patterns": ["logstash-*"], "template": { "settings": { "index.lifecycle.name": "logstash-example-policy", "index.lifecycle.rollover_alias": "logstash" } } }
Test
Verify the ILM policy exists in Kibana under Management > Data > Index Lifecycle Policies.
Verify the Index Lifecycle Management policy exists and references the index template.
GET /_ilm/policy/logstash-example-policyVerify the policy is referenced in the index template.
GET /_index_template/ilm_logstash_index_templateCreate a new index that matches the pattern logstash-*.
PUT /logstash-indexVerify the index has the policy in its definition.
GET /logstash-index
Considerations
- The index template configures 1 shard and the ILM policy/alias for rollover.
- The rollover action creates a new index when the max_age is reached.
- The delete phase removes indices older than 30 days.
Clean-up (optional)
Delete the index.
DELETE logstash-indexDelete the index template.
DELETE /_index_template/ilm_logstash_index_templateDelete the policy.
DELETE /_ilm/policy/logstash-example-policy
Documentation
Example 2: Creating an ILM policy for logs indices retention for 7, 30 and 90 days
Requirements
- The policy should be named
logs-policy. - It should have a hot phase with a duration of 7 days.
- It should have a warm phase with a duration of 30 days.
- It should have a cold phase with a duration of 90 days.
- It should have a delete phase.
- The policy should be assigned to indices matching the pattern
ilm_logs_*.
Steps using the Elastic/Kibana UI
Open the hamburger menu and click on Management > Data > Life Cycle Policies.
Press + Create New Policy.
Enter the following:
- Policy name: logs-policy.
- Hot phase:
- Press the garbage can icon to the right to delete data after this phase.
- Warm phase:
- Move data into phase when: 7 days old.
- Leave Delete data after this phase.
- Cold phase:
- Move data into phase when: 30 days old.
- Leave Delete data after this phase.
- Delete phase:
- Move data into phase when: 90 days old.
Press Save Policy.
Open the Kibana Console or use a REST client.
Create an index template that will match on indices that match the pattern
ilm_logs_*.PUT /_index_template/ilm_logs_index_template { "index_patterns": ["ilm_logs_*"] }Return to the Management > Data > Life Cycle Policies page.
Press the plus sign (+) to the right of logs_policy.
The Add Policy “logs-policy” to index template dialog opens.
Click on the Index Template input field and type the first few letters of the index template created above.
Select the template created above (
ilm_logs_index_template).Press Add Policy.
Open the Kibana Console or use a REST client.
List
ilm_logs_index_template. Notice the ILM policy is now part of the index template.GET /_index_template/ilm_logs_index_templateOutput from the
GET(look for the settings/index/lifecycle node):{ "index_templates": [ { "name": "ilm_logs_index_template", "index_template": { "index_patterns": ["ilm_logs_*"], "template": { "settings": { "index": { "lifecycle": { "name": "logs-policy" } } } }, "composed_of": [] } } ] }List
logs-policy.GET _ilm/policy/logs-policyIn the
in_use_bynode you will see:"in_use_by": { "indices": [], "data_streams": [], "composable_templates": [ "ilm_logs_index_template" ] }
Steps Using the REST API (which I would not recommend)
Open the Kibana Console or use a REST client.
Create the ILM policy.
PUT _ilm/policy/logs-policy { "policy": { "phases": { "hot": { "min_age": "0ms", "actions": { "set_priority": { "priority": 100 } } }, "warm": { "min_age": "7d", "actions": { "set_priority": { "priority": 50 } } }, "cold": { "min_age": "30d", "actions": { "set_priority": { "priority": 0 } } }, "delete": { "min_age": "90d", "actions": { "delete": {} } } } } }Assign the policy to the indices matching the pattern “logs_*“.
PUT /_index_template/ilm_logs_index_template { "index_patterns": ["ilm_logs_*"], "template": { "settings": { "index.lifecycle.name": "logs-policy", "index.lifecycle.rollover_alias": "logs" } } }
Test
Verify the ILM policy exists in Kibana under Management > Data > Index Lifecycle Policies.
Verify the Index Lifecycle Management policy exists and references the index template.
GET /_ilm/policy/logs-policyVerify the policy is referenced in the index template.
GET /_index_template/ilm_logs_index_templateCreate a new index that matches the pattern
ilm_logs_*.PUT /ilm_logs_indexVerify the index has the policy in its definition.
GET /ilm_logs_index
Considerations
- The ILM policy will manage the indices matching the pattern
ilm_logs_*. - The hot phase will keep the data for 7 days with high priority and rollover.
- The warm phase will keep the data for 30 days with medium priority.
- The cold phase will keep the data for 90 days with low priority.
- The ILM policy will automatically manage the indices based on their age and size.
- The policy can be adjusted based on the needs of the application and the data.
Clean-up (optional)
Delete the index.
DELETE ilm_logs_indexDelete the index template.
DELETE _index_template/ilm_logs_index_templateDelete the policy.
DELETE _ilm/policy/logs-policy
Documentation
Example 3: Creating an ILM policy for sensor data collected every hour, with daily rollover and retention for one month
Requirements
- Create a new index every day for sensor data (e.g.,
sensor_data-{date}). - Automatically roll over to a new index when the current one reaches a specific size.
- Delete rolled over indices after one month.
Steps using the Elastic/Kibana UI
Open the hamburger menu and click on Management > Data > Life Cycle Policies.
Press + Create New Policy.
Enter the following:
Policy name: sensor-data-policy
Hot phase:
- Change Keep Data in the Phase Forever (the infinity icon) to Delete Data After This Phase (the trashcan icon).
- Click Advanced Settings.
- Unselect Use Recommended Defaults.
- Set Maximum Age to 1.
- Set Maximum Index Size to 10.
Delete phase:
- Move data into phase when: 30 days old.
Press Save Policy.
Open the Kibana Console or use a REST client.
Create an index template that will match on indices that match the pattern “sensor_data-*“.
PUT /_index_template/sensor_data_index_template { "index_patterns": ["sensor_data-*"] }Return to the Management > Data > Life Cycle Policies page.
Press the plus sign (+) to the right of sensor-data-policy.
The Add Policy “sensor-data-policy” to index template dialog opens.
- Click on the Index Template input field and type the first few letters of the index template created above.
- Select the template created above (sensor_data_index_template).
- Press Add Policy.
Open the Kibana Console or use a REST client.
List sensor_data_index_template. Notice the ILM policy is now part of the index template.
GET /_index_template/sensor_data_index_templateOutput from the
GET:{ "index_templates": [ { "name": "sensor_data_index_template", "index_template": { "index_patterns": ["sensor_data-*"], "template": { "settings": { "index": { "lifecycle": { "name": "sensor-data-policy" } } } }, "composed_of": [] } } ] }List sensor-data-policy.
GET /_ilm/policy/sensor-data-policyIn the
in_use_bynode you will see:"in_use_by": { "indices": [], "data_streams": [], "composable_templates": [ "sensor_data_index_template" ] }
OR
Steps Using the REST API (which I would not recommend)
Open the Kibana Console or use a REST client.
Define the ILM policy.
PUT _ilm/policy/sensor-data-policy { "policy": { "phases": { "hot": { "min_age": "0ms", "actions": { "rollover": { "max_age": "1d", "max_size": "10gb" } } }, "delete": { "min_age": "30d", "actions": { "delete": {} } } } } }Assign the policy to the indices matching the pattern “sensor_data-*“.
PUT /_index_template/sensor_data_index_template { "index_patterns": ["sensor_data-*"], "template": { "settings": { "index.lifecycle.name": "sensor-data-policy", "index.lifecycle.rollover_alias": "sensor" } } }
Test
Verify the ILM policy exists in Kibana under Management > Data > Index Lifecycle Policies.
Verify the Index Lifecycle Management policy exists and references the index template.
GET /_ilm/policy/sensor-data-policyVerify the policy is referenced in the index template.
GET /_index_template/sensor_data_index_templateCreate a new index that matches the pattern sensor_data-*.
PUT /sensor_data-20240516Verify the index has the policy in its definition.
GET /sensor_data-20240516
Considerations
- The hot phase size threshold determines the frequency of rollovers.
- The delete phase retention period defines how long rolled over data is stored.
Clean-up (optional)
Delete the index.
DELETE sensor_data-20240516Delete the index template.
DELETE /_index_template/sensor_data_index_templateDelete the policy.
DELETE /_ilm/policy/sensor-data-policy
Documentation
1.5 Task: Define an index template that creates a new data stream
Data streams in Elasticsearch are used for managing time-series data such as logs, metrics, and events. They can handle large volumes of time-series data in an efficient and scalable manner.
An interesting aspect is that the creation of the data stream is pretty trivial. It normally looks like this in the index template that contains it:
```json
...
"data_stream" : {}
...
```
Yep, that’s it. The defaults take care of most circumstances.
Also, the data_stream must be created in an index template as the data_stream needs backing indices. Those backing indices are created when an index is created that matches the pattern in index_patterns (basically, any index created using the index template acts as an alias to the actual backing indices created).
Example 1: Creating an index template for continuously flowing application logs
Requirements
- Create a new data stream named “app-logs” to store application logs.
- Automatically create new backing indices within the data stream as needed.
Steps
Open the Kibana Console or use a REST client.
Define the index template that will be used by the data stream to create new backing indices.
PUT _index_template/app_logs_index_template { "index_patterns": ["app_logs*"], "data_stream": {} }
Test
Verify the index template creation.
GET _index_template/app_logs_index_templateConfirm there are no indices named app_logs*.
GET /_cat/indicesMock sending streaming data by just pushing a few documents to the stream. When sending documents using _bulk, they must use create instead of index. In addition, the documents must have a
@timestampfield.POST app_logs/_bulk { "create":{} } { "@timestamp": "2099-05-06T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" } { "create":{} } { "@timestamp": "2099-05-06T16:25:42.000Z", "message": "192.0.2.255 - - [06/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" }The response will list the name of the automatically created index, which will look something like this:
{ "errors": false, "took": 8, "items": [ { "create": { "_index": ".ds-app_logs-2099.05.06-000001", "_id": "OOazyo8BAvAOn4WaAfdD", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 2, "_primary_term": 1, "status": 201 } }, { "create": { "_index": ".ds-app_logs-2099.05.06-000001", "_id": "Oeazyo8BAvAOn4WaAfdD", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 3, "_primary_term": 1, "status": 201 } } ] }Notice the name of the index is
.ds-app_logs-2099.05.06-000001(it will probably be slightly different for you).Run:
GET /_cat/indices
You will see the new index listed. This is the backing index created by the data stream.
Check for the app_logs data stream under Management > Data > Index Management > Data Streams.
Verify that the documents were indexed.
GET app_logs/_searchNotice in the results that
_indexhas a different name thanapp_logs.You can also run the following (using the backing index name your cluster created).
GET .ds-app_logs-2024.07.25-000001/_search
Considerations
- Data streams provide a more efficient way to handle continuously flowing data compared to daily indices. They are created implicitly through the use of index templates, and you must use the _bulk API when streaming data.
- New backing indices are automatically created within the data stream as needed.
- Lifecycle management policies can be applied to data streams for automatic deletion of older backing indices.
Clean-up (optional)
Delete the data stream (deleting the data stream will also delete the backing index).
DELETE /_data_stream/app_logsDelete the index template.
DELETE _index_template/app_logs_index_template
Documentation
Example 2: Creating an index template for continuously flowing application logs with defined fields
Requirements
- The template should apply to any index matching the pattern
logs*. - The template must create a data stream.
- The template should define settings for two primary shards and one replica.
- The template should include mappings for fields
@timestamp,log_level, andmessage.
Steps
Open the Kibana Console or use a REST client.
Create the index template
PUT _index_template/log_application_index_template { "index_patterns": ["logs*"], "data_stream": {}, "template": { "settings": { "number_of_shards": 2, "number_of_replicas": 1 }, "mappings": { "properties": { "@timestamp": { "type": "date" }, "log_level": { "type": "keyword" }, "message": { "type": "text" } } } } }
Test
Verify the index template creation
GET _index_template/log_application_index_templateConfirm there are no indices named logs*
GET /_cat/indicesIndex documents into the data stream
POST /logs/_doc { "@timestamp": "2024-05-16T12:34:56", "log_level": "info", "message": "Test log message" }This will return a result with the name of the backing index
{ "_index": ".ds-logs-2024.05.16-000001", // yours will be different "_id": "PObWyo8BAvAOn4WaC_de", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 0, "_primary_term": 1 }Run
GET /_cat/indicesThe index will be listed.
Confirm the configuration of the backing index matches the index template (your backing index name will be different)
GET .ds-logs-2024.05.16-000001Run a search for the document that was indexed
GET .ds-logs-2024.05.16-000001/_search
Considerations
- Data streams provide a more efficient way to handle continuously flowing data compared to daily indices. They are created implicitly through the use of index templates, and you must use the _bulk API when streaming data.
- New backing indices are automatically created within the data stream as needed.
- Lifecycle management policies can be applied to data streams for automatic deletion of older backing indices (not shown but there is an example at Set Up a Data Stream).
Clean-up (optional)
Delete the data stream (deleting the data stream will also delete the backing index)
DELETE _data_stream/logsDelete the index template
DELETE _index_template/log_application_index_template
Documentation
Example 3: Creating a metrics data stream for application performance monitoring
Requirements
- Create an index template named
metrics_template. - The template should create a new data stream for indices named
metrics-{suffix}. - The template should have one shard and one replica.
- The template should have a mapping for the
metricfield as akeyworddata type. - The template should have a mapping for the
valuefield as afloatdata type.
Steps
Open the Kibana Console or use a REST client.
Create the index template.
PUT _index_template/metrics_template { "index_patterns": ["metrics-*"], "data_stream": {}, "template": { "settings": { "number_of_shards": 1, "number_of_replicas": 1 }, "mappings": { "properties": { "metric": { "type": "keyword" }, "value": { "type": "float" } } } } }
Test
Verify the index template creation.
GET _index_template/metrics_templateConfirm there are no indices named metrics-*.
GET /_cat/indicesIndex documents into the data stream.
POST /metrics-ds/_doc { "@timestamp": "2024-05-16T12:34:56", "metric": "cpu", "value": 0.5 }Notice the use of the
@timestampfield. That is required for any documents going into a data stream.This will return a result with the name of the backing index.
{ "_index": ".ds-metrics-ds-2024.05.16-000001", // yours will be different "_id": "P-YFy48BAvAOn4WaUvef", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 1, "_primary_term": 1 }Run:
GET /_cat/indicesThe index will be listed.
Confirm the configuration of the backing index matches the index template (your backing index name will be different).
GET .ds-metrics-ds-2024.05.16-000001Run a search for the document that was indexed.
GET .ds-metrics-ds-2024.05.16-000001/_search
Considerations
- The
keyworddata type is chosen for themetricfield to enable exact matching and filtering. - The
floatdata type is chosen for thevaluefield to enable precise numerical calculations. - One shard and one replica are chosen for simplicity and development purposes; in a production environment, this would depend on the expected data volume and search traffic.
Clean-up (optional)
Delete the data stream (deleting the data stream will also delete the backing index).
DELETE _data_stream/metrics-dsDelete the index template.
DELETE _index_template/metrics_template
Documentation
Example 4: Defining a Data Stream with Specific Lifecycle Policies
Requirements
- Create an index template named
logs_index_template. - Create a data stream named
logs_my_app_production. - Configure the data lifecycle:
- Data is hot for 3 minutes.
- Data rolls to warm immediately after 3 minutes.
- Data is warm for 5 minutes.
- Data rolls to cold after 5 minutes.
- Data is deleted 10 minutes after rolling to cold.
Steps
Create the Index Template:
- Define an index template named
logs_index_templatethat matches the data streamlogs_my_app_production.
PUT _index_template/logs_index_template { "index_patterns": ["logs_my_app_production*"], "data_stream": {} }- Define an index template named
Create the ILM Policy using the Elastic/Kibana UI{.unnumbered}
Open the hamburger menu and click on Management > Data > Index Life Cycle Policies.
Press + Create New Policy.
Enter the following:
- Policy name: logs-policy
- Hot phase:
- Advanced Settings > Use Recommended Defaults (disable) > Maximum Age: 7 Days
- Warm phase (enable):
- Move data into phase when: 3 minutes old.
- Leave Delete data after this phase.
- Cold phase:
- Move data into phase when: 5 minutes old.
- Leave Delete data after this phase.
- Delete phase:
- Move data into phase when: 10 minutes old.
Press Save Policy.
Management > Data > Index Life Cycle Policies > [plus sign]
Add Policy “logs-policy” to Index Template > Index Template: logs_index_template > Add Policy
OR
- Create the ILM Policy:
- Define an Index Lifecycle Management (ILM) policy named
logs_index_policyto manage the data lifecycle.
PUT _ilm/policy/logs_index_policy { "policy": { "phases": { "hot": { "min_age": "0ms", "actions": { "rollover": { "max_age": "3m" } } }, "warm": { "min_age": "3m", "actions": { "set_priority": { "priority": 50 } } }, "cold": { "min_age": "8m", "actions": { "set_priority": { "priority": 0 } } }, "delete": { "min_age": "18m", "actions": { "delete": {} } } } } } - Define an Index Lifecycle Management (ILM) policy named
- Create the Data Stream:
- Creating the data stream is similar to creating an index using:
PUT logs_my_app_production- Create the data stream
PUT /_data_stream/logs_my_app_production
Test
- Index Sample Data:
- Index some sample documents into the data stream to ensure it is working correctly.
POST /logs_my_app_production/_doc { "message": "This is a test log entry", "@timestamp": "2024-07-10T23:00:00Z" } - Verify ILM Policy:
- Check the status of the ILM policy to ensure it is being applied correctly.
GET /_ilm/explain/logs-policy - Monitor Data Lifecycle:
- Monitor the data stream to ensure that documents transition through the hot, warm, cold, and delete phases as expected.
Considerations
- The
rolloveraction in the hot phase ensures that the index rolls over after 3 minutes. - The
set_priorityaction in the warm and cold phases helps manage resource allocation. - The
deleteaction in the delete phase ensures that data is deleted 10 minutes after rolling to cold.
Clean-up (Optional)
Delete the data stream and index template to clean up the resources.
DELETE /_data_stream/logs_my_app_production DELETE /_index_template/logs_index_template DELETE /_ilm/policy/logs-policy