2 Searching Data

2.1 Task: Write and execute a search query for terms and/or phrases in one or more fields of an index

The following section will have only one full example, but will show variations of term and phrase queries. Also, bear in mind that when they say term they may not mean the Elasticsearch use of the word, but rather the generic search use of the word. There are a lot of ways to execute a search in Elasticsearch. Don’t get bogged down; focus on term and phrase searches for this section of the example.

Example 1: Write and execute a basic term and phrase search

Requirements

Create an index
Index some documents
Execute a term query
Execute a phrase query

Steps

Open the Kibana Console or use a REST client.

Index some documents which will create an index at the same time. The Elastic Console doesn’t like properly formatted documents when calling _bulk so they need to be tightly packed.

POST /example_index/_bulk
{ "index": {} }
{ "title": "The quick brown fox", "text": "The quick brown fox jumps over the lazy dog." }
{ "index": {} }
{ "title": "Fast and curious", "text": "A fast and curious fox was seen leaping over a lazy dog." }
{ "index": {} }
{ "title": "A fox in action", "text": "In a remarkable display of agility, a quick fox effortlessly jumped over a dog." }
{ "index": {} }
{ "title": "Wildlife wonders", "text": "Observers were amazed as the quick brown fox jumped over the lazy dog." }
{ "index": {} }
{ "title": "Fox tales", "text": "The tale of the quick fox that jumped over the lazy dog has become a legend." }

Execute a term query

Use the GET method to search for documents using 3 different term queries (there are 10 different ways currently. Refer to the Term-level Queries documentation for the full list).

GET example_index/_search
{
  "query": {
    "term": {
      "title": {
        "value": "quick"
      }
    }
  }
}

GET example_index/_search
{
  "query": {
    "terms": {
      "text": ["display", "amazed"]
    }
  }
}

Execute a phrase query

returns 2 docs

GET /example_index/_search
{
  "query": {
    "match_phrase": {
      "text": "quick brown fox"
    }
  }
}

returns 1 doc

GET /example_index/_search
{
  "query": {
    "match_phrase_prefix": {
      "text": "fast and curi"
    }
  }
}

returns 1 doc

GET /example_index/_search
{
  "query": {
    "query_string": {
      "default_field": "text",
      "query": "\"fox jumps\""
    }
  }
}

Considerations

The default standard analyzer (lowercasing, whitespace tokenization, basic normalization) is used.
The term query is used for exact matches and is not analyzed, meaning it matches the exact term in the inverted index.
The match_phrase query analyzes the input text and matches it as a phrase, making it useful for finding exact sequences of terms.

Test

Verify the various queries return the proper results.

Clean-up (optional)

Delete the example index
```
DELETE example_index
```

Documentation

Example 2: Boosting Document Score When an Additional Field Matches

Requirements

Perform a search for beverage OR bar
Boost the score of documents if the value snack exists in the tags field.

Steps

Index Sample Documents Using _bulk Endpoint:

Index documents with fields such as name, description, and tags.

POST /products/_bulk
{ "index": { "_id": "1" } }
{ "name": "Yoo-hoo Beverage", "description": "A delicious, chocolate-flavored drink.", "tags": ["beverage", "chocolate"] }
{ "index": { "_id": "2" } }
{ "name": "Apple iPhone 12", "description": "The latest iPhone model with advanced features.", "tags": ["electronics", "smartphone"] }
{ "index": { "_id": "3" } }
{ "name": "Choco-Lite Bar", "description": "A light and crispy chocolate snack bar.", "tags": ["snack", "chocolate"] }
{ "index": { "_id": "4" } }
{ "name": "Samsung Galaxy S21", "description": "A powerful smartphone with an impressive camera.", "tags": ["electronics", "smartphone"] }
{ "index": { "_id": "5" } }
{ "name": "Nike Air Max 270", "description": "Comfortable and stylish sneakers.", "tags": ["footwear", "sportswear"] }

Perform the query_string Query with Boosting:

Use a query_string query to create an OR condition within the query.
Use a function_score query to boost the score of documents where the tags field contains a specific value (e.g., "chocolate").

GET /products/_search
{
  "query": {
    "function_score": {
      "query": {
        "query_string": {
          "query": "beverage OR bar"
        }
      },
      "functions": [
        {
          "filter": {
            "term": { "tags": "snack" }
          },
          "weight": 2
        }
      ],
      "boost_mode": "multiply"
    }
  }
}

Test

Run the above search query.
Run the following query (which is missing the filter function)

GET /products/_search
{
  "query": {
    "query_string": {
      "query": "beverage OR bar"
    }
  }
}

Check the boosted output to ensure that documents containing "snack" in the tags field have a higher score, and that documents are matched based on the OR condition in the query_string.

Considerations

The query_string query allows you to use a query syntax that includes operators such as OR, AND, and NOT to combine different search criteria.
The function_score query is used to boost the score of documents based on specific conditions—in this case, whether the tags field contains the value "snack".
The weight parameter in the function_score query determines the amount by which the score is boosted, and the boost_mode of "multiply" multiplies the original score by the boost value.

Clean-up (optional)

Delete the example index
```
DELETE products
```

Documentation

2.2 Task: Write and execute a search query that is a Boolean combination of multiple queries and filters

Example 1: Creating a Boolean search for documents in a book index

Requirements

Search for documents with a term in the “title”, “description”, and “category” field

Steps

Open the Kibana Console or use a REST client.

Index some documents which will create an index at the same time. The Elastic Console doesn’t like properly formatted documents when calling _bulk so they need to be tightly packed.

POST /books/_bulk
{ "index": { "_id": "1" } }
{ "title": "To Kill a Mockingbird", "description": "A novel about the serious issues of rape and racial inequality.", "category": "Fiction" }
{ "index": { "_id": "2" } }
{ "title": "1984", "description": "A novel that delves into the dangers of totalitarianism.", "category": "Dystopian" }
{ "index": { "_id": "3" } }
{ "title": "The Great Gatsby", "description": "A critique of the American Dream.", "category": "Fiction" }
{ "index": { "_id": "4" } }
{ "title": "Moby Dick", "description": "The quest of Ahab to exact revenge on the whale Moby Dick.", "category": "Adventure" }
{ "index": { "_id": "5" } }
{ "title": "Pride and Prejudice", "description": "A romantic novel that also critiques the British landed gentry at the end of the 18th century.", "category": "Romance" }

Create a boolean search query. The order in which the various clauses are added don’t matter to the final result.
```
GET books/_search
{
  "query": {
    "bool": {}
  }
}
```

Add a must query for the description field. This will return 4 documents.

GET books/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "terms": {
            "description": [
              "novel",
              "dream",
              "critique"
            ]
          }
        }
      ]
    }
  }
}

Add a filter query for the category field. This will return 2 documents.

GET books/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "terms": {
            "description": [
              "novel",
              "dream",
              "critique"
            ]
          }
        }
      ],
      "filter": [
        {
          "term": {
            "category": "fiction"
          }
        }
      ]
    }
  }
}

Add a must_not filter for the title field. This will return 1 document.

GET books/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "terms": {
            "description": [
              "novel",
              "dream",
              "critique"
            ]
          }
        }
      ],
      "filter": [
        {
          "term": {
            "category": "fiction"
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "title": {
              "value": "gatsby"
            }
          }
        }
      ]
    }
  }
}

Considerations

The bool query allows for combining multiple queries and filters with Boolean logic.
The must, must_not, and filter clauses ensure that all searches and filters must match for a document to be returned.

Test

Verify that the search query returns documents with the term “novel”, “dream”, and “critique” in the description field. Why are there no documents with the term “critique”?

Clean-up (optional)

Delete the index
```
DELETE books
```

Documentation

Example 2: Creating a Boolean search for finding products within a specific price range and excluding discontinued items

Requirements

Find all documents where the name field exists (name: \*) and the price field falls within a specified range.
Additionally, filter out any documents where the discontinued field is set to true.

Steps

Open the Kibana Console or use a REST client.

Index some documents which will create an index at the same time. The Elastic Console doesn’t like properly formatted documents when calling _bulk so they need to be tightly packed.

POST /products/_bulk
{"index":{"_id":1}}
{"name":"Coffee Maker","price":49.99,"discontinued":false}
{"index":{"_id":2}}
{"name":"Gaming Laptop","price":1299.99,"discontinued":false}
{"index":{"_id":3}}
{"name":"Wireless Headphones","price":79.99,"discontinued":true}
{"index":{"_id":4}}
{"name":"Smartwatch","price":249.99,"discontinued":false}

Construct the first search query (the name field exists and the price field falls within a specified range)

GET products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "name"
          }
        },
        {
          "range": {
            "price": {
              "gte": 70,
              "lte": 500
            }
          }
        }
      ]
    }
  }
}

Construct the second search query (same as above, but check if discontinued is set to true)

GET products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "name"
          }
        },
        {
          "range": {
            "price": {
              "gte": 70,
              "lte": 500
            }
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "discontinued": {
              "value": "true"
            }
          }
        }
      ]
    }
  }
}

Explanation

Similar to the previous example, the bool query combines multiple conditions.
The must clause specifies documents that must match all conditions within it.
The range query ensures the price field is between $70 (inclusive) and $500 (inclusive).
The must_not clause excludes documents that match the specified criteria.
The term query filters out documents where discontinued is set to true.

Test

Run the search query and verify the results only include documents for products with:
- A price between $70 and $500 (inclusive).
- discontinued set to true (not discontinued).

This should return a single document with an ID of 4 (Smartwatch) based on the sample data.

Considerations

The chosen price range (gte: 70, lte: 500) can be adjusted based on your specific needs.
You can modify the match query for name to use more specific criteria if needed.

Clean-up (optional)

Delete the index
```
DELETE products
```

Documentation

Example 3: Creating a Boolean search for e-commerce products

Requirements

Search for products that belong to the “Electronics” category.
The product name should contain the term “phone”.
Exclude products with a price greater than 500.

Steps

Open the Kibana Console or use a REST client.

Create an index.

PUT products
{
  "mappings": {
    "properties": {
      "name" : {
        "type": "text"
      },
      "category" : {
        "type": "text"
      },
      "price" : {
        "type": "float"
      }
    }
  }
}

Index some documents which will create an index at the same time. The Elastic Console doesn’t like properly formatted documents when calling _bulk so they need to be tightly packed.

POST /products/_bulk
{"index": { "_id": 1 } }
{ "name": "Smartphone X", "category": "Electronics", "price": 399.99 }
{"index": { "_id": 2 } }
{ "name": "Laptop Y", "category": "Electronics", "price": 799.99 }
{"index": { "_id": 3 } }
{ "name": "Headphones Z", "category": "Electronics", "price": 99.99 }
{"index": { "_id": 4 } }
{ "name": "Gaming Console", "category": "Electronics", "price": 299.99 }

Create a term query that only matches the category “electronics”. This returns all 4 documents.

GET products/_search
{
  "query": {
    "term": {
      "category": {
        "value": "electronics"
      }
    }
  }
}

Create another query using wildcard to return docs that includes “phone”. This returns only 2 documents.

GET products/_search
{
  "query": {
    "wildcard": {
      "name": {
        "value": "*phone*"
      }
    }
  }
}

Create another query using range that returns docs with any price less than $500. This returns 3 documents.

GET products/_search
{
  "query": {
    "range": {
      "price": {
        "lt": 500
      }
    }
  }
}

Combine the above into one bool query with a single must that contains the three queries. This will return the 2 matching documents.

GET products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "category": {
              "value": "electronics"
            }
          }
        },
        {
          "wildcard": {
            "name": {
              "value": "*phone*"
            }
          }
        },
        {
          "range": {
            "price": {
              "lt": 500
            }
          }
        }
      ]
    }
  }
}

Test

The search results should include the following documents:
- Smartphone X
- Headphones Z

Considerations

The term query is used for matches on the category field.
The wildcard query is used for matches on the name field.
The range query is used to filter out documents based on price.
The bool.must query combines these conditions using the specified occurrence types.

Clean-up (optional)

Delete the index
```
DELETE products
```

Documentation

Example 4: Creating a Boolean search for e-commerce products

Requirements

Create an index named “products”.
Create at least 4 documents with varying categories, prices, ratings, and brands.
Create a boolean query
- Use the must:
  - return just electronics
  - products more than $500
- Use must_not:
  - rating less than 4
- Use filter:
  - only Apple products

Steps

Open the Kibana Console or use a REST client.

Create the “products” index

PUT products
{
  "mappings": {
    "properties": {
      "brand": {
        "type": "text"
      },
      "category": {
        "type": "keyword"
      },
      "name": {
        "type": "text"
      },
      "price": {
        "type": "long"
      },
      "rating": {
        "type": "float"
      }
    }
  }
}

Add some sample documents using the _bulk endpoint.

POST /products/_bulk
{"index":{"_id":1}}
{"name":"Laptop","category":"Electronics","price":1200,"rating":4.5,"brand":"Apple"}
{"index":{"_id":2}}
{"name":"Smartphone","category":"Electronics","price":800,"rating":4.2,"brand":"Samsung"}
{"index":{"_id":3}}
{"name":"Sofa","category":"Furniture","price":1000,"rating":3.8,"brand":"IKEA"}
{"index":{"_id":4}}
{"name":"Headphones","category":"Electronics","price":150,"rating":2.5,"brand":"Sony"}
{"index":{"_id":5}}
{"name":"Dining Table","category":"Furniture","price":600,"rating":4.1,"brand":"Ashley"}

Create a term query that only matches the category “electronics”. This returns 3 documents.

GET products/_search
{
  "query": {
    "term": {
      "category": {
        "value": "electronics"
      }
    }
  }
}

Create a range query to return products whose price is greater than $500. This should return 4 documents (why?).

GET products/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 500
      }
    }
  }
}

Create another range query to return products with a rating less than 4. This will return 2 documents.

GET products/_search
{
  "query": {
    "range": {
      "rating": {
        "lt": 4
      }
    }
  }
}

Create another term query to return only Apple branded products. This will return 2 documents.

GET products/_search
{
  "query": {
    "term": {
      "brand": {
        "value": "apple"
      }
    }
  }
}

Assemble the bool query by placing each query in their appropriate must, must_not and filter node.

GET products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "category": {
              "value": "electronics"
            }
          }
        },
        {
          "range": {
            "price": {
              "gte": 500
            }
          }
        }
      ],
      "must_not": [
        {
          "range": {
            "rating": {
              "lt": 4
            }
          }
        }
      ],
      "filter": [
        {
          "term": {
            "brand": {
              "value": "apple"
            }
          }
        }
      ]
    }
  }
}

Test

Check the response from the search query to ensure that it returns the expected documents
- products in the “Electronics” category
- a price greater than $500
- excluding products with a rating less than 4
- from the brand “Apple”

Considerations

The filter clause is used to include only documents with the brand “Apple”.

Clean-up (optional)

Delete the index
```
DELETE products
```

Documentation

2.3 Task: Create an asynchronous search

Asynchronous search uses the same parameters as regular search with a few extra features listed here. For example, in the solution below the documentation for the size option is here. There is only one example here as you can look up the other options as needed during the exam.

Example 1: Executing an asynchronous search on a large log index

Requirements

An Elasticsearch index named “logs” with a large number of documents (e.g., millions of log entries).
Perform a search on the “logs” index that may take a long time to complete due to the size of the index.
Retrieve the search results asynchronously without blocking the client.

Steps

Open the Kibana Console or use a REST client.
If you were submitting a normal/synchronous search to an index called logs your request would look something like this:
```
POST /logs/_search
{
  "query": {
    "match_all": {}
  },
  "size": 10000
}
```

To turn your request into an asynchronous search request turn _search to _async_search

POST /logs/_async_search
{
  "query": {
    "match_all": {}
  },
  "size": 10000
}

This request will return an id and a response object containing partial results if available.

Check the status of the asynchronous search using the id.
```
GET /_async_search/status/{id}
```
Retrieve the search results using the id.
```
GET /_async_search/{id}
```

Test

Index a large number of sample log documents or use an index with a large number of documents.
Execute the asynchronous search request and store the returned id.
Periodically check the status of the search using the id and the /_async_search/status/{id} endpoint.
```
GET /_async_search/status/{id}
```
Once the search is complete, retrieve the final results using the id and the /_async_search/{id} endpoint.

GET /_async_search/{id}

Considerations

The _async_search endpoint is used to submit an asynchronous search request.
The id returned by the initial request is used to check the status and retrieve the final results.
Asynchronous search is useful for long-running searches on large datasets, as it doesn’t block the client while the search is being processed.

Clean-up (optional)

If you created an index (for example, logs) for this example you might want to delete it.
```
DELETE logs
```

Documentation

2.4 Task: Write and execute metric and bucket aggregations

Example 1: Creating Metric and Bucket Aggregations for Product Prices

Requirements

Create an index called product_prices.
Index at least four documents using the _bulk endpoint.
Execute metric and bucket aggregations in a single
- bucket the category field
- calculate the average price per bucket
- find the maximum price per bucket
- find the minimum price per bucket

Steps

Open the Kibana Console or use a REST client.

Ensure you have access to Kibana or any REST client to execute the following requests.

Create an index with the following schema (needed for the aggregations to work properly).

PUT product_prices
{
  "mappings": {
    "properties": {
      "product": {
        "type": "text"
      },
      "category": {
        "type": "keyword"
      },
      "price": {
        "type": "double"
      }
    }
  }
}

Index documents.

POST /product_prices/_bulk
{ "index": { "_id": "1" } }
{ "product": "Elasticsearch Guide", "category": "Books", "price": 29.99 }
{ "index": { "_id": "2" } }
{ "product": "Advanced Elasticsearch", "category": "Books", "price": 39.99 }
{ "index": { "_id": "3" } }
{ "product": "Elasticsearch T-shirt", "category": "Apparel", "price": 19.99 }
{ "index": { "_id": "4" } }
{ "product": "Elasticsearch Mug", "category": "Apparel", "price": 12.99 }

Execute a simple aggregation (should return 2 buckets).

GET product_prices/_search
{
  "size": 0,
  "aggs": {
    "category_buckets": {
      "terms": {
        "field": "category"
      }
    }
  }
}

Add and execute a single sub-aggregation to determine the average price per category (bucket).

GET product_prices/_search
{
  "size": 0,
  "aggs": {
    "category_buckets": {
      "terms": {
        "field": "category"
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

Add min and max sub-aggregations and execute the query.

GET product_prices/_search
{
  "size": 0,
  "aggs": {
    "category_buckets": {
      "terms": {
        "field": "category"
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        },
        "min_price" : {
          "min": {
            "field": "price"
          }
        },
        "max_price": {
          "max": {
            "field": "price"
          }
        }
      }
    }
  }
}

Test

Verify the index creation.
```
GET /product_prices
```
Verify the documents have been indexed.
```
GET /product_prices/_search
```

Execute the aggregation query and verify the results.

{
  ...
  "aggregations": {
    "category_buckets": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Apparel",
          "doc_count": 2,
          "avg_price": {
            "value": 16.49
          },
          "min_price": {
            "value": 12.99
          },
          "max_price": {
            "value": 19.99
          }
        },
        {
          "key": "Books",
          "doc_count": 2,
          "avg_price": {
            "value": 34.99
          },
          "min_price": {
            "value": 29.99
          },
          "max_price": {
            "value": 39.99
          }
        }
      ]
    }
  }
}

Considerations

The category field must be of type keyword.
The terms aggregation creates buckets for each unique category.
The avg, min, and max sub-aggregations calculate the average, minimum, and maximum prices within each category bucket.
Setting size to 0 ensures that only aggregation results are returned, not individual documents.

Clean-up (optional)

Delete the index.
```
DELETE product_prices
```

Documentation

Example 2: Creating Metric and Bucket Aggregations for Website Traffic

Requirements

Create a new index with four documents representing website traffic data.
Aggregate the following:
- Group traffic by country.
- Calculate the total page views.
- Calculate the average page views per country.

Steps

Open the Kibana Console or use a REST client.

Create a new index.

PUT traffic
{
  "mappings": {
    "properties": {
      "country": {
        "type": "keyword"
      },
      "page_views": {
        "type": "long"
      }
    }
  }
}

Add four documents representing website traffic data.

POST /traffic/_bulk
{"index":{}}
{"country":"USA","page_views":100}
{"index":{}}
{"country":"USA","page_views":200}
{"index":{}}
{"country":"Canada","page_views":50}
{"index":{}}
{"country":"Canada","page_views":75}

Execute the bucket aggregation for country (should return 2 buckets).

GET traffic/_search
{
  "size": 0,
  "aggs": {
    "country_bucket": {
      "terms": {
        "field": "country"
      }
    }
  }
}

Add the sum aggregation for total page_views (should return 1 aggregation).

GET traffic/_search
{
  "size": 0,
  "aggs": {
    "country_bucket": {
      "terms": {
        "field": "country"
      }
    },
    "total_page_views": {
      "sum": {
        "field": "page_views"
      }
    }
  }
}

Add a sub-aggregation for average page_views per country (should appear in 2 buckets).

GET traffic/_search
{
  "size": 0,
  "aggs": {
    "country_bucket": {
      "terms": {
        "field": "country"
      },
      "aggs": {
        "avg_page_views": {
          "avg": {
            "field": "page_views"
          }
        }
      }
    },
    "total_page_views": {
      "sum": {
        "field": "page_views"
      }
    }
  }
}

Test

Verify the index creation.
```
GET /traffic
```
Verify the documents have been indexed.
```
GET /traffic/_search
```

Verify that the total page views are calculated correctly (should be 425).

GET /traffic/_search
{
  "aggs": {
    "total_page_views": {
      "sum": {
        "field": "page_views"
      }
    }
  }
}

Verify that the traffic is grouped correctly by country and average page views are calculated.

GET /traffic/_search
{
  "aggs": {
    "traffic_by_country": {
      "terms": {
        "field": "country"
      },
      "aggs": {
        "avg_page_views": {
          "avg": {
            "field": "page_views"
          }
        }
      }
    }
  }
}

Response:

{
  ...
  "aggregations": {
    "country_bucket": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Canada",
          "doc_count": 2,
          "avg_page_views": {
            "value": 62.5
          }
        },
        {
          "key": "USA",
          "doc_count": 2,
          "avg_page_views": {
            "value": 150
          }
        }
      ]
    },
    "total_page_views": {
      "value": 425
    }
  }
}

Considerations

The country field must be of type keyword.
The terms bucket aggregation is used to group traffic by country.
The sum metric aggregation is used to calculate the total page views.
The avg metric aggregation is used to calculate the average page views per country.

Clean-up (optional)

Delete the index.
```
DELETE traffic
```

Documentation

Example 3: Creating Metric and Bucket Aggregations for Analyzing Employee Salaries

Requirements

An Elasticsearch index named employees with documents containing fields name, department, position, salary, hire_date.
Calculate the average salary across all employees.
Group the employees by department
Calculate the maximum salary for each department.

Steps

Open the Kibana Console or use a REST client.

Create an index with the proper mapping for the department as we want to bucket by it.

PUT employees
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text"
      },
      "department": {
        "type": "keyword"
      },
      "position": {
        "type": "text"
      },
      "salary": {
        "type": "integer"
      },
      "hire_date": {
        "type": "date"
      }
    }
  }
}

Index sample employee documents using the /_bulk endpoint.

POST /employees/_bulk
{"index":{"_id":1}}
{"name":"John Doe", "department":"Engineering", "position":"Software Engineer", "salary":80000, "hire_date":"2018-01-15"}
{"index":{"_id":2}}
{"name":"Jane Smith", "department":"Engineering", "position":"DevOps Engineer", "salary":75000, "hire_date":"2020-03-01"}
{"index":{"_id":3}}
{"name":"Bob Johnson", "department":"Sales", "position":"Sales Manager", "salary":90000, "hire_date":"2016-06-01"}
{"index":{"_id":4}}
{"name":"Alice Williams", "department":"Sales", "position":"Sales Representative", "salary":65000, "hire_date":"2019-09-15"}

Calculate the average salary of all employees

GET employees/_search
{
  "size": 0,
  "aggs": {
    "avg_salary_all_emps": {
      "avg": {
        "field": "salary"
      }
    }
  }
}

Add grouping the employees by department

GET employees/_search
{
  "size": 0,
  "aggs": {
    "avg_salary_all_emps": {
      "avg": {
        "field": "salary"
      }
    },
    "employees_by_department" : {
      "terms": {
        "field": "department"
      }
    }
  }
}

Add calculating the highest salary of all employees by department

GET employees/_search
{
  "size": 0,
  "aggs": {
    "avg_salary_all_emps": {
      "avg": {
        "field": "salary"
      }
    },
    "employees_by_department": {
      "terms": {
        "field": "department"
      },
      "aggs": {
        "max_salary_by_department": {
          "max": {
            "field": "salary"
          }
        }
      }
    }
  }
}

Test

Verify the index creation.
```
GET /employees
```
Verify the documents have been indexed.
```
GET /employees/_search
```

Execute the aggregation query, and it should return the following:

{
  ...
  "aggregations": {
    "avg_salary_all_emps": {
      "value": 77500
    },
    "employees_by_department": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Engineering",
          "doc_count": 2,
          "max_salary_by_department": {
            "value": 80000
          }
        },
        {
          "key": "Sales",
          "doc_count": 2,
          "max_salary_by_department": {
            "value": 90000
          }
        }
      ]
    }
  }
}

Considerations

The department field must be of type keyword.
The size parameter is set to 0 to exclude hit documents from the response.
The avg_salary_all_emps metric aggregation calculates the average of the salary field across all documents.
The employees_by_department bucket aggregation groups the documents by the department field.
The max_salary_by_department sub-aggregation calculates the maximum value of the salary field for each department.

Clean-up (optional)

Delete the index.
```
DELETE employees
```

Documentation

2.5 Task: Write and execute aggregations that contain subaggregations

Example 1: Creating aggregations and sub-aggregations for Product Categories and Prices

Requirements

Create aggregations
- by category
- sub-aggregation of average price by category
  - price ranges: $0 to $20, $20-$40, $40 and up

Steps

Open the Kibana Console or use a REST client.

Create an index.

PUT /product_index
{
  "mappings": {
    "properties": {
      "product": {
        "type": "text"
      },
      "category": {
        "type": "keyword"
      },
      "price": {
        "type": "double"
      }
    }
  }
}

Index some sample documents.

POST /product_index/_bulk
{ "index": { "_id": "1" } }
{ "product": "Elasticsearch Guide", "category": "Books", "price": 29.99 }
{ "index": { "_id": "2" } }
{ "product": "Advanced Elasticsearch", "category": "Books", "price": 39.99 }
{ "index": { "_id": "3" } }
{ "product": "Elasticsearch T-shirt", "category": "Apparel", "price": 19.99 }
{ "index": { "_id": "4" } }
{ "product": "Elasticsearch Mug", "category": "Apparel", "price": 12.99 }

Create an aggregation by category.

GET product_index/_search
{
  "size": 0,
  "aggs": {
    "category_buckets": {
      "terms": {
        "field": "category"
      }
    }
  }
}

Create a sub-aggregations of average price.

GET product_index/_search
{
  "size": 0,
  "aggs": {
    "category_buckets": {
      "terms": {
        "field": "category"
      },
      "aggs": {
        "average_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

Create a sub-aggregations of price ranges ($0-$20, $10-$40, $40 and up).

GET product_index/_search
{
  "size": 0,
  "aggs": {
    "category_buckets": {
      "terms": {
        "field": "category"
      },
      "aggs": {
        "average_price": {
          "avg": {
            "field": "price"
          }
        },
        "price_ranges" : {
          "range": {
            "field": "price",
            "ranges": [
              {
                "to": 20
              },
              {
                "from": 20,
                "to": 40
              },
              {
                "from": 40
              }
            ]
          }
        }
      }
    }
  }
}

Test

Verify the index creation and mappings.
```
GET /product_index
```
Verify the test documents are in the index.
```
GET /product_index/_search
```

Execute the aggregation query and confirm the results.

{
  ...
  "aggregations": {
    "category_buckets": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Apparel",
          "doc_count": 2,
          "average_price": {
            "value": 16.49
          },
          "price_ranges": {
            "buckets": [
              {
                "key": "*-20.0",
                "to": 20,
                "doc_count": 2
              },
              {
                "key": "20.0-40.0",
                "from": 20,
                "to": 40,
                "doc_count": 0
              },
              {
                "key": "40.0-*",
                "from": 40,
                "doc_count": 0
              }
            ]
          }
        },
        {
          "key": "Books",
          "doc_count": 2,
          "average_price": {
            "value": 34.99
          },
          "price_ranges": {
            "buckets": [
              {
                "key": "*-20.0",
                "to": 20,
                "doc_count": 0
              },
              {
                "key": "20.0-40.0",
                "from": 20,
                "to": 40,
                "doc_count": 2
              },
              {
                "key": "40.0-*",
                "from": 40,
                "doc_count": 0
              }
            ]
          }
        }
      ]
    }
  }
}

Considerations

Setting size: 0 ensures the search doesn’t return any documents, focusing solely on the aggregations.
The category field must be of type keyword.
The terms aggregation creates buckets for each unique category.
The avg sub-aggregation calculates the average price within each category bucket.
The range sub-aggregation divides the prices into specified ranges within each category bucket.

Clean-up (optional)

Delete the index.
```
DELETE product_index
```

Documentation

Example 2: Creating aggregations and sub-aggregations for Employee Data Analysis

Requirements

Use the terms aggregation to group employees by department.
Use the avg sub-aggregation to calculate the average salary per department.
Use the filters sub-aggregation to group employees by job_title.

Steps

Open the Kibana Console or use a REST client.

Create a new index called employees.

PUT employees
{
  "mappings": {
    "properties": {
      "department": {
        "type": "keyword"
      },
      "salary": {
        "type": "integer"
      },
      "job_title": {
        "type": "keyword"
      }
    }
  }
}

Insert four documents representing employee data.

POST /employees/_bulk
{"index":{}}
{"department":"Sales","salary":100000,"job_title":"Manager"}
{"index":{}}
{"department":"Sales","salary":80000,"job_title":"Representative"}
{"index":{}}
{"department":"Marketing","salary":120000,"job_title":"Manager"}
{"index":{}}
{"department":"Marketing","salary":90000,"job_title":"Coordinator"}

Execute an aggregation by department.

GET employees/_search
{
  "size": 0,
  "aggs": {
    "employees_by_department": {
      "terms": {
        "field": "department"
      }
    }
  }
}

Add the sub-aggregations for average salary by department.

GET employees/_search
{
  "size": 0,
  "aggs": {
    "employees_by_department": {
      "terms": {
        "field": "department"
      },
      "aggs": {
        "avg_salary_by_department": {
          "avg": {
            "field": "salary"
          }
        }
      }
    }
  }
}

Add a filters sub-aggregation for each job_title.

GET employees/_search
{
  "size": 0,
  "aggs": {
    "employees_by_department": {
      "terms": {
        "field": "department"
      },
      "aggs": {
        "avg_salary_by_department": {
          "avg": {
            "field": "salary"
          }
        },
        "employees_by_title": {
          "filters": {
            "filters": {
              "Managers": {
                "term": {
                  "job_title": "Manager"
                }
              },
              "Representative" : {
                "term": {
                  "job_title": "Representative"
                }
              },
              "Coordinator" : {
                "term": {
                  "job_title": "Coordinator"
                }
              }
            }
          }
        }
      }
    }
  }
}

Test

Verify the index creation and mappings.
```
GET /employees
```
Verify the test documents are in the index.
```
GET /employees/_search
```

Verify that the employees are grouped correctly by department and job title and that the average salary is calculated correctly for each department.

{
  ...
  "aggregations": {
    "employees_by_department": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Marketing",
          "doc_count": 2,
          "avg_salary_by_department": {
            "value": 105000
          },
          "employees_by_title": {
            "buckets": {
              "Coordinator": {
                "doc_count": 1
              },
              "Managers": {
                "doc_count": 1
              },
              "Representative": {
                "doc_count": 0
              }
            }
          }
        },
        {
          "key": "Sales",
          "doc_count": 2,
          "avg_salary_by_department": {
            "value": 90000
          },
          "employees_by_title": {
            "buckets": {
              "Coordinator": {
                "doc_count": 0
              },
              "Managers": {
                "doc_count": 1
              },
              "Representative": {
                "doc_count": 1
              }
            }
          }
        }
      ]
    }
  }
}

Considerations

The department field must be of type keyword.
Setting size to 0 ensures the search doesn’t return any documents, focusing solely on the aggregations.
The terms aggregation is used to group employees by department.
The avg sub-aggregation is used to calculate the average salary per department.
The filters sub-aggregation is used to group employees by job_title.

Clean-up (optional)

Delete the index.
```
DELETE employees
```

Documentation

Example 3: Creating aggregations and sub-aggregations for application logs by Hour and Log Level

Requirements

Analyze application logs stored in an Elasticsearch index named app-logs.
Use a date_histogram aggregation to group logs by the hour.
Within each hour bucket, create a sub-aggregation to group logs by their severity level (log_level).

Steps

Open the Kibana Console or use a REST client.

Create a new index called app-logs.

PUT app-logs
{
  "mappings": {
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "log_level": {
        "type": "keyword"
      },
      "message": {
        "type": "text"
      }
    }
  }
}

Insert sample data.

POST /app-logs/_bulk
{"index":{},"_id":"1"}
{"@timestamp":"2024-05-24T10:30:00","log_level":"INFO","message":"Application started successfully."}
{"index":{},"_id":"2"}
{"@timestamp":"2024-05-24T11:15:00","log_level":"WARNING","message":"Potential memory leak detected."}
{"index":{},"_id":"3"}
{"@timestamp":"2024-05-24T12:00:00","log_level":"ERROR","message":"Database connection failed."}
{"index":{},"_id":"4"}
{"@timestamp":"2024-05-24T10:45:00","log_level":"DEBUG","message":"Processing user request."}

Use a date_histogram aggregation to group logs by the hour.

GET app-logs/_search
{
  "size": 0,
  "aggs": {
    "logs_by_the_hour": {
      "date_histogram": {
        "field": "@timestamp",
        "fixed_interval": "1h"
      }
    }
  }
}

Within each hour bucket, create a sub-aggregation to group logs by their severity level (log_level).

GET app-logs/_search
{
  "size": 0,
  "aggs": {
    "logs_by_the_hour": {
      "date_histogram": {
        "field": "@timestamp",
        "fixed_interval": "1h"
      },
      "aggs": {
        "log_severity": {
          "terms": {
            "field": "log_level"
          }
        }
      }
    }
  }
}

Test

Verify the index creation and mappings.
```
GET /app-logs
```
Verify the test documents are in the index.
```
GET /app-logs/_search
```

Run the search query and examine the response.

{
  ...
  "aggregations": {
    "logs_by_the_hour": {
      "buckets": [
        {
          "key_as_string": "2024-05-24T10:00:00.000Z",
          "key": 1716544800000,
          "doc_count": 2,
          "log_severity": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "DEBUG",
                "doc_count": 1
              },
              {
                "key": "INFO",
                "doc_count": 1
              }
            ]
          }
        },
        {
          "key_as_string": "2024-05-24T11:00:00.000Z",
          "key": 1716548400000,
          "doc_count": 1,
          "log_severity": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "WARNING",
                "doc_count": 1
              }
            ]
          }
        },
        {
          "key_as_string": "2024-05-24T12:00:00.000Z",
          "key": 1716552000000,
          "doc_count": 1,
          "log_severity": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "ERROR",
                "doc_count": 1
              }
            ]
          }
        }
      ]
    }
  }
}

Considerations

Setting size to 0 ensures the search doesn’t return any documents, focusing solely on the aggregations.
The date_histogram aggregation groups documents based on the @timestamp field with an interval of one hour.
The nested terms aggregation within the logs_by_hour aggregation counts the occurrences of each unique log_level within each hour bucket.

Clean-up (optional)

Delete the index.
```
DELETE app-logs
```

Documentation

Example 4: Finding the Stock with the Highest Daily Volume of the Month

This is taken from a webinar by Elastic to show a sample question and answer to the Certified Engineer Exam. Their answer was wrong and didn’t need aggregations.

Requirements

Create a query to find the stock with the highest daily volume for the current month.

Steps

Open the Kibana Console or use a REST client.

Index sample data:

Use the _bulk endpoint to index sample stock data.

Ensure the data includes fields for stock_name, date, and volume.

POST _bulk
{ "index": { "_index": "stocks", "_id": "1" } }
{ "stock_name": "AAPL", "date": "2024-07-01", "volume": 1000000 }
{ "index": { "_index": "stocks", "_id": "2" } }
{ "stock_name": "AAPL", "date": "2024-07-02", "volume": 1500000 }
{ "index": { "_index": "stocks", "_id": "3" } }
{ "stock_name": "GOOGL", "date": "2024-07-01", "volume": 2000000 }
{ "index": { "_index": "stocks", "_id": "4" } }
{ "stock_name": "GOOGL", "date": "2024-07-02", "volume": 2500000 }
{ "index": { "_index": "stocks", "_id": "5" } }
{ "stock_name": "MSFT", "date": "2024-07-01", "volume": 3000000 }
{ "index": { "_index": "stocks", "_id": "6" } }
{ "stock_name": "MSFT", "date": "2024-07-02", "volume": 3500000 }
{ "index": { "_index": "stocks", "_id": "7" } }
{ "stock_name": "TSLA", "date": "2024-07-01", "volume": 4000000 }
{ "index": { "_index": "stocks", "_id": "8" } }
{ "stock_name": "TSLA", "date": "2024-07-02", "volume": 4500000 }
{ "index": { "_index": "stocks", "_id": "9" } }
{ "stock_name": "AMZN", "date": "2024-07-01", "volume": 5000000 }
{ "index": { "_index": "stocks", "_id": "10" } }
{ "stock_name": "AMZN", "date": "2024-07-02", "volume": 5500000 }

Create the query. The stocks in the index are all from July, but you want just the stocks for the latest month. Update the above dates so the query will work for you.

  GET stocks/_search
  {
    "size": 1, 
    "query": {
      "range": {
        "date": {
          "gte": "now/M",
          "lte": "now"
        }
      }
    }
  }

The results of the query should be all the stocks from a given month. Now sort those stocks by their volume and display the top pick.

GET stocks/_search
{
  "size": 1, 
  "query": {
    "range": {
      "date": {
        "gte": "now/M",
        "lte": "now"
      }
    }
  },
  "sort": [
    {
      "volume": {
        "order": "desc"
      }
    }
  ]
}

Test

Verify the index creation and mappings.
```
GET /stocks
```
Verify the test documents are in the index.
```
GET /stocks/_search
```

Run the query and confirm that the stock with the highest daily volume of the month is displayed.

{
  ...
    "hits": [
      {
        "_index": "stocks",
        "_id": "10",
        "_score": null,
        "_source": {
          "stock_name": "AMZN",
          "date": "2024-07-02",
          "volume": 5500000
        },
        "sort": [
          5500000
        ]
      }
    ]
  }
}

Considerations

The range clause returned the stocks for the current month
The sort clause brought the highest volume of any stock to the top and size of 1 displayed that one record

Clean-up (Optional)

Delete the stocks index to clean up the data:
```
DELETE /stocks
```

Documentation

Example 5: Aggregating Sales Data by Month with Sub-Aggregation of Total Sales Value

Requirements

Aggregate e-commerce sales data by month, creating at least 12 date buckets.
Perform a sub-aggregation to calculate the total sales value within each month.

Steps

Index Sample Sales Documents Using _bulk Endpoint:

POST /sales_data/_bulk
{ "index": { "_id": "1" } }
{ "order_date": "2023-01-15", "product": "Yoo-hoo Beverage", "quantity": 10, "price": 1.99 }
{ "index": { "_id": "2" } }
{ "order_date": "2023-02-20", "product": "Apple iPhone 12", "quantity": 1, "price": 799.99 }
{ "index": { "_id": "3" } }
{ "order_date": "2023-03-05", "product": "Choco-Lite Bar", "quantity": 25, "price": 0.99 }
{ "index": { "_id": "4" } }
{ "order_date": "2023-04-10", "product": "Nike Air Max 270", "quantity": 3, "price": 150.00 }
{ "index": { "_id": "5" } }
{ "order_date": "2023-05-18", "product": "Samsung Galaxy S21", "quantity": 2, "price": 699.99 }
{ "index": { "_id": "6" } }
{ "order_date": "2023-06-22", "product": "Yoo-hoo Beverage", "quantity": 15, "price": 1.99 }
{ "index": { "_id": "7" } }
{ "order_date": "2023-07-03", "product": "Choco-Lite Bar", "quantity": 30, "price": 0.99 }
{ "index": { "_id": "8" } }
{ "order_date": "2023-08-25", "product": "Apple iPhone 12", "quantity": 1, "price": 799.99 }
{ "index": { "_id": "9" } }
{ "order_date": "2023-09-10", "product": "Nike Air Max 270", "quantity": 4, "price": 150.00 }
{ "index": { "_id": "10" } }
{ "order_date": "2023-10-15", "product": "Samsung Galaxy S21", "quantity": 1, "price": 699.99 }
{ "index": { "_id": "11" } }
{ "order_date": "2023-11-20", "product": "Yoo-hoo Beverage", "quantity": 20, "price": 1.99 }
{ "index": { "_id": "12" } }
{ "order_date": "2023-12-30", "product": "Choco-Lite Bar", "quantity": 50, "price": 0.99 }

Bucket the order_date using a Date Histogram Aggregation with Sub-Aggregation:

Use a date_histogram to create monthly buckets and a sum sub-aggregation to calculate total sales within each month.

GET /sales_data/_search
{
  "size": 0,
  "aggs": {
    "sales_over_time": {
      "date_histogram": {
        "field": "order_date",
        "calendar_interval": "month",
        "format": "yyyy-MM"
      },
      "aggs": {
        "total_sales": {
          "sum": {
            "field": "total_value"
          }
        }
      }
    }
  }
}

Calculate the Total Value:
- Before running the above aggregation, ensure that each document includes a total_value field. You could either compute it on the client side or dynamically compute it using an ingest pipeline or a script during the aggregation process.
  
  For simplicity, let’s assume the total_value is calculated as quantity * price:
```
POST /sales_data/_update_by_query
{
  "script": {
    "source": "ctx._source.total_value = ctx._source.quantity * ctx._source.price"
  },
  "query": {
    "match_all": {}
  }
}
```

Test

Run the above GET /sales_data/_search query.
Check the output to see 12 date buckets, one for each month, with the total_sales value for each bucket.

Considerations

The date_histogram aggregation is ideal for grouping records by time intervals such as months, weeks, or days.
The sum sub-aggregation allows you to calculate the total value of sales within each date bucket.
Ensure that the total_value field is correctly calculated, as this impacts the accuracy of the sub-aggregation.

Clean-up (Optional)

Delete the stocks index to clean up the data:
```
DELETE /sales_data
```

Documentation

2.6 Task: Write and execute a query that searches across multiple clusters

If you are running your instance of Elasticsearch locally, and need to create an additional cluster so that you can run these examples, go to the Appendix: Adding a Cluster to your Elasticsearch Instance for information on how to set up an additional single-node cluster.

Example 1: Creating search queries for Products in Multiple Clusters

Requirements

Set up two single-node clusters on localhost or Elastic Cloud.
Create an index in each cluster.
Index at least four documents in each cluster using the _bulk endpoint.
Configure cross-cluster search.
Execute a cross-cluster search query.

Steps

Open the Kibana Console or use a REST client.
Set up multiple clusters on localhost.

Assume you have two clusters, es01 and es02 and they have been set up as directed in the Appendix.

In the local cluster, configure communication between the clusters by updating the local cluster settings.

PUT /_cluster/settings
{
  "persistent": {
    "cluster": {
      "remote": {
        "es01": {
          "seeds": [
            "es01:9300"
          ],
          "skip_unavailable": true
        },
        "es02": {
          "seeds": [
            "es02:9300"
          ],
          "skip_unavailable": false
        }
      }
    }
  }
}

Create a product index in each cluster.

From the Kibana Console (es01)

PUT /products
{
  "mappings": {
    "properties": {
      "product": {
        "type": "text"
      },
      "category": {
        "type": "keyword"
      },
      "price": {
        "type": "double"
      }
    }
  }
}

From the command line (es02).

curl -u elastic:[your password here] -X PUT "http://localhost:9201/products?pretty" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "product": {
        "type": "text"
      },
      "category": {
        "type": "keyword"
      },
      "price": {
        "type": "double"
      }
    }
  }
}'

Index product documents into each cluster.

For es01:

POST /products/_bulk
{ "index": { "_id": "1" } }
{ "product": "Elasticsearch Guide", "category": "Books", "price": 29.99 }
{ "index": { "_id": "2" } }
{ "product": "Advanced Elasticsearch", "category": "Books", "price": 39.99 }
{ "index": { "_id": "3" } }
{ "product": "Elasticsearch T-shirt", "category": "Apparel", "price": 19.99 }
{ "index": { "_id": "4" } }
{ "product": "Elasticsearch Mug", "category": "Apparel", "price": 12.99 }

For es02 through the command line (note that the final single quote is on a line by itself):

curl -u elastic:[your password here] -X POST "http://localhost:9201/products/_bulk?pretty" -H 'Content-Type: application/json' -d'
{ "index": { "_id": "5" } }
{ "product": "Elasticsearch Stickers", "category": "Accessories", "price": 4.99 }
{ "index": { "_id": "6" } }
{ "product": "Elasticsearch Notebook", "category": "Stationery", "price": 7.99 }
{ "index": { "_id": "7" } }
{ "product": "Elasticsearch Pen", "category": "Stationery", "price": 3.49 }
{ "index": { "_id": "8" } }
{ "product": "Elasticsearch Hoodie", "category": "Apparel", "price": 45.99 }
'

Configure Cross-Cluster Search (CCS).

In the local cluster, ensure the remote cluster is configured by checking the settings:
```
GET /_cluster/settings?include_defaults=true&filter_path=defaults.cluster.remote
```

Execute a Cross-Cluster Search query.

GET /products,es02:products/_search
{
  "query": {
    "match": {
      "product": "Elasticsearch"
    }
  }
}

Test

Verify the index creation.

GET /products

From the command line execute:

curl -u elastic:[your password here] -X GET "http://localhost:9201/products?pretty"

Verify that the documents have been indexed.

GET /products/_search
GET /es02:products/_search

Ensure the remote cluster is correctly configured and visible from the local cluster.
```
GET /_remote/info
```

Execute a Cross-Cluster Search query.

GET /products,es02:products/_search
{
  "query": {
    "match": {
      "product": "Elasticsearch"
    }
  }
}

Considerations

Cross-cluster search is useful for querying data across multiple Elasticsearch clusters, providing a unified search experience.
Ensure the remote cluster settings are correctly configured in the cluster settings.
Properly handle the index names to avoid conflicts and ensure clear distinction between clusters.

Clean-up (optional)

Delete the es01 index.
```
DELETE products
```

Delete the es02 index from the command line.

curl -u elastic:[your password here] -X DELETE "http://localhost:9201/products?pretty"

Documentation

2.7 Task: Write and execute a search that utilizes a runtime field

Example 1: Creating search queries for products with a runtime field for discounted prices

Requirements

Create an index.
Index four documents.
Define a runtime field.
Execute a search query that creates a query-time runtime field with a 10% discount

Steps

Open the Kibana Console or use a REST client.
Create an index.

PUT /product_index
{
  "mappings": {
    "properties": {
      "product": {
        "type": "text"
      },
      "price": {
        "type": "double"
      },
      "category": {
        "type": "keyword"
      }
    }
  }
}

Index some documents.

POST /product_index/_bulk
{ "index": { "_id": "1" } }
{ "product": "Elasticsearch Guide", "price": 29.99, "category": "Books" }
{ "index": { "_id": "2" } }
{ "product": "Advanced Elasticsearch", "price": 39.99, "category": "Books" }
{ "index": { "_id": "3" } }
{ "product": "Elasticsearch T-shirt", "price": 19.99, "category": "Apparel" }
{ "index": { "_id": "4" } }
{ "product": "Elasticsearch Mug", "price": 12.99, "category": "Apparel" }

Define a query-time runtime field to return a discounted price.

GET product_index/_search
{
  "query": {
    "match_all": {}
  },
  "fields": [
    "product", "price", "discounted_price"
  ], 
  "runtime_mappings": {
    "discounted_price": {
      "type": "double",
      "script": {
        "source": "emit(doc['price'].value * 0.9)"
      }
    }
  }
}

Test

Verify the creation of the index and its mappings.
```
GET /product_index
```
Verify the indexed documents.
```
GET /product_index/_search
```

Execute the query and confirm the discounted_price.

{
  ...
    "hits": [
      {
        ...
        "fields": {
          "product": [
            "Elasticsearch Guide"
          ],
          "price": [
            29.99
          ],
          "discounted_price": [
            26.991
          ]
        }
      },
      {
        ...
        "fields": {
          "product": [
            "Advanced Elasticsearch"
          ],
          "price": [
            39.99
          ],
          "discounted_price": [
            35.991
          ]
        }
      },
      {
        ...
        "fields": {
          "product": [
            "Elasticsearch T-shirt"
          ],
          "price": [
            19.99
          ],
          "discounted_price": [
            17.991
          ]
        }
      },
      {
        ...
        "fields": {
          "product": [
            "Elasticsearch Mug"
          ],
          "price": [
            12.99
          ],
          "discounted_price": [
            11.691
          ]
        }
      }
    ]
  }
}

Considerations

Runtime fields allow for dynamic calculation of field values at search time, useful for complex calculations or when the field values are not stored.
The script in the runtime field calculates the discounted price by applying a 10% discount to the price field.

Clean-up (optional)

Delete the index.
```
DELETE product_index
```

Documentation

Example 2: Creating search queries for employees with a calculated total salary

In this example, the runtime field is defined as part of the index that executes code when documents are indexed. The salary field is read at index time to create a new value for the runtime field total_salary.

Requirements

An index (employees) with documents containing employee information (name, department, salary)and a runtime field (total_salary) to calculate the total salary of each employee.
A search query to retrieve employees with a total salary above $65,000.

Steps

Open the Kibana Console or use a REST client.

Create the employees index with a mapping for the runtime field.

PUT employees
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text"
      },
      "department": {
        "type": "text"
      },
      "salary": {
        "type": "integer"
      },
      "total_salary": {
        "type": "long",
        "script": {
          "source": "emit(doc['salary'].value * 12)"
        }
      }
    }
  }
}

Index some documents that contain a monthly salary.

POST /employees/_bulk
{ "index": { "_id": "1" } }
{ "name": "John Doe", "department": "Sales", "salary": 4000 }
{ "index": { "_id": "2" } }
{ "name": "Jane Smith", "department": "Marketing", "salary": 6000 }
{ "index": { "_id": "3" } }
{ "name": "Bob Johnson", "department": "IT", "salary": 7000 }
{ "index": { "_id": "4" } }
{ "name": "Alice Brown", "department": "HR", "salary": 5000 }

Execute a search query with a runtime field.

GET employees/_search
{
  "query": {
    "range": {
      "total_salary": {
        "gte": 65000
      }
    }
  },
  "fields": [
    "total_salary"
  ]
}

Test

Verify the creation of the index and its mappings.
```
GET /employees
```
Verify the indexed documents.
```
GET /employees/_search
```

Execute the query and verify the search results contain only employees with a total salary above 65000.

{
  ...
    "hits": [
      {
        "_index": "employees",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "Jane Smith",
          "department": "Marketing",
          "salary": 6000
        },
        "fields": {
          "total_salary": [
            72000
          ]
        }
      },
      {
        "_index": "employees",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "Bob Johnson",
          "department": "IT",
          "salary": 7000
        },
        "fields": {
          "total_salary": [
            84000
          ]
        }
      }
    ]
  }
}

Considerations

Runtime fields are calculated on the fly and can be used in search queries, aggregations, and sorting.
The script used in the runtime field calculates the total salary by multiplying the monthly salary by 12 months.

Clean-up (optional)

Delete the index.
```
DELETE employees
```

Documentation

Example 3: Creating search queries with a runtime field for restaurant data

Requirements

Create a search query for restaurants in New York City.
Include the restaurant’s name, cuisine, and a calculated rating_score in the search results.
- the rating_score is calculated by taking the square root of the product of the review_score and number_of_reviews.

Steps

Open the Kibana Console or use a REST client.

Create a restaurant index.

PUT restaurants
{
  "mappings": {
    "properties": {
      "city": {
        "type": "keyword"
      },
      "cuisine": {
        "type": "text"
      },
      "name": {
        "type": "text"
      },
      "number_of_reviews": {
        "type": "long"
      },
      "review_score": {
        "type": "float"
      },
      "state": {
        "type": "keyword"
      }
    }
  }
}

Index some sample restaurant documents.

POST /restaurants/_bulk
{ "index": { "_id": 1 } }
{ "name": "Tasty Bites", "city": "New York", "state": "NY", "cuisine": "Italian", "review_score": 4.5, "number_of_reviews": 200 }
{ "index": { "_id": 2 } }
{ "name": "Spicy Palace", "city": "Los Angeles", "state": "CA", "cuisine": "Indian", "review_score": 4.2, "number_of_reviews": 150 }
{ "index": { "_id": 3 } }
{ "name": "Sushi Spot", "city": "San Francisco", "state": "CA", "cuisine": "Japanese", "review_score": 4.7, "number_of_reviews": 300 }
{ "index": { "_id": 4 } }
{ "name": "Burger Joint", "city": "Chicago", "state": "IL", "cuisine": "American", "review_score": 3.8, "number_of_reviews": 100 }

Create a query to return restaurants based from New York City.

GET restaurants/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "city": {
              "value": "New York"
            }
          }
        },
        {
          "term": {
            "state": {
              "value": "NY"
            }
          }
        }
      ]
    }
  }
}

Define a runtime field named weighted_rating to calculate a weighted rating score for New York restaurants.

GET restaurants/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "city": {
              "value": "New York"
            }
          }
        },
        {
          "term": {
            "state": {
              "value": "NY"
            }
          }
        }
      ]
    }
  }, 
  "runtime_mappings": {
    "rating_score": {
      "type": "double",
      "script": {
        "source": "emit(Math.sqrt(doc['review_score'].value * doc['number_of_reviews'].value))"
      }
    }
  },
  "fields": [
    "rating_score"
  ]
}

Test

Verify the creation of the index and its mappings.
```
GET /restaurants
```
Verify the indexed documents.
```
GET /restaurants/_search
```

Execute the query and verify the restaurant name, cuisine type, and the calculated weighted rating score for restaurants located in New York, NY.

{
  ...
    "hits": [
      {
        "_index": "restaurants",
        "_id": "1",
        "_score": 2.4079456,
        "_source": {
          "name": "Tasty Bites",
          "city": "New York",
          "state": "NY",
          "cuisine": "Italian",
          "review_score": 4.5,
          "number_of_reviews": 200
        },
        "fields": {
          "rating_score": [
            30
          ]
        }
      }
    ]
  }
}

Considerations

The runtime_mappings section defines a new field weighted_rating that calculates a weighted rating score based on the review_score and number_of_reviews fields.
The query section uses the term query to search for restaurants in New York, NY.
The fields section specifies the fields to include in the search results (in this case, the runtime field weighted_rating).

Clean-up (optional)

Delete the index.
```
DELETE restaurants
```

Documentation

Runtime Fields in the Search Request