2 Searching Data
2.1 Task: Write and execute a search query for terms and/or phrases in one or more fields of an index
The following section will have only one full example, but will show variations of term and phrase queries. Also, bear in mind that when they say term they may not mean the Elasticsearch use of the word, but rather the generic search use of the word. There are a lot of ways to execute a search in Elasticsearch. Don’t get bogged down; focus on term and phrase searches for this section of the example.
Example 1: Write and execute a basic term and phrase search
Requirements
- Create an index
- Index some documents
- Execute a term query
- Execute a phrase query
Steps
Open the Kibana Console or use a REST client.
Index some documents which will create an index at the same time. The Elastic Console doesn’t like properly formatted documents when calling
_bulkso they need to be tightly packed.POST /example_index/_bulk { "index": {} } { "title": "The quick brown fox", "text": "The quick brown fox jumps over the lazy dog." } { "index": {} } { "title": "Fast and curious", "text": "A fast and curious fox was seen leaping over a lazy dog." } { "index": {} } { "title": "A fox in action", "text": "In a remarkable display of agility, a quick fox effortlessly jumped over a dog." } { "index": {} } { "title": "Wildlife wonders", "text": "Observers were amazed as the quick brown fox jumped over the lazy dog." } { "index": {} } { "title": "Fox tales", "text": "The tale of the quick fox that jumped over the lazy dog has become a legend." }Execute a
termquery
Use the GET method to search for documents using 3 different term queries (there are 10 different ways currently. Refer to the Term-level Queries documentation for the full list).
GET example_index/_search { "query": { "term": { "title": { "value": "quick" } } } }GET example_index/_search { "query": { "terms": { "text": ["display", "amazed"] } } }
- Execute a
phrasequery
returns 2 docs
GET /example_index/_search { "query": { "match_phrase": { "text": "quick brown fox" } } }returns 1 doc
GET /example_index/_search { "query": { "match_phrase_prefix": { "text": "fast and curi" } } }returns 1 doc
GET /example_index/_search { "query": { "query_string": { "default_field": "text", "query": "\"fox jumps\"" } } }
Considerations
- The default standard analyzer (lowercasing, whitespace tokenization, basic normalization) is used.
- The
termquery is used for exact matches and is not analyzed, meaning it matches the exact term in the inverted index. - The
match_phrasequery analyzes the input text and matches it as a phrase, making it useful for finding exact sequences of terms.
Test
- Verify the various queries return the proper results.
Clean-up (optional)
Delete the example index
DELETE example_index
Documentation
Example 2: Boosting Document Score When an Additional Field Matches
Requirements
- Perform a search for
beverage OR bar - Boost the score of documents if the value
snackexists in thetagsfield.
Steps
- Index Sample Documents Using
_bulkEndpoint:- Index documents with fields such as
name,description, andtags.
POST /products/_bulk { "index": { "_id": "1" } } { "name": "Yoo-hoo Beverage", "description": "A delicious, chocolate-flavored drink.", "tags": ["beverage", "chocolate"] } { "index": { "_id": "2" } } { "name": "Apple iPhone 12", "description": "The latest iPhone model with advanced features.", "tags": ["electronics", "smartphone"] } { "index": { "_id": "3" } } { "name": "Choco-Lite Bar", "description": "A light and crispy chocolate snack bar.", "tags": ["snack", "chocolate"] } { "index": { "_id": "4" } } { "name": "Samsung Galaxy S21", "description": "A powerful smartphone with an impressive camera.", "tags": ["electronics", "smartphone"] } { "index": { "_id": "5" } } { "name": "Nike Air Max 270", "description": "Comfortable and stylish sneakers.", "tags": ["footwear", "sportswear"] } - Index documents with fields such as
- Perform the
query_stringQuery with Boosting:- Use a
query_stringquery to create an OR condition within the query. - Use a
function_scorequery to boost the score of documents where thetagsfield contains a specific value (e.g.,"chocolate").
GET /products/_search { "query": { "function_score": { "query": { "query_string": { "query": "beverage OR bar" } }, "functions": [ { "filter": { "term": { "tags": "snack" } }, "weight": 2 } ], "boost_mode": "multiply" } } } - Use a
Test
- Run the above search query.
- Run the following query (which is missing the
filterfunction)
GET /products/_search
{
"query": {
"query_string": {
"query": "beverage OR bar"
}
}
}- Check the boosted output to ensure that documents containing
"snack"in thetagsfield have a higher score, and that documents are matched based on the OR condition in thequery_string.
Considerations
- The
query_stringquery allows you to use a query syntax that includes operators such asOR,AND, andNOTto combine different search criteria. - The
function_scorequery is used to boost the score of documents based on specific conditions—in this case, whether thetagsfield contains the value"snack". - The
weightparameter in thefunction_scorequery determines the amount by which the score is boosted, and theboost_modeof"multiply"multiplies the original score by the boost value.
Clean-up (optional)
Delete the example index
DELETE products
Documentation
2.2 Task: Write and execute a search query that is a Boolean combination of multiple queries and filters
Example 1: Creating a Boolean search for documents in a book index
Requirements
- Search for documents with a term in the “title”, “description”, and “category” field
Steps
Open the Kibana Console or use a REST client.
Index some documents which will create an index at the same time. The Elastic Console doesn’t like properly formatted documents when calling _bulk so they need to be tightly packed.
POST /books/_bulk { "index": { "_id": "1" } } { "title": "To Kill a Mockingbird", "description": "A novel about the serious issues of rape and racial inequality.", "category": "Fiction" } { "index": { "_id": "2" } } { "title": "1984", "description": "A novel that delves into the dangers of totalitarianism.", "category": "Dystopian" } { "index": { "_id": "3" } } { "title": "The Great Gatsby", "description": "A critique of the American Dream.", "category": "Fiction" } { "index": { "_id": "4" } } { "title": "Moby Dick", "description": "The quest of Ahab to exact revenge on the whale Moby Dick.", "category": "Adventure" } { "index": { "_id": "5" } } { "title": "Pride and Prejudice", "description": "A romantic novel that also critiques the British landed gentry at the end of the 18th century.", "category": "Romance" }Create a boolean search query. The order in which the various clauses are added don’t matter to the final result.
GET books/_search { "query": { "bool": {} } }Add a
mustquery for the description field. This will return 4 documents.GET books/_search { "query": { "bool": { "must": [ { "terms": { "description": [ "novel", "dream", "critique" ] } } ] } } }Add a
filterquery for the category field. This will return 2 documents.GET books/_search { "query": { "bool": { "must": [ { "terms": { "description": [ "novel", "dream", "critique" ] } } ], "filter": [ { "term": { "category": "fiction" } } ] } } }Add a
must_notfilter for the title field. This will return 1 document.GET books/_search { "query": { "bool": { "must": [ { "terms": { "description": [ "novel", "dream", "critique" ] } } ], "filter": [ { "term": { "category": "fiction" } } ], "must_not": [ { "term": { "title": { "value": "gatsby" } } } ] } } }
Considerations
- The
boolquery allows for combining multiple queries and filters with Boolean logic. - The
must,must_not, andfilterclauses ensure that all searches and filters must match for a document to be returned.
Test
- Verify that the search query returns documents with the term “novel”, “dream”, and “critique” in the
descriptionfield. Why are there no documents with the term “critique”?
Clean-up (optional)
Delete the index
DELETE books
Documentation
Example 2: Creating a Boolean search for finding products within a specific price range and excluding discontinued items
Requirements
- Find all documents where the
namefield exists (name: \*) and thepricefield falls within a specified range. - Additionally, filter out any documents where the
discontinuedfield is set totrue.
Steps
Open the Kibana Console or use a REST client.
Index some documents which will create an index at the same time. The Elastic Console doesn’t like properly formatted documents when calling _bulk so they need to be tightly packed.
POST /products/_bulk {"index":{"_id":1}} {"name":"Coffee Maker","price":49.99,"discontinued":false} {"index":{"_id":2}} {"name":"Gaming Laptop","price":1299.99,"discontinued":false} {"index":{"_id":3}} {"name":"Wireless Headphones","price":79.99,"discontinued":true} {"index":{"_id":4}} {"name":"Smartwatch","price":249.99,"discontinued":false}Construct the first search query (the
namefield exists and thepricefield falls within a specified range)GET products/_search { "query": { "bool": { "must": [ { "exists": { "field": "name" } }, { "range": { "price": { "gte": 70, "lte": 500 } } } ] } } }Construct the second search query (same as above, but check if
discontinuedis set totrue)GET products/_search { "query": { "bool": { "must": [ { "exists": { "field": "name" } }, { "range": { "price": { "gte": 70, "lte": 500 } } } ], "must_not": [ { "term": { "discontinued": { "value": "true" } } } ] } } }
Explanation
- Similar to the previous example, the
boolquery combines multiple conditions. - The
mustclause specifies documents that must match all conditions within it. - The
rangequery ensures the price field is between $70 (inclusive) and $500 (inclusive). - The
must_notclause excludes documents that match the specified criteria. - The
termquery filters out documents wherediscontinuedis set to true.
Test
- Run the search query and verify the results only include documents for products with:
- A
pricebetween $70 and $500 (inclusive). discontinuedset totrue(not discontinued).
- A
This should return a single document with an ID of 4 (Smartwatch) based on the sample data.
Considerations
- The chosen price range (gte: 70, lte: 500) can be adjusted based on your specific needs.
- You can modify the
matchquery fornameto use more specific criteria if needed.
Clean-up (optional)
Delete the index
DELETE products
Documentation
Example 3: Creating a Boolean search for e-commerce products
Requirements
- Search for products that belong to the “Electronics” category.
- The product name should contain the term “phone”.
- Exclude products with a price greater than 500.
Steps
Open the Kibana Console or use a REST client.
Create an index.
PUT products { "mappings": { "properties": { "name" : { "type": "text" }, "category" : { "type": "text" }, "price" : { "type": "float" } } } }Index some documents which will create an index at the same time. The Elastic Console doesn’t like properly formatted documents when calling _bulk so they need to be tightly packed.
POST /products/_bulk {"index": { "_id": 1 } } { "name": "Smartphone X", "category": "Electronics", "price": 399.99 } {"index": { "_id": 2 } } { "name": "Laptop Y", "category": "Electronics", "price": 799.99 } {"index": { "_id": 3 } } { "name": "Headphones Z", "category": "Electronics", "price": 99.99 } {"index": { "_id": 4 } } { "name": "Gaming Console", "category": "Electronics", "price": 299.99 }Create a
termquery that only matches the category “electronics”. This returns all 4 documents.GET products/_search { "query": { "term": { "category": { "value": "electronics" } } } }Create another query using
wildcardto return docs that includes “phone”. This returns only 2 documents.GET products/_search { "query": { "wildcard": { "name": { "value": "*phone*" } } } }Create another query using
rangethat returns docs with any price less than $500. This returns 3 documents.GET products/_search { "query": { "range": { "price": { "lt": 500 } } } }Combine the above into one
boolquery with a singlemustthat contains the three queries. This will return the 2 matching documents.GET products/_search { "query": { "bool": { "must": [ { "term": { "category": { "value": "electronics" } } }, { "wildcard": { "name": { "value": "*phone*" } } }, { "range": { "price": { "lt": 500 } } } ] } } }
Test
- The search results should include the following documents:
- Smartphone X
- Headphones Z
Considerations
- The
termquery is used for matches on thecategoryfield. - The
wildcardquery is used for matches on thenamefield. - The
rangequery is used to filter out documents based onprice. - The
bool.mustquery combines these conditions using the specified occurrence types.
Clean-up (optional)
Delete the index
DELETE products
Documentation
Example 4: Creating a Boolean search for e-commerce products
Requirements
- Create an index named “products”.
- Create at least 4 documents with varying categories, prices, ratings, and brands.
- Create a boolean query
- Use the
must:- return just electronics
- products more than $500
- Use
must_not:- rating less than 4
- Use
filter:- only Apple products
- Use the
Steps
Open the Kibana Console or use a REST client.
Create the “products” index
PUT products { "mappings": { "properties": { "brand": { "type": "text" }, "category": { "type": "keyword" }, "name": { "type": "text" }, "price": { "type": "long" }, "rating": { "type": "float" } } } }Add some sample documents using the
_bulkendpoint.POST /products/_bulk {"index":{"_id":1}} {"name":"Laptop","category":"Electronics","price":1200,"rating":4.5,"brand":"Apple"} {"index":{"_id":2}} {"name":"Smartphone","category":"Electronics","price":800,"rating":4.2,"brand":"Samsung"} {"index":{"_id":3}} {"name":"Sofa","category":"Furniture","price":1000,"rating":3.8,"brand":"IKEA"} {"index":{"_id":4}} {"name":"Headphones","category":"Electronics","price":150,"rating":2.5,"brand":"Sony"} {"index":{"_id":5}} {"name":"Dining Table","category":"Furniture","price":600,"rating":4.1,"brand":"Ashley"}Create a
termquery that only matches the category “electronics”. This returns 3 documents.GET products/_search { "query": { "term": { "category": { "value": "electronics" } } } }Create a
rangequery to return products whose price is greater than $500. This should return 4 documents (why?).GET products/_search { "query": { "range": { "price": { "gte": 500 } } } }Create another
rangequery to return products with a rating less than 4. This will return 2 documents.GET products/_search { "query": { "range": { "rating": { "lt": 4 } } } }Create another
termquery to return only Apple branded products. This will return 2 documents.GET products/_search { "query": { "term": { "brand": { "value": "apple" } } } }Assemble the
boolquery by placing each query in their appropriatemust,must_notandfilternode.GET products/_search { "query": { "bool": { "must": [ { "term": { "category": { "value": "electronics" } } }, { "range": { "price": { "gte": 500 } } } ], "must_not": [ { "range": { "rating": { "lt": 4 } } } ], "filter": [ { "term": { "brand": { "value": "apple" } } } ] } } }
Test
- Check the response from the search query to ensure that it returns the expected documents
- products in the “Electronics” category
- a price greater than $500
- excluding products with a rating less than 4
- from the brand “Apple”
Considerations
- The filter clause is used to include only documents with the brand “Apple”.
Clean-up (optional)
Delete the index
DELETE products
Documentation
2.3 Task: Create an asynchronous search
Asynchronous search uses the same parameters as regular search with a few extra features listed here. For example, in the solution below the documentation for the size option is here. There is only one example here as you can look up the other options as needed during the exam.
Example 1: Executing an asynchronous search on a large log index
Requirements
- An Elasticsearch index named “logs” with a large number of documents (e.g., millions of log entries).
- Perform a search on the “logs” index that may take a long time to complete due to the size of the index.
- Retrieve the search results asynchronously without blocking the client.
Steps
Open the Kibana Console or use a REST client.
If you were submitting a normal/synchronous search to an index called
logsyour request would look something like this:POST /logs/_search { "query": { "match_all": {} }, "size": 10000 }To turn your request into an asynchronous search request turn
_searchto_async_searchPOST /logs/_async_search { "query": { "match_all": {} }, "size": 10000 }
This request will return an id and a response object containing partial results if available.
Check the status of the asynchronous search using the id.
GET /_async_search/status/{id}Retrieve the search results using the id.
GET /_async_search/{id}
Test
Index a large number of sample log documents or use an index with a large number of documents.
Execute the asynchronous search request and store the returned id.
Periodically check the status of the search using the id and the
/_async_search/status/{id}endpoint.GET /_async_search/status/{id}Once the search is complete, retrieve the final results using the id and the
/_async_search/{id}endpoint.
GET /_async_search/{id}Considerations
- The
_async_searchendpoint is used to submit an asynchronous search request. - The id returned by the initial request is used to check the status and retrieve the final results.
- Asynchronous search is useful for long-running searches on large datasets, as it doesn’t block the client while the search is being processed.
Clean-up (optional)
If you created an index (for example,
logs) for this example you might want to delete it.DELETE logs
Documentation
2.4 Task: Write and execute metric and bucket aggregations
Example 1: Creating Metric and Bucket Aggregations for Product Prices
Requirements
- Create an index called
product_prices. - Index at least four documents using the
_bulkendpoint. - Execute metric and bucket aggregations in a single
- bucket the
categoryfield - calculate the average price per bucket
- find the maximum price per bucket
- find the minimum price per bucket
- bucket the
Steps
- Open the Kibana Console or use a REST client.
Ensure you have access to Kibana or any REST client to execute the following requests.
Create an index with the following schema (needed for the aggregations to work properly).
PUT product_prices { "mappings": { "properties": { "product": { "type": "text" }, "category": { "type": "keyword" }, "price": { "type": "double" } } } }Index documents.
POST /product_prices/_bulk { "index": { "_id": "1" } } { "product": "Elasticsearch Guide", "category": "Books", "price": 29.99 } { "index": { "_id": "2" } } { "product": "Advanced Elasticsearch", "category": "Books", "price": 39.99 } { "index": { "_id": "3" } } { "product": "Elasticsearch T-shirt", "category": "Apparel", "price": 19.99 } { "index": { "_id": "4" } } { "product": "Elasticsearch Mug", "category": "Apparel", "price": 12.99 }Execute a simple aggregation (should return 2 buckets).
GET product_prices/_search { "size": 0, "aggs": { "category_buckets": { "terms": { "field": "category" } } } }Add and execute a single sub-aggregation to determine the average price per category (bucket).
GET product_prices/_search { "size": 0, "aggs": { "category_buckets": { "terms": { "field": "category" }, "aggs": { "avg_price": { "avg": { "field": "price" } } } } } }Add min and max sub-aggregations and execute the query.
GET product_prices/_search { "size": 0, "aggs": { "category_buckets": { "terms": { "field": "category" }, "aggs": { "avg_price": { "avg": { "field": "price" } }, "min_price" : { "min": { "field": "price" } }, "max_price": { "max": { "field": "price" } } } } } }
Test
Verify the index creation.
GET /product_pricesVerify the documents have been indexed.
GET /product_prices/_searchExecute the aggregation query and verify the results.
{ ... "aggregations": { "category_buckets": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Apparel", "doc_count": 2, "avg_price": { "value": 16.49 }, "min_price": { "value": 12.99 }, "max_price": { "value": 19.99 } }, { "key": "Books", "doc_count": 2, "avg_price": { "value": 34.99 }, "min_price": { "value": 29.99 }, "max_price": { "value": 39.99 } } ] } } }
Considerations
- The category field must be of type
keyword. - The
termsaggregation creates buckets for each unique category. - The
avg,min, andmaxsub-aggregations calculate the average, minimum, and maximum prices within each category bucket. - Setting size to 0 ensures that only aggregation results are returned, not individual documents.
Clean-up (optional)
Delete the index.
DELETE product_prices
Documentation
Example 2: Creating Metric and Bucket Aggregations for Website Traffic
Requirements
- Create a new index with four documents representing website traffic data.
- Aggregate the following:
- Group traffic by country.
- Calculate the total page views.
- Calculate the average page views per country.
Steps
Open the Kibana Console or use a REST client.
Create a new index.
PUT traffic { "mappings": { "properties": { "country": { "type": "keyword" }, "page_views": { "type": "long" } } } }Add four documents representing website traffic data.
POST /traffic/_bulk {"index":{}} {"country":"USA","page_views":100} {"index":{}} {"country":"USA","page_views":200} {"index":{}} {"country":"Canada","page_views":50} {"index":{}} {"country":"Canada","page_views":75}Execute the bucket aggregation for
country(should return 2 buckets).GET traffic/_search { "size": 0, "aggs": { "country_bucket": { "terms": { "field": "country" } } } }Add the
sumaggregation for totalpage_views(should return 1 aggregation).GET traffic/_search { "size": 0, "aggs": { "country_bucket": { "terms": { "field": "country" } }, "total_page_views": { "sum": { "field": "page_views" } } } }Add a sub-aggregation for average
page_viewspercountry(should appear in 2 buckets).GET traffic/_search { "size": 0, "aggs": { "country_bucket": { "terms": { "field": "country" }, "aggs": { "avg_page_views": { "avg": { "field": "page_views" } } } }, "total_page_views": { "sum": { "field": "page_views" } } } }
Test
Verify the index creation.
GET /trafficVerify the documents have been indexed.
GET /traffic/_searchVerify that the total page views are calculated correctly (should be 425).
GET /traffic/_search { "aggs": { "total_page_views": { "sum": { "field": "page_views" } } } }Verify that the traffic is grouped correctly by country and average page views are calculated.
GET /traffic/_search { "aggs": { "traffic_by_country": { "terms": { "field": "country" }, "aggs": { "avg_page_views": { "avg": { "field": "page_views" } } } } } }Response:
{ ... "aggregations": { "country_bucket": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Canada", "doc_count": 2, "avg_page_views": { "value": 62.5 } }, { "key": "USA", "doc_count": 2, "avg_page_views": { "value": 150 } } ] }, "total_page_views": { "value": 425 } } }
Considerations
- The
countryfield must be of type keyword. - The
termsbucket aggregation is used to group traffic by country. - The
summetric aggregation is used to calculate the total page views. - The
avgmetric aggregation is used to calculate the average page views per country.
Clean-up (optional)
Delete the index.
DELETE traffic
Documentation
Example 3: Creating Metric and Bucket Aggregations for Analyzing Employee Salaries
Requirements
- An Elasticsearch index named
employeeswith documents containing fieldsname,department,position,salary,hire_date. - Calculate the average salary across all employees.
- Group the employees by department
- Calculate the maximum salary for each department.
Steps
Open the Kibana Console or use a REST client.
Create an index with the proper mapping for the department as we want to bucket by it.
PUT employees { "mappings": { "properties": { "name": { "type": "text" }, "department": { "type": "keyword" }, "position": { "type": "text" }, "salary": { "type": "integer" }, "hire_date": { "type": "date" } } } }Index sample employee documents using the /_bulk endpoint.
POST /employees/_bulk {"index":{"_id":1}} {"name":"John Doe", "department":"Engineering", "position":"Software Engineer", "salary":80000, "hire_date":"2018-01-15"} {"index":{"_id":2}} {"name":"Jane Smith", "department":"Engineering", "position":"DevOps Engineer", "salary":75000, "hire_date":"2020-03-01"} {"index":{"_id":3}} {"name":"Bob Johnson", "department":"Sales", "position":"Sales Manager", "salary":90000, "hire_date":"2016-06-01"} {"index":{"_id":4}} {"name":"Alice Williams", "department":"Sales", "position":"Sales Representative", "salary":65000, "hire_date":"2019-09-15"}Calculate the average salary of all employees
GET employees/_search { "size": 0, "aggs": { "avg_salary_all_emps": { "avg": { "field": "salary" } } } }Add grouping the employees by department
GET employees/_search { "size": 0, "aggs": { "avg_salary_all_emps": { "avg": { "field": "salary" } }, "employees_by_department" : { "terms": { "field": "department" } } } }Add calculating the highest salary of all employees by department
GET employees/_search { "size": 0, "aggs": { "avg_salary_all_emps": { "avg": { "field": "salary" } }, "employees_by_department": { "terms": { "field": "department" }, "aggs": { "max_salary_by_department": { "max": { "field": "salary" } } } } } }
Test
Verify the index creation.
GET /employeesVerify the documents have been indexed.
GET /employees/_searchExecute the aggregation query, and it should return the following:
{ ... "aggregations": { "avg_salary_all_emps": { "value": 77500 }, "employees_by_department": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Engineering", "doc_count": 2, "max_salary_by_department": { "value": 80000 } }, { "key": "Sales", "doc_count": 2, "max_salary_by_department": { "value": 90000 } } ] } } }
Considerations
- The
departmentfield must be of type keyword. - The
sizeparameter is set to 0 to exclude hit documents from the response. - The
avg_salary_all_empsmetric aggregation calculates the average of thesalaryfield across all documents. - The
employees_by_departmentbucket aggregation groups the documents by the department field. - The
max_salary_by_departmentsub-aggregation calculates the maximum value of the salary field for each department.
Clean-up (optional)
Delete the index.
DELETE employees
Documentation
2.5 Task: Write and execute aggregations that contain subaggregations
Example 1: Creating aggregations and sub-aggregations for Product Categories and Prices
Requirements
- Create aggregations
- by category
- sub-aggregation of average price by category
- price ranges: $0 to $20, $20-$40, $40 and up
Steps
Open the Kibana Console or use a REST client.
Create an index.
PUT /product_index { "mappings": { "properties": { "product": { "type": "text" }, "category": { "type": "keyword" }, "price": { "type": "double" } } } }Index some sample documents.
POST /product_index/_bulk { "index": { "_id": "1" } } { "product": "Elasticsearch Guide", "category": "Books", "price": 29.99 } { "index": { "_id": "2" } } { "product": "Advanced Elasticsearch", "category": "Books", "price": 39.99 } { "index": { "_id": "3" } } { "product": "Elasticsearch T-shirt", "category": "Apparel", "price": 19.99 } { "index": { "_id": "4" } } { "product": "Elasticsearch Mug", "category": "Apparel", "price": 12.99 }Create an aggregation by category.
GET product_index/_search { "size": 0, "aggs": { "category_buckets": { "terms": { "field": "category" } } } }Create a sub-aggregations of average price.
GET product_index/_search { "size": 0, "aggs": { "category_buckets": { "terms": { "field": "category" }, "aggs": { "average_price": { "avg": { "field": "price" } } } } } }Create a sub-aggregations of price ranges ($0-$20, $10-$40, $40 and up).
GET product_index/_search { "size": 0, "aggs": { "category_buckets": { "terms": { "field": "category" }, "aggs": { "average_price": { "avg": { "field": "price" } }, "price_ranges" : { "range": { "field": "price", "ranges": [ { "to": 20 }, { "from": 20, "to": 40 }, { "from": 40 } ] } } } } } }
Test
Verify the index creation and mappings.
GET /product_indexVerify the test documents are in the index.
GET /product_index/_searchExecute the aggregation query and confirm the results.
{ ... "aggregations": { "category_buckets": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Apparel", "doc_count": 2, "average_price": { "value": 16.49 }, "price_ranges": { "buckets": [ { "key": "*-20.0", "to": 20, "doc_count": 2 }, { "key": "20.0-40.0", "from": 20, "to": 40, "doc_count": 0 }, { "key": "40.0-*", "from": 40, "doc_count": 0 } ] } }, { "key": "Books", "doc_count": 2, "average_price": { "value": 34.99 }, "price_ranges": { "buckets": [ { "key": "*-20.0", "to": 20, "doc_count": 0 }, { "key": "20.0-40.0", "from": 20, "to": 40, "doc_count": 2 }, { "key": "40.0-*", "from": 40, "doc_count": 0 } ] } } ] } } }
Considerations
- Setting
size: 0 ensures the search doesn’t return any documents, focusing solely on the aggregations. - The
categoryfield must be of type keyword. - The
termsaggregation creates buckets for each unique category. - The
avgsub-aggregation calculates the average price within each category bucket. - The
rangesub-aggregation divides the prices into specified ranges within each category bucket.
Clean-up (optional)
Delete the index.
DELETE product_index
Documentation
Example 2: Creating aggregations and sub-aggregations for Employee Data Analysis
Requirements
- Use the
termsaggregation to group employees by department. - Use the
avgsub-aggregation to calculate the average salary per department. - Use the
filterssub-aggregation to group employees by job_title.
Steps
Open the Kibana Console or use a REST client.
Create a new index called
employees.PUT employees { "mappings": { "properties": { "department": { "type": "keyword" }, "salary": { "type": "integer" }, "job_title": { "type": "keyword" } } } }Insert four documents representing employee data.
POST /employees/_bulk {"index":{}} {"department":"Sales","salary":100000,"job_title":"Manager"} {"index":{}} {"department":"Sales","salary":80000,"job_title":"Representative"} {"index":{}} {"department":"Marketing","salary":120000,"job_title":"Manager"} {"index":{}} {"department":"Marketing","salary":90000,"job_title":"Coordinator"}Execute an aggregation by department.
GET employees/_search { "size": 0, "aggs": { "employees_by_department": { "terms": { "field": "department" } } } }Add the sub-aggregations for average salary by department.
GET employees/_search { "size": 0, "aggs": { "employees_by_department": { "terms": { "field": "department" }, "aggs": { "avg_salary_by_department": { "avg": { "field": "salary" } } } } } }Add a filters sub-aggregation for each
job_title.GET employees/_search { "size": 0, "aggs": { "employees_by_department": { "terms": { "field": "department" }, "aggs": { "avg_salary_by_department": { "avg": { "field": "salary" } }, "employees_by_title": { "filters": { "filters": { "Managers": { "term": { "job_title": "Manager" } }, "Representative" : { "term": { "job_title": "Representative" } }, "Coordinator" : { "term": { "job_title": "Coordinator" } } } } } } } } }
Test
Verify the index creation and mappings.
GET /employeesVerify the test documents are in the index.
GET /employees/_searchVerify that the employees are grouped correctly by department and job title and that the average salary is calculated correctly for each department.
{ ... "aggregations": { "employees_by_department": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Marketing", "doc_count": 2, "avg_salary_by_department": { "value": 105000 }, "employees_by_title": { "buckets": { "Coordinator": { "doc_count": 1 }, "Managers": { "doc_count": 1 }, "Representative": { "doc_count": 0 } } } }, { "key": "Sales", "doc_count": 2, "avg_salary_by_department": { "value": 90000 }, "employees_by_title": { "buckets": { "Coordinator": { "doc_count": 0 }, "Managers": { "doc_count": 1 }, "Representative": { "doc_count": 1 } } } } ] } } }
Considerations
- The
departmentfield must be of type keyword. - Setting
sizeto 0 ensures the search doesn’t return any documents, focusing solely on the aggregations. - The
termsaggregation is used to group employees by department. - The
avgsub-aggregation is used to calculate the average salary per department. - The
filterssub-aggregation is used to group employees byjob_title.
Clean-up (optional)
Delete the index.
DELETE employees
Documentation
Example 3: Creating aggregations and sub-aggregations for application logs by Hour and Log Level
Requirements
- Analyze application logs stored in an Elasticsearch index named
app-logs. - Use a
date_histogramaggregation to group logs by the hour. - Within each hour bucket, create a sub-aggregation to group logs by their severity level (
log_level).
Steps
Open the Kibana Console or use a REST client.
Create a new index called app-logs.
PUT app-logs { "mappings": { "properties": { "@timestamp": { "type": "date" }, "log_level": { "type": "keyword" }, "message": { "type": "text" } } } }Insert sample data.
POST /app-logs/_bulk {"index":{},"_id":"1"} {"@timestamp":"2024-05-24T10:30:00","log_level":"INFO","message":"Application started successfully."} {"index":{},"_id":"2"} {"@timestamp":"2024-05-24T11:15:00","log_level":"WARNING","message":"Potential memory leak detected."} {"index":{},"_id":"3"} {"@timestamp":"2024-05-24T12:00:00","log_level":"ERROR","message":"Database connection failed."} {"index":{},"_id":"4"} {"@timestamp":"2024-05-24T10:45:00","log_level":"DEBUG","message":"Processing user request."}Use a
date_histogramaggregation to group logs by the hour.GET app-logs/_search { "size": 0, "aggs": { "logs_by_the_hour": { "date_histogram": { "field": "@timestamp", "fixed_interval": "1h" } } } }Within each hour bucket, create a sub-aggregation to group logs by their severity level (log_level).
GET app-logs/_search { "size": 0, "aggs": { "logs_by_the_hour": { "date_histogram": { "field": "@timestamp", "fixed_interval": "1h" }, "aggs": { "log_severity": { "terms": { "field": "log_level" } } } } } }
Test
Verify the index creation and mappings.
GET /app-logsVerify the test documents are in the index.
GET /app-logs/_searchRun the search query and examine the response.
{ ... "aggregations": { "logs_by_the_hour": { "buckets": [ { "key_as_string": "2024-05-24T10:00:00.000Z", "key": 1716544800000, "doc_count": 2, "log_severity": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "DEBUG", "doc_count": 1 }, { "key": "INFO", "doc_count": 1 } ] } }, { "key_as_string": "2024-05-24T11:00:00.000Z", "key": 1716548400000, "doc_count": 1, "log_severity": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "WARNING", "doc_count": 1 } ] } }, { "key_as_string": "2024-05-24T12:00:00.000Z", "key": 1716552000000, "doc_count": 1, "log_severity": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "ERROR", "doc_count": 1 } ] } } ] } } }
Considerations
- Setting
sizeto 0 ensures the search doesn’t return any documents, focusing solely on the aggregations. - The
date_histogramaggregation groups documents based on the@timestampfield with an interval of one hour. - The nested
termsaggregation within thelogs_by_houraggregation counts the occurrences of each uniquelog_levelwithin each hour bucket.
Clean-up (optional)
Delete the index.
DELETE app-logs
Documentation
Example 4: Finding the Stock with the Highest Daily Volume of the Month
This is taken from a webinar by Elastic to show a sample question and answer to the Certified Engineer Exam. Their answer was wrong and didn’t need aggregations.
Requirements
- Create a query to find the stock with the highest daily volume for the current month.
Steps
Open the Kibana Console or use a REST client.
Index sample data:
Use the
_bulkendpoint to index sample stock data.Ensure the data includes fields for
stock_name,date, andvolume.POST _bulk { "index": { "_index": "stocks", "_id": "1" } } { "stock_name": "AAPL", "date": "2024-07-01", "volume": 1000000 } { "index": { "_index": "stocks", "_id": "2" } } { "stock_name": "AAPL", "date": "2024-07-02", "volume": 1500000 } { "index": { "_index": "stocks", "_id": "3" } } { "stock_name": "GOOGL", "date": "2024-07-01", "volume": 2000000 } { "index": { "_index": "stocks", "_id": "4" } } { "stock_name": "GOOGL", "date": "2024-07-02", "volume": 2500000 } { "index": { "_index": "stocks", "_id": "5" } } { "stock_name": "MSFT", "date": "2024-07-01", "volume": 3000000 } { "index": { "_index": "stocks", "_id": "6" } } { "stock_name": "MSFT", "date": "2024-07-02", "volume": 3500000 } { "index": { "_index": "stocks", "_id": "7" } } { "stock_name": "TSLA", "date": "2024-07-01", "volume": 4000000 } { "index": { "_index": "stocks", "_id": "8" } } { "stock_name": "TSLA", "date": "2024-07-02", "volume": 4500000 } { "index": { "_index": "stocks", "_id": "9" } } { "stock_name": "AMZN", "date": "2024-07-01", "volume": 5000000 } { "index": { "_index": "stocks", "_id": "10" } } { "stock_name": "AMZN", "date": "2024-07-02", "volume": 5500000 }
Create the query. The
stocksin the index are all from July, but you want just the stocks for the latest month. Update the above dates so the query will work for you.GET stocks/_search { "size": 1, "query": { "range": { "date": { "gte": "now/M", "lte": "now" } } } }The results of the
queryshould be all the stocks from a given month. Now sort those stocks by their volume and display the top pick.GET stocks/_search { "size": 1, "query": { "range": { "date": { "gte": "now/M", "lte": "now" } } }, "sort": [ { "volume": { "order": "desc" } } ] }
Test
Verify the index creation and mappings.
GET /stocksVerify the test documents are in the index.
GET /stocks/_searchRun the query and confirm that the stock with the highest daily volume of the month is displayed.
{ ... "hits": [ { "_index": "stocks", "_id": "10", "_score": null, "_source": { "stock_name": "AMZN", "date": "2024-07-02", "volume": 5500000 }, "sort": [ 5500000 ] } ] } }
Considerations
- The
rangeclause returned the stocks for the current month - The
sortclause brought the highest volume of any stock to the top andsizeof 1 displayed that one record
Clean-up (Optional)
Delete the
stocksindex to clean up the data:DELETE /stocks
Documentation
Example 5: Aggregating Sales Data by Month with Sub-Aggregation of Total Sales Value
Requirements
- Aggregate e-commerce sales data by month, creating at least 12 date buckets.
- Perform a sub-aggregation to calculate the total sales value within each month.
Steps
Index Sample Sales Documents Using
_bulkEndpoint:POST /sales_data/_bulk { "index": { "_id": "1" } } { "order_date": "2023-01-15", "product": "Yoo-hoo Beverage", "quantity": 10, "price": 1.99 } { "index": { "_id": "2" } } { "order_date": "2023-02-20", "product": "Apple iPhone 12", "quantity": 1, "price": 799.99 } { "index": { "_id": "3" } } { "order_date": "2023-03-05", "product": "Choco-Lite Bar", "quantity": 25, "price": 0.99 } { "index": { "_id": "4" } } { "order_date": "2023-04-10", "product": "Nike Air Max 270", "quantity": 3, "price": 150.00 } { "index": { "_id": "5" } } { "order_date": "2023-05-18", "product": "Samsung Galaxy S21", "quantity": 2, "price": 699.99 } { "index": { "_id": "6" } } { "order_date": "2023-06-22", "product": "Yoo-hoo Beverage", "quantity": 15, "price": 1.99 } { "index": { "_id": "7" } } { "order_date": "2023-07-03", "product": "Choco-Lite Bar", "quantity": 30, "price": 0.99 } { "index": { "_id": "8" } } { "order_date": "2023-08-25", "product": "Apple iPhone 12", "quantity": 1, "price": 799.99 } { "index": { "_id": "9" } } { "order_date": "2023-09-10", "product": "Nike Air Max 270", "quantity": 4, "price": 150.00 } { "index": { "_id": "10" } } { "order_date": "2023-10-15", "product": "Samsung Galaxy S21", "quantity": 1, "price": 699.99 } { "index": { "_id": "11" } } { "order_date": "2023-11-20", "product": "Yoo-hoo Beverage", "quantity": 20, "price": 1.99 } { "index": { "_id": "12" } } { "order_date": "2023-12-30", "product": "Choco-Lite Bar", "quantity": 50, "price": 0.99 }Bucket the
order_dateusing a Date Histogram Aggregation with Sub-Aggregation:Use a
date_histogramto create monthly buckets and asumsub-aggregation to calculate total sales within each month.GET /sales_data/_search { "size": 0, "aggs": { "sales_over_time": { "date_histogram": { "field": "order_date", "calendar_interval": "month", "format": "yyyy-MM" }, "aggs": { "total_sales": { "sum": { "field": "total_value" } } } } } }
Calculate the Total Value:
Before running the above aggregation, ensure that each document includes a
total_valuefield. You could either compute it on the client side or dynamically compute it using an ingest pipeline or a script during the aggregation process.For simplicity, let’s assume the
total_valueis calculated asquantity * price:POST /sales_data/_update_by_query { "script": { "source": "ctx._source.total_value = ctx._source.quantity * ctx._source.price" }, "query": { "match_all": {} } }
Test
- Run the above
GET /sales_data/_searchquery. - Check the output to see 12 date buckets, one for each month, with the
total_salesvalue for each bucket.
Considerations
- The
date_histogramaggregation is ideal for grouping records by time intervals such as months, weeks, or days. - The
sumsub-aggregation allows you to calculate the total value of sales within each date bucket. - Ensure that the
total_valuefield is correctly calculated, as this impacts the accuracy of the sub-aggregation.
Clean-up (Optional)
Delete the
stocksindex to clean up the data:DELETE /sales_data
Documentation
2.6 Task: Write and execute a query that searches across multiple clusters
If you are running your instance of Elasticsearch locally, and need to create an additional cluster so that you can run these examples, go to the Appendix: Adding a Cluster to your Elasticsearch Instance for information on how to set up an additional single-node cluster.
Example 1: Creating search queries for Products in Multiple Clusters
Requirements
- Set up two single-node clusters on localhost or Elastic Cloud.
- Create an index in each cluster.
- Index at least four documents in each cluster using the _bulk endpoint.
- Configure cross-cluster search.
- Execute a cross-cluster search query.
Steps
Open the Kibana Console or use a REST client.
Set up multiple clusters on localhost.
Assume you have two clusters, es01 and es02 and they have been set up as directed in the Appendix.
In the local cluster, configure communication between the clusters by updating the local cluster settings.
PUT /_cluster/settings { "persistent": { "cluster": { "remote": { "es01": { "seeds": [ "es01:9300" ], "skip_unavailable": true }, "es02": { "seeds": [ "es02:9300" ], "skip_unavailable": false } } } } }
- Create a product index in each cluster.
From the Kibana Console (es01)
PUT /products { "mappings": { "properties": { "product": { "type": "text" }, "category": { "type": "keyword" }, "price": { "type": "double" } } } }From the command line (es02).
curl -u elastic:[your password here] -X PUT "http://localhost:9201/products?pretty" -H 'Content-Type: application/json' -d' { "mappings": { "properties": { "product": { "type": "text" }, "category": { "type": "keyword" }, "price": { "type": "double" } } } }'
- Index product documents into each cluster.
For es01:
POST /products/_bulk { "index": { "_id": "1" } } { "product": "Elasticsearch Guide", "category": "Books", "price": 29.99 } { "index": { "_id": "2" } } { "product": "Advanced Elasticsearch", "category": "Books", "price": 39.99 } { "index": { "_id": "3" } } { "product": "Elasticsearch T-shirt", "category": "Apparel", "price": 19.99 } { "index": { "_id": "4" } } { "product": "Elasticsearch Mug", "category": "Apparel", "price": 12.99 }For es02 through the command line (note that the final single quote is on a line by itself):
curl -u elastic:[your password here] -X POST "http://localhost:9201/products/_bulk?pretty" -H 'Content-Type: application/json' -d' { "index": { "_id": "5" } } { "product": "Elasticsearch Stickers", "category": "Accessories", "price": 4.99 } { "index": { "_id": "6" } } { "product": "Elasticsearch Notebook", "category": "Stationery", "price": 7.99 } { "index": { "_id": "7" } } { "product": "Elasticsearch Pen", "category": "Stationery", "price": 3.49 } { "index": { "_id": "8" } } { "product": "Elasticsearch Hoodie", "category": "Apparel", "price": 45.99 } '
- Configure Cross-Cluster Search (CCS).
In the local cluster, ensure the remote cluster is configured by checking the settings:
GET /_cluster/settings?include_defaults=true&filter_path=defaults.cluster.remote
Execute a Cross-Cluster Search query.
GET /products,es02:products/_search { "query": { "match": { "product": "Elasticsearch" } } }
Test
Verify the index creation.
GET /productsFrom the command line execute:
curl -u elastic:[your password here] -X GET "http://localhost:9201/products?pretty"Verify that the documents have been indexed.
GET /products/_search GET /es02:products/_searchEnsure the remote cluster is correctly configured and visible from the local cluster.
GET /_remote/infoExecute a Cross-Cluster Search query.
GET /products,es02:products/_search { "query": { "match": { "product": "Elasticsearch" } } }
Considerations
- Cross-cluster search is useful for querying data across multiple Elasticsearch clusters, providing a unified search experience.
- Ensure the remote cluster settings are correctly configured in the cluster settings.
- Properly handle the index names to avoid conflicts and ensure clear distinction between clusters.
Clean-up (optional)
Delete the es01 index.
DELETE productsDelete the es02 index from the command line.
curl -u elastic:[your password here] -X DELETE "http://localhost:9201/products?pretty"
Documentation
2.7 Task: Write and execute a search that utilizes a runtime field
Example 1: Creating search queries for products with a runtime field for discounted prices
Requirements
- Create an index.
- Index four documents.
- Define a runtime field.
- Execute a search query that creates a query-time runtime field with a 10% discount
Steps
Open the Kibana Console or use a REST client.
Create an index.
PUT /product_index
{
"mappings": {
"properties": {
"product": {
"type": "text"
},
"price": {
"type": "double"
},
"category": {
"type": "keyword"
}
}
}
}- Index some documents.
POST /product_index/_bulk
{ "index": { "_id": "1" } }
{ "product": "Elasticsearch Guide", "price": 29.99, "category": "Books" }
{ "index": { "_id": "2" } }
{ "product": "Advanced Elasticsearch", "price": 39.99, "category": "Books" }
{ "index": { "_id": "3" } }
{ "product": "Elasticsearch T-shirt", "price": 19.99, "category": "Apparel" }
{ "index": { "_id": "4" } }
{ "product": "Elasticsearch Mug", "price": 12.99, "category": "Apparel" }Define a query-time runtime field to return a discounted price.
GET product_index/_search { "query": { "match_all": {} }, "fields": [ "product", "price", "discounted_price" ], "runtime_mappings": { "discounted_price": { "type": "double", "script": { "source": "emit(doc['price'].value * 0.9)" } } } }
Test
Verify the creation of the index and its mappings.
GET /product_indexVerify the indexed documents.
GET /product_index/_searchExecute the query and confirm the
discounted_price.{ ... "hits": [ { ... "fields": { "product": [ "Elasticsearch Guide" ], "price": [ 29.99 ], "discounted_price": [ 26.991 ] } }, { ... "fields": { "product": [ "Advanced Elasticsearch" ], "price": [ 39.99 ], "discounted_price": [ 35.991 ] } }, { ... "fields": { "product": [ "Elasticsearch T-shirt" ], "price": [ 19.99 ], "discounted_price": [ 17.991 ] } }, { ... "fields": { "product": [ "Elasticsearch Mug" ], "price": [ 12.99 ], "discounted_price": [ 11.691 ] } } ] } }
Considerations
- Runtime fields allow for dynamic calculation of field values at search time, useful for complex calculations or when the field values are not stored.
- The script in the runtime field calculates the discounted price by applying a 10% discount to the price field.
Clean-up (optional)
Delete the index.
DELETE product_index
Documentation
Example 2: Creating search queries for employees with a calculated total salary
In this example, the runtime field is defined as part of the index that executes code when documents are indexed. The salary field is read at index time to create a new value for the runtime field total_salary.
Requirements
- An index (
employees) with documents containing employee information (name,department,salary)and a runtime field (total_salary) to calculate the total salary of each employee. - A search query to retrieve employees with a total salary above $65,000.
Steps
Open the Kibana Console or use a REST client.
Create the employees index with a mapping for the runtime field.
PUT employees { "mappings": { "properties": { "name": { "type": "text" }, "department": { "type": "text" }, "salary": { "type": "integer" }, "total_salary": { "type": "long", "script": { "source": "emit(doc['salary'].value * 12)" } } } } }Index some documents that contain a monthly salary.
POST /employees/_bulk { "index": { "_id": "1" } } { "name": "John Doe", "department": "Sales", "salary": 4000 } { "index": { "_id": "2" } } { "name": "Jane Smith", "department": "Marketing", "salary": 6000 } { "index": { "_id": "3" } } { "name": "Bob Johnson", "department": "IT", "salary": 7000 } { "index": { "_id": "4" } } { "name": "Alice Brown", "department": "HR", "salary": 5000 }Execute a search query with a runtime field.
GET employees/_search { "query": { "range": { "total_salary": { "gte": 65000 } } }, "fields": [ "total_salary" ] }
Test
Verify the creation of the index and its mappings.
GET /employeesVerify the indexed documents.
GET /employees/_searchExecute the query and verify the search results contain only employees with a total salary above 65000.
{ ... "hits": [ { "_index": "employees", "_id": "2", "_score": 1, "_source": { "name": "Jane Smith", "department": "Marketing", "salary": 6000 }, "fields": { "total_salary": [ 72000 ] } }, { "_index": "employees", "_id": "3", "_score": 1, "_source": { "name": "Bob Johnson", "department": "IT", "salary": 7000 }, "fields": { "total_salary": [ 84000 ] } } ] } }
Considerations
- Runtime fields are calculated on the fly and can be used in search queries, aggregations, and sorting.
- The script used in the runtime field calculates the total salary by multiplying the monthly salary by 12 months.
Clean-up (optional)
Delete the index.
DELETE employees
Documentation
Example 3: Creating search queries with a runtime field for restaurant data
Requirements
- Create a search query for restaurants in New York City.
- Include the restaurant’s
name,cuisine, and a calculatedrating_scorein the search results.- the
rating_scoreis calculated by taking the square root of the product of thereview_scoreandnumber_of_reviews.
- the
Steps
Open the Kibana Console or use a REST client.
Create a
restaurantindex.PUT restaurants { "mappings": { "properties": { "city": { "type": "keyword" }, "cuisine": { "type": "text" }, "name": { "type": "text" }, "number_of_reviews": { "type": "long" }, "review_score": { "type": "float" }, "state": { "type": "keyword" } } } }Index some sample restaurant documents.
POST /restaurants/_bulk { "index": { "_id": 1 } } { "name": "Tasty Bites", "city": "New York", "state": "NY", "cuisine": "Italian", "review_score": 4.5, "number_of_reviews": 200 } { "index": { "_id": 2 } } { "name": "Spicy Palace", "city": "Los Angeles", "state": "CA", "cuisine": "Indian", "review_score": 4.2, "number_of_reviews": 150 } { "index": { "_id": 3 } } { "name": "Sushi Spot", "city": "San Francisco", "state": "CA", "cuisine": "Japanese", "review_score": 4.7, "number_of_reviews": 300 } { "index": { "_id": 4 } } { "name": "Burger Joint", "city": "Chicago", "state": "IL", "cuisine": "American", "review_score": 3.8, "number_of_reviews": 100 }Create a query to return restaurants based from New York City.
GET restaurants/_search { "query": { "bool": { "must": [ { "term": { "city": { "value": "New York" } } }, { "term": { "state": { "value": "NY" } } } ] } } }Define a runtime field named
weighted_ratingto calculate a weighted rating score for New York restaurants.GET restaurants/_search { "query": { "bool": { "must": [ { "term": { "city": { "value": "New York" } } }, { "term": { "state": { "value": "NY" } } } ] } }, "runtime_mappings": { "rating_score": { "type": "double", "script": { "source": "emit(Math.sqrt(doc['review_score'].value * doc['number_of_reviews'].value))" } } }, "fields": [ "rating_score" ] }
Test
Verify the creation of the index and its mappings.
GET /restaurantsVerify the indexed documents.
GET /restaurants/_searchExecute the query and verify the restaurant name, cuisine type, and the calculated weighted rating score for restaurants located in New York, NY.
{ ... "hits": [ { "_index": "restaurants", "_id": "1", "_score": 2.4079456, "_source": { "name": "Tasty Bites", "city": "New York", "state": "NY", "cuisine": "Italian", "review_score": 4.5, "number_of_reviews": 200 }, "fields": { "rating_score": [ 30 ] } } ] } }
Considerations
- The runtime_mappings section defines a new field
weighted_ratingthat calculates a weighted rating score based on thereview_scoreandnumber_of_reviewsfields. - The
querysection uses thetermquery to search for restaurants in New York, NY. - The
fieldssection specifies the fields to include in the search results (in this case, the runtime fieldweighted_rating).
Clean-up (optional)
Delete the index.
DELETE restaurants