OpenSearch(ElasticSearch) Terms Aggregation Sorting and Pagination Limitation

Bevis Lin
2 min readJun 8, 2024

--

TL;DR

The term aggregation does not support pagination. Only bucket sorting is available, but it is limited by the size parameter.

Prepare Data

  1. run opensearch container [ref]
  2. generate flight sample data
  3. check the schema by _mapping api

A Scenario for Terms Aggregation

We aim to calculate the total number of flights for each origin city, and find the earliest departure time min_flight_time for each city, then sort the results based on min_flight_time.

{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"group_by_origin_city": {
"terms": { # term aggregation
"field": "OriginCityName",
"order": {
"min_flight_time": "asc"
}
},
"aggs": { # sub-aggregation
"min_flight_time": {
"min": {
"field": "timestamp"
}
}
}
}
}
}

When I need pagination along with sorting functionality, I have to retrieve a sufficient number of buckets (very large) to ensure correct sorting.

{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"group_by_origin_city": {
"terms": {
"field": "OriginCityName",
"size": 1000000, ## large enough integer
"order": {
"min_flight_time": "desc"
}
},
"aggs": {
"min_flight_time": {
"min": {
"field": "timestamp"
}
},
"custom_paginate": {
"bucket_sort": {
"from": 0,
"size": 5
}
}
}
}
}
}

Potential Issue

This approach about large enough size may lead to memory overload issues, degrade performance, and potentially trigger document count errors.

Solutions

  1. Addressing from a business logic perspective.
    Use an additional index to record the earliest departure time and total number of flights for each origin city.
  2. Addressing from an implementation standpoint.
    Utilize filters such as time intervals or specific condition to reduce the number of documents.

NOTE/Others

  • Why not use composite aggregation? Because it doesn’t allow sorting with sub-aggregations.

reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

--

--