Many web interfaces let a user effortlessly page through large sets of data. Implementing database queries that fetch these pages is also effortless for the programmer, usually requiring an OFFSET and LIMIT in the case of SQL and a FROM and SIZE in the case of Elasticsearch. Although this method is easy on the user and programmer, pagination queries of this type have a high hidden cost for SQL, Elasticsearch and other database engines.

At Salsify, we encountered this problem when implementing a feature to allow a user to step through records in a large, heavily filtered and sorted set. We had to implement an efficient pagination solution that would work in both our SQL and ES datastores. In this post we’ll look at an interesting technique to make pagination efficient for the database with large datasets. Specifically, we’ll look at implementation in SQL as well as how to translate this method to Elasticsearch.

The status quo

The Good part

First, let’s look at the status quo and assume a page_size = 10. To do pagination in SQL, we get the first page by requesting the first 10 records with LIMIT 10. For the next page, we introduce OFFSET by scanning past the first 10 records with OFFSET 10 and then performing the same LIMIT 10 request as in the first query. For the n'th page, we compute the OFFSET by taking the product page_number * page_size and then fetching the next page_size records using LIMIT page_number.

The OFFSET method has a couple major benefits. First, this method is straightforward to implement. All you need to know is the page number and page size and you can produce a simple query to fetch the page. Furthermore, this method is in such broad usage that there are countless established UX practices that make this form of pagination immediately intelligible to the user.

Here we see the same concept using an Elasticsearch query:

The FROM key translates directly to OFFSET and the SIZE key translates directly to LIMIT.

When the offset method gets weird

Imagine you are using a Twitter-like app that shows tweets and autoloads the next page of tweets using the OFFSET method when you get to the bottom of a page. What could go wrong? Let’s say somebody you follow spouts off 10 angry tweets in the time you are reading the first page of results. Here is graphic showing the state of the table when your two queries execute:

Screen_Shot_2016-04-14_at_11.22.53_AM.png

When you scroll down and the next page loads, a request for the next page returns the same results as the previous request! The 10 new tweets have invalidated where you were in the set. In effect, you had been moved forward a page in the set without realizing it!

A similar problem occurs with deleting records, with the major difference being that you can be moved back in the paginated set in the case of deletion. Updating a record such that its sort key for a given ORDER BY parameter changes results in a similar unexpected pagination experience. A good rule of thumb is that any time the paginated set changes, your current page number has been invalidated and should be recomputed. If you don't fix the page number (and it's not clear how you would do that), the user will most likely see slightly strange results. Maybe you’ll accidentally jump forward or backward by a record or two when the paginated set isn't undergoing much churn.

When the offset method deteriorates

While the above issues demonstrate some undesirable behaviors that can occur with the OFFSET method, one of the biggest limitations occurs when working with really large sets. When you request page_number = 1,000,000 with page_size = 10, you expect to receive back 10 results. This means a fetch of 10, however, how many records must be scanned before the fetch? The answer may surprise you.

The way offset works is that you scan the page_number * page_size records and then begin fetching page_size records. Therefore, in our example of requesting page 1,000,000, we actually need to scan 10,000,000 records before fetching 10. In other words, OFFSET pagination is a linear time algorithm. Here is what it looks like assuming we have sorted by some column indexed with a btree:

Screen_Shot_2016-04-14_at_1.34.28_PM-1.png

Note that above we had to start at the far left of the tree and then perform a sequential access starting there. This is fine for use cases where you know you will not need high page numbers and your data isn't modified often, however, if either of those conditions don't hold, the OFFSET method of pagination may not be for you.

An Alternative: Keyset Pagination

What if we have the first record on the current page and want to be able to get the next page? We should be able to jump right to our current page in the index and then do a lot less scanning before fetching results for the next or previous page.

A simple implementation

Let's call this first record from the current page the query point. Let's assume we have a single deterministic sort for a btree indexed column. Now our task for getting the next page would look something like this:

Screen_Shot_2016-04-14_at_1.27.01_PM-1.png

In the above image we do a logarithmic time search to find the query point in the current page. We then do a linear time sequential access to scan past the rest of the current page and then linear time sequential access to fetch the items for the next page. In terms of time complexity, we have gone from O(page_number * page_size + page_size) down to O(log(table_size) + page_size).

In order to go to the previous page instead of the next page, our task is pretty similar. The key difference is that we first need to flip the sort. Next we sequentially fetch the next page_size records. Note that we don't need to scan past records from the current page when getting the previous page, because the first record on the current page immediately precedes the query point.

A complete implementation

In the above demonstration we made a significant simplification: we assumed a single deterministic sort. When we explore complete implementations below, we'll see that depending on the datastore, handling multiple sort adds significant complexity.

Composite Values

If you find yourself using a newer version of PostgreSQL, then handling multiple sort is pretty straightforward:

In the first query, we have a single sort example where we are ordering by a column a and therefore we constrain our result set using the value for query_point.a, LIMIT page_size and OFFSET page_size - 1 to jump past the rest of the current page.

In the second example, we have a general solution for multiple sort using composite values. Note that the order of columns listed in the constraint is the same as the defined order of precedence in the ORDER BY clause. If you are lucky enough to be using a database that supports composite values, then your work is done! However, if you are either curious or working with a database that doesn't support composite values, then we have to recreate this behavior. So what exactly is that composite value comparison doing?

Keyset pagination without Composite Values

The most straightforward way to understand how a query point works with multisort is to imagine that you are building up the set that satisfies the multisort constraint by unioning all the sets that we know should be contained in the set. In the case of getting the next page of results for a bunch of ascending sorts, the logic goes something like this:

We know that everything greater than the primary sort belongs in the results set.
What if there is something that ties on the primary sort? We know that every record that ties on the primary sort field but is greater than the query point on the secondary sort belongs in the set.
What if something ties on the primary and secondary sort? ...

We can rewrite the above set union operation as a SQL constraint letting (S1 .. SN) be an ordered list of columns for ascending sorts. The resulting constraint would be:

Note that each of the greater/less than comparators in the above constraint would be reversed in the case the sort was descending. Also, all comparators would be flipped in the case where you are looking for the previous page instead of the next page. Finally, don't forget to ensure determinism of your sort! This may involve adding a tiebreaker to the sort such as the primary key column.

The ES version of the above constraint would be as follows:

Although the SQL version tends to be easier to read and understand, the above ES query is conceptually the same.

One last note on the ES query is that you will need to add a default sort mode to reduce the sort key for records with multivalue properties down to a consistent sort key regardless of the sort direction. Otherwise, you may find yourself with unexpected user experience. One particularly interesting case that demonstrates this need is the adversarial case where record A has both the highest and lowest value for a particular multivalue field and record B has both the second highest and second lowest value for the same field:

Name	Values	min(values)	max(values)
A	1,10	1	10
B	2,9	2	9

Given the above data, an ascending sort (which results in a sort mode: min reduction) will use the sort key 1 for the record A and sort key 2 for record B, thus returning record A first. A descending sort (which results in a sort mode: max reduction) will use the sort key 10 for record A and sort key 9 for record B, also returning record A first! Therefore, be sure to set a consistent default sort mode when you are expecting reversing a sort to result in reversing the records in the set.

Limitations

So far we've seen how keyset pagination can solve some of the unexpected behaviors and inefficiencies that occur when using the OFFSET method. However, there are also some limitations worth exploring. Most important among these limitations is the complexity of keyset pagination. There are many libraries that help with this method (see below), however, there is certainly still increased complexity with this method.

A second important limitation regards how the user sees the pagination. If you are working with an infinite scroll (eg. Twitter, chat apps, etc.) then keyset pagination may be just what you are looking for. However, if you're dealing with a user interface that emphasizes pages and page size selection, the keyset method could prove highly problematic.

Lastly, keyset pagination offers no deep pagination support out of the box. If you need to present page 10 but don't know anything about page 9 or 11, you're out of luck! If you are interested in ways to make keyset pagination handle deep page numbers, consider the keyset pagination with estimated bookmarks method described here.

At this point we've seen the pros and cons of classical pagination with OFFSET and an alternative method called keyset pagination that works well with many user interfaces. Now it's your turn to decide what the right decision for your organization is. You may want to check out this article for a great list of libraries that expose keyset pagination functionality inside your project. At Salsify, we used the order_query gem to enable this functionality in ruby. Please leave a comment if you have used keyset pagination in ES or SQL for any of your projects and tell us how it worked!

Salsify Engineering Blog

Salsify Engineering Blog

Efficient Pagination in SQL and ElasticSearch