Boto3 is Amazon’s officially supported AWS SDK for Python.
It’s the de facto way to interact with AWS via Python.
If you’ve used Boto3 to query AWS resources, you may have run into limits on how many
resources a query to the specified AWS API will return (generally 50 or 100 results),
although S3 will return up to 1000 results. The AWS APIs return “pages” of results. If you are trying to retrieve more than one “page” of results you will need to use a paginator to issue multiple API requests on your behalf.
Introduction
Boto3 provides Paginators to
automatically issue multiple API requests to retrieve all the results
(e.g. on an API call to EC2.DescribeInstances
). Paginators are straightforward to use,
but not all Boto3 services provide paginator support. For those services you’ll need to
write your own paginator in Python.
In this post, I’ll show you how to retrieve all query results for Boto3 services
which provide Pagination support, and I’ll show you how to write a custom paginator
for services which don’t provide built-in pagination support.
Built-in Paginators
Most services in the Boto3 SDK provide Paginators. See S3 Paginators for example.
Once you determine you need to paginate your results, you’ll need to call the get_paginator()
method.
How do I know I need a Paginator?
If you suspect you aren’t getting all the results from your Boto3 API call, there are a couple of ways to check.
You can look in the AWS console (e.g. number of Running Instances), or run a query via the aws
command-line interface.
Here’s an example of querying an S3 bucket via the AWS command-line. Boto3 will return the first 1000 S3 objects from the bucket, but since there are a total of 1002 objects, you’ll need to paginate.
Counting results using the AWS CLI
$ aws s3 ls my-example-bucket|wc -l
-> 1002
Here’s a boto3
example which, by default, will return the first 1000 objects from a given S3 bucket.
Determining if the results are truncated
import boto3
# use default profile
s3 = boto3.client('s3')
resp = s3.list_objects_v2(Bucket='my-example-bucket')
print('list_objects_v2 returned {}/{} files.'.format(resp['KeyCount'], resp['MaxKeys']))
if resp['IsTruncated']:
print('There are more files available. You will need to paginate the results.')
>>> "list_objects_v2 returned 1000/1000 files."
>>> "There are more files available. You will need to paginate the results."
The S3 response dictionary provides some helpful properties, like IsTruncated
, KeyCount
, and MaxKeys
which tell you if the results were truncated. If resp['IsTruncated']
is True
, you know you’ll need to use a Paginator to return all the results.
Using Boto3’s Built-in Paginators
The Boto3 documentation provides a good overview of how to use the built-in paginators, so I won’t repeat it here.
If a given service has Paginators built-in, they are documented in the Paginators
section of the service docs, e.g. AutoScaling, and EC2.
Determine if a method can be paginated
You can also verify if the boto3
service provides Paginators via the client.can_paginate()
method.
import boto3
s3 = boto3.client('s3')
print(s3.can_paginate('list_objects_v2')) # => True
So, that’s it for built-in paginators. In this section I showed you how to determine
if your API results are being truncated, pointed you to Boto3’s excellent documentation
on Paginators, and
showed you how to use the can_paginate()
method to verify if a given service
method supports pagination.
If the Boto3 service you are using provides paginators, you should use them.
They are tested and well documented. In the next section, I’ll show you how to
write your own paginator.
How to Write Your Own Paginator
Some Boto3 services, such as AWS Config
don’t provide paginators. For these services, you will have to write your own
paginator code in Python to retrieve all the query results. In this section, I’ll show you how to write your own paginator.
You Might Need To Write Your Own Paginator If…
Some Boto3 SDK services aren’t as built-out as S3 or EC2. For example, the AWS Config service doesn’t provide paginators. The first clue is that the Boto3 AWS ConfigService docs don’t have a “Paginators” section.
The can_paginate
Method
You can also ask the individual service client’s can_paginate
method if it supports paginating. For example, here’s how to do that for the AWS config
client. In the example below, we determine that the config
service doesn’t support paginating for the get_compliance_details_by_config_rule
method.
import boto3
config = boto3.client('config')
can_paginate = config.can_paginate('get_compliance_details_by_config_rule')
if not can_paginate:
print('There is no built-in paginator for that method')
>>> 'There is no built-in paginator for that method'
Operation Not Pageable Error
If you try to paginate a method without a built-in paginator, you will get an
error similar to this:
config.get_paginator('get_compliance_details_by_config_rule')
.../python2.7/site-packages/botocore/client.pyc in get_paginator(self, operation_name)
591 if not self.can_paginate(operation_name):
--> 592 raise OperationNotPageableError(operation_name=operation_name)
593 else:
594 actual_operation_name = self._PY_TO_OP_NAME[operation_name]
"OperationNotPageableError: Operation cannot be paginated: get_compliance_details_by_config_rule"
If you get an error like this, it’s time to roll up your sleeves and write your
own paginator.
Writing a Paginator
Writing a paginator is fairly straightforward. When you call the AWS service API,
it will return the maximum number of results, and a long hex string token, next_token
if there are more results.
Approach
To create a paginator for this, you make calls to the service API in a loop until next_token
is empty, collecting the results from each loop iteration in a list. At the end of the loop, you will have all the results in the list.
In the example code below, I’m calling the
AWS Config
service to get a list of resources (e.g. EC2 instances), which are not compliant with the required-tags
Config rule.
As you read the example code below, it might help to read the Boto3 SDK docs for the
get_compliance_details_by_config_rule
method,
especially the “Response Syntax” section.
Example Paginator
import boto3
def get_resources_from(compliance_details):
results = compliance_details['EvaluationResults']
resources = [result['EvaluationResultIdentifier']['EvaluationResultQualifier'] for result in results]
next_token = compliance_details.get('NextToken', None)
return resources, next_token
def main():
config = boto3.client('config')
next_token = '' # variable to hold the pagination token
resources = [] # list for the entire resource collection
# Call the `get_compliance_details_by_config_rule` method in a loop
# until we have all the results from the AWS Config service API.
while next_token is not None:
compliance_details = config.get_compliance_details_by_config_rule(
ConfigRuleName='required-tags',
ComplianceTypes=['NON_COMPLIANT'],
Limit=100,
NextToken=next_token
)
current_batch, next_token = get_resources_from(compliance_details)
resources += current_batch
print(resources)
if __name__ == "__main__":
main()
Example Paginator – main()
Method
In the example above, the main()
method creates the config
client and initializes
the next_token
variable. The resources
list will hold the final results set.
The while
loop is the heart of the paginating code. In each loop iteration, we call the
get_compliance_details_by_config_rule
method, passing next_token
as a parameter.
Again, next_token
is a long hex string returned by the given AWS service API method.
It’s our “claim check” for the next set of results.
Next, we extract the current_batch
of AWS resources and the next_token
string
from the compliance_details
dictionary returned by our API call.
Example Paginator – get_resources_from()
Helper Method
The get_resources_from(compliance_details)
is an extracted helper method for parsing
the compliance_details
dictionary. It returns our current batch (100 results)
of resources and our next_token
“claim check” so we can get the next page of results
from config.get_compliance_details_by_config_rule()
.
I hope the example is helpful in writing your own custom paginator.
In this section on writing your own paginators I showed you a Boto3 documentation
example of a service without built-in paginator support. I discussed the can_paginate
method and showed you the error you get if you call it on a method which doesn’t
support pagination. Finally, I discussed an approach for writing a custom paginator
in Python and showed a concrete example of a custom paginator which passes the NextToken
“claim check” string to fetch the next page of results.
Summary
In this post, I covered Paginating AWS API responses with the Boto3 SDK.
Like most APIs (Twitter, GitHub, Atlassian, etc)
AWS paginates API responses over a set limit, generally 50 or 100 resources.
Knowing how to paginate results is crucial when dealing with large AWS accounts which
may contain thousands of resources.
I hope this post has taught you a bit about paginators and how to get all
your results from the AWS APIs.
About the Author
Doug is a Sr. DevOps engineer at 1Strategy, an AWS Consulting Partner specializing in Amazon Web Services (AWS). He has 23 years experience in IT, working at Microsoft, Washington Mutual Bank, and Nordstrom in diverse roles from testing, Windows Server engineer, developer, and Chef engineer, helping app and platform teams manage thousands of servers via automation.