Python Boto3: AWS Automation with the Python SDK

Boto3 is the official AWS SDK for Python. It lets you write Python scripts that create, configure, and manage AWS resources—from S3 buckets to EC2 instances to DynamoDB tables—without ever leaving your terminal. If you work with AWS and you write Python, Boto3 is the tool you reach for first.

AWS services expose their functionality through HTTP APIs. Boto3 wraps those APIs so you can call them with clean, idiomatic Python instead of constructing raw HTTP requests yourself. Under the hood, Boto3 depends on Botocore, a lower-level package that handles the actual signing, serialization, and transport. You rarely interact with Botocore directly, but understanding that it exists helps explain why the two packages appear together in dependency lists.

What Is Boto3 and How It Works

Boto3 has been the go-to AWS SDK for Python since it reached general availability in 2015. It currently ships as version 1.42.x and requires Python 3.10 or later. The SDK covers virtually every AWS service—S3, EC2, Lambda, DynamoDB, IAM, SQS, SNS, CloudFormation, and hundreds more—keeping pace with AWS service releases through regular updates to its bundled API definitions.

One of the more elegant aspects of Boto3's design is that the API definitions live in JSON files, not Python code. When you instantiate a client or resource, Boto3 reads the relevant JSON at runtime and builds the interface dynamically. This means AWS can ship support for a new service feature by updating a JSON file rather than rewriting Python, which is why Boto3 releases tend to arrive quickly after AWS announces new capabilities.

Note

The dynamic generation approach does have one practical downside: IDE autocompletion does not always work out of the box for Boto3 clients and resources. Tools like boto3-stubs on PyPI generate type stubs that restore autocomplete in editors like VS Code and PyCharm.

Installation and Credential Setup

Installing Boto3 is straightforward. Work inside a virtual environment to keep your project dependencies isolated.

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate   # On Windows: .venv\Scripts\activate

# Install Boto3
pip install boto3

Boto3 also ships with an optional AWS Common Runtime (CRT) extension that enables additional features like higher-throughput S3 transfers. To include it:

pip install boto3[crt]

Before Boto3 can make any API calls, it needs credentials. The SDK looks for credentials in a specific order: environment variables, the shared credentials file, instance metadata (when running on EC2 or Lambda), and a few other sources. The simplest setup for local development is to configure the AWS CLI, which writes to ~/.aws/credentials automatically.

# Run this in your terminal (requires AWS CLI installed)
aws configure

You will be prompted for your AWS Access Key ID, Secret Access Key, default region, and output format. Once that is done, Boto3 picks up those credentials without any additional code on your part.

Pro Tip

When running Python scripts on AWS services like EC2, Lambda, or ECS, avoid hardcoding credentials entirely. Attach an IAM role to the resource instead. Boto3 automatically retrieves short-lived credentials from the instance metadata endpoint, which is far more secure than static keys in code or environment variables.

Clients vs Resources: Choosing Your Interface

Boto3 offers two distinct ways to interact with AWS services: the client interface and the resource interface. Understanding the difference is one of the first things worth getting right, because it affects both the code you write and how you handle responses.

The Client Interface

A client maps directly to the underlying AWS REST API. Method names follow the service API closely, and responses come back as Python dictionaries. Clients support every operation that the AWS service exposes, and they are thread-safe, making them suitable for concurrent applications.

import boto3

# Create an S3 client
s3 = boto3.client('s3', region_name='us-east-1')

# List all buckets - response is a dictionary
response = s3.list_buckets()
for bucket in response['Buckets']:
    print(bucket['Name'])

The dictionary-based response is precise and complete, but it requires you to know the exact key names to pull the data you need. For simple calls this is fine. When pagination enters the picture, it becomes more involved.

The Resource Interface

A resource provides a higher-level, object-oriented view of AWS services. Instead of working with raw dictionaries, you work with Python objects that have attributes and methods. Resources handle pagination automatically through collections, and lazy loading means data is only fetched when you access it.

import boto3

# Create an S3 resource
s3 = boto3.resource('s3')

# List all buckets - returns Bucket objects
for bucket in s3.buckets.all():
    print(bucket.name)

# Work with a specific bucket and its objects
bucket = s3.Bucket('my-example-bucket')
for obj in bucket.objects.all():
    print(f"{obj.key}  ({obj.size} bytes)")

The resource interface covers a subset of AWS services: S3, EC2, DynamoDB, SQS, SNS, IAM, CloudFormation, CloudWatch, and Glacier are the primary ones. For any service not on that list, or for operations that the resource layer does not expose, you use the client interface. The good news is that you can access the underlying client from within a resource at any time through its meta.client attribute, so you are never locked into one or the other.

import boto3

s3_resource = boto3.resource('s3')

# Access the raw client from a resource when you need lower-level control
s3_client = s3_resource.meta.client

Note

In practice, many codebases use both interfaces together. Resources for the day-to-day object manipulation where they are available, and clients for the operations resources do not cover or for services without a resource layer at all.

Paginators and Waiters

Two Boto3 features that are easy to overlook early on—but become essential quickly—are paginators and waiters.

Paginators

When an AWS API call could return a large number of results, the service breaks the response into pages. Without pagination support, your code only sees the first page and silently misses everything else. Paginators handle this for you automatically.

import boto3

s3 = boto3.client('s3')

# Without a paginator, you only get the first 1000 objects
# With a paginator, all objects are returned across as many pages as needed
paginator = s3.get_paginator('list_objects_v2')

all_objects = []
for page in paginator.paginate(Bucket='my-example-bucket'):
    if 'Contents' in page:
        all_objects.extend(page['Contents'])

print(f"Total objects found: {len(all_objects)}")

You can narrow the results further by passing filter parameters directly to paginate(). For S3, that includes Prefix and Delimiter. Other services have their own filter options documented in the Boto3 reference.

Waiters

Many AWS operations are asynchronous. You send a request to start an EC2 instance or create a DynamoDB table, and AWS begins the process—but the resource is not immediately ready. Waiters let you block execution until a resource reaches a desired state, polling in the background so your code does not have to manage a polling loop manually.

import boto3

ec2 = boto3.client('ec2')

# Start an instance
ec2.start_instances(InstanceIds=['i-0abc123def456'])

# Wait until the instance is fully running before continuing
waiter = ec2.get_waiter('instance_running')
waiter.wait(InstanceIds=['i-0abc123def456'])

print("Instance is now running")

Each service exposes its own set of waiters. You can list them with client.waiter_names. Common ones include bucket_exists and object_exists for S3, instance_running and instance_stopped for EC2, and table_exists for DynamoDB. Waiters poll at a defined interval and raise an exception if the target state is not reached within a maximum number of attempts.

Pro Tip

You can mix the resource and client interfaces when using waiters. Start an instance using the resource, then pull the client from resource.meta.client and use that client's waiter. This lets you keep the cleaner resource syntax for the action while still accessing the full waiter library from the client.

Practical Examples: S3, EC2, and DynamoDB

The concepts covered so far come together quickly once you see them applied to real services. Here are three common use cases.

Uploading and Downloading Files with S3

Boto3 includes high-level transfer methods on the S3 resource that handle multipart uploads automatically for large files. You do not need to manage the multipart API yourself.

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('my-example-bucket')

# Upload a local file to S3
bucket.upload_file(
    Filename='report.csv',
    Key='data/2026/report.csv'
)

# Download a file from S3 to a local path
bucket.download_file(
    Key='data/2026/report.csv',
    Filename='downloaded_report.csv'
)

print("Transfer complete")

Listing and Filtering EC2 Instances

The EC2 resource makes it easy to filter running instances using the same filter syntax the AWS CLI accepts.

import boto3

ec2 = boto3.resource('ec2')

# Get all running instances with a specific tag
running_instances = ec2.instances.filter(
    Filters=[
        {'Name': 'instance-state-name', 'Values': ['running']},
        {'Name': 'tag:Environment',      'Values': ['production']}
    ]
)

for instance in running_instances:
    print(f"ID: {instance.id}  Type: {instance.instance_type}  "
          f"IP: {instance.public_ip_address}")

Reading and Writing Items in DynamoDB

DynamoDB's resource interface gives you a clean Table object with straightforward put_item, get_item, and query methods. The SDK handles marshaling Python types to DynamoDB attribute types automatically.

import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Users')

# Write an item
table.put_item(Item={
    'UserId':  'user-001',
    'Name':    'Alex Rivera',
    'Email':   '[email protected]',
    'Active':  True
})

# Read an item back by its primary key
response = table.get_item(Key={'UserId': 'user-001'})
user = response.get('Item')

if user:
    print(f"Name: {user['Name']}, Email: {user['Email']}")

Warning

DynamoDB's query and scan operations can also return paginated results when a table is large. Use a paginator or check for a LastEvaluatedKey in the response to ensure you retrieve all matching items, not just the first page.

Key Takeaways

Boto3 requires Python 3.10 or later as of the current version. Set up credentials through aws configure or IAM roles before writing any SDK code.
Use the client interface for full API coverage and the resource interface for cleaner, object-oriented code with automatic pagination. Both can be used together in the same script through resource.meta.client.
Always use paginators for list operations. AWS APIs cap the number of results per response, and skipping pagination means your code silently misses data beyond the first page.
Waiters handle asynchronous operations cleanly. They eliminate manual polling loops and raise exceptions automatically when an operation fails or times out.
Avoid hardcoding credentials in code. Use IAM roles for resources running on AWS, and the shared credentials file or environment variables for local development.

Boto3 is the foundation for almost any Python-based AWS automation task. Once you are comfortable with the client and resource patterns, paginators, and waiters covered here, you have the core knowledge needed to work with any of the hundreds of services the SDK supports. The official Boto3 documentation is well-maintained and worth bookmarking—each service page lists every available method, paginator, and waiter along with code examples.