Amazon S3

How to Export Object Metadata from AWS S3

S3 stores billions of objects for millions of organizations. Here are three ways to get your bucket metadata into a spreadsheet, from a single CLI command to automated inventory reports.

Built-in Export

S3 Inventory

Admin Required

IAM permissions

Best Output

CSV

Time to First Export

10-30 min

The short answer

AWS provides S3 Inventory, a built-in feature that automatically generates CSV or Parquet reports of all objects in a bucket with their metadata. For one-time or ad-hoc exports, the AWS CLI gives you instant results. For custom exports with full programmatic control, boto3 (Python) is the standard tool. All three methods require appropriate IAM permissions (at minimum s3:ListBucket to list objects and their metadata).

AWS CLI

EasyBest for: DevOps, cloud engineers, quick one-time exportsOutput: JSON → CSV~10 minutes

The AWS CLI is pre-installed on most developer machines and all AWS environments. A single command lists every object in a bucket with key metadata. Pipe the output throughjq or a short Python script to get a CSV.

Ensure the AWS CLI is installed and configured with credentials: aws configure. You need s3:ListBucket permission on the target bucket.

Run the list-objects command below. For buckets with more than 1,000 objects, the CLI handles pagination automatically.

Convert the JSON output to CSV using the included Python one-liner.

bash

# List all objects in a bucket as JSON
aws s3api list-objects-v2 \
  --bucket your-bucket-name \
  --output json > s3_objects.json

# Convert to CSV
python3 -c "
import json, csv, sys
with open('s3_objects.json') as f:
    data = json.load(f)
w = csv.writer(sys.stdout)
w.writerow(['Key','Size','LastModified','ETag','StorageClass'])
for obj in data.get('Contents', []):
    w.writerow([
        obj.get('Key',''),
        obj.get('Size',''),
        obj.get('LastModified',''),
        obj.get('ETag','').strip('"'),
        obj.get('StorageClass','STANDARD')
    ])
" > s3_metadata.csv

# Quick alternative: list just keys and sizes
aws s3 ls s3://your-bucket-name --recursive --human-readable > s3_listing.txt

✓

Prefix filtering

Add --prefix "folder/subfolder/" to limit the export to objects under a specific path. This is much faster than listing the entire bucket if you only need a subset of objects.

S3 Inventory (Built-in)

ModerateBest for: Cloud admins, governance teams, ongoing monitoringOutput: CSV or Parquet~Setup: 15 min, first report: up to 48 hours

S3 Inventory is AWS's built-in solution for large-scale object metadata exports. Once configured, it automatically delivers daily or weekly CSV (or Parquet) reports to a destination bucket. It's the best option for ongoing metadata monitoring and works efficiently on buckets with millions of objects.

Open the S3 Console. Navigate to the source bucket and go to the Management tab. Click Create inventory configuration.

Configure the inventory: choose a name, set the destination bucket (can be the same bucket under a prefix like inventory/), select CSV as the output format, and choose Daily or Weekly frequency.

Under Additional fields, select the metadata you want: Size, Last modified date, Storage class, ETag, Encryption status, Replication status, and more.

Save the configuration. The first report is delivered within 48 hours. Subsequent reports arrive on the schedule you set. Download the CSV from the destination bucket.

bash

# Create inventory configuration via CLI
aws s3api put-bucket-inventory-configuration \
  --bucket your-source-bucket \
  --id metadata-inventory \
  --inventory-configuration '{
    "Destination": {
      "S3BucketDestination": {
        "Bucket": "arn:aws:s3:::your-destination-bucket",
        "Format": "CSV",
        "Prefix": "inventory"
      }
    },
    "IsEnabled": true,
    "Id": "metadata-inventory",
    "IncludedObjectVersions": "Current",
    "OptionalFields": [
      "Size", "LastModifiedDate", "StorageClass",
      "ETag", "IsMultipartUploaded", "EncryptionStatus"
    ],
    "Schedule": { "Frequency": "Daily" }
  }'

S3 Inventory is free for the listing

AWS does not charge for S3 Inventory configuration. You only pay for the S3 storage of the generated report files and standard S3 request costs. For buckets with millions of objects, S3 Inventory is dramatically faster and cheaper than listing objects via the API.

Python + boto3

TechnicalBest for: Developers, data engineers, custom metadata needsOutput: CSV or Excel~20-30 minutes

The boto3 SDK gives you the most flexibility. You can list objects with their standard metadata, then make individual HEAD requests to pull custom metadata (user-defined headers) or tag sets for each object. This is the best approach when you need fields beyond what the list-objects API returns.

Install boto3: pip install boto3. Ensure your AWS credentials are configured (~/.aws/credentials or environment variables).

Run the script below. It paginates through all objects and writes to CSV. Optionally, it pulls custom metadata via HEAD requests for each object.

python

import boto3
import csv

s3 = boto3.client("s3")
BUCKET = "your-bucket-name"

def list_all_objects(bucket, prefix=""):
    """List all objects with pagination."""
    paginator = s3.get_paginator("list_objects_v2")
    objects = []
    for page in paginator.paginate(Bucket=bucket, Prefix=prefix):
        objects.extend(page.get("Contents", []))
    return objects

objects = list_all_objects(BUCKET)

with open("s3_metadata.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow([
        "Key", "Size (bytes)", "Last Modified",
        "ETag", "Storage Class"
    ])
    for obj in objects:
        writer.writerow([
            obj["Key"],
            obj["Size"],
            obj["LastModified"].isoformat(),
            obj["ETag"].strip('"'),
            obj.get("StorageClass", "STANDARD"),
        ])

print(f"Exported {len(objects)} objects to s3_metadata.csv")

# To include custom metadata and tags (slower, 1 API call per object):
# for obj in objects:
#     head = s3.head_object(Bucket=BUCKET, Key=obj["Key"])
#     custom_meta = head.get("Metadata", {})
#     content_type = head.get("ContentType", "")
#     tags_resp = s3.get_object_tagging(Bucket=BUCKET, Key=obj["Key"])
#     tags = {t["Key"]: t["Value"] for t in tags_resp["TagSet"]}

HEAD requests for custom metadata

The standard list-objects API does not return custom metadata (user-definedx-amz-meta-* headers) or S3 object tags. Getting those requires a HEAD request per object, which is slow and costly on large buckets. For 100,000+ objects, use S3 Inventory plus S3 Batch Operations instead.

What metadata fields can you export?

Field	AWS CLI	S3 Inventory	boto3
Object key (path)	✓	✓	✓
Object size	✓	✓	✓
Last modified date	✓	✓	✓
ETag (content hash)	✓	✓	✓
Storage class	✓	✓	✓
Content type	HEAD only	✕	HEAD only
Custom metadata headers	HEAD only	✕	HEAD only
Object tags	Separate call	✕	Separate call
Encryption status	✕	✓	HEAD only
Replication status	✕	✓	✕
Is multipart upload	✕	✓	✕
Object version ID	With versioning	✓	With versioning
Object lock status	✕	✓	✕
Bucket name	✓	✓	✓

Known limitations

Custom metadata requires per-object API calls: The list-objects API only returns key, size, last modified, ETag, and storage class. Content-Type, custom metadata headers, and tags require a HEAD or GET request per object.
S3 Inventory has a 48-hour delay: The first inventory report can take up to 48 hours. It's designed for ongoing monitoring, not instant one-time exports.
Folder concepts are virtual: S3 does not have real folders. "Folders" are just common prefixes in object keys. The export will list every object with its full key path, not a hierarchical folder structure.

You have your metadata export.
Now score it.

Upload your CSV or Excel file to MQS and get a structural metadata health score out of 100 with dimension breakdowns and actionable diagnostics.