Cloud Security

2025 S3 Audit: 3 Ultimate Scripts to Find Open Data

Prepare for your 2025 S3 audit. Discover our top 3 ultimate scripts (Python, AWS CLI) to find and remediate open S3 buckets and prevent data leaks.

E

Ethan Carter

Cloud security architect specializing in AWS and automated compliance solutions for enterprises.

7 min read8 views

Introduction: The Unseen Risk in Your Cloud

In the expansive universe of cloud storage, Amazon S3 stands as a titan. It’s scalable, durable, and incredibly versatile. However, this versatility is a double-edged sword. A single misconfiguration, a forgotten public access setting, or an overly permissive bucket policy can expose terabytes of sensitive data to the entire internet. As we head into 2025, the threat isn't diminishing; it's evolving. Manual checks are no longer feasible in environments with hundreds or thousands of S3 buckets. Automation is the only viable path forward.

This guide is built for the proactive DevOps engineer, the vigilant SecOps analyst, and the forward-thinking Cloud Architect. We'll cut through the noise and provide you with three powerful, actionable scripts to automate your S3 open data discovery process. Let's transform your S3 security from a reactive chore into a proactive, automated stronghold.

Why a 2025 S3 Audit is Non-Negotiable

The days of setting and forgetting S3 bucket permissions are long gone. The digital landscape of 2025 demands constant vigilance for several critical reasons:

  • Evolving Threat Landscape: Attackers continuously run automated scans for publicly accessible S3 buckets. What was secure yesterday might be a target today due to new exploit techniques.
  • Complex Environments: As organizations scale, so does the complexity of their AWS environments. Multiple teams, automated deployments, and third-party integrations can inadvertently create security gaps.
  • Strict Compliance Mandates: Regulations like GDPR, CCPA, and HIPAA impose severe penalties for data breaches. A proven, regular audit trail is a cornerstone of compliance.
  • Reputational and Financial Cost: A public data breach can lead to millions in fines, recovery costs, and irreparable damage to your brand's reputation. An ounce of prevention, in this case, is worth a ton of cure.

The 3 Ultimate Scripts for S3 Open Data Discovery

Here are three distinct approaches to finding publicly exposed S3 buckets, ranging from a quick command-line check to a comprehensive Python script. Each has its place in a modern security toolkit.

Script 1: The Boto3 Python Scanner (Comprehensive & Customizable)

For a thorough and customizable audit, nothing beats AWS's own SDK for Python, Boto3. This script iterates through all your buckets and performs detailed checks on both Access Control Lists (ACLs) and bucket policies.

What it Checks:

  • Public ACLs: Identifies permissions granted to "AllUsers" or "AuthenticatedUsers" groups.
  • Bucket Policies: Parses the JSON policy to find statements with a broad `"Principal": "*"` or `"Principal": {"AWS": "*"}` that grant public access.
import boto3
import json

def audit_s3_buckets():
    """Audits all S3 buckets for public access via ACLs and policies."""
    s3_client = boto3.client('s3')
    public_buckets = []

    try:
        response = s3_client.list_buckets()
        buckets = response.get('Buckets', [])
        print(f"Found {len(buckets)} buckets. Analyzing...")

        for bucket in buckets:
            bucket_name = bucket['Name']
            is_public = False

            # 1. Check Bucket ACLs
            try:
                acl = s3_client.get_bucket_acl(Bucket=bucket_name)
                for grant in acl['Grants']:
                    grantee = grant.get('Grantee', {})
                    if grantee.get('URI') == 'http://acs.amazonaws.com/groups/global/AllUsers':
                        public_buckets.append({'Bucket': bucket_name, 'Reason': 'Public ACL'})
                        is_public = True
                        break
            except Exception as e:
                print(f"Could not get ACL for {bucket_name}: {e}")

            if is_public: continue

            # 2. Check Bucket Policy
            try:
                policy_str = s3_client.get_bucket_policy(Bucket=bucket_name)['Policy']
                policy_doc = json.loads(policy_str)
                for statement in policy_doc.get('Statement', []):
                    if statement.get('Effect') == 'Allow' and statement.get('Principal') == '*':
                        public_buckets.append({'Bucket': bucket_name, 'Reason': 'Public Policy (Principal: *)'})
                        break
            except s3_client.exceptions.NoSuchBucketPolicy:
                pass # No policy is not a finding
            except Exception as e:
                print(f"Could not get/parse policy for {bucket_name}: {e}")

    except Exception as e:
        print(f"An error occurred listing buckets: {e}")

    return public_buckets

if __name__ == "__main__":
    open_buckets = audit_s3_buckets()
    if open_buckets:
        print("\n--- [!] WARNING: Found Publicly Accessible S3 Buckets ---")
        for item in open_buckets:
            print(f"- Bucket: {item['Bucket']}, Reason: {item['Reason']}")
    else:
        print("\n--- [+] SUCCESS: No publicly accessible buckets found. ---")

Pros: Extremely powerful, can be extended to check for specific actions (e.g., `s3:GetObject`), and integrates seamlessly into larger Python-based security automation frameworks.
Cons: Requires a Python environment with Boto3 installed. Can be slow on accounts with thousands of buckets unless parallelized.

Script 2: The AWS CLI & JQ Power Combo (Quick & Effective)

Sometimes you need a fast answer right from your terminal. This one-liner combines the AWS CLI with `jq`, a command-line JSON processor, to quickly list buckets that have the global "AllUsers" grant.

What it Checks:

  • Public ACLs: Focuses specifically on the `AllUsers` grantee URI.
aws s3api list-buckets --query 'Buckets[].Name' --output text | xargs -I {} sh -c 'aws s3api get-bucket-acl --bucket {} --query "Grants[?Grantee.URI==\`http://acs.amazonaws.com/groups/global/AllUsers\`].Permission" --output text | grep -q . && echo "[!] Public Bucket Found: {}"'

How it works:
1. `aws s3api list-buckets ...`: Lists all bucket names.
2. `xargs -I {} sh -c '...'`: For each bucket name (`{}`), it executes a sub-shell command.
3. `aws s3api get-bucket-acl ...`: Gets the ACL for the bucket and filters (`--query`) for any grant where the Grantee URI matches the global `AllUsers` group.
4. `grep -q . && echo ...`: If the previous command produces any output (meaning a public grant was found), it prints a warning with the bucket name.

Pros: Very fast for quick checks. Requires only the AWS CLI and `jq` (or just `grep` as in the example above), which are standard tools for many cloud engineers.
Cons: Doesn't check bucket policies, which is a major blind spot. Error handling is non-existent; a bucket that denies you `GetBucketAcl` will fail silently.

Script 3: The Framework-Inspired Prowler-Style Module (Holistic & Deep)

Professional security tools like Prowler operate on a modular basis. This script emulates that approach, creating a more structured and detailed analysis that checks for multiple conditions beyond simple public access. It provides a status (`PASS`/`FAIL`) and more context.

What it Checks:

  • Public ACLs and policies (like Script 1).
  • If S3 Block Public Access is disabled.
  • Returns structured results.
import boto3
import json

class S3AuditModule:
    def __init__(self, region='us-east-1'):
        self.s3_client = boto3.client('s3')
        self.s3_control_client = boto3.client('s3control')
        self.account_id = boto3.client('sts').get_caller_identity()['Account']

    def check_bucket(self, bucket_name):
        findings = []

        # Check 1: Public Access Block Configuration
        try:
            config = self.s3_control_client.get_public_access_block(AccountId=self.account_id)['PublicAccessBlockConfiguration']
            if not all(config.values()):
                findings.append({'Status': 'FAIL', 'Check': 'Account-Level Public Access Block', 'Message': 'Not all block settings are enabled account-wide.'})
        except Exception:
            # Bucket-level check if account-level fails
            try:
                config = self.s3_client.get_public_access_block(Bucket=bucket_name)['PublicAccessBlockConfiguration']
                if not all(config.values()):
                    findings.append({'Status': 'FAIL', 'Check': 'Bucket Public Access Block', 'Message': f'Block Public Access is not fully enabled on bucket {bucket_name}.'})
            except Exception:
                findings.append({'Status': 'FAIL', 'Check': 'Bucket Public Access Block', 'Message': f'Could not retrieve Block Public Access for {bucket_name}. Assumed not configured.'})

        # Check 2 & 3: ACLs and Policies (condensed for brevity)
        # ... [logic from Script 1 can be inserted here to check ACLs and Policies] ...
        # For this example, we'll just simulate a finding:
        # if is_public_by_acl_or_policy(bucket_name):
        #     findings.append({'Status': 'FAIL', 'Check': 'ACL/Policy Analysis', 'Message': 'Bucket is public via ACL or Policy.'})

        if not findings:
            return [{'Status': 'PASS', 'Check': 'Overall Security', 'Message': f'Bucket {bucket_name} passed all checks.'}]

        return findings

    def run_audit(self):
        all_results = {}
        for bucket in self.s3_client.list_buckets()['Buckets']:
            bucket_name = bucket['Name']
            all_results[bucket_name] = self.check_bucket(bucket_name)
        return all_results

if __name__ == "__main__":
    auditor = S3AuditModule()
    results = auditor.run_audit()
    for bucket, findings in results.items():
        print(f"\n--- Audit Results for: {bucket} ---")
        for find in findings:
            print(f"  [{find['Status']}] {find['Check']}: {find['Message']}")

Pros: Provides a more holistic security picture. The structured output is ideal for feeding into reporting systems or ticketing queues. Mimics the approach of professional-grade tools.
Cons: Significantly more complex. The learning curve is steeper, and it requires careful management of permissions (e.g., `s3control:GetPublicAccessBlock`).

Comparison of S3 Auditing Scripts

Script Feature Comparison
FeatureBoto3 Python ScriptAWS CLI & JQ ComboFramework-Inspired Script
Ease of UseModerateEasy (for CLI users)Complex
SpeedModerateFastestSlowest (most thorough)
CustomizabilityVery HighLowHigh
Detail LevelHigh (ACLs & Policies)Low (ACLs only)Very High (PAB, Policies, etc.)
DependenciesPython, Boto3AWS CLI, JQ/GrepPython, Boto3
Best ForScheduled, in-depth auditsQuick, on-the-fly checksComprehensive security frameworks

Beyond the Scripts: Best Practices for S3 Security in 2025

Scripts are powerful for discovery, but a robust S3 security strategy is multi-layered. Complement your automated checks with these essential AWS services and practices:

  • Enable S3 Block Public Access: This should be your first line of defense. Enable it at the account level to provide a safety net that prevents any bucket from accidentally becoming public.
  • Use AWS Macie: For buckets that must contain sensitive data, Macie uses machine learning to discover, classify, and protect it. It can identify PII, financial data, and credentials stored in your S3 objects.
  • Leverage IAM Access Analyzer for S3: This service continuously monitors your S3 bucket policies to provide formal, provable findings for any policy that grants external or public access. It's a free, built-in version of what our scripts do.
  • Implement Least Privilege: Regularly review and tighten IAM policies and S3 bucket policies. Ensure that users and services have only the permissions they absolutely need to perform their functions.
  • Activate S3 Storage Lens: Gain organization-wide visibility into your object storage usage and activity. Its dashboards can help you spot cost optimization opportunities and identify security best practices that are not being followed.

Conclusion: Automate Your Way to a Secure Cloud

Relying on manual S3 audits in 2025 is an invitation for a data breach. The scale and complexity of modern cloud infrastructure demand an automated, continuous approach to security. The three scripts provided offer a spectrum of solutions—from the quick CLI one-liner for ad-hoc checks to the comprehensive Boto3 script for deep analysis and the framework-inspired module for building a robust security program.

Use these scripts as a starting point. Integrate them into your CI/CD pipelines, run them as scheduled Lambda functions, or incorporate them into your central security monitoring platform. By making S3 data discovery a consistent, automated process, you can stay ahead of threats and ensure your data remains exactly where it should be: secure.