TrailTool: CloudTrail for AI Agents

Running security for AWS-centric companies means getting down and dirty with CloudTrail. Not only will you crawl the logs with SIEMs to “find the baddies” via IoCs; as a proactive engineering-focused security team, you’ll rely on them to implement access control, validate changes, and debug problems.

In the agentic AI era™ these tasks can be carried out by Claude et al., but CloudTrail logs are hard to synthesize. Answering “did contractor@company.com update this S3 bucket in the last 30 days?” could mean sifting through terabytes of logs, toiling with custom queries, and correlating role assumptions by hand. You can connect agents with MCP tooling and build skills to standardize query patterns, but it’s a fair amount of non-trivial configuration and undifferentiated heavy lifting. Every query (especially loops where the agent needs to learn/fix the syntax) wastes time and tokens while bloating the context window.

Enter TrailTool. The big idea is to process (Lamba) and cache (DynamoDB) CloudTrail based on access patterns, grouping events into entities (People, Sessions, Roles, Services, Resources). When you want to ask common questions (what has this role accessed?, who accessed this resource?, etc.) you get quick trustable answers (trailtool roles detail <RoleName>). TrailTool is open source - deploy the Ingestor Lambda via SAM and your agent (or you) query with a CLI that works with standard AWS credentials.

Here are four workflows I’ve been running with it. For each one, there’s a prompt I gave Claude Code and the resulting session transcript.

Detecting “ClickOps” modifications

Ah, ClickOps, the primordial ooze from which fully realized cloud services emerge. Who amongst us hasn’t felt the rush, the thrill, of building software with only a faint understanding of the resources being created using a wizard UI and a prayer? It’s the original way to vibe software development.

If you do cloud security, you know that ClickOps resources bypass some kinda important security mechanisms like “change control” and “cloud hardening standards.” They may represent a drift from Infrastructure as Code that needs to be rectified. Or they may represent an opportunity to nudge someone onto an IaC pattern. At the very least, it’s an opportunity to review the resource for security best practices.

Prompt:

Use trailtool to identify resources created or modified via ClickOps over the last 30 days, import them into Terraform state, and create the relevant Terraform configuration.
Session transcript (view gist)

Detecting ClickOps in CloudTrail means filtering out traffic that happens via a web browser user agent, finding the mutating actions, pulling out the resource names, and sifting everything down to a list that can be sliced and diced by who, what service, what region. TrailTool has already done that work at ingest time, so the agent skips straight to reasoning about the results.

This prompt assumes you’re in catch-up mode and people are spinning stuff up without IaC. Rather than nagging them, you can fire something like this off to clean up after them. You could run a longer-lived agent looking for ClickOps in real time, nagging folks over Slack. Of course, the long-term fix is IAM policies strict enough to prevent it in the first place, which brings us to the next case.

Defining least-privilege IAM policies for roles

Least privilege for IAM is definitely a journey, not a destination. Especially when you consider humans, with all of their non-deterministic behavior and their “I’m an admin, get me out of here” entitlement. We all agree that things should be locked down, until we can’t do something we need for our job.

Often permissions start with the block of stone known as AdministratorAccess, which is then whittled by IAM artisans into an artful figure of “enough removed to satisfy security, enough retained to avoid complaints.” Like Michelangelo, who (apocryphally) stated that the creation of his masterpiece was simple: “I just removed everything that is not David.”

How do we know what to remove? CloudTrail is a pretty good way to figure it out. Generating least-privilege policy from CloudTrail logs is non-trivial, but tools like IAM Access Analyzer and iamlive have mapped out this path. TrailTool’s session-level analysis maps log lines into coherent narratives about what a user did over the course of an authenticated login, and uses iamlive mappings to translate that into IAM policy actions.

Prompt:

Use trailtool to remove unused permissions from IAM policies for SandboxPowerUser role.
Session transcript (view gist)

Use cases come and go, and access granted before may no longer be needed. You can run this as a recurring workflow: generate a policy, create a PR, review, deploy, repeat. Of course, this leads to what happens when you over-tighten, or when there are new use cases.

Responding to AccessDenied errors

Tightening down IAM policies is half the battle. The other half is knowing what to do when a role starts throwing AccessDenied errors. In my experience, they’re often a feedback signal - someone tried to do something legitimate and got blocked. Rather than having them file a ticket or ping you in Slack, use an agent to automatically identify the errors and draft the fix.

Prompt:

Use trailtool to add permissions for events that received "AccessDenied" errors with IAM policies associated with SandboxPowerUser role.
Session transcript (view gist)

The idea here is that least privilege needs to evolve, and someone bumping up against a permissions error is a signal that policies need to be loosened. A human-in-the-loop is a good idea here, as with all permission changes, but by shortening the loop from “hey I tried to do this thing and I couldn’t” to “fixed try again” by plumbing together all the implicit data with an agent-generated PR that’s ready to merge means less back-and-forth and faster forward progress for developers.

Validating emergency break-glass access justifications

One common pattern for implementing IAM access is the break-glass case. If there’s an incident or high-priority operation, an operator can ask for an exception, usually accompanied by a justification. The approver typically uses this justification as context for their decision, but:

  • The justification may be extremely brief
  • The operator may end up doing something different after they receive the access

Prompt:

Use trailtool to investigate the session associated with the user alex@engseclabs.com assuming BreakGlassEmergency account around 8AM March 18, 2026 to see how it aligns with this justification: I was investigating an incident and needed to access SSM session for production instance.
Session transcript (view gist)

This is where session-level analysis is useful. TrailTool lets you summarize activity at the session level, without manually correlating role assumptions and API calls across raw log files. By comparing this with stated justifications, it highlights discrepancies that might represent unwanted system changes or even attacks. It eliminates one of the foibles of access approvals, closing the loop to ensure people do what they say they will.

Check out TrailTool

TrailTool is open source, so you can deploy it to your own account and start querying with the CLI. Or, you can check out trailtool.io for a more full-featured hosted version.

If you’re building AI-driven security workflows and CloudTrail analytics are slowing you down, let’s talk. Connect on LinkedIn or Mastodon.

About the author

Alex Smolen is a security engineer and the founder of EngSecLabs, a security consulting practice focused on practical security programs for growing companies. He works directly with engineering and product teams on security architecture, AI security, and compliance.

If you're working through a security problem, get in touch.

All posts