Logo

Running Hermes on AWS Lambda: Serverless AI Agents

Porting Hermes to AWS Lambda with DynamoDB chat history, S3 skills, EventBridge cron, and Telegram webhooks—the same agent loop without an always-on server.

Rob Helmer

Rob Helmer

5/17/2026 · 7 min read

Tags:


Architecture diagram: Telegram and EventBridge invoke AWS Lambda; DynamoDB, S3, AgentCore Memory, and DeepSeek API persist and power the Hermes agent.

In How I Am Running Sites in 2026 and Agent-Driven Engineering, I wrote about shifting from “deploy on green” to continuous agency—agents that reason about state, not just run scripts. That workflow still assumes a machine somewhere is always on, or at least that durable state lives on local disk under ~/.hermes.

I’ve been experimenting with the opposite shape: wake-on-demand agents on AWS Lambda, driven by Telegram and scheduled jobs, with every piece of persistence moved into managed services. AWS also ships a fuller AgentCore reference deployment for Hermes—I compare that below. I’m porting my Lambda work into Hermes Agent on a branch that is not published yet; this post is the architecture story, not a deploy guide.

At a glance

  • What this is: The same Hermes tool loop and session semantics, packaged as an ARM64 Lambda container that only runs when Telegram webhooks or EventBridge Scheduler fire.
  • What changed: Chat history, skills, cron, and long-term memory no longer assume a writable home directory on the host—they use DynamoDB, S3, EventBridge Scheduler, and Bedrock AgentCore Memory instead.
  • What it is not: A claim that Lambda is the only valid runtime. EC2, ECS, or a normal long-running Hermes process would work; Lambda is the packaging I chose for zero idle cost and operational simplicity.

The problem: Hermes without ~/.hermes

Classic Hermes is built around a long-lived process and local filesystem layout: sessions, skills, cron jobs.json, and optional memory files all live under HERMES_HOME. That model breaks down on Lambda for three reasons:

  1. Ephemeral compute — Only /tmp is writable; anything that must survive a cold start has to live elsewhere.
  2. Concurrent invocations — Two Telegram messages or a cron tick overlapping the same chat must not corrupt session state on disk.
  3. No in-process cron daemon — Scheduled work has to be external (EventBridge) and re-enter the agent via a trusted invoke payload.

The fix is pluggable backends (agent/session_backend.py, agent/skills_storage.py) and a Lambda-specific runtime layer (agent/lambda_runtime.py, hermes_lambda.py) that seeds config into /tmp/hermes, disables toolsets that need a shell or browser, and wires cron to EventBridge when HERMES_SCHEDULER_ROLE_ARN is set.

Architecture

%%{init: {
  "theme": "base",
  "themeVariables": {
    "darkMode": true,
    "background": "#0d1117",
    "mainBkg": "#161b22",
    "secondBkg": "#21262d",
    "tertiaryBkg": "#30363d",
    "primaryColor": "#1f3d5c",
    "primaryTextColor": "#e6edf3",
    "primaryBorderColor": "#58a6ff",
    "secondaryColor": "#2d2440",
    "secondaryTextColor": "#e6edf3",
    "secondaryBorderColor": "#a371f7",
    "tertiaryColor": "#1a3328",
    "tertiaryTextColor": "#e6edf3",
    "tertiaryBorderColor": "#3fb950",
    "lineColor": "#8b949e",
    "textColor": "#e6edf3",
    "fontFamily": "ui-sans-serif, system-ui, sans-serif",
    "fontSize": "14px",
    "clusterBkg": "#161b22",
    "clusterBorder": "#484f58",
    "titleColor": "#d4a72c",
    "edgeLabelBackground": "#21262d",
    "nodeTextColor": "#e6edf3"
  },
  "flowchart": {
    "curve": "basis",
    "padding": 40,
    "htmlLabels": false,
    "nodeSpacing": 60,
    "rankSpacing": 70,
    "useMaxWidth": true
  }
}}%%
flowchart TB
  subgraph user ["👤 You"]
    TG["Telegram"]
  end

  subgraph aws ["☁️ AWS"]
    Lambda["⚡ Lambda · Hermes agent"]
    EB["EventBridge Scheduler · cron"]

    subgraph persist ["💾 What persists"]
      DDB["DynamoDB · chat history"]
      ACM["AgentCore Memory · long-term"]
      S3["S3 · skills files"]
      KB["Knowledge Base · skill search"]
    end
  end

  subgraph external ["🌐 Outside AWS"]
    DS["DeepSeek API · V4 Flash"]
  end

  TG <-->|webhook| Lambda
  EB -->|scheduled invoke| Lambda
  Lambda <-->|read / write turns| DDB
  Lambda <-->|recall & learn| ACM
  Lambda <-->|skills_list / skill_manage| S3
  S3 -.->|sync on write| KB
  Lambda --> DS

  classDef userNode fill:#3d2f00,stroke:#d4a72c,color:#fff,stroke-width:2px
  classDef computeNode fill:#0d2847,stroke:#58a6ff,color:#fff,stroke-width:2px
  classDef storageNode fill:#21262d,stroke:#8b949e,color:#e6edf3
  classDef memoryNode fill:#2d2440,stroke:#a371f7,color:#fff
  classDef scheduleNode fill:#3d2208,stroke:#f0883e,color:#fff
  classDef externalNode fill:#1a3328,stroke:#3fb950,color:#fff

  class TG userNode
  class Lambda computeNode
  class DDB,S3,KB storageNode
  class ACM memoryNode
  class EB scheduleNode
  class DS externalNode
Hermes needAWS serviceNotes
Multi-turn chatDynamoDBSingle-table design: SESSION#{key} with META and MSG# rows
Long-term memoryBedrock AgentCore MemoryOptional; setup-agentcore-memory.sh provisions the resource
Skills read/writeS3Same SKILL.md layout as local skills; optional Knowledge Base sync in progress
Cron / remindersEventBridge SchedulerHermes cronjob tool creates schedules that invoke Lambda with {"source": "hermes.cron", "job": {...}}
RuntimeLambda + ECRARM64 container image (build and push to ECR)

Inference today: DeepSeek V4 Flash via API key—not Bedrock yet. The setup also supports flipping to Bedrock models via HERMES_PROVIDER=bedrock when you want everything inside AWS.

What works end-to-end

What works in my setup today:

CapabilityHow
Telegram botFunction URL webhook; TELEGRAM_WEBHOOK_SECRET required on HTTP; TELEGRAM_ALLOWED_USERS allowlist
Multi-turn chatHERMES_SESSION_BACKEND=dynamodb + hermes-sessions table
Cron / remindersEventBridge Scheduler + cronjob tool (no CLI workarounds; in-process cron disabled on Lambda)
SkillsHERMES_SKILLS_S3_BUCKET / prefix; skills toolset re-enabled when bucket is set
Long-term memoryAgentCore Memory plugin under hermes-aws-plugins/bedrock-agentcore/ when HERMES_AGENTCORE_MEMORY_ID is set

Recent fixes worth calling out: one EventBridge fire per cron tick (no scheduler retries stacking work), sanitized AgentCore session IDs for search, and webhook secret enforcement on all HTTP paths so only Telegram (or tests with the shared secret) can drive the function.

Lambda constraints: what we turned off

Not every Hermes toolset belongs on a 15-minute, read-only filesystem. lambda_runtime.py disables browser, terminal, code execution, delegation, local memory toolset (replaced by AgentCore when configured), and others. The agent still gets core reasoning, skills (from S3), and cron—enough for a personal Telegram assistant without pretending Lambda is a dev workstation.

Config and bundled plugins are copied from the container image into /tmp/hermes on cold start so load_config() and plugin discovery behave like a normal install without writing under /var/task.

Why Lambda (and when it is not the answer)

I picked Lambda because my use case is sparse: a few Telegram threads and occasional scheduled reminders. Pay-per-invoke beats a small EC2 instance sitting idle. If you need persistent WebSockets, heavy local tooling, or sub-second interactive latency at high volume, run Hermes on ECS or a VM and keep the same DynamoDB/S3/EventBridge backends—this is really “cloud-shaped Hermes,” with Lambda as one entrypoint.

That aligns with how I think about continuous agency: the agent loop is portable; the hosting contract is what you swap. Specs and tools stay; only persistence and scheduling move to AWS APIs.

While I was building this, AWS published sample-host-hermesagent-on-amazon-bedrock-agentcore—a full reference for running Hermes on Amazon Bedrock AgentCore Runtime (Firecracker microVMs per session), with CDK stacks, a router Lambda, multi-channel webhooks, and Claude on Bedrock via SigV4. It’s the right starting point if you want a production-shaped, AWS-native deployment with guardrails, budgets, and several chat platforms out of the box.

My work is complementary, not a fork of that sample. Same problem—Hermes without a persistent ~/.hermes on disk—but a different tradeoff curve.

AWS AgentCore sampleThis Lambda port (in progress)
Agent runtimeAgentCore Runtime (microVM per session, auto-scaling)Lambda container (one invocation per webhook/cron tick)
InferenceBedrock Claude (native; AnthropicAnthropicBedrock monkey-patch)DeepSeek V4 Flash via API today; optional Bedrock switch
ChannelsTelegram, Slack, Discord, Feishu (webhook); WeChat via optional ECS gatewayTelegram only so far
Chat historyS3 workspace + SQLite sync in the bridgeDynamoDB session backend
Long-term memoryS3-backed workspaceBedrock AgentCore Memory (plugin), not full Runtime
SkillsS3 workspaceS3 prefix + optional Knowledge Base
CronDedicated cron Lambda + stacksEventBridge Scheduler + Hermes cronjob tool
Deploy / opsNine CDK stacks, phased deploy.sh, observability & token budgetsShell setup scripts, personal scale, minimal idle cost
Hermes integrationRuns upstream Hermes in AgentCore with a bridge contractChanges in Hermes itself (lambda_runtime, session backends, hermes_lambda.py)

When the AWS sample fits better: multi-channel bots, team use, Bedrock-only inference, per-user isolation on AgentCore, and you’re fine with VPC/NAT-level infrastructure and a documented ~$220–770/month ballpark for ~10 active users (per their README).

When this Lambda shape fits better: a sparse personal assistant (few Telegram threads, occasional reminders), smallest possible idle bill, and you’re already comfortable wiring DynamoDB/S3/EventBridge yourself. You give up AgentCore Runtime’s session microVMs and the sample’s breadth of channels; you gain a simpler wake-on-demand unit of compute and tighter coupling to Hermes’s own tool loop on Lambda.

Both approaches reuse AgentCore ideas—the sample centers Runtime; I’m using AgentCore Memory (and optional KB) while keeping the agent on Lambda. If AWS’s layout becomes the long-term default, the DynamoDB/S3/EventBridge pieces here still look like a reasonable “lightweight lane” or a stepping stone before adopting the full CDK deployment.

Status and next steps

AreaState
Lambda + Telegram + DynamoDB + S3 + EventBridgeWorking
DeepSeek V4 FlashWorking
Bedrock AgentCore MemoryProvision via setup-agentcore-memory.sh, set env, redeploy
Bedrock Knowledge Base (semantic skill search)In progress

If you are running Hermes today on a home server, the interesting migration path is incremental: stand up DynamoDB sessions first, then S3 skills, then external cron—before you ever cut over to Lambda.