Agentic AI Core Blocks

Production AI agents are built on several layers — from the models that reason and act to the guardrails that keep them safe. This guide provides an essential toolkit for building reliable agents.

Explore the Stack Quick Start Guide

Architecture Overview

The Full Stack

11 layers from foundation to interface. Click any layer to explore its tools.

L10

Visual & Low-Code Builders

3 tools

Governance, Safety & AI Ops

7 tools

Protocols & Coordination

4 tools

Memory, Retrieval & Vector Stores

10 tools

Tools, Data Connectors & RAG

6 tools

Reasoning & Agent Design

6 tools

Orchestration & Multi-Agent

7 tools

Core LLM / Foundation Models

9 tools

Evaluation & Observability

10 tools

Infrastructure & Deployment

6 tools

Foundation

6 tools

Level 10

Visual & Low-Code Builders

Rapidly prototype agent workflows, tools, and integrations with minimal code; good for internal automation and early MVPs.

Microsoft Copilot Studio

Low-code platform for building AI agents and copilots.

Enterprise agents with M365, Dynamics, and Azure AI integration.

Amazon Bedrock Agents

Visual agent builder within AWS Bedrock.

Define tools, knowledge bases, and guardrails via AWS console.

Vertex AI Agent Builder

Google Cloud's visual builder for conversational agents.

Build agents with Gemini, Google Search, and enterprise data.

Level 9

Governance, Safety & AI Ops

Prevent unsafe outputs/actions, reduce privacy risk, provide auditability, and support compliance and production controls.

Guardrails AI

Guardrails/validators around LLM outputs (schemas, checks).

Ensure structured output, block unsafe content, enforce formats.

LlamaGuard

Meta's open-source LLM safety classifier.

Classify and filter unsafe inputs/outputs in enterprise pipelines.

NeMo Guardrails

NVIDIA's programmable safety rails for LLM outputs.

Enforce topic restrictions, tone policies, and safety contracts.

Level 8

Protocols & Coordination

Standardize how agents connect to tools/data and coordinate multi-step work across services.

MCP (Model Context Protocol)

Open standard for connecting AI apps to tools/data sources.

Expose internal systems to agents safely; reuse connectors across apps.

MCP Server Ecosystem

Growing catalog of pre-built MCP servers.

Quickly add tool capabilities to agents without custom integration code.

Pydantic

Python data validation + typed models.

Define tool I/O schemas; validate agent outputs before executing actions.

Level 7

Memory, Retrieval & Vector Stores

Give agents long-term context, personalized state, and retrieval over private knowledge.

Zep

Memory layer for chat history + embeddings + recall.

Persist and retrieve user context across sessions.

Pinecone

Fully managed vector database — zero ops, scales automatically.

Production semantic search and RAG at scale.

Cohere Embed

Embedding models optimized for retrieval.

Multilingual embeddings; enterprise RAG.

Level 6

Tools, Data Connectors & RAG

Give agents reliable access to data and actions via well-defined tools (APIs, DB queries, search, code execution).

LangChain

Popular framework for tools, chains, agents, and RAG.

Build tool-calling agents; integrate retrievers and prompts.

LlamaIndex

Data framework focused on RAG ingestion/retrieval.

Document ingestion, indexing strategies.

Unstructured

Document parsing/extraction.

Convert PDFs/HTML/docs into clean chunks for RAG.

Level 5

Reasoning & Agent Design

Make agent behavior reliable: planning, structured output, tool selection, and controllable reasoning patterns.

Claude Agent SDK

Anthropic's SDK for building agents with Claude.

Production agents with tool use and safety controls.

Google ADK

Agent development kit for Google models/services.

Build agents with Google ecosystem and Gemini.

PydanticAI

Python agent framework with typed I/O.

Schema-first tool outputs; safer action execution.

Level 4

Orchestration & Multi-Agent

Control how agents run: graphs, routing, delegation, retries, tool policies, and multi-agent collaboration.

LangGraph

Graph/state-machine orchestration for LangChain.

Deterministic flows with loops, checkpoints, branching.

Semantic Kernel

Microsoft's SDK for building agents as plugins — connects to Azure, M365, and custom APIs.

Enterprise app integration where agents need access to Microsoft ecosystem services.

AutoGen

Microsoft framework for multi-agent conversations — agents discuss, delegate, and solve tasks together.

Build teams of specialized agents that collaborate through structured dialogue.

Level 3

Core LLM / Foundation Models

The 'brains' that generate text, plans, tool calls, and structured outputs.

Claude Sonnet 4.6

Anthropic's primary reasoning model with extended thinking built-in.

High-quality assistant, tool-calling agents, complex analysis.

GPT-5.4

OpenAI's most capable model — advanced reasoning, coding, and native computer use; available via Azure OpenAI for regulated environments.

Complex reasoning, agentic coding, and enterprise-scale content generation.

Gemini 3.1 Pro

Google's state-of-the-art multimodal model; top performance across benchmarks.

Multimodal assistants, long-document analysis, Google ecosystem.

Level 2

Evaluation & Observability

Know what your agent did, why it did it, how much it cost, and whether quality is improving or regressing.

LangSmith

Tracing and evals platform — tightly integrated with the LangChain ecosystem.

Debug agent runs; compare prompt variants; track quality over time.

Datadog LLM Observability

LLM monitoring built into Datadog — unified with your existing APM and infrastructure metrics.

Enterprise LLM tracing for teams already on Datadog.

Arize

Production ML/LLM monitoring with drift detection and performance dashboards.

Track model quality over time; catch degradation before users notice.

Level 1

Infrastructure & Deployment

Run models and agents reliably at scale: compute, networking, gateways, and managed AI services.

Kubernetes / Helm

Container orchestration + package management.

Deploy and scale agent services reliably in enterprise environments.

AWS Bedrock

Managed model access layer on AWS.

Enterprise deployments with AWS governance.

Vertex AI

Google Cloud AI platform.

Gemini + ML pipelines on GCP.

Level 0

Foundation

The base languages and ML runtime components that everything else builds on.

Python

Primary language for agent + ML ecosystems.

Fast iteration on agents, tools, RAG, evals.

PyTorch

Dominant deep learning framework.

Fine-tuning, inference, custom model research.

Hugging Face

Model/dataset hub + libraries.

Access open models, tokenizers, embeddings.

Getting Started

Recommended Starter Kit

A pragmatic starting point for teams shipping their first agent.

Model

Claude

Framework

LangChain or PydanticAI

Orchestration

LangGraph

Vector DB

pgvector or Pinecone

Embeddings

Cohere Embed

Observability

LangSmith or Datadog LLM Observability

Durable Execution

Temporal or Apache Airflow

Guardrails

Guardrails AI

Telemetry

OpenTelemetry

Document Processing

Unstructured

Memory

Zep or pgvector

Reranking

Cohere Rerank

Scale up from here as complexity grows.

Commonly Missing

Critical Production Layers

Capabilities that often don't show up on vendor maps but are essential for reliable agents.

Durable Execution

Keep agent workflows running through crashes, restarts, and deployments.

Tool Permissions

Control what each agent can do — separate read from write access, require approval for risky actions.

RAG Data Lifecycle

Keep your knowledge base accurate, fresh, and auditable as source data changes.

Telemetry Substrate

Connect the dots when an agent fails — trace from LLM call to tool to infrastructure.

Security & Regulatory Controls

Protect sensitive data, control who can do what, and meet regulatory requirements.

Cost Controls

Prevent LLM costs from spiraling — set budgets, cache responses, and route to cheaper models when possible.

Graceful Degradation

Keep agents working when APIs go down — fall back to alternative models and retry intelligently.

Testing & Simulation

Test agents despite non-deterministic outputs — mock tools, replay traces, and stress-test with adversarial prompts.

Rate Limiting & Queuing

Prevent agents from generating unlimited requests — enforce limits and queue work to smooth load.

Model Risk Management

Treat every AI agent as a model subject to validation, monitoring, and regulatory oversight.

Human-in-the-Loop

Require human sign-off for high-stakes decisions — define who approves what and when to escalate.

Deployment & Operations

Where It Runs

Cloud-native service mappings across AWS, Azure, and GCP.

End-to-end agentic AI on Amazon Web Services

L10

Visual & Low-Code Builders

Amazon Bedrock Agents

Visual agent builder with tools, knowledge bases, and guardrails.

Governance & Safety

Amazon Bedrock Guardrails

Content filtering, PII redaction, topic denial, and grounding checks.

Memory, Retrieval & Vector

Amazon Bedrock Knowledge Bases

Managed RAG with automatic chunking, embedding, and retrieval.

Amazon OpenSearch Serverless

Vector search engine for semantic retrieval at scale.

Amazon RDS (pgvector)

Managed PostgreSQL with vector search extension.

Amazon ElastiCache (Redis)

Managed Redis with vector search capabilities.

All services listed are cloud-native managed offerings. Open-source tools can additionally be deployed on any cloud via VM.

Production Deployments

AI in Financial Services

Real-world AI deployments by leading financial institutions.

BNY

Eliza 2.0 Platform

Enterprise-scale agentic AI platform with 20,000+ empowered builders and 140+ digital employees deployed across global operations — one of the first banks to transition from experimental AI to a full-scale agentic operating model.

125+ live use cases

DBS Bank

ADA + ALAN AI Platform

Named World's Best AI Bank by Global Finance (2025), DBS generated S$1 billion in measurable economic value from AI — with 1,500+ models across 370+ use cases and a GenAI assistant processing 250,000 queries monthly.

S$1B economic value from AI in 2025

Deutsche Bank

DB Lumina

AI-powered financial research agent built on Google Gemini, live since September 2024 with ~5,000 users across investment banking — automating data analysis and delivering real-time insights while maintaining strict data privacy for regulated environments.

~5,000 users in production since September 2024

Morgan Stanley

AskResearchGPT + DevGen.AI

AI-powered research assistant giving one-click access to 70,000+ proprietary reports, plus DevGen.AI — an internal tool that processed 9 million lines of legacy code and saved 280,000 developer hours.

98% advisor team adoption

Wells Fargo

Fargo AI Assistant + Tachyon Platform

AI-powered virtual assistant that handled 245 million customer interactions in 2024 — an 11.5x increase from the prior year — with zero PII exposure to the LLM layer, built on the proprietary Tachyon multi-model platform.

245 million interactions in 2024 (up from 21.3M in 2023)

Citi

Citi Stylus + Devin AI Agent

Agentic AI deployed across 40,000 developers using Cognition's Devin agent for automated patching, upgrades, and code rewriting — part of a broader AI push that saw 6.5 million prompts in the first half of 2025.

40,000 developers using AI coding assistants

JPMorgan Chase

LLM Suite

Proprietary generative AI platform providing secure access to large language models for writing, summarization, and idea generation — ranked #1 for AI capabilities in Evident AI Index since its launch in 2023.

200,000+ employees with access

Art of the Possible

What Becomes Achievable

These aspirational use cases demonstrate why a comprehensive agentic framework matters. Each requires orchestration, memory, guardrails, and observability working in concert.

SwarmAlpha

From data to thesis in hours

Vigil

Continuous vigilance, instant response

DealForge

Thousands of documents, one picture

RegShift

From regulatory text to action

AdvisorMind

Every client, a dedicated AI partner

ExecStream

Research to execution, end-to-end

UnderwriteIQ

Fair, fast, fully explainable

RouteX

Optimal routing for every transaction

SentinelNet

Catch what rules alone can't see