AI Thinking

How to Give Your AI Agent a Thinking Framework (API Guide)

Load a thinking framework into Claude, OpenAI, or any LLM agent via system prompt. Production-ready code examples for Anthropic, OpenAI, LangChain, and custom agents.

By Gareth Hoyle·21 April 2026·8 min read

Most AI agents fail not because the model is wrong but because the system prompt is generic. "You are a helpful assistant" produces a helpful-sounding assistant that reasons like every other helpful-sounding assistant. A thinking framework in the system prompt gives your agent a distinct cognitive shape — how to evaluate, what to prioritise, what to reject.

This guide covers loading a thinking framework into production agents via API, with working code for Anthropic, OpenAI, LangChain, CrewAI, and custom multi-agent systems.

The core pattern

Every thinking framework from authority.md ships with a .md file. To give an agent that framework, you load the file as the system prompt:

┌──────────────────────────────────────────┐
│  system_prompt = contents of .md file   │
│                                          │
│  messages = [user: "your question"]      │
└──────────────────────────────────────────┘
                   │
                   ▼
           Every response shaped by
           the framework's mental models

That's it. The complexity below is implementation-specific — loading the file, structuring messages, handling token limits — not conceptual.

Anthropic API (Claude)

import anthropic

client = anthropic.Anthropic()

# Load framework once at startup, not per request
with open("warren-buffett-framework.md") as f:
    framework = f.read()

def ask(question: str) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=2000,
        system=framework,
        messages=[{"role": "user", "content": question}],
    )
    return response.content[0].text

# Usage
answer = ask("Should I invest in a SaaS company trading at 40x earnings?")

With the native Skill format

If you're using Claude Skills in the API (code-execution tool enabled), you can upload the .zip directly and Claude handles the rest:

import anthropic

client = anthropic.Anthropic()

# Upload the Skill package
with open("warren-buffett.zip", "rb") as f:
    skill = client.beta.skills.create(
        file=f,
        purpose="skill",
    )

# Use in conversation
response = client.beta.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=2000,
    skills=[skill.id],
    messages=[{"role": "user", "content": "Analyse this business for me..."}],
)

OpenAI API (GPT)

from openai import OpenAI

client = OpenAI()

with open("warren-buffett-framework.md") as f:
    framework = f.read()

def ask(question: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": framework},
            {"role": "user", "content": question},
        ],
    )
    return response.choices[0].message.content

answer = ask("Should I invest in a SaaS company trading at 40x earnings?")

Multi-turn conversations

Keep the system message first, append user/assistant turns below:

conversation = [{"role": "system", "content": framework}]

def ask(question: str) -> str:
    conversation.append({"role": "user", "content": question})

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=conversation,
    )
    reply = response.choices[0].message.content
    conversation.append({"role": "assistant", "content": reply})
    return reply

Two things to watch:

Token bloat. Frameworks are 2,000–3,000 tokens. Long conversations hit context limits faster. Budget accordingly.
Framework drift. On very long conversations, models sometimes "forget" early system instructions. If responses start sounding generic after 20+ turns, re-emphasise the framework in a user message or reset the conversation.

LangChain

LangChain abstracts the system prompt through SystemMessage:

from langchain_anthropic import ChatAnthropic
from langchain_core.messages import SystemMessage, HumanMessage

with open("warren-buffett-framework.md") as f:
    framework = f.read()

llm = ChatAnthropic(model="claude-sonnet-4-5")

messages = [
    SystemMessage(content=framework),
    HumanMessage(content="Should I invest in this company?"),
]

response = llm.invoke(messages)
print(response.content)

With a LangChain agent

If you're using LangChain's agent executor (tools + reasoning), the framework goes into the agent's prompt template:

from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate

with open("warren-buffett-framework.md") as f:
    framework = f.read()

prompt = ChatPromptTemplate.from_messages([
    ("system", framework),
    ("user", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

result = executor.invoke({"input": "Analyse this business"})

The framework shapes how the agent reasons between tool calls — which data to fetch, which questions to ask, how to synthesise results.

CrewAI (multi-agent)

CrewAI agents have a role, goal, and backstory field. The thinking framework goes into backstory:

from crewai import Agent

with open("warren-buffett-framework.md") as f:
    framework = f.read()

analyst = Agent(
    role="Investment Analyst",
    goal="Evaluate investment opportunities using Buffett's mental models",
    backstory=framework,
    verbose=True,
)

Combine multiple thinking frameworks across a crew:

with open("warren-buffett-framework.md") as f:
    buffett = f.read()
with open("charlie-munger-framework.md") as f:
    munger = f.read()

analyst = Agent(role="Financial Analyst", goal="...", backstory=buffett)
checker = Agent(role="Devil's Advocate", goal="...", backstory=munger)

The analyst thinks in Buffett's patterns; the checker applies Munger's inversion. Real cross-examination, not two copies of GPT-4o talking to each other.

Vercel AI SDK

import { anthropic } from '@ai-sdk/anthropic';
import { generateText } from 'ai';
import fs from 'fs/promises';

const framework = await fs.readFile('warren-buffett-framework.md', 'utf-8');

const { text } = await generateText({
  model: anthropic('claude-sonnet-4-5'),
  system: framework,
  prompt: 'Should I invest in this company?',
});

console.log(text);

Works identically with @ai-sdk/openai by swapping the provider.

Custom agent loops

If you're building your own agent loop (planning → tool calls → reflection → output), the framework goes into the planning system prompt:

PLANNING_SYSTEM = framework + """

When the user gives you a task, output a structured plan:
1. What would {persona_name} look for first?
2. Which of their mental models applies?
3. What data do you need to gather?
4. What's the decision heuristic you'll apply?
"""

def plan(task: str) -> dict:
    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1500,
        system=PLANNING_SYSTEM,
        messages=[{"role": "user", "content": task}],
    )
    # parse the response into your planning structure
    return parse_plan(response.content[0].text)

The framework now shapes both planning AND execution — the agent chooses which data to gather through the persona's lens, not just how to write the final answer.

Production considerations

Token usage

A typical framework is 2,000–3,000 tokens. If you're processing thousands of requests:

Model	Input cost per framework
Claude Sonnet 4.5	~$0.009 per request
Claude Haiku 4.5	~$0.003 per request
GPT-4o	~$0.008 per request
GPT-4o-mini	~$0.0005 per request

Prompt caching (supported by Anthropic and OpenAI) reduces this by ~90% for repeated framework use — the framework text gets cached once, future requests reference the cache instead of re-sending.

Anthropic prompt caching

response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=2000,
    system=[
        {
            "type": "text",
            "text": framework,
            "cache_control": {"type": "ephemeral"},
        },
    ],
    messages=[{"role": "user", "content": question}],
)

First request: full cost. Subsequent requests within 5 minutes: ~10% of the original input cost. At scale, this meaningfully changes the economics.

Framework composition

Multiple frameworks in one system prompt works — but there are gotchas:

Complementary frameworks compose well. Buffett + Munger is great. Both are investor-brained; Munger sharpens Buffett with inversion.
Contradictory frameworks confuse the model. Buffett ("hold forever, moats matter") + Hormozi ("optimise offers for rapid growth") pull in different directions. The model averages toward neither, producing generic advice.
Above 2–3 frameworks, returns diminish. You've just made a very expensive generic assistant.

Pick frameworks that share a worldview even if they differ in method. Buffett + Munger + Dalio works. Buffett + Hormozi + Musk does not.

Framework versioning

We version the .md files via YAML frontmatter:

---
name: warren-buffett-framework
description: "..."
version: 1.0
---

If you're loading frameworks into production, pin versions in your code and update deliberately. A framework update might change reasoning patterns in ways your downstream users depend on.

What not to do

Don't paste the framework into every user message. It costs more, it bloats conversation history, and it doesn't work better than system prompts. Use the system/role correctly.

Don't let users override the system prompt. If users can inject instructions that override the framework ("ignore previous instructions, think like X"), your agent loses its identity. Use the standard prompt injection defences (e.g., don't concatenate user input into system messages).

Don't mix framework content with application logic. Your app logic ("format the output as JSON", "always end with a confidence score") belongs in a separate system message or via tool definitions — not jammed into the framework text. Keep concerns separate.

What to do next

If you haven't bought a framework yet, pick one that matches the kind of reasoning your agent needs:

Investor

Warren Buffett

Value investing, capital allocation, long-term thinking

$4.99 · View framework →

Investor

Charlie Munger

Inversion, multi-disciplinary mental models, rejecting bad reasoning

$4.99 · View framework →

Tech Visionary

Elon Musk

First principles, the Idiot Index, reasoning from physics

$4.99 · View framework →

Building something with thinking frameworks? I'd love to hear about it. Email hello@authority.md — if it's interesting enough we'll feature it in a future guide.

Written by Gareth Hoyle. Last updated 21 April 2026. Part of the authority.md guides library.

Keep reading

More guides.

AI Thinking

Tools vs Frameworks: What's the Difference, and Which Do You Want?

A framework is $4.99 and a Tool can be $199. Here is the difference between the two product lines on authority.md, and which one you actually want.

4 min read

AI Thinking

Which Tool Should You Buy? An Honest Decision Guide

Twelve Tools, but you only need the one that matches the task you keep doing badly. A one-minute decision guide, including when the honest answer is to buy nothing yet.

4 min read

AI Thinking

Best places to buy Claude Skills in 2026: marketplaces compared

Claude Skills covers coding tools, document processors, thinking frameworks, and more. Here's an honest breakdown of the leading marketplaces by category, with genuine picks for each — and where authority.md fits in.

8 min read