prompt-x logoprompt-x

MANIFESTO

A new angle to context engineering

Context engineering has an infrastructure and a craft problem. The industry is obsessed with the first and ignoring the second. This essay is about the second: the human and machine communication principles.

Mariano Morera, founder · May 2026

01

Context

A common and persistent problem

Here is a scene I've watched play out dozens of times, across different teams with minor variations.

An AI engineer opens a 600-word system prompt from a Python file sent by a colleague, pastes it directly into Claude to reproduce a problem, then manually deletes the Markdown headers they used for Gemini ADK and rewrites the prompt as XML tags to match Claude's rules.

On the other side, a PM in a book application writes a mix of functional and technical instructions to test a core workflow after a feature ships. They figure it out late at night in Slack and Notion, then run it in Claude Code through a few iterations until it “works”.

In both cases, there's no dataset to evaluate, no criteria validates, and no version control — just what was once a helpful artifact buried in the day-to-day characters ocean.

Codebases are a prompt's buried inbox, but they're not the only one: Notion docs, Slack threads, and Claude conversations all accumulate prompts that worked once or twice and then disappeared.

Prompts alone are no longer sufficient to power AI at scale — we all know that. Meanwhile, static prompts produce vague outputs and waste tokens. And with no control over our +3K prompts a year (average professional), where does that leave us on the AI adoption curve?

We can fix that with structured prompt templates, file management, and multi-platform compilation, generation and evaluation engines.

What context actually means?

The topic “context engineering” was popularised in mid-2025 by people like Andrej Karpathy — “the delicate art and science of filling the context window with just the right information for the next step.” — alongside Tobi Lütke and others; Gartner even formalised it as an enterprise discipline.

Most data and AI leaders agreed that prompt engineering alone was no longer sufficient to power AI at scale, and the majority of data teams planned to invest in context engineering capabilities the next year.

Context engineering is the discipline of designing the full informational environment an LLM receives. Not just the prompt, but its meta structure; retrieved knowledge; tool outputs; conversation history; memory state; system instructions; output schemas; infrastructure — everything the model sees before it generates the next token.

But most people are talking about context engineering infrastructure while almost nobody seems to be talking about the part where a human sits down and writes the thing.

I envision it as a mix of prompt and context engineering: the practice of building knowledge instructions and files that improve communication between humans and machines.

02

Why is prompt engineering still broken?

Most teams have the tools to run AI, but not the craft to author reusable, testable instructions and context. That gap slows AI adoption, production velocity and output quality.

The adoption bias

According to a mid-2025 Gallup survey, 49% of U.S. workers report never using AI in their role — and that's the country where adoption is highest. Microsoft's 2025 Global AI Adoption report found that usage concentrates heavily in technology, finance, and professional services, with entire industries sitting at the starting line. The people who have never touched an LLM are not a minority.

I like how the following image, posted on X by Damian Player in February this year, captures where we might really be with AI — compared with what much of the LinkedIn feed and many keynotes claim:

AI adoption — each dot is ~3.2 million people, Feb 2026

Most of the time, when we use LLMs, we write natural-language instructions and delegate the hard parts to the model — role definition, tone, or task framing. You can even meta-prompt LLMs to produce more polished, advanced instructions for multiple tasks with attached context — even structured, if you're working in Claude Cowork- and Code-like workflows.

Accounting for AI usage will be key in the coming years, as AI is expected to take on part of our work while we ensure human-crafted criteria and authored instructions that function in collaborative tasks.

The authoring gap

Remember the AI engineer and the PM?

Six months later, they need to do a similar task and start getting crazy because that “perfect prompt” was abducted by a black hole.

You have to re-write all descriptions, backend schemas, and endpoint definitions — does it sound familiar to you? Then you have to spend hours because you didn't document, author and store those instructions properly.

Imagine this at scale.

Notice the asymmetry? The runtime half of context engineering has real infrastructure and is future-proof, VC-backed. We have vector databases, retrieval pipelines, agent frameworks, tool orchestration, observability stacks, context compression techniques, and much more.

Engineers are building serious systems for the retrieved, orchestrated, dynamically assembled parts of the context window. The authored portion has no equivalent: the craft of writing the structured input has no workspace, no IDE.

Thus, prompt authoring and management is upstream of everything in the AI productivity process.

Prompts are like code. They deserve the same discipline from the beginning: structure, version control, testing, refinement, reusability, and tools that compile cleanly to where they need to run.

The reframe is this: prompt authoring is not just vibe-writing.

03

What does structured prompt engineering look like?

Most teams don't need more AI infrastructure — they need better authored inputs: reusable, testable instructions and context that compile cleanly across tools and workflows.

The prompt anatomy

Prompts are instructions. Instructions are knowledge blocks.

In this fast-paced world, where new ideas, updates, and information flow constantly, we still have to sit down and choose the right words to communicate properly.

You're about to type into your favourite LLM, but you haven't clicked send yet. Now try breaking that natural-language message into smaller instruction blocks — follow the sample below from Anthropic's prompt engineering best practices to make the exercise.

For this example we are using a PM writing a spec for a book app:

prompt-x — plain text input, book app example

Restructuring a message into specific blocks — each filled with tailored instructions — not only improves LLM outputs, but also makes token usage more efficient, especially when compilation targets multiple LLMs in API workflows.

The industry-standard answer to “what are the components of a prompt?” has been stable for a couple of years. Google, OpenAI, Anthropic, and others converge on roughly six pieces: a role (who the AI is), context (background knowledge), a task (what to do), examples (demonstrations), an output format (how to structure the response), and constraints (boundaries and prohibitions). If you've written a system prompt for a production LLM app, you've likely used all six — plus a few more.

At prompt-x, we defined 9 blocks in a flexible structure that supports adding more fields if needed, enriched with a variable system so I can prompt at scale.

Same content, different encoding

The target platforms we use day to day encode structured instructions differently:

Claude

XML tags

Each field becomes its own XML element — natively parsed by the model.

GPT

Markdown headers

Concise formatting with explicit anti-pattern lists under each section.

Gemini

Uppercase labels

Critical restrictions stated clearly after each uppercase field label.

Lovable

Natural prose

Fields recombine into paragraphs that read like a product owner briefing an engineering team.

The content of the role field does not change across these targets. Only the encoding does.

Yet the dominant workflow in 2026 is still to write the prompt once, then manually reformat it for each platform. People try to remember platform-specific rules, lose information in each pass, and accept that multi-platform versions drift from the original within two weeks because nobody has time to reconcile them.

If the content is stable and the encoding is platform-specific, then encoding is a compilation step.

Authored context should be written once, in a canonical structured form, then compiled for each platform at the point of use. A prompt compilation engine can generate XML tags for Claude, Markdown headers for GPT, uppercase labels for Gemini, prose for Lovable, plus any platform-specific instructions.

04

What this looks like in practice

Let me ground this in a concrete example.

Imagine again the PM with prompt super-powers:

prompt-x — structured 9-field output

This is one canonical authored artifact: an engineered prompt with variables for expanded context and references to specific tools or local folder and files.

Now it's time for the compilation step.

As reviewed earlier, in Claude each field becomes an XML element. In GPT each field becomes a Markdown section under a header. In Gemini, each field becomes an uppercase label followed by its content. In Lovable, the same fields recombine into natural prose — the role becomes the opening paragraph of a development brief, and the tools become named references to existing components or assets.

3 different outputs, one source — that's what a prompt engineering platform looks like when it takes authoring and compilation seriously.

prompt-x — CLEAR evaluation score

Notice what isn't in this picture: we're not talking about retrieval, vector stores that hold data history, context compression, or memory strategies — no infrastructure at all. Those are real and necessary, but they're a different conversation.

Good authoring, structure, and compilation don't replace good retrieval; they make retrieval work by giving the model a clear frame to interpret what it pulls in, digest context inputs more efficiently, and increase the odds of a high-quality output.

The 2 layers are complementary and the industry has built the first half. Now it meets the second.

Footnotes and sources

This is an original thesis from Mariano co-worked with Claude to create internal knowledge and resources that inspired the development of the essay along with the following key resources — use them to expand the research:

  1. Prompt Engineering Guide – Nextra
  2. Prompt engineering overview (Anthropic)
  3. Prompt Engineering for AI Guide (Google)
  4. What Is Prompt Engineering? | IBM
  5. Context engineering: Why it's Replacing Prompt Engineering for Enterprise AI Success
  6. Context Engineering vs Prompt Engineering for AI Agents
  7. anthropics/prompt-eng-interactive-tutorial
  8. openai/evals
  9. langchain-ai/context_engineering
  10. Andrej Karpathy (@karpathy) on X
  11. tobi lutke (@tobi) on X

Start crafting prompts like code.

Start free.