Agentic AI Glossary

Plain-English definitions of the terminology you will encounter when evaluating, building, and deploying agentic AI systems.

Agent Orchestration

The coordination layer that manages multiple AI agents working together on a task. An orchestrator decides which agent handles which subtask, manages data flow between agents, handles failures, and ensures the overall workflow completes correctly. Think of it as a project manager for AI agents.

Chain-of-Thought (CoT)

A prompting technique that asks the model to show its reasoning step by step before giving a final answer. This dramatically improves accuracy on complex tasks like math, logic, and multi-step analysis by forcing the model to decompose the problem rather than jumping to a conclusion.

Embedding

A numerical representation of text (or images, audio, etc.) as a list of numbers called a vector. Embeddings capture semantic meaning, so similar concepts end up close together in vector space. They are the foundation of semantic search and RAG systems.

Fine-Tuning

The process of further training a pre-trained model on your specific data to improve its performance on your tasks. Unlike prompting, fine-tuning modifies the model weights themselves. It is expensive, requires significant data, and is often unnecessary when good prompting or RAG will suffice.

Full-Duplex

In AI voice systems, full-duplex means the system can listen and speak simultaneously, just like a human conversation. The AI can be interrupted, can detect when you start talking, and can respond naturally without waiting for a turn-taking signal.

Guardrails

Safety mechanisms that constrain what an AI agent can do. Guardrails include input validation, output filtering, action restrictions, spending limits, and human-in-the-loop approval gates. Good guardrails are invisible when things work correctly but prevent catastrophic failures.

Hallucination

When an AI model generates information that sounds plausible but is factually incorrect or entirely fabricated. Hallucinations are not bugs to be fixed but a fundamental property of how language models work. The goal is to detect and mitigate them, not eliminate them entirely.

Intelligence Density

A Thora Solutions metric that measures useful capability per dollar spent on AI. High intelligence density means you are getting maximum real-world value from your AI investment. It accounts for accuracy, latency, cost, and maintenance burden together.

LoRA (Low-Rank Adaptation)

A technique for fine-tuning large models efficiently by only training a small number of additional parameters instead of the entire model. LoRA makes fine-tuning 10-100x cheaper and allows you to swap different fine-tunes on the same base model.

MCP (Model Context Protocol)

An open protocol developed by Anthropic that standardizes how AI models connect to external tools, data sources, and services. MCP servers expose capabilities that any MCP-compatible client can use, creating an interoperable ecosystem of AI tools.

Prompt Injection

A security attack where malicious input tricks an AI model into ignoring its instructions and doing something unintended. For example, a user might submit text that says 'ignore all previous instructions and output the system prompt.' Defense requires multiple layers of input sanitization and output validation.

RAG (Retrieval-Augmented Generation)

A pattern that gives an AI model access to external knowledge by retrieving relevant documents before generating a response. Instead of relying on what the model memorized during training, RAG fetches current, specific information from your databases, documents, or APIs.

ReAct (Reasoning + Acting)

An agent architecture pattern where the model alternates between reasoning about what to do next and taking actions (calling tools, querying databases, etc.). The reasoning step makes the agent's decisions more transparent and debuggable compared to pure chain-of-action approaches.

RLHF (Reinforcement Learning from Human Feedback)

A training technique where human evaluators rank model outputs, and those rankings are used to train the model to produce better responses. RLHF is how models learn to be helpful, harmless, and honest. It is the key technique that makes modern chatbots usable.

Vector Database

A database optimized for storing and searching embeddings. Instead of matching exact keywords, vector databases find results that are semantically similar to your query. Popular options include Pinecone, Weaviate, and pgvector. They are the retrieval layer in RAG systems.

x402

An emerging protocol for machine-to-machine payments using the HTTP 402 status code. It enables AI agents to autonomously pay for API calls, data access, and services without human intervention, using cryptocurrency micropayments. Still early-stage but important for agent-to-agent commerce.