Skip to main content

Prompt Techniques

TechniqueDescriptionMechanismBenefitsLimitationsUse Cases
Zero-shot Prompting

LLM directly prompted to perform a task without any prior specific examples. It relies on the LLM's pre-trained knowledge to infer an appropriate response

The LLM leverages its extensive pre-trained knowledge base to understand the given prompt and generate a relevant response without seeing any task-specific examples. It's about direct task instruction

Simplicity, works out-of-the-box for many tasks, ideal for quick queries

Can struggle with complex or novel tasks, may produce less accurate or inconsistent results compared to other methods requiring examples

Question answering, summarization, simple text generation, language translation

Few-shot Prompting

LLM is provided with a small number of input-output examples (typically 1-5) of a task within the prompt to guide its understanding and generation of the desired output

The LLM observes the provided examples, learns the pattern and desired behavior, and then applies this learned context to solve a new, unseen input of the same task. It's in-context learning

Significantly improves performance on various tasks, especially for specific formats or styles; reduces the need for extensive fine-tuning; adaptable to new tasks quickly

Performance is highly dependent on the quality and diversity of the provided examples; prompt length can become an issue with too many examples; still less robust than fine-tuning for highly specialized tasks

Sentiment analysis, classification, question answering with specific formats, code generation, creative writing

Chain-of-Thought (CoT) Prompting

Encourages LLMs to break down complex problems into a series of intermediate reasoning steps before arriving at a final answer. This mimics human-like thinking

The prompt explicitly instructs the LLM to "think step by step" or provides examples of step-by-step reasoning. The LLM generates these intermediate thoughts, which then guide its subsequent reasoning and final answer generation

Improves accuracy on complex reasoning tasks (e.g., math word problems, common sense reasoning); provides transparency into the model's thought process; enables the model to allocate more compute to problems requiring more reasoning

Can be slower due to multiple reasoning steps; works best with larger models; the generated reasoning might not always reflect the true internal computation; can overcomplicate simple questions

Mathematical reasoning, symbolic manipulation, complex question answering, common sense reasoning, programming

Meta Prompting

LLM is prompted to generate or refine its own prompts to better perform a given task. It's about using the LLM to optimize its own instructions

The user asks the LLM to create or modify a prompt for a specific goal. Through iterative dialogue, the LLM refines the prompt based on user feedback, aiming to produce a more effective instruction for itself or another LLM

Improves prompt quality and effectiveness; reduces manual prompt engineering effort; allows for dynamic prompt adaptation; can save time in finding optimal prompts

Requires a good initial understanding of the desired outcome; can be computationally intensive if many iterations are needed; the quality of meta-prompts depends on the LLM's ability to self-critique

Prompt optimization, creating tailored instructions for specific tasks, refining conversational flows, content generation for specific tones/styles

Self-Consistency

Enhances the reliability and accuracy of LLM outputs by generating multiple diverse reasoning paths or responses to a given prompt and then selecting the most consistent or frequently occurring answer among them

The LLM is prompted multiple times, often with CoT, to generate diverse outputs. For quantitative answers, a majority vote selects the most common one. For qualitative answers, the LLM itself or an external evaluator determines the most coherent or consistent response

Improves accuracy, especially for tasks requiring reasoning or interpretation (e.g., math, code generation); particularly useful for free-form text generation where exact answers are not expected; reduces hallucination

Can be computationally more expensive due to multiple generations; may not always converge to a single "best" answer for highly subjective tasks; requires a mechanism to evaluate consistency

Mathematical problem solving, code generation, summarization, open-ended question answering, creative text generation

Generate Knowledge Prompting

LLM is prompted to first generate relevant knowledge or facts pertinent to a query, and then use that generated knowledge to formulate its final answer

The prompt is structured in two stages: first, an instruction to generate background information, facts, or concepts related to the query; second, an instruction to answer the original query using the newly generated knowledge

Provides the LLM with a "self-curated" knowledge base for the specific query, potentially leading to more accurate and grounded responses; reduces reliance on only the model's internal parameters for factual accuracy; can mitigate hallucination

Adds an extra step to the reasoning process, increasing latency; quality of the final answer depends on the accuracy and relevance of the generated knowledge; can still generate incorrect knowledge

Factual question answering, research assistance, content creation requiring specific factual grounding

Prompt Chaining

Involves breaking down a complex task into a series of smaller, sequential prompts, where the output of one prompt serves as the input or context for the next

A sequence of prompts is defined, each designed to perform a specific sub-task. The LLM processes the first prompt, its output is captured, and then concatenated with the next prompt in the chain, continuing until the final desired output is achieved

Enables the handling of multi-step complex tasks; ensures coherence and consistency across generated text; provides more control over the generation process; can be used to build sophisticated conversational agents

Requires careful design of each prompt in the chain; errors in earlier prompts can propagate; debugging can be challenging; can be resource-intensive for very long chains

Multi-turn conversations, data extraction pipelines, structured content generation (e.g., reports, articles with specific sections), complex problem-solving workflows

Tree of Thoughts (ToT)

An advanced reasoning framework that generalizes Chain-of-Thought by exploring multiple reasoning paths in a tree-like structure, allowing for backtracking and exploration of alternative solutions

Instead of a single linear chain, the LLM generates multiple "thoughts" (intermediate reasoning steps) at each stage. These thoughts form branches, and the system can evaluate their viability, prune unpromising paths, and backtrack to explore others, often using a search algorithm (e.g., Breadth-First Search, Depth-First Search)

Significantly improves performance on tasks requiring deeper, strategic thinking and decision-making; allows for more robust exploration of problem spaces; can mitigate early errors by exploring alternative paths

Computationally intensive due to the exploration of multiple paths; complex to implement and fine-tune; requires careful design of evaluation functions for pruning and path selection

Planning, complex problem-solving (e.g., game playing, logistics), creative generation with constraints, code debugging

Retrieval Augmented Generation (RAG)

A framework that combines the generative capabilities of LLMs with external information retrieval systems (e.g., databases, search engines) to ground the LLM's responses in factual, up-to-date knowledge

When a query is received, a retrieval component searches an external knowledge base for relevant documents or information. This retrieved information is then provided as context to the LLM, which uses both its internal knowledge and the external data to generate a more accurate and informed response

Addresses LLM limitations like hallucination and outdated information; provides access to fresh, domain-specific, or proprietary data; enhances factual accuracy and trustworthiness of outputs; reduces the need for constant model retraining

Requires an efficient and relevant retrieval system; can be complex to set up and maintain the knowledge base; potential for noise if retrieved information is irrelevant or low quality; latency introduced by the retrieval step

Factual question answering, enterprise chatbots, knowledge management, summarization of specific documents, legal research

Automatic Reasoning and Tool-use (ART)

A framework that enables LLMs to automatically break down problems, reason through steps, and dynamically use external tools (e.g., calculators, search engines, code interpreters) to solve complex tasks

ART maintains a task library of example problems and their step-by-step solutions, and a tool library. When facing a new problem, it finds similar examples, guides the LLM to generate a step-by-step solution, and automatically pauses for tool usage, coordinating between the LLM's thought process and external tools

Enhances LLM capabilities beyond pure language generation; improves accuracy on multi-step reasoning problems; reduces reliance on manual prompt engineering for tool integration; increases problem-solving flexibility

Can suffer from cascading errors if early steps are incorrect; performance is limited by the quality of generated code or tool usage; task selection from the library isn't always perfect; integration with various tools can be complex

Scientific problem solving, complex data analysis, interactive agents, technical support, educational tutoring systems

Automatic Prompt Engineer (APE)

An innovative solution where an AI system autonomously generates, optimizes, and selects prompts to achieve desired LLM outputs, significantly reducing manual effort

APE typically receives input-output pairs as examples. It then uses techniques like reinforcement learning, gradient-based optimization, or meta-prompting to generate and evaluate candidate prompts. The process iterates, refining prompts based on feedback on how well the generated responses match the expected outputs

Dramatically reduces the time and effort required for prompt engineering; can discover highly effective prompts that humans might miss; improves consistency and quality of LLM outputs across various applications

Can be computationally expensive during the optimization process; requires significant labeled data for training the APE system; interpretability of automatically generated prompts can be low

Optimizing prompts for chatbots, content generation, data extraction, fine-tuning LLM behavior for specific tasks

Active-Prompt

Improves Chain-of-Thought (CoT) prompting performance by selectively human-annotating exemplars (examples) where the model shows the most uncertainty

Instead of uniformly sampling examples, Active-Prompt identifies instances where the LLM is most uncertain about its reasoning path. These "uncertain" instances are then prioritized for human annotation to create high-quality CoT exemplars, which are then used to improve the LLM's performance

Focuses human annotation effort on the most impactful examples; potentially achieves better performance with less human labeling work; improves the robustness of CoT reasoning

Requires a mechanism to quantify LLM uncertainty; involves human-in-the-loop for selective annotation; can be more complex to implement than standard CoT

Improving CoT for specific domains or challenging tasks where the model frequently errs; active learning scenarios for prompt engineering

Directional Stimulus Prompting (DSP)

A framework for guiding black-box LLMs towards specific desired outputs by introducing subtle, instance-specific "directional stimulus" (hints or cues) generated by a smaller, tunable policy model

A small policy model (e.g., a fine-tuned T5) generates an auxiliary stimulus prompt for each input. This stimulus acts as a hint, subtly guiding the larger, black-box LLM (e.g., GPT-3, GPT-4) towards a desired outcome, without directly adjusting the LLM's parameters. The policy model is optimized via supervised fine-tuning and reinforcement learning

Provides fine-grained, query-specific guidance for black-box LLMs; bypasses the need for direct LLM fine-tuning; allows for more controlled output generation (e.g., including specific keywords); can enhance reasoning accuracy

Relies on the effectiveness of the smaller policy model; can still guide the LLM towards biased or harmful content if the policy model is optimized incorrectly; adds an extra layer of complexity

Summarization with specific keyword requirements, dialogue response generation, controlled text generation, guiding LLMs for specific stylistic outputs

Program-Aided Language Models (PAL)

A novel approach that combines LLMs with external interpreters (e.g., Python) to improve their reasoning capabilities, particularly for mathematical, logical, and algorithmic problems

The LLM interprets the natural language prompt and, during its reasoning step, generates a program (e.g., Python code) that encapsulates the logic required to solve the problem. This generated program is then executed by an external interpreter, and the result is used to formulate the final answer

Significantly improves accuracy on tasks requiring precise computation or symbolic manipulation; offloads complex calculations to reliable external tools; allows LLMs to leverage programming logic beyond their internal neural network capabilities

Requires the LLM to have sufficient coding ability; introduces a dependency on external interpreters; debugging errors in generated code can be challenging; execution time includes code generation and interpretation

Mathematical problem solving, data processing, logical puzzles, algorithmic tasks, generating and executing SQL queries

ReAct

A prompting strategy that interweaves Thought (reasoning) and Action (tool-use) steps, allowing LLMs to dynamically reason and interact with external environments to gather information or perform operations

The LLM generates a Thought which explains its reasoning process, then an Action which specifies a tool to use (e.g., search engine, calculator) and its arguments. The tool's output is observed, and the LLM then generates another Thought and Action based on the new information, iterating until the task is complete

Enables LLMs to perform multi-step tasks requiring external knowledge or computation; provides a transparent trace of reasoning and actions; allows for dynamic planning and adaptation to environmental feedback; reduces hallucinations by grounding responses in real-world data

Can be complex to design effective action spaces and tool integrations; performance heavily relies on the quality of tools and the LLM's ability to interpret tool outputs; potential for infinite loops or incorrect action sequences

Complex question answering requiring external data, interactive problem solving, web Browse, task automation, scientific experimentation

Reflexion

A general prompting strategy that involves having LLMs analyze their own outputs, behaviors, knowledge, or reasoning processes to identify errors and iteratively improve their performance

The LLM generates an initial output. A "reflection" module (which can be another LLM or a set of rules) then critically assesses this output, providing feedback, critique, or recommendations for improvement. The original LLM then uses this feedback to generate a refined output, repeating the process until a satisfactory result is achieved

Enables self-correction and continuous improvement without human intervention; enhances robustness and reliability of LLM outputs; provides a mechanism for learning from past mistakes; useful for tasks requiring high accuracy

Can be computationally intensive due to iterative refinement; the quality of reflection heavily depends on the reflection mechanism; potential for "reflection loops" if the model struggles to self-correct; may not always converge to the optimal solution

Code debugging and improvement, content refinement, factual correction, complex task completion where iterative improvement is beneficial

Multimodal CoT

Extends the Chain-of-Thought reasoning to incorporate multiple modalities, typically language (text) and vision (images), allowing LLMs to reason over information presented in different forms

In a two-stage framework, the LLM first generates intermediate reasoning chains (rationales) based on information from both text and image inputs. Then, in the second stage, it uses these multimodal rationales to infer the final answer

Enables LLMs to solve problems that require understanding and integrating information from diverse modalities; improves performance on multimodal reasoning tasks; can mitigate hallucination by grounding reasoning in visual evidence

Requires multimodal LLMs capable of processing and integrating different data types; complexity in aligning information across modalities; the quality of rationales depends on the integration of multimodal inputs

Visual question answering (VQA), science questions with diagrams, image captioning with reasoning, medical diagnosis from reports and scans

Graph Prompting

Prompts LLMs by treating knowledge or information as a structured graph (e.g., knowledge graphs), allowing the LLM to leverage relational information

The knowledge is preprocessed and presented to the LLM as a structured graph or a description derived from it, highlighting entities and their relationships. The prompt guides the LLM to reason over this graph structure to answer queries or generate text

Leverages the rich relational information present in knowledge graphs; improves factual consistency and reasoning over structured data; can enhance understanding of complex relationships between entities; allows for more precise and grounded responses

Requires the existence or construction of a knowledge graph; converting information into a graph structure can be complex; the LLM needs to be adept at interpreting graph representations

Question answering over knowledge graphs, fact extraction, entity linking, semantic search, reasoning in complex domains (e.g., biomedical, legal)