Skip to content

Patching

Patching adds structured output features to LLM client libraries. This page explains how it works. For most users, from_provider is simpler than manual patching.

Recommended Approach

Use from_provider instead of manual patching. It works the same way across all providers. See the Migration Guide if you're using older patching patterns.

What is Patching?

Patching adds new features to LLM client objects without changing their original code. When Instructor patches a client, it adds:

  • New parameters: response_model, max_retries, and context to completion methods
  • Validation: Checks responses against Pydantic models
  • Retry logic: Retries when validation fails
  • Compatibility: The patched client still works with all original methods

How Patching Works

When Instructor patches a client, it:

  1. Wraps the completion method: Intercepts calls to create() or chat.completions.create()
  2. Converts schemas: Changes Pydantic models into provider-specific formats (JSON schema, tool definitions, etc.)
  3. Validates responses: Checks LLM outputs against your Pydantic model
  4. Handles retries: Retries with validation feedback if needed
  5. Returns typed objects: Converts validated JSON into Pydantic model instances

Patching Modes

Different providers support different modes for structured extraction. Instructor automatically selects the best mode for each provider, but you can override it:

Tool Calling (TOOLS)

Uses the provider's function/tool calling API. This is the default for OpenAI.

Supported by: OpenAI, Anthropic (ANTHROPIC_TOOLS), Google (GENAI_TOOLS), Ollama (for supported models)

JSON Mode

Instructs the model to return JSON directly. Works with most providers.

Supported by: OpenAI, Anthropic, Google, Ollama, and most providers

Markdown JSON (MD_JSON)

Asks for JSON wrapped in markdown. Only use for specific providers like Databricks.

Supported by: Databricks, some vision models

Default Modes by Provider

Each provider uses a recommended default mode:

  • OpenAI: Mode.TOOLS (function calling)
  • Anthropic: Mode.TOOLS (tool use)
  • Google: Mode.TOOLS (function calling)
  • Ollama: Mode.TOOLS (if model supports it) or Mode.JSON
  • Others: Provider-specific defaults

When using from_provider, these defaults are applied automatically. You can override them with the mode parameter.

Manual Patching (Advanced)

If you need to patch a client manually (not recommended for most users):

import openai
import instructor
from pydantic import BaseModel


class YourModel(BaseModel):
    message: str


# Create the base client
openai_client = openai.OpenAI()

# Patch it manually
client = instructor.patch(openai_client, mode=instructor.Mode.TOOLS)

# Now use it
response = client.chat.completions.create(
    response_model=YourModel,
    messages=[{"role": "user", "content": "Say hello"}],
)

However, using from_provider is simpler and recommended:

import instructor
from pydantic import BaseModel


# Simpler approach
class YourModel(BaseModel):
    message: str


client = instructor.from_provider("openai/gpt-4o-mini")
_response = client.create(
    response_model=YourModel,
    messages=[{"role": "user", "content": "Say hello"}],
)

What Gets Patched?

Instructor adds these features to patched clients:

New Parameters

  • response_model: A Pydantic model or type that defines the expected output structure
  • max_retries: Number of retry attempts if validation fails (default: 0)
  • context: Additional context for validation hooks

Enhanced Methods

The patched client's create() method: - Accepts response_model parameter - Validates responses automatically - Retries on validation failures - Returns typed Pydantic objects instead of raw responses

Provider-Specific Considerations

OpenAI

  • Default mode: TOOLS (function calling)
  • Supports streaming with structured outputs

Anthropic

  • Default mode: ANTHROPIC_TOOLS (tool use)
  • Uses Claude's native tool calling API

Google Gemini

  • Default mode: GENAI_TOOLS (function calling)
  • Requires jsonref package for tool calling
  • Some limitations with strict validation and enums

Ollama (Local Models)

  • Default mode: TOOLS (if model supports it) or JSON
  • Models like llama3.1, llama3.2, mistral-nemo support tools
  • Older models fall back to JSON mode

When to Use Manual Patching

Manual patching is rarely needed. Use it only if:

  1. You need fine-grained control over the patching process
  2. You're working with a custom client implementation
  3. You're debugging patching behavior

For 99% of use cases, from_provider is the better choice.