Structured Outputs

Introduction

Standard large language models respond to user queries by generating plain text. This is great for many applications like chatbots, but if you want to programmatically access details in the response, plain text is hard to work with. Some models have the ability to respond with structured JSON instead, making it easy to work with data from the LLM’s output directly in your application code. If you’re using a supported model, you can enable structured responses by providing your desired schema details to the response_format key of the Chat Completions API.

Supported models

The following models currently support JSON mode:

meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo(32K context)
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
meta-llama/Llama-3.2-3B-Instruct-Turbo
meta-llama/Llama-3.3-70B-Instruct-Turbo
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
meta-llama/Llama-4-Scout-17B-16E-Instruct
deepseek-ai/DeepSeek-V3
Qwen/Qwen3-235B-A22B-fp8-tput
Qwen/Qwen2.5-VL-72B-Instruct

Basic example

Let’s look at a simple example, where we pass a transcript of a voice note to a model and ask it to summarize it. We want the summary to have the following structure:

{
  title: "A title for the voice note",
  summary: "A short one-sentence summary of the voice note",
  actionItems: [
    "Action item 1",
    "Action item 2",
  ]
}

We can tell our model to use this structure by giving it a JSON Schema definition. Since writing JSON Schema by hand is a bit tedious, we’ll use a library to help – Pydantic in Python, and Zod in TypeScript. Once we have the schema, we can give it to our model using the response_format key. Finally – and this is important – we need to make sure to instruct our model to only respond in JSON format. This ensures it will actually use the schema we provide when generating its response.

Important: You must always instruct your model to only respond in JSON format, either in the system prompt or a user message, in addition to passing your schema to the response_format key.

Let’s see what this looks like:

import json
import together
from pydantic import BaseModel, Field

client = together.Together()

# Define the schema for the output
class VoiceNote(BaseModel):
    title: str = Field(description="A title for the voice note")
    summary: str = Field(description="A short one sentence summary of the voice note.")
    actionItems: list[str] = Field(
        description="A list of action items from the voice note"
    )

def main():
    transcript = (
        "Good morning! It's 7:00 AM, and I'm just waking up. Today is going to be a busy day, "
        "so let's get started. First, I need to make a quick breakfast. I think I'll have some "
        "scrambled eggs and toast with a cup of coffee. While I'm cooking, I'll also check my "
        "emails to see if there's anything urgent."
    )

    # Call the LLM with the JSON schema
    extract = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": "The following is a voice message transcript. Only answer in JSON.",
            },
            {
                "role": "user",
                "content": transcript,
            },
        ],
        model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
        response_format={
            "type": "json_object",
            "schema": VoiceNote.model_json_schema(),
        },
    )

    output = json.loads(extract.choices[0].message.content)
    print(json.dumps(output, indent=2))
    return output

main()

If we try it out, our model responds with the following:

{
  "title": "Morning Routine",
  "summary": "Starting the day with a quick breakfast and checking emails",
  "actionItems": [
    "Cook scrambled eggs and toast",
    "Brew a cup of coffee",
    "Check emails for urgent messages"
  ]
}

Pretty neat! Our model has generated a summary of the user’s transcript using the schema we gave it.

Vision model example

Let’s look at another example, this time using a vision model. We want our LLM to extract text from the following screenshot of a Trello board:

In particular, we want to know the name of the project (Project A), and the number of columns in the board (4). Let’s try it out:

import json
import together
from pydantic import BaseModel, Field

client = together.Together()

# Define the schema for the output
class ImageDescription(BaseModel):
    project_name: str = Field(description="The name of the project shown in the image")
    col_num: int = Field(description="The number of columns in the board")

def main():
    imageUrl = "https://napkinsdev.s3.us-east-1.amazonaws.com/next-s3-uploads/d96a3145-472d-423a-8b79-bca3ad7978dd/trello-board.png"

    # Call the LLM with the JSON schema
    extract = client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "Extract a JSON object from the image."},
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": imageUrl,
                        },
                    },
                ],
            },
        ],
        model="Qwen/Qwen2.5-VL-72B-Instruct",
        response_format={
            "type": "json_object",
            "schema": ImageDescription.model_json_schema(),
        },
    )

    output = json.loads(extract.choices[0].message.content)
    print(json.dumps(output, indent=2))
    return output

main()

If we run it, we get the following output:

{
  projectName: 'Project A',
  columnCount: 4
}

JSON mode has worked perfectly alongside Qwen’s vision model to help us extract structured text from an image!

Try out your code in the Together Playground

You can try out JSON Mode in the Together Playground to test out variations on your schema and prompt:

Just click the RESPONSE FORMAT dropdown in the right-hand sidebar, choose JSON, and upload your schema!

🚀 Getting Started

⚡ Inference

💪 Capabilities

📚 Examples

🎯 Training

📖 Guides

❓ Frequently Asked Questions

Introduction

Supported models

Basic example

Vision model example

Try out your code in the Together Playground

🚀 Getting Started

⚡ Inference

💪 Capabilities

📚 Examples

🎯 Training

📖 Guides

❓ Frequently Asked Questions

​Introduction

​Supported models

​Basic example

​Vision model example

​Try out your code in the Together Playground

Introduction

Supported models

Basic example

Vision model example

Try out your code in the Together Playground