Log LLM calls

This guide will cover how to log LLM calls to LangSmith when you are using a custom model or a custom input/output format. To make the most of LangSmith’s LLM trace processing, you should log your LLM traces in one of the specified formats. LangSmith offers the following benefits for LLM traces:

Rich, structured rendering of message lists
Token and cost tracking per LLM call, per trace and across traces over time

If you don’t log your LLM traces in the suggested formats, you will still be able to log the data to LangSmith, but it may not be processed or rendered in expected ways. If you are using LangChain OSS to call language models or LangSmith wrappers (OpenAI, Anthropic), these approaches will automatically log traces in the correct format.

The examples on this page use the traceable decorator/wrapper to log the model run (which is the recommended approach for Python and JS/TS). However, the same idea applies if you are using the RunTree or API directly.

Messages Format

When tracing a custom model or a custom input/output format, it must either follow the LangChain format, OpenAI completions format or Anthropic messages format. For more details, refer to the OpenAI Chat Completions or Anthropic Messages documentation. The LangChain format is:

Show LangChain format

messages

array

required

A list of messages containing the content of the conversation.

type

string

required

Identifies the message type. One of: system | reasoning | user | assistant | tool

content

array

required

Content of the message. List of typed dictionaries.

Show Content options

type

string

required

Show text

type

literal('text')

required

text

string

required

Text content.

annotations

object[]

List of annotations for the text

extras

object

Additional provider-specific data.

Show reasoning

type

literal('reasoning')

required

text

string

required

Text content.

extras

object

Additional provider-specific data.

Show image

type

literal('image')

required

url

string

URL pointing to the image location.

base64

string

required

Base64-encoded image data.

string

Reference ID to an externally stored image (e.g., in a provider’s file system or in a bucket).

mime_type

string

Image MIME type (e.g., image/jpeg, image/png).

Show file (e.g., PDFs)

type

literal('file')

required

url

string

URL pointing to the file.

base64

string

required

Base64-encoded file data.

string

Reference ID to an externally stored file (e.g., in a provider’s file system or in a bucket).

mime_type

string

File MIME type (e.g., application/pdf).

Show audio

type

literal('audio')

required

url

string

URL pointing to the audio file.

base64

string

required

Base64-encoded audio data.

string

Reference ID to an externally stored audio file (e.g., in a provider’s file system or in a bucket).

mime_type

string

Audio MIME type (e.g., audio/mpeg, audio/wav).

Show video

type

literal('video')

required

url

string

URL pointing to the video file.

base64

string

required

Base64-encoded video data.

string

Reference ID to an externally stored video file (e.g., in a provider’s file system or in a bucket).

mime_type

string

Video MIME type (e.g., video/mp4, video/webm).

Show tool_call

type

literal('tool_call')

required

name

string

args

object

required

Arguments to pass to the tool.

string

Unique identifier for this tool call.

Show server_tool_call

type

literal('server_tool_call')

required

string

required

Unique identifier for this tool call.

name

string

required

The name of the tool to be called.

args

object

required

Arguments to pass to the tool.

Show server_tool_result

type

literal('server_tool_result')

required

tool_call_id

string

required

Identifier of the corresponding server tool call.

string

Unique identifier for this tool call.

status

string

required

Execution status of the server-side tool. One of: success | error.

output

Output of the executed tool.

tool_call_id

string

Must match the id of a prior assistant message’s tool_calls[i] entry. Only valid when role is tool.

usage_metadata

object

Use this field to send token counts and/or costs with your model’s output. See this guide for more details.

Examples

 inputs = {
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Hi, can you tell me the capital of France?"
        }
      ]
    }
  ]
}

outputs = {
  "messages": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "The capital of France is Paris."
        },
        {
          "type": "reasoning",
          "text": "The user is asking about..."
        }
      ]
    }
  ]
}

Converting custom I/O formats into LangSmith compatible formats

If you’re using a custom input or output format, you can convert it to a LangSmith compatible format using process_inputs/processInputs and process_outputs/processOutputs functions on the @traceable decorator (Python) or traceable function (TS). process_inputs/processInputs and process_outputs/processOutputs accept functions that allow you to transform the inputs and outputs of a specific trace before they are logged to LangSmith. They have access to the trace’s inputs and outputs, and can return a new dictionary with the processed data. Here’s a boilerplate example of how to use process_inputs and process_outputs to convert a custom I/O format into a LangSmith compatible format:

Show the code

class OriginalInputs(BaseModel):
    """Your app's custom request shape"""

class OriginalOutputs(BaseModel):
    """Your app's custom response shape."""

class LangSmithInputs(BaseModel):
    """The input format LangSmith expects."""

class LangSmithOutputs(BaseModel):
    """The output format LangSmith expects."""

def process_inputs(inputs: dict) -> dict:
    """Dict -> OriginalInputs -> LangSmithInputs -> dict"""

def process_outputs(output: Any) -> dict:
    """OriginalOutputs -> LangSmithOutputs -> dict"""


@traceable(run_type="llm", process_inputs=process_inputs, process_outputs=process_outputs)
def chat_model(inputs: dict) -> dict:
    """
    Your app's model call. Keeps your custom I/O shape.
    The decorators call process_* to log LangSmith-compatible format.
    """

Identifying a custom model in traces

When using a custom model, it is recommended to also provide the following metadata fields to identify the model when viewing traces and when filtering.

ls_provider: The provider of the model, eg “openai”, “anthropic”, etc.
ls_model_name: The name of the model, eg “gpt-4o-mini”, “claude-3-opus-20240307”, etc.

from langsmith import traceable

inputs = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I'd like to book a table for two."},
]
output = {
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "Sure, what time would you like to book the table for?"
            }
        }
    ]
}

@traceable(
    run_type="llm",
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def chat_model(messages: list):
    return output

chat_model(inputs)

This code will log the following trace:

LangSmith UI showing an LLM call trace called ChatOpenAI with a system and human input followed by an AI Output.

If you implement a custom streaming chat_model, you can “reduce” the outputs into the same format as the non-streaming version. This is currently only supported in Python.

def _reduce_chunks(chunks: list):
    all_text = "".join([chunk["choices"][0]["message"]["content"] for chunk in chunks])
    return {"choices": [{"message": {"content": all_text, "role": "assistant"}}]}

@traceable(
    run_type="llm",
    reduce_fn=_reduce_chunks,
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def my_streaming_chat_model(messages: list):
    for chunk in ["Hello, " + messages[1]["content"]]:
        yield {
            "choices": [
                {
                    "message": {
                        "content": chunk,
                        "role": "assistant",
                    }
                }
            ]
        }

list(
    my_streaming_chat_model(
        [
            {"role": "system", "content": "You are a helpful assistant. Please greet the user."},
            {"role": "user", "content": "polly the parrot"},
        ],
    )
)

If ls_model_name is not present in extra.metadata, other fields might be used from the extra.metadata for estimating token counts. The following fields are used in the order of precedence:

metadata.ls_model_name
inputs.model
inputs.model_name

To learn more about how to use the metadata fields, refer to the Add metadata and tags guide.

Provide token and cost information

By default, LangSmith uses tiktoken to count tokens, utilizing a best guess at the model’s tokenizer based on the ls_model_name provided. It also calculates costs automatically by using the model pricing table. To learn how LangSmith calculates token-based costs, see this guide. However, many models already include exact token counts as part of the response. If you have this information, you can override the default token calculation in LangSmith in one of two ways:

Extract usage within your traced function and set a usage_metadata field on the run’s metadata.
Return a usage_metadata field in your traced function outputs.

In both cases, the usage metadata you send should contain a subset of the following LangSmith-recognized fields:

You cannot set any fields other than the ones listed below. You do not need to include all fields.

class UsageMetadata(TypedDict, total=False):
    input_tokens: int
    """The number of tokens used for the prompt."""
    output_tokens: int
    """The number of tokens generated as output."""
    total_tokens: int
    """The total number of tokens used."""
    input_token_details: dict[str, float]
    """The details of the input tokens."""
    output_token_details: dict[str, float]
    """The details of the output tokens."""
    input_cost: float
    """The cost of the input tokens."""
    output_cost: float
    """The cost of the output tokens."""
    total_cost: float
    """The total cost of the tokens."""
    input_cost_details: dict[str, float]
    """The cost details of the input tokens."""
    output_cost_details: dict[str, float]
    """The cost details of the output tokens."""

Note that the usage data can also include cost information, in case you do not want to rely on LangSmith’s token-based cost formula. This is useful for models with pricing that is not linear by token type.

Setting run metadata

You can modify the current run’s metadata with usage information within your traced function. The advantage of this approach is that you do not need to change your traced function’s runtime outputs. Here’s an example:

Requires langsmith>=0.3.43 (Python) and langsmith>=0.3.30 (JS/TS).

from langsmith import traceable, get_current_run_tree

inputs = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I'd like to book a table for two."},
]

@traceable(
    run_type="llm",
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def chat_model(messages: list):
    llm_output = {
        "choices": [
            {
                "message": {
                    "role": "assistant",
                    "content": "Sure, what time would you like to book the table for?"
                }
            }
        ],
        "usage_metadata": {
            "input_tokens": 27,
            "output_tokens": 13,
            "total_tokens": 40,
            "input_token_details": {"cache_read": 10},
            # If you wanted to specify costs:
            # "input_cost": 1.1e-6,
            # "input_cost_details": {"cache_read": 2.3e-7},
            # "output_cost": 5.0e-6,
        },
    }
    run = get_current_run_tree()
    run.set(usage_metadata=llm_output["usage_metadata"])
    return llm_output["choices"][0]["message"]

chat_model(inputs)

Setting run outputs

You can add a usage_metadata key to the function’s response to set manual token counts and costs.

from langsmith import traceable

inputs = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I'd like to book a table for two."},
]
output = {
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "Sure, what time would you like to book the table for?"
            }
        }
    ],
    "usage_metadata": {
        "input_tokens": 27,
        "output_tokens": 13,
        "total_tokens": 40,
        "input_token_details": {"cache_read": 10},
        # If you wanted to specify costs:
        # "input_cost": 1.1e-6,
        # "input_cost_details": {"cache_read": 2.3e-7},
        # "output_cost": 5.0e-6,
    },
}

@traceable(
    run_type="llm",
    metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
)
def chat_model(messages: list):
    return output

chat_model(inputs)

Time-to-first-token

If you are using traceable or one of our SDK wrappers, LangSmith will automatically populate time-to-first-token for streaming LLM runs. However, if you are using the RunTree API directly, you will need to add a new_token event to the run tree in order to properly populate time-to-first-token. Here’s an example:

from langsmith.run_trees import RunTree
run_tree = RunTree(
    name="CustomChatModel",
    run_type="llm",
    inputs={ ... }
)
run_tree.post()
llm_stream = ...
first_token = None
for token in llm_stream:
    if first_token is None:
      first_token = token
      run_tree.add_event({
        "name": "new_token"
      })
run_tree.end(outputs={ ... })
run_tree.patch()

Edit the source of this page on GitHub

Tracing setup

Configuration & troubleshooting

Viewing & managing traces

Automations

Feedback & evaluation

Monitoring & alerting

Data type reference

Messages Format

Examples

Converting custom I/O formats into LangSmith compatible formats

Identifying a custom model in traces

Provide token and cost information

Setting run metadata

Setting run outputs

Time-to-first-token

Tracing setup

Configuration & troubleshooting

Viewing & managing traces

Automations

Feedback & evaluation

Monitoring & alerting

Data type reference

​Messages Format

​Examples

​Converting custom I/O formats into LangSmith compatible formats

​Identifying a custom model in traces

​Provide token and cost information

​Setting run metadata

​Setting run outputs

​Time-to-first-token

Messages Format

Examples

Converting custom I/O formats into LangSmith compatible formats

Identifying a custom model in traces

Provide token and cost information

Setting run metadata

Setting run outputs

Time-to-first-token