Run (span) data format
Recommended Reading
Before diving into this content, it might be helpful to read the following:
LangSmith stores and processes trace data in a simple format that is easy to export and import.
Many of these fields are optional or not important to know about but are included for completeness. The bolded fields are the most important ones to know about.
Field Name | Type | Description |
---|---|---|
id | UUID | Unique identifier for the span. |
name | string | The name associated with the run. |
inputs | object | A map or set of inputs provided to the run. |
run_type | string | Type of run, e.g., "llm", "chain", "tool". |
start_time | datetime | Start time of the run. |
end_time | datetime | End time of the run. |
extra | object | Any extra information run. |
error | string | Error message if the run encountered an error. |
outputs | object | A map or set of outputs generated by the run. |
events | array of objects | A list of event objects associated with the run. This is relevant for runs executed with streaming. |
tags | array of strings | Tags or labels associated with the run. |
trace_id | UUID | Unique identifier for the trace the run is a part of. This is also the id field of the root run of the trace |
dotted_order | string | Ordering string, hierarchical. Format: run_start_time Zrun_uuid .child_run_start_time Zchild_run_uuid ... |
status | string | Current status of the run execution, e.g., "error", "pending", "success" |
child_run_ids | array of UUIDs | List of IDs for all child runs. |
direct_child_run_ids | array of UUIDs | List of IDs for direct children of this run. |
parent_run_ids | array of UUIDs | List of IDs for all parent runs. |
feedback_stats | object | Aggregations of feedback statistics for this run |
reference_example_id | UUID | ID of a reference example associated with the run. This is usually only present for evaluation runs. |
total_tokens | integer | Total number of tokens processed by the run. |
prompt_tokens | integer | Number of tokens in the prompt of the run. |
completion_tokens | integer | Number of tokens in the completion of the run. |
total_cost | string | Total cost associated with processing the run. |
prompt_cost | string | Cost associated with the prompt part of the run. |
completion_cost | string | Cost associated with the completion of the run. |
first_token_time | datetime | Time when the first token was generated. |
session_id | string | Session identifier for the run. |
in_dataset | boolean | Indicates whether the run is included in a dataset. |
parent_run_id | UUID | Unique identifier of the parent run. |
execution_order (deprecated) | integer | The order in which this run was executed within the trace. |
serialized | object | Serialized state of the object executing the run if applicable. |
manifest_id (deprecated) | UUID | Identifier for a manifest associated with the span. |
manifest_s3_id | UUID | S3 identifier for the manifest. |
inputs_s3_urls | object | S3 URLs for the inputs. |
outputs_s3_urls | object | S3 URLs for the outputs. |
price_model_id | UUID | Identifier for the pricing model applied to the run. |
app_path | string | Application (UI) path for this run. |
last_queued_at | datetime | Last time the span was queued. |
share_token | string | Token for sharing access to the run's data. |
Here is an example of a JSON representation of a run in the above format:
{
"id": "497f6eca-6276-4993-bfeb-53cbbbba6f08",
"name": "string",
"inputs": {},
"run_type": "llm",
"start_time": "2024-04-29T00:49:12.090000",
"end_time": "2024-04-29T00:49:12.459000",
"extra": {},
"error": "string",
"execution_order": 1,
"serialized": {},
"outputs": {},
"parent_run_id": "f8faf8c1-9778-49a4-9004-628cdb0047e5",
"manifest_id": "82825e8e-31fc-47d5-83ce-cd926068341e",
"manifest_s3_id": "0454f93b-7eb6-4b9d-a203-f1261e686840",
"events": [{}],
"tags": ["foo"],
"inputs_s3_urls": {},
"outputs_s3_urls": {},
"trace_id": "df570c03-5a03-4cea-8df0-c162d05127ac",
"dotted_order": "20240429T004912090000Z497f6eca-6276-4993-bfeb-53cbbbba6f08",
"status": "string",
"child_run_ids": ["497f6eca-6276-4993-bfeb-53cbbbba6f08"],
"direct_child_run_ids": ["497f6eca-6276-4993-bfeb-53cbbbba6f08"],
"parent_run_ids": ["f8faf8c1-9778-49a4-9004-628cdb0047e5"],
"feedback_stats": {
"correctness": {
"n": 1,
"avg": 1.0
}
},
"reference_example_id": "9fb06aaa-105f-4c87-845f-47d62ffd7ee6",
"total_tokens": 0,
"prompt_tokens": 0,
"completion_tokens": 0,
"total_cost": "string",
"prompt_cost": "string",
"completion_cost": "string",
"price_model_id": "0b5d9575-bec3-4256-b43a-05893b8b8440",
"first_token_time": null,
"session_id": "1ffd059c-17ea-40a8-8aef-70fd0307db82",
"app_path": "string",
"last_queued_at": null,
"in_dataset": true,
"share_token": "d0430ac3-04a1-4e32-a7ea-57776ad22c1c"
}
What is dotted_order
?
A run's dotted order is a sortable key that fully specifies its location within the tracing hierarchy.
Take the following example:
import langsmith as ls
@ls.traceable
def grandchild():
p("grandchild")
@ls.traceable
def child():
grandchild()
@ls.traceable
def parent():
child()
If you print out the IDs at each stage, you may get the following:
parent
run_id=0e01bf50-474d-4536-810f-67d3ee7ea3e7
trace_id=0e01bf50-474d-4536-810f-67d3ee7ea3e7
parent_run_id=null
dotted_order=20240919T171648521691Z0e01bf50-474d-4536-810f-67d3ee7ea3e7
child
run_id=a8024e23-5b82-47fd-970e-f6a5ba3f5097
trace_id=0e01bf50-474d-4536-810f-67d3ee7ea3e7
parent_run_id=0e01bf50-474d-4536-810f-67d3ee7ea3e7
dotted_order=20240919T171648521691Z0e01bf50-474d-4536-810f-67d3ee7ea3e7.20240919T171648523407Za8024e23-5b82-47fd-970e-f6a5ba3f5097
grandchild
run_id=0ec6b845-18b9-4aa1-8f1b-6ba3f9fdefd6
trace_id=0e01bf50-474d-4536-810f-67d3ee7ea3e7
parent_run_id=a8024e23-5b82-47fd-970e-f6a5ba3f5097
dotted_order=20240919T171648521691Z0e01bf50-474d-4536-810f-67d3ee7ea3e7.20240919T171648523407Za8024e23-5b82-47fd-970e-f6a5ba3f5097.20240919T171648523563Z0ec6b845-18b9-4aa1-8f1b-6ba3f9fdefd6
Note a few invariants:
- The "id" is equal to the last 36 characters of the dotted order (the suffix after the final "Z"). See
0ec6b845-18b9-4aa1-8f1b-6ba3f9fdefd6
for example in the grandchild. - The "trace_id" is equal to the first UUID in the dotted order (i.e.,
dotted_order.split('.')[0].split('Z')[1]
) - If "parent_run_id" exists, it is the penultimate UUID in the dotted order. See
a8024e23-5b82-47fd-970e-f6a5ba3f5097
in the grandchild, for an example. - If you split the dotted_order on the dots, each segment is formatted as (
<run_start_time>Z<run_id>
)