Init commit

This commit is contained in:
Robert Xu
2025-11-07 09:35:52 -05:00
commit 53a6754881
68 changed files with 11394 additions and 0 deletions
+7
View File
@@ -0,0 +1,7 @@
# langsmith
LANGSMITH_TRACING=true
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com" # Replace with your instance!
LANGSMITH_API_KEY="<langsmith_api_key>"
LANGSMITH_PROJECT="<project_name>"
OTEL_BSP_MAX_QUEUE_SIZE=10000 # default is 2048, increase if you are benchmarking a lot of data and see `Queue is full, likely spans will be dropped.` in the logs.
+53
View File
@@ -0,0 +1,53 @@
evals/data/*
!evals/data/*-example.csv
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Virtual Environment
venv/
env/
ENV/
.env
# IDE
.idea/
.vscode/
*.swp
*.swo
.DS_Store
*.log
# Jupyter Notebook
.ipynb_checkpoints
# Coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
coverage.xml
*.cover
# Node
node_modules/
+1
View File
@@ -0,0 +1 @@
3.11
+160
View File
@@ -0,0 +1,160 @@
# 🦜🛠 LangSmith SDK Benchmarks
## Pre-requisites
### 0. Install Python 3.11, poetry
If you use Homebrew, you can install poetry with:
```commandline
brew install poetry
```
### 1. Install Dependencies
```commandline
poetry install
```
### 2. Set Environment Variables
After installing dependencies, copy the `.env.example` file contents into `.env` and set the required values:
```commmandline
cp .env.example .env
```
<br>
## Running Benchmarks
This package provides an interactive script to run benchmarks end-to-end. Simply run:
```commandline
python run_benchmarks.py
```
This will present you with a menu to choose between:
1. **Tracing Benchmarks** - Benchmarks trace ingestion performance
2. **Evaluation Benchmarks** - Benchmarks evaluation performance
3. **Exit**
You can customize defaults for each benchmark type, or press Enter to use the defaults.
### Non-Interactive Mode
To run benchmarks without prompts (uses defaults - runs evaluation benchmarks):
```commandline
python run_benchmarks.py --non-interactive
```
<br>
## Tracing Benchmarks
### Overview
Tracing benchmarks measure the performance of ingesting traces into LangSmith. The script automatically:
1. Prepares trace files (replaces UUIDs and updates dates)
2. Runs flat tracing benchmark (runs get their own traces)
3. Runs nested tracing benchmark (runs properly nested under parents)
### Requirements
- Trace data files in JSONL format (`processed_run_ops_*.jsonl`) in the specified data directory
- Default data directory: `tracing/data`
### Running Tracing Benchmarks
**Via interactive script:**
```commandline
python run_benchmarks.py
# Select option 1 (Tracing Benchmarks)
# Enter data directory (default: data)
```
**Directly:**
```commandline
cd tracing
poetry run python benchmark_flat.py [data_dir]
poetry run python benchmark_nested.py [data_dir]
```
### Results
Results are printed to the terminal and saved to:
- `tracing/benchmark_results_flat.txt`
- `tracing/benchmark_results_nested.txt`
<br>
## Evaluation Benchmarks
### Overview
Evaluation benchmarks measure the performance of running evaluations on LangSmith datasets. The script automatically:
1. Benchmarks data upload performance (uploads CSV data to LangSmith)
2. Runs evaluation benchmarks on the uploaded dataset
**Note:** If the dataset already exists in LangSmith, the upload step will be skipped and the script will proceed directly to running evaluations.
### Requirements
- CSV data file in `evals/data/` directory
- Dataset configuration in `evals/config.json`
- Default dataset: `10k-long-emails-example`
- Default data directory: `evals/data`
### 1. Prepare Your Data
Place your CSV file in the `evals/data/` directory. The CSV file must be named `{dataset_name}.csv` where `{dataset_name}` matches the name you'll use in `config.json`.
### 2. Configure Dataset Mapping
You must specify in the ```evals/config.json``` file which CSV columns should be mapped to dataset inputs, and which columns should map to dataset outputs.
**Configuration Details:**
* **`inputs`**: A list of CSV column names that will be extracted from each row and set as the input data for each example in the LangSmith dataset. These columns will be converted to dictionaries (one per row) and passed to `client.create_examples(inputs=...)`.
* **`outputs`**: A list of CSV column names that will be extracted from each row and set as the expected outputs (ground truth) for each example in the LangSmith dataset. These columns will be converted to dictionaries (one per row) and passed to `client.create_examples(outputs=...)`. If empty (`[]`), no outputs will be uploaded.
**Example `config.json` structure:**
```json
{
"_instructions": "This configuration file maps CSV datasets to LangSmith dataset structure...",
"data_files": {
"your-dataset-name": {
"inputs": ["column1", "column2"],
"outputs": ["expected_output"]
}
}
}
```
The CSV file must be named `{dataset_name}.csv` and placed in the `evals/data/` directory. The column names in `inputs` and `outputs` must match the column headers in your CSV file.
### 3. Run Evaluation Benchmarks
**Via interactive script:**
```commandline
python run_benchmarks.py
# Select option 2 (Evaluation Benchmarks)
# Enter dataset name (default: 10k-long-emails-example)
# Enter data directory (default: data)
```
**Directly:**
```commandline
cd evals
# First, benchmark data upload
poetry run python benchmark_upload.py [data_dir] [dataset_name]
# Then, run evaluation benchmarks
poetry run python benchmark_evals.py [dataset_name]
```
### Results
Results are printed to the terminal and saved to:
- `evals/benchmark_results_upload_data.txt` (upload benchmark results)
- `evals/benchmark_results_evals.txt` (evaluation benchmark results)
<br>
## Notes
- **Dataset Upload**: Data will be uploaded to LangSmith as part of the evaluation benchmarks workflow. If a dataset with the same name already exists in LangSmith, the upload step will be automatically skipped and the script will proceed directly to running evaluations.
- **Data Directory**: Both tracing and evaluation benchmarks allow you to specify custom data directories. Defaults are `data` for tracing and `evals/data` for evaluations.
- **Trace Data Preparation**: For tracing benchmarks, UUID replacement and date updates are automatically handled before running benchmarks. These steps run silently in the background.
+55
View File
@@ -0,0 +1,55 @@
from typing import Tuple
import asyncio
import argparse
from eval_data import run_eval
def format_results(ls_results: Tuple[float, str, int]) -> str:
"""Format benchmark results."""
ls_time, _, ls_examples = ls_results
# Use the number of examples from the results
num_examples = ls_examples
avg_ls = ls_time / num_examples if num_examples else 0
return f"""\
Langsmith Benchmark Results
===========================
Time Breakdown:
Total time {ls_time:7.3f}s
Performance:
Total Examples {num_examples}
"""
async def run_benchmark(dataset_name: str):
print("Running Langsmith benchmark...")
langsmith_results = await run_eval(dataset_name)
table = format_results(langsmith_results)
# Print to console
print("\nBenchmark Results:\n")
print(table)
# Save results to a file
with open("benchmark_results_evals.txt", "w") as f:
f.write(table)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Benchmark evaluation performance on a LangSmith dataset",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Example:
python benchmark_evals.py 10k-long-emails
"""
)
parser.add_argument(
"dataset_name",
type=str,
help="Name of the LangSmith dataset to benchmark"
)
args = parser.parse_args()
asyncio.run(run_benchmark(args.dataset_name))
+8
View File
@@ -0,0 +1,8 @@
Langsmith Benchmark Results
===========================
Time Breakdown:
Total time 9.583s
Performance:
Total Examples 2
Avg time/example 4.791s
+9
View File
@@ -0,0 +1,9 @@
Langsmith Benchmark Results
===========================
Time Breakdown:
Total time 2.970s
Performance:
Total Examples 2
Total Size 23.9 kB
Avg time/example 1.485s
+89
View File
@@ -0,0 +1,89 @@
import os
import argparse
import json
import humanize
from pathlib import Path
from typing import Tuple
from upload_data import langsmith_init_data
def get_directory_size(data_dir: str, csv_file: str) -> int:
"""Calculate total size of CSV file in a directory."""
csv_path = Path(data_dir) / f"{csv_file}.csv"
if csv_path.exists():
return csv_path.stat().st_size
return 0
def format_results(ls_results: Tuple[float, str, int],
data_dir: str,
csv_file: str) -> str:
"""Format benchmark results."""
ls_time, _, ls_examples = ls_results
# Use the number of examples from the results
num_examples = ls_examples
total_size = get_directory_size(data_dir, csv_file)
size_human = humanize.naturalsize(total_size)
avg_ls = ls_time / num_examples if num_examples else 0
return f"""\
Langsmith Benchmark Results
===========================
Time Breakdown:
Total time {ls_time:7.3f}s
Performance:
Total Examples {num_examples}
Total Size {size_human}
Avg time/example {avg_ls:7.3f}s
"""
def run_benchmark(data_dir: str, csv_file: str):
config_path = os.path.join(os.path.dirname(__file__), "config.json")
with open(config_path, 'r') as f:
config = json.load(f)
# Get dataset configuration
if csv_file not in config["data_files"]:
raise ValueError(f"Dataset '{csv_file}' not found in config.json")
if "inputs" not in config["data_files"][csv_file] or "outputs" not in config["data_files"][csv_file]:
raise ValueError(f"Dataset '{csv_file}' does not have inputs or outputs in config.json")
dataset_config = config["data_files"][csv_file]
print(f"Dataset config: {dataset_config}")
print("Running Langsmith benchmark...")
langsmith_results = langsmith_init_data(csv_file, dataset_config["inputs"], dataset_config["outputs"], data_dir)
table = format_results(langsmith_results, data_dir, csv_file)
# Print to console
print("\nBenchmark Results:\n")
print(table)
# Save results to a file
with open("benchmark_results_upload_data.txt", "w") as f:
f.write(table)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Benchmark dataset upload performance to LangSmith",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Example:
python benchmark_upload.py data 10k-long-emails
"""
)
parser.add_argument(
"data_dir",
type=str,
help="Directory containing the CSV data file"
)
parser.add_argument(
"csv_file",
type=str,
help="Name of the CSV file (without .csv extension, must exist in config.json)"
)
args = parser.parse_args()
run_benchmark(args.data_dir, args.csv_file)
+13
View File
@@ -0,0 +1,13 @@
{
"_instructions": "This configuration file maps CSV datasets to LangSmith dataset structure. For each dataset: 'inputs' are CSV column names that will be set as inputs in the LangSmith dataset, and 'outputs' are CSV column names that will be recorded as outputs (expected outputs) in the LangSmith dataset.",
"data_files": {
"10k-long-emails": {
"inputs": ["email"],
"outputs": []
},
"10k-long-emails-example": {
"inputs": ["email"],
"outputs": []
}
}
}
+111
View File
@@ -0,0 +1,111 @@
email
"Subject: Urgent: Reclaim Your Ancestral Land Rights Limited Time Offer!
Body:
Dear Mr. and Mrs. Bartholomew Higgins,
It is with a profound sense of urgency and, frankly, a considerable degree of historical injustice that we reach out to you today. For generations, your family, the Higgins, have been the rightful custodians of the sprawling, fertile valley known as “Whispering Pines” a land tragically lost through a series of unfortunate, and frankly, bewildering, legal maneuvers perpetrated by the notoriously unscrupulous Silas Blackwood and his descendants. We understand that the passage of time can obscure details, and perhaps the memory of this historical wrong has faded, but we are here to reignite the flame of rightful ownership and guide you through the process of reclaiming what is, undeniably, yours.
Lets be brutally honest. The situation is complex. Its not merely a simple matter of finding a forgotten deed. Silas Blackwood, a man who made his fortune in dubious dealings primarily the lucrative trade in exotic bird feathers during the Victorian era (a period we find particularly fascinating, though irrelevant to your immediate concern) systematically dismantled your familys claim through a series of meticulously crafted legal arguments, leveraging loopholes in antiquated land laws, and employing a network of compliant (and, we suspect, bribed) local magistrates. He wasnt a malicious man, per se, though his ethical compass appeared to be permanently pointing south. He simply operated according to the prevailing norms of the time norms that prioritized wealth and power over heritage and justice.
The evidence, meticulously compiled over decades by our team of dedicated historical researchers (led by the brilliant and utterly relentless Professor Alistair Finch a man who once spent three months meticulously cataloging the button collections of a retired naval admiral a testament to his dedication), demonstrates a clear and irrefutable case of fraudulent land acquisition. We possess original correspondence, including letters from Silas Blackwood himself, outlining his strategy and detailing his efforts to discredit your lineage. We have unearthed previously unknown contracts, signed in code (a particularly clever tactic employed by Blackwood, who was a staunch advocate of secrecy and obfuscation), and even photographs depicting Blackwood himself surveying the valley, a smug expression on his face, clearly anticipating the eventual dismantling of your familys claim.
But the important thing to understand is that Blackwoods actions were not isolated. They were part of a broader pattern of exploitation that characterized much of the late 19th and early 20th centuries a period we often refer to as “The Age of Calculated Neglect,” a time when vast swathes of land were systematically stripped from indigenous populations and marginalized communities in the name of progress and economic development. The Higgins family, due to a series of unfortunate circumstances including a sudden and unexpected inheritance of a prize-winning marmoset named Bartholomew (a detail weve included for context and to illustrate the sheer randomness of historical events) and a regrettable investment in a steam-powered rhubarb harvester were particularly vulnerable to Blackwood's predatory tactics.
Our work isnt simply about nostalgia. Its about justice. Its about recognizing the enduring consequences of historical inequity and taking concrete steps to rectify it. The “Whispering Pines” valley isnt just a beautiful expanse of land; its a symbol of your familys heritage, a testament to your resilience, and a rightful claim that has been unjustly denied for over a century. We understand that the idea of reclaiming land, particularly at your age, might seem daunting. Perhaps youve heard stories of legal battles, of endless paperwork, of insurmountable obstacles. Were here to tell you that it doesnt have to be. Weve streamlined the entire process, leveraging cutting-edge technology and a team of legal experts who specialize in historical land disputes. Weve built a system thats not only efficient but also empathetic, recognizing the emotional weight attached to this claim.
Lets delve into the specifics of what we offer. Our comprehensive “Restoration Package,” as we call it, includes the following:
**Phase 1: Historical Verification & Authentication (Approximately 6-8 Weeks)**
This initial phase is the cornerstone of our entire operation. We dont simply rely on anecdotal evidence or hearsay. We operate on a foundation of irrefutable facts. This phase involves a meticulous examination of all relevant historical documents, including:
* **Deciphering the Blackwood Code:** Professor Finch and his team have developed a proprietary algorithm utilizing advanced linguistic analysis and a surprisingly sophisticated understanding of Victorian-era cryptography to translate the Blackwood code. This is no simple task. The code isnt just based on simple substitution ciphers. Blackwood was a master of creating layered codes, incorporating nautical terminology, botanical references, and even subtle variations in penmanship to create a truly impenetrable system. Weve spent countless hours cracking this code, and were now confident that we can unlock the full extent of Blackwoods deception. The translation process alone will take approximately 4-6 weeks. We anticipate discovering a significant number of previously unknown documents, including Blackwoods personal diaries (which, according to preliminary assessments, are filled with surprisingly candid observations about his business dealings and his profound disdain for the Higgins family) and a series of meticulously crafted maps depicting the valley at various points in time.
* **Genealogical Deep Dive:** We'll meticulously trace your family lineage back to its earliest documented roots, meticulously researching every branch of the family tree. We'll be contacting historical societies, genealogical databases, and even conducting on-site interviews with long-standing families in the region to piece together the full story of your ancestors. Were particularly interested in discovering any hidden connections to the indigenous tribes who originally inhabited the valley. Theres a strong possibility that your familys claim is intertwined with a treaty a treaty that was, of course, subsequently violated by Silas Blackwood. This process will involve DNA analysis, comparing your genetic profile to that of known family members and to samples taken from the soil in the valley. Were exploring the possibility of utilizing ancient DNA techniques, analyzing pollen samples to determine the original composition of the valleys ecosystem and comparing this data to the genetic makeup of the original inhabitants.
* **Title Examination Review:** Our team of legal experts will conduct a thorough examination of all existing land titles, identifying any discrepancies or inconsistencies. We'll be working with the local county recorders office, challenging any claims that are not supported by evidence. Well be employing sophisticated legal research tools, scouring historical records for any indications of a wrongful transfer of ownership. Were looking for evidence of fraud, duress, or undue influence. Were prepared to pursue legal action against any entities that may have benefited from the wrongful transfer of ownership.
* **Environmental Impact Assessment:** We will conduct a comprehensive environmental impact assessment of the “Whispering Pines” valley, documenting any changes that have occurred since Blackwoods acquisition. This will include assessing the impact of agricultural practices, logging activities, and residential development. We're particularly concerned about the impact of a recent development project a luxury golf course which we believe has significantly degraded the valley's natural beauty and disrupted the ecosystem.
**Phase 2: Legal Strategy & Litigation (Approximately 8-12 Weeks)**
Once weve completed the historical verification and authentication phase, well move into the legal strategy and litigation phase. This is where we formally assert your claim and begin the process of seeking legal redress. This phase will involve:
* **Formal Legal Complaint Filing:** We will file a formal legal complaint with the county court, asserting your claim to the “Whispering Pines” valley. The complaint will be supported by all of the evidence weve gathered during the historical verification and authentication phase.
* **Discovery Process:** We will engage in the discovery process, formally requesting documents and information from the defendants primarily the current owner of the property, a shell corporation controlled by the Blackwood familys descendants (a surprisingly persistent lineage, we might add) and the developer of the luxury golf course. This will involve serving subpoenas, taking depositions, and conducting interrogatories.
* **Negotiation & Mediation:** We will actively pursue negotiation and mediation with the defendants, seeking a fair and equitable resolution to the dispute. We believe that a negotiated settlement is the most efficient and cost-effective way to resolve the case. However, we are fully prepared to litigate the case in court if necessary.
* **Expert Witness Testimony:** We will retain expert witnesses historians, genealogists, environmental scientists, and legal experts to provide testimony in support of your claim. These experts will be called upon to testify about the historical context of the case, the scientific evidence, and the legal principles involved.
**Phase 3: Property Restoration & Stewardship (Ongoing)**
This phase is about more than just reclaiming legal ownership. It's about ensuring the long-term stewardship of the “Whispering Pines” valley. This phase will involve:
* **Valley Restoration Project:** We will implement a comprehensive valley restoration project, aimed at reversing the damage caused by agricultural practices, logging activities, and residential development. This will include reforestation efforts, the restoration of native plant species, and the implementation of sustainable land management practices.
* **Community Engagement:** We will actively engage with the local community, seeking their input and support for the valley restoration project. We believe that the valley should be a valuable asset for the entire community.
* **Establishment of a Conservation Trust:** We will establish a conservation trust to ensure the long-term protection of the “Whispering Pines” valley. The trust will be governed by a board of directors, comprised of local residents, environmental experts, and legal professionals.
**Investment Opportunity:**
We are currently seeking investors to help fund this ambitious project. Your investment will not only help us reclaim your rightful ownership of the “Whispering Pines” valley, but it will also contribute to the preservation of this valuable natural resource for generations to come. We are offering a variety of investment options, ranging from equity financing to philanthropic donations.
**Disclaimer:** *This is a hypothetical scenario and does not constitute legal advice. Actual outcomes may vary depending on the specific circumstances of the case.*
**Contact Us:**
[Contact Information]
We understand that this is a complex and potentially overwhelming undertaking. However, we are confident that with your support, we can successfully reclaim the “Whispering Pines” valley and ensure its long-term protection. We look forward to hearing from you. iteration 23"
"Subject: Urgent: Reclaim Your Ancestral Land Rights Limited Time Offer!
Body:
Dear Mr. and Mrs. Bartholomew Higgins,
It is with a profound sense of urgency and, frankly, a considerable degree of historical injustice that we reach out to you today. For generations, your family, the Higgins, have been the rightful custodians of the sprawling, fertile valley known as “Whispering Pines” a land tragically lost through a series of unfortunate, and frankly, bewildering, legal maneuvers perpetrated by the notoriously unscrupulous Silas Blackwood and his descendants. We understand that the passage of time can obscure details, and perhaps the memory of this historical wrong has faded, but we are here to reignite the flame of rightful ownership and guide you through the process of reclaiming what is, undeniably, yours.
Lets be brutally honest. The situation is complex. Its not merely a simple matter of finding a forgotten deed. Silas Blackwood, a man who made his fortune in dubious dealings primarily the lucrative trade in exotic bird feathers during the Victorian era (a period we find particularly fascinating, though irrelevant to your immediate concern) systematically dismantled your familys claim through a series of meticulously crafted legal arguments, leveraging loopholes in antiquated land laws, and employing a network of compliant (and, we suspect, bribed) local magistrates. He wasnt a malicious man, per se, though his ethical compass appeared to be permanently pointing south. He simply operated according to the prevailing norms of the time norms that prioritized wealth and power over heritage and justice.
The evidence, meticulously compiled over decades by our team of dedicated historical researchers (led by the brilliant and utterly relentless Professor Alistair Finch a man who once spent three months meticulously cataloging the button collections of a retired naval admiral a testament to his dedication), demonstrates a clear and irrefutable case of fraudulent land acquisition. We possess original correspondence, including letters from Silas Blackwood himself, outlining his strategy and detailing his efforts to discredit your lineage. We have unearthed previously unknown contracts, signed in code (a particularly clever tactic employed by Blackwood, who was a staunch advocate of secrecy and obfuscation), and even photographs depicting Blackwood himself surveying the valley, a smug expression on his face, clearly anticipating the eventual dismantling of your familys claim.
But the important thing to understand is that Blackwoods actions were not isolated. They were part of a broader pattern of exploitation that characterized much of the late 19th and early 20th centuries a period we often refer to as “The Age of Calculated Neglect,” a time when vast swathes of land were systematically stripped from indigenous populations and marginalized communities in the name of progress and economic development. The Higgins family, due to a series of unfortunate circumstances including a sudden and unexpected inheritance of a prize-winning marmoset named Bartholomew (a detail weve included for context and to illustrate the sheer randomness of historical events) and a regrettable investment in a steam-powered rhubarb harvester were particularly vulnerable to Blackwood's predatory tactics.
Our work isnt simply about nostalgia. Its about justice. Its about recognizing the enduring consequences of historical inequity and taking concrete steps to rectify it. The “Whispering Pines” valley isnt just a beautiful expanse of land; its a symbol of your familys heritage, a testament to your resilience, and a rightful claim that has been unjustly denied for over a century. We understand that the idea of reclaiming land, particularly at your age, might seem daunting. Perhaps youve heard stories of legal battles, of endless paperwork, of insurmountable obstacles. Were here to tell you that it doesnt have to be. Weve streamlined the entire process, leveraging cutting-edge technology and a team of legal experts who specialize in historical land disputes. Weve built a system thats not only efficient but also empathetic, recognizing the emotional weight attached to this claim.
Lets delve into the specifics of what we offer. Our comprehensive “Restoration Package,” as we call it, includes the following:
**Phase 1: Historical Verification & Authentication (Approximately 6-8 Weeks)**
This initial phase is the cornerstone of our entire operation. We dont simply rely on anecdotal evidence or hearsay. We operate on a foundation of irrefutable facts. This phase involves a meticulous examination of all relevant historical documents, including:
* **Deciphering the Blackwood Code:** Professor Finch and his team have developed a proprietary algorithm utilizing advanced linguistic analysis and a surprisingly sophisticated understanding of Victorian-era cryptography to translate the Blackwood code. This is no simple task. The code isnt just based on simple substitution ciphers. Blackwood was a master of creating layered codes, incorporating nautical terminology, botanical references, and even subtle variations in penmanship to create a truly impenetrable system. Weve spent countless hours cracking this code, and were now confident that we can unlock the full extent of Blackwoods deception. The translation process alone will take approximately 4-6 weeks. We anticipate discovering a significant number of previously unknown documents, including Blackwoods personal diaries (which, according to preliminary assessments, are filled with surprisingly candid observations about his business dealings and his profound disdain for the Higgins family) and a series of meticulously crafted maps depicting the valley at various points in time.
* **Genealogical Deep Dive:** We'll meticulously trace your family lineage back to its earliest documented roots, meticulously researching every branch of the family tree. We'll be contacting historical societies, genealogical databases, and even conducting on-site interviews with long-standing families in the region to piece together the full story of your ancestors. Were particularly interested in discovering any hidden connections to the indigenous tribes who originally inhabited the valley. Theres a strong possibility that your familys claim is intertwined with a treaty a treaty that was, of course, subsequently violated by Silas Blackwood. This process will involve DNA analysis, comparing your genetic profile to that of known family members and to samples taken from the soil in the valley. Were exploring the possibility of utilizing ancient DNA techniques, analyzing pollen samples to determine the original composition of the valleys ecosystem and comparing this data to the genetic makeup of the original inhabitants.
* **Title Examination Review:** Our team of legal experts will conduct a thorough examination of all existing land titles, identifying any discrepancies or inconsistencies. We'll be working with the local county recorders office, challenging any claims that are not supported by evidence. Well be employing sophisticated legal research tools, scouring historical records for any indications of a wrongful transfer of ownership. Were looking for evidence of fraud, duress, or undue influence. Were prepared to pursue legal action against any entities that may have benefited from the wrongful transfer of ownership.
* **Environmental Impact Assessment:** We will conduct a comprehensive environmental impact assessment of the “Whispering Pines” valley, documenting any changes that have occurred since Blackwoods acquisition. This will include assessing the impact of agricultural practices, logging activities, and residential development. We're particularly concerned about the impact of a recent development project a luxury golf course which we believe has significantly degraded the valley's natural beauty and disrupted the ecosystem.
**Phase 2: Legal Strategy & Litigation (Approximately 8-12 Weeks)**
Once weve completed the historical verification and authentication phase, well move into the legal strategy and litigation phase. This is where we formally assert your claim and begin the process of seeking legal redress. This phase will involve:
* **Formal Legal Complaint Filing:** We will file a formal legal complaint with the county court, asserting your claim to the “Whispering Pines” valley. The complaint will be supported by all of the evidence weve gathered during the historical verification and authentication phase.
* **Discovery Process:** We will engage in the discovery process, formally requesting documents and information from the defendants primarily the current owner of the property, a shell corporation controlled by the Blackwood familys descendants (a surprisingly persistent lineage, we might add) and the developer of the luxury golf course. This will involve serving subpoenas, taking depositions, and conducting interrogatories.
* **Negotiation & Mediation:** We will actively pursue negotiation and mediation with the defendants, seeking a fair and equitable resolution to the dispute. We believe that a negotiated settlement is the most efficient and cost-effective way to resolve the case. However, we are fully prepared to litigate the case in court if necessary.
* **Expert Witness Testimony:** We will retain expert witnesses historians, genealogists, environmental scientists, and legal experts to provide testimony in support of your claim. These experts will be called upon to testify about the historical context of the case, the scientific evidence, and the legal principles involved.
**Phase 3: Property Restoration & Stewardship (Ongoing)**
This phase is about more than just reclaiming legal ownership. It's about ensuring the long-term stewardship of the “Whispering Pines” valley. This phase will involve:
* **Valley Restoration Project:** We will implement a comprehensive valley restoration project, aimed at reversing the damage caused by agricultural practices, logging activities, and residential development. This will include reforestation efforts, the restoration of native plant species, and the implementation of sustainable land management practices.
* **Community Engagement:** We will actively engage with the local community, seeking their input and support for the valley restoration project. We believe that the valley should be a valuable asset for the entire community.
* **Establishment of a Conservation Trust:** We will establish a conservation trust to ensure the long-term protection of the “Whispering Pines” valley. The trust will be governed by a board of directors, comprised of local residents, environmental experts, and legal professionals.
**Investment Opportunity:**
We are currently seeking investors to help fund this ambitious project. Your investment will not only help us reclaim your rightful ownership of the “Whispering Pines” valley, but it will also contribute to the preservation of this valuable natural resource for generations to come. We are offering a variety of investment options, ranging from equity financing to philanthropic donations.
**Disclaimer:** *This is a hypothetical scenario and does not constitute legal advice. Actual outcomes may vary depending on the specific circumstances of the case.*
**Contact Us:**
[Contact Information]
We understand that this is a complex and potentially overwhelming undertaking. However, we are confident that with your support, we can successfully reclaim the “Whispering Pines” valley and ensure its long-term protection. We look forward to hearing from you. iteration 24"
1 email
2 Subject: Urgent: Reclaim Your Ancestral Land Rights – Limited Time Offer! Body: Dear Mr. and Mrs. Bartholomew Higgins, It is with a profound sense of urgency and, frankly, a considerable degree of historical injustice that we reach out to you today. For generations, your family, the Higgins, have been the rightful custodians of the sprawling, fertile valley known as “Whispering Pines” – a land tragically lost through a series of unfortunate, and frankly, bewildering, legal maneuvers perpetrated by the notoriously unscrupulous Silas Blackwood and his descendants. We understand that the passage of time can obscure details, and perhaps the memory of this historical wrong has faded, but we are here to reignite the flame of rightful ownership and guide you through the process of reclaiming what is, undeniably, yours. Let’s be brutally honest. The situation is complex. It’s not merely a simple matter of finding a forgotten deed. Silas Blackwood, a man who made his fortune in dubious dealings – primarily the lucrative trade in exotic bird feathers during the Victorian era (a period we find particularly fascinating, though irrelevant to your immediate concern) – systematically dismantled your family’s claim through a series of meticulously crafted legal arguments, leveraging loopholes in antiquated land laws, and employing a network of compliant (and, we suspect, bribed) local magistrates. He wasn’t a malicious man, per se, though his ethical compass appeared to be permanently pointing south. He simply operated according to the prevailing norms of the time – norms that prioritized wealth and power over heritage and justice. The evidence, meticulously compiled over decades by our team of dedicated historical researchers (led by the brilliant and utterly relentless Professor Alistair Finch – a man who once spent three months meticulously cataloging the button collections of a retired naval admiral – a testament to his dedication), demonstrates a clear and irrefutable case of fraudulent land acquisition. We possess original correspondence, including letters from Silas Blackwood himself, outlining his strategy and detailing his efforts to discredit your lineage. We have unearthed previously unknown contracts, signed in code (a particularly clever tactic employed by Blackwood, who was a staunch advocate of secrecy and obfuscation), and even photographs depicting Blackwood himself surveying the valley, a smug expression on his face, clearly anticipating the eventual dismantling of your family’s claim. But the important thing to understand is that Blackwood’s actions were not isolated. They were part of a broader pattern of exploitation that characterized much of the late 19th and early 20th centuries – a period we often refer to as “The Age of Calculated Neglect,” a time when vast swathes of land were systematically stripped from indigenous populations and marginalized communities in the name of progress and economic development. The Higgins family, due to a series of unfortunate circumstances – including a sudden and unexpected inheritance of a prize-winning marmoset named Bartholomew (a detail we’ve included for context and to illustrate the sheer randomness of historical events) and a regrettable investment in a steam-powered rhubarb harvester – were particularly vulnerable to Blackwood's predatory tactics. Our work isn’t simply about nostalgia. It’s about justice. It’s about recognizing the enduring consequences of historical inequity and taking concrete steps to rectify it. The “Whispering Pines” valley isn’t just a beautiful expanse of land; it’s a symbol of your family’s heritage, a testament to your resilience, and a rightful claim that has been unjustly denied for over a century. We understand that the idea of reclaiming land, particularly at your age, might seem daunting. Perhaps you’ve heard stories of legal battles, of endless paperwork, of insurmountable obstacles. We’re here to tell you that it doesn’t have to be. We’ve streamlined the entire process, leveraging cutting-edge technology and a team of legal experts who specialize in historical land disputes. We’ve built a system that’s not only efficient but also empathetic, recognizing the emotional weight attached to this claim. Let’s delve into the specifics of what we offer. Our comprehensive “Restoration Package,” as we call it, includes the following: **Phase 1: Historical Verification & Authentication (Approximately 6-8 Weeks)** This initial phase is the cornerstone of our entire operation. We don’t simply rely on anecdotal evidence or hearsay. We operate on a foundation of irrefutable facts. This phase involves a meticulous examination of all relevant historical documents, including: * **Deciphering the Blackwood Code:** Professor Finch and his team have developed a proprietary algorithm – utilizing advanced linguistic analysis and a surprisingly sophisticated understanding of Victorian-era cryptography – to translate the Blackwood code. This is no simple task. The code isn’t just based on simple substitution ciphers. Blackwood was a master of creating layered codes, incorporating nautical terminology, botanical references, and even subtle variations in penmanship to create a truly impenetrable system. We’ve spent countless hours cracking this code, and we’re now confident that we can unlock the full extent of Blackwood’s deception. The translation process alone will take approximately 4-6 weeks. We anticipate discovering a significant number of previously unknown documents, including Blackwood’s personal diaries (which, according to preliminary assessments, are filled with surprisingly candid observations about his business dealings and his profound disdain for the Higgins family) and a series of meticulously crafted maps depicting the valley at various points in time. * **Genealogical Deep Dive:** We'll meticulously trace your family lineage back to its earliest documented roots, meticulously researching every branch of the family tree. We'll be contacting historical societies, genealogical databases, and even conducting on-site interviews with long-standing families in the region to piece together the full story of your ancestors. We’re particularly interested in discovering any hidden connections to the indigenous tribes who originally inhabited the valley. There’s a strong possibility that your family’s claim is intertwined with a treaty – a treaty that was, of course, subsequently violated by Silas Blackwood. This process will involve DNA analysis, comparing your genetic profile to that of known family members and to samples taken from the soil in the valley. We’re exploring the possibility of utilizing ancient DNA techniques, analyzing pollen samples to determine the original composition of the valley’s ecosystem and comparing this data to the genetic makeup of the original inhabitants. * **Title Examination Review:** Our team of legal experts will conduct a thorough examination of all existing land titles, identifying any discrepancies or inconsistencies. We'll be working with the local county recorder’s office, challenging any claims that are not supported by evidence. We’ll be employing sophisticated legal research tools, scouring historical records for any indications of a wrongful transfer of ownership. We’re looking for evidence of fraud, duress, or undue influence. We’re prepared to pursue legal action against any entities that may have benefited from the wrongful transfer of ownership. * **Environmental Impact Assessment:** We will conduct a comprehensive environmental impact assessment of the “Whispering Pines” valley, documenting any changes that have occurred since Blackwood’s acquisition. This will include assessing the impact of agricultural practices, logging activities, and residential development. We're particularly concerned about the impact of a recent development project – a luxury golf course – which we believe has significantly degraded the valley's natural beauty and disrupted the ecosystem. **Phase 2: Legal Strategy & Litigation (Approximately 8-12 Weeks)** Once we’ve completed the historical verification and authentication phase, we’ll move into the legal strategy and litigation phase. This is where we formally assert your claim and begin the process of seeking legal redress. This phase will involve: * **Formal Legal Complaint Filing:** We will file a formal legal complaint with the county court, asserting your claim to the “Whispering Pines” valley. The complaint will be supported by all of the evidence we’ve gathered during the historical verification and authentication phase. * **Discovery Process:** We will engage in the discovery process, formally requesting documents and information from the defendants – primarily the current owner of the property, a shell corporation controlled by the Blackwood family’s descendants (a surprisingly persistent lineage, we might add) and the developer of the luxury golf course. This will involve serving subpoenas, taking depositions, and conducting interrogatories. * **Negotiation & Mediation:** We will actively pursue negotiation and mediation with the defendants, seeking a fair and equitable resolution to the dispute. We believe that a negotiated settlement is the most efficient and cost-effective way to resolve the case. However, we are fully prepared to litigate the case in court if necessary. * **Expert Witness Testimony:** We will retain expert witnesses – historians, genealogists, environmental scientists, and legal experts – to provide testimony in support of your claim. These experts will be called upon to testify about the historical context of the case, the scientific evidence, and the legal principles involved. **Phase 3: Property Restoration & Stewardship (Ongoing)** This phase is about more than just reclaiming legal ownership. It's about ensuring the long-term stewardship of the “Whispering Pines” valley. This phase will involve: * **Valley Restoration Project:** We will implement a comprehensive valley restoration project, aimed at reversing the damage caused by agricultural practices, logging activities, and residential development. This will include reforestation efforts, the restoration of native plant species, and the implementation of sustainable land management practices. * **Community Engagement:** We will actively engage with the local community, seeking their input and support for the valley restoration project. We believe that the valley should be a valuable asset for the entire community. * **Establishment of a Conservation Trust:** We will establish a conservation trust to ensure the long-term protection of the “Whispering Pines” valley. The trust will be governed by a board of directors, comprised of local residents, environmental experts, and legal professionals. **Investment Opportunity:** We are currently seeking investors to help fund this ambitious project. Your investment will not only help us reclaim your rightful ownership of the “Whispering Pines” valley, but it will also contribute to the preservation of this valuable natural resource for generations to come. We are offering a variety of investment options, ranging from equity financing to philanthropic donations. **Disclaimer:** *This is a hypothetical scenario and does not constitute legal advice. Actual outcomes may vary depending on the specific circumstances of the case.* **Contact Us:** [Contact Information] We understand that this is a complex and potentially overwhelming undertaking. However, we are confident that with your support, we can successfully reclaim the “Whispering Pines” valley and ensure its long-term protection. We look forward to hearing from you. iteration 23
3 Subject: Urgent: Reclaim Your Ancestral Land Rights – Limited Time Offer! Body: Dear Mr. and Mrs. Bartholomew Higgins, It is with a profound sense of urgency and, frankly, a considerable degree of historical injustice that we reach out to you today. For generations, your family, the Higgins, have been the rightful custodians of the sprawling, fertile valley known as “Whispering Pines” – a land tragically lost through a series of unfortunate, and frankly, bewildering, legal maneuvers perpetrated by the notoriously unscrupulous Silas Blackwood and his descendants. We understand that the passage of time can obscure details, and perhaps the memory of this historical wrong has faded, but we are here to reignite the flame of rightful ownership and guide you through the process of reclaiming what is, undeniably, yours. Let’s be brutally honest. The situation is complex. It’s not merely a simple matter of finding a forgotten deed. Silas Blackwood, a man who made his fortune in dubious dealings – primarily the lucrative trade in exotic bird feathers during the Victorian era (a period we find particularly fascinating, though irrelevant to your immediate concern) – systematically dismantled your family’s claim through a series of meticulously crafted legal arguments, leveraging loopholes in antiquated land laws, and employing a network of compliant (and, we suspect, bribed) local magistrates. He wasn’t a malicious man, per se, though his ethical compass appeared to be permanently pointing south. He simply operated according to the prevailing norms of the time – norms that prioritized wealth and power over heritage and justice. The evidence, meticulously compiled over decades by our team of dedicated historical researchers (led by the brilliant and utterly relentless Professor Alistair Finch – a man who once spent three months meticulously cataloging the button collections of a retired naval admiral – a testament to his dedication), demonstrates a clear and irrefutable case of fraudulent land acquisition. We possess original correspondence, including letters from Silas Blackwood himself, outlining his strategy and detailing his efforts to discredit your lineage. We have unearthed previously unknown contracts, signed in code (a particularly clever tactic employed by Blackwood, who was a staunch advocate of secrecy and obfuscation), and even photographs depicting Blackwood himself surveying the valley, a smug expression on his face, clearly anticipating the eventual dismantling of your family’s claim. But the important thing to understand is that Blackwood’s actions were not isolated. They were part of a broader pattern of exploitation that characterized much of the late 19th and early 20th centuries – a period we often refer to as “The Age of Calculated Neglect,” a time when vast swathes of land were systematically stripped from indigenous populations and marginalized communities in the name of progress and economic development. The Higgins family, due to a series of unfortunate circumstances – including a sudden and unexpected inheritance of a prize-winning marmoset named Bartholomew (a detail we’ve included for context and to illustrate the sheer randomness of historical events) and a regrettable investment in a steam-powered rhubarb harvester – were particularly vulnerable to Blackwood's predatory tactics. Our work isn’t simply about nostalgia. It’s about justice. It’s about recognizing the enduring consequences of historical inequity and taking concrete steps to rectify it. The “Whispering Pines” valley isn’t just a beautiful expanse of land; it’s a symbol of your family’s heritage, a testament to your resilience, and a rightful claim that has been unjustly denied for over a century. We understand that the idea of reclaiming land, particularly at your age, might seem daunting. Perhaps you’ve heard stories of legal battles, of endless paperwork, of insurmountable obstacles. We’re here to tell you that it doesn’t have to be. We’ve streamlined the entire process, leveraging cutting-edge technology and a team of legal experts who specialize in historical land disputes. We’ve built a system that’s not only efficient but also empathetic, recognizing the emotional weight attached to this claim. Let’s delve into the specifics of what we offer. Our comprehensive “Restoration Package,” as we call it, includes the following: **Phase 1: Historical Verification & Authentication (Approximately 6-8 Weeks)** This initial phase is the cornerstone of our entire operation. We don’t simply rely on anecdotal evidence or hearsay. We operate on a foundation of irrefutable facts. This phase involves a meticulous examination of all relevant historical documents, including: * **Deciphering the Blackwood Code:** Professor Finch and his team have developed a proprietary algorithm – utilizing advanced linguistic analysis and a surprisingly sophisticated understanding of Victorian-era cryptography – to translate the Blackwood code. This is no simple task. The code isn’t just based on simple substitution ciphers. Blackwood was a master of creating layered codes, incorporating nautical terminology, botanical references, and even subtle variations in penmanship to create a truly impenetrable system. We’ve spent countless hours cracking this code, and we’re now confident that we can unlock the full extent of Blackwood’s deception. The translation process alone will take approximately 4-6 weeks. We anticipate discovering a significant number of previously unknown documents, including Blackwood’s personal diaries (which, according to preliminary assessments, are filled with surprisingly candid observations about his business dealings and his profound disdain for the Higgins family) and a series of meticulously crafted maps depicting the valley at various points in time. * **Genealogical Deep Dive:** We'll meticulously trace your family lineage back to its earliest documented roots, meticulously researching every branch of the family tree. We'll be contacting historical societies, genealogical databases, and even conducting on-site interviews with long-standing families in the region to piece together the full story of your ancestors. We’re particularly interested in discovering any hidden connections to the indigenous tribes who originally inhabited the valley. There’s a strong possibility that your family’s claim is intertwined with a treaty – a treaty that was, of course, subsequently violated by Silas Blackwood. This process will involve DNA analysis, comparing your genetic profile to that of known family members and to samples taken from the soil in the valley. We’re exploring the possibility of utilizing ancient DNA techniques, analyzing pollen samples to determine the original composition of the valley’s ecosystem and comparing this data to the genetic makeup of the original inhabitants. * **Title Examination Review:** Our team of legal experts will conduct a thorough examination of all existing land titles, identifying any discrepancies or inconsistencies. We'll be working with the local county recorder’s office, challenging any claims that are not supported by evidence. We’ll be employing sophisticated legal research tools, scouring historical records for any indications of a wrongful transfer of ownership. We’re looking for evidence of fraud, duress, or undue influence. We’re prepared to pursue legal action against any entities that may have benefited from the wrongful transfer of ownership. * **Environmental Impact Assessment:** We will conduct a comprehensive environmental impact assessment of the “Whispering Pines” valley, documenting any changes that have occurred since Blackwood’s acquisition. This will include assessing the impact of agricultural practices, logging activities, and residential development. We're particularly concerned about the impact of a recent development project – a luxury golf course – which we believe has significantly degraded the valley's natural beauty and disrupted the ecosystem. **Phase 2: Legal Strategy & Litigation (Approximately 8-12 Weeks)** Once we’ve completed the historical verification and authentication phase, we’ll move into the legal strategy and litigation phase. This is where we formally assert your claim and begin the process of seeking legal redress. This phase will involve: * **Formal Legal Complaint Filing:** We will file a formal legal complaint with the county court, asserting your claim to the “Whispering Pines” valley. The complaint will be supported by all of the evidence we’ve gathered during the historical verification and authentication phase. * **Discovery Process:** We will engage in the discovery process, formally requesting documents and information from the defendants – primarily the current owner of the property, a shell corporation controlled by the Blackwood family’s descendants (a surprisingly persistent lineage, we might add) and the developer of the luxury golf course. This will involve serving subpoenas, taking depositions, and conducting interrogatories. * **Negotiation & Mediation:** We will actively pursue negotiation and mediation with the defendants, seeking a fair and equitable resolution to the dispute. We believe that a negotiated settlement is the most efficient and cost-effective way to resolve the case. However, we are fully prepared to litigate the case in court if necessary. * **Expert Witness Testimony:** We will retain expert witnesses – historians, genealogists, environmental scientists, and legal experts – to provide testimony in support of your claim. These experts will be called upon to testify about the historical context of the case, the scientific evidence, and the legal principles involved. **Phase 3: Property Restoration & Stewardship (Ongoing)** This phase is about more than just reclaiming legal ownership. It's about ensuring the long-term stewardship of the “Whispering Pines” valley. This phase will involve: * **Valley Restoration Project:** We will implement a comprehensive valley restoration project, aimed at reversing the damage caused by agricultural practices, logging activities, and residential development. This will include reforestation efforts, the restoration of native plant species, and the implementation of sustainable land management practices. * **Community Engagement:** We will actively engage with the local community, seeking their input and support for the valley restoration project. We believe that the valley should be a valuable asset for the entire community. * **Establishment of a Conservation Trust:** We will establish a conservation trust to ensure the long-term protection of the “Whispering Pines” valley. The trust will be governed by a board of directors, comprised of local residents, environmental experts, and legal professionals. **Investment Opportunity:** We are currently seeking investors to help fund this ambitious project. Your investment will not only help us reclaim your rightful ownership of the “Whispering Pines” valley, but it will also contribute to the preservation of this valuable natural resource for generations to come. We are offering a variety of investment options, ranging from equity financing to philanthropic donations. **Disclaimer:** *This is a hypothetical scenario and does not constitute legal advice. Actual outcomes may vary depending on the specific circumstances of the case.* **Contact Us:** [Contact Information] We understand that this is a complex and potentially overwhelming undertaking. However, we are confident that with your support, we can successfully reclaim the “Whispering Pines” valley and ensure its long-term protection. We look forward to hearing from you. iteration 24
+127
View File
@@ -0,0 +1,127 @@
import asyncio
import argparse
import random
import dotenv
import time
from langsmith import traceable, Client
dotenv.load_dotenv()
client = Client()
@traceable(run_type="llm", metadata={"ls_provider": "openai", "ls_model_name": "gpt-4o-mini"})
async def mock_chat_completion(*, model, messages):
# Sleep for 3 seconds each time
await asyncio.sleep(3)
input_tokens = random.randint(10000, 12000)
output_tokens = random.randint(1000, 2000)
return {
"role": "assistant",
"content": "This is a summary of the information provided.",
"usage_metadata": {
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"total_tokens": input_tokens + output_tokens,
},
}
# Will be traced by default
async def target(inputs: dict) -> dict:
messages = [
{
"role": "system",
"content": "You are an expert summarizer."
},
# This dataset has inputs as a dict with a "email" key
{"role": "user", "content": "Summarize this information:\n\n" + str(inputs)},
]
res = await mock_chat_completion(
model="gpt-4o-mini",
messages=messages
)
return { "summary": res }
@traceable(run_type="llm", metadata={"ls_provider": "openai", "ls_model_name": "o3-mini"})
async def mock_evaluator_chat_completion(*, model, messages):
await asyncio.sleep(2)
# Mock return value
input_tokens = random_number = random.randint(10000, 12000)
output_tokens = random.randint(10, 20)
return {
"role": "assistant",
"content": str(random.random()),
"usage_metadata": {
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"total_tokens": input_tokens + output_tokens,
},
}
async def mock_quality_evaluator(inputs: dict, outputs: dict):
messages = [
{
"role": "system",
"content": "Assign a quality score for the generated summary of an email."
},
{
"role": "user",
"content": f"""
Input info: {str(inputs)}
output: {outputs["summary"]}
"""
},
]
res = await mock_evaluator_chat_completion(
model="o3-mini",
messages=messages
)
return {
"key": "quality",
"score": float(res["content"]),
"comment": "Score justification or other comments can go here.",
}
async def run_eval(dataset_name: str):
print("Starting LangSmith experiment!")
start = time.perf_counter()
experiment_results = await client.aevaluate(
target,
# dataset with 10,000 examples
data=dataset_name, #"10k-long-emails"
evaluators=[
mock_quality_evaluator,
# can add multiple evaluators here
],
max_concurrency=1000,
)
finish_time = time.perf_counter()
print(f"Experiment finished in {finish_time - start} seconds")
client.flush()
flush_time = time.perf_counter()
print(f"All runs flushed to LangSmith in {flush_time - finish_time} seconds")
return (finish_time - start, dataset_name, len(experiment_results))
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Run evaluation on a LangSmith dataset",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Example:
python eval_data.py 10k-long-emails
"""
)
parser.add_argument(
"dataset_name",
type=str,
help="Name of the LangSmith dataset to evaluate"
)
args = parser.parse_args()
results = asyncio.run(run_eval(args.dataset_name))
print(results)
+102
View File
@@ -0,0 +1,102 @@
import os
import time
import argparse
import dotenv
import json
import pandas as pd
from pathlib import Path
from langsmith import Client
dotenv.load_dotenv()
client = Client()
def langsmith_init_data(csv_file: str, input_keys: list[str], output_keys: list[str], data_dir: str = "data"):
path = Path(data_dir) / f"{csv_file}.csv"
start = time.perf_counter()
# Read the CSV file
df = pd.read_csv(path)
total_rows = len(df)
# Check if dataset exists, otherwise create it
try:
dataset = client.read_dataset(dataset_name=csv_file)
raise ValueError(f"Dataset '{csv_file}' already exists in LangSmith. Please delete it first or use a different dataset name.")
except ValueError:
# Re-raise ValueError (dataset exists)
raise
except Exception:
# Dataset doesn't exist, create it
dataset = client.create_dataset(
dataset_name=csv_file,
description=f"Dataset created from {csv_file} CSV file with {total_rows} rows"
)
print(f"Created dataset: {dataset.name}")
# Calibrate chunk size to avoid sending too much at once
chunk_size = 1000
num_chunks = (total_rows + chunk_size - 1) // chunk_size
print(f"Uploading {total_rows} rows in {num_chunks} chunks to dataset {dataset.name}...")
for i in range(num_chunks):
start_idx = i * chunk_size
end_idx = min((i + 1) * chunk_size, total_rows)
# Create chunk dataframe
chunk_df = df.iloc[start_idx:end_idx]
# Prepare lists of inputs and outputs
inputs_list = [{key: row[key] for key in input_keys if key in row} for _, row in chunk_df.iterrows()]
outputs_list = [{key: row[key] for key in output_keys if key in row} for _, row in chunk_df.iterrows()] if output_keys else None
# Upload chunk to the dataset
client.create_examples(
inputs=inputs_list,
outputs=outputs_list,
dataset_id=dataset.id
)
print(f"Uploaded chunk {i+1}/{num_chunks}: {len(inputs_list)} examples")
end = time.perf_counter()
print(f"LangSmith dataset {dataset.name} uploaded in {end - start} seconds")
print(f"Total examples uploaded: {total_rows}")
return (end - start, dataset.name, total_rows)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Upload CSV data to LangSmith dataset",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Example:
python upload_data.py 10k-long-emails --data-dir data
"""
)
parser.add_argument(
"dataset_name",
type=str,
help="Name of the dataset to upload (must exist in config.json)"
)
parser.add_argument(
"--data-dir",
type=str,
default="data",
help="Directory containing the CSV data file (default: data)"
)
args = parser.parse_args()
config_path = os.path.join(os.path.dirname(__file__), "config.json")
with open(config_path, 'r') as f:
config = json.load(f)
if args.dataset_name not in config["data_files"]:
raise ValueError(f"Dataset '{args.dataset_name}' not found in config.json")
if "inputs" not in config["data_files"][args.dataset_name] or "outputs" not in config["data_files"][args.dataset_name]:
raise ValueError(f"Dataset '{args.dataset_name}' does not have inputs or outputs in config.json")
dataset_config = config["data_files"][args.dataset_name]
langsmith_results = langsmith_init_data(args.dataset_name, dataset_config["inputs"], dataset_config["outputs"], args.data_dir)
print(f"LangSmith results: {langsmith_results}")
Generated
+1885
View File
File diff suppressed because it is too large Load Diff
+24
View File
@@ -0,0 +1,24 @@
[tool.poetry]
name = "sdk-load-test"
version = "0.1.0"
description = "Package for load testing the langsmith sdk"
authors = ["Robert Xu <xuro@langchain.dev>"]
readme = "README.md"
package-mode = false
[tool.poetry.dependencies]
python = ">=3.11,<3.13"
langsmith = "^0.4.1"
langchain = "^1.0.0"
langchain_openai = "^1.0.0"
orjson = "^3.10.14"
python-dotenv = "^1.0.1"
pathlib = "^1.0.1"
humanize = "^4.11.0"
pandas = "^2.3.0"
packaging = "^25.0"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
+287
View File
@@ -0,0 +1,287 @@
#!/usr/bin/env python3
"""
Interactive script to run benchmarks end-to-end.
This script provides a menu interface to:
- Run tracing benchmarks (with UUID replacement and date updates)
- Run evaluation benchmarks (with data upload)
- Customize defaults for each benchmark type
"""
import argparse
import subprocess
import sys
from pathlib import Path
import dotenv
dotenv.load_dotenv()
def run_command(cmd, cwd, description, check=True, verbose=True):
"""Run a command and handle errors."""
if verbose:
print(f"\n{'='*60}")
print(f"Running: {description}")
print(f"{'='*60}")
print(f"Command: {' '.join(cmd)}")
print(f"Working directory: {cwd}\n")
# Always capture output so we can check for specific errors
result = subprocess.run(cmd, cwd=cwd, check=False, capture_output=True, text=True)
# Check if this is an "already exists" error before printing output
error_output = ""
if result.stdout:
error_output += result.stdout
if result.stderr:
error_output += result.stderr
is_already_exists_error = "already exists" in error_output.lower()
if verbose and not is_already_exists_error:
# Print output in real-time fashion for verbose mode (unless it's an expected error)
if result.stdout:
print(result.stdout)
if result.stderr:
print(result.stderr, file=sys.stderr)
if check and result.returncode != 0:
if not verbose:
print(f"\nError: {description} failed with exit code {result.returncode}")
if result.stdout:
print(result.stdout)
if result.stderr:
print(result.stderr)
elif not is_already_exists_error:
# Only print error details if it's not an expected "already exists" error
print(f"\nError: {description} failed with exit code {result.returncode}")
return False, result
return True, result
def run_tracing_benchmarks(data_dir="data"):
"""Run tracing benchmarks end-to-end."""
project_root = Path(__file__).parent.absolute()
tracing_dir = project_root / "tracing"
if not tracing_dir.exists():
print(f"Error: Tracing directory not found at {tracing_dir}")
return False
print("\n" + "="*60)
print("TRACING BENCHMARKS")
print("="*60)
# Step 1: Run replace_uuids (silent)
success, _ = run_command(
[sys.executable, "utils/replace_uuids.py"],
cwd=str(tracing_dir),
description="Preparing trace files",
check=True,
verbose=False
)
if not success:
return False
# Step 2: Run update_dates (silent)
success, _ = run_command(
[sys.executable, "utils/update_dates.py"],
cwd=str(tracing_dir),
description="Preparing trace files",
check=True,
verbose=False
)
if not success:
return False
# Step 3: Run flat benchmark
print("\nSTEP 1: Running flat tracing benchmark")
success, _ = run_command(
[sys.executable, "benchmark_flat.py", data_dir],
cwd=str(tracing_dir),
description="Flat tracing benchmark"
)
if not success:
return False
# Step 4: Run nested benchmark
print("\nSTEP 2: Running nested tracing benchmark")
success, _ = run_command(
[sys.executable, "benchmark_nested.py", data_dir],
cwd=str(tracing_dir),
description="Nested tracing benchmark"
)
if not success:
return False
print("\n" + "="*60)
print("SUCCESS: All tracing benchmarks completed!")
print("="*60)
print(f"\nResults saved to:")
print(f" - {tracing_dir}/benchmark_results_flat.txt")
print(f" - {tracing_dir}/benchmark_results_nested.txt")
return True
def run_evals_benchmarks(dataset="10k-long-emails-example", data_dir="data"):
"""Run evaluation benchmarks end-to-end."""
project_root = Path(__file__).parent.absolute()
evals_dir = project_root / "evals"
if not evals_dir.exists():
print(f"Error: Evals directory not found at {evals_dir}")
return False
print("\n" + "="*60)
print("EVALUATION BENCHMARKS")
print("="*60)
# Step 1: Benchmark data upload
print("\nSTEP 1: Benchmarking data upload")
# Run benchmark_upload.py
success, result = run_command(
[sys.executable, "benchmark_upload.py", data_dir, dataset],
cwd=str(evals_dir),
description=f"Benchmark upload for dataset '{dataset}'",
check=False
)
if not success:
# Check if failure was due to dataset already existing
error_output = ""
if result.stdout:
error_output += result.stdout
if result.stderr:
error_output += result.stderr
if "already exists" in error_output.lower():
print(f"\nDataset '{dataset}' already exists - skipping upload benchmark")
print("Moving directly to evaluation benchmarks...\n")
else:
# Different error, fail
return False
# Step 2: Run evaluation benchmarks
print("\nSTEP 2: Running evaluation benchmarks")
success, _ = run_command(
[sys.executable, "benchmark_evals.py", dataset],
cwd=str(evals_dir),
description=f"Evaluation benchmark for dataset '{dataset}'"
)
if not success:
return False
print("\n" + "="*60)
print("SUCCESS: All evaluation benchmarks completed!")
print("="*60)
print(f"\nResults saved to:")
print(f" - {evals_dir}/benchmark_results_evals.txt")
return True
def get_user_input(prompt, default=None, input_type=str):
"""Get user input with optional default."""
if default is not None:
full_prompt = f"{prompt} [{default}]: "
else:
full_prompt = f"{prompt}: "
user_input = input(full_prompt).strip()
if not user_input and default is not None:
return default
if not user_input:
return None
try:
return input_type(user_input)
except ValueError:
print(f"Invalid input. Please enter a valid {input_type.__name__}.")
return get_user_input(prompt, default, input_type)
def main():
parser = argparse.ArgumentParser(
description="Interactive script to run benchmarks end-to-end",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
This script provides an interactive menu to run benchmarks:
- Tracing benchmarks: Runs UUID replacement, date updates, and both flat/nested benchmarks
- Evaluation benchmarks: Uploads data and runs evaluation benchmarks
Example:
python run_benchmarks.py
"""
)
parser.add_argument(
"--non-interactive",
action="store_true",
help="Run in non-interactive mode (use defaults)"
)
args = parser.parse_args()
if args.non_interactive:
# Non-interactive mode: run all evals by default
print("Running in non-interactive mode - executing all evaluation benchmarks...")
success = run_evals_benchmarks()
sys.exit(0 if success else 1)
# Interactive mode
print("\n" + "="*60)
print("LangSmith SDK Benchmarks")
print("="*60)
print("\nSelect benchmark type:")
print("1. Tracing Benchmarks")
print("2. Evaluation Benchmarks")
print("3. Exit")
choice = get_user_input("\nEnter your choice", default="2", input_type=int)
if choice == 1:
# Tracing benchmarks
print("\n" + "-"*60)
print("Tracing Benchmarks Configuration")
print("-"*60)
data_dir = get_user_input(
"Enter data directory containing trace files",
default="data"
)
if not data_dir:
print("Error: Data directory is required")
sys.exit(1)
success = run_tracing_benchmarks(data_dir)
sys.exit(0 if success else 1)
elif choice == 2:
# Evaluation benchmarks
print("\n" + "-"*60)
print("Evaluation Benchmarks Configuration")
print("-"*60)
dataset = get_user_input(
"Enter dataset name (must exist in config.json)",
default="10k-long-emails-example"
)
data_dir = get_user_input(
"Enter data directory containing CSV files",
default="data"
)
if not dataset:
print("Error: Dataset name is required")
sys.exit(1)
if not data_dir:
print("Error: Data directory is required")
sys.exit(1)
success = run_evals_benchmarks(dataset, data_dir)
sys.exit(0 if success else 1)
elif choice == 3:
print("\nExiting...")
sys.exit(0)
else:
print("\nInvalid choice. Exiting...")
sys.exit(1)
if __name__ == "__main__":
main()
+163
View File
@@ -0,0 +1,163 @@
import argparse
import sys
import time
from pathlib import Path
from datetime import datetime
from typing import Tuple
import orjson
import dotenv
import humanize
from langsmith import Client
# Load environment variables
dotenv.load_dotenv()
########################################
# Helper Functions
########################################
def get_directory_size(directory: str) -> int:
"""Calculate total size of all JSONL files in a directory."""
total_size = 0
for path in Path(directory).glob("*.jsonl"):
total_size += path.stat().st_size
return total_size
########################################
# Replay Class
########################################
class LangsmithReplay:
@staticmethod
def replay_trace(run_ops_file: Path, logger: Client) -> None:
with open(run_ops_file, 'rb') as f:
for line in f:
operation = orjson.loads(line)
if operation["operation"] == "post":
create_params = {
"name": operation.get("name"),
"start_time": operation.get("start_time"),
"inputs": operation.get("inputs", {}),
"run_type": operation.get("run_type"),
"serialized": operation.get("serialized", {}),
"extra": operation.get("extra", {}),
"tags": operation.get("tags", []),
"trace_id": operation.get("trace_id"),
"dotted_order": operation.get("dotted_order"),
"parent_run_id": operation.get("parent_run_id"),
"id": operation.get('id'),
}
# Remove any keys with a None value.
create_params = {k: v for k, v in create_params.items() if v is not None}
logger.create_run(**create_params)
elif operation["operation"] == "patch":
end_time = operation.get("end_time")
if end_time and isinstance(end_time, str):
try:
end_time = datetime.fromisoformat(end_time.replace('Z', '+00:00'))
except ValueError:
end_time = None
update_params = {
"run_id": operation.get("id"),
"end_time": end_time,
"outputs": operation.get("outputs", {}),
"error": operation.get("error"),
"trace_id": operation.get("trace_id"),
"dotted_order": operation.get("dotted_order"),
"parent_run_id": operation.get("parent_run_id"),
}
update_params = {k: v for k, v in update_params.items() if v is not None}
logger.update_run(**update_params)
########################################
# Benchmark Runner
########################################
def run_ls_benchmark(data_dir: str) -> Tuple[float, float, float, int]:
"""Run the Langsmith benchmark."""
logger = Client()
data_path = Path(data_dir)
run_ops_files = list(data_path.glob("processed_run_ops_*.jsonl"))
run_ops_files.sort()
num_traces = len(run_ops_files)
user_perceived_start = time.perf_counter()
for run_ops_file in run_ops_files:
try:
LangsmithReplay.replay_trace(run_ops_file, logger)
except Exception as e:
print(f"Error replaying {run_ops_file}: {str(e)}", file=sys.stderr)
user_perceived_time = time.perf_counter() - user_perceived_start
flush_start = time.perf_counter()
logger.flush()
flush_time = time.perf_counter() - flush_start
total_time = user_perceived_time + flush_time
return user_perceived_time, flush_time, total_time, num_traces
def format_results(ls_results: Tuple[float, float, float, int],
data_dir: str) -> str:
"""Format benchmark results."""
_, _, ls_total, num_traces = ls_results
total_size = get_directory_size(data_dir)
size_human = humanize.naturalsize(total_size)
avg_ls = ls_total / num_traces if num_traces else 0
return f"""\
Langsmith Benchmark Results
===========================
Time Breakdown:
Total time {ls_total:7.3f}s
Performance:
Total Traces {num_traces}
Total Size {size_human}
Avg time/trace {avg_ls:7.3f}s
"""
def main(data_dir: str):
print("Running Langsmith benchmark...")
ls_results = run_ls_benchmark(data_dir)
results = format_results(ls_results, data_dir)
# Print to console
print("\nBenchmark Results:\n")
print(results)
# Save results to a file
with open("benchmark_results_flat.txt", "w") as f:
f.write(results)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Benchmark flat tracing performance (runs get their own traces)",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Example:
python benchmark_flat.py data
"""
)
parser.add_argument(
"data_dir",
type=str,
help="Directory containing processed_run_ops_*.jsonl trace files"
)
args = parser.parse_args()
main(args.data_dir)
+142
View File
@@ -0,0 +1,142 @@
import argparse
import sys
import time
from pathlib import Path
from datetime import datetime
from typing import Tuple
import orjson
import dotenv
import humanize
from langsmith import Client
dotenv.load_dotenv()
def get_directory_size(directory: str) -> int:
"""Calculate total size of all JSONL files in directory."""
total_size = 0
for path in Path(directory).glob("*.jsonl"):
total_size += path.stat().st_size
return total_size
class LangsmithReplay:
@staticmethod
def replay_trace(run_ops_file: Path, logger: Client) -> None:
with open(run_ops_file, 'rb') as f:
for line in f:
operation = orjson.loads(line)
if operation["operation"] == "post":
create_params = {
"name": operation.get("name"),
"start_time": operation.get("start_time"),
"inputs": operation.get("inputs", {}),
"run_type": operation.get("run_type"),
"serialized": operation.get("serialized", {}),
"extra": operation.get("extra", {}),
"tags": operation.get("tags", []),
"trace_id": operation.get("trace_id"),
"dotted_order": operation.get("dotted_order"),
"parent_run_id": operation.get("parent_run_id"),
"id": operation.get('id'),
}
create_params = {k: v for k, v in create_params.items() if v is not None}
logger.create_run(**create_params)
elif operation["operation"] == "patch":
end_time = operation.get("end_time")
if end_time and isinstance(end_time, str):
try:
end_time = datetime.fromisoformat(end_time.replace('Z', '+00:00'))
except ValueError:
end_time = None
update_params = {
"run_id": operation.get("id"),
"end_time": end_time,
"outputs": operation.get("outputs", {}),
"error": operation.get("error"),
"trace_id": operation.get("trace_id"),
"dotted_order": operation.get("dotted_order"),
"parent_run_id": operation.get("parent_run_id"),
}
update_params = {k: v for k, v in update_params.items() if v is not None}
logger.update_run(**update_params)
def run_ls_benchmark(data_dir: str) -> Tuple[float, float, float, int]:
"""Run the Langsmith benchmark."""
logger = Client()
data_path = Path(data_dir)
run_ops_files = list(data_path.glob("processed_run_ops_*.jsonl"))
run_ops_files.sort()
num_traces = len(run_ops_files)
user_perceived_start = time.perf_counter()
for run_ops_file in run_ops_files:
try:
LangsmithReplay.replay_trace(run_ops_file, logger)
except Exception as e:
print(f"Error replaying {run_ops_file}: {str(e)}", file=sys.stderr)
user_perceived_time = time.perf_counter() - user_perceived_start
flush_start = time.perf_counter()
logger.flush()
flush_time = time.perf_counter() - flush_start
total_time = user_perceived_time + flush_time
return user_perceived_time, flush_time, total_time, num_traces
def format_results(ls_results: Tuple[float, float, float, int],
data_dir: str) -> str:
"""Format benchmark results."""
_, _, ls_total, num_traces = ls_results
total_size = get_directory_size(data_dir)
size_human = humanize.naturalsize(total_size)
avg_ls = ls_total / num_traces if num_traces else 0
return f"""\
Langsmith Benchmark Results
===========================
Time Breakdown:
Total time {ls_total:7.3f}s
Performance:
Total Traces {num_traces}
Total Size {size_human}
Avg time/trace {avg_ls:7.3f}s
"""
def main(data_dir: str):
print("Running Langsmith benchmark...")
ls_results = run_ls_benchmark(data_dir)
results = format_results(ls_results, data_dir)
print(results)
with open("benchmark_results_nested.txt", "w") as f:
f.write(results)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Benchmark nested tracing performance (runs properly nested under parents)",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Example:
python benchmark_nested.py data
"""
)
parser.add_argument(
"data_dir",
type=str,
help="Directory containing processed_run_ops_*.jsonl trace files"
)
args = parser.parse_args()
main(args.data_dir)
+9
View File
@@ -0,0 +1,9 @@
Langsmith Benchmark Results
===========================
Time Breakdown:
Total time 2.899s
Performance:
Total Traces 45
Total Size 145.6 MB
Avg time/trace 0.064s
+9
View File
@@ -0,0 +1,9 @@
Langsmith Benchmark Results
===========================
Time Breakdown:
Total time 1.660s
Performance:
Total Traces 45
Total Size 145.6 MB
Avg time/trace 0.037s
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
+85
View File
@@ -0,0 +1,85 @@
from langsmith import Client
import orjson
import dotenv
trace_ids = { # Replace with your trace ids
"1efcf279-c279-613f-b5c6-4705c507f3e7",
"1efcf279-6c93-6b57-8468-788f550a4171",
"1efcf237-d278-65a7-a2da-d75c39571b51",
"1efcf1f3-3ee1-6643-9513-f19ec6ea38cd",
"1efcf164-fcce-66e1-bb63-0673854ee9f4",
"1efcf15f-521f-642b-92b3-9cc39ad0e414",
"1efcf15f-2725-61b3-a8fc-ad30facd29bd",
"1efcf12d-d85d-6780-9300-cf174385be8d",
"1efcf134-212d-6350-af0e-01e79b5c5b02",
"1efcf115-bfd6-6945-a028-11213eca6861",
"1efcf108-9282-6966-8ef1-d0c5f16d23da",
"1efcf102-e767-6629-bd02-16bafbb1bfb4",
"1efcf0d6-43bf-6ff8-b12a-47a79c940b6e",
"1efcf0d4-2007-6727-be94-9884ee1c0cbe",
"1efcf0d2-5e61-6a0d-b6d3-e9666f398dfa",
"1efcf0cf-d5e7-6b82-9635-7ee2bb254b18",
"1efcf0cd-a49f-665b-9894-52d508078fc1",
"1efcf0be-fc26-6bee-9361-51e9631e9f99",
"1efcf09b-9bcb-68e6-8cc5-da2673832877",
"1efcf072-10ee-684d-ab63-51194ccc7955",
"1efcf06c-ec82-63c2-a2ec-1e69717454c1",
"1efcf05a-5bac-6bba-98d1-978315c98f98",
"1efcf04a-a9ae-60a6-b899-19ba02b2c8ea",
"1efcf01f-fb89-61bd-a8ba-1123f9f48458",
"1efcefe2-7fe1-6ff4-9ea8-300f343970e1",
"1efcef47-59e3-6165-bac0-1a6524c25453",
"1efceee0-a5ba-6a89-b1eb-181a839b2a28",
"1efceed5-37db-6423-a76f-974355024b85",
"1efceecb-b89e-6a62-8140-0d9edfbe5225",
"1efceec9-8695-6c1b-a4eb-eaab38afce6c",
"1efcee40-3791-6296-a156-5207b6f9291c",
}
def produce_run_ops_jsonl_files():
client = Client()
for trace_id in trace_ids:
results = client.list_runs(
project_name='example-project', # Replace with your project name
trace_id=trace_id,
)
results = list(results)
results.sort(key=lambda x: x.dotted_order)
with open(f'data/processed_run_ops_{trace_id}.jsonl', 'wb') as run_ops_file:
for run in results:
run_dict = dict(run)
post = {
"operation": "post",
"id": run_dict["id"],
"name": run_dict["name"],
"start_time": run_dict["start_time"],
"serialized": run_dict["serialized"],
"events": run_dict["events"],
"inputs": run_dict["inputs"],
"run_type": run_dict["run_type"],
"extra": run_dict["extra"],
"tags": run_dict["tags"],
"trace_id": run_dict["trace_id"],
"dotted_order": run_dict["dotted_order"],
"parent_run_id": run_dict["parent_run_id"],
}
run_ops_file.write(orjson.dumps(post))
run_ops_file.write(b'\n')
patch = {
"operation": "patch",
"id": run_dict["id"],
"name": run_dict["name"],
"end_time": run_dict["end_time"],
"error": run_dict["error"],
"outputs": run_dict["outputs"],
"trace_id": run_dict["trace_id"],
"dotted_order": run_dict["dotted_order"],
"parent_run_id": run_dict["parent_run_id"],
}
run_ops_file.write(orjson.dumps(patch))
run_ops_file.write(b'\n')
if __name__ == "__main__":
dotenv.load_dotenv()
produce_run_ops_jsonl_files()
+83
View File
@@ -0,0 +1,83 @@
import orjson
from pathlib import Path
import dotenv
import time
from datetime import datetime
from langsmith import Client
dotenv.load_dotenv()
def replay_trace(run_ops_file: Path, logger: Client) -> None:
with open(run_ops_file, 'rb') as f:
for line in f:
operation = orjson.loads(line)
if operation["operation"] == "post":
create_params = {
"name": operation.get("name"),
"start_time": operation.get("start_time"),
"inputs": operation.get("inputs", {}),
"run_type": operation.get("run_type"),
"serialized": operation.get("serialized", {}),
"extra": operation.get("extra", {}),
"tags": operation.get("tags", []),
"trace_id": operation.get("trace_id"),
"dotted_order": operation.get("dotted_order"),
"parent_run_id": operation.get("parent_run_id"),
"id": operation.get('id'),
}
create_params = {k: v for k, v in create_params.items() if v is not None}
logger.create_run(**create_params)
elif operation["operation"] == "patch":
end_time = operation.get("end_time")
if end_time and isinstance(end_time, str):
try:
end_time = datetime.fromisoformat(end_time.replace('Z', '+00:00'))
except ValueError:
end_time = None
update_params = {
"run_id": operation.get("id"),
"end_time": end_time,
"outputs": operation.get("outputs", {}),
"error": operation.get("error"),
"trace_id": operation.get("trace_id"),
"dotted_order": operation.get("dotted_order"),
"parent_run_id": operation.get("parent_run_id"),
}
update_params = {k: v for k, v in update_params.items() if v is not None}
logger.update_run(**update_params)
def replay_all_traces(data_dir: str = "data") -> None:
logger = Client()
data_path = Path(data_dir)
run_ops_files = list(data_path.glob("processed_run_ops_*.jsonl"))
run_ops_files.sort()
user_percieved_start_time = time.perf_counter()
for run_ops_file in run_ops_files:
try:
replay_trace(run_ops_file, logger)
except Exception as e:
print(f"Error replaying {run_ops_file}: {str(e)}")
user_percieved_time = time.perf_counter() - user_percieved_start_time
print(f"User perceived time taken: {user_percieved_time} seconds")
flush_time = time.perf_counter()
logger.flush()
flush_time = time.perf_counter() - flush_time
print(f"Flush time: {flush_time} seconds")
if __name__ == "__main__":
start_time = time.perf_counter()
try:
replay_all_traces()
finally:
end_time = time.perf_counter()
print(f"Total time taken: {end_time - start_time} seconds")
+50
View File
@@ -0,0 +1,50 @@
import uuid
import re
from pathlib import Path
def generate_uuid_mapping(content):
uuid_pattern = r'[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}'
old_uuids = set(re.findall(uuid_pattern, content))
return {old: str(uuid.uuid4()) for old in old_uuids}
def update_json_content(content, uuid_mapping):
updated_content = content
# Sort UUIDs by length in descending order to avoid partial replacements
for old_uuid in sorted(uuid_mapping.keys(), key=len, reverse=True):
new_uuid = uuid_mapping[old_uuid]
updated_content = updated_content.replace(old_uuid, new_uuid)
return updated_content
def process_json_file(input_path, output_path):
"""Process the JSON file and write updated content to output file."""
try:
with open(input_path, 'r') as file:
content = file.read()
uuid_mapping = generate_uuid_mapping(content)
updated_content = update_json_content(content, uuid_mapping)
with open(output_path, 'w') as file:
file.write(updated_content)
print(f"Successfully processed file. Output written to: {output_path}")
print(f"UUID mappings:")
for old, new in uuid_mapping.items():
print(f"Old: {old}")
print(f"New: {new}")
print("-" * 50)
except Exception as e:
print(f"Error processing file: {str(e)}")
# Process all files in data directory
data_path = Path("data")
run_ops_files = list(data_path.glob("*.jsonl"))
run_ops_files.sort()
for input_file in run_ops_files:
output_file = input_file.parent / f"{input_file.name}"
process_json_file(str(input_file), str(output_file))
+116
View File
@@ -0,0 +1,116 @@
import re
from pathlib import Path
from datetime import date, timedelta
def update_json_dates(content):
today = date.today()
# Collect all unique dates from the content
all_dates = set()
# Find all dates in start_time fields
start_time_pattern = r'("start_time":\s*")(\d{4}-\d{2}-\d{2})T'
start_time_matches = re.findall(start_time_pattern, content)
for match in start_time_matches:
all_dates.add(match[1])
# Find all dates in end_time fields
end_time_pattern = r'("end_time":\s*")(\d{4}-\d{2}-\d{2})T'
end_time_matches = re.findall(end_time_pattern, content)
for match in end_time_matches:
all_dates.add(match[1])
# Find all dates in start events
start_event_pattern = r'("name":"start","time":")(\d{4}-\d{2}-\d{2})T'
start_event_matches = re.findall(start_event_pattern, content)
for match in start_event_matches:
all_dates.add(match[1])
# Find all dates in end events
end_event_pattern = r'("name":"end","time":")(\d{4}-\d{2}-\d{2})T'
end_event_matches = re.findall(end_event_pattern, content)
for match in end_event_matches:
all_dates.add(match[1])
if not all_dates:
print("No dates found to update.")
return content
# Create date mapping: latest date -> today, second latest -> yesterday, etc.
sorted_dates = sorted(all_dates)
date_mapping = {}
for i, old_date in enumerate(sorted_dates):
days_back = len(sorted_dates) - 1 - i
new_date = today - timedelta(days=days_back)
date_mapping[old_date] = new_date.isoformat()
print(f"Date mapping:")
for old_date, new_date in date_mapping.items():
print(f" {old_date} -> {new_date}")
updated_content = content
# Apply mapping to start_time fields
for old_date, new_date in date_mapping.items():
updated_content = re.sub(
f'("start_time":\\s*"){old_date}T',
f'"start_time":"{new_date}T',
updated_content
)
# Apply mapping to end_time fields
for old_date, new_date in date_mapping.items():
updated_content = re.sub(
f'("end_time":\\s*"){old_date}T',
f'"end_time":"{new_date}T',
updated_content
)
# Apply mapping to start events
for old_date, new_date in date_mapping.items():
updated_content = re.sub(
f'("name":"start","time":")({old_date})T',
f'"name":"start","time":"{new_date}T',
updated_content
)
# Apply mapping to end events
for old_date, new_date in date_mapping.items():
updated_content = re.sub(
f'("name":"end","time":")({old_date})T',
f'"name":"end","time":"{new_date}T',
updated_content
)
return updated_content
def process_json_file(input_path, output_path):
"""Process the JSON file and write updated content to output file."""
try:
with open(input_path, 'r') as file:
content = file.read()
updated_content = update_json_dates(content)
with open(output_path, 'w') as file:
file.write(updated_content)
print(f"Successfully processed file. Output written to: {output_path}")
except Exception as e:
print(f"Error processing file: {str(e)}")
# Process all files in data directory
data_path = Path("data")
if data_path.exists():
run_ops_files = list(data_path.glob("*.jsonl"))
run_ops_files.sort()
for input_file in run_ops_files:
output_file = input_file.parent / f"{input_file.name}"
process_json_file(str(input_file), str(output_file))
else:
print(f"Directory not found: {data_path}. Make sure to run this script from the root directory of the project.")