mirror of
https://github.com/run-llama/semtools.git
synced 2026-06-30 21:38:00 -04:00
docs: update docs to unified interface
This commit is contained in:
@@ -2,12 +2,12 @@
|
||||
|
||||
> Semantic search and document parsing tools for the command line
|
||||
|
||||
A collection of high-performance CLI tools for document processing and semantic search, built with Rust for speed and reliability.
|
||||
A high-performance CLI tool for document processing and semantic search, built with Rust for speed and reliability.
|
||||
|
||||
- **`parse`** - Parse documents (PDF, DOCX, etc.) using, by default, the LlamaParse API into markdown format
|
||||
- **`search`** - Local semantic keyword search using multilingual embeddings with cosine similarity matching and per-line context matching
|
||||
- **`ask`** - AI agent with search and read tools for answering questions over document collections (defaults to OpenAI, but see the [config section](#configuration) to learn more about connecting to any OpenAI-Compatible API)
|
||||
- **`workspace`** - Workspace management for accelerating search over large collections
|
||||
- **`semtools parse`** - Parse documents (PDF, DOCX, etc.) using, by default, the LlamaParse API into markdown format
|
||||
- **`semtools search`** - Local semantic keyword search using multilingual embeddings with cosine similarity matching and per-line context matching
|
||||
- **`semtools ask`** - AI agent with search and read tools for answering questions over document collections (defaults to OpenAI, but see the [config section](#configuration) to learn more about connecting to any OpenAI-Compatible API)
|
||||
- **`semtools workspace`** - Workspace management for accelerating search over large collections
|
||||
|
||||
**NOTE:** By default, `parse` uses LlamaParse as a backend. Get your API key today for free at [https://cloud.llamaindex.ai](https://cloud.llamaindex.ai). `search` and `workspace` remain local-only. `ask` requires an OpenAI API key.
|
||||
|
||||
@@ -25,7 +25,8 @@ A collection of high-performance CLI tools for document processing and semantic
|
||||
|
||||
Prerequisites:
|
||||
|
||||
- For the `parse` tool: LlamaIndex Cloud API key
|
||||
- For the `parse` subcommand: LlamaIndex Cloud API key
|
||||
- For the `ask` subcommand: OpenAI API key
|
||||
|
||||
Install:
|
||||
|
||||
@@ -53,41 +54,41 @@ Basic Usage:
|
||||
|
||||
```bash
|
||||
# Parse some files
|
||||
parse my_dir/*.pdf
|
||||
semtools parse my_dir/*.pdf
|
||||
|
||||
# Search some (text-based) files
|
||||
search "some keywords" *.txt --max-distance 0.3 --n-lines 5
|
||||
semtools search "some keywords" *.txt --max-distance 0.3 --n-lines 5
|
||||
|
||||
# Ask questions about your documents using an AI agent
|
||||
ask "What are the main findings?" papers/*.txt
|
||||
semtools ask "What are the main findings?" papers/*.txt
|
||||
|
||||
# Combine parsing and search
|
||||
parse my_docs/*.pdf | xargs search "API endpoints"
|
||||
semtools parse my_docs/*.pdf | xargs search "API endpoints"
|
||||
|
||||
# Ask a question to a set of files
|
||||
ask "Some question?" *.txt
|
||||
semtools ask "Some question?" *.txt
|
||||
|
||||
# Combine parsing with the ask agent
|
||||
parse research_papers/*.pdf | xargs ask "Summarize the key methodologies"
|
||||
semtools parse research_papers/*.pdf | xargs ask "Summarize the key methodologies"
|
||||
|
||||
# Ask based on stdin content
|
||||
cat README.md | ask "How do I install SemTools?"
|
||||
cat README.md | semtools ask "How do I install SemTools?"
|
||||
```
|
||||
|
||||
Advanced Usage:
|
||||
|
||||
```bash
|
||||
# Combine with grep for exact-match pre-filtering and distance thresholding
|
||||
parse *.pdf | xargs cat | grep -i "error" | search "network error" --max-distance 0.3
|
||||
semtools parse *.pdf | xargs cat | grep -i "error" | semtools search "network error" --max-distance 0.3
|
||||
|
||||
# Pipeline with content search (note the 'xargs' on search to search files instead of stdin)
|
||||
find . -name "*.md" | xargs parse | xargs search "installation"
|
||||
find . -name "*.md" | xargs semtools parse | xargs semtools search "installation"
|
||||
|
||||
# Combine with grep for filtering (grep could be before or after parse/search!)
|
||||
parse docs/*.pdf | xargs search "API" | grep -A5 "authentication"
|
||||
semtools parse docs/*.pdf | xargs semtools search "API" | grep -A5 "authentication"
|
||||
|
||||
# Save search results from stdin search
|
||||
parse report.pdf | xargs cat | search "summary" > results.txt
|
||||
semtools parse report.pdf | xargs cat | semtools search "summary" > results.txt
|
||||
```
|
||||
|
||||
Using Workspaces:
|
||||
@@ -95,7 +96,7 @@ Using Workspaces:
|
||||
```bash
|
||||
# Create or select a workspace
|
||||
# Workspaces are stored in ~/.semtools/workspaces/
|
||||
workspace use my-workspace
|
||||
semtools workspace use my-workspace
|
||||
> Workspace 'my-workspace' configured.
|
||||
> To activate it, run:
|
||||
> export SEMTOOLS_WORKSPACE=my-workspace
|
||||
@@ -107,17 +108,17 @@ export SEMTOOLS_WORKSPACE=my-workspace
|
||||
|
||||
# All search commands will now use the workspace for caching embeddings
|
||||
# The initial command is used to initialize the workspace
|
||||
search "some keywords" ./some_large_dir/*.txt --n-lines 5 --top-k 10
|
||||
semtools search "some keywords" ./some_large_dir/*.txt --n-lines 5 --top-k 10
|
||||
|
||||
# If documents change, they are automatically re-embedded and cached
|
||||
echo "some new content" > ./some_large_dir/some_file.txt
|
||||
search "some keywords" ./some_large_dir/*.txt --n-lines 5 --top-k 10
|
||||
semtools search "some keywords" ./some_large_dir/*.txt --n-lines 5 --top-k 10
|
||||
|
||||
# If documents are removed, you can run prune to clean up stale files
|
||||
workspace prune
|
||||
semtools workspace prune
|
||||
|
||||
# You can see the stats of a workspace at any time
|
||||
workspace status
|
||||
semtools workspace status
|
||||
> Active workspace: arxiv
|
||||
> Root: /Users/loganmarkewich/.semtools/workspaces/arxiv
|
||||
> Documents: 3000
|
||||
@@ -127,10 +128,10 @@ workspace status
|
||||
## CLI Help
|
||||
|
||||
```bash
|
||||
$ parse --help
|
||||
$ semtools parse --help
|
||||
A CLI tool for parsing documents using various backends
|
||||
|
||||
Usage: parse [OPTIONS] <FILES>...
|
||||
Usage: semtools parse [OPTIONS] <FILES>...
|
||||
|
||||
Arguments:
|
||||
<FILES>... Files to parse
|
||||
@@ -140,33 +141,32 @@ Options:
|
||||
-b, --backend <BACKEND> The backend type to use for parsing. Defaults to `llama-parse` [default: llama-parse]
|
||||
-v, --verbose Verbose output while parsing
|
||||
-h, --help Print help
|
||||
-V, --version Print version
|
||||
```
|
||||
|
||||
```bash
|
||||
$ search --help
|
||||
$ semtools search --help
|
||||
A CLI tool for fast semantic keyword search
|
||||
|
||||
Usage: search [OPTIONS] <QUERY> [FILES]...
|
||||
Usage: semtools search [OPTIONS] <QUERY> [FILES]...
|
||||
|
||||
Arguments:
|
||||
<QUERY> Query to search for (positional argument)
|
||||
[FILES]... Files or directories to search
|
||||
[FILES]... Files to search, optional if using stdin
|
||||
|
||||
Options:
|
||||
-n, --n-lines <N_LINES> How many lines before/after to return as context [default: 3]
|
||||
--top-k <TOP_K> The top-k files or texts to return (ignored if max_distance is set) [default: 3]
|
||||
-m, --max-distance <MAX_DISTANCE> Return all results with distance below this threshold (0.0+)
|
||||
-i, --ignore-case Perform case-insensitive search (default is false)
|
||||
-j, --json Output results in JSON format
|
||||
-h, --help Print help
|
||||
-V, --version Print version
|
||||
```
|
||||
|
||||
```bash
|
||||
$ workspace --help
|
||||
$ semtools workspace --help
|
||||
Manage semtools workspaces
|
||||
|
||||
Usage: workspace <COMMAND>
|
||||
Usage: semtools workspace [OPTIONS] <COMMAND>
|
||||
|
||||
Commands:
|
||||
use Use or create a workspace (prints export command to run)
|
||||
@@ -175,15 +175,15 @@ Commands:
|
||||
help Print this message or the help of the given subcommand(s)
|
||||
|
||||
Options:
|
||||
-h, --help Print help
|
||||
-V, --version Print version
|
||||
-j, --json Output results in JSON format
|
||||
-h, --help Print help
|
||||
```
|
||||
|
||||
```bash
|
||||
$ ask --help
|
||||
A CLI tool for fast semantic keyword search
|
||||
$ semtools ask --help
|
||||
A CLI tool for document-based question-answering
|
||||
|
||||
Usage: ask [OPTIONS] <QUERY> [FILES]...
|
||||
Usage: semtools ask [OPTIONS] <QUERY> [FILES]...
|
||||
|
||||
Arguments:
|
||||
<QUERY> Query to prompt the agent with
|
||||
@@ -194,8 +194,9 @@ Options:
|
||||
--api-key <API_KEY> OpenAI API key (overrides config file and env var)
|
||||
--base-url <BASE_URL> OpenAI base URL (overrides config file)
|
||||
-m, --model <MODEL> Model to use for the agent (overrides config file)
|
||||
--api-mode <API_MODE> API mode to use: 'chat' or 'responses' (overrides config file)
|
||||
-j, --json Output results in JSON or text format
|
||||
-h, --help Print help
|
||||
-V, --version Print version
|
||||
```
|
||||
|
||||
## Configuration
|
||||
@@ -260,11 +261,11 @@ Configuration values are resolved in the following priority order (highest to lo
|
||||
|
||||
This allows you to set common defaults in the config file while overriding them on a per-command basis when needed.
|
||||
|
||||
### Tool-Specific Configuration
|
||||
### Subcommand-Specific Configuration
|
||||
|
||||
#### Parse Tool
|
||||
#### Parse Subcommand
|
||||
|
||||
The `parse` tool requires a LlamaParse API key. Get your free API key at [https://cloud.llamaindex.ai](https://cloud.llamaindex.ai).
|
||||
The `parse` subcommand requires a LlamaParse API key. Get your free API key at [https://cloud.llamaindex.ai](https://cloud.llamaindex.ai).
|
||||
|
||||
Configuration options:
|
||||
- `api_key`: Your LlamaParse API key
|
||||
@@ -273,9 +274,9 @@ Configuration options:
|
||||
- `parse_kwargs`: Additional parsing parameters
|
||||
- `check_interval`, `max_timeout`, `max_retries`, `retry_delay_ms`, `backoff_multiplier`: Retry and timeout settings
|
||||
|
||||
#### Ask Tool
|
||||
#### Ask Subcommand
|
||||
|
||||
The `ask` tool requires an OpenAI API key for the agent's LLM.
|
||||
The `ask` subcommand requires an OpenAI API key for the agent's LLM.
|
||||
|
||||
Configuration options:
|
||||
- `api_key`: Your OpenAI API key
|
||||
@@ -285,7 +286,7 @@ Configuration options:
|
||||
|
||||
You can also override these per-command:
|
||||
```bash
|
||||
ask "What is this about?" docs/*.txt --model gpt-4o --api-key sk-...
|
||||
semtools ask "What is this about?" docs/*.txt --model gpt-4o --api-key sk-...
|
||||
```
|
||||
|
||||
## Agent Use Case Examples
|
||||
|
||||
+45
-45
@@ -1,58 +1,58 @@
|
||||
# Augmented CLI Tooling
|
||||
|
||||
If executing bash commands, you have three very helpful utilities installed
|
||||
If executing bash commands, you have three very helpful sucommands installed within `semtools`:
|
||||
- `parse` -- converts any non grep-able format into markdown, outputs a filepath for a converted markdown file for every input file to stdin
|
||||
- `search` -- performs a search using static embeddings on either stdin or a list of files (very similar to grep). Works best with keyword based search queries. Only works with text-based files so it may require the `parse` tool to help preprocess into markdown.
|
||||
- `workspace` - Workspace management for accelerating search over large collections
|
||||
|
||||
These commands, combined with other CLI commands, you can ensure that you can search large amounts of files efficiently, while handling various formats of documents. Both `parse` and `search` can scale to hundreds of thousands of documents.
|
||||
|
||||
## Parse CLI Help
|
||||
## Parse Subcommand Help
|
||||
|
||||
```bash
|
||||
parse --help
|
||||
$ semtools parse --help
|
||||
A CLI tool for parsing documents using various backends
|
||||
|
||||
Usage: parse [OPTIONS] <FILES>...
|
||||
Usage: semtools parse [OPTIONS] <FILES>...
|
||||
|
||||
Arguments:
|
||||
<FILES>... Files to parse
|
||||
|
||||
Options:
|
||||
-c, --parse-config <PARSE_CONFIG> Path to the config file. Defaults to ~/.parse_config.json
|
||||
-b, --backend <BACKEND> The backend type to use for parsing. Defaults to `llama-parse` [default: llama-parse]
|
||||
-h, --help Print help
|
||||
-V, --version Print version
|
||||
-c, --config <CONFIG> Path to the config file. Defaults to ~/.semtools_config.json
|
||||
-b, --backend <BACKEND> The backend type to use for parsing. Defaults to `llama-parse` [default: llama-parse]
|
||||
-v, --verbose Verbose output while parsing
|
||||
-h, --help Print help
|
||||
```
|
||||
|
||||
## Search CLI Help
|
||||
## Search Subcommand Help
|
||||
|
||||
```bash
|
||||
search --help
|
||||
$ semtools search --help
|
||||
A CLI tool for fast semantic keyword search
|
||||
|
||||
Usage: search [OPTIONS] <QUERY> [FILES]...
|
||||
Usage: semtools search [OPTIONS] <QUERY> [FILES]...
|
||||
|
||||
Arguments:
|
||||
<QUERY> Query to search for (positional argument)
|
||||
[FILES]... Files or directories to search
|
||||
[FILES]... Files to search, optional if using stdin
|
||||
|
||||
Options:
|
||||
-n, --n-lines <N_LINES> How many lines before/after to return as context [default: 3]
|
||||
--top-k <TOP_K> The top-k files or texts to return (ignored if max_distance is set) [default: 3]
|
||||
-m, --max-distance <MAX_DISTANCE> Return all results with distance below this threshold (0.0+)
|
||||
-i, --ignore-case Perform case-insensitive search (default is false)
|
||||
-j, --json Output results in JSON format
|
||||
-h, --help Print help
|
||||
-V, --version Print version
|
||||
```
|
||||
|
||||
## Workspaces CLI Help
|
||||
## Workspaces Subcommand Help
|
||||
|
||||
```bash
|
||||
workspace --help
|
||||
$ semtools workspace --help
|
||||
Manage semtools workspaces
|
||||
|
||||
Usage: workspace <COMMAND>
|
||||
Usage: semtools workspace [OPTIONS] <COMMAND>
|
||||
|
||||
Commands:
|
||||
use Use or create a workspace (prints export command to run)
|
||||
@@ -61,55 +61,55 @@ Commands:
|
||||
help Print this message or the help of the given subcommand(s)
|
||||
|
||||
Options:
|
||||
-h, --help Print help
|
||||
-V, --version Print version
|
||||
-j, --json Output results in JSON format
|
||||
-h, --help Print help
|
||||
```
|
||||
|
||||
|
||||
## Common Usage Patterns
|
||||
|
||||
Here's how to convert those standalone commands to `semtools` subcommands:
|
||||
|
||||
### Using Parse and Search
|
||||
|
||||
```bash
|
||||
# Parse a PDF and search for specific content
|
||||
parse document.pdf | xargs cat | search "error handling"
|
||||
semtools parse document.pdf | xargs cat | semtools search "error handling"
|
||||
|
||||
# Search within many files after parsing
|
||||
parse my_docs/*.pdf | xargs search "API endpoints"
|
||||
semtools parse my_docs/*.pdf | xargs semtools search "API endpoints"
|
||||
|
||||
# Search with custom context and thresholds or distance thresholds
|
||||
search "machine learning" *.txt --n-lines 5 --max-distance 0.3
|
||||
semtools search "machine learning" *.txt --n-lines 5 --max-distance 0.3
|
||||
|
||||
# Search from stdin
|
||||
echo "some text content" | search "content"
|
||||
echo "some text content" | semtools search "content"
|
||||
|
||||
# Parse multiple documents
|
||||
parse report.pdf data.xlsx presentation.pptx
|
||||
semtools parse report.pdf data.xlsx presentation.pptx
|
||||
|
||||
# Chain parsing with semantic search
|
||||
parse *.pdf | xargs search "financial projections" --n-lines 3
|
||||
semtools parse *.pdf | xargs semtools search "financial projections" --n-lines 3
|
||||
|
||||
# Search with distance threshold (lower = more similar)
|
||||
parse document.pdf | xargs cat | search "revenue" --max-distance 0.2
|
||||
semtools parse document.pdf | xargs cat | semtools search "revenue" --max-distance 0.2
|
||||
|
||||
# Search multiple files directly
|
||||
search "error handling" src/*.rs --top-k 5
|
||||
semtools search "error handling" src/*.rs --top-k 5
|
||||
|
||||
# Combine with grep for exact-match pre-filtering and distance thresholding
|
||||
parse *.pdf | xargs cat | grep -i "error" | search "network error" --max-distance 0.3
|
||||
semtools parse *.pdf | xargs cat | grep -i "error" | semtools search "network error" --max-distance 0.3
|
||||
|
||||
# Pipeline with content search (note the 'cat')
|
||||
find . -name "*.md" | xargs parse | xargs search "installation"
|
||||
find . -name "*.md" | xargs semtools parse | xargs semtools search "installation"
|
||||
```
|
||||
|
||||
|
||||
|
||||
### Using with Workspaces
|
||||
|
||||
```bash
|
||||
# Create or select a workspace
|
||||
# Workspaces are stored in ~/.semtools/workspaces/
|
||||
workspace use my-workspace
|
||||
semtools workspace use my-workspace
|
||||
> Workspace 'my-workspace' configured.
|
||||
> To activate it, run:
|
||||
> export SEMTOOLS_WORKSPACE=my-workspace
|
||||
@@ -121,44 +121,44 @@ export SEMTOOLS_WORKSPACE=my-workspace
|
||||
|
||||
# All search commands will now use the workspace for caching embeddings
|
||||
# The initial command is used to initialize the workspace
|
||||
search "some keywords" ./some_large_dir/*.txt --n-lines 5 --top-k 10
|
||||
semtools search "some keywords" ./some_large_dir/*.txt --n-lines 5 --top-k 10
|
||||
|
||||
# If documents change, they are automatically re-embedded and cached
|
||||
echo "some new content" > ./some_large_dir/some_file.txt
|
||||
search "some keywords" ./some_large_dir/*.txt --n-lines 5 --top-k 10
|
||||
|
||||
semtools search "some keywords" ./some_large_dir/*.txt --n-lines 5 --top-k 10
|
||||
|
||||
# A workspace example if you are using with parse
|
||||
# create a workspace
|
||||
workspace use my-workspace2
|
||||
semtools workspace use my-workspace2
|
||||
export SEMTOOLS_WORKSPACE=my-workspace2
|
||||
|
||||
# parse files, and then search over the parsed files, and cache the file embeddings
|
||||
parse *.pdf | xargs search "financial projections" --n-lines 3
|
||||
semtools parse *.pdf | xargs semtools search "financial projections" --n-lines 3
|
||||
|
||||
# if you run the command with a different query (see option a and b), over the same set of files, then search will operate
|
||||
# over the cached file embeddings
|
||||
# option a - parse won't rerun since files already cached
|
||||
parse *.pdf | xargs search "balance sheet" --n-lines 3
|
||||
semtools parse *.pdf | xargs semtools search "balance sheet" --n-lines 3
|
||||
|
||||
# option b - run search directly over the parse cache
|
||||
xargs search "balance sheet" /Users/jerryliu/.parse/*.pdf.md --n-lines 3
|
||||
xargs semtools search "balance sheet" /Users/jerryliu/.parse/*.pdf.md --n-lines 3
|
||||
|
||||
# If documents are removed, you can run prune to clean up stale files
|
||||
workspace prune
|
||||
semtools workspace prune
|
||||
|
||||
# You can see the stats of a workspace at any time
|
||||
workspace status
|
||||
semtools workspace status
|
||||
> Active workspace: arxiv
|
||||
> Root: /Users/loganmarkewich/.semtools/workspaces/arxiv
|
||||
> Documents: 3000
|
||||
> Index: Yes (IVF_PQ)
|
||||
|
||||
```
|
||||
|
||||
|
||||
## Tips for using these tools
|
||||
|
||||
- If you have run / plan on running repeated `search` queries over the same file or set of files, you SHOULD create a workspace (`workspace use`) before running parse/search commands - otherwise you will be re-embedding the same document collections from scratch every time. Make sure the environment variable is set before downstream commands.
|
||||
- Before you create a workspace, you can check current workspace through `workspace status` which will also give the directory where all workspaces are stored.
|
||||
- You can choose to add a new workspace or prune an existing one if you are changing to a different collection of files. You can check the status through `workspace status`.
|
||||
- If you have run / plan on running repeated `search` queries over the same file or set of files, you SHOULD create a workspace (`semtools workspace use`) before running parse/search commands - otherwise you will be re-embedding the same document collections from scratch every time. Make sure the environment variable is set before downstream commands.
|
||||
- Before you create a workspace, you can check current workspace through `semtools workspace status` which will also give the directory where all workspaces are stored.
|
||||
- You can choose to add a new workspace or prune an existing one if you are changing to a different collection of files. You can check the status through `semtools workspace status`.
|
||||
- `parse` will always output paths of parsed files to stdin. These parsed files represent the markdown version of their original file (for example, parsing a PDF or DOCX file into markdown).
|
||||
- ALWAYS call `parse` first when interacting with PDF (or similar) formats so that you can get the paths to the markdown versions of those files
|
||||
- `search` only works with text-based files (like markdown). It's a common pattern to first call `parse` and either feed files into `search` or cat files and search from stdin
|
||||
|
||||
@@ -42,10 +42,10 @@ In this example, I have a folder of 900+ papers (as PDFs) from an AI conference.
|
||||
First, I parse the PDFs (which caches them to disk at `~/.parse`):
|
||||
|
||||
```bash
|
||||
parse ./papers
|
||||
semtools parse ./papers
|
||||
```
|
||||
|
||||
Next, I invoke `claude` and ask it to summarize its takes on aspects of the papers. Using `parse`, `search`, and other common CLI commands, Claude is able to gather specific information from all of the papers.
|
||||
Next, I invoke `claude` and ask it to summarize its takes on aspects of the papers. Using `semtools parse`, `semtools search`, and other common CLI commands, Claude is able to gather specific information from all of the papers.
|
||||
|
||||
Below is the full trace from Claude-Code, which demonstrates it's ability to use `semtools` like `parse` and `search` to gather information from your files.
|
||||
|
||||
@@ -82,7 +82,7 @@ Below is the full trace from Claude-Code, which demonstrates it's ability to use
|
||||
⏺ I can see there are 910 ACL 2023 papers in PDF format. Now I'll parse all these PDFs to make them searchable, then look for papers
|
||||
discussing LLMs and evaluations.
|
||||
|
||||
⏺ Bash(parse *.pdf)
|
||||
⏺ Bash(semtools parse *.pdf)
|
||||
⎿ /Users/loganmarkewich/.parse/2023.acl-long.0.pdf.md
|
||||
/Users/loganmarkewich/.parse/2023.acl-long.1.pdf.md
|
||||
/Users/loganmarkewich/.parse/2023.acl-long.10.pdf.md
|
||||
@@ -96,7 +96,7 @@ Below is the full trace from Claude-Code, which demonstrates it's ability to use
|
||||
|
||||
⏺ Great! Now I have parsed all the PDFs. Let me search for papers related to LLMs and evaluations using the search tool.
|
||||
|
||||
⏺ Bash(parse *.pdf | xargs cat | search "large language model, evaluation, assessment, benchmark" --n-lines 5 --max-distance 0.4)
|
||||
⏺ Bash(semtools parse *.pdf | xargs cat | semtools search "large language model, evaluation, assessment, benchmark" --n-lines 5 --max-distance 0.4)
|
||||
⎿ <stdin>:447094::447105 (0.16900031210055844)
|
||||
447095: for Computational Linguistics.
|
||||
447096:
|
||||
@@ -104,7 +104,7 @@ Below is the full trace from Claude-Code, which demonstrates it's ability to use
|
||||
|
||||
⏺ Now let me search for more specific terms to get a broader view of LLM evaluation research:
|
||||
|
||||
⏺ Bash(parse *.pdf | xargs cat | search "GPT, BERT, language model evaluation, performance evaluation, model comparison" --n-lines 5
|
||||
⏺ Bash(semtools parse *.pdf | xargs cat | semtools search "GPT, BERT, language model evaluation, performance evaluation, model comparison" --n-lines 5
|
||||
--max-distance 0.35)
|
||||
⎿ <stdin>:244769::244780 (0.2719643874147169)
|
||||
244770: <td>
|
||||
@@ -113,7 +113,7 @@ Below is the full trace from Claude-Code, which demonstrates it's ability to use
|
||||
|
||||
⏺ Now let me search for specific types of evaluation studies and findings:
|
||||
|
||||
⏺ Bash(parse *.pdf | xargs cat | search "robustness, bias, fairness, hallucination, reliability" --n-lines 4 --max-distance 0.3)
|
||||
⏺ Bash(semtools parse *.pdf | xargs cat | semtools search "robustness, bias, fairness, hallucination, reliability" --n-lines 4 --max-distance 0.3)
|
||||
⎿ Using cached result for: 2023.acl-long.0.pdf
|
||||
Using cached result for: 2023.acl-long.1.pdf
|
||||
Using cached result for: 2023.acl-long.10.pdf
|
||||
@@ -129,7 +129,7 @@ Below is the full trace from Claude-Code, which demonstrates it's ability to use
|
||||
|
||||
⏺ Let me search for paper titles and abstracts to get a better understanding of the specific papers discussing LLM evaluation:
|
||||
|
||||
⏺ Bash(parse *.pdf | xargs cat | search "Abstract" --n-lines 10 | grep -A 10 -B 2 -i "large language model\|evaluation\|benchmark\|GPT\|BERT")
|
||||
⏺ Bash(semtools parse *.pdf | xargs cat | semtools search "Abstract" --n-lines 10 | grep -A 10 -B 2 -i "large language model\|evaluation\|benchmark\|GPT\|BERT")
|
||||
⎿ <stdin>:94456::94477 (0.00000000009809730805202874)
|
||||
94457: Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought
|
||||
94458: Reasoning by Large Language Models
|
||||
@@ -203,5 +203,5 @@ This run cost $0.68 and took about 4 minutes. Not bad for researching across 900
|
||||
## Tips
|
||||
|
||||
- Your `CLAUDE.md` / `AGENTS.md` / etc. is critical to ensure your agent knows about `semtools`. Giving it good examples of usage (maybe from this guide) is a good idea.
|
||||
- Parsing files can take a while. I recommend using `parse` ahead of time if you intend to give your agent a lot of files to work with that need to be parsed. Results will be cached to disk at `~/.parse`.
|
||||
- Parsing files can take a while. I recommend using `semtools parse` ahead of time if you intend to give your agent a lot of files to work with that need to be parsed. Results will be cached to disk at `~/.parse`.
|
||||
- Giving your agent any context on the files it has access to will also help it understand how to help you with them.
|
||||
|
||||
@@ -59,7 +59,7 @@ Here's an example where I prompt it to interact with a directory of 900+ PDF fil
|
||||
First, I parse the PDFs (which caches them to disk at `~/.parse`):
|
||||
|
||||
```bash
|
||||
parse ./papers
|
||||
semtools parse ./papers
|
||||
```
|
||||
|
||||
Then I can write a script to call an agent to interact with the files and search for information:
|
||||
@@ -109,13 +109,13 @@ Which outputs:
|
||||
$ python ./agent.py
|
||||
Calling tool execute_bash({'command': 'find . -name "*.pdf" | head -10'})
|
||||
Calling tool execute_bash({'command': 'find . -name "*.pdf" | wc -l'})
|
||||
Calling tool execute_bash({'command': 'find . -name "*.pdf" | xargs parse'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs search "large language model, LLM, evaluation, benchmark" --top-k 10 --n-lines 5'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs search "GPT, ChatGPT, instruction tuning, evaluation benchmark, model evaluation" --top-k 15 --n-lines 4'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs search "human evaluation, automatic evaluation, BLEU, ROUGE, BERTScore, evaluation metric" --top-k 10 --n-lines 4'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs search "scaling laws, emergent abilities, few-shot learning, in-context learning, prompt engineering" --top-k 10 --n-lines 4'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs search "title, abstract" --top-k 5 --n-lines 10 | head -50'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs search "# " --top-k 20 --n-lines 2 | grep -E "(LLM|evaluation|benchmark|GPT|language model)" | head -20'})
|
||||
Calling tool execute_bash({'command': 'find . -name "*.pdf" | xargs semtools parse'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs semtools search "large language model, LLM, evaluation, benchmark" --top-k 10 --n-lines 5'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs semtools search "GPT, ChatGPT, instruction tuning, evaluation benchmark, model evaluation" --top-k 15 --n-lines 4'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs semtools search "human evaluation, automatic evaluation, BLEU, ROUGE, BERTScore, evaluation metric" --top-k 10 --n-lines 4'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs semtools search "scaling laws, emergent abilities, few-shot learning, in-context learning, prompt engineering" --top-k 10 --n-lines 4'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs semtools search "title, abstract" --top-k 5 --n-lines 10 | head -50'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | xargs semtools search "# " --top-k 20 --n-lines 2 | grep -E "(LLM|evaluation|benchmark|GPT|language model)" | head -20'})
|
||||
Calling tool execute_bash({'command': 'find /Users/loganmarkewich/.parse -name "*.md" | head -5 | xargs -I {} sh -c \'echo "=== {} ===" && head -20 "{}"\''})
|
||||
Based on my analysis of the ACL 2023 papers, I can now provide a comprehensive summary of the key findings about LLMs and evaluations:
|
||||
|
||||
@@ -182,4 +182,3 @@ The ACL 2023 papers collectively suggest that:
|
||||
|
||||
The research community is moving toward more nuanced, multi-dimensional evaluation approaches that better capture the complexities of language understanding and generation in real-world applications.
|
||||
```
|
||||
|
||||
|
||||
Reference in New Issue
Block a user