Compare commits

...

14 Commits

Author SHA1 Message Date
Logan Markewich 59cf8d4165 remove llama-extract 2025-02-05 12:54:43 -06:00
Logan Markewich 684758b770 remove report pdf 2025-02-04 21:21:02 -06:00
Logan Markewich c37325b496 make e2e work 2025-02-04 21:09:29 -06:00
Logan Markewich 2c4d5f5bb8 small nit 2025-02-04 15:37:32 -06:00
Logan Markewich bf9a26b4ad imports 2025-02-03 22:47:13 -06:00
Logan Markewich c98239cb7a readme organize 2025-02-03 10:04:21 -06:00
Logan Markewich ec9b7331d6 except in teardown 2025-01-31 20:37:22 -06:00
Logan Markewich f4b3f10c95 tests 2025-01-31 19:00:38 -06:00
Logan Markewich 02cb622d94 tests 2025-01-31 09:01:18 -06:00
Logan Markewich f79263e25e tests 2025-01-31 08:51:02 -06:00
Logan Markewich 9b5cc20d7f types 2025-01-31 08:42:57 -06:00
Logan Markewich c1f37bba2a types 2025-01-31 08:39:10 -06:00
Logan Markewich 13cf7dbb15 wip 2025-01-30 17:33:21 -06:00
Logan Markewich 04312cc066 wip 2025-01-30 17:32:02 -06:00
116 changed files with 5762 additions and 970 deletions
+1 -1
View File
@@ -45,4 +45,4 @@ jobs:
- name: Test import
shell: bash
working-directory: ${{ vars.RUNNER_TEMP }}
run: python -c "import llama_parse"
run: python -c "import llama_cloud_services"
+17 -1
View File
@@ -23,16 +23,31 @@ jobs:
uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Install Poetry
uses: snok/install-poetry@v1
with:
version: ${{ env.POETRY_VERSION }}
- name: Install deps
shell: bash
run: pip install -e .
- name: Build and publish to pypi
- name: Build and publish llama-cloud-services
uses: JRubics/poetry-publish@v2.1
with:
poetry_version: ${{ env.POETRY_VERSION }}
python_version: ${{ env.PYTHON_VERSION }}
working_directory: "llama_cloud_services"
pypi_token: ${{ secrets.LLAMA_PARSE_PYPI_TOKEN }}
poetry_install_options: "--without dev"
- name: Build and publish llama-parse
uses: JRubics/poetry-publish@v2.1
with:
poetry_version: ${{ env.POETRY_VERSION }}
python_version: ${{ env.PYTHON_VERSION }}
working_directory: "llama_parse"
pypi_token: ${{ secrets.LLAMA_PARSE_PYPI_TOKEN }}
poetry_install_options: "--without dev"
@@ -52,6 +67,7 @@ jobs:
export PKG=$(ls dist/ | grep tar)
set -- $PKG
echo "name=$1" >> $GITHUB_ENV
- name: Upload Release Asset (sdist) to GitHub
id: upload-release-asset
uses: actions/upload-release-asset@v1
+2 -1
View File
@@ -33,6 +33,7 @@ repos:
rev: v1.0.1
hooks:
- id: mypy
exclude: ^tests/
additional_dependencies:
[
"types-requests",
@@ -46,7 +47,7 @@ repos:
[
--disallow-untyped-defs,
--ignore-missing-imports,
--python-version=3.8,
--python-version=3.10,
]
- repo: https://github.com/adamchainz/blacken-docs
rev: 1.16.0
+22 -137
View File
@@ -1,158 +1,45 @@
# LlamaParse
[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-parse)](https://pypi.org/project/llama-parse/)
[![GitHub contributors](https://img.shields.io/github/contributors/run-llama/llama_parse)](https://github.com/run-llama/llama_parse/graphs/contributors)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-cloud-services)](https://pypi.org/project/llama-cloud-services/)
[![GitHub contributors](https://img.shields.io/github/contributors/run-llama/llama_cloud_services)](https://github.com/run-llama/llama_cloud_services/graphs/contributors)
[![Discord](https://img.shields.io/discord/1059199217496772688)](https://discord.gg/dGcwcsnxhU)
LlamaParse is a **GenAI-native document parser** that can parse complex document data for any downstream LLM use case (RAG, agents).
# Llama Cloud Services
It is really good at the following:
This repository contains the code for hand-written SDKs and clients for interacting with LlamaCloud.
-**Broad file type support**: Parsing a variety of unstructured file types (.pdf, .pptx, .docx, .xlsx, .html) with text, tables, visual elements, weird layouts, and more.
-**Table recognition**: Parsing embedded tables accurately into text and semi-structured representations.
-**Multimodal parsing and chunking**: Extracting visual elements (images/diagrams) into structured formats and return image chunks using the latest multimodal models.
-**Custom parsing**: Input custom prompt instructions to customize the output the way you want it.
This includes:
LlamaParse directly integrates with [LlamaIndex](https://github.com/run-llama/llama_index).
The free plan is up to 1000 pages a day. Paid plan is free 7k pages per week + 0.3c per additional page by default. There is a sandbox available to test the API [**https://cloud.llamaindex.ai/parse ↗**](https://cloud.llamaindex.ai/parse).
Read below for some quickstart information, or see the [full documentation](https://docs.cloud.llamaindex.ai/).
If you're a company interested in enterprise RAG solutions, and/or high volume/on-prem usage of LlamaParse, come [talk to us](https://www.llamaindex.ai/contact).
- [LlamaParse](./parse.md) - A GenAI-native document parser that can parse complex document data for any downstream LLM use case (Agents, RAG, data processing, etc.).
- [LlamaReport (beta/invite-only)](./report.md) - A prebuilt agentic report builder that can be used to build reports from a variety of data sources.
- [LlamaExtract (coming soon!)]() - A prebuilt agentic data extractor that can be used to transform data into a structured JSON representation.
## Getting Started
First, login and get an api-key from [**https://cloud.llamaindex.ai/api-key ↗**](https://cloud.llamaindex.ai/api-key).
Then, make sure you have the latest LlamaIndex version installed.
**NOTE:** If you are upgrading from v0.9.X, we recommend following our [migration guide](https://pretty-sodium-5e0.notion.site/v0-10-0-Migration-Guide-6ede431dcb8841b09ea171e7f133bd77), as well as uninstalling your previous version first.
```
pip uninstall llama-index # run this if upgrading from v0.9.x or older
pip install -U llama-index --upgrade --no-cache-dir --force-reinstall
```
Lastly, install the package:
`pip install llama-parse`
Now you can parse your first PDF file using the command line interface. Use the command `llama-parse [file_paths]`. See the help text with `llama-parse --help`.
Install the package:
```bash
export LLAMA_CLOUD_API_KEY='llx-...'
# output as text
llama-parse my_file.pdf --result-type text --output-file output.txt
# output as markdown
llama-parse my_file.pdf --result-type markdown --output-file output.md
# output as raw json
llama-parse my_file.pdf --output-raw-json --output-file output.json
pip install llama-cloud-services
```
You can also create simple scripts:
Then, get your API key from [LlamaCloud](https://cloud.llamaindex.ai/).
Then, you can use the services in your code:
```python
import nest_asyncio
from llama_cloud_services import LlamaParse, LlamaReport
nest_asyncio.apply()
from llama_parse import LlamaParse
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
num_workers=4, # if multiple files passed, split in `num_workers` API calls
verbose=True,
language="en", # Optionally you can define a language, default=en
)
# sync
documents = parser.load_data("./my_file.pdf")
# sync batch
documents = parser.load_data(["./my_file1.pdf", "./my_file2.pdf"])
# async
documents = await parser.aload_data("./my_file.pdf")
# async batch
documents = await parser.aload_data(["./my_file1.pdf", "./my_file2.pdf"])
parser = LlamaParse(api_key="YOUR_API_KEY")
report = LlamaReport(api_key="YOUR_API_KEY")
```
## Using with file object
See the quickstart guides for each service for more information:
You can parse a file object directly:
```python
import nest_asyncio
nest_asyncio.apply()
from llama_parse import LlamaParse
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
num_workers=4, # if multiple files passed, split in `num_workers` API calls
verbose=True,
language="en", # Optionally you can define a language, default=en
)
file_name = "my_file1.pdf"
extra_info = {"file_name": file_name}
with open(f"./{file_name}", "rb") as f:
# must provide extra_info with file_name key with passing file object
documents = parser.load_data(f, extra_info=extra_info)
# you can also pass file bytes directly
with open(f"./{file_name}", "rb") as f:
file_bytes = f.read()
# must provide extra_info with file_name key with passing file bytes
documents = parser.load_data(file_bytes, extra_info=extra_info)
```
## Using with `SimpleDirectoryReader`
You can also integrate the parser as the default PDF loader in `SimpleDirectoryReader`:
```python
import nest_asyncio
nest_asyncio.apply()
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
verbose=True,
)
file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(
"./data", file_extractor=file_extractor
).load_data()
```
Full documentation for `SimpleDirectoryReader` can be found on the [LlamaIndex Documentation](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader.html).
## Examples
Several end-to-end indexing examples can be found in the examples folder
- [Getting Started](examples/demo_basic.ipynb)
- [Advanced RAG Example](examples/demo_advanced.ipynb)
- [Raw API Usage](examples/demo_api.ipynb)
- [LlamaParse](./parse.md)
- [LlamaReport (beta/invite-only)](./report.md)
- [LlamaExtract (coming soon!)]()
## Documentation
[https://docs.cloud.llamaindex.ai/](https://docs.cloud.llamaindex.ai/)
You can see complete SDK and API documentation for each service on [our official docs](https://docs.cloud.llamaindex.ai/).
## Terms of Service
@@ -160,6 +47,4 @@ See the [Terms of Service Here](./TOS.pdf).
## Get in Touch (LlamaCloud)
LlamaParse is part of LlamaCloud, our e2e enterprise RAG platform that provides out-of-the-box, production-ready connectors, indexing, and retrieval over your complex data sources. We offer SaaS and VPC options.
LlamaCloud is currently available via waitlist (join by [creating an account](https://cloud.llamaindex.ai/)). If you're interested in state-of-the-art quality and in centralizing your RAG efforts, come [get in touch with us](https://www.llamaindex.ai/contact).
You can get in touch with us by following our [contact link](https://www.llamaindex.ai/contact).
@@ -53,7 +53,7 @@
"source": [
"!pip install llama-index\n",
"!pip install llama-index-core\n",
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -190,7 +190,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(result_type=\"markdown\")"
]

Before

Width:  |  Height:  |  Size: 6.9 MiB

After

Width:  |  Height:  |  Size: 6.9 MiB

@@ -22,7 +22,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-parse llama-index llama-index-postprocessor-sbert-rerank"
"!pip install llama-cloud-services llama-index llama-index-postprocessor-sbert-rerank"
]
},
{
@@ -82,7 +82,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -81,7 +81,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"docs = LlamaParse(result_type=\"text\").load_data(\"./caltrain_schedule_weekend.pdf\")"
]
@@ -26,7 +26,7 @@
"!pip install llama-index-embeddings-openai\n",
"!pip install llama-index-postprocessor-flag-embedding-reranker\n",
"!pip install git+https://github.com/FlagOpen/FlagEmbedding.git\n",
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -108,7 +108,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"markdown\").load_data(\"./apple_2021_10k.pdf\")"
]
@@ -22,7 +22,7 @@
"%pip install llama-index-embeddings-openai\n",
"%pip install llama-index-postprocessor-flag-embedding-reranker\n",
"%pip install git+https://github.com/FlagOpen/FlagEmbedding.git\n",
"%pip install llama-parse\n",
"%pip install llama-cloud-services\n",
"%pip install llama-index-vector-stores-astra-db"
]
},
@@ -107,7 +107,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"markdown\").load_data(\"./uber_10q_march_2022.pdf\")"
]
@@ -176,7 +176,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"markdown\").load_data(\"./uber_10q_march_2022.pdf\")"
]
@@ -130,7 +130,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"text\").load_data(file_path)"
]
@@ -73,7 +73,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"text\").load_data(\"./attention.pdf\")"
]
@@ -120,7 +120,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"markdown\").load_data(\"./attention.pdf\")"
]
@@ -142,7 +142,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"text\").load_data(file_path)"
]
@@ -21,7 +21,7 @@
"outputs": [],
"source": [
"%pip install llama-index\n",
"%pip install llama-parse"
"%pip install llama-cloud-services"
]
},
{
@@ -41,7 +41,7 @@
"\n",
"nest_asyncio.apply()\n",
"\n",
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"api_key = \"llx-\" # get from cloud.llamaindex.ai"
]
@@ -32,7 +32,7 @@
"outputs": [],
"source": [
"!pip install llama-index-core\n",
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -119,7 +119,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(extract_charts=True, invalidate_cache=True)\n",
"json_objs = parser.get_json_result(\"./agentless.pdf\")"
@@ -116,7 +116,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"markdown\").load_data(\"./policy.pdf\")"
]
@@ -35,7 +35,7 @@
"!pip install llama-index-core\n",
"!pip install llama-index-llms-anthropic llama-index-multi-modal-llms-anthropic\n",
"!pip install llama-index-embeddings-huggingface\n",
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -129,7 +129,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(verbose=True)\n",
"json_objs = parser.get_json_result(\"./uber_10q_march_2022.pdf\")\n",
@@ -37,7 +37,7 @@
"%pip install llama-index-core\n",
"%pip install llama-index-llms-anthropic llama-index-multi-modal-llms-anthropic\n",
"%pip install llama-index-embeddings-huggingface\n",
"%pip install llama-parse"
"%pip install llama-cloud-services"
]
},
{
@@ -110,7 +110,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(verbose=True)\n",
"json_objs = parser.get_json_result(\"./uber_10q_march_2022.pdf\")\n",
@@ -32,7 +32,7 @@
"outputs": [],
"source": [
"!pip install llama-index-core\n",
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -125,7 +125,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(verbose=True, premium_mode=True)\n",
"json_objs = parser.get_json_result(\"./san_francisco_budget_2023.pdf\")"
@@ -77,7 +77,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(result_type=\"text\", language=\"fr\")\n",
"documents = parser.load_data(\"./treasury_report.pdf\")"
@@ -250,7 +250,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(result_type=\"text\", language=\"ch_sim\")\n",
"documents = parser.load_data(\"./chinese_pdf.pdf\")"
@@ -404,7 +404,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"base_parser = LlamaParse(result_type=\"text\", language=\"en\")\n",
"base_documents = parser.load_data(\"./chinese_pdf2.pdf\")"
@@ -69,7 +69,7 @@
"import pymongo\n",
"\n",
"from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch\n",
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"from llama_index.embeddings.openai import OpenAIEmbedding\n",
"from llama_index.core import VectorStoreIndex, StorageContext\n",
"from llama_index.core.node_parser import SimpleNodeParser"
@@ -114,7 +114,7 @@
}
],
"source": [
"%pip install llama-parse"
"%pip install llama-cloud-services"
]
},
{
@@ -169,7 +169,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse"
"from llama_cloud_services import LlamaParse"
]
},
{
@@ -35,7 +35,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -160,7 +160,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -199,7 +199,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser_gpt4o = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -31,7 +31,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -117,7 +117,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(target_pages=\"0,1,2\", result_type=\"markdown\")\n",
"\n",
@@ -34,7 +34,7 @@
"%pip install llama-index-question-gen-openai\n",
"%pip install llama-index-postprocessor-flag-embedding-reranker\n",
"%pip install git+https://github.com/FlagOpen/FlagEmbedding.git\n",
"%pip install llama-parse"
"%pip install llama-cloud-services"
]
},
{
@@ -109,7 +109,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"docs_2021 = LlamaParse(result_type=\"markdown\").load_data(\"./apple_2021_10k.pdf\")\n",
"docs_2020 = LlamaParse(result_type=\"markdown\").load_data(\"./apple_2020_10k.pdf\")"
@@ -31,7 +31,7 @@
"outputs": [],
"source": [
"%pip install llama-index\n",
"%pip install llama-parse"
"%pip install llama-cloud-services"
]
},
{
@@ -53,7 +53,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"# api_key = \"llx-\" # get from cloud.llamaindex.ai"
]
@@ -37,7 +37,7 @@
"outputs": [],
"source": [
"# !pip install llama-index\n",
"# !pip install llama-parse"
"# !pip install llama-cloud-services"
]
},
{
@@ -59,7 +59,7 @@
"from llama_index.core import VectorStoreIndex\n",
"from IPython.display import Image, Markdown\n",
"\n",
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"from llama_index.core.node_parser import MarkdownElementNodeParser"
]

Before

Width:  |  Height:  |  Size: 195 KiB

After

Width:  |  Height:  |  Size: 195 KiB

Before

Width:  |  Height:  |  Size: 363 KiB

After

Width:  |  Height:  |  Size: 363 KiB

Before

Width:  |  Height:  |  Size: 343 KiB

After

Width:  |  Height:  |  Size: 343 KiB

Before

Width:  |  Height:  |  Size: 185 KiB

After

Width:  |  Height:  |  Size: 185 KiB

Before

Width:  |  Height:  |  Size: 254 KiB

After

Width:  |  Height:  |  Size: 254 KiB

Before

Width:  |  Height:  |  Size: 650 KiB

After

Width:  |  Height:  |  Size: 650 KiB

Before

Width:  |  Height:  |  Size: 72 KiB

After

Width:  |  Height:  |  Size: 72 KiB

Before

Width:  |  Height:  |  Size: 88 KiB

After

Width:  |  Height:  |  Size: 88 KiB

Before

Width:  |  Height:  |  Size: 200 KiB

After

Width:  |  Height:  |  Size: 200 KiB

Before

Width:  |  Height:  |  Size: 115 KiB

After

Width:  |  Height:  |  Size: 115 KiB

@@ -33,7 +33,7 @@
"!pip install llama-index-postprocessor-flag-embedding-reranker\n",
"!pip install git+https://github.com/FlagOpen/FlagEmbedding.git\n",
"!pip install llama-index-graph-stores-neo4j\n",
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -125,7 +125,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"docs = LlamaParse(result_type=\"text\").load_data(\"./data/budget_2023.pdf\")"
]

Before

Width:  |  Height:  |  Size: 334 KiB

After

Width:  |  Height:  |  Size: 334 KiB

@@ -141,7 +141,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -205,7 +205,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser_gpt4o = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -118,7 +118,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -181,7 +181,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser_gpt4o = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -99,7 +99,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -102,7 +102,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",

Before

Width:  |  Height:  |  Size: 1.2 MiB

After

Width:  |  Height:  |  Size: 1.2 MiB

Before

Width:  |  Height:  |  Size: 170 KiB

After

Width:  |  Height:  |  Size: 170 KiB

@@ -169,7 +169,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"\n",
"parser = LlamaParse(\n",

Before

Width:  |  Height:  |  Size: 580 KiB

After

Width:  |  Height:  |  Size: 580 KiB

@@ -153,7 +153,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"\n",
"parser_text = LlamaParse(result_type=\"text\")\n",

Before

Width:  |  Height:  |  Size: 271 KiB

After

Width:  |  Height:  |  Size: 271 KiB

@@ -143,7 +143,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -172,7 +172,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",

Before

Width:  |  Height:  |  Size: 1.5 MiB

After

Width:  |  Height:  |  Size: 1.5 MiB

@@ -104,7 +104,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\",\n",
@@ -25,7 +25,7 @@
"\n",
"nest_asyncio.apply()\n",
"\n",
"from llama_parse import LlamaParse"
"from llama_cloud_services import LlamaParse"
]
},
{
@@ -27,7 +27,7 @@
"outputs": [],
"source": [
"%pip install llama-index\n",
"%pip install llama-parse\n",
"%pip install llama-cloud-services\n",
"%pip install torch transformers python-pptx Pillow"
]
},
@@ -85,7 +85,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse"
"from llama_cloud_services import LlamaParse"
]
},
{

Before

Width:  |  Height:  |  Size: 350 KiB

After

Width:  |  Height:  |  Size: 350 KiB

Before

Width:  |  Height:  |  Size: 47 KiB

After

Width:  |  Height:  |  Size: 47 KiB

@@ -45,7 +45,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -100,7 +100,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"vanilaParsing = LlamaParse(result_type=\"markdown\").load_data(\"./mcdonalds_receipt.png\")"
]

Before

Width:  |  Height:  |  Size: 344 KiB

After

Width:  |  Height:  |  Size: 344 KiB

@@ -22,7 +22,7 @@
"!pip install llama-index\n",
"!pip install llama-index-core\n",
"!pip install llama-index-embeddings-openai llama-index-llms-openai\n",
"!pip install llama-parse"
"!pip install llama-cloud-services"
]
},
{
@@ -111,7 +111,7 @@
}
],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"file_path = \"data/nova_technical_report.pdf\"\n",
"\n",

Before

Width:  |  Height:  |  Size: 2.3 MiB

After

Width:  |  Height:  |  Size: 2.3 MiB

Before

Width:  |  Height:  |  Size: 100 KiB

After

Width:  |  Height:  |  Size: 100 KiB

Before

Width:  |  Height:  |  Size: 464 KiB

After

Width:  |  Height:  |  Size: 464 KiB

Before

Width:  |  Height:  |  Size: 410 KiB

After

Width:  |  Height:  |  Size: 410 KiB

Before

Width:  |  Height:  |  Size: 444 KiB

After

Width:  |  Height:  |  Size: 444 KiB

Before

Width:  |  Height:  |  Size: 610 KiB

After

Width:  |  Height:  |  Size: 610 KiB

@@ -95,7 +95,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"# use our multimodal models for extractions\n",
"parser = LlamaParse(\n",

Before

Width:  |  Height:  |  Size: 986 KiB

After

Width:  |  Height:  |  Size: 986 KiB

@@ -107,7 +107,7 @@
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_cloud_services import LlamaParse\n",
"\n",
"parser_gpt4o = LlamaParse(\n",
" result_type=\"markdown\",\n",
+762
View File
@@ -0,0 +1,762 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Report Generation with LlamaReport\n",
"\n",
"In this notebook, we'll walk through the basic process of generating a report with LlamaReport, and highlight some of the key features of the library.\n",
"\n",
"TLDR:\n",
"1. Download source data to use as knowledge base for the report\n",
"2. Kick off report generation with a template\n",
"3. Get the plan and review/accept/reject suggestions\n",
"4. Get the final report\n",
"5. Review/accept/reject suggestions to edit the final report\n",
"6. Print the final report"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-cloud-services"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Download Source Data\n",
"\n",
"Here, we download the `Attention is All You Need` paper as a PDF.\n",
"\n",
"LlamaReport currently supports up to 5 files as input, and essentially any file type that can be parsed by LlamaParse.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!wget \"https://arxiv.org/pdf/1706.03762.pdf\" -O \"./attention.pdf\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Kick off Report Generation\n",
"\n",
"Here, we kick off report generation with a template.\n",
"\n",
"The template can either be a string or a file path, but here we'll use a string.\n",
"\n",
"In our experiments, anything works as a template, but some general guidelines:\n",
"\n",
"- Use markdown formatting + instructions in each section to guide the report generation\n",
"- If using an existing file as a template, provide extra instructions to guide the report generation\n",
"\n",
"**NOTE:** Since we are in a notebook, we will use async functions and `await` throughout. Synchronous methods that work without `await` are available by just removing the `a` from the method name and removing the `await` keyword."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud_services import LlamaReport\n",
"\n",
"llama_report = LlamaReport(\n",
" api_key=\"llx-...\",\n",
")\n",
"\n",
"report_client = await llama_report.acreate_report(\n",
" name=\"my_cool_report_on_attention\",\n",
" # can pass in file paths or bytes\n",
" input_files=[\"./attention.pdf\"],\n",
" template_text=\"\"\"\\\n",
"# [Some title]\\n\\n\n",
"## TLDR\\n\n",
"A quick summary of the paper.\\n\\n\n",
"## Details\\n\n",
"More details about the paper, possibly more than one section here.\\n\n",
"\"\"\",\n",
" # optional additional instructions for the report generation\n",
" # template_instructions=None,\n",
" # optional file path to an existing template instead of template_text\n",
" # template_file=None,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The returned `ReportClient` object is used to interact with the report generation process for this specific report."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Report(id=0a394b33-1a3e-463c-b5cb-7ff8ab827d0a, name=my_cool_report_on_attention)\n"
]
}
],
"source": [
"print(report_client)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Get the plan\n",
"\n",
"The first phases of report generation involve ingesting the source data and generating a plan.\n",
"\n",
"The plan is a list of instructions for the report generation, and can be reviewed/accepted/rejected by the user.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plan = await report_client.await_for_plan(\n",
" timeout=10000,\n",
" poll_interval=10,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"# {title}\n",
"[ReportQuery(field='title', prompt='Generate a clear and concise title for this paper about the Transformer model and attention mechanisms', context='The paper discusses the Transformer architecture for sequence transduction using attention mechanisms, focusing on machine translation applications')]\n",
"==================\n",
"## TLDR\n",
"\n",
"{tldr_content}\n",
"[ReportQuery(field='tldr_content', prompt='Write a brief, clear summary of the key points about the Transformer model', context='Focus on the main innovations: attention mechanisms, efficiency improvements, and state-of-the-art results in machine translation')]\n",
"==================\n",
"## Details\n",
"\n",
"{details_content}\n",
"[ReportQuery(field='details_content', prompt='Provide detailed information about the Transformer model architecture and its applications', context='Include information about:\\n- The attention mechanism implementation\\n- Advantages over recurrent and convolutional models\\n- Performance in machine translation tasks\\n- Training efficiency improvements')]\n",
"==================\n"
]
}
],
"source": [
"for plan_block in plan.blocks:\n",
" print(plan_block.block.template)\n",
" print(plan_block.queries)\n",
" print(\"==================\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With the plan, we can either use it to kick off generation of the final report, or we can edit the plan and adjust it as needed.\n",
"\n",
"While we could manually edit the objects here and use `await report_client.aupdate_plan(action=\"edit\", updated_plan=plan)`, we can also use `LlamaReport` to agentically edit the plan."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"suggestions = await report_client.asuggest_edits(\n",
" \"Can you split the details section into two sections?\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Justification for change: \n",
"I'll help you break down the details section into two distinct parts - one focusing on the architecture and another on the practical applications and performance. This will make the content more organized and easier to follow. The original block at index 2 will be replaced with these two new sections.\n",
"\n",
"Proposed changes:\n",
"\n",
"## Architecture Details\n",
"\n",
"{architecture_content}\n",
"\n",
"[ReportQuery(field='architecture_content', prompt='Describe the technical details of the Transformer model architecture', context='Focus on:\\n- Core components of the Transformer architecture\\n- Self-attention mechanism implementation\\n- Multi-head attention details\\n- Position encoding approach\\n- Feed-forward network structure')]\n",
"==================\n",
"\n",
"## Performance and Applications\n",
"\n",
"{applications_content}\n",
"\n",
"[ReportQuery(field='applications_content', prompt='Explain the practical applications and performance advantages of the Transformer model', context='Cover:\\n- Comparison with RNN and CNN models\\n- Machine translation results and benchmarks\\n- Training efficiency improvements\\n- Real-world applications and use cases\\n- Scalability benefits')]\n",
"==================\n"
]
}
],
"source": [
"for suggestion in suggestions:\n",
" print(\"Justification for change:\", suggestion.justification)\n",
" print(\"Proposed changes:\")\n",
" for plan_block in suggestion.blocks:\n",
" print(plan_block.block.template)\n",
" print(plan_block.queries)\n",
" print(\"==================\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This looks pretty good! We can also use the client to automatically accept and apply, or reject, these suggestions.\n",
"\n",
"This will (locally) keep track of the history of changes, so that future suggestions can be based on the previous changes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for suggestion in suggestions:\n",
" await report_client.aaccept_edit(suggestion)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What effect did that have on the tracked local history? Let's see!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[EditAction(block_idx=2, old_content='## Details\\n\\n{details_content}\\n\\nField: details_content, Prompt: Provide detailed information about the Transformer model architecture and its applications, Context: Include information about:\\n- The attention mechanism implementation\\n- Advantages over recurrent and convolutional models\\n- Performance in machine translation tasks\\n- Training efficiency improvements\\nDepends on: none', new_content='\\n## Architecture Details\\n\\n{architecture_content}\\n\\n\\nField: architecture_content, Prompt: Describe the technical details of the Transformer model architecture, Context: Focus on:\\n- Core components of the Transformer architecture\\n- Self-attention mechanism implementation\\n- Multi-head attention details\\n- Position encoding approach\\n- Feed-forward network structure\\nDepends on: none', action='approved', timestamp=datetime.datetime(2025, 2, 4, 20, 59, 55, 773558)),\n",
" EditAction(block_idx=3, old_content='[No old content]', new_content='\\n## Performance and Applications\\n\\n{applications_content}\\n\\n\\nField: applications_content, Prompt: Explain the practical applications and performance advantages of the Transformer model, Context: Cover:\\n- Comparison with RNN and CNN models\\n- Machine translation results and benchmarks\\n- Training efficiency improvements\\n- Real-world applications and use cases\\n- Scalability benefits\\nDepends on: previous', action='approved', timestamp=datetime.datetime(2025, 2, 4, 20, 59, 55, 773687))]"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"report_client.edit_history"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Message(role=<MessageRole.USER: 'user'>, content='Can you split the details section into two sections?', timestamp=datetime.datetime(2025, 2, 4, 20, 59, 47, 754848)),\n",
" Message(role=<MessageRole.ASSISTANT: 'assistant'>, content=\"\\nI'll help you break down the details section into two distinct parts - one focusing on the architecture and another on the practical applications and performance. This will make the content more organized and easier to follow. The original block at index 2 will be replaced with these two new sections.\\n\", timestamp=datetime.datetime(2025, 2, 4, 20, 59, 55, 482070))]"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"report_client.chat_history"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These two items are used to provide context for future suggestions! You can always clear this, or provide your own history."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# report_client.suggest_edits(\"....\", chat_history=[{\"role\": \"user\", \"content\": \"...\"}, ...])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Get the final report\n",
"\n",
"Now that we have a plan, we can kick off generation of the final report."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# kicks off report generation\n",
"await report_client.aupdate_plan(action=\"approve\")\n",
"\n",
"# waits for report generation to complete\n",
"report = await report_client.await_completion(\n",
" timeout=10000,\n",
" poll_interval=10,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"# Attention Is All You Need: A Pure Attention-Based Architecture for Neural Machine Translation\n",
"\n",
"## TLDR\n",
"\n",
"The Transformer introduced a revolutionary architecture that relies entirely on attention mechanisms, eliminating the need for recurrence or convolution in sequence processing. Its key innovations include multi-head self-attention for parallel processing of input sequences, scaled dot-product attention for efficient computation, and positional encodings for sequence order awareness. The model achieved breakthrough results in machine translation (28.4 BLEU on English-to-German, 41.8 BLEU on English-to-French) while requiring significantly less training time than previous approaches, training in 3.5 days on 8 GPUs. This architecture demonstrated that attention mechanisms alone are sufficient for state-of-the-art sequence modeling, setting a new direction for natural language processing.\n",
"\n",
"\n",
"## Architecture Details\n",
"\n",
"The Transformer architecture represents a groundbreaking approach to sequence processing, built entirely on attention mechanisms without recurrence or convolution. Here are its key technical details:\n",
"\n",
"Core Components:\n",
"- Encoder-decoder architecture with stacked self-attention and point-wise feed-forward layers\n",
"- Each layer contains two main sub-layers: multi-head self-attention mechanism and position-wise feed-forward network\n",
"- Layer normalization and residual connections between sub-layers\n",
"- No recurrent or convolutional elements, enabling parallel processing\n",
"\n",
"Self-Attention Mechanism:\n",
"- Processes relationships between all positions in a sequence simultaneously\n",
"- Computes attention weights using queries, keys, and values derived from input representations\n",
"- Implements scaled dot-product attention to prevent gradient issues with large input dimensions\n",
"- Allows direct modeling of dependencies regardless of positional distance\n",
"- Uses masking in decoder to prevent leftward information flow and maintain auto-regressive property\n",
"\n",
"Multi-Head Attention:\n",
"- Employs multiple attention heads operating in parallel\n",
"- Each head processes information in different representation subspaces\n",
"- Three types of attention applications:\n",
" 1. Encoder self-attention (all positions attend to each other)\n",
" 2. Decoder self-attention (each position attends to previous positions)\n",
" 3. Encoder-decoder attention (decoder queries attend to encoder outputs)\n",
"- Counteracts reduced resolution from attention averaging through parallel processing\n",
"\n",
"Position-wise Feed-Forward Network:\n",
"- Applied identically to each position separately\n",
"- Consists of two linear transformations with ReLU activation\n",
"- Structure: FFN(x) = max(0, xW1 + b1)W2 + b2\n",
"- Input and output dimensionality: dmodel = 512\n",
"- Inner-layer dimensionality: dff = 2048\n",
"- Parameters vary between layers but remain constant across positions\n",
"\n",
"Position Encoding:\n",
"- Adds positional information to input embeddings\n",
"- Enables the model to consider sequential order without recurrence\n",
"- Implements sinusoidal position encodings to allow model to attend to relative positions\n",
"- Maintains constant number of operations between any two positions, unlike convolutional approaches\n",
"- Allows effective modeling of both local and long-range dependencies\n",
"\n",
"\n",
"\n",
"## Performance and Applications\n",
"\n",
"The Transformer model demonstrates significant performance advantages and practical applications across multiple domains:\n",
"\n",
"Performance Advantages over RNN/CNN Models:\n",
"- Eliminates sequential computation constraints present in RNNs, enabling superior parallelization\n",
"- Reduces operations needed for relating distant positions to a constant number, compared to linear/logarithmic scaling in CNNs\n",
"- Processes all input and output positions simultaneously through self-attention mechanisms\n",
"- Achieves state-of-the-art results while requiring significantly less computational resources\n",
"\n",
"Machine Translation Benchmarks:\n",
"- WMT 2014 English-to-German: 28.4 BLEU score, exceeding previous best results by over 2 BLEU points\n",
"- WMT 2014 English-to-French: 41.8 BLEU score (single-model state-of-the-art)\n",
"- Surpasses performance of existing model ensembles in translation tasks\n",
"\n",
"Training Efficiency:\n",
"- Requires only 3.5 days of training on eight GPUs for state-of-the-art performance\n",
"- Achieves superior results at \"a small fraction of the training costs\" compared to previous models\n",
"- Enables significantly faster training through parallel processing of input/output sequences\n",
"- Can reach production-quality performance in as little as twelve hours on modern GPU hardware\n",
"\n",
"Real-world Applications:\n",
"- Machine translation systems\n",
"- Natural language understanding tasks\n",
"- Reading comprehension\n",
"- Abstractive summarization\n",
"- Text entailment analysis\n",
"- Constituency parsing (achieving 92.7 F1 score in semi-supervised settings)\n",
"- Adaptable to both large and limited training data scenarios\n",
"\n",
"Scalability Benefits:\n",
"- Highly parallelizable architecture enables efficient scaling across multiple GPUs\n",
"- Constant computational complexity for relating any input/output positions\n",
"- Effective handling of long-range dependencies in sequences\n",
"- Maintains performance quality while scaling to larger datasets and model sizes\n",
"- Generalizes well across different tasks and domains without architectural changes\n",
"- Supports efficient inference and deployment in production environments\n",
"\n"
]
}
],
"source": [
"report_text = \"\\n\\n\".join([block.template for block in report.blocks])\n",
"print(report_text)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Edit the final report\n",
"\n",
"Now that we have a report, we can edit it.\n",
"\n",
"We can use the `asuggest_edits` method to get suggestions for edits, and then use the `aaccept_edit`/`areject_edit` methods to apply them.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Justification for change: \n",
"I'd suggest changing \"TLDR\" to \"Executive Summary\" which is more appropriate for a professional or academic report. This term is widely used in formal documents and better reflects the nature of this concise overview section while maintaining the same function of providing a quick summary of the key points.\n",
"\n",
"Proposed changes:\n",
"## Executive Summary\n",
"\n",
"The Transformer introduced a revolutionary architecture that relies entirely on attention mechanisms, eliminating the need for recurrence or convolution in sequence processing. Its key innovations include multi-head self-attention for parallel processing of input sequences, scaled dot-product attention for efficient computation, and positional encodings for sequence order awareness. The model achieved breakthrough results in machine translation (28.4 BLEU on English-to-German, 41.8 BLEU on English-to-French) while requiring significantly less training time than previous approaches, training in 3.5 days on 8 GPUs. This architecture demonstrated that attention mechanisms alone are sufficient for state-of-the-art sequence modeling, setting a new direction for natural language processing.\n",
"==================\n"
]
}
],
"source": [
"suggestions = await report_client.asuggest_edits(\n",
" \"Can you change the TLDR header to something more professional?\"\n",
")\n",
"for suggestion in suggestions:\n",
" print(\"Justification for change:\", suggestion.justification)\n",
" print(\"Proposed changes:\")\n",
" for block in suggestion.blocks:\n",
" print(block.template)\n",
" print(\"==================\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Changing to \"Executive Summary\" sounds reasonable, lets accept that!\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for suggestion in suggestions:\n",
" await report_client.aaccept_edit(suggestion)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Print the final report\n",
"\n",
"Now that we have a report, we can print it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"# Attention Is All You Need: A Pure Attention-Based Architecture for Neural Machine Translation\n",
"\n",
"## Executive Summary\n",
"\n",
"The Transformer introduced a revolutionary architecture that relies entirely on attention mechanisms, eliminating the need for recurrence or convolution in sequence processing. Its key innovations include multi-head self-attention for parallel processing of input sequences, scaled dot-product attention for efficient computation, and positional encodings for sequence order awareness. The model achieved breakthrough results in machine translation (28.4 BLEU on English-to-German, 41.8 BLEU on English-to-French) while requiring significantly less training time than previous approaches, training in 3.5 days on 8 GPUs. This architecture demonstrated that attention mechanisms alone are sufficient for state-of-the-art sequence modeling, setting a new direction for natural language processing.\n",
"\n",
"\n",
"## Architecture Details\n",
"\n",
"The Transformer architecture represents a groundbreaking approach to sequence processing, built entirely on attention mechanisms without recurrence or convolution. Here are its key technical details:\n",
"\n",
"Core Components:\n",
"- Encoder-decoder architecture with stacked self-attention and point-wise feed-forward layers\n",
"- Each layer contains two main sub-layers: multi-head self-attention mechanism and position-wise feed-forward network\n",
"- Layer normalization and residual connections between sub-layers\n",
"- No recurrent or convolutional elements, enabling parallel processing\n",
"\n",
"Self-Attention Mechanism:\n",
"- Processes relationships between all positions in a sequence simultaneously\n",
"- Computes attention weights using queries, keys, and values derived from input representations\n",
"- Implements scaled dot-product attention to prevent gradient issues with large input dimensions\n",
"- Allows direct modeling of dependencies regardless of positional distance\n",
"- Uses masking in decoder to prevent leftward information flow and maintain auto-regressive property\n",
"\n",
"Multi-Head Attention:\n",
"- Employs multiple attention heads operating in parallel\n",
"- Each head processes information in different representation subspaces\n",
"- Three types of attention applications:\n",
" 1. Encoder self-attention (all positions attend to each other)\n",
" 2. Decoder self-attention (each position attends to previous positions)\n",
" 3. Encoder-decoder attention (decoder queries attend to encoder outputs)\n",
"- Counteracts reduced resolution from attention averaging through parallel processing\n",
"\n",
"Position-wise Feed-Forward Network:\n",
"- Applied identically to each position separately\n",
"- Consists of two linear transformations with ReLU activation\n",
"- Structure: FFN(x) = max(0, xW1 + b1)W2 + b2\n",
"- Input and output dimensionality: dmodel = 512\n",
"- Inner-layer dimensionality: dff = 2048\n",
"- Parameters vary between layers but remain constant across positions\n",
"\n",
"Position Encoding:\n",
"- Adds positional information to input embeddings\n",
"- Enables the model to consider sequential order without recurrence\n",
"- Implements sinusoidal position encodings to allow model to attend to relative positions\n",
"- Maintains constant number of operations between any two positions, unlike convolutional approaches\n",
"- Allows effective modeling of both local and long-range dependencies\n",
"\n",
"\n",
"\n",
"## Performance and Applications\n",
"\n",
"The Transformer model demonstrates significant performance advantages and practical applications across multiple domains:\n",
"\n",
"Performance Advantages over RNN/CNN Models:\n",
"- Eliminates sequential computation constraints present in RNNs, enabling superior parallelization\n",
"- Reduces operations needed for relating distant positions to a constant number, compared to linear/logarithmic scaling in CNNs\n",
"- Processes all input and output positions simultaneously through self-attention mechanisms\n",
"- Achieves state-of-the-art results while requiring significantly less computational resources\n",
"\n",
"Machine Translation Benchmarks:\n",
"- WMT 2014 English-to-German: 28.4 BLEU score, exceeding previous best results by over 2 BLEU points\n",
"- WMT 2014 English-to-French: 41.8 BLEU score (single-model state-of-the-art)\n",
"- Surpasses performance of existing model ensembles in translation tasks\n",
"\n",
"Training Efficiency:\n",
"- Requires only 3.5 days of training on eight GPUs for state-of-the-art performance\n",
"- Achieves superior results at \"a small fraction of the training costs\" compared to previous models\n",
"- Enables significantly faster training through parallel processing of input/output sequences\n",
"- Can reach production-quality performance in as little as twelve hours on modern GPU hardware\n",
"\n",
"Real-world Applications:\n",
"- Machine translation systems\n",
"- Natural language understanding tasks\n",
"- Reading comprehension\n",
"- Abstractive summarization\n",
"- Text entailment analysis\n",
"- Constituency parsing (achieving 92.7 F1 score in semi-supervised settings)\n",
"- Adaptable to both large and limited training data scenarios\n",
"\n",
"Scalability Benefits:\n",
"- Highly parallelizable architecture enables efficient scaling across multiple GPUs\n",
"- Constant computational complexity for relating any input/output positions\n",
"- Effective handling of long-range dependencies in sequences\n",
"- Maintains performance quality while scaling to larger datasets and model sizes\n",
"- Generalizes well across different tasks and domains without architectural changes\n",
"- Supports efficient inference and deployment in production environments\n",
"\n"
]
}
],
"source": [
"report_response = await report_client.aget()\n",
"report_text = \"\\n\\n\".join([block.template for block in report_response.report.blocks])\n",
"print(report_text)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also see the sources for each block!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.99687636\n",
"# Abstract\n",
"\n",
"The dominant sequence transduction models are based on complex recurrent or convolutiona\n",
"==================\n",
"0.99591404\n",
"# 2 Background\n",
"\n",
"The goal of reducing sequential computation also forms the foundation of the Extende\n",
"==================\n",
"0.9951325\n",
"# 1 Introduction\n",
"\n",
"Recurrent neural networks, long short-term memory [13] and gated recurrent [7] neu\n",
"==================\n",
"0.99442345\n",
"# 7 Conclusion\n",
"\n",
"In this work, we presented the Transformer, the first sequence transduction model ba\n",
"==================\n",
"0.9967649\n",
"# 3.2.3 Applications of Attention in our Model\n",
"\n",
"The Transformer uses multi-head attention in three d\n",
"==================\n",
"0.99533635\n",
"# 2 Background\n",
"\n",
"The goal of reducing sequential computation also forms the foundation of the Extende\n",
"==================\n",
"0.9935868\n",
"# Abstract\n",
"\n",
"The dominant sequence transduction models are based on complex recurrent or convolutiona\n",
"==================\n",
"0.98780584\n",
"# Outputs\n",
"\n",
"(shifted right)\n",
"\n",
"Figure 1: The Transformer - model architecture.\n",
"\n",
"The Transformer follows\n",
"==================\n",
"0.9205043\n",
"# 3.3 Position-wise Feed-Forward Networks\n",
"\n",
"In addition to attention sub-layers, each of the layers i\n",
"==================\n",
"0.79581684\n",
"# 1 Introduction\n",
"\n",
"Recurrent neural networks, long short-term memory [13] and gated recurrent [7] neu\n",
"==================\n",
"0.9946774\n",
"# Abstract\n",
"\n",
"The dominant sequence transduction models are based on complex recurrent or convolutiona\n",
"==================\n",
"0.97079873\n",
"# 7 Conclusion\n",
"\n",
"In this work, we presented the Transformer, the first sequence transduction model ba\n",
"==================\n",
"0.9535353\n",
"# 6.3 English Constituency Parsing\n",
"\n",
"To evaluate if the Transformer can generalize to other tasks we \n",
"==================\n",
"0.9514138\n",
"# 2 Background\n",
"\n",
"The goal of reducing sequential computation also forms the foundation of the Extende\n",
"==================\n",
"0.9790758\n",
"# 1 Introduction\n",
"\n",
"Recurrent neural networks, long short-term memory [13] and gated recurrent [7] neu\n",
"==================\n",
"0.92262185\n",
"# Outputs\n",
"\n",
"(shifted right)\n",
"\n",
"Figure 1: The Transformer - model architecture.\n",
"\n",
"The Transformer follows\n",
"==================\n"
]
}
],
"source": [
"for block in report_response.report.blocks:\n",
" # Each block has a list of sources, which are the nodes that were used to generate the block\n",
" for source in block.sources:\n",
" print(source.score)\n",
" print(source.node.text[:100])\n",
" print(\"==================\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "llama-parse-aNC435Vv-py3.10",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
+8
View File
@@ -0,0 +1,8 @@
from llama_cloud_services.parse import LlamaParse
from llama_cloud_services.report import ReportClient, LlamaReport
__all__ = [
"LlamaParse",
"ReportClient",
"LlamaReport",
]
+3
View File
@@ -0,0 +1,3 @@
from llama_cloud_services.parse.base import LlamaParse, ResultType
__all__ = ["LlamaParse", "ResultType"]
@@ -18,7 +18,7 @@ from llama_index.core.readers.base import BasePydanticReader
from llama_index.core.readers.file.base import get_default_fs
from llama_index.core.schema import Document
from llama_parse.utils import (
from llama_cloud_services.parse.utils import (
SUPPORTED_FILE_TYPES,
ResultType,
nest_asyncio_err,
@@ -5,7 +5,7 @@ from pathlib import Path
from pydantic.fields import FieldInfo
from typing import Any, Callable, List
from llama_parse.base import LlamaParse
from llama_cloud_services.parse.base import LlamaParse
def pydantic_field_to_click_option(name: str, field: FieldInfo) -> click.Option:
+4
View File
@@ -0,0 +1,4 @@
from llama_cloud_services.report.report import ReportClient
from llama_cloud_services.report.base import LlamaReport
__all__ = ["ReportClient", "LlamaReport"]
+269
View File
@@ -0,0 +1,269 @@
import asyncio
import httpx
import os
import io
from concurrent.futures import ThreadPoolExecutor
from typing import Optional, List, Union, Any, Coroutine, TypeVar
from urllib.parse import urljoin
from llama_cloud.types import ReportMetadata
from llama_cloud_services.report.report import ReportClient
T = TypeVar("T")
class LlamaReport:
"""Client for managing reports and general report operations."""
def __init__(
self,
api_key: Optional[str] = None,
project_id: Optional[str] = None,
organization_id: Optional[str] = None,
base_url: Optional[str] = None,
timeout: Optional[int] = None,
async_httpx_client: Optional[httpx.AsyncClient] = None,
):
self.api_key = api_key or os.getenv("LLAMA_CLOUD_API_KEY", None)
if not self.api_key:
raise ValueError("No API key provided.")
self.base_url = base_url or os.getenv(
"LLAMA_CLOUD_BASE_URL", "https://api.cloud.llamaindex.ai"
)
self.timeout = timeout or 60
# Initialize HTTP clients
self._aclient = async_httpx_client or httpx.AsyncClient(timeout=self.timeout)
# Set auth headers
self.headers = {
"Authorization": f"Bearer {self.api_key}",
}
self.organization_id = organization_id
self.project_id = project_id
self._client_params = {
"timeout": self._aclient.timeout,
"headers": self._aclient.headers,
"base_url": self._aclient.base_url,
"auth": self._aclient.auth,
"event_hooks": self._aclient.event_hooks,
"cookies": self._aclient.cookies,
"max_redirects": self._aclient.max_redirects,
"params": self._aclient.params,
"trust_env": self._aclient.trust_env,
}
self._thread_pool = ThreadPoolExecutor(
max_workers=min(10, (os.cpu_count() or 1) + 4)
)
@property
def aclient(self) -> httpx.AsyncClient:
if self._aclient is None:
self._aclient = httpx.AsyncClient(**self._client_params)
return self._aclient
def _run_sync(self, coro: Coroutine[Any, Any, T]) -> T:
"""Run coroutine in a separate thread to avoid event loop issues"""
# force a new client for this thread/event loop
original_client = self._aclient
self._aclient = None
def run_coro() -> T:
async def wrapped_coro() -> T:
return await coro
return asyncio.run(wrapped_coro())
result = self._thread_pool.submit(run_coro).result()
# restore the original client
self._aclient = original_client
return result
async def _get_default_project(self) -> str:
response = await self.aclient.get(
urljoin(str(self.base_url), "/api/v1/projects"), headers=self.headers
)
response.raise_for_status()
projects = response.json()
default_project = [p for p in projects if p.get("is_default")]
return default_project[0]["id"]
async def _build_url(
self, endpoint: str, extra_params: Optional[List[str]] = None
) -> str:
"""Helper method to build URLs with common query parameters."""
url = urljoin(str(self.base_url), endpoint)
if not self.project_id:
self.project_id = await self._get_default_project()
query_params = []
if self.organization_id:
query_params.append(f"organization_id={self.organization_id}")
if self.project_id:
query_params.append(f"project_id={self.project_id}")
if extra_params:
query_params.extend([p for p in extra_params if p is not None])
if query_params:
url += "?" + "&".join(query_params)
return url
async def acreate_report(
self,
name: str,
template_instructions: Optional[str] = None,
template_text: Optional[str] = None,
template_file: Optional[Union[str, tuple[str, bytes]]] = None,
input_files: Optional[List[Union[str, tuple[str, bytes]]]] = None,
existing_retriever_id: Optional[str] = None,
) -> ReportClient:
"""Create a new report asynchronously."""
url = await self._build_url("/api/v1/reports/")
open_files: List[io.BufferedReader] = []
data = {"name": name}
if template_instructions:
data["template_instructions"] = template_instructions
if template_text:
data["template_text"] = template_text
if existing_retriever_id:
data["existing_retriever_id"] = str(existing_retriever_id)
files: List[tuple[str, io.BufferedReader | bytes]] = []
if template_file:
if isinstance(template_file, str):
open_files.append(open(template_file, "rb"))
files.append(("template_file", open_files[-1]))
else:
files.append(("template_file", template_file[1]))
if input_files:
for f in input_files:
if isinstance(f, str):
open_files.append(open(f, "rb"))
files.append(("files", open_files[-1]))
else:
files.append(("files", f[1]))
response = await self.aclient.post(
url, headers=self.headers, data=data, files=files
)
try:
response.raise_for_status()
report_id = response.json()["id"]
return ReportClient(report_id, name, self)
except httpx.HTTPStatusError as e:
raise ValueError(
f"Failed to create report: {e.response.text}\nError Code: {e.response.status_code}"
)
finally:
for open_file in open_files:
open_file.close()
def create_report(
self,
name: str,
template_instructions: Optional[str] = None,
template_text: Optional[str] = None,
template_file: Optional[Union[str, tuple[str, bytes]]] = None,
input_files: Optional[List[Union[str, tuple[str, bytes]]]] = None,
existing_retriever_id: Optional[str] = None,
) -> ReportClient:
"""Create a new report."""
return self._run_sync(
self.acreate_report(
name=name,
template_instructions=template_instructions,
template_text=template_text,
template_file=template_file,
input_files=input_files,
existing_retriever_id=existing_retriever_id,
)
)
async def alist_reports(
self, state: Optional[str] = None, limit: int = 100, offset: int = 0
) -> List[ReportClient]:
"""List all reports asynchronously."""
params = []
if state:
params.append(f"state={state}")
if limit:
params.append(f"limit={limit}")
if offset:
params.append(f"offset={offset}")
url = await self._build_url(
"/api/v1/reports/list",
extra_params=params,
)
response = await self.aclient.get(url, headers=self.headers)
response.raise_for_status()
data = response.json()
return [
ReportClient(r["report_id"], r["name"], self)
for r in data["report_responses"]
]
def list_reports(
self, state: Optional[str] = None, limit: int = 100, offset: int = 0
) -> List[ReportClient]:
"""Synchronous wrapper for listing reports."""
return self._run_sync(self.alist_reports(state, limit, offset))
async def aget_report(self, report_id: str) -> ReportClient:
"""Get a Report instance for working with a specific report."""
url = await self._build_url(f"/api/v1/reports/{report_id}")
response = await self.aclient.get(url, headers=self.headers)
response.raise_for_status()
data = response.json()
return ReportClient(data["report_id"], data["name"], self)
def get_report(self, report_id: str) -> ReportClient:
"""Synchronous wrapper for getting a report."""
return self._run_sync(self.aget_report(report_id))
async def aget_report_metadata(self, report_id: str) -> ReportMetadata:
"""Get metadata for a specific report asynchronously.
Returns:
dict containing:
- id: Report ID
- name: Report name
- state: Current report state
- report_metadata: Additional metadata
- template_file: Name of template file if used
- template_instructions: Template instructions if provided
- input_files: List of input file names
"""
url = await self._build_url(f"/api/v1/reports/{report_id}/metadata")
response = await self.aclient.get(url, headers=self.headers)
response.raise_for_status()
return ReportMetadata(**response.json())
def get_report_metadata(self, report_id: str) -> ReportMetadata:
"""Synchronous wrapper for getting report metadata."""
return self._run_sync(self.aget_report_metadata(report_id))
async def adelete_report(self, report_id: str) -> None:
"""Delete a specific report asynchronously."""
url = await self._build_url(f"/api/v1/reports/{report_id}")
response = await self.aclient.delete(url, headers=self.headers)
response.raise_for_status()
def delete_report(self, report_id: str) -> None:
"""Synchronous wrapper for deleting a report."""
return self._run_sync(self.adelete_report(report_id))
+527
View File
@@ -0,0 +1,527 @@
import asyncio
import httpx
import time
from typing import Optional, List, Literal, Union, TYPE_CHECKING
from dataclasses import dataclass
from datetime import datetime
from enum import Enum
from llama_cloud.types import (
ReportEventItemEventData_Progress,
ReportMetadata,
EditSuggestion,
ReportResponse,
ReportPlan,
ReportBlock,
ReportPlanBlock,
Report,
)
if TYPE_CHECKING:
from llama_cloud_services.report.base import LlamaReport
class MessageRole(str, Enum):
USER = "user"
ASSISTANT = "assistant"
@dataclass
class Message:
role: MessageRole
content: str
timestamp: datetime
@dataclass
class EditAction:
block_idx: int
old_content: str
new_content: Optional[str]
action: Literal["approved", "rejected"]
timestamp: datetime
DEFAULT_POLL_INTERVAL = 5
DEFAULT_TIMEOUT = 600
class ReportClient:
"""Client for operations on a specific report."""
def __init__(self, report_id: str, name: str, parent_client: "LlamaReport"):
self.report_id = report_id
self.name = name
self._client = parent_client
self._headers = parent_client.headers
self._run_sync = parent_client._run_sync
self._build_url = parent_client._build_url
self.chat_history: List[Message] = []
self.edit_history: List[EditAction] = []
@property
def aclient(self) -> httpx.AsyncClient:
return self._client.aclient
def __str__(self) -> str:
return f"Report(id={self.report_id}, name={self.name})"
def __repr__(self) -> str:
return f"Report(id={self.report_id}, name={self.name})"
def _get_block_content(self, block: Union[ReportBlock, ReportPlanBlock]) -> str:
if isinstance(block, ReportBlock):
return block.template
elif isinstance(block, ReportPlanBlock):
return block.block.template
else:
raise ValueError(f"Invalid block type: {type(block)}")
def _get_block_idx(self, block: Union[ReportBlock, ReportPlanBlock]) -> int:
if isinstance(block, ReportBlock):
return block.idx
elif isinstance(block, ReportPlanBlock):
return block.block.idx
else:
raise ValueError(f"Invalid block type: {type(block)}")
async def aget(self, version: Optional[int] = None) -> ReportResponse:
"""Get this report's details asynchronously."""
extra_params = []
if version is not None:
extra_params.append(f"version={version}")
url = await self._build_url(f"/api/v1/reports/{self.report_id}", extra_params)
response = await self.aclient.get(url, headers=self._headers)
response.raise_for_status()
return ReportResponse(**response.json())
def get(self, version: Optional[int] = None) -> ReportResponse:
"""Synchronous wrapper for getting this report's details."""
return self._run_sync(self.aget(version))
async def aupdate_report(self, updated_report: Report) -> ReportResponse:
"""Update this report's content asynchronously."""
url = await self._build_url(f"/api/v1/reports/{self.report_id}")
response = await self.aclient.patch(
url, headers=self._headers, json={"content": updated_report.dict()}
)
response.raise_for_status()
return ReportResponse(**response.json())
def update_report(self, updated_report: Report) -> ReportResponse:
"""Synchronous wrapper for updating this report's content."""
return self._run_sync(self.aupdate_report(updated_report))
async def aupdate_plan(
self,
action: Literal["approve", "reject", "edit"],
updated_plan: Optional[ReportPlan] = None,
) -> ReportResponse:
"""Update this report's plan asynchronously."""
if action == "edit" and not updated_plan:
raise ValueError("updated_plan is required when action is 'edit'")
url = await self._build_url(
f"/api/v1/reports/{self.report_id}/plan", [f"action={action}"]
)
data = None
if updated_plan is not None:
plan_dict = updated_plan.dict()
plan_dict.pop("generated_at", None)
data = plan_dict
if updated_plan is None and action == "edit":
raise ValueError("updated_plan is required when action is 'edit'")
response = await self.aclient.patch(url, headers=self._headers, json=data)
response.raise_for_status()
return ReportResponse(**response.json())
def update_plan(
self,
action: Literal["approve", "reject", "edit"],
updated_plan: Optional[ReportPlan] = None,
) -> ReportResponse:
"""Synchronous wrapper for updating this report's plan."""
return self._run_sync(self.aupdate_plan(action, updated_plan))
async def asuggest_edits(
self,
user_query: str,
auto_history: bool = True,
chat_history: Optional[List[dict]] = None,
) -> List[EditSuggestion]:
"""Get AI suggestions for edits to this report asynchronously.
Args:
user_query: The user's request/question about what to edit
auto_history: Whether to automatically add the user's message to the chat history
chat_history:
A list of chat messages to include in the chat history.
The format being a list of dictionaries with "role" and "content" keys.
"""
# Add user message to history
self.chat_history.append(
Message(role=MessageRole.USER, content=user_query, timestamp=datetime.now())
)
# Format chat history with edit summaries
chat_history_dicts = []
for msg in self.chat_history[:-1]: # Exclude current message
content = msg.content
if msg.role == MessageRole.USER:
# Add edit summary for user messages
edit_summary = self._get_edit_summary_after_message(msg.timestamp)
if edit_summary:
content = f"{content}\n\nActions taken:\n{edit_summary}"
chat_history_dicts.append({"role": msg.role.value, "content": content})
# decide whether to include chat history or not
if chat_history:
chat_history_dicts = chat_history
elif auto_history:
chat_history_dicts = chat_history_dicts
else:
chat_history_dicts = []
# Make the API call
url = await self._build_url(f"/api/v1/reports/{self.report_id}/suggest_edits")
data = {"user_query": user_query, "chat_history": chat_history_dicts}
response = await self.aclient.post(url, headers=self._headers, json=data)
response.raise_for_status()
suggestions = response.json()
suggestions = [EditSuggestion(**suggestion) for suggestion in suggestions]
# Add assistant response to history
if suggestions:
for suggestion in suggestions:
self.chat_history.append(
Message(
role=MessageRole.ASSISTANT,
content=suggestion.justification,
timestamp=datetime.now(),
)
)
return suggestions
def suggest_edits(
self,
user_query: str,
auto_history: bool = True,
chat_history: Optional[List[dict]] = None,
) -> List[EditSuggestion]:
"""Synchronous wrapper for getting edit suggestions."""
return self._run_sync(
self.asuggest_edits(user_query, auto_history, chat_history)
)
async def await_completion(
self, timeout: int = DEFAULT_TIMEOUT, poll_interval: int = DEFAULT_POLL_INTERVAL
) -> Report:
"""Wait for this report to complete processing."""
start_time = time.time()
while True:
report_response = await self.aget()
status = report_response.status
if status == "completed":
return report_response.report
elif status == "error":
events = await self.aget_events()
raise ValueError(f"Report entered error state: {events[-1].msg}")
elif time.time() - start_time > timeout:
raise TimeoutError(f"Report did not complete within {timeout} seconds")
await asyncio.sleep(poll_interval)
def wait_for_completion(
self, timeout: int = DEFAULT_TIMEOUT, poll_interval: int = DEFAULT_POLL_INTERVAL
) -> Report:
"""Synchronous wrapper for awaiting report completion."""
return self._run_sync(self.await_completion(timeout, poll_interval))
async def await_for_plan(
self, timeout: int = DEFAULT_TIMEOUT, poll_interval: int = DEFAULT_POLL_INTERVAL
) -> ReportPlan:
"""Wait for this report's plan to be ready for review."""
start_time = time.time()
while True:
report_metadata = await self.aget_metadata()
state = report_metadata.state
if state == "waiting_approval":
report_response = await self.aget()
return report_response.plan
elif state == "error":
events = await self.aget_events()
raise ValueError(f"Report entered error state: {events[-1].msg}")
elif time.time() - start_time > timeout:
raise TimeoutError(f"Plan was not ready within {timeout} seconds")
await asyncio.sleep(poll_interval)
def wait_for_plan(
self, timeout: int = DEFAULT_TIMEOUT, poll_interval: int = DEFAULT_POLL_INTERVAL
) -> ReportPlan:
"""Synchronous wrapper for awaiting plan readiness."""
return self._run_sync(self.await_for_plan(timeout, poll_interval))
async def aget_metadata(self) -> ReportMetadata:
"""Get this report's metadata asynchronously."""
return await self._client.aget_report_metadata(self.report_id)
def get_metadata(self) -> ReportMetadata:
"""Synchronous wrapper for getting this report's metadata."""
return self._run_sync(self.aget_metadata())
async def adelete(self) -> None:
"""Delete this report asynchronously."""
return await self._client.adelete_report(self.report_id)
def delete(self) -> None:
"""Synchronous wrapper for deleting this report."""
return self._run_sync(self.adelete())
async def aaccept_edit(self, suggestion: EditSuggestion) -> None:
"""Accept a suggested edit.
Args:
suggestion: The EditSuggestion to accept, typically from suggest_edits()
"""
if len(suggestion.blocks) == 0:
return
# Determine if we're editing a plan or report based on first block type
is_plan_edit = isinstance(suggestion.blocks[0], ReportPlanBlock)
# Get current content
report_response = await self.aget()
current_blocks = (
report_response.plan.blocks
if is_plan_edit
else report_response.report.blocks
)
# Track the edit
new_blocks = []
for edit_block in suggestion.blocks:
# Find matching block in current content
old_block = next(
(
b
for b in current_blocks
if self._get_block_idx(b) == self._get_block_idx(edit_block)
),
None,
)
old_content = (
self._get_block_content(old_block) if old_block else "[No old content]"
)
new_content = self._get_block_content(edit_block)
if is_plan_edit:
new_queries_str = "\n".join(
[
f"Field: {q.field}, Prompt: {q.prompt}, Context: {q.context}"
for q in edit_block.queries
]
)
new_dependency_str = (
f"Depends on: {edit_block.dependency}"
if edit_block.dependency
else ""
)
new_content += f"\n\n{new_queries_str}\n{new_dependency_str}"
if old_block:
old_queries_str = "\n".join(
[
f"Field: {q.field}, Prompt: {q.prompt}, Context: {q.context}"
for q in old_block.queries
]
)
old_dependency_str = (
f"Depends on: {old_block.dependency}"
if old_block.dependency
else ""
)
old_content += f"\n\n{old_queries_str}\n{old_dependency_str}"
self.edit_history.append(
EditAction(
block_idx=self._get_block_idx(edit_block),
old_content=old_content,
new_content=new_content,
action="approved",
timestamp=datetime.now(),
)
)
# Create updated block
if is_plan_edit:
new_blocks.append(
ReportPlanBlock(
block=ReportBlock(
idx=edit_block.block.idx,
template=self._get_block_content(edit_block),
sources=edit_block.block.sources,
),
queries=edit_block.queries,
dependency=edit_block.dependency,
)
)
else:
new_blocks.append(
ReportBlock(
idx=edit_block.idx,
template=self._get_block_content(edit_block),
sources=edit_block.sources,
)
)
if new_blocks:
if is_plan_edit:
# Update plan in place
plan = report_response.plan
# Replace edited blocks and add new ones
for new_block in new_blocks:
block_idx = self._get_block_idx(new_block)
existing_block_idx = next(
(
i
for i, b in enumerate(plan.blocks)
if b.block.idx == block_idx
),
None,
)
if existing_block_idx is not None:
# Replace existing block
plan.blocks[existing_block_idx] = new_block
else:
# Add new block to end
plan.blocks.append(new_block)
await self.aupdate_plan("edit", plan)
else:
# Update report in place
report = report_response.report
# Replace edited blocks and add new ones
for new_block in new_blocks:
block_idx = self._get_block_idx(new_block)
existing_block_idx = next(
(i for i, b in enumerate(report.blocks) if b.idx == block_idx),
None,
)
if existing_block_idx is not None:
# Replace existing block
report.blocks[existing_block_idx] = new_block
else:
# Add new block to end
report.blocks.append(new_block)
await self.aupdate_report(report)
def accept_edit(self, suggestion: EditSuggestion) -> None:
"""Synchronous wrapper for accepting an edit."""
return self._run_sync(self.aaccept_edit(suggestion))
async def areject_edit(self, suggestion: EditSuggestion) -> None:
"""Reject a suggested edit.
Args:
suggestion: The EditSuggestion to reject, typically from suggest_edits()
"""
# Track the rejections
for edit_block in suggestion.blocks:
self.edit_history.append(
EditAction(
block_idx=self._get_block_idx(edit_block),
old_content=self._get_block_content(edit_block),
new_content=None,
action="rejected",
timestamp=datetime.now(),
)
)
def reject_edit(self, suggestion: EditSuggestion) -> None:
"""Synchronous wrapper for rejecting an edit."""
return self._run_sync(self.areject_edit(suggestion))
def _get_edit_summary_after_message(
self, message_timestamp: datetime
) -> Optional[str]:
"""Get a summary of edits that occurred after a specific message."""
relevant_edits = [
edit for edit in self.edit_history if edit.timestamp > message_timestamp
]
if not relevant_edits:
return None
approved = [edit for edit in relevant_edits if edit.action == "approved"]
rejected = [edit for edit in relevant_edits if edit.action == "rejected"]
summary = []
if approved:
summary.append("Approved edits:")
for edit in approved:
summary.append(
f'Block {edit.block_idx}: "{edit.old_content}" -> "{edit.new_content}"'
)
if rejected:
if approved: # Add spacing if we had approved edits
summary.append("")
summary.append("Rejected edits:")
for edit in rejected:
summary.append(f'Block {edit.block_idx}: "{edit.old_content}"')
return "\n".join(summary)
async def aget_events(
self, last_sequence: Optional[int] = None
) -> List[ReportEventItemEventData_Progress]:
"""Get all events for this report asynchronously.
Args:
last_sequence: If provided, only get events after this sequence number
Returns:
List of ReportEvent objects
"""
extra_params = []
if last_sequence is not None:
extra_params.append(f"last_sequence={last_sequence}")
url = await self._build_url(
f"/api/v1/reports/{self.report_id}/events", extra_params
)
response = await self.aclient.get(url, headers=self._headers)
response.raise_for_status()
progress_events = []
for event in response.json():
if event["event_type"] == "progress":
progress_events.append(
ReportEventItemEventData_Progress(**event["event_data"])
)
return progress_events
def get_events(
self, last_sequence: Optional[int] = None
) -> List[ReportEventItemEventData_Progress]:
"""Synchronous wrapper for getting report events."""
return self._run_sync(self.aget_events(last_sequence))
+165
View File
@@ -0,0 +1,165 @@
# LlamaParse
[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-parse)](https://pypi.org/project/llama-parse/)
[![GitHub contributors](https://img.shields.io/github/contributors/run-llama/llama_parse)](https://github.com/run-llama/llama_parse/graphs/contributors)
[![Discord](https://img.shields.io/discord/1059199217496772688)](https://discord.gg/dGcwcsnxhU)
LlamaParse is a **GenAI-native document parser** that can parse complex document data for any downstream LLM use case (RAG, agents).
It is really good at the following:
-**Broad file type support**: Parsing a variety of unstructured file types (.pdf, .pptx, .docx, .xlsx, .html) with text, tables, visual elements, weird layouts, and more.
-**Table recognition**: Parsing embedded tables accurately into text and semi-structured representations.
-**Multimodal parsing and chunking**: Extracting visual elements (images/diagrams) into structured formats and return image chunks using the latest multimodal models.
-**Custom parsing**: Input custom prompt instructions to customize the output the way you want it.
LlamaParse directly integrates with [LlamaIndex](https://github.com/run-llama/llama_index).
The free plan is up to 1000 pages a day. Paid plan is free 7k pages per week + 0.3c per additional page by default. There is a sandbox available to test the API [**https://cloud.llamaindex.ai/parse ↗**](https://cloud.llamaindex.ai/parse).
Read below for some quickstart information, or see the [full documentation](https://docs.cloud.llamaindex.ai/).
If you're a company interested in enterprise RAG solutions, and/or high volume/on-prem usage of LlamaParse, come [talk to us](https://www.llamaindex.ai/contact).
## Getting Started
First, login and get an api-key from [**https://cloud.llamaindex.ai/api-key ↗**](https://cloud.llamaindex.ai/api-key).
Then, make sure you have the latest LlamaIndex version installed.
**NOTE:** If you are upgrading from v0.9.X, we recommend following our [migration guide](https://pretty-sodium-5e0.notion.site/v0-10-0-Migration-Guide-6ede431dcb8841b09ea171e7f133bd77), as well as uninstalling your previous version first.
```
pip uninstall llama-index # run this if upgrading from v0.9.x or older
pip install -U llama-index --upgrade --no-cache-dir --force-reinstall
```
Lastly, install the package:
`pip install llama-parse`
Now you can parse your first PDF file using the command line interface. Use the command `llama-parse [file_paths]`. See the help text with `llama-parse --help`.
```bash
export LLAMA_CLOUD_API_KEY='llx-...'
# output as text
llama-parse my_file.pdf --result-type text --output-file output.txt
# output as markdown
llama-parse my_file.pdf --result-type markdown --output-file output.md
# output as raw json
llama-parse my_file.pdf --output-raw-json --output-file output.json
```
You can also create simple scripts:
```python
import nest_asyncio
nest_asyncio.apply()
from llama_parse import LlamaParse
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
num_workers=4, # if multiple files passed, split in `num_workers` API calls
verbose=True,
language="en", # Optionally you can define a language, default=en
)
# sync
documents = parser.load_data("./my_file.pdf")
# sync batch
documents = parser.load_data(["./my_file1.pdf", "./my_file2.pdf"])
# async
documents = await parser.aload_data("./my_file.pdf")
# async batch
documents = await parser.aload_data(["./my_file1.pdf", "./my_file2.pdf"])
```
## Using with file object
You can parse a file object directly:
```python
import nest_asyncio
nest_asyncio.apply()
from llama_parse import LlamaParse
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
num_workers=4, # if multiple files passed, split in `num_workers` API calls
verbose=True,
language="en", # Optionally you can define a language, default=en
)
file_name = "my_file1.pdf"
extra_info = {"file_name": file_name}
with open(f"./{file_name}", "rb") as f:
# must provide extra_info with file_name key with passing file object
documents = parser.load_data(f, extra_info=extra_info)
# you can also pass file bytes directly
with open(f"./{file_name}", "rb") as f:
file_bytes = f.read()
# must provide extra_info with file_name key with passing file bytes
documents = parser.load_data(file_bytes, extra_info=extra_info)
```
## Using with `SimpleDirectoryReader`
You can also integrate the parser as the default PDF loader in `SimpleDirectoryReader`:
```python
import nest_asyncio
nest_asyncio.apply()
from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
verbose=True,
)
file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(
"./data", file_extractor=file_extractor
).load_data()
```
Full documentation for `SimpleDirectoryReader` can be found on the [LlamaIndex Documentation](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader.html).
## Examples
Several end-to-end indexing examples can be found in the examples folder
- [Getting Started](examples/demo_basic.ipynb)
- [Advanced RAG Example](examples/demo_advanced.ipynb)
- [Raw API Usage](examples/demo_api.ipynb)
## Documentation
[https://docs.cloud.llamaindex.ai/](https://docs.cloud.llamaindex.ai/)
## Terms of Service
See the [Terms of Service Here](./TOS.pdf).
## Get in Touch (LlamaCloud)
LlamaParse is part of LlamaCloud, our e2e enterprise RAG platform that provides out-of-the-box, production-ready connectors, indexing, and retrieval over your complex data sources. We offer SaaS and VPC options.
LlamaCloud is currently available via waitlist (join by [creating an account](https://cloud.llamaindex.ai/)). If you're interested in state-of-the-art quality and in centralizing your RAG efforts, come [get in touch with us](https://www.llamaindex.ai/contact).
-3
View File
@@ -1,3 +0,0 @@
from llama_parse.base import LlamaParse, ResultType
__all__ = ["LlamaParse", "ResultType"]
+3
View File
@@ -0,0 +1,3 @@
from llama_cloud_services.parse import LlamaParse, ResultType
__all__ = ["LlamaParse", "ResultType"]
+19
View File
@@ -0,0 +1,19 @@
from llama_cloud_services.parse.base import (
LlamaParse,
ResultType,
FileInput,
_DEFAULT_SEPARATOR,
JOB_RESULT_URL,
JOB_STATUS_ROUTE,
JOB_UPLOAD_ROUTE,
)
__all__ = [
"LlamaParse",
"ResultType",
"FileInput",
"_DEFAULT_SEPARATOR",
"JOB_RESULT_URL",
"JOB_STATUS_ROUTE",
"JOB_UPLOAD_ROUTE",
]
+4
View File
@@ -0,0 +1,4 @@
from llama_cloud_services.parse.cli.main import parse
if __name__ == "__main__":
parse()
+11
View File
@@ -0,0 +1,11 @@
from llama_cloud_services.parse.utils import (
SUPPORTED_FILE_TYPES,
Language,
ResultType,
)
__all__ = [
"SUPPORTED_FILE_TYPES",
"Language",
"ResultType",
]
+24
View File
@@ -0,0 +1,24 @@
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "llama-parse"
version = "0.5.21"
description = "Parse files into RAG-Optimized formats."
authors = ["Logan Markewich <logan@llamaindex.ai>"]
license = "MIT"
readme = "README.md"
packages = [{include = "llama_parse"}]
[tool.poetry.dependencies]
python = ">=3.9,<4.0"
llama-cloud-services = "*"
[tool.poetry.group.dev.dependencies]
pytest = "^8.0.0"
pytest-asyncio = "*"
ipykernel = "^6.29.0"
[tool.poetry.scripts]
llama-parse = "llama_parse.cli.main:parse"
+142
View File
@@ -0,0 +1,142 @@
# LlamaParse
LlamaParse is a **GenAI-native document parser** that can parse complex document data for any downstream LLM use case (RAG, agents).
It is really good at the following:
-**Broad file type support**: Parsing a variety of unstructured file types (.pdf, .pptx, .docx, .xlsx, .html) with text, tables, visual elements, weird layouts, and more.
-**Table recognition**: Parsing embedded tables accurately into text and semi-structured representations.
-**Multimodal parsing and chunking**: Extracting visual elements (images/diagrams) into structured formats and return image chunks using the latest multimodal models.
-**Custom parsing**: Input custom prompt instructions to customize the output the way you want it.
LlamaParse directly integrates with [LlamaIndex](https://github.com/run-llama/llama_index).
The free plan is up to 1000 pages a day. Paid plan is free 7k pages per week + 0.3c per additional page by default. There is a sandbox available to test the API [**https://cloud.llamaindex.ai/parse ↗**](https://cloud.llamaindex.ai/parse).
Read below for some quickstart information, or see the [full documentation](https://docs.cloud.llamaindex.ai/).
If you're a company interested in enterprise RAG solutions, and/or high volume/on-prem usage of LlamaParse, come [talk to us](https://www.llamaindex.ai/contact).
## Getting Started
First, login and get an api-key from [**https://cloud.llamaindex.ai/api-key ↗**](https://cloud.llamaindex.ai/api-key).
Then, install the package:
`pip install llama-cloud-services`
Now you can parse your first PDF file using the command line interface. Use the command `llama-parse [file_paths]`. See the help text with `llama-parse --help`.
```bash
export LLAMA_CLOUD_API_KEY='llx-...'
# output as text
llama-parse my_file.pdf --result-type text --output-file output.txt
# output as markdown
llama-parse my_file.pdf --result-type markdown --output-file output.md
# output as raw json
llama-parse my_file.pdf --output-raw-json --output-file output.json
```
You can also create simple scripts:
```python
import nest_asyncio
nest_asyncio.apply()
from llama_cloud_services import LlamaParse
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
num_workers=4, # if multiple files passed, split in `num_workers` API calls
verbose=True,
language="en", # Optionally you can define a language, default=en
)
# sync
documents = parser.load_data("./my_file.pdf")
# sync batch
documents = parser.load_data(["./my_file1.pdf", "./my_file2.pdf"])
# async
documents = await parser.aload_data("./my_file.pdf")
# async batch
documents = await parser.aload_data(["./my_file1.pdf", "./my_file2.pdf"])
```
## Using with file object
You can parse a file object directly:
```python
import nest_asyncio
nest_asyncio.apply()
from llama_cloud_services import LlamaParse
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
num_workers=4, # if multiple files passed, split in `num_workers` API calls
verbose=True,
language="en", # Optionally you can define a language, default=en
)
file_name = "my_file1.pdf"
extra_info = {"file_name": file_name}
with open(f"./{file_name}", "rb") as f:
# must provide extra_info with file_name key with passing file object
documents = parser.load_data(f, extra_info=extra_info)
# you can also pass file bytes directly
with open(f"./{file_name}", "rb") as f:
file_bytes = f.read()
# must provide extra_info with file_name key with passing file bytes
documents = parser.load_data(file_bytes, extra_info=extra_info)
```
## Using with `SimpleDirectoryReader`
You can also integrate the parser as the default PDF loader in `SimpleDirectoryReader`:
```python
import nest_asyncio
nest_asyncio.apply()
from llama_cloud_services import LlamaParse
from llama_index.core import SimpleDirectoryReader
parser = LlamaParse(
api_key="llx-...", # can also be set in your env as LLAMA_CLOUD_API_KEY
result_type="markdown", # "markdown" and "text" are available
verbose=True,
)
file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(
"./data", file_extractor=file_extractor
).load_data()
```
Full documentation for `SimpleDirectoryReader` can be found on the [LlamaIndex Documentation](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader.html).
## Examples
Several end-to-end indexing examples can be found in the examples folder
- [Getting Started](examples/parse/demo_basic.ipynb)
- [Advanced RAG Example](examples/parse/demo_advanced.ipynb)
- [Raw API Usage](examples/parse/demo_api.ipynb)
## Documentation
[https://docs.cloud.llamaindex.ai/](https://docs.cloud.llamaindex.ai/)
Generated
+2289 -723
View File
File diff suppressed because it is too large Load Diff
+19 -6
View File
@@ -2,25 +2,38 @@
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[tool.mypy]
files = ["llama_cloud_services"]
python_version = "3.10"
[tool.poetry]
name = "llama-parse"
version = "0.5.20"
description = "Parse files into RAG-Optimized formats."
authors = ["Logan Markewich <logan@llamaindex.ai>"]
name = "llama-cloud-services"
version = "0.1.0"
description = "Tailored SDK clients for LlamaCloud services."
authors = ["Logan Markewich <logan@runllama.ai>"]
license = "MIT"
readme = "README.md"
packages = [{include = "llama_parse"}]
packages = [{include = "llama_cloud_services"}]
[tool.poetry.dependencies]
python = ">=3.9,<4.0"
llama-index-core = ">=0.11.0"
llama-cloud = "^0.1.11"
pydantic = "!=2.10"
click = "^8.1.7"
python-dotenv = "^1.0.1"
eval-type-backport = {python = "<3.10", version = "^0.2.0"}
[tool.poetry.group.dev.dependencies]
pytest = "^8.0.0"
pytest-asyncio = "*"
ipykernel = "^6.29.0"
pre-commit = "3.2.0"
autoevals = "^0.0.114"
deepdiff = "^8.1.1"
ipython = "^8.12.3"
jupyter = "^1.1.1"
mypy = "^1.14.1"
[tool.poetry.scripts]
llama-parse = "llama_parse.cli.main:parse"
llama-parse = "llama_cloud_services.parse.cli.main:parse"

Some files were not shown because too many files have changed in this diff Show More