fix: add client-side resilience for S3 transaction race condition

Address the S3 race condition (LI-5775) where the backend DB may commit a SUCCESS status before the S3 data write fully propagates, causing transient 500 errors or null data when fetching extraction runs. Changes: - Add _get_run_with_data_check() wrapper in _wait_for_job_result that detects SUCCESS runs with null data and retries with exponential backoff (2s, 4s, 8s) to allow S3 writes to propagate - Increase run_retry_attempts from 3 to 5 and run_max_wait from 4/20 to 30 for more resilient handling of transient 500 errors from the backend's S3 read failures - Add warning messages when the race condition pattern is detected to aid debugging Co-authored-by: George He <georgewho96@gmail.com>
More robust extract tests with pytest xdist (#1117 )
2026-07-01 21:44:37 -04:00 · 2026-02-17 21:33:10 +00:00 · 2026-02-16 16:16:15 -08:00 · 2026-02-14 21:03:05 -06:00 · 2026-02-13 15:29:09 -08:00 · 2026-02-13 15:20:52 -08:00
14 changed files with 164 additions and 108 deletions
@@ -1,8 +1,8 @@
-name: Hourly Extract E2E Tests
+name: Extract E2E Tests (every 4 hours)

 on:
  schedule:
-    - cron: "18 * * * *"
+    - cron: "0 */4 * * *"
  workflow_dispatch:
    # Allows manual triggering
    inputs:
@@ -29,7 +29,7 @@ env:

 jobs:
  extract-e2e:
-    name: "Hourly Extract E2E Tests (${{ matrix.environment }})"
+    name: "Extract E2E Tests (${{ matrix.environment }})"
    runs-on: ubuntu-latest
    timeout-minutes: 30
    concurrency:
@@ -149,7 +149,7 @@ jobs:
      - name: Post to Extract Slack channel
        id: slack
        if: (failure() || cancelled()) && steps.runtime.outputs.notify_slack == 'true'
-        uses: slackapi/slack-github-action@v1.27.0
+        uses: slackapi/slack-github-action@v2.1.1
        with:
          channel-id: ${{ env.SLACK_CHANNEL_ID }}
          slack-message: |
@@ -1,5 +1,17 @@
 # llama-cloud-services-py

+## 0.6.94
+
+### Patch Changes
+
+- 232c55b: Include xlsx files in extract input
+
+## 0.6.93
+
+### Patch Changes
+
+- da1916c: Add more warnings
+
 ## 0.6.92

 ### Patch Changes
@@ -4,6 +4,16 @@

 # Llama Cloud Services

+> **⚠️ DEPRECATION NOTICE**
+>
+> This repository and its packages are deprecated and will be maintained until **May 1, 2026**.
+>
+> **Please migrate to the new packages:**
+> - **Python**: `pip install llama-cloud>=1.0` ([GitHub](https://github.com/run-llama/llama-cloud-py))
+> - **TypeScript**: `npm install @llamaindex/llama-cloud` ([GitHub](https://github.com/run-llama/llama-cloud-ts))
+>
+> The new packages provide the same functionality with improved performance, better support, and active development.
+
 This repository contains the code for hand-written SDKs and clients for interacting with LlamaCloud.

 This includes:
@@ -121,11 +121,19 @@ async def _wait_for_job_result(
    job_retry_attempts: int = 5,
    job_max_wait: float = 60,
    job_jitter: float = 5,
-    run_retry_attempts: int = 3,
-    run_max_wait: float = 20,
+    run_retry_attempts: int = 5,
+    run_max_wait: float = 30,
    run_jitter: float = 3,
+    data_availability_retries: int = 3,
+    data_availability_initial_delay: float = 2.0,
 ) -> Optional[ExtractRun]:
-    """Wait for and return the results of an extraction job."""
+    """Wait for and return the results of an extraction job.
+
+    Includes resilience against a known S3 race condition on the backend where
+    the database may be updated with a SUCCESS status before the S3 data write
+    fully propagates. This can cause transient 500 errors or runs with null data.
+    The retry parameters and data availability checks mitigate this.
+    """

    @_async_retry(
        max_attempts=job_retry_attempts, max_wait=job_max_wait, jitter=job_jitter
@@ -143,6 +151,28 @@ async def _wait_for_job_result(
            organization_id=organization_id,
        )

+    async def _get_run_with_data_check() -> ExtractRun:
+        """Fetch the extraction run, retrying if data is missing due to S3 race condition.
+
+        When the backend has status SUCCESS but the S3 data write hasn't propagated,
+        the run may come back with data=None or the API may return a 500 error.
+        The 500 case is handled by the @_async_retry on _get_run(). This wrapper
+        handles the data=None case by retrying with exponential backoff.
+        """
+        for attempt in range(data_availability_retries):
+            run = await _get_run()
+            if run.data is not None or run.status != StatusEnum.SUCCESS:
+                return run
+            delay = data_availability_initial_delay * (2**attempt)
+            warnings.warn(
+                f"Extraction run for job {job_id} has status SUCCESS but data is "
+                f"not yet available (possible S3 race condition). "
+                f"Retrying in {delay:.1f}s "
+                f"(attempt {attempt + 1}/{data_availability_retries})..."
+            )
+            await asyncio.sleep(delay)
+        return await _get_run()
+
    start = time.perf_counter()
    poll_count = 0

@@ -152,7 +182,7 @@ async def _wait_for_job_result(
        job = await _get_job()

        if job.status == StatusEnum.SUCCESS:
-            return await _get_run()
+            return await _get_run_with_data_check()
        elif job.status == StatusEnum.PENDING:
            end = time.perf_counter()
            if end - start > max_timeout:
@@ -323,8 +353,8 @@ class ExtractionAgent:
            job_retry_attempts=5,
            job_max_wait=60,
            job_jitter=5,
-            run_retry_attempts=3,
-            run_max_wait=20,
+            run_retry_attempts=5,
+            run_max_wait=30,
            run_jitter=3,
        )

@@ -783,8 +813,8 @@ class LlamaExtract(BaseComponent):
            job_retry_attempts=3,
            job_max_wait=4,
            job_jitter=5,
-            run_retry_attempts=3,
-            run_max_wait=4,
+            run_retry_attempts=5,
+            run_max_wait=30,
            run_jitter=3,
        )

@@ -806,6 +836,7 @@ class LlamaExtract(BaseComponent):
            # Document files
            ".pdf": "application/pdf",
            ".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
+            ".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
            # Image files
            ".png": "image/png",
            ".jpg": "image/jpeg",
@@ -1,5 +1,21 @@
 # llama_parse

+## 0.6.94
+
+### Patch Changes
+
+- 232c55b: Include xlsx files in extract input
+- Updated dependencies [232c55b]
+  - llama-cloud-services-py@0.6.94
+
+## 0.6.93
+
+### Patch Changes
+
+- da1916c: Add more warnings
+- Updated dependencies [da1916c]
+  - llama-cloud-services-py@0.6.93
+
 ## 0.6.92

 ### Patch Changes
@@ -1,5 +1,15 @@
 # LlamaParse

+> **⚠️ DEPRECATION NOTICE**
+>
+> This repository and its packages are deprecated and will be maintained until **May 1, 2026**.
+>
+> **Please migrate to the new packages:**
+> - **Python**: `pip install llama-cloud>=1.0` ([GitHub](https://github.com/run-llama/llama-cloud-py))
+> - **TypeScript**: `npm install @llamaindex/llama-cloud` ([GitHub](https://github.com/run-llama/llama-cloud-ts))
+>
+> The new packages provide the same functionality with improved performance, better support, and active development.
+
 [![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-parse)](https://pypi.org/project/llama-parse/)
 [![GitHub contributors](https://img.shields.io/github/contributors/run-llama/llama_parse)](https://github.com/run-llama/llama_parse/graphs/contributors)
 [![Discord](https://img.shields.io/discord/1059199217496772688)](https://discord.gg/dGcwcsnxhU)
@@ -1,8 +1,18 @@
-from llama_cloud_services.parse import (
+import warnings
+from llama_cloud_services.parse import (  # type: ignore[attr-defined]
    LlamaParse,
    ResultType,
    ParsingMode,
    FailedPageMode,
 )

+warnings.warn(
+    "The 'llama-parse' package is deprecated and will no longer receive updates. "
+    "Please migrate to the new unified SDK. "
+    "See https://developers.llamaindex.ai/python/cloud/llamaparse/getting_started/ "
+    "and https://github.com/run-llama/llama-cloud-py/blob/main/README.md for migration instructions.",
+    DeprecationWarning,
+    stacklevel=2,
+)
+
 __all__ = ["LlamaParse", "ResultType", "ParsingMode", "FailedPageMode"]
@@ -1,6 +1,6 @@
 {
  "name": "llama_parse",
-  "version": "0.6.92",
+  "version": "0.6.94",
  "description": "",
  "main": "index.js",
  "private": false,
@@ -11,13 +11,13 @@ dev = [

 [project]
 name = "llama-parse"
-version = "0.6.92"
+version = "0.6.94"
 description = "Parse files into RAG-Optimized formats."
 authors = [{name = "Logan Markewich", email = "logan@llamaindex.ai"}]
 requires-python = ">=3.9,<4.0"
 readme = "README.md"
 license = "MIT"
-dependencies = ["llama-cloud-services>=0.6.92"]
+dependencies = ["llama-cloud-services>=0.6.94"]

 [project.scripts]
 llama-parse = "llama_parse.cli.main:parse"
@@ -1,6 +1,6 @@
 {
  "name": "llama-cloud-services-py",
-  "version": "0.6.92",
+  "version": "0.6.94",
  "private": false,
  "license": "MIT",
  "scripts": {},
@@ -23,7 +23,7 @@ dev = [

 [project]
 name = "llama-cloud-services"
-version = "0.6.92"
+version = "0.6.94"
 description = "Tailored SDK clients for LlamaCloud services."
 authors = [{name = "Logan Markewich", email = "logan@runllama.ai"}]
 requires-python = ">=3.9,<4.0"
@@ -1,5 +1,4 @@
-import os
-from typing import Any, Dict, List, Optional, Union
+from typing import Any, Dict, Optional, Union

 from llama_cloud.core.api_error import ApiError
 from llama_cloud.types import ExtractConfig
@@ -13,9 +12,6 @@ from tenacity import (

 from llama_cloud_services.extract import ExtractionAgent, LlamaExtract

-# Global storage for agents to cleanup
-_TEST_AGENTS_TO_CLEANUP: List[str] = []
-

 def _is_rate_limit_error(exception: BaseException) -> bool:
    """Check if the exception is a rate limit error (429)."""
@@ -42,38 +38,3 @@ def pytest_configure(config):
    """Register custom markers for extract tests."""
    config.addinivalue_line("markers", "agent_name: custom agent name for test")
    config.addinivalue_line("markers", "agent_schema: custom agent schema for test")
-
-
-def pytest_sessionfinish(session, exitstatus):
-    """Hook that runs after all tests complete - cleanup agents here"""
-    print(
-        f"pytest_sessionfinish hook called! Agents to cleanup: {_TEST_AGENTS_TO_CLEANUP}"
-    )
-
-    if _TEST_AGENTS_TO_CLEANUP:
-        print("Creating cleanup client...")
-        # Create a fresh client just for cleanup
-        cleanup_client = LlamaExtract(
-            api_key=os.getenv("LLAMA_CLOUD_API_KEY"),
-            base_url=os.getenv("LLAMA_CLOUD_BASE_URL"),
-            project_id=os.getenv("LLAMA_CLOUD_PROJECT_ID"),
-            verbose=True,
-        )
-
-        for agent_id in _TEST_AGENTS_TO_CLEANUP:
-            try:
-                print(f"Deleting agent {agent_id}...")
-                cleanup_client.delete_agent(agent_id)
-                print(f"Cleaned up agent {agent_id}")
-            except Exception as e:
-                print(f"Warning: Failed to delete agent {agent_id}: {e}")
-
-        _TEST_AGENTS_TO_CLEANUP.clear()
-        print("Agent cleanup completed")
-    else:
-        print("No agents to cleanup")
-
-
-def register_agent_for_cleanup(agent_id: str):
-    """Register an agent ID for cleanup at the end of the test session"""
-    _TEST_AGENTS_TO_CLEANUP.append(agent_id)
@@ -1,4 +1,6 @@
 import os
+import shutil
+import uuid
 import pytest
 from pathlib import Path
 from pydantic import BaseModel
@@ -6,7 +8,7 @@ from pydantic import BaseModel
 from llama_cloud_services.extract import LlamaExtract, ExtractionAgent, SourceText
 from llama_cloud.types import ExtractConfig, ExtractMode, ExtractRun
 from tests.extract.util import load_test_dotenv
-from .conftest import register_agent_for_cleanup, create_agent_with_retry
+from .conftest import create_agent_with_retry

 load_test_dotenv()

@@ -59,17 +61,27 @@ def test_schema_dict():


@pytest.fixture
-def test_agent(llama_extract, test_agent_name, test_schema_dict, request):
-    """Creates a test agent and collects it for cleanup at the end of all tests"""
-    test_id = request.node.nodeid
-    test_hash = hex(hash(test_id))[-8:]
-    base_name = test_agent_name
+def unique_test_pdf(tmp_path):
+    """Copy test PDF to a unique path to avoid file deduplication across parallel tests.

+    Uses a UUID in the filename so that external_file_id is unique regardless of
+    whether the full path or just the filename is sent to the backend.
+    """
+    unique_name = f"{TEST_PDF.stem}-{uuid.uuid4().hex[:8]}{TEST_PDF.suffix}"
+    unique_pdf = tmp_path / unique_name
+    shutil.copy2(TEST_PDF, unique_pdf)
+    return unique_pdf
+
+
+@pytest.fixture
+def test_agent(llama_extract, test_agent_name, test_schema_dict, request):
+    """Creates a test agent with a unique name and cleans it up after the test."""
+    unique_id = uuid.uuid4().hex[:8]
    base_name = next(
        (marker.args[0] for marker in request.node.iter_markers("agent_name")),
-        base_name,
+        test_agent_name,
    )
-    name = f"{base_name}_{test_hash}"
+    name = f"{base_name}_{unique_id}"

    schema = next(
        (
@@ -79,25 +91,20 @@ def test_agent(llama_extract, test_agent_name, test_schema_dict, request):
        test_schema_dict,
    )

-    # Cleanup existing agent
-    try:
-        for agent in llama_extract.list_agents():
-            if agent.name == name:
-                llama_extract.delete_agent(agent.id)
-    except Exception as e:
-        print(f"Warning: Failed to cleanup existing agent: {e}")
-
    # Use config with cache invalidation to ensure fresh results in tests
    config = ExtractConfig(invalidate_cache=True)
    agent = create_agent_with_retry(
        llama_extract, name=name, data_schema=schema, config=config
    )

-    # Add agent to cleanup list via conftest helper
-    register_agent_for_cleanup(agent.id)
-
    yield agent

+    # Inline cleanup -- each worker cleans up its own agents
+    try:
+        llama_extract.delete_agent(agent.id)
+    except Exception as e:
+        print(f"Warning: Failed to cleanup agent {agent.id}: {e}")
+

 class TestLlamaExtract:
    def test_init_without_api_key(self):
@@ -138,34 +145,38 @@ class TestLlamaExtract:

 class TestExtractionAgent:
    @pytest.mark.asyncio
-    async def test_extract_single_file(self, test_agent):
-        result = await test_agent.aextract(TEST_PDF)
+    async def test_extract_single_file(self, test_agent, unique_test_pdf):
+        result = await test_agent.aextract(unique_test_pdf)
        assert result.status == "SUCCESS"
        assert result.data is not None
        assert isinstance(result.data, dict)
        assert "title" in result.data
        assert "summary" in result.data

-    def test_sync_extract_single_file(self, test_agent):
-        result = test_agent.extract(TEST_PDF)
+    def test_sync_extract_single_file(self, test_agent, unique_test_pdf):
+        result = test_agent.extract(unique_test_pdf)
        assert result.status == "SUCCESS"
        assert result.data is not None
        assert isinstance(result.data, dict)
        assert "title" in result.data
        assert "summary" in result.data

-    def test_extract_file_from_buffered_io(self, test_agent):
-        result = test_agent.extract(SourceText(file=open(TEST_PDF, "rb")))
+    def test_extract_file_from_buffered_io(self, test_agent, unique_test_pdf):
+        result = test_agent.extract(
+            SourceText(file=open(unique_test_pdf, "rb"), filename=unique_test_pdf.name)
+        )
        assert result.status == "SUCCESS"
        assert result.data is not None
        assert isinstance(result.data, dict)
        assert "title" in result.data
        assert "summary" in result.data

-    def test_extract_file_from_bytes(self, test_agent):
-        with open(TEST_PDF, "rb") as f:
+    def test_extract_file_from_bytes(self, test_agent, unique_test_pdf):
+        with open(unique_test_pdf, "rb") as f:
            file_bytes = f.read()
-        result = test_agent.extract(SourceText(file=file_bytes, filename=TEST_PDF.name))
+        result = test_agent.extract(
+            SourceText(file=file_bytes, filename=unique_test_pdf.name)
+        )
        assert result.status == "SUCCESS"
        assert result.data is not None
        assert isinstance(result.data, dict)
@@ -181,7 +192,10 @@ class TestExtractionAgent:
        weight for 8 to 13 km (5–8 miles).[3] The name llama (also historically spelled
        "glama") was adopted by European settlers from native Peruvians.
        """
-        result = test_agent.extract(SourceText(text_content=TEST_TEXT))
+        unique_name = f"text-{uuid.uuid4().hex[:8]}.txt"
+        result = test_agent.extract(
+            SourceText(text_content=TEST_TEXT, filename=unique_name)
+        )
        assert result.status == "SUCCESS"
        assert result.data is not None
        assert isinstance(result.data, dict)
@@ -189,8 +203,8 @@ class TestExtractionAgent:
        assert "summary" in result.data

    @pytest.mark.asyncio
-    async def test_extract_multiple_files(self, test_agent):
-        files = [TEST_PDF, TEST_PDF]  # Using same file twice for testing
+    async def test_extract_multiple_files(self, test_agent, unique_test_pdf):
+        files = [unique_test_pdf, unique_test_pdf]  # Using same file twice for testing
        response = await test_agent.aextract(files)

        assert len(response) == 2
@@ -219,15 +233,15 @@ class TestExtractionAgent:
        updated_agent = llama_extract.get_agent(name=test_agent.name)
        assert "new_field" in updated_agent.data_schema["properties"]

-    def test_list_extraction_runs(self, test_agent: ExtractionAgent):
+    def test_list_extraction_runs(self, test_agent: ExtractionAgent, unique_test_pdf):
        assert test_agent.list_extraction_runs().total == 0
-        test_agent.extract(TEST_PDF)
+        test_agent.extract(unique_test_pdf)
        runs = test_agent.list_extraction_runs()
        assert runs.total > 0

-    def test_delete_extraction_run(self, test_agent: ExtractionAgent):
+    def test_delete_extraction_run(self, test_agent: ExtractionAgent, unique_test_pdf):
        assert test_agent.list_extraction_runs().total == 0
-        run: ExtractRun = test_agent.extract(TEST_PDF)
+        run: ExtractRun = test_agent.extract(unique_test_pdf)
        test_agent.delete_extraction_run(run.id)
        runs = test_agent.list_extraction_runs()
        assert runs.total == 0
@@ -10,7 +10,7 @@ import uuid
 from llama_cloud.types import ExtractConfig, ExtractMode
 from deepdiff import DeepDiff
 from tests.extract.util import json_subset_match_score, load_test_dotenv
-from .conftest import register_agent_for_cleanup, create_agent_with_retry
+from .conftest import create_agent_with_retry

 load_test_dotenv()

@@ -109,32 +109,24 @@ def extractor():
@pytest.fixture
 def extraction_agent(test_case: ExtractionTestCase, extractor: LlamaExtract):
    """Fixture to create and cleanup extraction agent for each test."""
-    # Create unique name with random UUID (important for CI to avoid conflicts)
    unique_id = uuid.uuid4().hex[:8]
    agent_name = f"{test_case.name}_{unique_id}"

    with open(test_case.schema_path, "r") as f:
        schema = json.load(f)

-    # Clean up any existing agents with this name
-    try:
-        agents = extractor.list_agents()
-        for agent in agents:
-            if agent.name == agent_name:
-                extractor.delete_agent(agent.id)
-    except Exception as e:
-        print(f"Warning: Failed to cleanup existing agent: {str(e)}")
-
-    # Create new agent with retry logic for rate limiting
    agent = create_agent_with_retry(
        extractor, name=agent_name, data_schema=schema, config=test_case.config
    )

-    # Register agent for cleanup at the end of the test session
-    register_agent_for_cleanup(agent.id)
-
    yield agent

+    # Inline cleanup -- each worker cleans up its own agents
+    try:
+        extractor.delete_agent(agent.id)
+    except Exception as e:
+        print(f"Warning: Failed to cleanup agent {agent.id}: {e}")
+

@pytest.mark.skipif(
    os.environ.get("LLAMA_CLOUD_API_KEY", "") == "",
Author	SHA1	Message	Date
Cursor Agent	a87743df0f	fix: add client-side resilience for S3 transaction race condition Address the S3 race condition (LI-5775) where the backend DB may commit a SUCCESS status before the S3 data write fully propagates, causing transient 500 errors or null data when fetching extraction runs. Changes: - Add _get_run_with_data_check() wrapper in _wait_for_job_result that detects SUCCESS runs with null data and retries with exponential backoff (2s, 4s, 8s) to allow S3 writes to propagate - Increase run_retry_attempts from 3 to 5 and run_max_wait from 4/20 to 30 for more resilient handling of transient 500 errors from the backend's S3 read failures - Add warning messages when the race condition pattern is detected to aid debugging Co-authored-by: George He <georgewho96@gmail.com>	2026-02-17 21:33:10 +00:00
Neeraj Pradhan	5ea758b853	More robust extract tests with pytest xdist (#1117 )	2026-02-16 16:16:15 -08:00
dependabot[bot]	208b6f2fa5	build(deps): bump slackapi/slack-github-action from 1.27.0 to 2.1.1 (#1092 ) Bumps [slackapi/slack-github-action](https://github.com/slackapi/slack-github-action) from 1.27.0 to 2.1.1. - [Release notes](https://github.com/slackapi/slack-github-action/releases) - [Commits](https://github.com/slackapi/slack-github-action/compare/v1.27.0...v2.1.1) --- updated-dependencies: - dependency-name: slackapi/slack-github-action dependency-version: 2.1.1 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-14 21:03:05 -06:00
github-actions[bot]	e1b9143f79	chore: version packages (#1116 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2026-02-13 15:29:09 -08:00
Neeraj Pradhan	232c55bd6a	Bump up patch version (#1115 )	2026-02-13 15:20:52 -08:00
Neeraj Pradhan	ab6f2f8da5	Allows xlsx files in the sdk for extract (#1114 )	2026-02-13 14:44:25 -08:00
github-actions[bot]	66c2639ec8	chore: version packages (#1112 )	2026-02-11 15:18:43 -06:00
Logan	da1916c69f	more loudly deprecate ancient llama-parse package (#1111 )	2026-02-11 15:16:01 -06:00
Neeraj Pradhan	345e272573	Lower frequency for e2e tests (#1110 )	2026-02-11 09:07:15 -08:00