Files
luisquintanilla c05354b2a4 Add samples README
2024-07-30 17:56:00 +01:00

2.1 KiB

ParseDocuments Sample

This samples shows off the basics you need to get started parsing documents in .NET using the LlamaParse .NET client SDK inside of a console application.

Prerequisites

Guide

  1. Configure your client

    var apiKey = Environment.GetEnvironmentVariable("LLAMACLOUD_API_KEY");
    
    var parseConfig = new Configuration()
    {
        ApiKey = apiKey?? string.Empty
    };
    
    var llamaParseClient = new LlamaParseClient(new HttpClient(), parseConfig);
    
  2. Use the client to parse your documents. In this case, we're using an InMemoryFile, which contains the document data byte[] from the paper Attention is all you need. For simplicity and further processing, we've opted to get the results in JSON format.

    var document = new InMemoryFile(documentData, "attention-is-all-you-need.pdf");
    var parsedDocs = llamaParseClient.LoadDataRawAsync(document, ResultType.Json);
    
  3. Extract parsed results and post-process. In this case, the code just takes the paginated results and prints them out to the console.

    await foreach (var parsedDoc in parsedDocs)
    {
        var serializerOptions = new JsonSerializerOptions
        {
            PropertyNameCaseInsensitive = true
        };
    
        var result = JsonSerializer.Deserialize<ParseResult>(parsedDoc.Result, serializerOptions);
    
        foreach(var page in result.Pages)
        {
            Console.WriteLine($"Page {page.Page}");
            Console.WriteLine("-------------------");
            Console.WriteLine(page.Text);
            Console.WriteLine("-------------------");
        }
    }
    
    public record ParseResult(PageContent[] Pages);
    public record PageContent(int Page, string Text);