mirror of
https://github.com/run-llama/llamaindex.net.git
synced 2026-07-01 20:36:58 -04:00
2.1 KiB
2.1 KiB
ParseDocuments Sample
This samples shows off the basics you need to get started parsing documents in .NET using the LlamaParse .NET client SDK inside of a console application.
Prerequisites
Guide
-
Configure your client
var apiKey = Environment.GetEnvironmentVariable("LLAMACLOUD_API_KEY"); var parseConfig = new Configuration() { ApiKey = apiKey?? string.Empty }; var llamaParseClient = new LlamaParseClient(new HttpClient(), parseConfig); -
Use the client to parse your documents. In this case, we're using an
InMemoryFile, which contains the document databyte[]from the paper Attention is all you need. For simplicity and further processing, we've opted to get the results in JSON format.var document = new InMemoryFile(documentData, "attention-is-all-you-need.pdf"); var parsedDocs = llamaParseClient.LoadDataRawAsync(document, ResultType.Json); -
Extract parsed results and post-process. In this case, the code just takes the paginated results and prints them out to the console.
await foreach (var parsedDoc in parsedDocs) { var serializerOptions = new JsonSerializerOptions { PropertyNameCaseInsensitive = true }; var result = JsonSerializer.Deserialize<ParseResult>(parsedDoc.Result, serializerOptions); foreach(var page in result.Pages) { Console.WriteLine($"Page {page.Page}"); Console.WriteLine("-------------------"); Console.WriteLine(page.Text); Console.WriteLine("-------------------"); } } public record ParseResult(PageContent[] Pages); public record PageContent(int Page, string Text);