mirror of
https://github.com/langchain-ai/datafusion.git
synced 2026-07-01 21:24:06 -04:00
465c89f7f1
* Update github repo link * Format markdown --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
4.9 KiB
4.9 KiB
DataFusion Examples
This crate includes several examples of how to use various DataFusion APIs and help you on your way.
Prerequisites:
Run git submodule update --init to init test files.
Running Examples
To run the examples, use the cargo run command, such as:
git clone https://github.com/apache/datafusion
cd arrow-datafusion
# Download test data
git submodule update --init
# Run the `csv_sql` example:
# ... use the equivalent for other examples
cargo run --example csv_sql
Single Process
advanced_udaf.rs: Define and invoke a more complicated User Defined Aggregate Function (UDAF)advanced_udf.rs: Define and invoke a more complicated User Defined Scalar Function (UDF)advanced_udwf.rs: Define and invoke a more complicated User Defined Window Function (UDWF)avro_sql.rs: Build and run a query plan from a SQL statement against a local AVRO filecatalog.rs: Register the table into a custom catalogcsv_sql.rs: Build and run a query plan from a SQL statement against a local CSV filecsv_sql_streaming.rs: Build and run a streaming query plan from a SQL statement against a local CSV filecustom_datasource.rs: Run queries against a custom datasource (TableProvider)dataframe-to-s3.rs: Run a query using a DataFrame against a parquet file from s3 and writing back to s3dataframe.rs: Run a query using a DataFrame against a local parquet filedataframe_in_memory.rs: Run a query using a DataFrame against data in memorydataframe_output.rs: Examples of methods which write data out from a DataFramedeserialize_to_struct.rs: Convert query results into rust structs using serdeexpr_api.rs: Create, execute, simplify and analyzeExprsflight_sql_server.rs: Run DataFusion as a standalone process and execute SQL queries from JDBC clientsfunction_factory.rs: RegisterCREATE FUNCTIONhandler to implement SQL macrosmake_date.rs: Examples of using the make_date functionmemtable.rs: Create an query data in memory using SQL andRecordBatchesparquet_sql.rs: Build and run a query plan from a SQL statement against a local Parquet fileparquet_sql_multiple_files.rs: Build and run a query plan from a SQL statement against multiple local Parquet filespruning.rs: Use pruning to rule out files based on statisticsquery-aws-s3.rs: Configureobject_storeand run a query against files stored in AWS S3query-http-csv.rs: Configureobject_storeand run a query against files vi HTTPregexp.rs: Examples of using regular expression functionsrewrite_expr.rs: Define and invoke a custom Query Optimizer passsimple_udaf.rs: Define and invoke a User Defined Aggregate Function (UDAF)simple_udf.rs: Define and invoke a User Defined Scalar Function (UDF)simple_udfw.rs: Define and invoke a User Defined Window Function (UDWF)sql_dialect.rs: Example of implementing a custom SQL dialect on top ofDFParserto_char.rs: Examples of using the to_char functionto_timestamp.rs: Examples of using to_timestamp functions
Distributed
flight_client.rsandflight_server.rs: Run DataFusion as a standalone process and execute SQL queries from a client using the Flight protocol.