mirror of
https://github.com/langchain-ai/datafusion.git
synced 2026-06-30 21:27:59 -04:00
chore: Update READMEs of crates to be more consistent (#17691)
* chore: Update READMEs of crates to be more consistent * Add some more Apache project links * Minor formatting * Formatting * Update datafusion/pruning/README.md Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * suggestion * formatting * formatting --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
This commit is contained in:
@@ -19,12 +19,15 @@
|
||||
|
||||
<!-- Note this file is included in the crates.io page as well https://crates.io/crates/datafusion-cli -->
|
||||
|
||||
# DataFusion Command-line Interface
|
||||
# Apache DataFusion Command-line Interface
|
||||
|
||||
[DataFusion](https://datafusion.apache.org/) is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
DataFusion CLI (`datafusion-cli`) is a small command line utility that runs SQL queries using the DataFusion engine.
|
||||
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
|
||||
# Frequently Asked Questions
|
||||
|
||||
## Where can I find more information?
|
||||
|
||||
@@ -18,11 +18,11 @@
|
||||
[package]
|
||||
name = "datafusion-catalog-listing"
|
||||
description = "datafusion-catalog-listing"
|
||||
readme = "README.md"
|
||||
authors.workspace = true
|
||||
edition.workspace = true
|
||||
homepage.workspace = true
|
||||
license.workspace = true
|
||||
readme.workspace = true
|
||||
repository.workspace = true
|
||||
rust-version.workspace = true
|
||||
version.workspace = true
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion catalog-listing
|
||||
# Apache DataFusion Catalog Listing
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion with [ListingTable], an implementation
|
||||
of [TableProvider] based on files in a directory (either locally or on remote
|
||||
@@ -29,8 +29,8 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[listingtable]: https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html
|
||||
[tableprovider]: https://docs.rs/datafusion/latest/datafusion/datasource/trait.TableProvider.html
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -18,11 +18,11 @@
|
||||
[package]
|
||||
name = "datafusion-catalog"
|
||||
description = "datafusion-catalog"
|
||||
readme = "README.md"
|
||||
authors.workspace = true
|
||||
edition.workspace = true
|
||||
homepage.workspace = true
|
||||
license.workspace = true
|
||||
readme.workspace = true
|
||||
repository.workspace = true
|
||||
rust-version.workspace = true
|
||||
version.workspace = true
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Catalog
|
||||
# Apache DataFusion Catalog
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides catalog management functionality, including catalogs, schemas, and tables.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Common Runtime
|
||||
# Apache DataFusion Common Runtime
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides common utilities.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Common
|
||||
# Apache DataFusion Common
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides common data types and utilities.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,15 +17,12 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Core
|
||||
<!--
|
||||
Note the main crates.io landing page https://crates.io/crates/datafusion
|
||||
uses the workspace README.md file, not this file
|
||||
-->
|
||||
|
||||
DataFusion is an extensible query execution framework, written in Rust,
|
||||
that uses Apache Arrow as its in-memory format.
|
||||
# Apache DataFusion Core
|
||||
|
||||
This crate contains the main entry points and high level DataFusion APIs such as
|
||||
`SessionContext`, `DataFrame` and `ListingTable`.
|
||||
|
||||
For more information, please see:
|
||||
|
||||
- [DataFusion Website](https://datafusion.apache.org)
|
||||
- [DataFusion API Docs](https://docs.rs/datafusion/latest/datafusion/)
|
||||
|
||||
@@ -18,11 +18,11 @@
|
||||
[package]
|
||||
name = "datafusion-datasource-avro"
|
||||
description = "datafusion-datasource-avro"
|
||||
readme = "README.md"
|
||||
authors.workspace = true
|
||||
edition.workspace = true
|
||||
homepage.workspace = true
|
||||
license.workspace = true
|
||||
readme.workspace = true
|
||||
repository.workspace = true
|
||||
rust-version.workspace = true
|
||||
version.workspace = true
|
||||
|
||||
@@ -17,15 +17,17 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion datasource
|
||||
# Apache DataFusion Avro DataSource
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that defines a Avro based file source.
|
||||
This crate is a submodule of DataFusion that defines an [Apache Avro] based file source.
|
||||
|
||||
Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[apache avro]: https://avro.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -18,11 +18,11 @@
|
||||
[package]
|
||||
name = "datafusion-datasource-csv"
|
||||
description = "datafusion-datasource-csv"
|
||||
readme = "README.md"
|
||||
authors.workspace = true
|
||||
edition.workspace = true
|
||||
homepage.workspace = true
|
||||
license.workspace = true
|
||||
readme.workspace = true
|
||||
repository.workspace = true
|
||||
rust-version.workspace = true
|
||||
version.workspace = true
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion datasource
|
||||
# Apache DataFusion CSV DataSource
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that defines a CSV based file source.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -18,11 +18,11 @@
|
||||
[package]
|
||||
name = "datafusion-datasource-json"
|
||||
description = "datafusion-datasource-json"
|
||||
readme = "README.md"
|
||||
authors.workspace = true
|
||||
edition.workspace = true
|
||||
homepage.workspace = true
|
||||
license.workspace = true
|
||||
readme.workspace = true
|
||||
repository.workspace = true
|
||||
rust-version.workspace = true
|
||||
version.workspace = true
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion datasource
|
||||
# Apache DataFusion JSON DataSource
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that defines a JSON based file source.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -18,11 +18,11 @@
|
||||
[package]
|
||||
name = "datafusion-datasource-parquet"
|
||||
description = "datafusion-datasource-parquet"
|
||||
readme = "README.md"
|
||||
authors.workspace = true
|
||||
edition.workspace = true
|
||||
homepage.workspace = true
|
||||
license.workspace = true
|
||||
readme.workspace = true
|
||||
repository.workspace = true
|
||||
rust-version.workspace = true
|
||||
version.workspace = true
|
||||
|
||||
@@ -17,15 +17,17 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion datasource
|
||||
# Apache DataFusion Parquet DataSource
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that defines a Parquet based file source.
|
||||
This crate is a submodule of DataFusion that defines an [Apache Parquet] based file source.
|
||||
|
||||
Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[apache parquet]: https://parquet.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -18,11 +18,11 @@
|
||||
[package]
|
||||
name = "datafusion-datasource"
|
||||
description = "datafusion-datasource"
|
||||
readme = "README.md"
|
||||
authors.workspace = true
|
||||
edition.workspace = true
|
||||
homepage.workspace = true
|
||||
license.workspace = true
|
||||
readme.workspace = true
|
||||
repository.workspace = true
|
||||
rust-version.workspace = true
|
||||
version.workspace = true
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion datasource
|
||||
# Apache DataFusion DataSource
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that defines common DataSource related components like FileScanConfig, FileCompression etc.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -19,6 +19,7 @@
|
||||
name = "datafusion-doc"
|
||||
description = "Documentation module for DataFusion query engine"
|
||||
keywords = ["datafusion", "query", "sql"]
|
||||
readme = "README.md"
|
||||
version = { workspace = true }
|
||||
edition = { workspace = true }
|
||||
homepage = { workspace = true }
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Execution
|
||||
# Apache DataFusion Documentation
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides structures and macros
|
||||
for documenting user defined functions.
|
||||
@@ -28,5 +28,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Execution
|
||||
# Apache DataFusion Execution
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides execution runtime such as the memory pools and disk manager.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -19,6 +19,7 @@
|
||||
name = "datafusion-expr-common"
|
||||
description = "Logical plan and expression representation for DataFusion query engine"
|
||||
keywords = ["datafusion", "logical", "plan", "expressions"]
|
||||
readme = "README.md"
|
||||
version = { workspace = true }
|
||||
edition = { workspace = true }
|
||||
homepage = { workspace = true }
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Logical Plan and Expressions
|
||||
# Apache DataFusion Common Logical Plan and Expressions
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides common logical expressions
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Logical Plan and Expressions
|
||||
# Apache DataFusion Logical Plan and Expressions
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides data types and utilities for logical plans and expressions.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
+11
-11
@@ -17,10 +17,10 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# `datafusion-ffi`: Apache DataFusion Foreign Function Interface
|
||||
# Apache DataFusion Foreign Function Interface
|
||||
|
||||
This crate contains code to allow interoperability of Apache [DataFusion] with
|
||||
functions from other libraries and/or [DataFusion] versions using a stable
|
||||
This crate contains code to allow interoperability of [Apache DataFusion] with
|
||||
functions from other libraries and/or DataFusion versions using a stable
|
||||
interface.
|
||||
|
||||
One of the limitations of the Rust programming language is that there is no
|
||||
@@ -28,10 +28,10 @@ stable [Rust ABI] (Application Binary Interface). If a library is compiled with
|
||||
one version of the Rust compiler and you attempt to use that library with a
|
||||
program compiled by a different Rust compiler, there is no guarantee that you
|
||||
can access the data structures. In order to share code between libraries loaded
|
||||
at runtime, you need to use Rust's [FFI](Foreign Function Interface (FFI)).
|
||||
at runtime, you need to use Rust's [FFI] (Foreign Function Interface (FFI)).
|
||||
|
||||
The purpose of this crate is to define interfaces between [DataFusion] libraries
|
||||
that will remain stable across different versions of [DataFusion]. This allows
|
||||
The purpose of this crate is to define interfaces between DataFusion libraries
|
||||
that will remain stable across different versions of DataFusion. This allows
|
||||
users to write libraries that can interface between each other at runtime rather
|
||||
than require compiling all of the code into a single executable.
|
||||
|
||||
@@ -46,7 +46,7 @@ See [API Docs] for details and examples.
|
||||
Two use cases have been identified for this crate, but they are not intended to
|
||||
be all inclusive.
|
||||
|
||||
1. `datafusion-python` which will use the FFI to provide external services such
|
||||
1. [`datafusion-python`] which will use the FFI to provide external services such
|
||||
as a `TableProvider` without needing to re-export the entire `datafusion-python`
|
||||
code base. With `datafusion-ffi` these packages do not need `datafusion-python`
|
||||
as a dependency at all.
|
||||
@@ -68,8 +68,8 @@ stable interfaces that closely mirror the Rust native approach. To learn more
|
||||
about this approach see the [abi_stable] and [async-ffi] crates.
|
||||
|
||||
If you have a library in another language that you wish to interface to
|
||||
[DataFusion] the recommendation is to create a Rust wrapper crate to interface
|
||||
with your library and then to connect it to [DataFusion] using this crate.
|
||||
DataFusion the recommendation is to create a Rust wrapper crate to interface
|
||||
with your library and then to connect it to DataFusion using this crate.
|
||||
Alternatively, you could use [bindgen] to interface directly to the [FFI] provided
|
||||
by this crate, but that is currently not supported.
|
||||
|
||||
@@ -101,12 +101,12 @@ In this crate we have a variety of structs which closely mimic the behavior of
|
||||
their internal counterparts. To see detailed notes about how to use them, see
|
||||
the example in `FFI_TableProvider`.
|
||||
|
||||
[datafusion]: https://datafusion.apache.org
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[api docs]: http://docs.rs/datafusion-ffi/latest
|
||||
[rust abi]: https://doc.rust-lang.org/reference/abi.html
|
||||
[ffi]: https://doc.rust-lang.org/nomicon/ffi.html
|
||||
[abi_stable]: https://crates.io/crates/abi_stable
|
||||
[async-ffi]: https://crates.io/crates/async-ffi
|
||||
[bindgen]: https://crates.io/crates/bindgen
|
||||
[datafusion-python]: https://datafusion.apache.org/python/
|
||||
[`datafusion-python`]: https://datafusion.apache.org/python/
|
||||
[datafusion-contrib]: https://github.com/datafusion-contrib
|
||||
|
||||
@@ -19,6 +19,7 @@
|
||||
name = "datafusion-functions-aggregate-common"
|
||||
description = "Utility functions for implementing aggregate functions for the DataFusion query engine"
|
||||
keywords = ["datafusion", "logical", "plan", "expressions"]
|
||||
readme = "README.md"
|
||||
version = { workspace = true }
|
||||
edition = { workspace = true }
|
||||
homepage = { workspace = true }
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Aggregate Function Library
|
||||
# Apache DataFusion Aggregate Function Common Library
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains common functionality for implementation aggregate and window functions.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Aggregate Function Library
|
||||
# Apache DataFusion Aggregate Function Library
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains implementations of aggregate functions.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,16 +17,18 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Nested Type Function Library
|
||||
# Apache DataFusion Nested Type Function Library
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains functions for working with arrays, maps and structs, such as `array_append` that work with
|
||||
`ListArray`, `LargeListArray` and `FixedListArray` types from the `arrow` crate.
|
||||
`ListArray`, `LargeListArray` and `FixedListArray` types from the [`arrow`] crate.
|
||||
|
||||
Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`arrow`]: https://crates.io/crates/arrow
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Table Function Library
|
||||
# Apache DataFusion Table Function Library
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains table functions that can be used in DataFusion queries.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Window Function Common Library
|
||||
# Apache DataFusion Window Function Common Library
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains common functions for implementing window functions.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Window Function Library
|
||||
# Apache DataFusion Window Function Library
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains window function definitions.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Function Library
|
||||
# Apache DataFusion Function Library
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains packages of function that can be used to customize the
|
||||
functionality of DataFusion.
|
||||
@@ -28,5 +28,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -19,6 +19,7 @@
|
||||
name = "datafusion-macros"
|
||||
description = "Procedural macros for DataFusion query engine"
|
||||
keywords = ["datafusion", "query", "sql"]
|
||||
readme = "README.md"
|
||||
version = { workspace = true }
|
||||
edition = { workspace = true }
|
||||
homepage = { workspace = true }
|
||||
|
||||
@@ -17,15 +17,14 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Window Function Common Library
|
||||
# Apache DataFusion Macros
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains common macros used in DataFusion
|
||||
|
||||
Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
Most projects should use the [`datafusion`] crate directly.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,7 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
# Apache DataFusion Optimizer
|
||||
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains the DataFusion logical optimizer.
|
||||
Please see [Query Optimizer] in the Library User Guide for more information.
|
||||
@@ -26,6 +28,7 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
[query optimizer]: https://datafusion.apache.org/library-user-guide/query-optimizer.html
|
||||
|
||||
@@ -1,4 +1,25 @@
|
||||
# DataFusion Physical Expression Adapter
|
||||
<!---
|
||||
Licensed to the Apache Software Foundation (ASF) under one
|
||||
or more contributor license agreements. See the NOTICE file
|
||||
distributed with this work for additional information
|
||||
regarding copyright ownership. The ASF licenses this file
|
||||
to you under the Apache License, Version 2.0 (the
|
||||
"License"); you may not use this file except in compliance
|
||||
with the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing,
|
||||
software distributed under the License is distributed on an
|
||||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# Apache DataFusion Physical Expression Adapter
|
||||
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate provides utilities for adapting physical expressions to different schemas in DataFusion.
|
||||
|
||||
@@ -6,3 +27,12 @@ It handles schema differences in file scans by rewriting expressions to match th
|
||||
including type casting, missing columns, and partition values.
|
||||
|
||||
For detailed documentation, see the [`PhysicalExprAdapter`] trait documentation.
|
||||
|
||||
Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
[`physicalexpradapter`]: https://docs.rs/datafusion/latest/datafusion/physical_expr_adapter/trait.PhysicalExprAdapter.html
|
||||
|
||||
@@ -17,16 +17,19 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Core Physical Expressions
|
||||
# Apache DataFusion Core Physical Expressions
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides shared APIs for implementing
|
||||
physical expressions such as `PhysicalExpr` and `PhysicalSortExpr`.
|
||||
physical expressions such as [`PhysicalExpr`] and [`PhysicalSortExpr`].
|
||||
|
||||
Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
[`physicalexpr`]: https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
|
||||
[`physicalsortexpr`]: https://docs.rs/datafusion/latest/datafusion/physical_expr/struct.PhysicalSortExpr.html
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Physical Expressions
|
||||
# Apache DataFusion Physical Expressions
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides data types and utilities for physical expressions.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,10 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Physical Optimizer
|
||||
# Apache DataFusion Physical Optimizer
|
||||
|
||||
DataFusion is an extensible query execution framework, written in Rust,
|
||||
that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains the physical optimizer for DataFusion.
|
||||
|
||||
@@ -28,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Physical Plan
|
||||
# Apache DataFusion Physical Plan
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that contains the `ExecutionPlan` trait and the various implementations of that
|
||||
trait for built in operators such as filters, projections, joins, aggregations, etc.
|
||||
@@ -28,5 +28,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -19,9 +19,9 @@
|
||||
name = "datafusion-proto-common"
|
||||
description = "Protobuf serialization of DataFusion common types"
|
||||
keywords = ["arrow", "query", "sql"]
|
||||
readme = "README.md"
|
||||
version = { workspace = true }
|
||||
edition = { workspace = true }
|
||||
readme = { workspace = true }
|
||||
homepage = { workspace = true }
|
||||
repository = { workspace = true }
|
||||
license = { workspace = true }
|
||||
|
||||
@@ -17,17 +17,21 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# `datafusion-proto-common`: Apache DataFusion Protobuf Serialization / Deserialization
|
||||
# Apache DataFusion Protobuf Common Serialization / Deserialization
|
||||
|
||||
This crate contains code to convert Apache [DataFusion] primitive types to and from
|
||||
bytes, which can be useful for sending data over the network.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains code to convert DataFusion primitive types to and from
|
||||
bytes using [Protocol Buffers], which can be useful for sending data over the network.
|
||||
|
||||
See [API Docs] for details and examples.
|
||||
|
||||
Most projects should use the [`datafusion-proto`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion-protp`] crate, there is no
|
||||
this module. If you are already using the [`datafusion-proto`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[protocol buffers]: https://protobuf.dev/
|
||||
[`datafusion-proto`]: https://crates.io/crates/datafusion-proto
|
||||
[datafusion]: https://datafusion.apache.org
|
||||
[api docs]: http://docs.rs/datafusion-proto/latest
|
||||
|
||||
@@ -17,13 +17,17 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# `datafusion-proto`: Apache DataFusion Protobuf Serialization / Deserialization
|
||||
# Apache DataFusion Protobuf Serialization / Deserialization
|
||||
|
||||
This crate contains code to convert [Apache DataFusion] plans to and from
|
||||
bytes, which can be useful for sending plans over the network, for example
|
||||
when building a distributed query engine.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate contains code to convert DataFusion plans to and from bytes using [Protocol Buffers],
|
||||
which can be useful for sending plans over the network, for example when building a distributed
|
||||
query engine.
|
||||
|
||||
See [API Docs] for details and examples.
|
||||
|
||||
[apache datafusion]: https://datafusion.apache.org
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[protocol buffers]: https://protobuf.dev/
|
||||
[api docs]: http://docs.rs/datafusion-proto/latest
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
[package]
|
||||
name = "datafusion-pruning"
|
||||
description = "DataFusion Pruning Logic"
|
||||
readme = "README.md"
|
||||
version = { workspace = true }
|
||||
edition = { workspace = true }
|
||||
homepage = { workspace = true }
|
||||
|
||||
@@ -0,0 +1,34 @@
|
||||
<!---
|
||||
Licensed to the Apache Software Foundation (ASF) under one
|
||||
or more contributor license agreements. See the NOTICE file
|
||||
distributed with this work for additional information
|
||||
regarding copyright ownership. The ASF licenses this file
|
||||
to you under the Apache License, Version 2.0 (the
|
||||
"License"); you may not use this file except in compliance
|
||||
with the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing,
|
||||
software distributed under the License is distributed on an
|
||||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# Apache DataFusion Pruning Logic
|
||||
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that contains pruning logic, to analyze filter expressions with
|
||||
statistics such as min/max values and null counts, proving files / large subsections of files can be skipped
|
||||
without reading the actual data.
|
||||
|
||||
Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
@@ -18,11 +18,11 @@
|
||||
[package]
|
||||
name = "datafusion-session"
|
||||
description = "datafusion-session"
|
||||
readme = "README.md"
|
||||
authors.workspace = true
|
||||
edition.workspace = true
|
||||
homepage.workspace = true
|
||||
license.workspace = true
|
||||
readme.workspace = true
|
||||
repository.workspace = true
|
||||
rust-version.workspace = true
|
||||
version.workspace = true
|
||||
|
||||
@@ -17,9 +17,9 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion Session
|
||||
# Apache DataFusion Session
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate provides **session-related abstractions** used in the DataFusion query engine. A _session_ represents the runtime context for query execution, including configuration, runtime environment, function registry, and planning.
|
||||
|
||||
@@ -27,5 +27,6 @@ Most projects should use the [`datafusion`] crate directly, which re-exports
|
||||
this module. If you are already using the [`datafusion`] crate, there is no
|
||||
reason to use this crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
@@ -17,9 +17,15 @@ specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# datafusion-spark: Spark-compatible Expressions
|
||||
# Apache DataFusion Spark-compatible Expressions
|
||||
|
||||
This crate provides Apache Spark-compatible expressions for use with DataFusion.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides [Apache Spark] compatible expressions for use with DataFusion.
|
||||
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[apache spark]: https://spark.apache.org/
|
||||
|
||||
## Testing Guide
|
||||
|
||||
@@ -29,12 +35,15 @@ or `coerce_types`) is not applied.
|
||||
Therefore, direct invocation tests should only be used to verify that the function is correctly implemented.
|
||||
|
||||
Please be sure to add additional tests beyond direct invocation.
|
||||
For more detailed testing guidelines, refer to
|
||||
the [Spark SQLLogicTest README](../sqllogictest/test_files/spark/README.md).
|
||||
For more detailed testing guidelines, refer to the [Spark SQLLogicTest README].
|
||||
|
||||
## Implementation References
|
||||
|
||||
When implementing Spark-compatible functions, you can check if there are existing implementations in
|
||||
the [Sail](https://github.com/lakehq/sail) or [Comet](https://github.com/apache/datafusion-comet) projects first.
|
||||
the [Sail] or [Comet] projects first.
|
||||
If you do port functionality from these sources, make sure to port over the corresponding tests too, to ensure
|
||||
correctness and compatibility.
|
||||
|
||||
[spark sqllogictest readme]: ../sqllogictest/test_files/spark/README.md
|
||||
[sail]: https://github.com/lakehq/sail
|
||||
[comet]: https://github.com/apache/datafusion-comet
|
||||
|
||||
@@ -17,10 +17,10 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion SQL Query Planner
|
||||
# Apache DataFusion SQL Query Planner
|
||||
|
||||
This crate provides a general purpose SQL query planner that can parse SQL and translate queries into logical
|
||||
plans. Although this crate is used by the [DataFusion][df] query engine, it was designed to be easily usable from any
|
||||
plans. Although this crate is used by the [Apache DataFusion] query engine, it was designed to be easily usable from any
|
||||
project that requires a SQL query planner and does not make any assumptions about how the resulting logical plan
|
||||
will be translated to a physical plan. For example, there is no concept of row-based versus columnar execution in the
|
||||
logical plan.
|
||||
@@ -29,12 +29,12 @@ Note that the [`datafusion`] crate re-exports this module. If you are already
|
||||
using the [`datafusion`] crate in your project, there is no reason to use this
|
||||
crate directly in your project as well.
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[`datafusion`]: https://crates.io/crates/datafusion
|
||||
|
||||
## Example Usage
|
||||
|
||||
See the [examples](examples) directory for fully working examples.
|
||||
See the [examples] directory for fully working examples.
|
||||
|
||||
Here is an example of producing a logical plan from a SQL string.
|
||||
|
||||
@@ -69,8 +69,8 @@ fn main() {
|
||||
```
|
||||
|
||||
This is the logical plan that is produced from this example. Note that this is an **unoptimized**
|
||||
logical plan. The [datafusion-optimizer](https://crates.io/crates/datafusion-optimizer) crate provides a query
|
||||
optimizer that can be applied to plans produced by this crate.
|
||||
logical plan. The [datafusion-optimizer] crate provides a query optimizer that can be applied to
|
||||
plans produced by this crate.
|
||||
|
||||
```
|
||||
Sort: state_tax DESC NULLS FIRST
|
||||
@@ -87,4 +87,5 @@ Sort: state_tax DESC NULLS FIRST
|
||||
TableScan: orders
|
||||
```
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[examples]: examples
|
||||
[datafusion-optimizer]: https://crates.io/crates/datafusion-optimizer
|
||||
|
||||
@@ -17,23 +17,29 @@
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# DataFusion sqllogictest
|
||||
# Apache DataFusion sqllogictest
|
||||
|
||||
[DataFusion][df] is an extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that contains an implementation of [sqllogictest](https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki).
|
||||
This crate is a submodule of DataFusion that contains an implementation of [sqllogictest].
|
||||
|
||||
[df]: https://crates.io/crates/datafusion
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[sqllogictest]: https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki
|
||||
|
||||
## Overview
|
||||
|
||||
This crate uses [sqllogictest-rs](https://github.com/risinglightdb/sqllogictest-rs) to parse and run `.slt` files in the
|
||||
[`test_files`](test_files) directory of this crate or the [`data/sqlite`](https://github.com/apache/datafusion-testing/tree/main/data/sqlite)
|
||||
directory of the [datafusion-testing](https://github.com/apache/datafusion-testing) crate.
|
||||
This crate uses [sqllogictest-rs] to parse and run `.slt` files in the [`test_files`] directory of
|
||||
this crate or the [`data/sqlite`] directory of the [datafusion-testing] repository.
|
||||
|
||||
[sqllogictest-rs]: https://github.com/risinglightdb/sqllogictest-rs
|
||||
[`test_files`]: test_files
|
||||
[`data/sqlite`]: https://github.com/apache/datafusion-testing/tree/main/data/sqlite
|
||||
[datafusion-testing]: https://github.com/apache/datafusion-testing
|
||||
|
||||
## Testing setup
|
||||
|
||||
1. `rustup update stable` DataFusion uses the latest stable release of rust
|
||||
1. `rustup update stable` DataFusion uses the latest stable release of Rust
|
||||
2. `git submodule init`
|
||||
3. `git submodule update --init --remote --recursive`
|
||||
|
||||
|
||||
@@ -19,9 +19,12 @@
|
||||
|
||||
# Apache DataFusion Substrait
|
||||
|
||||
This crate contains a [Substrait] producer and consumer for [Apache DataFusion]
|
||||
[Apache DataFusion] is an extensible query execution framework, written in Rust, that uses [Apache Arrow] as its in-memory format.
|
||||
|
||||
This crate is a submodule of DataFusion that provides a [Substrait] producer and consumer for DataFusion
|
||||
plans. See [API Docs] for details and examples.
|
||||
|
||||
[apache arrow]: https://arrow.apache.org/
|
||||
[apache datafusion]: https://datafusion.apache.org/
|
||||
[substrait]: https://substrait.io
|
||||
[apache datafusion]: https://datafusion.apache.org
|
||||
[api docs]: https://docs.rs/datafusion-substrait/latest
|
||||
|
||||
Reference in New Issue
Block a user