## Which issue does this PR close? - Closes #20530 ## Rationale for this change The previous implementation of `array_position` used `compare_element_to_list` for every input row. When the needle is a scalar (quite common), we can do much better by searching over the entire flat haystack values array with a single call to `arrow_ord::cmp::not_distinct`. We can then iterate over the resulting set bits to determine per-row results. This is ~5-10x faster than the previous implementation for typical inputs. ## What changes are included in this PR? * Implement new fast path for `array_position` with scalar needle * Improve docs for `array_position` * Don't use `internal_err` to report a user-visible error ## Are these changes tested? Yes, and benchmarked. Additional tests added in a separate PR (#20531) ## Are there any user-facing changes? No.
DataFusion Documentation
This folder contains the source content of the User Guide and Contributor Guide. These are both published to https://datafusion.apache.org/ as part of the release process.
Dependencies
Install build dependencies and build the documentation using uv:
uv sync
uv run bash build.sh
The docs build regenerates the workspace dependency graph via
docs/scripts/generate_dependency_graph.sh, so ensure cargo, cargo-depgraph
(cargo install cargo-depgraph --version ^1.6 --locked), and Graphviz dot
(brew install graphviz or sudo apt-get install -y graphviz) are available.
Build & Preview
Run the provided script to build the HTML pages.
# If using venv, ensure you have activated it
./build.sh
The HTML will be generated into a build directory. Open build/html/index.html
in your preferred browser, e.g.
Preview the site on Linux by running this command.
# On macOS
open build/html/index.html
# On Linux with Firefox
firefox build/html/index.html
Making Changes
To make changes to the docs, simply make a Pull Request with your proposed changes as normal. When the PR is merged the docs will be automatically updated.
Release Process
This documentation is hosted at https://datafusion.apache.org/
When the PR is merged to the main branch of the DataFusion
repository, a github workflow which:
- Builds the html content
- Pushes the html content to the
asf-sitebranch in this repository.
The Apache Software Foundation provides https://datafusion.apache.org/, which serves content based on the configuration in .asf.yaml, which specifies the target as https://datafusion.apache.org/.