datafusion

mirror of https://github.com/langchain-ai/datafusion.git synced 2026-07-01 21:24:06 -04:00

Author	SHA1	Message	Date
Lía Adriana	e937cadbcc	[fix] Add type coercion from NULL to Interval to make date_bin more postgres compatible (#20499 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes https://github.com/apache/datafusion/issues/20502 ## Rationale for this change The following query is failing with the following error: `SELECT date_bin(NULL, TIMESTAMP '2023-01-01 12:30:00', TIMESTAMP '2023-01-01 12:00:00') ` `Error: Error during planning: Failed to coerce arguments to satisfy a call to 'date_bin' function: coercion from Null, Timestamp(ns), Timestamp(ns) to the signature OneOf([....])` ## What changes are included in this PR? Fix `date_bin(NULL, ...)` to return `NULL` instead of a planning error by allowing Nulls to coerce to Interva. ## Are these changes tested? I added a sqllogictest case to verify the query executes and returns `NULL`. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Yes, previously `date_bin(NULL, ...) `returned a planning error. It now returns NULL.	2026-02-25 08:02:30 +00:00
kosiew	d75fcb83e3	Fix physical expr adapter to resolve physical fields by name, not column index (#20485 ) ## Which issue does this PR close? * [Comment](https://github.com/apache/datafusion/pull/20202#discussion_r2804840366) on #20202 ## Rationale for this change When adapting physical expressions across differing logical/physical schemas, relying on `Column::index()` can be incorrect if the physical schema column ordering differs from the logical plan (or if a `Column` is constructed with an index that doesn’t match the current physical schema). This can lead to looking up the wrong physical field, causing incorrect casts, type mismatches, or runtime failures. This change ensures the adapter always resolves the physical field using the column name against the physical file schema, making expression rewriting robust to schema reordering and avoiding subtle bugs where an index points at an unrelated column. ## What changes are included in this PR? * Updated `create_cast_column_expr` to resolve the physical field via `physical_file_schema.index_of(column.name())` instead of `column.index()`. * Added a regression test that deliberately supplies a mismatched `Column` index and asserts the rewriter still selects the correct physical field by name and produces the expected `CastColumnExpr`. ## Are these changes tested? Yes. * Added `test_create_cast_column_expr_uses_name_lookup_not_column_index` which covers the scenario where physical and logical schemas have different column orders and the provided `Column` index is incorrect. ## Are there any user-facing changes? No direct user-facing changes. This is an internal correctness fix that improves robustness of physical expression adaptation when schema ordering differs between logical and physical plans. <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->	2026-02-25 07:52:59 +00:00
Haresh Khanna	2347306943	[Minor] Fix error messages for `shrink` and `try_shrink` (#20422 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> In the following code, when we fetch `prev` again to construct the error message, the value we get may be different from the value that failed `checked_sub` in the first place which would get us out of the fetch_update CAS loop. Instead we should use the prev value that `fetch_update` returned in the error message. ```rust pub fn try_shrink(&self, capacity: usize) -> Result<usize> { let prev = self .size .fetch_update( atomic::Ordering::Relaxed, atomic::Ordering::Relaxed, \|prev\| prev.checked_sub(capacity), ) .map_err(\|_\| { let prev = self.size.load(atomic::Ordering::Relaxed); internal_datafusion_err!( "Cannot free the capacity {capacity} out of allocated size {prev}" ) })?; self.registration.pool.shrink(self, capacity); Ok(prev - capacity) } ``` ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes, with existing tests. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> No	2026-02-25 01:47:39 +00:00
Albert Skalt	387e20cc58	Improve `HashJoinExecBuilder` to save state from previous fields (#20276 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> Closes #20270 Prior the patch HashJoinExecBuilder constructed from an existing node reseted some fields of the node, e.g. dynamic filters, metrics. It significantly reduces usage scope of the builder. ## What changes are included in this PR? This patch improves the implementation. Now builder created from the existing node preserves all fields in case they have not been explicitly updated. Also builder now tracks flag if it must recompute plan properties. Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2026-02-24 22:15:34 +00:00
Neil Conway	585bbf35d3	perf: Optimize `array_has_any()` with scalar arg (#20385 ) ## Which issue does this PR close? - Closes #20384. - See #18181 for related context. ## Rationale for this change When `array_has_any` is passed a scalar for either of its arguments, we can use a much faster algorithm: rather than doing O(NM) comparisons for each row of the columnar arg, we can build a hash table on the scalar argument and probe it instead. ## What changes are included in this PR? Add benchmark to cover the one-scalar-arg case * Implement optimization as described above Note that we fallback to a linear scan when the scalar arg is smaller than a threshold (<= 8 elements), because benchmarks suggested probing a HashSet is not profitable for very small arrays. ## Are these changes tested? Yes. Tests pass and benchmarked. ## Are there any user-facing changes? No. --------- Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com> Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>	2026-02-24 20:59:08 +00:00
Albert Skalt	34dad2ccee	Cache `PlanProperties`, add fast-path for `with_new_children` (#19792 ) - closes https://github.com/apache/datafusion/issues/19796 This patch aims to implement a fast-path for the ExecutionPlan::with_new_children function for some plans, moving closer to a physical plan re-use implementation and improving planning performance. If the passed children properties are the same as in self, we do not actually recompute self's properties (which could be costly if projection mapping is required). Instead, we just replace the children and re-use self's properties as-is. To be able to compare two different properties -- ExecutionPlan::properties(...) signature is modified and now returns `&Arc<PlanProperties>`. If `children` properties are the same in `with_new_children` -- we clone our properties arc and then a parent plan will consider our properties as unchanged, doing the same. - Return `&Arc<PlanProperties>` from `ExecutionPlan::properties(...)` instead of a reference. - Implement `with_new_children` fast-path if there is no children properties changes for all major plans. Note: currently, `reset_plan_states` does not allow to re-use plan in general: it is not supported for dynamic filters and recursive queries features, as in this case state reset should update pointers in the children plans. --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2026-02-24 20:58:06 +00:00
Ganesh Patil	b8cebdde2a	Fix incorrect regex pattern in regex_replace_posix_groups (#19827 ) The `regex_replace_posix_groups` method was using the pattern `(\d)` to match POSIX capture group references like `\1`. However, `` matches zero or more digits, which caused a lone backslash `\` to incorrectly become `${}`. Changed to `(\d+)` which requires at least one digit, fixing the issue. Added unit tests to validate correct behavior. - Fixes #19766 --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2026-02-24 20:26:06 +00:00
Adam Gutglick	e80694e369	Remove recursive const check in `simplify_const_expr` (#20234 ) ## Which issue does this PR close? - Closes #20134 . ## Rationale for this change The check for simplifying const expressions was recursive and expensive, repeatedly checking the expression's children in a recursive way. I've tried other approached like pre-computing the result for all expressions outside of the loop and using that cache during the traversal, but I've found that it only yielded between 5-8% improvement while adding complexity, while this approach simplifies the code and seems to be more performant in my benchmarks (change is compared to current main branch): ``` tpc-ds/q76/cs/16 time: [27.112 µs 27.159 µs 27.214 µs] change: [−13.533% −13.167% −12.801%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low mild 4 (4.00%) high mild 2 (2.00%) high severe tpc-ds/q76/ws/16 time: [26.175 µs 26.280 µs 26.394 µs] change: [−14.312% −13.833% −13.346%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild tpc-ds/q76/cs/128 time: [195.79 µs 196.17 µs 196.56 µs] change: [−14.362% −14.080% −13.816%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild tpc-ds/q76/ws/128 time: [197.08 µs 197.61 µs 198.23 µs] change: [−13.531% −13.142% −12.737%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high mild ``` ## What changes are included in this PR? 1. `simplify_const_expr` now only checks itself and whether all of its children are literals, because it assumes the order of simplification is bottoms-up. 2. Removes some code from the public API, see the last section for the full details. ## Are these changes tested? Existing test suite ## Are there any user-facing changes? I suggest removing some of the physical expression simplification code from the public API, which I believe reduces the maintenance burden here. These changes also helps removing code like the distinct `simplify_const_expr` and `simplify_const_expr_with_dummy`. 1. Makes all `datafusion-physical-expr::simplifier` sub-modules (`not` and `const_evaluator`) private, including their key functions. They are not used externally, and being able to change their behavior seems more valuable long term. The simplifier is also not currently an extension point as far as I can tell, so there's no value in providing atomic building blocks like them for now. 2. Removes `has_column_references` completely, its trivial to re-implement and isn't used anywhere in the codebase. --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2026-02-24 19:49:37 +00:00
Andrew Lamb	fdd36d0d21	Update comments on OptimizerRule about function name matching (#20346 ) ## Which issue does this PR close? - Related to https://github.com/apache/datafusion/pull/20180 ## Rationale for this change I gave feedback to @devanshu0987 https://github.com/apache/datafusion/pull/20180/changes#r2800720037 that it was not a good idea to check for function names in optimizer rules, but then I realized that the rationale for this is not written down anywhere. ## What changes are included in this PR? Document why checking for function names in optimizer rules is not good and offer alternatives ## Are these changes tested? By CI ## Are there any user-facing changes? Just docs, no functional changes	2026-02-24 19:46:32 +00:00
Raz Luvaton	b16ad9badc	fix: SortMergeJoin don't wait for all input before emitting (#20482 ) ## Which issue does this PR close? N/A ## Rationale for this change I noticed while playing around with local tests and debugging memory issue, that `SortMergeJoinStream` wait for all input before start emitting, which shouldn't be the case as we can emit early when we have enough data. also, this cause huge memory pressure ## What changes are included in this PR? Trying to fix the issue, not sure yet ## Are these changes tested? Yes ## Are there any user-facing changes? ----- ## TODO: - [x] update docs - [x] finish fix	2026-02-24 19:12:42 +00:00
Neil Conway	db5197b742	chore: Replace `matches!` on fieldless enums with `==` (#20525 ) ## Which issue does this PR close? N/A ## Rationale for this change When comparing a value with a field-less enum that implements `PartialEq`, `==` is simpler and more readable than `matches!`. ## What changes are included in this PR? ## Are these changes tested? Yes. ## Are there any user-facing changes? No.	2026-02-24 15:48:06 +00:00
dependabot[bot]	932418b20c	chore(deps): bump strum_macros from 0.27.2 to 0.28.0 (#20521 ) Bumps [strum_macros](https://github.com/Peternator7/strum) from 0.27.2 to 0.28.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/Peternator7/strum/blob/master/CHANGELOG.md">strum_macros's changelog</a>.</em></p> <blockquote> <h2>0.28.0</h2> <ul> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/461">#461</a>: Allow any kind of passthrough attributes on <code>EnumDiscriminants</code>.</p> <ul> <li>Previously only list-style attributes (e.g. <code>#[strum_discriminants(derive(...))]</code>) were supported. Now path-only (e.g. <code>#[strum_discriminants(non_exhaustive)]</code>) and name/value (e.g. <code>#[strum_discriminants(doc = "foo")]</code>) attributes are also supported.</li> </ul> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/462">#462</a>: Add missing <code>#[automatically_derived]</code> to generated impls not covered by <a href="https://redirect.github.com/Peternator7/strum/pull/444">#444</a>.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/466">#466</a>: Bump MSRV to 1.71, required to keep up with updated <code>syn</code> and <code>windows-sys</code> dependencies. This is a breaking change if you're on an old version of rust.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/469">#469</a>: Use absolute paths in generated proc macro code to avoid potential name conflicts.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/465">#465</a>: Upgrade <code>phf</code> dependency to v0.13.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/473">#473</a>: Fix <code>cargo fmt</code> / <code>clippy</code> issues and add GitHub Actions CI.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/477">#477</a>: <code>strum::ParseError</code> now implements <code>core::fmt::Display</code> instead <code>std::fmt::Display</code> to make it <code>#[no_std]</code> compatible. Note the <code>Error</code> trait wasn't available in core until <code>1.81</code> so <code>strum::ParseError</code> still only implements that in std.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/476">#476</a>: <strong>Breaking Change</strong> - <code>EnumString</code> now implements <code>From<&str></code> (infallible) instead of <code>TryFrom<&str></code> when the enum has a <code>#[strum(default)]</code> variant. This more accurately reflects that parsing cannot fail in that case. If you need the old <code>TryFrom</code> behavior, you can opt back in using <code>parse_error_ty</code> and <code>parse_error_fn</code>:</p> <pre lang="rust"><code>#[derive(EnumString)] #[strum(parse_error_ty = strum::ParseError, parse_error_fn = make_error)] pub enum Color { Red, #[strum(default)] Other(String), } <p>fn make_error(x: &str) -> strum::ParseError { strum::ParseError::VariantNotFound } </code></pre></p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/431">#431</a>: Fix bug where <code>EnumString</code> ignored the <code>parse_err_ty</code> attribute when the enum had a <code>#[strum(default)]</code> variant.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/474">#474</a>: EnumDiscriminants will now copy <code>default</code> over from the original enum to the Discriminant enum.</p> <pre lang="rust"><code>#[derive(Debug, Default, EnumDiscriminants)] #[strum_discriminants(derive(Default))] // <- Remove this in 0.28. enum MyEnum { #[default] // <- Will be the #[default] on the MyEnumDiscriminant #[strum_discriminants(default)] // <- Remove this in 0.28 Variant0, Variant1 { a: NonDefault }, } </code></pre> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/Peternator7/strum/commit/7376771128834d28bb9beba5c39846cba62e71ec"><code>7376771</code></a> Peternator7/0.28 (<a href="https://redirect.github.com/Peternator7/strum/issues/475">#475</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/26e63cd964a2e364331a5dd977d589bb9f649d8c"><code>26e63cd</code></a> Display exists in core (<a href="https://redirect.github.com/Peternator7/strum/issues/477">#477</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/9334c728eedaa8a992d1388a8f4564bbccad1934"><code>9334c72</code></a> Make TryFrom and FromStr infallible if there's a default (<a href="https://redirect.github.com/Peternator7/strum/issues/476">#476</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/0ccbbf823c16e827afc263182cd55e99e3b2a52e"><code>0ccbbf8</code></a> Honor parse_err_ty attribute when the enum has a default variant (<a href="https://redirect.github.com/Peternator7/strum/issues/431">#431</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/2c9e5a9259189ce8397f2f4967060240c6bafd74"><code>2c9e5a9</code></a> Automatically add Default implementation to EnumDiscriminant if it exists on ...</li> <li><a href="https://github.com/Peternator7/strum/commit/e241243e48359b8b811b8eaccdcfa1ae87138e0d"><code>e241243</code></a> Fix existing cargo fmt + clippy issues and add GH actions (<a href="https://redirect.github.com/Peternator7/strum/issues/473">#473</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/639b67fefd20eaead1c5d2ea794e9afe70a00312"><code>639b67f</code></a> feat: allow any kind of passthrough attributes on <code>EnumDiscriminants</code> (<a href="https://redirect.github.com/Peternator7/strum/issues/461">#461</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/0ea1e2d0fd1460e7492ea32e6b460394d9199ff8"><code>0ea1e2d</code></a> docs: Fix typo (<a href="https://redirect.github.com/Peternator7/strum/issues/463">#463</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/36c051b91086b37d531c63ccf5a49266832a846d"><code>36c051b</code></a> Upgrade <code>phf</code> to v0.13 (<a href="https://redirect.github.com/Peternator7/strum/issues/465">#465</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/9328b38617dc6f4a3bc5fdac03883d3fc766cf34"><code>9328b38</code></a> Use absolute paths in proc macro (<a href="https://redirect.github.com/Peternator7/strum/issues/469">#469</a>)</li> <li>Additional commits viewable in <a href="https://github.com/Peternator7/strum/compare/v0.27.2...v0.28.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=strum_macros&package-manager=cargo&previous-version=0.27.2&new-version=0.28.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-24 15:47:50 +00:00
Neil Conway	e71e7a39bf	chore: Cleanup code to use `repeat_n` in a few places (#20527 ) ## Which issue does this PR close? N/A ## Rationale for this change Using `repeat_n` is more readable and slightly faster than `(0..n).map(\|_\| ...)`. ## What changes are included in this PR? ## Are these changes tested? Yes. ## Are there any user-facing changes? No.	2026-02-24 15:46:58 +00:00
Adrian Garcia Badaracco	670dbf481c	fix: prevent duplicate alias collision with user-provided __datafusion_extracted names (#20432 ) ## Summary - Fixes a bug where the optimizer's `AliasGenerator` could produce alias names that collide with`__datafusion_extracted_N` aliases, causing a "Schema contains duplicate unqualified field name" error - I don't expect users themselves to create these aliases, but if you run the optimizers twice (with different `AliasGenerator` instances) you'll hit this. - Adds `AliasGenerator::update_min_id()` to advance the counter past existing aliases - Scans each plan node's expressions during `ExtractLeafExpressions` traversal to seed the generator before any extraction occurs - Switches to controlling the traversal which also means the config-based short circuit more clearly skips the entire rule. Closes https://github.com/apache/datafusion/issues/20430 ## Test plan - [x] Unit test: `test_user_provided_extracted_alias_no_collision` in `extract_leaf_expressions` - [x] SLT regression test in `projection_pushdown.slt` with explicit `__datafusion_extracted_2` alias 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 15:02:59 +00:00
mishop-15	17d770d6e5	fix: handle out of range errors in DATE_BIN instead of panicking (#20221 ) ## Which issue does this PR close? Closes #20219 ## Rationale for this change The DATE_BIN function was panicking when datetime operations went out of range instead of returning proper errors. The two specific cases were: 1. Month subtraction going out of range causing `DateTime - Months` panic 2. `timestamp_nanos_opt()` returning None and then unwrapping ## What changes are included in this PR? - Changed `date_bin_months_interval` and `to_utc_date_time` to return `Result` instead of panicking - Replaced `origin_date - Months` and `origin_date + Months` with `checked_sub_months` and `checked_add_months` - Replaced `.unwrap()` calls with proper `match` statements and error handling - Updated all callers throughout the file to handle `Result` types ## Are these changes tested? Tested manually with the exact queries from the issue that were panicking: ```sql select DATE_BIN('1637426858', TO_TIMESTAMP_MILLIS(1040292460), TIMESTAMP '1984-01-07 00:00:00'); select DATE_BIN('1637426858', TO_TIMESTAMP_MILLIS(-1040292460), TIMESTAMP '1984-01-07 00:00:00'); ``` Both queries now return NULL instead of panicking. All existing unit tests pass. ## Are there any user-facing changes? Yes - queries with DATE_BIN that would previously panic now return NULL when datetime operations go out of range.	2026-02-24 13:55:32 +00:00
Neil Conway	9c85ac608f	perf: Fix quadratic behavior of `to_array_of_size` (#20459 ) ## Which issue does this PR close? - Closes #20458. - Closes #18159. ## Rationale for this change When `array_to_size(n)` was called on a `List`-like object containing a `StringViewArray` with `b` data buffers, the previous implementation returned a list containing a `StringViewArray` with `nb` buffers, which results in catastrophically bad performance if `b` grows even somewhat large. This issue was previously noticed causing poor nested loop join performance. #18161 adjusted the NLJ code to avoid calling `to_array_of_size` for this reason, but didn't attempt to fix the underlying issue in `to_array_of_size`. This PR doesn't attempt to revert the change to the NLJ code: the special-case code added in #18161 is still slightly faster than `to_array_of_size` after this optimization. It might be possible to address that in a future PR. ## What changes are included in this PR? Instead of using `repeat_n` + `concat` to merge together `n` copies of the `StringViewArray`, we instead use `take`, which preserves the same number of buffers as the input `StringViewArray`. * Add a new benchmark for this situation * Add more unit tests for `to_array_of_size` ## Are these changes tested? Yes and benchmarked. ## Are there any user-facing changes? No. ## AI usage Iterated on the problem with Claude Code; I understand the problem and the solution.	2026-02-24 13:53:10 +00:00
Tim Saucer	a9c090141d	Add support for FFI config extensions (#19469 ) ## Which issue does this PR close? This addresses part of https://github.com/apache/datafusion/issues/17035 This is also a blocker for https://github.com/apache/datafusion/issues/20450 ## Rationale for this change Currently we cannot support user defined configuration extensions via FFI. This is because much of the infrastructure on how to add and extract custom extensions relies on knowing concrete types of the extensions. This is not supported in FFI. This PR adds an implementation of configuration extensions that can be used across a FFI boundary. ## What changes are included in this PR? - Implement `FFI_ExtensionOptions`. - Update `ConfigOptions` to check if a `datafusion_ffi` namespace exists when setting values - Add unit test ## Are these changes tested? Unit test added. Also tested against `datafusion-python` locally. With this code I have the following test that passes. I have created a simple python exposed `MyConfig`: ```python from datafusion import SessionConfig from datafusion_ffi_example import MyConfig def test_catalog_provider(): config = MyConfig() config = SessionConfig().with_extension(config) config.set("my_config.baz_count", "42") ``` ## Are there any user-facing changes? New addition only.	2026-02-24 13:18:02 +00:00
kosiew	4a41587bdf	Make `custom_file_casts` example schema nullable to allow null `id` values during casting (#20486 ) ## Which issue does this PR close? * [Comment](https://github.com/apache/datafusion/pull/20202#discussion_r2804841561) on #20202 --- ## Rationale for this change The `custom_file_casts` example defines a logical/table schema that uses `id: Int32` as the target type. In practice, casting and projection paths in DataFusion can produce nulls (e.g. failed casts, missing values, or intermediate expressions), and examples should avoid implying that nulls are impossible when demonstrating casting behavior. Marking the `id` field as nullable makes the example more realistic and prevents confusion when users follow or adapt the example to scenarios where nulls may appear. --- ## What changes are included in this PR? * Update the logical/table schema in `custom_file_casts.rs` to define `id` as nullable (`Field::new("id", DataType::Int32, true)`). * Adjust the inline comment to reflect the nullable schema. --- ## Are these changes tested? No new tests were added. This is a documentation/example-only change that updates a schema definition and comment. The example continues to compile and can be exercised by running the `custom_file_casts` example as before. --- ## Are there any user-facing changes? Yes (example behavior/expectations): * The `custom_file_casts` example now documents `id` as nullable, aligning the example schema with situations where cast/projection may yield null values. * No public APIs are changed and no breaking behavior is introduced.	2026-02-24 12:24:42 +00:00
Tim-53	0dfa542201	fix: HashJoin panic with dictionary-encoded columns in multi-key joins (#20441 ) ## Which issue does this PR close? - Closes #20437 ## Rationale for this change `flatten_dictionary_array` returned only the unique values rather then the full expanded array when being called on a `DictionaryArray`. When building a `StructArray` this caused a length mismatch panic. ## What changes are included in this PR? Replaced `array.values()` with `arrow::compute::cast(array, value_type)` in `flatten_dictionary_array`, which properly expands the dictionary into a full length array matching the row count. ## Are these changes tested? Yes, both a new unit test aswell as a regression test were added. ## Are there any user-facing changes? Nope --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2026-02-24 12:10:15 +00:00
dependabot[bot]	6c793694e9	chore(deps): bump the all-other-cargo-deps group with 2 updates (#20519 ) Bumps the all-other-cargo-deps group with 2 updates: [chrono](https://github.com/chronotope/chrono) and [wasm-bindgen-test](https://github.com/wasm-bindgen/wasm-bindgen). Updates `chrono` from 0.4.43 to 0.4.44 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/chronotope/chrono/releases">chrono's releases</a>.</em></p> <blockquote> <h2>0.4.44</h2> <h2>What's Changed</h2> <ul> <li>docs: match MSRV with <code>Cargo.toml</code> contents by <a href="https://github.com/coryan"><code>@coryan</code></a> in <a href="https://redirect.github.com/chronotope/chrono/pull/1772">chronotope/chrono#1772</a></li> <li>Add track_caller to non-deprecated functions by <a href="https://github.com/svix-jplatte"><code>@svix-jplatte</code></a> in <a href="https://redirect.github.com/chronotope/chrono/pull/1774">chronotope/chrono#1774</a></li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/chronotope/chrono/commit/c14b4599d07ef36ffa1f8a531fb0bc7eb3b42464"><code>c14b459</code></a> Bump version to 0.4.44</li> <li><a href="https://github.com/chronotope/chrono/commit/ea832c5090369eefa2cb6a47d643e2f7ade7ffa7"><code>ea832c5</code></a> Add track_caller to non-deprecated functions</li> <li><a href="https://github.com/chronotope/chrono/commit/cfae889a3a23507acf49b605794abba17effd2d7"><code>cfae889</code></a> Fix panic message in to_rfc2822</li> <li><a href="https://github.com/chronotope/chrono/commit/f8900b5a44228a7f6282c65e8c407d3ecb6dcb7b"><code>f8900b5</code></a> docs: match MSRV with <code>Cargo.toml</code> contents</li> <li>See full diff in <a href="https://github.com/chronotope/chrono/compare/v0.4.43...v0.4.44">compare view</a></li> </ul> </details> <br /> Updates `wasm-bindgen-test` from 0.3.61 to 0.3.62 <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/wasm-bindgen/wasm-bindgen/commits">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-24 11:24:24 +00:00
dependabot[bot]	4c0a6531ca	chore(deps): bump taiki-e/install-action from 2.68.6 to 2.68.8 (#20518 ) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.68.6 to 2.68.8. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.68.8</h2> <ul> <li> <p>Update <code>cargo-nextest@latest</code> to 0.9.129.</p> </li> <li> <p>Update <code>mise@latest</code> to 2026.2.19.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.32.</p> </li> </ul> <h2>2.68.7</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2026.2.18.</p> </li> <li> <p>Update <code>wasm-bindgen@latest</code> to 0.2.111.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <ul> <li> <p>Update <code>wasm-bindgen@latest</code> to 0.2.112.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.10.5.</p> </li> </ul> <h2>[2.68.8] - 2026-02-23</h2> <ul> <li> <p>Update <code>cargo-nextest@latest</code> to 0.9.129.</p> </li> <li> <p>Update <code>mise@latest</code> to 2026.2.19.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.32.</p> </li> </ul> <h2>[2.68.7] - 2026-02-22</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2026.2.18.</p> </li> <li> <p>Update <code>wasm-bindgen@latest</code> to 0.2.111.</p> </li> </ul> <h2>[2.68.6] - 2026-02-21</h2> <ul> <li>Update <code>wasm-bindgen@latest</code> to 0.2.110.</li> </ul> <h2>[2.68.5] - 2026-02-20</h2> <ul> <li>Update <code>wasm-bindgen@latest</code> to 0.2.109.</li> </ul> <h2>[2.68.4] - 2026-02-20</h2> <ul> <li>Update <code>cargo-nextest@latest</code> to 0.9.128.</li> </ul> <h2>[2.68.3] - 2026-02-19</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2026.2.17.</p> </li> <li> <p>Update <code>cargo-tarpaulin@latest</code> to 0.35.2.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.42.1.</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/cfdb446e391c69574ebc316dfb7d7849ec12b940"><code>cfdb446</code></a> Release 2.68.8</li> <li><a href="https://github.com/taiki-e/install-action/commit/350f13bd74589d52195d1aed8e04b35b616a9c49"><code>350f13b</code></a> Update <code>cargo-nextest@latest</code> to 0.9.129</li> <li><a href="https://github.com/taiki-e/install-action/commit/8ba6eccac43cdb9aa5b83627b897b416b206be2a"><code>8ba6ecc</code></a> Update <code>mise@latest</code> to 2026.2.19</li> <li><a href="https://github.com/taiki-e/install-action/commit/cf805946ef1da29d90b652c870b7a19aae44b0f5"><code>cf80594</code></a> Update <code>tombi@latest</code> to 0.7.32</li> <li><a href="https://github.com/taiki-e/install-action/commit/f92912fad184299a31e22ad070a5059fd07d4f59"><code>f92912f</code></a> Release 2.68.7</li> <li><a href="https://github.com/taiki-e/install-action/commit/4970026aba514ced4229209c822802e1bff68b3e"><code>4970026</code></a> Update <code>mise@latest</code> to 2026.2.18</li> <li><a href="https://github.com/taiki-e/install-action/commit/6043f02f023f20fde8f9436e0d500ee1391fab70"><code>6043f02</code></a> Update <code>wasm-bindgen@latest</code> to 0.2.111</li> <li>See full diff in <a href="https://github.com/taiki-e/install-action/compare/470679bc3a1580072dac4e67535d1aa3a3dcdf51...cfdb446e391c69574ebc316dfb7d7849ec12b940">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=taiki-e/install-action&package-manager=github_actions&previous-version=2.68.6&new-version=2.68.8)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-24 11:24:23 +00:00
dependabot[bot]	3aa34b33f5	chore(deps): bump strum from 0.27.2 to 0.28.0 (#20520 ) Bumps [strum](https://github.com/Peternator7/strum) from 0.27.2 to 0.28.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/Peternator7/strum/blob/master/CHANGELOG.md">strum's changelog</a>.</em></p> <blockquote> <h2>0.28.0</h2> <ul> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/461">#461</a>: Allow any kind of passthrough attributes on <code>EnumDiscriminants</code>.</p> <ul> <li>Previously only list-style attributes (e.g. <code>#[strum_discriminants(derive(...))]</code>) were supported. Now path-only (e.g. <code>#[strum_discriminants(non_exhaustive)]</code>) and name/value (e.g. <code>#[strum_discriminants(doc = "foo")]</code>) attributes are also supported.</li> </ul> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/462">#462</a>: Add missing <code>#[automatically_derived]</code> to generated impls not covered by <a href="https://redirect.github.com/Peternator7/strum/pull/444">#444</a>.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/466">#466</a>: Bump MSRV to 1.71, required to keep up with updated <code>syn</code> and <code>windows-sys</code> dependencies. This is a breaking change if you're on an old version of rust.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/469">#469</a>: Use absolute paths in generated proc macro code to avoid potential name conflicts.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/465">#465</a>: Upgrade <code>phf</code> dependency to v0.13.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/473">#473</a>: Fix <code>cargo fmt</code> / <code>clippy</code> issues and add GitHub Actions CI.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/477">#477</a>: <code>strum::ParseError</code> now implements <code>core::fmt::Display</code> instead <code>std::fmt::Display</code> to make it <code>#[no_std]</code> compatible. Note the <code>Error</code> trait wasn't available in core until <code>1.81</code> so <code>strum::ParseError</code> still only implements that in std.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/476">#476</a>: <strong>Breaking Change</strong> - <code>EnumString</code> now implements <code>From<&str></code> (infallible) instead of <code>TryFrom<&str></code> when the enum has a <code>#[strum(default)]</code> variant. This more accurately reflects that parsing cannot fail in that case. If you need the old <code>TryFrom</code> behavior, you can opt back in using <code>parse_error_ty</code> and <code>parse_error_fn</code>:</p> <pre lang="rust"><code>#[derive(EnumString)] #[strum(parse_error_ty = strum::ParseError, parse_error_fn = make_error)] pub enum Color { Red, #[strum(default)] Other(String), } <p>fn make_error(x: &str) -> strum::ParseError { strum::ParseError::VariantNotFound } </code></pre></p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/431">#431</a>: Fix bug where <code>EnumString</code> ignored the <code>parse_err_ty</code> attribute when the enum had a <code>#[strum(default)]</code> variant.</p> </li> <li> <p><a href="https://redirect.github.com/Peternator7/strum/pull/474">#474</a>: EnumDiscriminants will now copy <code>default</code> over from the original enum to the Discriminant enum.</p> <pre lang="rust"><code>#[derive(Debug, Default, EnumDiscriminants)] #[strum_discriminants(derive(Default))] // <- Remove this in 0.28. enum MyEnum { #[default] // <- Will be the #[default] on the MyEnumDiscriminant #[strum_discriminants(default)] // <- Remove this in 0.28 Variant0, Variant1 { a: NonDefault }, } </code></pre> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/Peternator7/strum/commit/7376771128834d28bb9beba5c39846cba62e71ec"><code>7376771</code></a> Peternator7/0.28 (<a href="https://redirect.github.com/Peternator7/strum/issues/475">#475</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/26e63cd964a2e364331a5dd977d589bb9f649d8c"><code>26e63cd</code></a> Display exists in core (<a href="https://redirect.github.com/Peternator7/strum/issues/477">#477</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/9334c728eedaa8a992d1388a8f4564bbccad1934"><code>9334c72</code></a> Make TryFrom and FromStr infallible if there's a default (<a href="https://redirect.github.com/Peternator7/strum/issues/476">#476</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/0ccbbf823c16e827afc263182cd55e99e3b2a52e"><code>0ccbbf8</code></a> Honor parse_err_ty attribute when the enum has a default variant (<a href="https://redirect.github.com/Peternator7/strum/issues/431">#431</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/2c9e5a9259189ce8397f2f4967060240c6bafd74"><code>2c9e5a9</code></a> Automatically add Default implementation to EnumDiscriminant if it exists on ...</li> <li><a href="https://github.com/Peternator7/strum/commit/e241243e48359b8b811b8eaccdcfa1ae87138e0d"><code>e241243</code></a> Fix existing cargo fmt + clippy issues and add GH actions (<a href="https://redirect.github.com/Peternator7/strum/issues/473">#473</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/639b67fefd20eaead1c5d2ea794e9afe70a00312"><code>639b67f</code></a> feat: allow any kind of passthrough attributes on <code>EnumDiscriminants</code> (<a href="https://redirect.github.com/Peternator7/strum/issues/461">#461</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/0ea1e2d0fd1460e7492ea32e6b460394d9199ff8"><code>0ea1e2d</code></a> docs: Fix typo (<a href="https://redirect.github.com/Peternator7/strum/issues/463">#463</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/36c051b91086b37d531c63ccf5a49266832a846d"><code>36c051b</code></a> Upgrade <code>phf</code> to v0.13 (<a href="https://redirect.github.com/Peternator7/strum/issues/465">#465</a>)</li> <li><a href="https://github.com/Peternator7/strum/commit/9328b38617dc6f4a3bc5fdac03883d3fc766cf34"><code>9328b38</code></a> Use absolute paths in proc macro (<a href="https://redirect.github.com/Peternator7/strum/issues/469">#469</a>)</li> <li>Additional commits viewable in <a href="https://github.com/Peternator7/strum/compare/v0.27.2...v0.28.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=strum&package-manager=cargo&previous-version=0.27.2&new-version=0.28.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-24 11:23:48 +00:00
Dmitrii Blaginin	11ef486e6c	Runs-on for extended CI checks (#20511 ) part of https://github.com/apache/datafusion/issues/20052 ## Which issue does this PR close? example run: https://github.com/apache/datafusion/actions/runs/22325922758 this recused the run time from 3h to 1h. still a lot (on my mac it runs in 5m!) but that's a start --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-24 10:34:49 +00:00
Xander	d59cdfe999	Fix name tracker (#19856 ) ## Which issue does this PR close? - Closes #17508 ## Rationale for this change The previous implementation used UUID-based aliasing as a workaround to prevent duplicate names for literals in Substrait plans. This approach had several drawbacks: - Non-deterministic plan names that made testing difficult (requiring UUID regex filters) - Only addressed literal naming conflicts, not the broader issue of name deduplication - Added unnecessary dependency on the `uuid` crate - Didn't properly handle cases where the same qualified name could appear with different schema representations ## What changes are included in this PR? 1. Enhanced NameTracker: Refactored to detect two types of conflicts: - Duplicate schema names: Tracked via schema_name() to prevent validate_unique_names failures (e.g., two Utf8(NULL) literals) - Ambiguous references: Tracked via qualified_name() to prevent DFSchema::check_names failures when a qualified field (e.g., left.Utf8(NULL)) and unqualified field (e.g., Utf8(NULL)) share the same column name 2. Removed UUID dependency: Eliminated the `uuid` crate from `datafusion/substrait` 3. Removed literal-specific aliasing: The UUID-based workaround in `project_rel.rs` is no longer needed as the improved NameTracker handles all naming conflicts consistently 4. Deterministic naming: Name conflicts now use predictable `__temp__N` suffixes instead of random UUIDs Note: This doesn't fully fix all the issues in #17508 which allow some special casing of `CAST` which are not included here. ## Are these changes tested? Yes: - Updated snapshot tests to reflect the new deterministic naming (e.g., `Utf8("people")__temp__0` instead of UUID-based names) - Modified some roundtrip tests to verify semantic equivalence (schema matching and execution) rather than exact string matching, which is more robust - All existing integration tests pass with the new naming scheme ## Are there any user-facing changes? Minimal. The generated plan names are now deterministic and more readable (using `__temp__N` suffixes instead of UUIDs), but this is primarily an internal representation change. The functional behavior and query results remain unchanged.	2026-02-24 08:15:59 +00:00
Neil Conway	b6d46a6382	perf: Optimize `initcap()` (#20352 ) ## Which issue does this PR close? - Closes #20351. ## Rationale for this change When all values in a `Utf8`/`LargeUtf8` array are ASCII, we can skip using `GenericStringBuilder` and instead process the entire input buffer in a single pass using byte-level operations. This also avoids recomputing the offsets and nulls arrays. A similar optimization is already used for lower() and upper(). Along the way, optimize `initcap_string()` for ASCII-only inputs. It already had an ASCII-only fastpath but there was room for further optimization, by iterating over bytes rather than characters. ## What changes are included in this PR? * Cleanup benchmarks: we ran the scalar benchmark for different array sizes, despite the fact that it is invariant to the array size * Add benchmark for different string lengths * Add benchmark for Unicode array input * Optimize for ASCII-only inputs as described above * Add test case for ASCII-only input that is a sliced array * Add test case variants for `LargeStringArray` ## Are these changes tested? Yes, plus an additional test added. ## Are there any user-facing changes? No.	2026-02-24 06:11:08 +00:00
Dmitrii Blaginin	7602913b0f	Switch to the latest Mac OS (#20510 )	2026-02-23 22:57:49 +00:00
Andrew Lamb	b9328b9734	Upgrade to sqlparser 0.61.0 (#20177 ) DRAFT until SQL parser is released ## Which issue does this PR close? - part of https://github.com/apache/datafusion-sqlparser-rs/issues/2117 ## Rationale for this change Keep up to date with dependencies I think @Samyak2 specifically would like access to the `:` field syntax ## What changes are included in this PR? 1. Update to 0.61.0 2. Update APIs ## Are these changes tested? Yes by existing tests ## Are there any user-facing changes? New dependency --------- Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>	2026-02-23 18:49:08 +00:00
Neil Conway	d303f5817f	chore: Add end-to-end benchmark for array_agg, code cleanup (#20496 ) ## Which issue does this PR close? - Prep work for #20465 ## Rationale for this change - Add three queries to measure the end-to-end performance of `array_agg()`, as prep work for optimizing its performance. ## What changes are included in this PR? This PR also cleans up the `data_utils` benchmark code: - Seed the RNG once and use it for all data generation. The previous coding seeded an RNG but only used it for some data, and also used the same seed for every batch, which lead to repeated data (... I assume this was not the intent?) - The previous code made `u64_wide` a nullable field, but passed `9.0` for the `value_density` when generating data, which meant that no NULL values would ever be generated. Switch to making `u64_wide` non-nullable. - Fix up comments, remove a clippy suppress, various other cleanups. ## Are these changes tested? Yes. ## Are there any user-facing changes? No.	2026-02-23 18:26:16 +00:00
Oleks V	df8f818b29	chore: Avoid build fails on MinIO rate limits (#20472 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## Rationale for this change Sometimes CI failed because of docker rates limits. ``` thread 'test_s3_url_fallback' (11052) panicked at datafusion-cli/tests/cli_integration.rs:116:13: Failed to start MinIO container. Ensure Docker is running and accessible: failed to pull the image 'minio/minio:RELEASE.2025-02-28T09-55-16Z', error: Docker responded with status code 500: toomanyrequests: You have reached your unauthenticated pull rate limit. https://www.docker.com/increase-rate-limit stack backtrace: ``` Example https://github.com/apache/datafusion/actions/runs/22262073722/job/64401977127 <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? Ignore the tests if rates limit hit only <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->	2026-02-23 16:09:01 +00:00
Oleks V	ed0323a2bb	feat: support `arrays_zip` function (#20440 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## Rationale for this change Summary - Adds a new arrays_zip scalar function that combines multiple arrays into a single array of structs, where each struct field corresponds to an input array - Shorter arrays within a row are padded with NULLs to match the longest array's length - Compatible with Spark's arrays_zip behavior <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? ``` arrays_zip takes N list arrays and produces a List<Struct<c0, c1, ..., cN>> where each struct contains the elements at the same index from each input array. > SELECT arrays_zip([1, 2, 3], ['a', 'b', 'c']); [{c0: 1, c1: a}, {c0: 2, c1: b}, {c0: 3, c1: c}] > SELECT arrays_zip([1, 2], [3, 4, 5]); [{c0: 1, c1: 3}, {c0: 2, c1: 4}, {c0: NULL, c1: 5}] Implementation details: - Implemented in set_ops.rs following existing array function patterns - Uses MutableArrayData builders per column with row-by-row processing for efficient memory handling - For each row, computes the max array length, copies values from each input array, and pads shorter arrays with NULLs - Supports variadic arguments (2 or more arrays) - Handles NULL list entries, NULL elements, empty arrays, mixed types, and Null-typed arguments - Registered as arrays_zip with alias list_zip - Uses Signature::variadic_any with Volatility::Immutable ``` <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->	2026-02-23 16:08:36 +00:00
Andrew Lamb	89a8576171	docs: Document that adding new optimizer rules are expensive (#20348 ) ## Which issue does this PR close? - Similarly to https://github.com/apache/datafusion/pull/20346 ## Rationale for this change As part of PR reviews, it seems like it is not obvious to some contributors that there is a non trivial cost to adding new optimizer rules. Let's add that knowledge into the codebase as comments, so it may be less of a surprise ## What changes are included in this PR? Add comments ## Are these changes tested? N/A ## Are there any user-facing changes? No this is entirely internal comments oly --------- Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>	2026-02-23 11:21:17 +00:00
Dmitrii Blaginin	60457d0b0a	Runs-on for more actions (#20274 ) Follow up on https://github.com/apache/datafusion/pull/20107: switch more actions to the new flow \| Job \| OLD \| NEW \| Delta \| \|---\|---\|---\|---\| \| linux build test (from #20107) \| 3m 55s \| 1m 46s \| -2m 09s (55% faster) \| \| cargo test (amd64) (from #20107) \| 11m 34s \| 3m 13s \| -8m 21s (72% faster) \| \| cargo check datafusion features \| 11m 18s \| 6m 21s \| -4m 57s (44% faster) \| \| cargo examples (amd64) \| 9m 13s \| 4m 35s \| -4m 38s (50% faster) \| \| verify benchmark results (amd64) \| 11m 48s \| 4m 22s \| -7m 26s (63% faster) \| \| cargo check datafusion-substrait features \| 10m 20s \| 3m 56s \| -6m 24s (62% faster) \| \| cargo check datafusion-proto features \| 4m 48s \| 2m 25s \| -2m 23s (50% faster) \| \| cargo test datafusion-cli (amd64) \| 5m 42s \| 1m 58s \| -3m 44s (65% faster) \| \| cargo test doc (amd64) \| 8m 07s \| 3m 16s \| -4m 51s (60% faster) \| \| cargo doc \| 5m 10s \| 1m 56s \| -3m 14s (63% faster) \| \| Run sqllogictest with Postgres runner \| 6m 06s \| 2m 46s \| -3m 20s (55% faster) \| \| Run sqllogictest in Substrait round-trip mode \| 6m 42s \| 2m 38s \| -4m 04s (61% faster) \| \| clippy \| 6m 01s \| 2m 10s \| -3m 51s (64% faster) \| \| *check configs.md and \\\_functions.md is up-to-date** \| 6m 54s \| 2m 12s \| -4m 42s (68% faster) \|	2026-02-23 10:04:37 +00:00
Filippo	7815732f0f	feat(memory-tracking): implement arrow_buffer::MemoryPool for MemoryPool (#18928 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #18926 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Related to #16841. The ability to correctly account for memory usage of arrow buffers in execution nodes is crucial to maximise resource usage while preventing OOMs. ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> - An implementation of arrow_buffer::MemoryPool for DataFusion's MemoryPool under the `arrow_buffer_pool` feature-flag ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes! ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Introduced new API.	2026-02-23 06:15:09 +00:00
Andy Grove	9660c98743	perf: Use zero-copy slice instead of take kernel in sort merge join (#20463 ) ## Summary Follows on from https://github.com/apache/datafusion/pull/20464 which adds new criterion benchmarks. - When the join indices form a contiguous ascending range (e.g. `[3,4,5,6]`), replace the O(n) Arrow `take` kernel with O(1) `RecordBatch::slice` (zero-copy pointer arithmetic) - Applies to both the streamed (left) and buffered (right) sides of the sort merge join ## Rationale In SMJ, the streamed side cursor advances sequentially, so its indices are almost always contiguous. The buffered side is scanned sequentially within each key group, so its indices are also contiguous for 1:1 and 1:few joins. The `take` kernel allocates new arrays and copies data even when a simple slice would suffice. ## Benchmark Results Criterion micro-benchmark (100K rows, pre-sorted, no sort/scan overhead): \| Benchmark \| Baseline \| Optimized \| Improvement \| \|-----------\|----------\|-----------\|-------------\| \| inner_1to1 (unique keys) \| 5.11 ms \| 3.88 ms \| -24% \| \| inner_1to10 (10K keys) \| 17.64 ms \| 16.29 ms \| -8% \| \| left_1to1_unmatched (5% unmatched) \| 4.80 ms \| 3.87 ms \| -19% \| \| left_semi_1to10 (10K keys) \| 3.65 ms \| 3.11 ms \| -15% \| \| left_anti_partial (partial match) \| 3.58 ms \| 3.43 ms \| -4% \| All improvements are statistically significant (p < 0.05). TPC-H SF1 with SMJ forced (`prefer_hash_join=false`) shows no regressions across all 22 queries, with modest end-to-end improvements on join-heavy queries (Q3 -7%, Q19 -5%, Q21 -2%). ## Implementation - `is_contiguous_range()`: checks if a `UInt64Array` is a contiguous ascending range. Uses quick endpoint rejection then verifies every element sequentially. - `freeze_streamed()`: uses `slice` instead of `take` for streamed (left) columns when indices are contiguous. - `fetch_right_columns_from_batch_by_idxs()`: uses `slice` instead of `take` for buffered (right) columns when indices are contiguous. When indices are not contiguous (e.g. repeated indices in many-to-many joins), falls back to the existing `take` path. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 14:43:55 +00:00
Zhang Xiaofeng	bfc012e638	bench: Add IN list benchmarks for non-constant list expressions (#20444 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Relates to #20427 . ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> The existing `in_list` benchmarks only cover the static filter path (constant literal lists), which uses HashSet lookup. There are no benchmarks for the dynamic evaluation path, triggered when the IN list contains non-constant expressions such as column references (e.g., `a IN (b, c, d)`). Adding these benchmarks establishes a baseline for measuring the impact upcoming optimizations to the dynamic path. (see #20428). ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Add criterion benchmarks for the dynamic IN list evaluation path: - `bench_dynamic_int32`: Int32 column references, list sizes [3, 8, 28] × match rates [0%, 50%, 100%] × null rates [0%, 20%] - `bench_dynamic_utf8`: Utf8 column references, list sizes [3, 8, 28] × match rates [0%, 50%, 100%] ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes. The benchmarks compile and run correctly. No implementation code is changed. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->	2026-02-22 07:40:02 +00:00
Daniël Heres	c1ad8636a0	[Minor] Use buffer_unordered (#20462 ) ## Which issue does this PR close? - Closes #. ## Rationale for this change `buffer_unordered` should be slightly better here - as we sort by the paths anyway (perhaps we can reduce the default concurrency). Also remove some unnecessary allocations. ## What changes are included in this PR? ## Are these changes tested? ## Are there any user-facing changes?	2026-02-22 07:38:53 +00:00
Kumar Ujjawal	f488a9071b	perf: Optimize scalar fast path for `regexp_like` and rejects g inside combined flags like ig (#20354 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Part of https://github.com/apache/datafusion-comet/issues/2986 ## Rationale for this change `regexp_like` was converting scalar inputs into single‑element arrays, adding avoidable overhead for constant folding and scalar‑only evaluations. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? - Add a scalar fast path in RegexpLikeFunc::invoke_with_args that evaluates regexp_like directly for scalar inputs - Add benchmark - Fixes regexp_like to reject the global flag even when provided in combined flags (e.g., ig) across scalar and array+scalar execution paths; adds tests for both branches. \| Type \| Before \| After \| Speedup \| \|------\|--------\|-------\|---------\| \| regexp_like_scalar_utf8 \| 12.092 µs \| 10.943 µs \| 1.10x \| <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Yes <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? NO <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>	2026-02-22 01:03:52 +00:00
dependabot[bot]	cfdd7c180c	chore(deps): bump testcontainers-modules from 0.14.0 to 0.15.0 (#20471 ) Bumps [testcontainers-modules](https://github.com/testcontainers/testcontainers-rs-modules-community) from 0.14.0 to 0.15.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/testcontainers/testcontainers-rs-modules-community/releases">testcontainers-modules's releases</a>.</em></p> <blockquote> <h2>v0.15.0</h2> <h3>Documentation</h3> <ul> <li>Complete doc string for mongodb usage (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/375">#375</a>)</li> <li>Complete doc comments for confluents kafka image (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/376">#376</a>)</li> <li>Complete doc-comment for dynamodb (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/378">#378</a>)</li> <li>Complete doc comments for confluents ElasticMQ image (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/379">#379</a>)</li> <li>Complete doc comments for nats' images (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/383">#383</a>)</li> <li>Complete doc comments for k3s images (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/381">#381</a>)</li> <li>Complete doc comments for elasticsearch image (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/380">#380</a>)</li> <li>Complete doc comments for the parity image (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/384">#384</a>)</li> <li>Complete doc comments for orientdb images (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/382">#382</a>)</li> <li>Complete doc comment for minio (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/377">#377</a>)</li> <li>Complete doc comments for the google_cloud_sdk_emulators image (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/385">#385</a>)</li> <li>Add a docstring for the last missing function <code>Consul::with_local_config</code> (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/386">#386</a>)</li> </ul> <h3>Features</h3> <ul> <li>[<strong>breaking</strong>] Update testcontainers to 0.25.0 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/388">#388</a>)</li> </ul> <h3>Miscellaneous Tasks</h3> <ul> <li>Update redis requirement from 0.29.0 to 0.32.2 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/362">#362</a>)</li> <li>Update async-nats requirement from 0.41.0 to 0.42.0 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/360">#360</a>)</li> <li>Update lapin requirement from 2.3.1 to 3.0.0 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/359">#359</a>)</li> <li>Update arrow-flight requirement from 55.1.0 to 56.0.0 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/374">#374</a>)</li> <li>Update rdkafka requirement from 0.37.0 to 0.38.0 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/365">#365</a>)</li> <li>Update meilisearch-sdk requirement from 0.28.0 to 0.29.1 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/370">#370</a>)</li> <li>Update azure_core to 0.27.0 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/390">#390</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/testcontainers/testcontainers-rs-modules-community/blob/main/CHANGELOG.md">testcontainers-modules's changelog</a>.</em></p> <blockquote> <h2>[0.15.0] - 2026-02-21</h2> <h3>Bug Fixes</h3> <ul> <li>Ready condition in ClickHouse (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/441">#441</a>)</li> </ul> <h3>Features</h3> <ul> <li>Add RustFS module (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/444">#444</a>)</li> <li>[<strong>breaking</strong>] Update testcontainers to <code>0.27</code> (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/445">#445</a>)</li> </ul> <h3>Miscellaneous Tasks</h3> <ul> <li>Expose compile feature to pass through testcontainers/ring or aws-lc-rs (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/442">#442</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/8840e4ddfb59326fa4838a94fbeaee99999eb99c"><code>8840e4d</code></a> chore: release v0.15.0 (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/issues/446">#446</a>)</li> <li><a href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/59cc33f008bfa10e2cb6aef04413e3a807eecb61"><code>59cc33f</code></a> feat!: update testcontainers to <code>0.27</code> (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/issues/445">#445</a>)</li> <li><a href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/b0d7a17be741e28bc5a0fc39952992125539e653"><code>b0d7a17</code></a> feat: add RustFS module (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/issues/444">#444</a>)</li> <li><a href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/893ea7f4bc9a9434e5f918ea585dd92f97860bce"><code>893ea7f</code></a> chore(deps): expose compile feature to pass through testcontainers/ring or aw...</li> <li><a href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/331abcc6e61d9d76e5f8e6ec91566ce874d8fc32"><code>331abcc</code></a> fix: ready condition in ClickHouse (<a href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/issues/441">#441</a>)</li> <li>See full diff in <a href="https://github.com/testcontainers/testcontainers-rs-modules-community/compare/v0.14.0...v0.15.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=testcontainers-modules&package-manager=cargo&previous-version=0.14.0&new-version=0.15.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-21 22:49:05 +00:00
dependabot[bot]	043f908b60	chore(deps): bump the all-other-cargo-deps group with 6 updates (#20470 ) Bumps the all-other-cargo-deps group with 6 updates: \| Package \| From \| To \| \| --- \| --- \| --- \| \| [async-compression](https://github.com/Nullus157/async-compression) \| `0.4.39` \| `0.4.40` \| \| [clap](https://github.com/clap-rs/clap) \| `4.5.59` \| `4.5.60` \| \| [wasm-bindgen-test](https://github.com/wasm-bindgen/wasm-bindgen) \| `0.3.58` \| `0.3.61` \| \| [aws-credential-types](https://github.com/smithy-lang/smithy-rs) \| `1.2.12` \| `1.2.13` \| \| [tonic](https://github.com/hyperium/tonic) \| `0.14.4` \| `0.14.5` \| \| [syn](https://github.com/dtolnay/syn) \| `2.0.116` \| `2.0.117` \| Updates `async-compression` from 0.4.39 to 0.4.40 <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/Nullus157/async-compression/commit/9d848a02f13f3a56542e4123be8947a8da06097e"><code>9d848a0</code></a> chore: release (<a href="https://redirect.github.com/Nullus157/async-compression/issues/452">#452</a>)</li> <li><a href="https://github.com/Nullus157/async-compression/commit/9df508b037dafb9a2d80bfd60fcd6679891abef1"><code>9df508b</code></a> Fix update of bytes read in the encoder state machine. (<a href="https://redirect.github.com/Nullus157/async-compression/issues/456">#456</a>)</li> <li><a href="https://github.com/Nullus157/async-compression/commit/0370b470db4dbe8f92a178320438e3094495a99a"><code>0370b47</code></a> Stop consuming input on errors in codecs. (<a href="https://redirect.github.com/Nullus157/async-compression/issues/451">#451</a>)</li> <li><a href="https://github.com/Nullus157/async-compression/commit/9a4b0961f988cdc2b70dae0f4310046c7fedc307"><code>9a4b096</code></a> chore(deps): update rand requirement from 0.9 to 0.10 (<a href="https://redirect.github.com/Nullus157/async-compression/issues/449">#449</a>)</li> <li>See full diff in <a href="https://github.com/Nullus157/async-compression/compare/async-compression-v0.4.39...async-compression-v0.4.40">compare view</a></li> </ul> </details> <br /> Updates `clap` from 4.5.59 to 4.5.60 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/clap-rs/clap/releases">clap's releases</a>.</em></p> <blockquote> <h2>v4.5.60</h2> <h2>[4.5.60] - 2026-02-19</h2> <h3>Fixes</h3> <ul> <li><em>(help)</em> Quote empty default values, possible values</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/clap-rs/clap/blob/master/CHANGELOG.md">clap's changelog</a>.</em></p> <blockquote> <h2>[4.5.60] - 2026-02-19</h2> <h3>Fixes</h3> <ul> <li><em>(help)</em> Quote empty default values, possible values</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/clap-rs/clap/commit/33d24d844b11c0e926ae132e1af338ff070bdf4a"><code>33d24d8</code></a> chore: Release</li> <li><a href="https://github.com/clap-rs/clap/commit/9332409f4a6c1d5c22064e839ec8e9bc040f3be7"><code>9332409</code></a> docs: Update changelog</li> <li><a href="https://github.com/clap-rs/clap/commit/b7adce5a17089596eecb2af6985e6503f2ffcd38"><code>b7adce5</code></a> Merge pull request <a href="https://redirect.github.com/clap-rs/clap/issues/6166">#6166</a> from fabalchemy/fix-dynamic-powershell-completion</li> <li><a href="https://github.com/clap-rs/clap/commit/009bba44ec3d182028ec3a72f5b6f3e507827768"><code>009bba4</code></a> fix(clap_complete): Improve powershell registration</li> <li><a href="https://github.com/clap-rs/clap/commit/d89d57dfb4bdd18930a40c6d7f4fadb23ee9c5b3"><code>d89d57d</code></a> chore: Release</li> <li><a href="https://github.com/clap-rs/clap/commit/f18b67ec3d4ce6ac1acf115adaab2f16ab2ed3c7"><code>f18b67e</code></a> docs: Update changelog</li> <li><a href="https://github.com/clap-rs/clap/commit/9d218eb418526143c9110f734f78a608b8cf6440"><code>9d218eb</code></a> Merge pull request <a href="https://redirect.github.com/clap-rs/clap/issues/6165">#6165</a> from epage/shirt</li> <li><a href="https://github.com/clap-rs/clap/commit/126440ca846613671e1dac98198b2ceb17dab2b0"><code>126440c</code></a> fix(help): Correctly calculate padding for short-only args</li> <li><a href="https://github.com/clap-rs/clap/commit/9e3c05ef3800a3e638b8224a7881a81517a4f4db"><code>9e3c05e</code></a> test(help): Show panic with short, valueless arg</li> <li><a href="https://github.com/clap-rs/clap/commit/c9898d0fece98d8520d3dd954cf457b685b3308f"><code>c9898d0</code></a> test(help): Verify short with value</li> <li>Additional commits viewable in <a href="https://github.com/clap-rs/clap/compare/clap_complete-v4.5.59...clap_complete-v4.5.60">compare view</a></li> </ul> </details> <br /> Updates `wasm-bindgen-test` from 0.3.58 to 0.3.61 <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/wasm-bindgen/wasm-bindgen/commits">compare view</a></li> </ul> </details> <br /> Updates `aws-credential-types` from 1.2.12 to 1.2.13 <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/smithy-lang/smithy-rs/commits">compare view</a></li> </ul> </details> <br /> Updates `tonic` from 0.14.4 to 0.14.5 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/hyperium/tonic/releases">tonic's releases</a>.</em></p> <blockquote> <h2>v0.14.5</h2> <h2>What's Changed</h2> <ul> <li>Add max connections setting</li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/hyperium/tonic/compare/v0.14.4...v0.14.5">https://github.com/hyperium/tonic/compare/v0.14.4...v0.14.5</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/hyperium/tonic/commit/3f7caf3171393734ef19e12d010bd9c945c9e242"><code>3f7caf3</code></a> chore: prepare v0.14.5 release (<a href="https://redirect.github.com/hyperium/tonic/issues/2516">#2516</a>)</li> <li><a href="https://github.com/hyperium/tonic/commit/3f56644955162b344ce4a2641823776574ae98e4"><code>3f56644</code></a> grpc(chore): add missing copyright notices (<a href="https://redirect.github.com/hyperium/tonic/issues/2513">#2513</a>)</li> <li><a href="https://github.com/hyperium/tonic/commit/1769c91a96f054416e0d11c84fcc26284262dda2"><code>1769c91</code></a> feat(xds): implement xDS subscription worker (<a href="https://redirect.github.com/hyperium/tonic/issues/2478">#2478</a>)</li> <li><a href="https://github.com/hyperium/tonic/commit/56f8c6db4718c32e8cb1732438b87c85a3a8c1f6"><code>56f8c6d</code></a> feat(grpc): Add TCP listener API in the Runtime trait + tests for server cred...</li> <li><a href="https://github.com/hyperium/tonic/commit/149f3668f0514bd79f12524778ca76eb6341a3f5"><code>149f366</code></a> feat(grpc) Add channel credentials API + Insecure credentials (<a href="https://redirect.github.com/hyperium/tonic/issues/2495">#2495</a>)</li> <li>See full diff in <a href="https://github.com/hyperium/tonic/compare/v0.14.4...v0.14.5">compare view</a></li> </ul> </details> <br /> Updates `syn` from 2.0.116 to 2.0.117 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/dtolnay/syn/releases">syn's releases</a>.</em></p> <blockquote> <h2>2.0.117</h2> <ul> <li>Fix parsing of <code>self::</code> pattern in first function argument (<a href="https://redirect.github.com/dtolnay/syn/issues/1970">#1970</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/dtolnay/syn/commit/7bcb37cdb3399977658c8b52d2441d37e42e48f2"><code>7bcb37c</code></a> Release 2.0.117</li> <li><a href="https://github.com/dtolnay/syn/commit/9c6e7d3b8df7b30909d60395f88a6ca07688e1c1"><code>9c6e7d3</code></a> Merge pull request <a href="https://redirect.github.com/dtolnay/syn/issues/1970">#1970</a> from dtolnay/receiver</li> <li><a href="https://github.com/dtolnay/syn/commit/019a84847eded0cdb1f7856e0752ba618155cfc9"><code>019a848</code></a> Fix self:: pattern in first function argument</li> <li><a href="https://github.com/dtolnay/syn/commit/23f54f3cf61ddedd5daea4f347eca2d4b84c8abb"><code>23f54f3</code></a> Update test suite to nightly-2026-02-18</li> <li><a href="https://github.com/dtolnay/syn/commit/b99b9a627c46580343398472e7b08a131357a994"><code>b99b9a6</code></a> Unpin CI miri toolchain</li> <li>See full diff in <a href="https://github.com/dtolnay/syn/compare/2.0.116...2.0.117">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-21 22:48:36 +00:00
dependabot[bot]	626bc01b04	chore(deps): bump astral-sh/setup-uv from 6.1.0 to 7.3.0 (#20468 ) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6.1.0 to 7.3.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.3.0 🌈 New features and bug fixes for activate-environment</h2> <h2>Changes</h2> <p>This release contains a few bug fixes and a new feature for the activate-environment functionality.</p> <h2>🐛 Bug fixes</h2> <ul> <li>fix: warn instead of error when no python to cache <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/762">#762</a>)</li> <li>fix: use --clear to create venv <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/761">#761</a>)</li> </ul> <h2>🚀 Enhancements</h2> <ul> <li>feat: add venv-path input for activate-environment <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/746">#746</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>chore: update known checksums for 0.10.0 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/759">#759</a>)</li> <li>refactor: tilde-expansion tests as unittests and no self-hosted tests <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/760">#760</a>)</li> <li>chore: update known checksums for 0.9.30 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/756">#756</a>)</li> <li>chore: update known checksums for 0.9.29 @<a href="https://github.com/apps/github-actions">github-actions[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/748">#748</a>)</li> </ul> <h2>📚 Documentation</h2> <ul> <li>Fix punctuation <a href="https://github.com/pm-dev563"><code>@pm-dev563</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/747">#747</a>)</li> </ul> <h2>⬆️ Dependency updates</h2> <ul> <li>Bump typesafegithub/github-actions-typing from 2.2.1 to 2.2.2 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/753">#753</a>)</li> <li>Bump peter-evans/create-pull-request from 8.0.0 to 8.1.0 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/751">#751</a>)</li> <li>Bump actions/checkout from 6.0.1 to 6.0.2 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/740">#740</a>)</li> <li>Bump release-drafter/release-drafter from 6.1.0 to 6.2.0 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/743">#743</a>)</li> <li>Bump eifinger/actionlint-action from 1.9.3 to 1.10.0 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/731">#731</a>)</li> <li>Bump actions/setup-node from 6.1.0 to 6.2.0 @<a href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/738">#738</a>)</li> </ul> <h2>v7.2.0 🌈 add outputs python-version and python-cache-hit</h2> <h2>Changes</h2> <p>Among some minor typo fixes and quality of life features for developers of actions the main feature of this release are new outputs:</p> <ul> <li><strong>python-version:</strong> The Python version that was set (same content as existing <code>UV_PYTHON</code>)</li> <li><strong>python-cache-hit:</strong> A boolean value to indicate the Python cache entry was found</li> </ul> <p>While implementing this it became clear, that it is easier to handle the Python binaries in a separate cache entry. The added benefit for users is that the "normal" cache containing the dependencies can be used in all runs no matter if these cache the Python binaries or not.</p> <blockquote> <p>[!NOTE]<br /> This release will invalidate caches that contain the Python binaries. This happens a single time.</p> </blockquote> <h2>🐛 Bug fixes</h2> <ul> <li>chore: remove stray space from UV_PYTHON_INSTALL_DIR message <a href="https://github.com/akx"><code>@akx</code></a> (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/720">#720</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/astral-sh/setup-uv/commit/eac588ad8def6316056a12d4907a9d4d84ff7a3b"><code>eac588a</code></a> Bump typesafegithub/github-actions-typing from 2.2.1 to 2.2.2 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/753">#753</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/a97c6cbe9c11a3fc620e0f506b2967ef4fe74ebb"><code>a97c6cb</code></a> Bump peter-evans/create-pull-request from 8.0.0 to 8.1.0 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/751">#751</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/02182fa02a198f2423c87ba9a41982b2efbaa3ef"><code>02182fa</code></a> fix: warn instead of error when no python to cache (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/762">#762</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/a3b3eaea92d7cf978795e7ae0a996f861347b70b"><code>a3b3eae</code></a> chore: update known checksums for 0.10.0 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/759">#759</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/78cebeceac116b9740b3fb83de1d99c68aa4ced9"><code>78cebec</code></a> fix: use --clear to create venv (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/761">#761</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/b6b8e2cd6a1bad11205c4c74af16307cdbecd194"><code>b6b8e2c</code></a> refactor: tilde-expansion tests as unittests and no self-hosted tests (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/760">#760</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/e31bec8546a22248f075a182e7e60c534bffa057"><code>e31bec8</code></a> chore: update known checksums for 0.9.30 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/756">#756</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/db2b65ebaeba7fdae1dfc2a646812fa8ebccefe2"><code>db2b65e</code></a> Bump actions/checkout from 6.0.1 to 6.0.2 (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/740">#740</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/3511ff7054b4bdbf897f4410d573261859a8eeb2"><code>3511ff7</code></a> feat: add venv-path input for activate-environment (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/746">#746</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/99b0f0474b8c709992d2d82e9cfa8745d4715d14"><code>99b0f04</code></a> Fix punctuation (<a href="https://redirect.github.com/astral-sh/setup-uv/issues/747">#747</a>)</li> <li>Additional commits viewable in <a href="https://github.com/astral-sh/setup-uv/compare/f0ec1fc3b38f5e7cd731bb6ce540c5af426746bb...eac588ad8def6316056a12d4907a9d4d84ff7a3b">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.1.0&new-version=7.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-21 22:40:06 +00:00
dependabot[bot]	d2c5666f5a	chore(deps): bump taiki-e/install-action from 2.68.0 to 2.68.6 (#20467 ) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.68.0 to 2.68.6. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.68.6</h2> <ul> <li>Update <code>wasm-bindgen@latest</code> to 0.2.110.</li> </ul> <h2>2.68.5</h2> <ul> <li>Update <code>wasm-bindgen@latest</code> to 0.2.109.</li> </ul> <h2>2.68.4</h2> <ul> <li>Update <code>cargo-nextest@latest</code> to 0.9.128.</li> </ul> <h2>2.68.3</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2026.2.17.</p> </li> <li> <p>Update <code>cargo-tarpaulin@latest</code> to 0.35.2.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.42.1.</p> </li> </ul> <h2>2.68.2</h2> <ul> <li> <p>Update <code>uv@latest</code> to 0.10.4.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.31.</p> </li> <li> <p>Update <code>rclone@latest</code> to 1.73.1.</p> </li> </ul> <h2>2.68.1</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2026.2.15.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.30.</p> </li> <li> <p>Update <code>knope@latest</code> to 0.22.3.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <ul> <li>Update <code>wasm-bindgen@latest</code> to 0.2.111.</li> </ul> <h2>[2.68.6] - 2026-02-21</h2> <ul> <li>Update <code>wasm-bindgen@latest</code> to 0.2.110.</li> </ul> <h2>[2.68.5] - 2026-02-20</h2> <ul> <li>Update <code>wasm-bindgen@latest</code> to 0.2.109.</li> </ul> <h2>[2.68.4] - 2026-02-20</h2> <ul> <li>Update <code>cargo-nextest@latest</code> to 0.9.128.</li> </ul> <h2>[2.68.3] - 2026-02-19</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2026.2.17.</p> </li> <li> <p>Update <code>cargo-tarpaulin@latest</code> to 0.35.2.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.42.1.</p> </li> </ul> <h2>[2.68.2] - 2026-02-18</h2> <ul> <li> <p>Update <code>uv@latest</code> to 0.10.4.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.31.</p> </li> <li> <p>Update <code>rclone@latest</code> to 1.73.1.</p> </li> </ul> <h2>[2.68.1] - 2026-02-17</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2026.2.15.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.30.</p> </li> <li> <p>Update <code>knope@latest</code> to 0.22.3.</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/470679bc3a1580072dac4e67535d1aa3a3dcdf51"><code>470679b</code></a> Release 2.68.6</li> <li><a href="https://github.com/taiki-e/install-action/commit/6d8a751fa8ca34ab6f9c3fd87eea05661fa2196d"><code>6d8a751</code></a> Update <code>wasm-bindgen@latest</code> to 0.2.110</li> <li><a href="https://github.com/taiki-e/install-action/commit/71b48393496777ee11188c07a34d48b048a985cd"><code>71b4839</code></a> Release 2.68.5</li> <li><a href="https://github.com/taiki-e/install-action/commit/4ca0169380867518b6c0cb49cb63c9646ac66e21"><code>4ca0169</code></a> Update <code>wasm-bindgen@latest</code> to 0.2.109</li> <li><a href="https://github.com/taiki-e/install-action/commit/2723513a70062521fb56e5df87a04967751efd2f"><code>2723513</code></a> Release 2.68.4</li> <li><a href="https://github.com/taiki-e/install-action/commit/564854d94ec8d55b29e46a990a0bb8a1edc78e71"><code>564854d</code></a> Update <code>cargo-nextest@latest</code> to 0.9.128</li> <li><a href="https://github.com/taiki-e/install-action/commit/1cf3de8de323df92fe08c793e53eaef58799aec4"><code>1cf3de8</code></a> Release 2.68.3</li> <li><a href="https://github.com/taiki-e/install-action/commit/ef14f86a60d221f1fe25998845372fdf90cdd7d4"><code>ef14f86</code></a> Update changelog</li> <li><a href="https://github.com/taiki-e/install-action/commit/d7329c5811e2d509a381c912e9bd5b235cec5fdf"><code>d7329c5</code></a> Update <code>mise@latest</code> to 2026.2.17</li> <li><a href="https://github.com/taiki-e/install-action/commit/bc11002a6517dd702174597bd0a8e6350d2a7211"><code>bc11002</code></a> Update <code>cargo-tarpaulin@latest</code> to 0.35.2</li> <li>Additional commits viewable in <a href="https://github.com/taiki-e/install-action/compare/f8d25fb8a2df08dcd3cead89780d572767b8655f...470679bc3a1580072dac4e67535d1aa3a3dcdf51">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=taiki-e/install-action&package-manager=github_actions&previous-version=2.68.0&new-version=2.68.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-21 22:38:21 +00:00
Oleks V	d03601547a	chore: group minor dependencies into single PR (#20457 ) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## Rationale for this change - Reduce Dependabot PR noise without reducing coverage Grouping most minor and patch Cargo updates into a single PR keeps routine churn manageable while still ensuring updates are applied regularly. - Keep riskier updates isolated Major version bumps can include breaking changes, so we intentionally do not group major updates. This preserves one PR per crate for majors, simplifying review, CI triage, and rollback. - Preserve existing special handling for Arrow/Parquet - Arrow/Parquet updates are higher impact and often coordinated, so we keep their minor/patch updates grouped together for consistency. - Arrow/Parquet major bumps are handled manually (and ignored by Dependabot) to avoid surprise large-scale breakage. - Ensure `object_store` and `sqlparser` remain easy to diagnose These dependencies can have outsized downstream impact in DataFusion. Excluding them from the catch-all group ensures their updates land as individual PRs, making it easier to attribute regressions and bisect failures. - Maintain targeted grouping where it’s beneficial Protocol-related crates (`prost`, `pbjson`) are commonly updated together, so grouping their minor/patch updates reduces churn while keeping changes cohesive. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->	2026-02-21 22:14:22 +00:00
Andy Grove	42dd4279de	bench: Add criterion benchmark for sort merge join (#20464 ) ## Summary - Adds a criterion micro-benchmark for SortMergeJoinExec that measures join kernel performance in isolation - Pre-sorted RecordBatches are fed directly into the join operator, avoiding sort/scan overhead - Data is constructed once and reused across iterations; only the `TestMemoryExec` wrapper is recreated per iteration ## Benchmarks Five scenarios covering the most common SMJ patterns: \| Benchmark \| Join Type \| Key Pattern \| \|-----------\|-----------\|-------------\| \| `inner_1to1` \| Inner \| 100K unique keys per side \| \| `inner_1to10` \| Inner \| 10K keys, ~10 rows per key \| \| `left_1to1_unmatched` \| Left \| ~5% unmatched on left side \| \| `left_semi_1to10` \| Left Semi \| 10K keys \| \| `left_anti_partial` \| Left Anti \| Partial key overlap \| ## Usage ```bash cargo bench -p datafusion-physical-plan --features test_utils --bench sort_merge_join ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 18:32:33 +00:00
Paul J. Davis	0d63ced04a	Implement FFI table provider factory (#20326 ) > ## Which issue does this PR close? > * Closes [expose TableProviderFactory via FFI #17942](https://github.com/apache/datafusion/issues/17942) > This PR is re-opening PR #17994 and updating it to match the current FFI approach (I.e., I made it look like the FFI_TableProvider in various places). > ## Rationale for this change > Expose `TableProviderFactory` via FFI to enable external languages (e.g., Python) to implement custom table provider factories and extend DataFusion with new data source types. > > ## What changes are included in this PR? > * Added `datafusion/ffi/src/table_provider_factory.rs` with: > > * `FFI_TableProviderFactory`: Stable C ABI struct with function pointers for `create`, `clone`, `release`, and `version` > * `ForeignTableProviderFactory`: Wrapper implementing `TableProviderFactory` trait > > ## Are these changes tested? > Yes > I've also added the integration tests as requested in the original PR. > ## Are there any user-facing changes? > Yes - new FFI API that enables custom `TableProviderFactory` implementations in foreign languages. This is an additive change with no breaking changes to existing APIs. Also, I'd like to thank @Weijun-H for the initial version of this PR as it simplified getting up to speed on the serialization logic that I hadn't encountered yet. --------- Co-authored-by: Weijun-H <huangweijun1001@gmail.com>	2026-02-21 12:36:30 +00:00
Liang-Chi Hsieh	1736fd2a40	refactor: Extract sort-merge join filter logic into separate module (#19614 ) Refactored the sort-merge join implementation to improve code organization by extracting all filter-related logic into a dedicated filter.rs module. Changes: - Created new filter.rs module (~576 lines) containing: - Filter metadata tracking (FilterMetadata struct) - Deferred filtering decision logic (needs_deferred_filtering) - Filter mask correction for different join types (get_corrected_filter_mask) - Filter application with null-joined row handling (filter_record_batch_by_join_type) - Helper functions for filter column extraction and batch filtering - Updated stream.rs: - Removed ~450 lines of filter-specific code - Now delegates to filter module functions - Simplified main join logic to focus on stream processing - Updated tests.rs: - Updated imports to use new filter module - Changed test code to use FilterMetadata struct - All 47 sort-merge join tests passing 🤖 Generated with [Claude Code](https://claude.com/claude-code) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-21 02:10:36 +00:00
Kazantsev Maksim	fc98d5c282	feat: Implement Spark `bitmap_bucket_number` function (#20288 ) ## Which issue does this PR close? N/A ## Rationale for this change Add new function: https://spark.apache.org/docs/latest/api/sql/index.html#bitmap_bucket_number ## What changes are included in this PR? - Implementation - Unit Tests - SLT tests ## Are these changes tested? Yes, tests added as part of this PR. ## Are there any user-facing changes? No, these are new function. --------- Co-authored-by: Kazantsev Maksim <mn.kazantsev@gmail.com>	2026-02-21 02:08:44 +00:00
Neil Conway	7f99947390	chore: Cleanup "!is_valid(i)" -> "is_null(i)" (#20453 ) ## Which issue does this PR close? N/A ## Rationale for this change This makes the code easier to read; per suggestion from @Jefffrey in code review for a different change. ## What changes are included in this PR? ## Are these changes tested? Yes. ## Are there any user-facing changes? No.	2026-02-21 02:04:39 +00:00
Eren Avsarogullari	a936d0de95	test: Extend Spark Array functions: `array_repeat` , `shuffle` and `slice` test coverage (#20420 ) ## Which issue does this PR close? - Closes #20419. ## Rationale for this change This PR adds new positive test cases for `datafusion-spark` array functions: `array_repeat `, `shuffle`, `slice` for the following use-cases: ``` - nested function execution, - different datatypes such as timestamp, - casting before function execution ``` Also, being updated contributor-guide testing documentation with minor addition. ## What changes are included in this PR? Being added new positive test cases to `datafusion-spark` array functions: `array_repeat `, `shuffle`, `slice`. ## Are these changes tested? Yes, adding new positive test cases. ## Are there any user-facing changes? No	2026-02-20 18:39:37 +00:00
Yu-Chuan Hung	0f7a405b8c	feat: support Spark-compatible `json_tuple` function (#20412 ) ## Which issue does this PR close? - Part of #15914 - Related comet issue: https://github.com/apache/datafusion-comet/issues/3160 ## Rationale for this change - Apache Spark's `json_tuple` extracts top-level fields from a JSON string. - This function is used in Spark SQL and needed for DataFusion-Comet compatibility. - Reference: https://spark.apache.org/docs/latest/api/sql/index.html#json_tuple ## What changes are included in this PR? - Add Spark-compatible `json_tuple` function in `datafusion-spark` crate - Function signature: `json_tuple(json_string, key1, key2, ...) -> Struct<c0: Utf8, c1: Utf8, ...>` - `json_string`: The JSON string to extract fields from - `key1, key2, ...`: Top-level field names to extract - Returns a Struct because DataFusion ScalarUDFs return one value per row; caller (Comet) destructures the fields ### Examples ```sql SELECT json_tuple('{"f1":"value1","f2":"value2","f3":3}', 'f1', 'f2', 'f3'); -- {c0: value1, c1: value2, c2: 3} SELECT json_tuple('{"f1":"value1"}', 'f1', 'f2'); -- {c0: value1, c1: NULL} SELECT json_tuple(NULL, 'f1'); -- NULL ``` ## Are these changes tested? - Unit tests: return_field_from_args shape validation and too-few-args error - sqllogictest: test_files/spark/json/json_tuple.slt, test cases derived from Spark JsonExpressionsSuite ## Are there any user-facing changes? Yes.	2026-02-20 18:38:32 +00:00
Adrian Garcia Badaracco	1ee782f783	Migrate Python usage to uv workspace (#20414 ) I was having trouble getting benchmarks to gen data. ## Summary - Replace three independent `requirements.txt` files with a uv workspace (`benchmarks`, `dev`, `docs` projects) - Single `uv.lock` lockfile for reproducible dependency resolution - Simplify `bench.sh` by removing all ad-hoc venv/pip logic in favor of `uv run` ## Test plan - [ ] `uv sync` resolves all deps from repo root - [ ] `uv run --project benchmarks python3 benchmarks/compare.py` works - [ ] `uv run --project docs sphinx-build docs/source docs/build` builds docs - [ ] Run a benchmark from `bench.sh` that uses Python (e.g., h2o data gen or compare flow) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 16:29:56 +00:00

1 2 3 4 5 ...

12745 Commits