Commit Graph

12745 Commits

Author SHA1 Message Date
Lía Adriana e937cadbcc [fix] Add type coercion from NULL to Interval to make date_bin more postgres compatible (#20499)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes https://github.com/apache/datafusion/issues/20502

## Rationale for this change

The following query is failing with the following error:

`SELECT date_bin(NULL, TIMESTAMP '2023-01-01 12:30:00', TIMESTAMP
'2023-01-01 12:00:00')
`

`Error: Error during planning: Failed to coerce arguments to satisfy a
call to 'date_bin' function: coercion from Null, Timestamp(ns),
Timestamp(ns) to the signature OneOf([....])`

## What changes are included in this PR?

Fix `date_bin(NULL, ...)` to return `NULL` instead of a planning error
by allowing Nulls to coerce to Interva.

## Are these changes tested?

I added a sqllogictest case to verify the query executes and returns
`NULL`.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Yes, previously `date_bin(NULL, ...) `returned a planning error. It now
returns NULL.
2026-02-25 08:02:30 +00:00
kosiew d75fcb83e3 Fix physical expr adapter to resolve physical fields by name, not column index (#20485)
## Which issue does this PR close?

*
[Comment](https://github.com/apache/datafusion/pull/20202#discussion_r2804840366)
on #20202

## Rationale for this change

When adapting physical expressions across differing logical/physical
schemas, relying on `Column::index()` can be incorrect if the physical
schema column ordering differs from the logical plan (or if a `Column`
is constructed with an index that doesn’t match the current physical
schema). This can lead to looking up the wrong physical field, causing
incorrect casts, type mismatches, or runtime failures.

This change ensures the adapter always resolves the physical field using
the column **name** against the physical file schema, making expression
rewriting robust to schema reordering and avoiding subtle bugs where an
index points at an unrelated column.

## What changes are included in this PR?

* Updated `create_cast_column_expr` to resolve the physical field via
`physical_file_schema.index_of(column.name())` instead of
`column.index()`.
* Added a regression test that deliberately supplies a mismatched
`Column` index and asserts the rewriter still selects the correct
physical field by name and produces the expected `CastColumnExpr`.

## Are these changes tested?

Yes.

* Added `test_create_cast_column_expr_uses_name_lookup_not_column_index`
which covers the scenario where physical and logical schemas have
different column orders and the provided `Column` index is incorrect.

## Are there any user-facing changes?

No direct user-facing changes.

This is an internal correctness fix that improves robustness of physical
expression adaptation when schema ordering differs between logical and
physical plans.

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
2026-02-25 07:52:59 +00:00
Haresh Khanna 2347306943 [Minor] Fix error messages for shrink and try_shrink (#20422)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
In the following code, when we fetch `prev` again to construct the error
message, the value we get may be different from the value that failed
`checked_sub` in the first place which would get us out of the
fetch_update CAS loop. Instead we should use the prev value that
`fetch_update` returned in the error message.

```rust
pub fn try_shrink(&self, capacity: usize) -> Result<usize> {
    let prev = self
        .size
        .fetch_update(
            atomic::Ordering::Relaxed,
            atomic::Ordering::Relaxed,
            |prev| prev.checked_sub(capacity),
        )
        .map_err(|_| {
            let prev = self.size.load(atomic::Ordering::Relaxed);
            internal_datafusion_err!(
                "Cannot free the capacity {capacity} out of allocated size {prev}"
            )
        })?;

    self.registration.pool.shrink(self, capacity);
    Ok(prev - capacity)
}
```

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
Yes, with existing tests.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
No
2026-02-25 01:47:39 +00:00
Albert Skalt 387e20cc58 Improve HashJoinExecBuilder to save state from previous fields (#20276)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

Closes #20270

Prior the patch HashJoinExecBuilder constructed from an existing node
reseted some fields of the node, e.g. dynamic filters, metrics. It
significantly reduces usage scope of the builder.

## What changes are included in this PR?

This patch improves the implementation. Now builder created from the
existing node preserves all fields in case they have not been explicitly
updated. Also builder now tracks flag if it must recompute plan
properties.

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2026-02-24 22:15:34 +00:00
Neil Conway 585bbf35d3 perf: Optimize array_has_any() with scalar arg (#20385)
## Which issue does this PR close?

- Closes #20384.
- See #18181 for related context.

## Rationale for this change

When `array_has_any` is passed a scalar for either of its arguments, we
can use a much faster algorithm: rather than doing O(N*M) comparisons
for each row of the columnar arg, we can build a hash table on the
scalar argument and probe it instead.

## What changes are included in this PR?

* Add benchmark to cover the one-scalar-arg case
* Implement optimization as described above

Note that we fallback to a linear scan when the scalar arg is smaller
than a threshold (<= 8 elements), because benchmarks suggested probing a
HashSet is not profitable for very small arrays.

## Are these changes tested?

Yes. Tests pass and benchmarked.

## Are there any user-facing changes?

No.

---------

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
2026-02-24 20:59:08 +00:00
Albert Skalt 34dad2ccee Cache PlanProperties, add fast-path for with_new_children (#19792)
- closes https://github.com/apache/datafusion/issues/19796

This patch aims to implement a fast-path for the
ExecutionPlan::with_new_children function for some plans, moving closer
to a physical plan re-use implementation and improving planning
performance. If the passed children properties are the same as in self,
we do not actually recompute self's properties (which could be costly if
projection mapping is required). Instead, we just replace the children
and re-use self's properties as-is.

To be able to compare two different properties --
ExecutionPlan::properties(...) signature is modified and now returns
`&Arc<PlanProperties>`. If `children` properties are the same in
`with_new_children` -- we clone our properties arc and then a parent
plan will consider our properties as unchanged, doing the same.

- Return `&Arc<PlanProperties>` from `ExecutionPlan::properties(...)`
instead of a reference.
- Implement `with_new_children` fast-path if there is no children
properties changes for all
  major plans.

Note: currently, `reset_plan_states` does not allow to re-use plan in
general: it is not
supported for dynamic filters and recursive queries features, as in this
case state reset
should update pointers in the children plans.

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2026-02-24 20:58:06 +00:00
Ganesh Patil b8cebdde2a Fix incorrect regex pattern in regex_replace_posix_groups (#19827)
The `regex_replace_posix_groups` method was using the pattern `(\d*)` to
match
POSIX capture group references like `\1`. However, `*` matches zero or
more
digits, which caused a lone backslash `\` to incorrectly become `${}`.

Changed to `(\d+)` which requires at least one digit, fixing the issue.

Added unit tests to validate correct behavior.

- Fixes #19766

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2026-02-24 20:26:06 +00:00
Adam Gutglick e80694e369 Remove recursive const check in simplify_const_expr (#20234)
## Which issue does this PR close?

- Closes #20134 .

## Rationale for this change

The check for simplifying const expressions was recursive and expensive,
repeatedly checking the expression's children in a recursive way.

I've tried other approached like pre-computing the result for all
expressions outside of the loop and using that cache during the
traversal, but I've found that it only yielded between 5-8% improvement
while adding complexity, while this approach simplifies the code and
seems to be more performant in my benchmarks (change is compared to
current main branch):
```
tpc-ds/q76/cs/16        time:   [27.112 µs 27.159 µs 27.214 µs]
                        change: [−13.533% −13.167% −12.801%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe

tpc-ds/q76/ws/16        time:   [26.175 µs 26.280 µs 26.394 µs]
                        change: [−14.312% −13.833% −13.346%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

tpc-ds/q76/cs/128       time:   [195.79 µs 196.17 µs 196.56 µs]
                        change: [−14.362% −14.080% −13.816%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild

tpc-ds/q76/ws/128       time:   [197.08 µs 197.61 µs 198.23 µs]
                        change: [−13.531% −13.142% −12.737%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
```

## What changes are included in this PR?

1. `simplify_const_expr` now only checks itself and whether all of its
children are literals, because it assumes the order of simplification is
bottoms-up.
2. Removes some code from the public API, see the last section for the
full details.

## Are these changes tested?

Existing test suite

## Are there any user-facing changes?

I suggest removing some of the physical expression simplification code
from the public API, which I believe reduces the maintenance burden
here. These changes also helps removing code like the distinct
`simplify_const_expr` and `simplify_const_expr_with_dummy`.

1. Makes all `datafusion-physical-expr::simplifier` sub-modules (`not`
and `const_evaluator`) private, including their key functions. They are
not used externally, and being able to change their behavior seems more
valuable long term. The simplifier is also not currently an extension
point as far as I can tell, so there's no value in providing atomic
building blocks like them for now.
2. Removes `has_column_references` completely, its trivial to
re-implement and isn't used anywhere in the codebase.

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2026-02-24 19:49:37 +00:00
Andrew Lamb fdd36d0d21 Update comments on OptimizerRule about function name matching (#20346)
## Which issue does this PR close?

- Related to  https://github.com/apache/datafusion/pull/20180



## Rationale for this change

I gave feedback to @devanshu0987
https://github.com/apache/datafusion/pull/20180/changes#r2800720037 that
it was not a good idea to check for function names in optimizer rules,
but then I realized that the rationale for this is not written down
anywhere.

## What changes are included in this PR?

Document why checking for function names in optimizer rules is not good
and offer alternatives

## Are these changes tested?

By CI

## Are there any user-facing changes?

Just docs, no functional changes
2026-02-24 19:46:32 +00:00
Raz Luvaton b16ad9badc fix: SortMergeJoin don't wait for all input before emitting (#20482)
## Which issue does this PR close?

N/A

## Rationale for this change

I noticed while playing around with local tests and debugging memory
issue, that `SortMergeJoinStream` wait for all input before start
emitting, which shouldn't be the case as we can emit early when we have
enough data.

also, this cause huge memory pressure

## What changes are included in this PR?

Trying to fix the issue, not sure yet

## Are these changes tested?

Yes

## Are there any user-facing changes?


-----


## TODO:
- [x] update docs
- [x] finish fix
2026-02-24 19:12:42 +00:00
Neil Conway db5197b742 chore: Replace matches! on fieldless enums with == (#20525)
## Which issue does this PR close?

N/A

## Rationale for this change

When comparing a value with a field-less enum that implements
`PartialEq`, `==` is simpler and more readable than `matches!`.

## What changes are included in this PR?

## Are these changes tested?

Yes.

## Are there any user-facing changes?

No.
2026-02-24 15:48:06 +00:00
dependabot[bot] 932418b20c chore(deps): bump strum_macros from 0.27.2 to 0.28.0 (#20521)
Bumps [strum_macros](https://github.com/Peternator7/strum) from 0.27.2
to 0.28.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Peternator7/strum/blob/master/CHANGELOG.md">strum_macros's
changelog</a>.</em></p>
<blockquote>
<h2>0.28.0</h2>
<ul>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/461">#461</a>:
Allow any kind of passthrough attributes on
<code>EnumDiscriminants</code>.</p>
<ul>
<li>Previously only list-style attributes (e.g.
<code>#[strum_discriminants(derive(...))]</code>) were supported. Now
path-only
(e.g. <code>#[strum_discriminants(non_exhaustive)]</code>) and
name/value (e.g. <code>#[strum_discriminants(doc =
&quot;foo&quot;)]</code>)
attributes are also supported.</li>
</ul>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/462">#462</a>:
Add missing <code>#[automatically_derived]</code> to generated impls not
covered by <a
href="https://redirect.github.com/Peternator7/strum/pull/444">#444</a>.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/466">#466</a>:
Bump MSRV to 1.71, required to keep up with updated <code>syn</code> and
<code>windows-sys</code> dependencies. This is a breaking change if
you're on an old version of rust.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/469">#469</a>:
Use absolute paths in generated proc macro code to avoid
potential name conflicts.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/465">#465</a>:
Upgrade <code>phf</code> dependency to v0.13.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/473">#473</a>:
Fix <code>cargo fmt</code> / <code>clippy</code> issues and add GitHub
Actions CI.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/477">#477</a>:
<code>strum::ParseError</code> now implements
<code>core::fmt::Display</code> instead
<code>std::fmt::Display</code> to make it <code>#[no_std]</code>
compatible. Note the <code>Error</code> trait wasn't available in core
until <code>1.81</code>
so <code>strum::ParseError</code> still only implements that in std.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/476">#476</a>:
<strong>Breaking Change</strong> - <code>EnumString</code> now
implements <code>From&lt;&amp;str&gt;</code>
(infallible) instead of <code>TryFrom&lt;&amp;str&gt;</code> when the
enum has a <code>#[strum(default)]</code> variant. This more accurately
reflects that parsing cannot fail in that case. If you need the old
<code>TryFrom</code> behavior, you can opt back in using
<code>parse_error_ty</code> and <code>parse_error_fn</code>:</p>
<pre lang="rust"><code>#[derive(EnumString)]
#[strum(parse_error_ty = strum::ParseError, parse_error_fn =
make_error)]
pub enum Color {
    Red,
    #[strum(default)]
    Other(String),
}
<p>fn make_error(x: &amp;str) -&gt; strum::ParseError {
strum::ParseError::VariantNotFound
}
</code></pre></p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/431">#431</a>:
Fix bug where <code>EnumString</code> ignored the
<code>parse_err_ty</code>
attribute when the enum had a <code>#[strum(default)]</code>
variant.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/474">#474</a>:
EnumDiscriminants will now copy <code>default</code> over from the
original enum to the Discriminant enum.</p>
<pre lang="rust"><code>#[derive(Debug, Default, EnumDiscriminants)]
#[strum_discriminants(derive(Default))] // &lt;- Remove this in 0.28.
enum MyEnum {
    #[default] // &lt;- Will be the #[default] on the MyEnumDiscriminant
    #[strum_discriminants(default)] // &lt;- Remove this in 0.28
    Variant0,
    Variant1 { a: NonDefault },
}
</code></pre>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/Peternator7/strum/commit/7376771128834d28bb9beba5c39846cba62e71ec"><code>7376771</code></a>
Peternator7/0.28 (<a
href="https://redirect.github.com/Peternator7/strum/issues/475">#475</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/26e63cd964a2e364331a5dd977d589bb9f649d8c"><code>26e63cd</code></a>
Display exists in core (<a
href="https://redirect.github.com/Peternator7/strum/issues/477">#477</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/9334c728eedaa8a992d1388a8f4564bbccad1934"><code>9334c72</code></a>
Make TryFrom and FromStr infallible if there's a default (<a
href="https://redirect.github.com/Peternator7/strum/issues/476">#476</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/0ccbbf823c16e827afc263182cd55e99e3b2a52e"><code>0ccbbf8</code></a>
Honor parse_err_ty attribute when the enum has a default variant (<a
href="https://redirect.github.com/Peternator7/strum/issues/431">#431</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/2c9e5a9259189ce8397f2f4967060240c6bafd74"><code>2c9e5a9</code></a>
Automatically add Default implementation to EnumDiscriminant if it
exists on ...</li>
<li><a
href="https://github.com/Peternator7/strum/commit/e241243e48359b8b811b8eaccdcfa1ae87138e0d"><code>e241243</code></a>
Fix existing cargo fmt + clippy issues and add GH actions (<a
href="https://redirect.github.com/Peternator7/strum/issues/473">#473</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/639b67fefd20eaead1c5d2ea794e9afe70a00312"><code>639b67f</code></a>
feat: allow any kind of passthrough attributes on
<code>EnumDiscriminants</code> (<a
href="https://redirect.github.com/Peternator7/strum/issues/461">#461</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/0ea1e2d0fd1460e7492ea32e6b460394d9199ff8"><code>0ea1e2d</code></a>
docs: Fix typo (<a
href="https://redirect.github.com/Peternator7/strum/issues/463">#463</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/36c051b91086b37d531c63ccf5a49266832a846d"><code>36c051b</code></a>
Upgrade <code>phf</code> to v0.13 (<a
href="https://redirect.github.com/Peternator7/strum/issues/465">#465</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/9328b38617dc6f4a3bc5fdac03883d3fc766cf34"><code>9328b38</code></a>
Use absolute paths in proc macro (<a
href="https://redirect.github.com/Peternator7/strum/issues/469">#469</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/Peternator7/strum/compare/v0.27.2...v0.28.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=strum_macros&package-manager=cargo&previous-version=0.27.2&new-version=0.28.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-24 15:47:50 +00:00
Neil Conway e71e7a39bf chore: Cleanup code to use repeat_n in a few places (#20527)
## Which issue does this PR close?

N/A

## Rationale for this change

Using `repeat_n` is more readable and slightly faster than
`(0..n).map(|_| ...)`.

## What changes are included in this PR?

## Are these changes tested?

Yes.

## Are there any user-facing changes?

No.
2026-02-24 15:46:58 +00:00
Adrian Garcia Badaracco 670dbf481c fix: prevent duplicate alias collision with user-provided __datafusion_extracted names (#20432)
## Summary
- Fixes a bug where the optimizer's `AliasGenerator` could produce alias
names that collide with`__datafusion_extracted_N` aliases, causing a
"Schema contains duplicate unqualified field name" error
- I don't expect users themselves to create these aliases, but if you
run the optimizers twice (with different `AliasGenerator` instances)
you'll hit this.
- Adds `AliasGenerator::update_min_id()` to advance the counter past
existing aliases
- Scans each plan node's expressions during `ExtractLeafExpressions`
traversal to seed the generator before any extraction occurs
- Switches to controlling the traversal which also means the
config-based short circuit more clearly skips the entire rule.

Closes https://github.com/apache/datafusion/issues/20430

## Test plan
- [x] Unit test: `test_user_provided_extracted_alias_no_collision` in
`extract_leaf_expressions`
- [x] SLT regression test in `projection_pushdown.slt` with explicit
`__datafusion_extracted_2` alias

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 15:02:59 +00:00
mishop-15 17d770d6e5 fix: handle out of range errors in DATE_BIN instead of panicking (#20221)
## Which issue does this PR close?

Closes #20219

## Rationale for this change

The DATE_BIN function was panicking when datetime operations went out of
range instead of returning proper errors. The two specific cases were:
1. Month subtraction going out of range causing `DateTime - Months`
panic
2. `timestamp_nanos_opt()` returning None and then unwrapping

## What changes are included in this PR?

- Changed `date_bin_months_interval` and `to_utc_date_time` to return
`Result` instead of panicking
- Replaced `origin_date - Months` and `origin_date + Months` with
`checked_sub_months` and `checked_add_months`
- Replaced `.unwrap()` calls with proper `match` statements and error
handling
- Updated all callers throughout the file to handle `Result` types

## Are these changes tested?

Tested manually with the exact queries from the issue that were
panicking:
```sql
select DATE_BIN('1637426858', TO_TIMESTAMP_MILLIS(1040292460), TIMESTAMP '1984-01-07 00:00:00');
select DATE_BIN('1637426858', TO_TIMESTAMP_MILLIS(-1040292460), TIMESTAMP '1984-01-07 00:00:00');
```

Both queries now return NULL instead of panicking. All existing unit
tests pass.

## Are there any user-facing changes?

Yes - queries with DATE_BIN that would previously panic now return NULL
when datetime operations go out of range.
2026-02-24 13:55:32 +00:00
Neil Conway 9c85ac608f perf: Fix quadratic behavior of to_array_of_size (#20459)
## Which issue does this PR close?

- Closes #20458.
- Closes #18159.

## Rationale for this change

When `array_to_size(n)` was called on a `List`-like object containing a
`StringViewArray` with `b` data buffers, the previous implementation
returned a list containing a `StringViewArray` with `n*b` buffers, which
results in catastrophically bad performance if `b` grows even somewhat
large.

This issue was previously noticed causing poor nested loop join
performance. #18161 adjusted the NLJ code to avoid calling
`to_array_of_size` for this reason, but didn't attempt to fix the
underlying issue in `to_array_of_size`. This PR doesn't attempt to
revert the change to the NLJ code: the special-case code added in #18161
is still slightly faster than `to_array_of_size` after this
optimization. It might be possible to address that in a future PR.

## What changes are included in this PR?
* Instead of using `repeat_n` + `concat` to merge together `n` copies of
the `StringViewArray`, we instead use `take`, which preserves the same
number of buffers as the input `StringViewArray`.
* Add a new benchmark for this situation
* Add more unit tests for `to_array_of_size`

## Are these changes tested?

Yes and benchmarked.

## Are there any user-facing changes?

No.

## AI usage

Iterated on the problem with Claude Code; I understand the problem and
the solution.
2026-02-24 13:53:10 +00:00
Tim Saucer a9c090141d Add support for FFI config extensions (#19469)
## Which issue does this PR close?

This addresses part of https://github.com/apache/datafusion/issues/17035

This is also a blocker for
https://github.com/apache/datafusion/issues/20450

## Rationale for this change

Currently we cannot support user defined configuration extensions via
FFI. This is because much of the infrastructure on how to add and
extract custom extensions relies on knowing concrete types of the
extensions. This is not supported in FFI. This PR adds an implementation
of configuration extensions that can be used across a FFI boundary.

## What changes are included in this PR?

- Implement `FFI_ExtensionOptions`.
- Update `ConfigOptions` to check if a `datafusion_ffi` namespace exists
when setting values
- Add unit test

## Are these changes tested?

Unit test added.

Also tested against `datafusion-python` locally. With this code I have
the following test that passes. I have created a simple python exposed
`MyConfig`:

```python
from datafusion import SessionConfig
from datafusion_ffi_example import MyConfig

def test_catalog_provider():
    config = MyConfig()
    config = SessionConfig().with_extension(config)
    config.set("my_config.baz_count", "42")
```

## Are there any user-facing changes?

New addition only.
2026-02-24 13:18:02 +00:00
kosiew 4a41587bdf Make custom_file_casts example schema nullable to allow null id values during casting (#20486)
## Which issue does this PR close?

*
[Comment](https://github.com/apache/datafusion/pull/20202#discussion_r2804841561)
on #20202

---

## Rationale for this change

The `custom_file_casts` example defines a *logical/table* schema that
uses `id: Int32` as the target type. In practice, casting and projection
paths in DataFusion can produce **nulls** (e.g. failed casts, missing
values, or intermediate expressions), and examples should avoid implying
that nulls are impossible when demonstrating casting behavior.

Marking the `id` field as **nullable** makes the example more realistic
and prevents confusion when users follow or adapt the example to
scenarios where nulls may appear.

---

## What changes are included in this PR?

* Update the logical/table schema in `custom_file_casts.rs` to define
`id` as **nullable** (`Field::new("id", DataType::Int32, true)`).
* Adjust the inline comment to reflect the nullable schema.

---

## Are these changes tested?

No new tests were added.

This is a documentation/example-only change that updates a schema
definition and comment. The example continues to compile and can be
exercised by running the `custom_file_casts` example as before.

---

## Are there any user-facing changes?

Yes (example behavior/expectations):

* The `custom_file_casts` example now documents `id` as nullable,
aligning the example schema with situations where cast/projection may
yield null values.
* No public APIs are changed and no breaking behavior is introduced.
2026-02-24 12:24:42 +00:00
Tim-53 0dfa542201 fix: HashJoin panic with dictionary-encoded columns in multi-key joins (#20441)
## Which issue does this PR close?

- Closes #20437


## Rationale for this change
`flatten_dictionary_array` returned only the unique values rather then
the full expanded array when being called on a `DictionaryArray`. When
building a `StructArray` this caused a length mismatch panic.


## What changes are included in this PR?
Replaced `array.values()` with `arrow::compute::cast(array, value_type)`
in `flatten_dictionary_array`, which properly expands the dictionary
into a full length array matching the row count.

## Are these changes tested?

Yes, both a new unit test aswell as a regression test were added.

## Are there any user-facing changes?

Nope

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2026-02-24 12:10:15 +00:00
dependabot[bot] 6c793694e9 chore(deps): bump the all-other-cargo-deps group with 2 updates (#20519)
Bumps the all-other-cargo-deps group with 2 updates:
[chrono](https://github.com/chronotope/chrono) and
[wasm-bindgen-test](https://github.com/wasm-bindgen/wasm-bindgen).

Updates `chrono` from 0.4.43 to 0.4.44
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/chronotope/chrono/releases">chrono's
releases</a>.</em></p>
<blockquote>
<h2>0.4.44</h2>
<h2>What's Changed</h2>
<ul>
<li>docs: match MSRV with <code>Cargo.toml</code> contents by <a
href="https://github.com/coryan"><code>@​coryan</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1772">chronotope/chrono#1772</a></li>
<li>Add track_caller to non-deprecated functions by <a
href="https://github.com/svix-jplatte"><code>@​svix-jplatte</code></a>
in <a
href="https://redirect.github.com/chronotope/chrono/pull/1774">chronotope/chrono#1774</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/chronotope/chrono/commit/c14b4599d07ef36ffa1f8a531fb0bc7eb3b42464"><code>c14b459</code></a>
Bump version to 0.4.44</li>
<li><a
href="https://github.com/chronotope/chrono/commit/ea832c5090369eefa2cb6a47d643e2f7ade7ffa7"><code>ea832c5</code></a>
Add track_caller to non-deprecated functions</li>
<li><a
href="https://github.com/chronotope/chrono/commit/cfae889a3a23507acf49b605794abba17effd2d7"><code>cfae889</code></a>
Fix panic message in to_rfc2822</li>
<li><a
href="https://github.com/chronotope/chrono/commit/f8900b5a44228a7f6282c65e8c407d3ecb6dcb7b"><code>f8900b5</code></a>
docs: match MSRV with <code>Cargo.toml</code> contents</li>
<li>See full diff in <a
href="https://github.com/chronotope/chrono/compare/v0.4.43...v0.4.44">compare
view</a></li>
</ul>
</details>
<br />

Updates `wasm-bindgen-test` from 0.3.61 to 0.3.62
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/wasm-bindgen/wasm-bindgen/commits">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-24 11:24:24 +00:00
dependabot[bot] 4c0a6531ca chore(deps): bump taiki-e/install-action from 2.68.6 to 2.68.8 (#20518)
Bumps
[taiki-e/install-action](https://github.com/taiki-e/install-action) from
2.68.6 to 2.68.8.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's
releases</a>.</em></p>
<blockquote>
<h2>2.68.8</h2>
<ul>
<li>
<p>Update <code>cargo-nextest@latest</code> to 0.9.129.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2026.2.19.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.7.32.</p>
</li>
</ul>
<h2>2.68.7</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2026.2.18.</p>
</li>
<li>
<p>Update <code>wasm-bindgen@latest</code> to 0.2.111.</p>
</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<p>All notable changes to this project will be documented in this
file.</p>
<p>This project adheres to <a href="https://semver.org">Semantic
Versioning</a>.</p>
<!-- raw HTML omitted -->
<h2>[Unreleased]</h2>
<ul>
<li>
<p>Update <code>wasm-bindgen@latest</code> to 0.2.112.</p>
</li>
<li>
<p>Update <code>uv@latest</code> to 0.10.5.</p>
</li>
</ul>
<h2>[2.68.8] - 2026-02-23</h2>
<ul>
<li>
<p>Update <code>cargo-nextest@latest</code> to 0.9.129.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2026.2.19.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.7.32.</p>
</li>
</ul>
<h2>[2.68.7] - 2026-02-22</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2026.2.18.</p>
</li>
<li>
<p>Update <code>wasm-bindgen@latest</code> to 0.2.111.</p>
</li>
</ul>
<h2>[2.68.6] - 2026-02-21</h2>
<ul>
<li>Update <code>wasm-bindgen@latest</code> to 0.2.110.</li>
</ul>
<h2>[2.68.5] - 2026-02-20</h2>
<ul>
<li>Update <code>wasm-bindgen@latest</code> to 0.2.109.</li>
</ul>
<h2>[2.68.4] - 2026-02-20</h2>
<ul>
<li>Update <code>cargo-nextest@latest</code> to 0.9.128.</li>
</ul>
<h2>[2.68.3] - 2026-02-19</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2026.2.17.</p>
</li>
<li>
<p>Update <code>cargo-tarpaulin@latest</code> to 0.35.2.</p>
</li>
<li>
<p>Update <code>syft@latest</code> to 1.42.1.</p>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/taiki-e/install-action/commit/cfdb446e391c69574ebc316dfb7d7849ec12b940"><code>cfdb446</code></a>
Release 2.68.8</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/350f13bd74589d52195d1aed8e04b35b616a9c49"><code>350f13b</code></a>
Update <code>cargo-nextest@latest</code> to 0.9.129</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/8ba6eccac43cdb9aa5b83627b897b416b206be2a"><code>8ba6ecc</code></a>
Update <code>mise@latest</code> to 2026.2.19</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/cf805946ef1da29d90b652c870b7a19aae44b0f5"><code>cf80594</code></a>
Update <code>tombi@latest</code> to 0.7.32</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/f92912fad184299a31e22ad070a5059fd07d4f59"><code>f92912f</code></a>
Release 2.68.7</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/4970026aba514ced4229209c822802e1bff68b3e"><code>4970026</code></a>
Update <code>mise@latest</code> to 2026.2.18</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/6043f02f023f20fde8f9436e0d500ee1391fab70"><code>6043f02</code></a>
Update <code>wasm-bindgen@latest</code> to 0.2.111</li>
<li>See full diff in <a
href="https://github.com/taiki-e/install-action/compare/470679bc3a1580072dac4e67535d1aa3a3dcdf51...cfdb446e391c69574ebc316dfb7d7849ec12b940">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=taiki-e/install-action&package-manager=github_actions&previous-version=2.68.6&new-version=2.68.8)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-24 11:24:23 +00:00
dependabot[bot] 3aa34b33f5 chore(deps): bump strum from 0.27.2 to 0.28.0 (#20520)
Bumps [strum](https://github.com/Peternator7/strum) from 0.27.2 to
0.28.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Peternator7/strum/blob/master/CHANGELOG.md">strum's
changelog</a>.</em></p>
<blockquote>
<h2>0.28.0</h2>
<ul>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/461">#461</a>:
Allow any kind of passthrough attributes on
<code>EnumDiscriminants</code>.</p>
<ul>
<li>Previously only list-style attributes (e.g.
<code>#[strum_discriminants(derive(...))]</code>) were supported. Now
path-only
(e.g. <code>#[strum_discriminants(non_exhaustive)]</code>) and
name/value (e.g. <code>#[strum_discriminants(doc =
&quot;foo&quot;)]</code>)
attributes are also supported.</li>
</ul>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/462">#462</a>:
Add missing <code>#[automatically_derived]</code> to generated impls not
covered by <a
href="https://redirect.github.com/Peternator7/strum/pull/444">#444</a>.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/466">#466</a>:
Bump MSRV to 1.71, required to keep up with updated <code>syn</code> and
<code>windows-sys</code> dependencies. This is a breaking change if
you're on an old version of rust.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/469">#469</a>:
Use absolute paths in generated proc macro code to avoid
potential name conflicts.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/465">#465</a>:
Upgrade <code>phf</code> dependency to v0.13.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/473">#473</a>:
Fix <code>cargo fmt</code> / <code>clippy</code> issues and add GitHub
Actions CI.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/477">#477</a>:
<code>strum::ParseError</code> now implements
<code>core::fmt::Display</code> instead
<code>std::fmt::Display</code> to make it <code>#[no_std]</code>
compatible. Note the <code>Error</code> trait wasn't available in core
until <code>1.81</code>
so <code>strum::ParseError</code> still only implements that in std.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/476">#476</a>:
<strong>Breaking Change</strong> - <code>EnumString</code> now
implements <code>From&lt;&amp;str&gt;</code>
(infallible) instead of <code>TryFrom&lt;&amp;str&gt;</code> when the
enum has a <code>#[strum(default)]</code> variant. This more accurately
reflects that parsing cannot fail in that case. If you need the old
<code>TryFrom</code> behavior, you can opt back in using
<code>parse_error_ty</code> and <code>parse_error_fn</code>:</p>
<pre lang="rust"><code>#[derive(EnumString)]
#[strum(parse_error_ty = strum::ParseError, parse_error_fn =
make_error)]
pub enum Color {
    Red,
    #[strum(default)]
    Other(String),
}
<p>fn make_error(x: &amp;str) -&gt; strum::ParseError {
strum::ParseError::VariantNotFound
}
</code></pre></p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/431">#431</a>:
Fix bug where <code>EnumString</code> ignored the
<code>parse_err_ty</code>
attribute when the enum had a <code>#[strum(default)]</code>
variant.</p>
</li>
<li>
<p><a
href="https://redirect.github.com/Peternator7/strum/pull/474">#474</a>:
EnumDiscriminants will now copy <code>default</code> over from the
original enum to the Discriminant enum.</p>
<pre lang="rust"><code>#[derive(Debug, Default, EnumDiscriminants)]
#[strum_discriminants(derive(Default))] // &lt;- Remove this in 0.28.
enum MyEnum {
    #[default] // &lt;- Will be the #[default] on the MyEnumDiscriminant
    #[strum_discriminants(default)] // &lt;- Remove this in 0.28
    Variant0,
    Variant1 { a: NonDefault },
}
</code></pre>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/Peternator7/strum/commit/7376771128834d28bb9beba5c39846cba62e71ec"><code>7376771</code></a>
Peternator7/0.28 (<a
href="https://redirect.github.com/Peternator7/strum/issues/475">#475</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/26e63cd964a2e364331a5dd977d589bb9f649d8c"><code>26e63cd</code></a>
Display exists in core (<a
href="https://redirect.github.com/Peternator7/strum/issues/477">#477</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/9334c728eedaa8a992d1388a8f4564bbccad1934"><code>9334c72</code></a>
Make TryFrom and FromStr infallible if there's a default (<a
href="https://redirect.github.com/Peternator7/strum/issues/476">#476</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/0ccbbf823c16e827afc263182cd55e99e3b2a52e"><code>0ccbbf8</code></a>
Honor parse_err_ty attribute when the enum has a default variant (<a
href="https://redirect.github.com/Peternator7/strum/issues/431">#431</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/2c9e5a9259189ce8397f2f4967060240c6bafd74"><code>2c9e5a9</code></a>
Automatically add Default implementation to EnumDiscriminant if it
exists on ...</li>
<li><a
href="https://github.com/Peternator7/strum/commit/e241243e48359b8b811b8eaccdcfa1ae87138e0d"><code>e241243</code></a>
Fix existing cargo fmt + clippy issues and add GH actions (<a
href="https://redirect.github.com/Peternator7/strum/issues/473">#473</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/639b67fefd20eaead1c5d2ea794e9afe70a00312"><code>639b67f</code></a>
feat: allow any kind of passthrough attributes on
<code>EnumDiscriminants</code> (<a
href="https://redirect.github.com/Peternator7/strum/issues/461">#461</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/0ea1e2d0fd1460e7492ea32e6b460394d9199ff8"><code>0ea1e2d</code></a>
docs: Fix typo (<a
href="https://redirect.github.com/Peternator7/strum/issues/463">#463</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/36c051b91086b37d531c63ccf5a49266832a846d"><code>36c051b</code></a>
Upgrade <code>phf</code> to v0.13 (<a
href="https://redirect.github.com/Peternator7/strum/issues/465">#465</a>)</li>
<li><a
href="https://github.com/Peternator7/strum/commit/9328b38617dc6f4a3bc5fdac03883d3fc766cf34"><code>9328b38</code></a>
Use absolute paths in proc macro (<a
href="https://redirect.github.com/Peternator7/strum/issues/469">#469</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/Peternator7/strum/compare/v0.27.2...v0.28.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=strum&package-manager=cargo&previous-version=0.27.2&new-version=0.28.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-24 11:23:48 +00:00
Dmitrii Blaginin 11ef486e6c Runs-on for extended CI checks (#20511)
part of https://github.com/apache/datafusion/issues/20052

## Which issue does this PR close?


example run:
https://github.com/apache/datafusion/actions/runs/22325922758

this recused the run time from 3h to 1h. still a lot (on my mac it runs
in 5m!) but that's a start

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-24 10:34:49 +00:00
Xander d59cdfe999 Fix name tracker (#19856)
## Which issue does this PR close?

- Closes #17508

## Rationale for this change

The previous implementation used UUID-based aliasing as a workaround to
prevent duplicate names for literals in Substrait plans. This approach
had several drawbacks:
- Non-deterministic plan names that made testing difficult (requiring
UUID regex filters)
- Only addressed literal naming conflicts, not the broader issue of name
deduplication
- Added unnecessary dependency on the `uuid` crate
- Didn't properly handle cases where the same qualified name could
appear with different schema representations

## What changes are included in this PR?

  1. Enhanced NameTracker: Refactored to detect two types of conflicts:
- Duplicate schema names: Tracked via schema_name() to prevent
validate_unique_names failures (e.g., two Utf8(NULL) literals)
- Ambiguous references: Tracked via qualified_name() to prevent
DFSchema::check_names failures when a qualified field (e.g.,
left.Utf8(NULL)) and unqualified field (e.g., Utf8(NULL)) share the same
column name
2. **Removed UUID dependency**: Eliminated the `uuid` crate from
`datafusion/substrait`
3. **Removed literal-specific aliasing**: The UUID-based workaround in
`project_rel.rs` is no longer needed as the improved NameTracker handles
all naming conflicts consistently
4. **Deterministic naming**: Name conflicts now use predictable
`__temp__N` suffixes instead of random UUIDs

Note: This doesn't fully fix all the issues in #17508 which allow some
special casing of `CAST` which are not included here.
## Are these changes tested?

Yes:
- Updated snapshot tests to reflect the new deterministic naming (e.g.,
`Utf8("people")__temp__0` instead of UUID-based names)
- Modified some roundtrip tests to verify semantic equivalence (schema
matching and execution) rather than exact string matching, which is more
robust
- All existing integration tests pass with the new naming scheme

## Are there any user-facing changes?

Minimal. The generated plan names are now deterministic and more
readable (using `__temp__N` suffixes instead of UUIDs), but this is
primarily an internal representation change. The functional behavior and
query results remain unchanged.
2026-02-24 08:15:59 +00:00
Neil Conway b6d46a6382 perf: Optimize initcap() (#20352)
## Which issue does this PR close?

- Closes #20351.

## Rationale for this change

When all values in a `Utf8`/`LargeUtf8` array are ASCII, we can skip
using `GenericStringBuilder` and instead process the entire input buffer
in a single pass using byte-level operations. This also avoids
recomputing the offsets and nulls arrays. A similar optimization is
already used for lower() and upper().

Along the way, optimize `initcap_string()` for ASCII-only inputs. It
already had an ASCII-only fastpath but there was room for further
optimization, by iterating over bytes rather than characters.

## What changes are included in this PR?

* Cleanup benchmarks: we ran the scalar benchmark for different array
sizes, despite the fact that it is invariant to the array size
* Add benchmark for different string lengths
* Add benchmark for Unicode array input
* Optimize for ASCII-only inputs as described above
* Add test case for ASCII-only input that is a sliced array
* Add test case variants for `LargeStringArray`

## Are these changes tested?

Yes, plus an additional test added.

## Are there any user-facing changes?

No.
2026-02-24 06:11:08 +00:00
Dmitrii Blaginin 7602913b0f Switch to the latest Mac OS (#20510) 2026-02-23 22:57:49 +00:00
Andrew Lamb b9328b9734 Upgrade to sqlparser 0.61.0 (#20177)
DRAFT until SQL parser is released

## Which issue does this PR close?

- part of https://github.com/apache/datafusion-sqlparser-rs/issues/2117


## Rationale for this change

Keep up to date with dependencies

I think @Samyak2 specifically would like access to the `:` field syntax

## What changes are included in this PR?
1. Update to 0.61.0
2. Update APIs 

## Are these changes tested?
Yes by existing tests

## Are there any user-facing changes?
New dependency

---------

Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
2026-02-23 18:49:08 +00:00
Neil Conway d303f5817f chore: Add end-to-end benchmark for array_agg, code cleanup (#20496)
## Which issue does this PR close?

- Prep work for #20465 

## Rationale for this change

- Add three queries to measure the end-to-end performance of
`array_agg()`, as prep work for optimizing its performance.

## What changes are included in this PR?

This PR also cleans up the `data_utils` benchmark code:

- Seed the RNG once and use it for all data generation. The previous
coding seeded an RNG but only used it for some data, and also used the
same seed for every batch, which lead to repeated data (... I assume
this was not the intent?)
- The previous code made `u64_wide` a nullable field, but passed `9.0`
for the `value_density` when generating data, which meant that no NULL
values would ever be generated. Switch to making `u64_wide`
non-nullable.
- Fix up comments, remove a clippy suppress, various other cleanups.

## Are these changes tested?

Yes.

## Are there any user-facing changes?

No.
2026-02-23 18:26:16 +00:00
Oleks V df8f818b29 chore: Avoid build fails on MinIO rate limits (#20472)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change
Sometimes CI failed because of docker rates limits.
```
thread 'test_s3_url_fallback' (11052) panicked at datafusion-cli/tests/cli_integration.rs:116:13:
Failed to start MinIO container. Ensure Docker is running and accessible: failed to pull the image 'minio/minio:RELEASE.2025-02-28T09-55-16Z', error: Docker responded with status code 500: toomanyrequests: You have reached your unauthenticated pull rate limit. https://www.docker.com/increase-rate-limit
stack backtrace:
```
Example
https://github.com/apache/datafusion/actions/runs/22262073722/job/64401977127
<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

Ignore the tests if rates limit hit only

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
2026-02-23 16:09:01 +00:00
Oleks V ed0323a2bb feat: support arrays_zip function (#20440)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->


- Closes #.

## Rationale for this change

Summary
- Adds a new arrays_zip scalar function that combines multiple arrays
into a single array of structs, where each struct field corresponds to
an input array
- Shorter arrays within a row are padded with NULLs to match the longest
array's length
- Compatible with Spark's arrays_zip behavior

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

```

  arrays_zip takes N list arrays and produces a List<Struct<c0, c1, ..., cN>> where each struct contains the elements at the same index from each input array.

  > SELECT arrays_zip([1, 2, 3], ['a', 'b', 'c']);
  [{c0: 1, c1: a}, {c0: 2, c1: b}, {c0: 3, c1: c}]

  > SELECT arrays_zip([1, 2], [3, 4, 5]);
  [{c0: 1, c1: 3}, {c0: 2, c1: 4}, {c0: NULL, c1: 5}]

  Implementation details:
  - Implemented in set_ops.rs following existing array function patterns
  - Uses MutableArrayData builders per column with row-by-row processing for efficient memory handling
  - For each row, computes the max array length, copies values from each input array, and pads shorter arrays with NULLs
  - Supports variadic arguments (2 or more arrays)
  - Handles NULL list entries, NULL elements, empty arrays, mixed types, and Null-typed arguments
  - Registered as arrays_zip with alias list_zip
  - Uses Signature::variadic_any with Volatility::Immutable
```

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
2026-02-23 16:08:36 +00:00
Andrew Lamb 89a8576171 docs: Document that adding new optimizer rules are expensive (#20348)
## Which issue does this PR close?

- Similarly to https://github.com/apache/datafusion/pull/20346

## Rationale for this change

As part of PR reviews, it seems like it is not obvious to some
contributors that there is a non trivial cost to adding new optimizer
rules. Let's add that knowledge into the codebase as comments, so it may
be less of a surprise

## What changes are included in this PR?

Add comments
## Are these changes tested?
N/A
## Are there any user-facing changes?
No this is entirely internal comments oly

---------

Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
2026-02-23 11:21:17 +00:00
Dmitrii Blaginin 60457d0b0a Runs-on for more actions (#20274)
Follow up on https://github.com/apache/datafusion/pull/20107: switch
more actions to the new flow

| Job | OLD | NEW | Delta |
|---|---|---|---|
| **linux build test** (from #20107) | 3m 55s | 1m 46s | -2m 09s (55%
faster) |
| **cargo test (amd64)** (from #20107) | 11m 34s | 3m 13s | -8m 21s (72%
faster) |
| **cargo check datafusion features** | 11m 18s | 6m 21s | -4m 57s (44%
faster) |
| **cargo examples (amd64)** | 9m 13s | 4m 35s | -4m 38s (50% faster) |
| **verify benchmark results (amd64)** | 11m 48s | 4m 22s | -7m 26s (63%
faster) |
| **cargo check datafusion-substrait features** | 10m 20s | 3m 56s | -6m
24s (62% faster) |
| **cargo check datafusion-proto features** | 4m 48s | 2m 25s | -2m 23s
(50% faster) |
| **cargo test datafusion-cli (amd64)** | 5m 42s | 1m 58s | -3m 44s (65%
faster) |
| **cargo test doc (amd64)** | 8m 07s | 3m 16s | -4m 51s (60% faster) |
| **cargo doc** | 5m 10s | 1m 56s | -3m 14s (63% faster) |
| **Run sqllogictest with Postgres runner** | 6m 06s | 2m 46s | -3m 20s
(55% faster) |
| **Run sqllogictest in Substrait round-trip mode** | 6m 42s | 2m 38s |
-4m 04s (61% faster) |
| **clippy** | 6m 01s | 2m 10s | -3m 51s (64% faster) |
| **check configs.md and \*\*\*_functions.md is up-to-date** | 6m 54s |
2m 12s | -4m 42s (68% faster) |
2026-02-23 10:04:37 +00:00
Filippo 7815732f0f feat(memory-tracking): implement arrow_buffer::MemoryPool for MemoryPool (#18928)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #18926

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

Related to #16841. The ability to correctly account for memory usage of
arrow buffers in execution nodes is crucial to maximise resource usage
while preventing OOMs.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

- An implementation of arrow_buffer::MemoryPool for DataFusion's
MemoryPool under the `arrow_buffer_pool` feature-flag

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes!

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Introduced new API.
2026-02-23 06:15:09 +00:00
Andy Grove 9660c98743 perf: Use zero-copy slice instead of take kernel in sort merge join (#20463)
## Summary

Follows on from https://github.com/apache/datafusion/pull/20464 which
adds new criterion benchmarks.

- When the join indices form a contiguous ascending range (e.g.
`[3,4,5,6]`), replace the O(n) Arrow `take` kernel with O(1)
`RecordBatch::slice` (zero-copy pointer arithmetic)
- Applies to both the streamed (left) and buffered (right) sides of the
sort merge join

## Rationale

In SMJ, the streamed side cursor advances sequentially, so its indices
are almost always contiguous. The buffered side is scanned sequentially
within each key group, so its indices are also contiguous for 1:1 and
1:few joins. The `take` kernel allocates new arrays and copies data even
when a simple slice would suffice.

## Benchmark Results

Criterion micro-benchmark (100K rows, pre-sorted, no sort/scan
overhead):

| Benchmark | Baseline | Optimized | Improvement |
|-----------|----------|-----------|-------------|
| inner_1to1 (unique keys) | 5.11 ms | 3.88 ms | **-24%** |
| inner_1to10 (10K keys) | 17.64 ms | 16.29 ms | **-8%** |
| left_1to1_unmatched (5% unmatched) | 4.80 ms | 3.87 ms | **-19%** |
| left_semi_1to10 (10K keys) | 3.65 ms | 3.11 ms | **-15%** |
| left_anti_partial (partial match) | 3.58 ms | 3.43 ms | **-4%** |

All improvements are statistically significant (p < 0.05).

TPC-H SF1 with SMJ forced (`prefer_hash_join=false`) shows no
regressions across all 22 queries, with modest end-to-end improvements
on join-heavy queries (Q3 -7%, Q19 -5%, Q21 -2%).

## Implementation

- `is_contiguous_range()`: checks if a `UInt64Array` is a contiguous
ascending range. Uses quick endpoint rejection then verifies every
element sequentially.
- `freeze_streamed()`: uses `slice` instead of `take` for streamed
(left) columns when indices are contiguous.
- `fetch_right_columns_from_batch_by_idxs()`: uses `slice` instead of
`take` for buffered (right) columns when indices are contiguous.

When indices are not contiguous (e.g. repeated indices in many-to-many
joins), falls back to the existing `take` path.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 14:43:55 +00:00
Zhang Xiaofeng bfc012e638 bench: Add IN list benchmarks for non-constant list expressions (#20444)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Relates to #20427 .

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

The existing `in_list` benchmarks only cover the static filter path
(constant literal lists), which uses HashSet lookup. There are no
benchmarks for the dynamic evaluation path, triggered when the IN list
contains non-constant expressions such as column references (e.g., `a IN
(b, c, d)`). Adding these benchmarks establishes a baseline for
measuring the impact upcoming optimizations to the dynamic path. (see
#20428).

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

  Add criterion benchmarks for the dynamic IN list evaluation path:

- `bench_dynamic_int32`: Int32 column references, list sizes [3, 8, 28]
× match rates [0%, 50%, 100%] × null rates [0%, 20%]
- `bench_dynamic_utf8`: Utf8 column references, list sizes [3, 8, 28] ×
match rates [0%, 50%, 100%]


## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

Yes. The benchmarks compile and run correctly. No implementation code is
changed.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
2026-02-22 07:40:02 +00:00
Daniël Heres c1ad8636a0 [Minor] Use buffer_unordered (#20462)
## Which issue does this PR close?


- Closes #.

## Rationale for this change

`buffer_unordered` should be slightly better here - as we sort by the
paths anyway (perhaps we can reduce the default concurrency).

Also remove some unnecessary allocations.

## What changes are included in this PR?


## Are these changes tested?

## Are there any user-facing changes?
2026-02-22 07:38:53 +00:00
Kumar Ujjawal f488a9071b perf: Optimize scalar fast path for regexp_like and rejects g inside combined flags like ig (#20354)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Part of  https://github.com/apache/datafusion-comet/issues/2986

## Rationale for this change

`regexp_like` was converting scalar inputs into single‑element arrays,
adding avoidable overhead for constant folding and scalar‑only
evaluations.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

- Add a scalar fast path in RegexpLikeFunc::invoke_with_args that
evaluates regexp_like directly for scalar inputs
- Add benchmark
- Fixes regexp_like to reject the global flag even when provided in
combined flags (e.g., ig) across scalar and array+scalar execution
paths; adds tests for both branches.

 | Type | Before | After | Speedup |
  |------|--------|-------|---------|
  | regexp_like_scalar_utf8 | 12.092 µs | 10.943 µs | 1.10x |

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

Yes

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

NO

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

---------

Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
2026-02-22 01:03:52 +00:00
dependabot[bot] cfdd7c180c chore(deps): bump testcontainers-modules from 0.14.0 to 0.15.0 (#20471)
Bumps
[testcontainers-modules](https://github.com/testcontainers/testcontainers-rs-modules-community)
from 0.14.0 to 0.15.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/testcontainers/testcontainers-rs-modules-community/releases">testcontainers-modules's
releases</a>.</em></p>
<blockquote>
<h2>v0.15.0</h2>
<h3>Documentation</h3>
<ul>
<li>Complete doc string for mongodb usage (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/375">#375</a>)</li>
<li>Complete doc comments for confluents kafka image (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/376">#376</a>)</li>
<li>Complete doc-comment for dynamodb (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/378">#378</a>)</li>
<li>Complete doc comments for confluents ElasticMQ image (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/379">#379</a>)</li>
<li>Complete doc comments for nats' images (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/383">#383</a>)</li>
<li>Complete doc comments for k3s images (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/381">#381</a>)</li>
<li>Complete doc comments for elasticsearch image (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/380">#380</a>)</li>
<li>Complete doc comments for the parity image (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/384">#384</a>)</li>
<li>Complete doc comments for orientdb images (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/382">#382</a>)</li>
<li>Complete doc comment for minio (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/377">#377</a>)</li>
<li>Complete doc comments for the google_cloud_sdk_emulators image (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/385">#385</a>)</li>
<li>Add a docstring for the last missing function
<code>Consul::with_local_config</code> (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/386">#386</a>)</li>
</ul>
<h3>Features</h3>
<ul>
<li>[<strong>breaking</strong>] Update testcontainers to 0.25.0 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/388">#388</a>)</li>
</ul>
<h3>Miscellaneous Tasks</h3>
<ul>
<li>Update redis requirement from 0.29.0 to 0.32.2 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/362">#362</a>)</li>
<li>Update async-nats requirement from 0.41.0 to 0.42.0 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/360">#360</a>)</li>
<li>Update lapin requirement from 2.3.1 to 3.0.0 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/359">#359</a>)</li>
<li>Update arrow-flight requirement from 55.1.0 to 56.0.0 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/374">#374</a>)</li>
<li>Update rdkafka requirement from 0.37.0 to 0.38.0 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/365">#365</a>)</li>
<li>Update meilisearch-sdk requirement from 0.28.0 to 0.29.1 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/370">#370</a>)</li>
<li>Update azure_core to 0.27.0 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/390">#390</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/testcontainers/testcontainers-rs-modules-community/blob/main/CHANGELOG.md">testcontainers-modules's
changelog</a>.</em></p>
<blockquote>
<h2>[0.15.0] - 2026-02-21</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Ready condition in ClickHouse (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/441">#441</a>)</li>
</ul>
<h3>Features</h3>
<ul>
<li>Add RustFS module (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/444">#444</a>)</li>
<li>[<strong>breaking</strong>] Update testcontainers to
<code>0.27</code> (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/445">#445</a>)</li>
</ul>
<h3>Miscellaneous Tasks</h3>
<ul>
<li>Expose compile feature to pass through testcontainers/ring or
aws-lc-rs (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/pull/442">#442</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/8840e4ddfb59326fa4838a94fbeaee99999eb99c"><code>8840e4d</code></a>
chore: release v0.15.0 (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/issues/446">#446</a>)</li>
<li><a
href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/59cc33f008bfa10e2cb6aef04413e3a807eecb61"><code>59cc33f</code></a>
feat!: update testcontainers to <code>0.27</code> (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/issues/445">#445</a>)</li>
<li><a
href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/b0d7a17be741e28bc5a0fc39952992125539e653"><code>b0d7a17</code></a>
feat: add RustFS module (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/issues/444">#444</a>)</li>
<li><a
href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/893ea7f4bc9a9434e5f918ea585dd92f97860bce"><code>893ea7f</code></a>
chore(deps): expose compile feature to pass through testcontainers/ring
or aw...</li>
<li><a
href="https://github.com/testcontainers/testcontainers-rs-modules-community/commit/331abcc6e61d9d76e5f8e6ec91566ce874d8fc32"><code>331abcc</code></a>
fix: ready condition in ClickHouse (<a
href="https://redirect.github.com/testcontainers/testcontainers-rs-modules-community/issues/441">#441</a>)</li>
<li>See full diff in <a
href="https://github.com/testcontainers/testcontainers-rs-modules-community/compare/v0.14.0...v0.15.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=testcontainers-modules&package-manager=cargo&previous-version=0.14.0&new-version=0.15.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-21 22:49:05 +00:00
dependabot[bot] 043f908b60 chore(deps): bump the all-other-cargo-deps group with 6 updates (#20470)
Bumps the all-other-cargo-deps group with 6 updates:

| Package | From | To |
| --- | --- | --- |
| [async-compression](https://github.com/Nullus157/async-compression) |
`0.4.39` | `0.4.40` |
| [clap](https://github.com/clap-rs/clap) | `4.5.59` | `4.5.60` |
| [wasm-bindgen-test](https://github.com/wasm-bindgen/wasm-bindgen) |
`0.3.58` | `0.3.61` |
| [aws-credential-types](https://github.com/smithy-lang/smithy-rs) |
`1.2.12` | `1.2.13` |
| [tonic](https://github.com/hyperium/tonic) | `0.14.4` | `0.14.5` |
| [syn](https://github.com/dtolnay/syn) | `2.0.116` | `2.0.117` |

Updates `async-compression` from 0.4.39 to 0.4.40
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/Nullus157/async-compression/commit/9d848a02f13f3a56542e4123be8947a8da06097e"><code>9d848a0</code></a>
chore: release (<a
href="https://redirect.github.com/Nullus157/async-compression/issues/452">#452</a>)</li>
<li><a
href="https://github.com/Nullus157/async-compression/commit/9df508b037dafb9a2d80bfd60fcd6679891abef1"><code>9df508b</code></a>
Fix update of bytes read in the encoder state machine. (<a
href="https://redirect.github.com/Nullus157/async-compression/issues/456">#456</a>)</li>
<li><a
href="https://github.com/Nullus157/async-compression/commit/0370b470db4dbe8f92a178320438e3094495a99a"><code>0370b47</code></a>
Stop consuming input on errors in codecs. (<a
href="https://redirect.github.com/Nullus157/async-compression/issues/451">#451</a>)</li>
<li><a
href="https://github.com/Nullus157/async-compression/commit/9a4b0961f988cdc2b70dae0f4310046c7fedc307"><code>9a4b096</code></a>
chore(deps): update rand requirement from 0.9 to 0.10 (<a
href="https://redirect.github.com/Nullus157/async-compression/issues/449">#449</a>)</li>
<li>See full diff in <a
href="https://github.com/Nullus157/async-compression/compare/async-compression-v0.4.39...async-compression-v0.4.40">compare
view</a></li>
</ul>
</details>
<br />

Updates `clap` from 4.5.59 to 4.5.60
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/clap-rs/clap/releases">clap's
releases</a>.</em></p>
<blockquote>
<h2>v4.5.60</h2>
<h2>[4.5.60] - 2026-02-19</h2>
<h3>Fixes</h3>
<ul>
<li><em>(help)</em> Quote empty default values, possible values</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/clap-rs/clap/blob/master/CHANGELOG.md">clap's
changelog</a>.</em></p>
<blockquote>
<h2>[4.5.60] - 2026-02-19</h2>
<h3>Fixes</h3>
<ul>
<li><em>(help)</em> Quote empty default values, possible values</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/clap-rs/clap/commit/33d24d844b11c0e926ae132e1af338ff070bdf4a"><code>33d24d8</code></a>
chore: Release</li>
<li><a
href="https://github.com/clap-rs/clap/commit/9332409f4a6c1d5c22064e839ec8e9bc040f3be7"><code>9332409</code></a>
docs: Update changelog</li>
<li><a
href="https://github.com/clap-rs/clap/commit/b7adce5a17089596eecb2af6985e6503f2ffcd38"><code>b7adce5</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/6166">#6166</a>
from fabalchemy/fix-dynamic-powershell-completion</li>
<li><a
href="https://github.com/clap-rs/clap/commit/009bba44ec3d182028ec3a72f5b6f3e507827768"><code>009bba4</code></a>
fix(clap_complete): Improve powershell registration</li>
<li><a
href="https://github.com/clap-rs/clap/commit/d89d57dfb4bdd18930a40c6d7f4fadb23ee9c5b3"><code>d89d57d</code></a>
chore: Release</li>
<li><a
href="https://github.com/clap-rs/clap/commit/f18b67ec3d4ce6ac1acf115adaab2f16ab2ed3c7"><code>f18b67e</code></a>
docs: Update changelog</li>
<li><a
href="https://github.com/clap-rs/clap/commit/9d218eb418526143c9110f734f78a608b8cf6440"><code>9d218eb</code></a>
Merge pull request <a
href="https://redirect.github.com/clap-rs/clap/issues/6165">#6165</a>
from epage/shirt</li>
<li><a
href="https://github.com/clap-rs/clap/commit/126440ca846613671e1dac98198b2ceb17dab2b0"><code>126440c</code></a>
fix(help): Correctly calculate padding for short-only args</li>
<li><a
href="https://github.com/clap-rs/clap/commit/9e3c05ef3800a3e638b8224a7881a81517a4f4db"><code>9e3c05e</code></a>
test(help): Show panic with short, valueless arg</li>
<li><a
href="https://github.com/clap-rs/clap/commit/c9898d0fece98d8520d3dd954cf457b685b3308f"><code>c9898d0</code></a>
test(help): Verify short with value</li>
<li>Additional commits viewable in <a
href="https://github.com/clap-rs/clap/compare/clap_complete-v4.5.59...clap_complete-v4.5.60">compare
view</a></li>
</ul>
</details>
<br />

Updates `wasm-bindgen-test` from 0.3.58 to 0.3.61
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/wasm-bindgen/wasm-bindgen/commits">compare
view</a></li>
</ul>
</details>
<br />

Updates `aws-credential-types` from 1.2.12 to 1.2.13
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/smithy-lang/smithy-rs/commits">compare
view</a></li>
</ul>
</details>
<br />

Updates `tonic` from 0.14.4 to 0.14.5
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/hyperium/tonic/releases">tonic's
releases</a>.</em></p>
<blockquote>
<h2>v0.14.5</h2>
<h2>What's Changed</h2>
<ul>
<li>Add max connections setting</li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/hyperium/tonic/compare/v0.14.4...v0.14.5">https://github.com/hyperium/tonic/compare/v0.14.4...v0.14.5</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/hyperium/tonic/commit/3f7caf3171393734ef19e12d010bd9c945c9e242"><code>3f7caf3</code></a>
chore: prepare v0.14.5 release (<a
href="https://redirect.github.com/hyperium/tonic/issues/2516">#2516</a>)</li>
<li><a
href="https://github.com/hyperium/tonic/commit/3f56644955162b344ce4a2641823776574ae98e4"><code>3f56644</code></a>
grpc(chore): add missing copyright notices (<a
href="https://redirect.github.com/hyperium/tonic/issues/2513">#2513</a>)</li>
<li><a
href="https://github.com/hyperium/tonic/commit/1769c91a96f054416e0d11c84fcc26284262dda2"><code>1769c91</code></a>
feat(xds): implement xDS subscription worker (<a
href="https://redirect.github.com/hyperium/tonic/issues/2478">#2478</a>)</li>
<li><a
href="https://github.com/hyperium/tonic/commit/56f8c6db4718c32e8cb1732438b87c85a3a8c1f6"><code>56f8c6d</code></a>
feat(grpc): Add TCP listener API in the Runtime trait + tests for server
cred...</li>
<li><a
href="https://github.com/hyperium/tonic/commit/149f3668f0514bd79f12524778ca76eb6341a3f5"><code>149f366</code></a>
feat(grpc) Add channel credentials API + Insecure credentials (<a
href="https://redirect.github.com/hyperium/tonic/issues/2495">#2495</a>)</li>
<li>See full diff in <a
href="https://github.com/hyperium/tonic/compare/v0.14.4...v0.14.5">compare
view</a></li>
</ul>
</details>
<br />

Updates `syn` from 2.0.116 to 2.0.117
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/dtolnay/syn/releases">syn's
releases</a>.</em></p>
<blockquote>
<h2>2.0.117</h2>
<ul>
<li>Fix parsing of <code>self::</code> pattern in first function
argument (<a
href="https://redirect.github.com/dtolnay/syn/issues/1970">#1970</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/dtolnay/syn/commit/7bcb37cdb3399977658c8b52d2441d37e42e48f2"><code>7bcb37c</code></a>
Release 2.0.117</li>
<li><a
href="https://github.com/dtolnay/syn/commit/9c6e7d3b8df7b30909d60395f88a6ca07688e1c1"><code>9c6e7d3</code></a>
Merge pull request <a
href="https://redirect.github.com/dtolnay/syn/issues/1970">#1970</a>
from dtolnay/receiver</li>
<li><a
href="https://github.com/dtolnay/syn/commit/019a84847eded0cdb1f7856e0752ba618155cfc9"><code>019a848</code></a>
Fix self:: pattern in first function argument</li>
<li><a
href="https://github.com/dtolnay/syn/commit/23f54f3cf61ddedd5daea4f347eca2d4b84c8abb"><code>23f54f3</code></a>
Update test suite to nightly-2026-02-18</li>
<li><a
href="https://github.com/dtolnay/syn/commit/b99b9a627c46580343398472e7b08a131357a994"><code>b99b9a6</code></a>
Unpin CI miri toolchain</li>
<li>See full diff in <a
href="https://github.com/dtolnay/syn/compare/2.0.116...2.0.117">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-21 22:48:36 +00:00
dependabot[bot] 626bc01b04 chore(deps): bump astral-sh/setup-uv from 6.1.0 to 7.3.0 (#20468)
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from
6.1.0 to 7.3.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's
releases</a>.</em></p>
<blockquote>
<h2>v7.3.0 🌈 New features and bug fixes for activate-environment</h2>
<h2>Changes</h2>
<p>This release contains a few bug fixes and a new feature for the
activate-environment functionality.</p>
<h2>🐛 Bug fixes</h2>
<ul>
<li>fix: warn instead of error when no python to cache <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/762">#762</a>)</li>
<li>fix: use --clear to create venv <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/761">#761</a>)</li>
</ul>
<h2>🚀 Enhancements</h2>
<ul>
<li>feat: add venv-path input for activate-environment <a
href="https://github.com/eifinger"><code>@​eifinger</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/746">#746</a>)</li>
</ul>
<h2>🧰 Maintenance</h2>
<ul>
<li>chore: update known checksums for 0.10.0 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/759">#759</a>)</li>
<li>refactor: tilde-expansion tests as unittests and no self-hosted
tests <a href="https://github.com/eifinger"><code>@​eifinger</code></a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/760">#760</a>)</li>
<li>chore: update known checksums for 0.9.30 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/756">#756</a>)</li>
<li>chore: update known checksums for 0.9.29 @<a
href="https://github.com/apps/github-actions">github-actions[bot]</a>
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/748">#748</a>)</li>
</ul>
<h2>📚 Documentation</h2>
<ul>
<li>Fix punctuation <a
href="https://github.com/pm-dev563"><code>@​pm-dev563</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/747">#747</a>)</li>
</ul>
<h2>⬆️ Dependency updates</h2>
<ul>
<li>Bump typesafegithub/github-actions-typing from 2.2.1 to 2.2.2 @<a
href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/753">#753</a>)</li>
<li>Bump peter-evans/create-pull-request from 8.0.0 to 8.1.0 @<a
href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/751">#751</a>)</li>
<li>Bump actions/checkout from 6.0.1 to 6.0.2 @<a
href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/740">#740</a>)</li>
<li>Bump release-drafter/release-drafter from 6.1.0 to 6.2.0 @<a
href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/743">#743</a>)</li>
<li>Bump eifinger/actionlint-action from 1.9.3 to 1.10.0 @<a
href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/731">#731</a>)</li>
<li>Bump actions/setup-node from 6.1.0 to 6.2.0 @<a
href="https://github.com/apps/dependabot">dependabot[bot]</a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/738">#738</a>)</li>
</ul>
<h2>v7.2.0 🌈 add outputs python-version and python-cache-hit</h2>
<h2>Changes</h2>
<p>Among some minor typo fixes and quality of life features for
developers of actions the main feature of this release are new
outputs:</p>
<ul>
<li><strong>python-version:</strong> The Python version that was set
(same content as existing <code>UV_PYTHON</code>)</li>
<li><strong>python-cache-hit:</strong> A boolean value to indicate the
Python cache entry was found</li>
</ul>
<p>While implementing this it became clear, that it is easier to handle
the Python binaries in a separate cache entry. The added benefit for
users is that the &quot;normal&quot; cache containing the dependencies
can be used in all runs no matter if these cache the Python binaries or
not.</p>
<blockquote>
<p>[!NOTE]<br />
This release will invalidate caches that contain the Python binaries.
This happens a single time.</p>
</blockquote>
<h2>🐛 Bug fixes</h2>
<ul>
<li>chore: remove stray space from UV_PYTHON_INSTALL_DIR message <a
href="https://github.com/akx"><code>@​akx</code></a> (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/720">#720</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/eac588ad8def6316056a12d4907a9d4d84ff7a3b"><code>eac588a</code></a>
Bump typesafegithub/github-actions-typing from 2.2.1 to 2.2.2 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/753">#753</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/a97c6cbe9c11a3fc620e0f506b2967ef4fe74ebb"><code>a97c6cb</code></a>
Bump peter-evans/create-pull-request from 8.0.0 to 8.1.0 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/751">#751</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/02182fa02a198f2423c87ba9a41982b2efbaa3ef"><code>02182fa</code></a>
fix: warn instead of error when no python to cache (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/762">#762</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/a3b3eaea92d7cf978795e7ae0a996f861347b70b"><code>a3b3eae</code></a>
chore: update known checksums for 0.10.0 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/759">#759</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/78cebeceac116b9740b3fb83de1d99c68aa4ced9"><code>78cebec</code></a>
fix: use --clear to create venv (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/761">#761</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/b6b8e2cd6a1bad11205c4c74af16307cdbecd194"><code>b6b8e2c</code></a>
refactor: tilde-expansion tests as unittests and no self-hosted tests
(<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/760">#760</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/e31bec8546a22248f075a182e7e60c534bffa057"><code>e31bec8</code></a>
chore: update known checksums for 0.9.30 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/756">#756</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/db2b65ebaeba7fdae1dfc2a646812fa8ebccefe2"><code>db2b65e</code></a>
Bump actions/checkout from 6.0.1 to 6.0.2 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/740">#740</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/3511ff7054b4bdbf897f4410d573261859a8eeb2"><code>3511ff7</code></a>
feat: add venv-path input for activate-environment (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/746">#746</a>)</li>
<li><a
href="https://github.com/astral-sh/setup-uv/commit/99b0f0474b8c709992d2d82e9cfa8745d4715d14"><code>99b0f04</code></a>
Fix punctuation (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/747">#747</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/astral-sh/setup-uv/compare/f0ec1fc3b38f5e7cd731bb6ce540c5af426746bb...eac588ad8def6316056a12d4907a9d4d84ff7a3b">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=astral-sh/setup-uv&package-manager=github_actions&previous-version=6.1.0&new-version=7.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-21 22:40:06 +00:00
dependabot[bot] d2c5666f5a chore(deps): bump taiki-e/install-action from 2.68.0 to 2.68.6 (#20467)
Bumps
[taiki-e/install-action](https://github.com/taiki-e/install-action) from
2.68.0 to 2.68.6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's
releases</a>.</em></p>
<blockquote>
<h2>2.68.6</h2>
<ul>
<li>Update <code>wasm-bindgen@latest</code> to 0.2.110.</li>
</ul>
<h2>2.68.5</h2>
<ul>
<li>Update <code>wasm-bindgen@latest</code> to 0.2.109.</li>
</ul>
<h2>2.68.4</h2>
<ul>
<li>Update <code>cargo-nextest@latest</code> to 0.9.128.</li>
</ul>
<h2>2.68.3</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2026.2.17.</p>
</li>
<li>
<p>Update <code>cargo-tarpaulin@latest</code> to 0.35.2.</p>
</li>
<li>
<p>Update <code>syft@latest</code> to 1.42.1.</p>
</li>
</ul>
<h2>2.68.2</h2>
<ul>
<li>
<p>Update <code>uv@latest</code> to 0.10.4.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.7.31.</p>
</li>
<li>
<p>Update <code>rclone@latest</code> to 1.73.1.</p>
</li>
</ul>
<h2>2.68.1</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2026.2.15.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.7.30.</p>
</li>
<li>
<p>Update <code>knope@latest</code> to 0.22.3.</p>
</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<p>All notable changes to this project will be documented in this
file.</p>
<p>This project adheres to <a href="https://semver.org">Semantic
Versioning</a>.</p>
<!-- raw HTML omitted -->
<h2>[Unreleased]</h2>
<ul>
<li>Update <code>wasm-bindgen@latest</code> to 0.2.111.</li>
</ul>
<h2>[2.68.6] - 2026-02-21</h2>
<ul>
<li>Update <code>wasm-bindgen@latest</code> to 0.2.110.</li>
</ul>
<h2>[2.68.5] - 2026-02-20</h2>
<ul>
<li>Update <code>wasm-bindgen@latest</code> to 0.2.109.</li>
</ul>
<h2>[2.68.4] - 2026-02-20</h2>
<ul>
<li>Update <code>cargo-nextest@latest</code> to 0.9.128.</li>
</ul>
<h2>[2.68.3] - 2026-02-19</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2026.2.17.</p>
</li>
<li>
<p>Update <code>cargo-tarpaulin@latest</code> to 0.35.2.</p>
</li>
<li>
<p>Update <code>syft@latest</code> to 1.42.1.</p>
</li>
</ul>
<h2>[2.68.2] - 2026-02-18</h2>
<ul>
<li>
<p>Update <code>uv@latest</code> to 0.10.4.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.7.31.</p>
</li>
<li>
<p>Update <code>rclone@latest</code> to 1.73.1.</p>
</li>
</ul>
<h2>[2.68.1] - 2026-02-17</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2026.2.15.</p>
</li>
<li>
<p>Update <code>tombi@latest</code> to 0.7.30.</p>
</li>
<li>
<p>Update <code>knope@latest</code> to 0.22.3.</p>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/taiki-e/install-action/commit/470679bc3a1580072dac4e67535d1aa3a3dcdf51"><code>470679b</code></a>
Release 2.68.6</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/6d8a751fa8ca34ab6f9c3fd87eea05661fa2196d"><code>6d8a751</code></a>
Update <code>wasm-bindgen@latest</code> to 0.2.110</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/71b48393496777ee11188c07a34d48b048a985cd"><code>71b4839</code></a>
Release 2.68.5</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/4ca0169380867518b6c0cb49cb63c9646ac66e21"><code>4ca0169</code></a>
Update <code>wasm-bindgen@latest</code> to 0.2.109</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/2723513a70062521fb56e5df87a04967751efd2f"><code>2723513</code></a>
Release 2.68.4</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/564854d94ec8d55b29e46a990a0bb8a1edc78e71"><code>564854d</code></a>
Update <code>cargo-nextest@latest</code> to 0.9.128</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/1cf3de8de323df92fe08c793e53eaef58799aec4"><code>1cf3de8</code></a>
Release 2.68.3</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/ef14f86a60d221f1fe25998845372fdf90cdd7d4"><code>ef14f86</code></a>
Update changelog</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/d7329c5811e2d509a381c912e9bd5b235cec5fdf"><code>d7329c5</code></a>
Update <code>mise@latest</code> to 2026.2.17</li>
<li><a
href="https://github.com/taiki-e/install-action/commit/bc11002a6517dd702174597bd0a8e6350d2a7211"><code>bc11002</code></a>
Update <code>cargo-tarpaulin@latest</code> to 0.35.2</li>
<li>Additional commits viewable in <a
href="https://github.com/taiki-e/install-action/compare/f8d25fb8a2df08dcd3cead89780d572767b8655f...470679bc3a1580072dac4e67535d1aa3a3dcdf51">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=taiki-e/install-action&package-manager=github_actions&previous-version=2.68.0&new-version=2.68.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-21 22:38:21 +00:00
Oleks V d03601547a chore: group minor dependencies into single PR (#20457)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.


## Rationale for this change

- **Reduce Dependabot PR noise without reducing coverage**  
Grouping most *minor* and *patch* Cargo updates into a single PR keeps
routine churn manageable while still ensuring updates are applied
regularly.

- **Keep riskier updates isolated**  
*Major* version bumps can include breaking changes, so we intentionally
**do not group major updates**. This preserves **one PR per crate** for
majors, simplifying review, CI triage, and rollback.

- **Preserve existing special handling for Arrow/Parquet**  
- Arrow/Parquet updates are higher impact and often coordinated, so we
keep their **minor/patch** updates grouped together for consistency.
- Arrow/Parquet **major** bumps are handled manually (and ignored by
Dependabot) to avoid surprise large-scale breakage.

- **Ensure `object_store` and `sqlparser` remain easy to diagnose**  
These dependencies can have outsized downstream impact in DataFusion.
Excluding them from the catch-all group ensures their updates land as
**individual PRs**, making it easier to attribute regressions and bisect
failures.

- **Maintain targeted grouping where it’s beneficial**  
Protocol-related crates (`prost*`, `pbjson*`) are commonly updated
together, so grouping their minor/patch updates reduces churn while
keeping changes cohesive.

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
2026-02-21 22:14:22 +00:00
Andy Grove 42dd4279de bench: Add criterion benchmark for sort merge join (#20464)
## Summary
- Adds a criterion micro-benchmark for SortMergeJoinExec that measures
join kernel performance in isolation
- Pre-sorted RecordBatches are fed directly into the join operator,
avoiding sort/scan overhead
- Data is constructed once and reused across iterations; only the
`TestMemoryExec` wrapper is recreated per iteration

## Benchmarks

Five scenarios covering the most common SMJ patterns:

| Benchmark | Join Type | Key Pattern |
|-----------|-----------|-------------|
| `inner_1to1` | Inner | 100K unique keys per side |
| `inner_1to10` | Inner | 10K keys, ~10 rows per key |
| `left_1to1_unmatched` | Left | ~5% unmatched on left side |
| `left_semi_1to10` | Left Semi | 10K keys |
| `left_anti_partial` | Left Anti | Partial key overlap |

## Usage

```bash
cargo bench -p datafusion-physical-plan --features test_utils --bench sort_merge_join
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 18:32:33 +00:00
Paul J. Davis 0d63ced04a Implement FFI table provider factory (#20326)
> ## Which issue does this PR close?
> * Closes [expose TableProviderFactory via
FFI #17942](https://github.com/apache/datafusion/issues/17942)
> 

This PR is re-opening PR #17994 and updating it to match the current FFI
approach (I.e., I made it look like the FFI_TableProvider in various
places).

> ## Rationale for this change
> Expose `TableProviderFactory` via FFI to enable external languages
(e.g., Python) to implement custom table provider factories and extend
DataFusion with new data source types.
> 
> ## What changes are included in this PR?
> * Added `datafusion/ffi/src/table_provider_factory.rs` with:
>   
> * `FFI_TableProviderFactory`: Stable C ABI struct with function
pointers for `create`, `clone`, `release`, and `version`
> * `ForeignTableProviderFactory`: Wrapper implementing
`TableProviderFactory` trait
> 
> ## Are these changes tested?
> Yes
> 

I've also added the integration tests as requested in the original PR.

> ## Are there any user-facing changes?
> Yes - new FFI API that enables custom `TableProviderFactory`
implementations in foreign languages. This is an additive change with no
breaking changes to existing APIs.

Also, I'd like to thank @Weijun-H for the initial version of this PR as
it simplified getting up to speed on the serialization logic that I
hadn't encountered yet.

---------

Co-authored-by: Weijun-H <huangweijun1001@gmail.com>
2026-02-21 12:36:30 +00:00
Liang-Chi Hsieh 1736fd2a40 refactor: Extract sort-merge join filter logic into separate module (#19614)
Refactored the sort-merge join implementation to improve code
organization by extracting all filter-related logic into a dedicated
filter.rs module.

Changes:
- Created new filter.rs module (~576 lines) containing:
  - Filter metadata tracking (FilterMetadata struct)
  - Deferred filtering decision logic (needs_deferred_filtering)
- Filter mask correction for different join types
(get_corrected_filter_mask)
- Filter application with null-joined row handling
(filter_record_batch_by_join_type)
  - Helper functions for filter column extraction and batch filtering

- Updated stream.rs:
  - Removed ~450 lines of filter-specific code
  - Now delegates to filter module functions
  - Simplified main join logic to focus on stream processing

- Updated tests.rs:
  - Updated imports to use new filter module
  - Changed test code to use FilterMetadata struct
  - All 47 sort-merge join tests passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-21 02:10:36 +00:00
Kazantsev Maksim fc98d5c282 feat: Implement Spark bitmap_bucket_number function (#20288)
## Which issue does this PR close?

N/A

## Rationale for this change

Add new function:
https://spark.apache.org/docs/latest/api/sql/index.html#bitmap_bucket_number

## What changes are included in this PR?

- Implementation
- Unit Tests
- SLT tests

## Are these changes tested?

Yes, tests added as part of this PR.

## Are there any user-facing changes?

No, these are new function.

---------

Co-authored-by: Kazantsev Maksim <mn.kazantsev@gmail.com>
2026-02-21 02:08:44 +00:00
Neil Conway 7f99947390 chore: Cleanup "!is_valid(i)" -> "is_null(i)" (#20453)
## Which issue does this PR close?

N/A

## Rationale for this change

This makes the code easier to read; per suggestion from @Jefffrey in
code review for a different change.

## What changes are included in this PR?

## Are these changes tested?

Yes.

## Are there any user-facing changes?

No.
2026-02-21 02:04:39 +00:00
Eren Avsarogullari a936d0de95 test: Extend Spark Array functions: array_repeat , shuffle and slice test coverage (#20420)
## Which issue does this PR close?
- Closes #20419.

## Rationale for this change
This PR adds new positive test cases for `datafusion-spark` array
functions: `array_repeat `, `shuffle`, `slice` for the following
use-cases:
```
- nested function execution,
- different datatypes such as timestamp,
- casting before function execution
```
Also, being updated contributor-guide testing documentation with minor
addition.

## What changes are included in this PR?
Being added new positive test cases to `datafusion-spark` array
functions: `array_repeat `, `shuffle`, `slice`.

## Are these changes tested?
Yes, adding new positive test cases.

## Are there any user-facing changes?
No
2026-02-20 18:39:37 +00:00
Yu-Chuan Hung 0f7a405b8c feat: support Spark-compatible json_tuple function (#20412)
## Which issue does this PR close?

- Part of #15914
- Related comet issue:
https://github.com/apache/datafusion-comet/issues/3160

## Rationale for this change

- Apache Spark's `json_tuple` extracts top-level fields from a JSON
string.
- This function is used in Spark SQL and needed for DataFusion-Comet
compatibility.
- Reference:
https://spark.apache.org/docs/latest/api/sql/index.html#json_tuple

## What changes are included in this PR?

- Add Spark-compatible `json_tuple` function in `datafusion-spark` crate
- Function signature: `json_tuple(json_string, key1, key2, ...) ->
Struct<c0: Utf8, c1: Utf8, ...>`
  - `json_string`: The JSON string to extract fields from
  - `key1, key2, ...`: Top-level field names to extract
- Returns a Struct because DataFusion ScalarUDFs return one value per
row; caller (Comet) destructures the fields

### Examples

```sql
SELECT json_tuple('{"f1":"value1","f2":"value2","f3":3}', 'f1', 'f2', 'f3');
-- {c0: value1, c1: value2, c2: 3}

SELECT json_tuple('{"f1":"value1"}', 'f1', 'f2');
-- {c0: value1, c1: NULL}

SELECT json_tuple(NULL, 'f1');
-- NULL
```

## Are these changes tested?

- Unit tests: return_field_from_args shape validation and too-few-args
error
- sqllogictest: test_files/spark/json/json_tuple.slt, test cases derived
from Spark JsonExpressionsSuite

## Are there any user-facing changes?
Yes.
2026-02-20 18:38:32 +00:00
Adrian Garcia Badaracco 1ee782f783 Migrate Python usage to uv workspace (#20414)
I was having trouble getting benchmarks to gen data.

## Summary
- Replace three independent `requirements.txt` files with a uv workspace
(`benchmarks`, `dev`, `docs` projects)
- Single `uv.lock` lockfile for reproducible dependency resolution
- Simplify `bench.sh` by removing all ad-hoc venv/pip logic in favor of
`uv run`

## Test plan
- [ ] `uv sync` resolves all deps from repo root
- [ ] `uv run --project benchmarks python3 benchmarks/compare.py` works
- [ ] `uv run --project docs sphinx-build docs/source docs/build` builds
docs
- [ ] Run a benchmark from `bench.sh` that uses Python (e.g., h2o data
gen or compare flow)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:29:56 +00:00