perf: Use Hashbrown for array_distinct (#20538)

## Which issue does this PR close?

N/A

## Rationale for this change

#20364 recently optimized `array_distinct` to use batched row
conversion. As part of that PR, `std::HashSet` was used. This PR just
replaces `std::HashSet` with `hashbrown::HashSet`, which measurably
improves performance.

## What changes are included in this PR?

## Are these changes tested?

Yes.

## Are there any user-facing changes?

No.
This commit is contained in:
Neil Conway
2026-02-25 13:12:42 -05:00
committed by GitHub
parent e6849945bf
commit e894a03bea
+1 -1
View File
@@ -34,8 +34,8 @@ use datafusion_expr::{
ColumnarValue, Documentation, ScalarUDFImpl, Signature, Volatility,
};
use datafusion_macros::user_doc;
use hashbrown::HashSet;
use std::any::Any;
use std::collections::HashSet;
use std::fmt::{Display, Formatter};
use std::sync::Arc;