mirror of
https://github.com/langchain-ai/datafusion.git
synced 2026-07-01 21:24:06 -04:00
perf: Use Hashbrown for array_distinct (#20538)
## Which issue does this PR close? N/A ## Rationale for this change #20364 recently optimized `array_distinct` to use batched row conversion. As part of that PR, `std::HashSet` was used. This PR just replaces `std::HashSet` with `hashbrown::HashSet`, which measurably improves performance. ## What changes are included in this PR? ## Are these changes tested? Yes. ## Are there any user-facing changes? No.
This commit is contained in:
@@ -34,8 +34,8 @@ use datafusion_expr::{
|
||||
ColumnarValue, Documentation, ScalarUDFImpl, Signature, Volatility,
|
||||
};
|
||||
use datafusion_macros::user_doc;
|
||||
use hashbrown::HashSet;
|
||||
use std::any::Any;
|
||||
use std::collections::HashSet;
|
||||
use std::fmt::{Display, Formatter};
|
||||
use std::sync::Arc;
|
||||
|
||||
|
||||
Reference in New Issue
Block a user