51 Commits

Author SHA1 Message Date
Qi Zhu 02bad143cf perf: single-pass plan traversal in Predicate::new (#113)
* perf: single-pass plan traversal in Predicate::new

* address comments

* add join error test
2026-01-20 13:45:13 +08:00
xudong.w 4071577345 Expose a get_mv_candidates_for_table API for ViewMatcher (#112)
* Expose a get_mv_candidates_for_table API for ViewMatcher

* refine comments
2026-01-12 15:48:08 +00:00
xudong.w 547aa4d703 Upgrade DF52 (#111)
* Upgrade DF52

* update test

* use 52
2026-01-12 15:31:11 +00:00
xudong.w 0d1aefae25 Expose mv_plans for ViewMatcher (#22) (#109) 2025-12-19 14:22:46 +00:00
xudong.w 4539acf8fa prevent rewriting strict inequality to closed interval for non-discrete types (#21) (#108)
Co-authored-by: Matt Friede <7852262+Friede80@users.noreply.github.com>
2025-12-15 11:02:51 +08:00
xudong.w 9c915808e6 Fix mv dependencies involving unrelated files (#107) 2025-12-12 20:27:46 +08:00
xudong.w f1f7ad8e72 Upgrade DF51.0.0 (#104)
* Upgrade DF51.0.0

* udpate
2025-11-19 09:57:11 +00:00
Qi Zhu ec7e88ab4a Add benchmark for heavy operation for datafusion-materialized-views (#101) 2025-10-28 13:19:42 +08:00
github-actions[bot] e6205944dd chore: release v0.2.0 (#96)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-10-24 09:33:23 +00:00
xudong.w 6162aea1d5 Chore: remove useless lines in changelog (#97) 2025-10-24 09:24:25 +00:00
xudong.w f3d5eb1c7e Improve the doc (#95) 2025-10-24 09:10:37 +00:00
xudong.w 3910e12ce9 Support limit pushdown for OneOfExec (#94) 2025-10-11 15:03:11 +08:00
Matthew Cramerus 0c408a73ba Improved documentation on IVM algorithm (#90)
* inline mermaid diagrams

* additional comment

* mention duplicates, not cardinality

* more documentation
2025-10-10 09:41:36 +08:00
Matthew Cramerus 5915f4cddb Support static partition columns for MV (#89)
* Support static partition columns for MV

* runtime checks

* unit test for dynamic partition columns

* lint
2025-09-24 23:18:02 -05:00
xudong.w 169eb66628 upgrade to DF50 (#87) 2025-09-16 11:07:50 +00:00
Qi Zhu 540f29ee55 Fix empty unnest columns handling when pushdown_projection_inexact (#88) 2025-09-15 19:16:41 +08:00
xudong.w d8364fbf6d make cost fn accept candidates (#83) 2025-09-13 08:45:50 +00:00
Qi Zhu 3026895c6d Upgrade DF to 49.0.2 (#86)
* Upgrade DF to 49.0.2

* fix clippy and upgrade rust

* upgrade version for Cargo Deny
2025-09-03 13:55:04 +08:00
xudong.w 25e5ccc06a Upgrade to DF49 (#75)
* Upgrade to DF49

* fix licenses

* use 49

* resolve comments
2025-08-01 08:51:20 +08:00
xudong.w d4cc10f0fb Upgrade DataFusion 48.0.0 (#61)
* Upgrade to DF48

* fix bug

* update

* update more
2025-06-26 09:27:47 +08:00
Jared Combs b733a12409 Allow customization of list_all_files function. (#69)
We need to be able to support custom file listing logic (e.g.
versioning). Introduced a new trait to allow the injection of custom
logic.

Co-authored-by: Matthew Turner <matthew.m.turner@outlook.com>
2025-06-19 09:45:09 -04:00
Jared Combs 819843c40c Allow for 'special' partitions that are omitted in the staleness check. (#68)
It is sometimes useful to have additional information in the file path
that we can access programatically but are not 'normal' partitions. For
instance, it can be helpful to encode versioning or hash metadata into
the path. However, this can prevent the stale dependencies UTF from
correctly matching the file metadata against its dependencies.

To work around this, I've added the notion of a 'special' partition.
These partitions are prefixed with an underscore and must be the leaf
most nodes in the path. This makes it very easy to omit these form the
path in the same way that we already omit file names.
2025-06-19 09:42:23 -04:00
Matthew Cramerus 3dfda8f003 don't panic if eq class is not found (#60) 2025-05-23 14:19:52 -05:00
Matthew Cramerus e206db5051 Handle table scan filters that reference dropped columns (#59)
* use full table schema when analyzing predicates

* uncomment tests

* comment

* fmt
2025-05-23 11:50:03 -05:00
Matthew Cramerus 1ffaad516b exclude some materialized views from query rewriting (#57)
* exclude some materialized views from query rewriting

* newline at EOF

* fmt
2025-05-13 14:58:07 -05:00
xudong.w 54d810c6c2 Optimize performance bottleneck if projection is large (#56) 2025-05-05 12:25:56 -05:00
xudong.w b88a762b48 Upgrade df47 (#55) 2025-04-21 10:53:45 -05:00
dependabot[bot] 7c1861ed0d Update itertools requirement from 0.13 to 0.14 (#32)
Updates the requirements on [itertools](https://github.com/rust-itertools/itertools) to permit the latest version.
- [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-itertools/itertools/compare/v0.13.0...v0.14.0)

---
updated-dependencies:
- dependency-name: itertools
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matthew Cramerus <8771538+suremarc@users.noreply.github.com>
2025-04-01 18:16:14 -05:00
dependabot[bot] 895d5f313f Update ordered-float requirement from 4.6.0 to 5.0.0 (#49)
Updates the requirements on [ordered-float](https://github.com/reem/rust-ordered-float) to permit the latest version.
- [Release notes](https://github.com/reem/rust-ordered-float/releases)
- [Commits](https://github.com/reem/rust-ordered-float/compare/v4.6.0...v5.0.0)

---
updated-dependencies:
- dependency-name: ordered-float
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-01 18:07:50 -05:00
xudong.w f64e8f06f2 Upgrade DF46 (#48) 2025-03-25 21:08:31 -05:00
Matthew Turner 7d635a3abc Update extension (#45) 2025-03-20 21:09:51 +00:00
Matthew Cramerus f77260cf9c make explain output stable (#44) 2025-03-12 17:42:44 +00:00
Matthew Cramerus 6632011f60 Add alternate analysis for MVs with no partition columns (#39)
* port alternate analysis for no partitions

* fix
2025-03-04 02:20:59 +00:00
Matthew Cramerus 600a4d8d9c upgrade to datafusion 45 (#38)
* upgrade to datafusion 45

* fix test
2025-03-03 19:00:42 -06:00
Matthew Cramerus 7616021b08 use nanosecond timestamps in file metadata (#28) 2025-01-08 11:20:26 -06:00
Matthew Cramerus 64eaabdcfc feat: Decorator trait (#26)
* new decorator api

* exercise decorator in tests
2025-01-08 10:57:57 -06:00
github-actions[bot] 5fdd03e825 chore: release v0.1.1 (#17)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-01-07 11:38:59 -06:00
Matthew Cramerus 1b6105a394 add constructor for RowMetadataRegistry from FileMetadata (#25) 2025-01-07 11:30:25 -06:00
Matthew Cramerus ed6114383e feat: view exploitation (#19)
* feat: view exploitation

* add PruneCandidates physical optimizer
2025-01-03 14:55:20 -06:00
Matthew Cramerus cf65d796ce feat: SPJ normal form (#18)
* feat: SPJ normal form

* formatting on doc comment

* fix typo

* ColumnEquivalenceClass not public
2025-01-03 12:38:33 -06:00
Matthew Cramerus 55301b5a65 add changelog manually (#14) 2025-01-03 16:30:24 +00:00
Matthew Cramerus 80b616864f don't use paths-ignore (#15) 2025-01-03 16:15:11 +00:00
Matthew Cramerus a739bc5301 some api improvements + remove manual changelog (#12) 2025-01-03 09:47:07 -06:00
Matthew Cramerus da8ed33b5b Integration test (#10)
* wip integration test

* working integration test

* fix license header + update readme
2025-01-03 01:06:20 -06:00
Matthew Cramerus e1ba25452a setup changelog (#9) 2024-12-31 18:42:08 +00:00
Matthew Cramerus ce6a3b6f3b Release plz (#7)
* add release-plz

* newline at eof
2024-12-31 12:22:37 -06:00
Matthew Cramerus 7c77433216 stale_files + rename to mv_dependencies (#6) 2024-12-31 12:08:26 -06:00
Matthew Cramerus 14a35187ac Incremental view maintenance (#3)
* port MV dependency code

* improved documentation

* fix spelling mistake

* fix typo

* don't forget license header

* readme

* explain what UDTF means

* fix typo in readme
2024-12-27 11:19:39 -06:00
Matthew Cramerus b7974e8e24 Add FileMetadata table and RowMetadataRegistry (#2)
* port file metadata + hive_partition + row metadata + add tests

* better docs

* more comments

* fix link

* fix another comment

* allow the same licenses as datafusion-orc

* add unicode-3.0 license to allowlist
2024-12-26 12:01:41 -06:00
suremarc 759aa19c51 Setup cargo + CI (#1)
cargo + GH actions + taplo

add a test

update Cargo.toml package metadata

forgot to add #[test]

add licenserc.toml

add deny.toml

only require license headers on .rs and .toml files

add a license to the Cargo.toml

newlines at EOF
2024-12-23 20:00:36 +00:00