Add Cargo.lock (#14483)

* Add `Cargo.lock`

* Style fix

* Update `Cargo.lock`

* Remove unused path

* Remove cli-specific ci job and dependabot config

* Remove home pin

* Make cli test work with backtrace feature

* More changes resulting from moving the cli crate in the workspace

* Exclude `depcheck` `Cargo.lock`

* Remove `--locked` from `depcheck` run

* Refer to guidance instead of updated guidance

* Remove `--locked` from benchmark script

* Only run with `--locked` in `linux-build-lib` job of `Rust` workflow

* Remove unrelated formatting changes

* Add a note about using `--locked`

* Update cargo.lock

* Update .github/workflows/rust.yml

* Update Cargo.lock

* fix yaml

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
This commit is contained in:
Matthijs Brobbel
2025-02-08 17:05:25 +01:00
committed by GitHub
parent 56a30acbde
commit 94d2baf318
16 changed files with 2644 additions and 294 deletions
-15
View File
@@ -28,24 +28,9 @@ updates:
# arrow is bumped manually
- dependency-name: "arrow*"
update-types: ["version-update:semver-major"]
- package-ecosystem: cargo
directory: "datafusion-cli/"
schedule:
interval: daily
open-pull-requests-limit: 10
target-branch: main
labels: [auto-dependencies]
ignore:
# arrow is bumped manually
- dependency-name: "arrow*"
update-types: ["version-update:semver-major"]
# datafusion is bumped manually
- dependency-name: "datafusion*"
update-types: ["version-update:semver-major"]
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
open-pull-requests-limit: 10
labels: [auto-dependencies]
+3 -1
View File
@@ -25,9 +25,11 @@ on:
push:
paths:
- "**/Cargo.toml"
- "**/Cargo.lock"
pull_request:
paths:
- "**/Cargo.toml"
- "**/Cargo.lock"
# manual trigger
# https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow
workflow_dispatch:
@@ -50,4 +52,4 @@ jobs:
- name: Check dependencies
run: |
cd dev/depcheck
cargo run
cargo run
+45 -83
View File
@@ -60,7 +60,11 @@ jobs:
with:
rust-version: stable
- name: Prepare cargo build
run: cargo check --profile ci --all-targets --features integration-tests
run: |
# Adding `--locked` here to assert that the `Cargo.lock` file is up to
# date with the manifest. When this fails, please make sure to commit
# the changes to `Cargo.lock` after building with the updated manifest.
cargo check --profile ci --workspace --all-targets --features integration-tests --locked
# cargo check common, functions and substrait with no default features
linux-cargo-check-no-default-features:
@@ -95,12 +99,6 @@ jobs:
- name: Check workspace with additional features
run: cargo check --profile ci --workspace --benches --features avro,json,integration-tests
- name: Check Cargo.lock for datafusion-cli
run: |
# If this test fails, try running `cargo update` in the `datafusion-cli` directory
# and check in the updated Cargo.lock file.
cargo check --profile ci --manifest-path datafusion-cli/Cargo.toml --locked
# cargo check datafusion to ensure that the datafusion crate can be built with only a
# subset of the function packages enabled.
linux-cargo-check-datafusion:
@@ -189,28 +187,6 @@ jobs:
- name: Verify Working Directory Clean
run: git diff --exit-code
linux-test-datafusion-cli:
name: cargo test datafusion-cli (amd64)
needs: linux-build-lib
runs-on: ubuntu-latest
container:
image: amd64/rust
steps:
- uses: actions/checkout@v4
with:
submodules: true
fetch-depth: 1
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Run tests (excluding doctests)
run: |
cd datafusion-cli
cargo test --profile ci --lib --tests --bins --all-features
- name: Verify Working Directory Clean
run: git diff --exit-code
linux-test-example:
name: cargo examples (amd64)
needs: linux-build-lib
@@ -252,10 +228,7 @@ jobs:
with:
rust-version: stable
- name: Run doctests
run: |
cargo test --profile ci --doc --features avro,json
cd datafusion-cli
cargo test --profile ci --doc --all-features
run: cargo test --profile ci --doc --features avro,json
- name: Verify Working Directory Clean
run: git diff --exit-code
@@ -364,45 +337,40 @@ jobs:
POSTGRES_HOST: postgres
POSTGRES_PORT: ${{ job.services.postgres.ports[5432] }}
# Temporarily commenting out the Windows flow, the reason is enormously slow running build
# Waiting for new Windows 2025 github runner
# Details: https://github.com/apache/datafusion/issues/13726
#
# windows:
# name: cargo test (win64)
# runs-on: windows-latest
# steps:
# - uses: actions/checkout@v4
# with:
# submodules: true
# - name: Setup Rust toolchain
# uses: ./.github/actions/setup-windows-builder
# - name: Run tests (excluding doctests)
# shell: bash
# run: |
# export PATH=$PATH:$HOME/d/protoc/bin
# cargo test --lib --tests --bins --features avro,json,backtrace
# cd datafusion-cli
# cargo test --lib --tests --bins --all-features
# Temporarily commenting out the Windows flow, the reason is enormously slow running build
# Waiting for new Windows 2025 github runner
# Details: https://github.com/apache/datafusion/issues/13726
#
# windows:
# name: cargo test (win64)
# runs-on: windows-latest
# steps:
# - uses: actions/checkout@v4
# with:
# submodules: true
# - name: Setup Rust toolchain
# uses: ./.github/actions/setup-windows-builder
# - name: Run tests (excluding doctests)
# shell: bash
# run: |
# export PATH=$PATH:$HOME/d/protoc/bin
# cargo test --lib --tests --bins --features avro,json,backtrace
# Commenting out intel mac build as so few users would ever use it
# Details: https://github.com/apache/datafusion/issues/13846
# macos:
# name: cargo test (macos)
# runs-on: macos-latest
# steps:
# - uses: actions/checkout@v4
# with:
# submodules: true
# fetch-depth: 1
# - name: Setup Rust toolchain
# uses: ./.github/actions/setup-macos-builder
# - name: Run tests (excluding doctests)
# shell: bash
# run: |
# cargo test run --profile ci --exclude datafusion-examples --exclude datafusion-benchmarks --workspace --lib --tests --bins --features avro,json,backtrace
# cd datafusion-cli
# cargo test run --profile ci --lib --tests --bins --all-features
# Commenting out intel mac build as so few users would ever use it
# Details: https://github.com/apache/datafusion/issues/13846
# macos:
# name: cargo test (macos)
# runs-on: macos-latest
# steps:
# - uses: actions/checkout@v4
# with:
# submodules: true
# fetch-depth: 1
# - name: Setup Rust toolchain
# uses: ./.github/actions/setup-macos-builder
# - name: Run tests (excluding doctests)
# shell: bash
# run: cargo test run --profile ci --exclude datafusion-examples --exclude datafusion-benchmarks --workspace --lib --tests --bins --features avro,json,backtrace
macos-aarch64:
name: cargo test (macos-aarch64)
@@ -416,10 +384,7 @@ jobs:
uses: ./.github/actions/setup-macos-aarch64-builder
- name: Run tests (excluding doctests)
shell: bash
run: |
cargo test --profile ci --lib --tests --bins --features avro,json,backtrace,integration-tests
cd datafusion-cli
cargo test --profile ci --lib --tests --bins --all-features
run: cargo test --profile ci --lib --tests --bins --features avro,json,backtrace,integration-tests
test-datafusion-pyarrow:
name: cargo test pyarrow (amd64)
@@ -615,19 +580,19 @@ jobs:
# (Min Supported Rust Version) than the one specified in the
# `rust-version` key of `Cargo.toml`.
#
# To reproduce:
# 1. Install the version of Rust that is failing. Example:
# To reproduce:
# 1. Install the version of Rust that is failing. Example:
# rustup install 1.80.1
# 2. Run the command that failed with that version. Example:
# cargo +1.80.1 check -p datafusion
#
#
# To resolve, either:
# 1. Change your code to use older Rust features,
# 1. Change your code to use older Rust features,
# 2. Revert dependency update
# 3. Update the MSRV version in `Cargo.toml`
#
# Please see the DataFusion Rust Version Compatibility Policy before
# updating Cargo.toml. You may have to update the code instead.
# updating Cargo.toml. You may have to update the code instead.
# https://github.com/apache/datafusion/blob/main/README.md#rust-version-compatibility-policy
cargo msrv --output-format json --log-target stdout verify
- name: Check datafusion-substrait
@@ -636,6 +601,3 @@ jobs:
- name: Check datafusion-proto
working-directory: datafusion/proto
run: cargo msrv --output-format json --log-target stdout verify
- name: Check datafusion-cli
working-directory: datafusion-cli
run: cargo msrv --output-format json --log-target stdout verify
-2
View File
@@ -42,8 +42,6 @@ venv/*
# Rust
target
Cargo.lock
!datafusion-cli/Cargo.lock
rusty-tags.vi
.history
+2562 -99
View File
File diff suppressed because it is too large Load Diff
+2 -2
View File
@@ -16,8 +16,7 @@
# under the License.
[workspace]
# datafusion-cli is excluded because of its Cargo.lock. See datafusion-cli/README.md.
exclude = ["datafusion-cli", "dev/depcheck"]
exclude = ["dev/depcheck"]
members = [
"datafusion/common",
"datafusion/common-runtime",
@@ -48,6 +47,7 @@ members = [
"datafusion/sqllogictest",
"datafusion/substrait",
"datafusion/wasmtest",
"datafusion-cli",
"datafusion-examples",
"datafusion-examples/examples/ffi/ffi_example_table_provider",
"datafusion-examples/examples/ffi/ffi_module_interface",
+7 -20
View File
@@ -152,26 +152,13 @@ deprecate methods before removing them, according to the [deprecation guidelines
[deprecation guidelines]: https://datafusion.apache.org/library-user-guide/api-health.html
## Dependencies and a `Cargo.lock`
## Dependencies and `Cargo.lock`
`datafusion` is intended for use as a library and thus purposely does not have a
`Cargo.lock` file checked in. You can read more about the distinction in the
[Cargo book].
Following the [guidance] on committing `Cargo.lock` files, this project commits
its `Cargo.lock` file.
CI tests always run against the latest compatible versions of all dependencies
(the equivalent of doing `cargo update`), as suggested in the [Cargo CI guide]
and we rely on Dependabot for other upgrades. This strategy has two problems
that occasionally arise:
CI uses the committed `Cargo.lock` file, and dependencies are updated regularly
using [Dependabot] PRs.
1. CI failures when downstream libraries upgrade in some non compatible way
2. Local development builds that fail when DataFusion inadvertently relies on
a feature in a newer version of a dependency than declared in `Cargo.toml`
(e.g. a new method is added to a trait that we use).
However, we think the current strategy is the best tradeoff between maintenance
overhead and user experience and ensures DataFusion always works with the latest
compatible versions of all dependencies. If you encounter either of these
problems, please open an issue or PR.
[cargo book]: https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
[cargo ci guide]: https://doc.rust-lang.org/cargo/guide/continuous-integration.html#verifying-latest-dependencies
[guidance]: https://blog.rust-lang.org/2023/08/29/committing-lockfiles.html
[dependabot]: https://docs.github.com/en/code-security/dependabot/working-with-dependabot
-2
View File
@@ -19,5 +19,3 @@
set -ex
cargo clippy --all-targets --workspace --features avro,pyarrow,integration-tests -- -D warnings
cd datafusion-cli
cargo clippy --all-targets --all-features -- -D warnings
-2
View File
@@ -20,5 +20,3 @@
set -ex
export RUSTDOCFLAGS="-D warnings"
cargo doc --document-private-items --no-deps --workspace
cd datafusion-cli
cargo doc --document-private-items --no-deps
-2
View File
@@ -19,5 +19,3 @@
set -ex
cargo fmt --all -- --check
cd datafusion-cli
cargo fmt --all -- --check
+20 -33
View File
@@ -18,23 +18,22 @@
[package]
name = "datafusion-cli"
description = "Command Line Client for DataFusion query engine."
version = "45.0.0"
authors = ["Apache DataFusion <dev@datafusion.apache.org>"]
edition = "2021"
keywords = ["arrow", "datafusion", "query", "sql"]
license = "Apache-2.0"
homepage = "https://datafusion.apache.org"
repository = "https://github.com/apache/datafusion"
rust-version = "1.81.0"
readme = "README.md"
version = { workspace = true }
edition = { workspace = true }
homepage = { workspace = true }
repository = { workspace = true }
license = { workspace = true }
authors = { workspace = true }
rust-version = { workspace = true }
[dependencies]
arrow = { version = "54.1.0" }
async-trait = "0.1.0"
arrow = { workspace = true }
async-trait = { workspace = true }
aws-config = "1.5.16"
aws-credential-types = "1.2.0"
clap = { version = "4.5.28", features = ["derive", "cargo"] }
datafusion = { path = "../datafusion/core", version = "45.0.0", features = [
datafusion = { workspace = true, features = [
"avro",
"crypto_expressions",
"datetime_expressions",
@@ -46,31 +45,19 @@ datafusion = { path = "../datafusion/core", version = "45.0.0", features = [
"compression",
] }
dirs = "6.0.0"
env_logger = "0.11"
futures = "0.3"
# pin as home 0.5.11 has MSRV 1.81. Can remove this once we bump MSRV to 1.81
env_logger = { workspace = true }
futures = { workspace = true }
mimalloc = { version = "0.1", default-features = false }
object_store = { version = "0.11.0", features = ["aws", "gcp", "http"] }
parking_lot = { version = "0.12" }
parquet = { version = "54.1.0", default-features = false }
regex = "1.8"
object_store = { workspace = true, features = ["aws", "gcp", "http"] }
parking_lot = { workspace = true }
parquet = { workspace = true, default-features = false }
regex = { workspace = true }
rustyline = "15.0"
tokio = { version = "1.43", features = ["macros", "rt", "rt-multi-thread", "sync", "parking_lot", "signal"] }
url = "2.5.4"
tokio = { workspace = true, features = ["macros", "rt", "rt-multi-thread", "sync", "parking_lot", "signal"] }
url = { workspace = true }
[dev-dependencies]
assert_cmd = "2.0"
ctor = "0.2.9"
ctor = { workspace = true }
predicates = "3.0"
rstest = "0.24"
[profile.ci]
inherits = "dev"
incremental = false
# ci turns off debug info, etc for dependencies to allow for smaller binaries making caching more effective
[profile.ci.package."*"]
debug = false
debug-assertions = false
strip = "debuginfo"
incremental = false
rstest = { workspace = true }
+3 -7
View File
@@ -18,18 +18,14 @@
FROM rust:bookworm AS builder
COPY . /usr/src/datafusion
COPY ./datafusion /usr/src/datafusion/datafusion
COPY ./datafusion-cli /usr/src/datafusion/datafusion-cli
WORKDIR /usr/src/datafusion/datafusion-cli
WORKDIR /usr/src/datafusion
RUN rustup component add rustfmt
RUN cargo build --release
RUN cargo build -p datafusion-cli --release
FROM debian:bookworm-slim
COPY --from=builder /usr/src/datafusion/datafusion-cli/target/release/datafusion-cli /usr/local/bin
COPY --from=builder /usr/src/datafusion/target/release/datafusion-cli /usr/local/bin
RUN mkdir /data
-16
View File
@@ -30,19 +30,3 @@ DataFusion CLI (`datafusion-cli`) is a small command line utility that runs SQL
## Where can I find more information?
See the [`datafusion-cli` documentation](https://datafusion.apache.org/user-guide/cli/index.html) for further information.
## How do I make my IDE work with `datafusion-cli`?
"open" the `datafusion/datafusion-cli` project as its own top level
project in my IDE (rather than opening `datafusion`)
The reason `datafusion-cli` is not part of the main workspace in
[`datafusion Cargo.toml`] file is that `datafusion-cli` is a binary and has a
checked in `Cargo.lock` file to ensure reproducible builds.
However, the `datafusion` and sub crates are intended for use as libraries and
thus do not have a `Cargo.lock` file checked in, as described in the [main
README] file.
[`datafusion cargo.toml`]: https://github.com/apache/datafusion/blob/main/Cargo.toml
[main readme]: ../README.md
+1 -1
View File
@@ -549,7 +549,7 @@ mod tests {
.await
.unwrap_err();
assert_eq!(err.to_string(), "Invalid or Unsupported Configuration: Invalid endpoint: http://endpoint33. HTTP is not allowed for S3 endpoints. To allow HTTP, set 'aws.allow_http' to true");
assert_eq!(err.to_string().lines().next().unwrap_or_default(), "Invalid or Unsupported Configuration: Invalid endpoint: http://endpoint33. HTTP is not allowed for S3 endpoints. To allow HTTP, set 'aws.allow_http' to true");
} else {
return plan_err!("LogicalPlan is not a CreateExternalTable");
}
+1
View File
@@ -0,0 +1 @@
Cargo.lock
-9
View File
@@ -141,15 +141,6 @@ git checkout apache/main
Manually update the datafusion version in the root `Cargo.toml` to `38.0.0`.
Run `cargo update` in the root directory and also in `datafusion-cli`:
```shell
cargo update
cd datafustion-cli
cargo update
cd ..
```
Run `cargo test` to re-generate some example files:
```shell