Fix: strict multi-shard health checks (#27) — backport to 1.x; bump 1.1.1b1

Summary

- Backports the 2.0 strict health policy to 1.x and ships as a pre-release.

Health changes (cache_utils.py)

- Index-aware: validate safetensors and PyTorch indexes; all referenced shards must exist, be >0B, and not be Git LFS
pointers.
- Pattern policy: shard patterns (model-XXXXX-of-YYYYY.*) without an index → unhealthy (parity with 2.0).
- Partial/tmp markers: any “.partial”, “.tmp”, or names containing “partial” anywhere under the snapshot → unhealthy.
- LFS scan: recursive detection of Git LFS pointer files (<200B with LFS header).
- Single-file fallback: non-empty .safetensors/.bin/*.gguf (no pattern shards) remain healthy.
- Ergonomics: is_model_healthy() accepts direct snapshot paths; check_lfs_corruption() scans recursively.

Tests

- Add tests/unit/test_health_multishard.py covering:
    - index complete → healthy; missing/empty shard → unhealthy; LFS pointer → unhealthy
    - pattern shards (no index) → unhealthy
    - partial marker → unhealthy
    - PyTorch index parity (complete → healthy)
    - single-file safetensors/gguf → healthy

Docs

- CHANGELOG.md: add 1.1.1-beta.1 with detailed rules; note GitHub tag vs PyPI mapping (1.1.1-beta.1 ↔ 1.1.1b1).
- README.md: tests badge 160/160; pre-release note for 1.1.1b1.
- TESTING.md: status 160/160; update test structure (add test_health_multishard.py; remove 2.0 note).

Version

- version = 1.1.1b1 (PEP 440 pre-release); VERSION = (1, 1, 1).

Behavioral impact

- Health reporting is stricter (cannot regress functionality): incomplete multi-shard downloads correctly report
unhealthy. No changes to pull/run/server behavior.

Validation

- Python 3.9 local: 160 passed, 36 deselected; warnings eliminated on 3.9/3.10 under project defaults.
- New multishard tests pass; manual spot-checks show expected unhealthy→healthy transitions as downloads complete.

Release

- GitHub: tag v1.1.1-beta.1 (Pre-release).
- PyPI: upload 1.1.1b1 (install via pip install --pre mlx-knife).
This commit is contained in:
Local Test
2025-09-01 01:26:27 +02:00
parent f511dd9c74
commit b9db12ae89
6 changed files with 269 additions and 37 deletions
+5 -3
View File
@@ -2,7 +2,7 @@
## Current Status
**150/150 tests passing** (August 2025) - **STABLE RELEASE** 🚀
**160/160 tests passing** (September 2025) - **STABLE RELEASE + Pre-release** 🚀
**Apple Silicon verified** (M1/M2/M3)
**Python 3.9-3.13 compatible**
**Production ready** - comprehensive testing with real model execution
@@ -55,12 +55,14 @@ tests/
│ ├── test_end_token_issue.py # Issue #20: End-token filtering (@server)
│ ├── test_issue_14.py # Issue #14: Chat self-conversation (@server)
│ └── test_issue_15_16.py # Issues #15/#16: Dynamic token limits (@server)
└── unit/ # Module-level unit tests (72 tests)
└── unit/ # Module-level unit tests (82 tests)
├── test_cache_utils.py # Cache management & Issue #21/#23 tests
├── test_cli.py # CLI argument parsing
├── test_health_multishard.py # Strict multi-shard/index health (Issue #27)
└── test_mlx_runner_memory.py # Memory management tests
```
## 3-Category Test Strategy (MLX Knife 1.1.0+)
MLX Knife uses a **3-category test strategy** to balance test isolation, performance, and user cache protection:
@@ -552,4 +554,4 @@ def test_new_feature(mlx_server, model_name: str, size_str: str, ram_needed: int
1. **Mark with `@pytest.mark.server`** - excludes from default `pytest`
2. **Use `mlx_server` fixture** - automatic server lifecycle management
3. **Test RAM requirements** - use `get_safe_models_for_system()` helper
4. **Document in TESTING.md** - add to this guide
4. **Document in TESTING.md** - add to this guide