Fix: strict multi-shard health checks (#27) — backport to 1.x; bump 1.1.1b1

Summary - Backports the 2.0 strict health policy to 1.x and ships as a pre-release. Health changes (cache_utils.py) - Index-aware: validate safetensors and PyTorch indexes; all referenced shards must exist, be >0B, and not be Git LFS pointers. - Pattern policy: shard patterns (model-XXXXX-of-YYYYY.*) without an index → unhealthy (parity with 2.0). - Partial/tmp markers: any “.partial”, “.tmp”, or names containing “partial” anywhere under the snapshot → unhealthy. - LFS scan: recursive detection of Git LFS pointer files (<200B with LFS header). - Single-file fallback: non-empty .safetensors/.bin/*.gguf (no pattern shards) remain healthy. - Ergonomics: is_model_healthy() accepts direct snapshot paths; check_lfs_corruption() scans recursively. Tests - Add tests/unit/test_health_multishard.py covering: - index complete → healthy; missing/empty shard → unhealthy; LFS pointer → unhealthy - pattern shards (no index) → unhealthy - partial marker → unhealthy - PyTorch index parity (complete → healthy) - single-file safetensors/gguf → healthy Docs - CHANGELOG.md: add 1.1.1-beta.1 with detailed rules; note GitHub tag vs PyPI mapping (1.1.1-beta.1 ↔ 1.1.1b1). - README.md: tests badge 160/160; pre-release note for 1.1.1b1. - TESTING.md: status 160/160; update test structure (add test_health_multishard.py; remove 2.0 note). Version - version = 1.1.1b1 (PEP 440 pre-release); VERSION = (1, 1, 1). Behavioral impact - Health reporting is stricter (cannot regress functionality): incomplete multi-shard downloads correctly report unhealthy. No changes to pull/run/server behavior. Validation - Python 3.9 local: 160 passed, 36 deselected; warnings eliminated on 3.9/3.10 under project defaults. - New multishard tests pass; manual spot-checks show expected unhealthy→healthy transitions as downloads complete. Release - GitHub: tag v1.1.1-beta.1 (Pre-release). - PyPI: upload 1.1.1b1 (install via pip install --pre mlx-knife).
2026-06-30 20:48:03 -04:00 · 2025-09-01 01:26:27 +02:00
parent f511dd9c74
commit b9db12ae89
6 changed files with 269 additions and 37 deletions
@@ -2,7 +2,7 @@

 ## Current Status

-✅ **150/150 tests passing** (August 2025) - **STABLE RELEASE** 🚀  
+✅ **160/160 tests passing** (September 2025) - **STABLE RELEASE + Pre-release** 🚀  
 ✅ **Apple Silicon verified** (M1/M2/M3)  
 ✅ **Python 3.9-3.13 compatible**  
 ✅ **Production ready** - comprehensive testing with real model execution
@@ -55,12 +55,14 @@ tests/
 │   ├── test_end_token_issue.py             # Issue #20: End-token filtering (@server)
 │   ├── test_issue_14.py                    # Issue #14: Chat self-conversation (@server)
 │   └── test_issue_15_16.py                 # Issues #15/#16: Dynamic token limits (@server)
-└── unit/                              # Module-level unit tests (72 tests)
+└── unit/                              # Module-level unit tests (82 tests)
    ├── test_cache_utils.py                 # Cache management & Issue #21/#23 tests
    ├── test_cli.py                         # CLI argument parsing
+    ├── test_health_multishard.py           # Strict multi-shard/index health (Issue #27)
    └── test_mlx_runner_memory.py           # Memory management tests
 ```

+
 ## 3-Category Test Strategy (MLX Knife 1.1.0+)

 MLX Knife uses a **3-category test strategy** to balance test isolation, performance, and user cache protection:
@@ -552,4 +554,4 @@ def test_new_feature(mlx_server, model_name: str, size_str: str, ram_needed: int
 1. **Mark with `@pytest.mark.server`** - excludes from default `pytest`
 2. **Use `mlx_server` fixture** - automatic server lifecycle management
 3. **Test RAM requirements** - use `get_safe_models_for_system()` helper
-4. **Document in TESTING.md** - add to this guide
+4. **Document in TESTING.md** - add to this guide