diff --git a/CHANGELOG.md b/CHANGELOG.md index 44ac7f2..b0ae4da 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,23 @@ # Changelog +## [1.1.1-beta.1] - 2025-09-01 + +### Fix: Strict Health Completeness for Multi‑Shard Models (Issue #27) +- Problem: Health reported some multi‑part downloads as OK with missing/empty shards (false positives). +- Solution: Backported 2.0 health rules to 1.x with index‑aware validation, pattern detection, and robust corruption checks. +- Details: + - Config validation: `config.json` must exist and be a non‑empty JSON object. + - Index‑aware: If `model.safetensors.index.json` or `pytorch_model.bin.index.json` exists, every referenced shard must exist, be non‑empty, and not be a Git LFS pointer file. + - Pattern fallback policy: If pattern shards like `model-XXXXX-of-YYYYY.*` are present but no index file exists, the model is considered unhealthy (parity with 2.0 policy). + - Partial/tmp markers: Any `*.partial`, `*.tmp`, or names containing `partial` anywhere under the snapshot mark the model as unhealthy. + - LFS detection: Recursive scan flags suspiciously small files (<200B) that contain the Git LFS pointer header. + - Single‑file weights: Non‑empty `*.safetensors`, `*.bin`, or `*.gguf` without pattern shards remain supported and healthy if not LFS pointers. +- Impact: “Healthy” now reliably means “complete and usable” for automation and CLI workflows. +- Tests: Added `tests/unit/test_health_multishard.py` covering complete/missing/empty shards, pointer detection, pattern‑without‑index policy, partial markers, and PyTorch index parity. + +Note: GitHub tag/version uses `1.1.1-beta.1`. PyPI release uses PEP 440 `1.1.1b1`. + + ## [1.1.0] - 2025-08-26 - **STABLE RELEASE** 🚀 ### Production Readiness & Enhanced Testing 🧪 @@ -300,4 +318,4 @@ - Comprehensive test suite (86/86 passing) ## Known Issues -- See GitHub Issues for tracking \ No newline at end of file +- See GitHub Issues for tracking diff --git a/README.md b/README.md index bf83bac..ec89e8e 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,7 @@ A lightweight, ollama-like CLI for managing and running MLX models on Apple Sili > **Note**: MLX Knife is designed as a command-line interface tool only. While some internal functions are accessible via Python imports, only CLI usage is officially supported. **Current Version**: 1.1.0 (August 2025) - **STABLE RELEASE** 🚀 +- Pre-release: 1.1.1b1 — strict multi-shard health checks (Issue #27). Install with `pip install --pre mlx-knife`. - **Production Ready**: First stable release since 1.0.4 with comprehensive testing - **Enhanced Test System**: 150/150 tests passing with real model lifecycle integration tests - **Python 3.9-3.13**: Full compatibility verified across all Python versions @@ -21,7 +22,7 @@ A lightweight, ollama-like CLI for managing and running MLX models on Apple Sili [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/) [![Apple Silicon](https://img.shields.io/badge/Apple%20Silicon-M1%2FM2%2FM3-green.svg)](https://support.apple.com/en-us/HT211814) [![MLX](https://img.shields.io/badge/MLX-Latest-orange.svg)](https://github.com/ml-explore/mlx) -[![Tests](https://img.shields.io/badge/tests-150%2F150%20passing-brightgreen.svg)](#testing) +[![Tests](https://img.shields.io/badge/tests-160%2F160%20passing-brightgreen.svg)](#testing) ## Features diff --git a/TESTING.md b/TESTING.md index ea48d0a..b59c9b2 100644 --- a/TESTING.md +++ b/TESTING.md @@ -2,7 +2,7 @@ ## Current Status -✅ **150/150 tests passing** (August 2025) - **STABLE RELEASE** 🚀 +✅ **160/160 tests passing** (September 2025) - **STABLE RELEASE + Pre-release** 🚀 ✅ **Apple Silicon verified** (M1/M2/M3) ✅ **Python 3.9-3.13 compatible** ✅ **Production ready** - comprehensive testing with real model execution @@ -55,12 +55,14 @@ tests/ │ ├── test_end_token_issue.py # Issue #20: End-token filtering (@server) │ ├── test_issue_14.py # Issue #14: Chat self-conversation (@server) │ └── test_issue_15_16.py # Issues #15/#16: Dynamic token limits (@server) -└── unit/ # Module-level unit tests (72 tests) +└── unit/ # Module-level unit tests (82 tests) ├── test_cache_utils.py # Cache management & Issue #21/#23 tests ├── test_cli.py # CLI argument parsing + ├── test_health_multishard.py # Strict multi-shard/index health (Issue #27) └── test_mlx_runner_memory.py # Memory management tests ``` + ## 3-Category Test Strategy (MLX Knife 1.1.0+) MLX Knife uses a **3-category test strategy** to balance test isolation, performance, and user cache protection: @@ -552,4 +554,4 @@ def test_new_feature(mlx_server, model_name: str, size_str: str, ram_needed: int 1. **Mark with `@pytest.mark.server`** - excludes from default `pytest` 2. **Use `mlx_server` fixture** - automatic server lifecycle management 3. **Test RAM requirements** - use `get_safe_models_for_system()` helper -4. **Document in TESTING.md** - add to this guide \ No newline at end of file +4. **Document in TESTING.md** - add to this guide diff --git a/mlx_knife/__init__.py b/mlx_knife/__init__.py index c753355..f6ad890 100644 --- a/mlx_knife/__init__.py +++ b/mlx_knife/__init__.py @@ -9,7 +9,7 @@ import warnings warnings.filterwarnings('ignore', message='urllib3 v2 only supports OpenSSL 1.1.1+') -__version__ = "1.1.0" +__version__ = "1.1.1b1" __author__ = "The BROKE team" __email__ = "broke@gmx.eu" __license__ = "MIT" @@ -17,7 +17,7 @@ __description__ = "ollama-style CLI for MLX models on Apple Silicon" __url__ = "https://github.com/mzau/mlx-knife" # Version tuple for programmatic access (major, minor, patch) -VERSION = (1, 1, 0) +VERSION = (1, 1, 1) # Core functionality imports from .cache_utils import ( diff --git a/mlx_knife/cache_utils.py b/mlx_knife/cache_utils.py index 870dfd3..7fd4d8d 100644 --- a/mlx_knife/cache_utils.py +++ b/mlx_knife/cache_utils.py @@ -249,55 +249,138 @@ def get_model_hash(model_path): return latest.name[:8] def is_model_healthy(model_spec): - model_path, _, _ = resolve_single_model(model_spec) + """Strict health check for 1.x (backport of #27 rules). + + Rules: + - config.json must exist and be valid non-empty JSON object. + - If a safetensors or PyTorch index exists, all referenced shards must exist, be non-empty, + and not be Git LFS pointer files. + - Without an index: if multi-shard pattern files exist (model-XXXXX-of-YYYYY.*), require index (unhealthy without index). + Single-file weights (*.safetensors/*.bin/*.gguf) are allowed if non-empty and not LFS pointers. + - Any '.partial'/'partial' or '.tmp' artifacts anywhere => unhealthy. + - Recursive LFS pointer scan for suspiciously small files (<200B). + """ + + # Resolve model path: accept direct directory paths or model specs + candidate = Path(str(model_spec)) + if candidate.exists() and candidate.is_dir(): + model_path = candidate + else: + model_path, _, _ = resolve_single_model(model_spec) if not model_path: return False + + # 1) config.json must be valid, non-empty dict config_path = model_path / "config.json" if not config_path.exists(): return False - # Check if config.json is valid JSON and not empty try: with open(config_path) as f: config_data = json.load(f) - # Basic sanity check: should be a non-empty dict - if not isinstance(config_data, dict) or len(config_data) == 0: + if not isinstance(config_data, dict) or not config_data: return False except (OSError, json.JSONDecodeError): return False - weight_files = list(model_path.glob("*.safetensors")) + list(model_path.glob("*.bin")) + list(model_path.glob("*.gguf")) - if not weight_files: - weight_files = list(model_path.glob("**/*.safetensors")) + list(model_path.glob("**/*.bin")) + list(model_path.glob("**/*.gguf")) - if not weight_files: - index_file = model_path / "model.safetensors.index.json" - if index_file.exists(): + + # 2) Fail fast on partial/tmp markers anywhere in the snapshot + for p in model_path.rglob("*"): + name = p.name.lower() + if ".partial" in name or name.endswith(".partial") or name.endswith(".tmp") or "partial" in name: + return False + + # Helper: detect Git LFS pointer file + def _is_lfs_pointer(fp: Path) -> bool: + try: + if fp.stat().st_size >= 200: + return False + with open(fp, "rb") as f: + head = f.read(200) + return b"version https://git-lfs.github.com/spec/v1" in head + except Exception: + return False + + # Helper: verify referenced shards + def _verify_shards(files: list[Path]) -> bool: + if not files: + return False + for f in files: try: - with open(index_file) as f: - index = json.load(f) - if 'weight_map' in index: - referenced_files = set(index['weight_map'].values()) - existing_files = [f for f in referenced_files if (model_path / f).exists()] - if len(existing_files) > 0: - return True - except: - pass - if not weight_files: + if (not f.exists()) or f.stat().st_size == 0: + return False + if _is_lfs_pointer(f): + return False + except Exception: + return False + return True + + # 3) Index-aware checks (safetensors or PyTorch) + st_index = model_path / "model.safetensors.index.json" + pt_index = model_path / "pytorch_model.bin.index.json" + if st_index.exists() or pt_index.exists(): + index_files = [p for p in [st_index, pt_index] if p.exists()] + for idx in index_files: + try: + with open(idx) as f: + idx_data = json.load(f) + weight_map = idx_data.get("weight_map") + if not isinstance(weight_map, dict) or not weight_map: + return False + referenced = sorted(set(weight_map.values())) + shard_paths = [model_path / r for r in referenced] + if not _verify_shards(shard_paths): + return False + except (OSError, json.JSONDecodeError): + return False + # Also ensure no recursive LFS pointers elsewhere + ok, _ = check_lfs_corruption(model_path) + return ok + + # 4) No index present — detect multi-shard pattern + # If pattern shards exist, require index (unhealthy without index by policy parity with 2.0) + import re + shard_re = re.compile(r"model-([0-9]{5})-of-([0-9]{5})\.(safetensors|bin)") + pattern_files = [] + for f in model_path.glob("*"): + if f.is_file(): + m = shard_re.match(f.name) + if m: + pattern_files.append((f, int(m.group(1)), int(m.group(2)))) + if pattern_files: + # Even if complete by pattern, absence of index => unhealthy return False - lfs_ok, _ = check_lfs_corruption(model_path) - if not lfs_ok: + + # 5) Single-file weights fallback (includes GGUF) + weight_files = list(model_path.rglob("*.safetensors")) + list(model_path.rglob("*.bin")) + list(model_path.rglob("*.gguf")) + # Exclude known pattern shards from consideration (handled above) + filtered_weights = [] + for f in weight_files: + name = f.name + if shard_re.match(name): + continue + filtered_weights.append(f) + if not filtered_weights: return False - return True + for wf in filtered_weights: + if wf.stat().st_size == 0 or _is_lfs_pointer(wf): + return False + + # Final recursive LFS scan + ok, _ = check_lfs_corruption(model_path) + return ok def check_lfs_corruption(model_path): + """Recursively scan for Git LFS pointer files (suspiciously small files).""" corrupted_files = [] - for file_path in model_path.glob("*"): - if file_path.is_file() and file_path.stat().st_size < 200: - try: + for file_path in model_path.rglob("*"): + try: + if file_path.is_file() and file_path.stat().st_size < 200: with open(file_path, 'rb') as f: - header = f.read(100) + header = f.read(200) if b'version https://git-lfs.github.com/spec/v1' in header: - corrupted_files.append(file_path.name) - except: - pass + corrupted_files.append(str(file_path.relative_to(model_path))) + except Exception: + # Ignore unreadable files in corruption scan, keep conservative + continue if corrupted_files: return False, f"LFS pointers instead of files: {', '.join(corrupted_files)}" return True, "No LFS corruption detected" diff --git a/tests/unit/test_health_multishard.py b/tests/unit/test_health_multishard.py new file mode 100644 index 0000000..1fe46e8 --- /dev/null +++ b/tests/unit/test_health_multishard.py @@ -0,0 +1,128 @@ +""" +Strict health checks for multi-shard/index models (Issue #27 backport). +""" +import json +from pathlib import Path + +from mlx_knife.cache_utils import is_model_healthy + + +def _write_json(p: Path, data: dict) -> None: + p.write_text(json.dumps(data)) + + +def _mk_snapshot(tmp_dir: Path, name: str = "models--org--model", snap: str = "main") -> Path: + d = tmp_dir / "hub" / name / "snapshots" / snap + d.mkdir(parents=True, exist_ok=True) + return d + + +def test_index_complete_healthy(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + # Valid config + _write_json(d / "config.json", {"model_type": "test"}) + # Two shards + index + (d / "model-00001-of-00002.safetensors").write_bytes(b"a" * 1024) + (d / "model-00002-of-00002.safetensors").write_bytes(b"b" * 1024) + _write_json( + d / "model.safetensors.index.json", + {"weight_map": {"w1": "model-00001-of-00002.safetensors", "w2": "model-00002-of-00002.safetensors"}}, + ) + assert is_model_healthy(str(d)) is True + + +def test_index_missing_shard_unhealthy(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + # Only one shard present + (d / "model-00001-of-00002.safetensors").write_bytes(b"a" * 1024) + _write_json( + d / "model.safetensors.index.json", + {"weight_map": {"w1": "model-00001-of-00002.safetensors", "w2": "model-00002-of-00002.safetensors"}}, + ) + assert is_model_healthy(str(d)) is False + + +def test_index_empty_shard_unhealthy(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + (d / "model-00001-of-00002.safetensors").write_bytes(b"") # empty + (d / "model-00002-of-00002.safetensors").write_bytes(b"b" * 1024) + _write_json( + d / "model.safetensors.index.json", + {"weight_map": {"w1": "model-00001-of-00002.safetensors", "w2": "model-00002-of-00002.safetensors"}}, + ) + assert is_model_healthy(str(d)) is False + + +def test_index_lfs_pointer_unhealthy(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + # One shard becomes an LFS pointer (small text with signature) + (d / "model-00001-of-00002.safetensors").write_text( + "version https://git-lfs.github.com/spec/v1\n" + "oid sha256:deadbeef\n" + "size 100\n" + ) + (d / "model-00002-of-00002.safetensors").write_bytes(b"b" * 1024) + _write_json( + d / "model.safetensors.index.json", + {"weight_map": {"w1": "model-00001-of-00002.safetensors", "w2": "model-00002-of-00002.safetensors"}}, + ) + assert is_model_healthy(str(d)) is False + + +def test_pattern_complete_without_index_unhealthy(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + # Two shards with pattern but no index + (d / "model-00001-of-00002.safetensors").write_bytes(b"a" * 1024) + (d / "model-00002-of-00002.safetensors").write_bytes(b"b" * 1024) + assert is_model_healthy(str(d)) is False + + +def test_pattern_incomplete_unhealthy(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + # Only one pattern shard present, no index + (d / "model-00001-of-00002.safetensors").write_bytes(b"a" * 1024) + assert is_model_healthy(str(d)) is False + + +def test_partial_marker_unhealthy(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + (d / "model.safetensors.partial").write_bytes(b"x") + assert is_model_healthy(str(d)) is False + + +def test_single_file_safetensors_ok(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + (d / "model.safetensors").write_bytes(b"weights" * 256) + assert is_model_healthy(str(d)) is True + + +def test_single_file_gguf_ok(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + (d / "model.q4_0.gguf").write_bytes(b"gguf-weights" * 256) + assert is_model_healthy(str(d)) is True + + +def test_pytorch_index_complete_ok(temp_cache_dir: Path): + d = _mk_snapshot(temp_cache_dir) + _write_json(d / "config.json", {"model_type": "test"}) + (d / "pytorch_model-00001-of-00002.bin").write_bytes(b"a" * 1024) + (d / "pytorch_model-00002-of-00002.bin").write_bytes(b"b" * 1024) + _write_json( + d / "pytorch_model.bin.index.json", + { + "weight_map": { + "w1": "pytorch_model-00001-of-00002.bin", + "w2": "pytorch_model-00002-of-00002.bin", + } + }, + ) + assert is_model_healthy(str(d)) is True +