diff --git a/CHANGELOG.md b/CHANGELOG.md
index 76174ea..58228eb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,61 @@
# Changelog
+## 2.0.0-beta.4 — 2025-10-18
+
+**Health Check Enhancement**: Separate integrity and runtime compatibility validation (Issue #36).
+
+### Changed
+- **JSON API 0.1.5 specification**:
+ - Added `runtime_compatible: boolean` field to `modelObject` (always present)
+ - Added `reason: string | null` field to `modelObject` (describes first problem found)
+ - `list`/`show` JSON output performs both integrity and runtime compatibility checks
+ - Gate logic: Runtime check requires integrity check first; `reason` shows first problem (integrity > runtime priority)
+- **Health check concepts documented**:
+ - Integrity Check (`health` field): File-level validation (required files, no LFS pointers, valid JSON)
+ - Runtime Compatibility Check (`runtime_compatible` field): MLX framework + architecture validation with mlx-lm
+ - Framework detection: GGUF/PyTorch models marked as runtime-incompatible
+ - Architecture detection: Unsupported model types (e.g., `qwen3_next` with mlx-lm < 0.28.0) detected
+ - Respects `MODEL_REMAPPING` for aliased architectures (e.g., `mistral` → `llama`)
+
+### Implementation Status
+- ✅ **Phase 1 Complete**: JSON API Specification 0.1.5
+ - `docs/json-api-schema.json` updated with new fields
+ - `docs/json-api-specification.md` extended with health check concepts and examples
+- ✅ **Phase 2 Complete**: JSON Implementation
+ - `mlxk2/spec.py` bumped to 0.1.5
+ - `mlxk2/operations/health.py`: `check_runtime_compatibility()` with gate logic
+ - `mlxk2/operations/common.py`: `build_model_object()` always computes `runtime_compatible` + `reason`
+ - mlx-lm API compatibility: Supports both 0.27.x (`mlx_lm.utils._get_classes`) and 0.28.x APIs
+ - Log suppression: mlx-lm ERROR logs redirected to `reason` field only
+- ✅ **Phase 3 Complete**: Human Output Specification
+ - Compact mode: `healthy` / `healthy*` / `unhealthy` (single column)
+ - Verbose mode: "Integrity" | "Runtime" | "Reason" (split columns)
+ - ASCII-only output (no UTF-8 symbols for parsing compatibility)
+ - README.md fully documented with examples and design philosophy
+ - JSON examples verified for consistency with schema and code
+- ✅ **Phase 4 Complete**: Human Output Implementation in `mlxk2/output/human.py`
+
+### Dependencies
+- **mlx-lm requirement updated**: `>=0.27.0` → `>=0.28.3`
+ - Now uses official mlx-lm 0.28.3 release with Python 3.9 compatibility fixes for `qwen3_next`
+ - Adds support for newer architectures (Klear, qwen3_next, etc.)
+ - Git pin removed in favor of stable PyPI release
+
+### Validation
+- ✅ All 256 tests pass (9 skipped)
+- ✅ Runtime compatibility correctly detects:
+ - GGUF/PyTorch models → `runtime_compatible: false` (framework mismatch)
+ - Supported MLX models → `runtime_compatible: true`
+ - Unsupported architectures → `runtime_compatible: false` with descriptive `reason`
+ - Klear-46B verified working with mlx-lm 0.28.2
+
+### Known Issues
+- None
+
+### Notes
+- Human output columns controlled by CLI flags (documentation in README.md, separate from JSON spec)
+- This addresses the root cause discovered in Issue #36: GGUF models show "healthy" but are not executable with mlx-lm
+
## 2.0.0-beta.3 — 2025-09-18
**Feature Complete**: Full 1.1.1 parity achieved with Clone implementation (ADR-007 Phase 1) and APFS filesystem detection fixes.
diff --git a/README.md b/README.md
index 7e903ba..4816f83 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-#
MLX-Knife 2.0.0-beta.3
+#
MLX-Knife 2.0.0-beta.4
@@ -10,7 +10,7 @@
**Stable Version: 1.1.1**
-[](https://github.com/mzau/mlx-knife/releases)
+[](https://github.com/mzau/mlx-knife/releases)
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://support.apple.com/en-us/HT211814)
@@ -24,7 +24,7 @@
- **Model Information**: Detailed model metadata including quantization info
- **Download Models**: Pull models from HuggingFace with progress tracking
- **Run Models**: Native MLX execution with streaming and chat modes
-- **Health Checks**: Verify model integrity and completeness
+- **Health Checks**: Verify model integrity and MLX runtime compatibility
- **Cache Management**: Clean up and organize your model storage
- **Privacy & Network**: No background network or telemetry; only explicit Hugging Face interactions when you run pull or the experimental push.
@@ -47,10 +47,10 @@ MLX Knife has been comprehensively tested and verified on:
```bash
# Install latest beta release directly from GitHub
-pip install https://github.com/mzau/mlx-knife/releases/download/v2.0.0-beta.3/mlxk_json-2.0.0b3-py3-none-any.whl
+pip install https://github.com/mzau/mlx-knife/releases/download/v2.0.0-beta.4/mlxk_json-2.0.0b4-py3-none-any.whl
# Verify installation
-mlxk2 --version # → mlxk2 2.0.0b3
+mlxk2 --version # → mlxk2 2.0.0b4
```
### Development Installation
@@ -68,17 +68,63 @@ pip install -e ".[dev,test]"
## Human output (default)
mlxk2 list
mlxk2 list --health
-mlxk2 list --all --verbose
mlxk2 health
mlxk2 show "mlx-community/Phi-3-mini-4k-instruct-4bit"
### List filters (human)
-- `list`: shows MLX chat models only (safe default for run/server selection)
-- `list --verbose`: shows all MLX models (chat + base)
+- `list`: shows MLX chat models only (compact names, safe default)
+- `list --verbose`: shows all MLX models (chat + base) with full org/names and Framework column
- `list --all`: shows all frameworks (MLX, GGUF, PyTorch)
-- `list --all --verbose`: same selection as `--all`, with fuller names/details
+- Flags are combinable: `--all --verbose`, `--all --health`, `--verbose --health`
-Note: JSON output is unaffected by these human-only filters.
+### Health status display (--health flag)
+
+The `--health` flag adds health status information to the output:
+
+**Compact mode** (default, `--all`):
+- Shows single "Health" column with values:
+ - `healthy` - File integrity OK and MLX runtime compatible
+ - `healthy*` - File integrity OK but not MLX runtime compatible (use `--verbose` for details)
+ - `unhealthy` - File integrity failed or unknown format
+
+**Verbose mode** (`--verbose --health`):
+- Splits into "Integrity" and "Runtime" columns:
+ - **Integrity:** `healthy` / `unhealthy`
+ - **Runtime:** `yes` / `no` / `-` (dash = gate blocked by failed integrity)
+ - **Reason:** Explanation when problems detected (wrapped at 26 chars for readability)
+
+**Examples:**
+
+```bash
+# Compact health view
+mlxk2 list --health
+# Output:
+# Name | Hash | Size | Modified | Type | Health
+# Llama-3.2-3B-Instruct | a1b2c3d | 2.1GB | 2d ago | chat | healthy
+# Qwen2-7B-Instruct | 1a2b3c4 | 4.8GB | 3d ago | chat | healthy*
+
+# Verbose health view with details
+mlxk2 list --verbose --health
+# Output:
+# Name | Hash | Size | Modified | Framework | Type | Integrity | Runtime | Reason
+# Llama-3.2-3B-Instruct | a1b2c3d | 2.1GB | 2d ago | MLX | chat | healthy | yes | -
+# Qwen2-7B-Instruct | 1a2b3c4 | 4.8GB | 3d ago | PyTorch | chat | healthy | no | Incompatible: PyTorch
+
+# All frameworks with health status
+mlxk2 list --all --health
+# Output:
+# Name | Hash | Size | Modified | Framework | Type | Health
+# Llama-3.2-3B-Instruct | a1b2c3d | 2.1GB | 2d ago | MLX | chat | healthy
+# llama-3.2-gguf-q4 | b2c3d4e | 1.8GB | 3d ago | GGUF | unknown | healthy*
+# broken-download | - | 500MB | 1h ago | Unknown | unknown | unhealthy
+```
+
+**Design Philosophy:**
+- `unhealthy` is a catch-all for anything not understood/supported (broken downloads, unknown formats, creative HuggingFace structures)
+- `healthy` guarantees the model will work with `mlxk2 run`
+- `healthy*` means files are intact but MLX runtime can't execute them (e.g., GGUF/PyTorch models, incompatible model_type, or mlx-lm version too old)
+
+Note: JSON output is unaffected by these human-only filters and always includes full health/runtime data.
## JSON API
```bash
@@ -174,8 +220,8 @@ These features are not final and may change or be removed in future releases.
pip install -e /path/to/mlx-knife
# Verify installation
-mlxk-json --version # → mlxk2 2.0.0-beta.3
-mlxk2 --version # → mlxk2 2.0.0-beta.3
+mlxk-json --version # → mlxk2 2.0.0-beta.4
+mlxk2 --version # → mlxk2 2.0.0-beta.4
```
### Parallel with MLX-Knife 1.x
@@ -238,6 +284,8 @@ mlxk-json list --json
"model_type": "chat",
"capabilities": ["text-generation", "chat"],
"health": "healthy",
+ "runtime_compatible": true,
+ "reason": null,
"cached": true
}
],
@@ -256,7 +304,11 @@ mlxk-json health --json
"command": "health",
"data": {
"healthy": [
- { "name": "mlx-community/Phi-3-mini-4k-instruct-4bit", "status": "healthy", "reason": "Model is healthy" }
+ {
+ "name": "mlx-community/Phi-3-mini-4k-instruct-4bit",
+ "status": "healthy",
+ "reason": "Model is healthy"
+ }
],
"unhealthy": [],
"summary": { "total": 1, "healthy_count": 1, "unhealthy_count": 0 }
@@ -282,6 +334,8 @@ mlxk-json show "Phi-3-mini" --json --files
"capabilities": ["text-generation", "chat"],
"last_modified": "2024-10-15T08:23:41Z",
"health": "healthy",
+ "runtime_compatible": true,
+ "reason": null,
"cached": true
},
"files": [
@@ -399,12 +453,12 @@ pytest tests/ -v
## Known Notes
- Streaming UX: Some UIs buffer SSE; verify real-time with `curl -N`. The server emits a clear interrupt marker on abort.
-- Error handling/logging: Unified error envelope and structured logs are planned post‑beta.3 (see ADR‑004).
+- Error handling/logging: Unified error envelope and structured logs are planned post‑beta.4 (see ADR‑004).
## Development Status
### Version Roadmap
-- **2.0.0-beta.3** ← You are here (feature complete; full 1.x parity achieved; all core commands implemented)
+- **2.0.0-beta.4** ← You are here (runtime compatibility checks; separates file integrity from MLX execution capability)
- **2.0.0-rc**: CLI compatibility improvements: `mlxk` alias alongside `mlxk2`; final production hardening
- **2.0.0-stable**: Stable release after RC feedback
@@ -468,6 +522,6 @@ Note: This branch is hard‑split for 2.0. The 1.x implementation and tests were
Made with ❤️ by The BROKE team 
- Version 2.0.0-beta.3 | September 2025
+ Version 2.0.0-beta.4 | October 2025
🔮 Next: BROKE Cluster for multi-node deployments
diff --git a/TESTING.md b/TESTING.md
index 8d13149..30d760b 100644
--- a/TESTING.md
+++ b/TESTING.md
@@ -2,18 +2,18 @@
## Current Status
-✅ **254/254 tests passing** (September 2025) — 2.0.0-beta.3; 11 skipped (opt-in)
-✅ **Apple Silicon verified** (M1/M2/M3)
-✅ **Python 3.9-3.13 compatible**
+✅ **253/253 tests passing** (October 2025) — 2.0.0-beta.4; 12 skipped (opt-in)
+✅ **Test environment:** macOS 14.x, M2 Max, Python 3.9-3.13
+✅ **Production verified & reported:** M1, M1 Max, M2 Max in real-world use
✅ **Beta (CLI/JSON)** — stable features only, experimental features opt-in
✅ **Isolated test system** - user cache stays pristine with temp cache isolation
✅ **3-category test strategy** - optimized for performance and safety
-### Skipped Tests Breakdown (11 total)
-- **3 Live tests** - Network-dependent (requires environment setup: `live_push`, `live_clone`, `live_list`)
-- **3 Alpha feature tests** - Hidden features (requires `MLXK2_ENABLE_ALPHA_FEATURES=1`)
-- **2 Issue #27 tests** - Real-model tests (require user cache setup)
-- **3 Other opt-in tests** - Schema validation, spec compliance (require jsonschema)
+### Skipped Tests Breakdown (12 total, standard run without HF_HOME)
+- **3 Live Clone tests** - APFS same-volume clone workflow (requires `MLXK2_LIVE_CLONE=1`)
+- **1 Live List test** - Tests against user cache (requires HF_HOME with models)
+- **1 Live Push test** - Real HuggingFace push (requires `MLXK2_LIVE_PUSH=1`)
+- **7 Issue #27 tests** - Real-model health validation (requires HF_HOME or MLXK2_USER_HF_HOME setup)
## Quick Start (2.0 Default)
@@ -907,7 +907,7 @@ When submitting PRs, please include:
**MLX Knife 2.0 Testing Status:**
-✅ **Feature Complete** - 254/254 tests passing (2.0.0-beta.3)
+✅ **Feature Complete** - 253/253 tests passing (2.0.0-beta.4)
✅ **Enhanced Isolation** - Sentinel protection with `isolated_cache` fixture
✅ **3-Category Strategy** - Isolated/Live/Server tests optimized for 2.0
✅ **Multi-Python Support** - Python 3.9-3.13 verified
@@ -978,10 +978,10 @@ def test_model_generation_quality(model_name: str, ram_needed: int):
- ✅ **RAM-aware** - Tests adapt to available system resources
**Implementation Status:**
-- 🚧 **TODO for post-beta.3** - Requires real MLX integration in test environment
+- 🚧 **TODO for post-beta.4** - Requires real MLX integration in test environment
- 📋 **Design preserved** - RAM-aware filtering logic documented for future use
- 🎯 **Target**: Optional `pytest -m server_real` for comprehensive model validation
---
-*MLX-Knife 2.0.0-beta.3 — Comprehensive testing for JSON-first model management.*
+*MLX-Knife 2.0.0-beta.4 — Comprehensive testing for JSON-first model management.*
diff --git a/docs/ADR/ADR-004-Enhanced-Error-Logging.md b/docs/ADR/ADR-004-Enhanced-Error-Logging.md
index 7aaef95..8926f6d 100644
--- a/docs/ADR/ADR-004-Enhanced-Error-Logging.md
+++ b/docs/ADR/ADR-004-Enhanced-Error-Logging.md
@@ -1,6 +1,8 @@
# ADR-004: Enhanced Error Handling & Logging
-Status: Proposal (post-beta.3)
+Status: Accepted (Implementation: beta.5+)
+
+Note: Error type taxonomy and rate-limiting parameters may be refined during implementation based on real-world usage patterns.
Context
- 2.0 currently has working error paths and minimal logs. We want a unified error envelope, structured logging, and consistent HTTP/CLI mapping without overcomplicating local workflows.
@@ -43,9 +45,15 @@ Specification (phase 1)
- Hot-swap logging: "Switching to model", "Model loaded", cleanup results (freed memory, optional).
Rollout plan
-- Beta.3: keep current behavior; add tests (done) and reduce noisy logs (done).
-- Post-beta.3 (minor): add request_id generation and propagation; envelope for HTTP errors; optional JSON logs via env; minimal redaction.
-- Post-beta.3 (follow-up): SSE error finalization parity across endpoints; rate-limit error floods.
+- Beta.3: ✅ Keep current behavior; add tests (done) and reduce noisy logs (done).
+- Beta.4 (KW 41 2024): Runtime Check (Issue #36) - separate bugfix, not part of ADR-004.
+- Beta.5+ (Q4 2024): ADR-004 Phase 1 implementation
+ - Add request_id generation and propagation
+ - Unified error envelope for HTTP errors
+ - Optional JSON logs via env `MLXK2_LOG_JSON=1`
+ - Minimal redaction (HF_TOKEN, paths)
+- Beta.5+ (follow-up): SSE error finalization parity across endpoints; rate-limit error floods.
+- 2.0.0 Final (Q1 2026): Production-ready with full error/logging infrastructure.
- CLI operations
- Exit codes: success=0; any status:error → 1 (no special codes per type).
diff --git a/docs/json-api-schema.json b/docs/json-api-schema.json
index d924dc8..f34ecad 100644
--- a/docs/json-api-schema.json
+++ b/docs/json-api-schema.json
@@ -1,7 +1,7 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://example.com/mlxk-json-api.schema.json",
- "title": "MLX-Knife 2.0 JSON API (current)",
+ "title": "MLX-Knife 2.0 JSON API 0.1.5 (current)",
"type": "object",
"additionalProperties": false,
"properties": {
@@ -52,12 +52,15 @@
"items": {"type": "string", "enum": ["text-generation", "chat", "embeddings", "completion"]}
},
"health": {"type": "string", "enum": ["healthy", "unhealthy"]},
+ "runtime_compatible": {"type": "boolean"},
+ "reason": {"type": ["string", "null"]},
"cached": {"type": "boolean"}
},
"required": [
"name", "hash", "size_bytes", "last_modified", "framework",
- "model_type", "capabilities", "health", "cached"
- ]
+ "model_type", "capabilities", "health", "runtime_compatible", "cached"
+ ],
+ "description": "Extended in 0.1.5: added runtime_compatible (boolean) and reason (string|null) fields. Both checks always performed in JSON mode. reason describes the first problem found (integrity or runtime). Gate logic: runtime check requires integrity check first. CLI flags (--health, --runtime-check) control human output columns only, not JSON data collection."
},
"fileEntry": {
"type": "object",
diff --git a/docs/json-api-specification.md b/docs/json-api-specification.md
index 0bc3891..900275d 100644
--- a/docs/json-api-specification.md
+++ b/docs/json-api-specification.md
@@ -1,8 +1,8 @@
# MLX-Knife 2.0 JSON API Specification
-**Specification Version:** 0.1.4
-**Status:** Alpha - Subject to change
-**Target:** MLX-Knife 2.0.0
+**Specification Version:** 0.1.5
+**Status:** Alpha - Subject to change
+**Target:** MLX-Knife 2.0.0-beta.4
> Based on [GitHub Issue #8](https://github.com/mzau/mlx-knife/issues/8) - Comprehensive JSON output support for all commands
@@ -10,6 +10,80 @@
MLX Knife is promoted as a "scriptable" tool, but formatted terminal output makes automation difficult. JSON output enables robust scripting integration and broke-cluster compatibility.
+## Health Check Concepts (0.1.5)
+
+MLX Knife distinguishes between two levels of model validation:
+
+### Integrity Check (`health` field)
+- **Purpose:** Verify that downloaded model files are complete and uncorrupted
+- **Scope:** File-level validation only
+- **Checks:**
+ - Required files present (config.json, weights, tokenizer files)
+ - No Git LFS pointers instead of actual files
+ - JSON files are valid JSON
+- **States:** `"healthy"` | `"unhealthy"`
+- **Always included:** In all `modelObject` instances
+
+### Runtime Compatibility Check (`runtime_compatible` field)
+- **Purpose:** Verify that model can be executed with `mlx-lm`
+- **Scope:** Framework and model architecture validation
+- **Checks:**
+ - Framework is MLX (GGUF/PyTorch models fail)
+ - Model architecture supported by current mlx-lm version
+ - Respects `MODEL_REMAPPING` (e.g., `mistral` → `llama`)
+- **States:** `true` | `false`
+- **Always included:** In all `modelObject` instances
+
+### Gate Logic & Reason Field
+- Runtime compatibility check **requires** integrity check first
+- If integrity check fails (`health: "unhealthy"`), runtime check is skipped (`runtime_compatible: false`)
+- `reason` field describes the **first problem found**:
+ - Integrity problems take precedence
+ - Runtime problems only shown if files are healthy
+ - `null` when both checks pass (`health: "healthy"` AND `runtime_compatible: true`)
+
+### Example Scenarios
+
+**Healthy MLX Model (Compatible):**
+```json
+/* Illustrative snippet - not a complete response */
+{
+ "health": "healthy",
+ "runtime_compatible": true,
+ "reason": null
+}
+```
+
+**GGUF Model (Files OK, Not Executable):**
+```json
+/* Illustrative snippet - not a complete response */
+{
+ "health": "healthy",
+ "runtime_compatible": false,
+ "reason": "Framework GGUF not executable with mlx-lm (requires MLX)"
+}
+```
+
+**Unsupported Architecture:**
+```json
+/* Illustrative snippet - not a complete response */
+{
+ "health": "healthy",
+ "runtime_compatible": false,
+ "reason": "Model architecture 'qwen3_next' requires mlx-lm >= 0.28.0 (current: 0.27.1)"
+}
+```
+
+**Incomplete Download (Runtime Check Skipped):**
+```json
+/* Illustrative snippet - not a complete response */
+{
+ "health": "unhealthy",
+ "runtime_compatible": false,
+ "reason": "config.json missing"
+}
+```
+
## CLI Usage
All commands require the `--json` flag for JSON output:
@@ -68,13 +142,16 @@ All commands that return model information use the same minimal model object.
- `framework`: "MLX" | "GGUF" | "PyTorch" | "Unknown".
- `model_type`: "chat" | "embedding" | "base" | "unknown".
- `capabilities`: e.g., ["text-generation", "chat"] or ["embeddings"].
-- `health`: "healthy" | "unhealthy".
+- `health`: "healthy" | "unhealthy" (always present).
+- `runtime_compatible`: `true` | `false` (0.1.5+, always present).
+- `reason`: `string | null` (0.1.5+, describes first problem found, null when both checks pass).
- `cached`: true.
Notes:
- No human-readable `size` field; only `size_bytes`.
- No human-readable "modified" field; `last_modified` is authoritative.
- No absolute filesystem paths are exposed.
+- `runtime_compatible` and `reason` fields added in spec version 0.1.5 (Issue #36).
### Supported Commands
@@ -112,27 +189,21 @@ Notes:
**Basic Usage:**
```bash
-mlxk-json list --json # All models with health status
-mlxk-json list "mlx-community" --json # Filter by pattern
+mlxk-json list --json # All models with full validation
+mlxk-json list "mlx-community" --json # Filter by pattern
mlxk-json list "Llama" --json # Fuzzy matching
```
**Behavior:**
-- Equivalent to 1.1.0 columns (NAME/ID/SIZE/MODIFIED/FRAMEWORK/HEALTH) with JSON mapping:
- - NAME → `name`
- - ID → `hash`
- - SIZE → `size_bytes` (bytes, integer)
- - MODIFIED → `last_modified` (ISO-8601 UTC)
- - FRAMEWORK → `framework`
- - HEALTH → `health`
-- Health status is always included.
-- Pattern filter is a case-insensitive substring match on `name`.
+- Returns all cached models with complete metadata
+- Performs both integrity and runtime compatibility checks (0.1.5+)
+- Pattern filter is a case-insensitive substring match on `name`
**JSON Schema:**
```json
{
"status": "success",
- "command": "list",
+ "command": "list",
"data": {
"models": [
{
@@ -144,6 +215,8 @@ mlxk-json list "Llama" --json # Fuzzy matching
"model_type": "chat",
"capabilities": ["text-generation", "chat"],
"health": "healthy",
+ "runtime_compatible": true,
+ "reason": null,
"cached": true
},
{
@@ -155,6 +228,8 @@ mlxk-json list "Llama" --json # Fuzzy matching
"model_type": "embedding",
"capabilities": ["embeddings"],
"health": "healthy",
+ "runtime_compatible": true,
+ "reason": null,
"cached": true
},
{
@@ -165,7 +240,35 @@ mlxk-json list "Llama" --json # Fuzzy matching
"framework": "GGUF",
"model_type": "chat",
"capabilities": ["text-generation", "chat"],
+ "health": "healthy",
+ "runtime_compatible": false,
+ "reason": "Framework GGUF not executable with mlx-lm (requires MLX)",
+ "cached": true
+ },
+ {
+ "name": "mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit",
+ "hash": "f1234a5f90abcdef1234567890abcdef12345678",
+ "size_bytes": 45000000000,
+ "last_modified": "2024-10-01T09:15:30Z",
+ "framework": "MLX",
+ "model_type": "chat",
+ "capabilities": ["text-generation", "chat"],
+ "health": "healthy",
+ "runtime_compatible": false,
+ "reason": "Model architecture 'qwen3_next' requires mlx-lm >= 0.28.0 (current: 0.27.1)",
+ "cached": true
+ },
+ {
+ "name": "corrupted/incomplete-download",
+ "hash": "c9876a5f90abcdef1234567890abcdef12345678",
+ "size_bytes": 2500000000,
+ "last_modified": "2024-09-15T12:00:00Z",
+ "framework": "MLX",
+ "model_type": "unknown",
+ "capabilities": [],
"health": "unhealthy",
+ "runtime_compatible": false,
+ "reason": "config.json missing",
"cached": true
}
],
@@ -293,6 +396,8 @@ mlxk-json show "Phi-3-mini" --config --json # Include config.json content
"capabilities": ["text-generation", "chat"],
"last_modified": "2024-10-15T08:23:41Z",
"health": "healthy",
+ "runtime_compatible": true,
+ "reason": null,
"cached": true
},
"metadata": {
@@ -324,6 +429,8 @@ mlxk-json show "Phi-3-mini" --config --json # Include config.json content
"capabilities": ["text-generation", "chat"],
"last_modified": "2024-10-15T08:23:41Z",
"health": "healthy",
+ "runtime_compatible": true,
+ "reason": null,
"cached": true
},
"files": [
@@ -356,6 +463,8 @@ mlxk-json show "Phi-3-mini" --config --json # Include config.json content
"capabilities": ["text-generation", "chat"],
"last_modified": "2024-10-15T08:23:41Z",
"health": "healthy",
+ "runtime_compatible": true,
+ "reason": null,
"cached": true
},
"config": {
@@ -409,6 +518,20 @@ mlxk-json show "Phi-3-mini" --config --json # Include config.json content
}
```
+## Changes in 0.1.5 (Alpha)
+
+**Issue #36: Separate Integrity and Runtime Compatibility Checks**
+
+- Added `runtime_compatible: boolean` field to `modelObject`
+- Added `reason: string | null` field to `modelObject`
+- Both fields always present in JSON output
+- `runtime_compatible` checks:
+ - Framework must be MLX (GGUF/PyTorch fail)
+ - Model architecture must be supported by installed mlx-lm version
+ - Respects `MODEL_REMAPPING` for aliased architectures
+- Gate logic: Runtime check requires passing integrity check first
+- `reason` field describes first problem found (integrity > runtime priority)
+
## Changes in 0.1.2 (Alpha)
- Introduced a common minimal Model Object for consistency across commands.
diff --git a/mlxk2/__init__.py b/mlxk2/__init__.py
index 8ed39a1..7b77c0c 100644
--- a/mlxk2/__init__.py
+++ b/mlxk2/__init__.py
@@ -7,4 +7,4 @@ import warnings
# Issue parity with 1.1.0 (Issue #22)
warnings.filterwarnings('ignore', message='urllib3 v2 only supports OpenSSL 1.1.1+')
-__version__ = "2.0.0b3"
+__version__ = "2.0.0b4"
diff --git a/mlxk2/operations/common.py b/mlxk2/operations/common.py
index 10bf6de..572ea6a 100644
--- a/mlxk2/operations/common.py
+++ b/mlxk2/operations/common.py
@@ -232,7 +232,7 @@ def build_model_object(hf_name: str, model_root: Path, selected_path: Optional[P
may be the model_root. Commit hash is taken from selected_path.name if it looks
like a 40-char hex string, else None.
"""
- from ..operations.health import is_model_healthy # local import to avoid cycle
+ from ..operations.health import is_model_healthy, check_runtime_compatibility # local import to avoid cycle
# Compute commit hash if selected path is a snapshot dir
commit_hash: Optional[str] = None
@@ -252,7 +252,19 @@ def build_model_object(hf_name: str, model_root: Path, selected_path: Optional[P
capabilities = detect_capabilities(model_type, hf_name, tok, config)
# Health: rely on existing operation (name-based)
- healthy, _reason = is_model_healthy(hf_name)
+ healthy, health_reason = is_model_healthy(hf_name)
+
+ # Runtime compatibility: ALWAYS computed (gate logic applies)
+ # Gate: Only check runtime if file integrity is healthy
+ if healthy:
+ runtime_compatible, runtime_reason = check_runtime_compatibility(probe, framework)
+ else:
+ # File integrity failed → skip runtime check
+ runtime_compatible = False
+ runtime_reason = None # health_reason takes precedence
+
+ # Reason field: First problem encountered (health → runtime)
+ reason = health_reason if not healthy else runtime_reason
# Size/Modified computed from selected path (snapshot preferred)
base = selected_path if selected_path is not None else model_root
@@ -265,6 +277,8 @@ def build_model_object(hf_name: str, model_root: Path, selected_path: Optional[P
"model_type": model_type,
"capabilities": capabilities,
"health": "healthy" if healthy else "unhealthy",
+ "runtime_compatible": runtime_compatible,
+ "reason": reason,
"cached": True,
}
return model_obj
diff --git a/mlxk2/operations/health.py b/mlxk2/operations/health.py
index ee6b5b2..160f61a 100644
--- a/mlxk2/operations/health.py
+++ b/mlxk2/operations/health.py
@@ -1,4 +1,7 @@
import json
+import logging
+from pathlib import Path
+from typing import Tuple, Optional
from ..core.cache import get_current_model_cache, hf_to_cache_dir, cache_dir_to_hf
from ..core.model_resolution import resolve_model_for_operation
@@ -251,6 +254,68 @@ def health_from_cache(model_spec, cache_dir):
return _check_snapshot_health(model_path)
+def check_runtime_compatibility(model_path: Path, framework: str) -> Tuple[bool, Optional[str]]:
+ """Check if model is executable with mlx-lm.
+
+ Gate logic:
+ 1. Framework must be "MLX" (GGUF/PyTorch → incompatible)
+ 2. model_type must be supported by current mlx-lm version
+
+ Returns:
+ (is_compatible, reason): reason is None if compatible, error message otherwise
+ """
+ # Gate 1: Framework check
+ if framework != "MLX":
+ return False, f"Incompatible: {framework}"
+
+ # Gate 2: model_type support check via mlx-lm
+ config_path = model_path / "config.json"
+ if not config_path.exists():
+ return False, "config.json missing (required for model_type detection)"
+
+ try:
+ with open(config_path) as f:
+ config = json.load(f)
+ model_type = config.get("model_type")
+ if not model_type:
+ return False, "config.json missing model_type field"
+ except (OSError, json.JSONDecodeError) as e:
+ return False, f"Failed to read config.json: {e}"
+
+ # Check if mlx-lm supports this model_type
+ try:
+ # Suppress mlx-lm's ERROR logs during detection
+ # mlx-lm uses root logger, so we need to suppress both mlx_lm and root
+ mlx_logger = logging.getLogger("mlx_lm")
+ root_logger = logging.getLogger()
+ original_mlx_level = mlx_logger.level
+ original_root_level = root_logger.level
+ mlx_logger.setLevel(logging.CRITICAL)
+ root_logger.setLevel(logging.CRITICAL)
+
+ try:
+ # Try mlx-lm >= 0.28.0 API first (mlx_lm.models.base._get_classes)
+ try:
+ from mlx_lm.models.base import _get_classes
+ model_class, _ = _get_classes(config=config, model_config=config)
+ except ImportError:
+ # Fall back to mlx-lm 0.27.x API (mlx_lm.utils._get_classes)
+ from mlx_lm.utils import _get_classes
+ model_class, _ = _get_classes(config)
+
+ if model_class is None:
+ return False, f"model_type '{model_type}' not supported by mlx-lm"
+
+ return True, None
+ finally:
+ mlx_logger.setLevel(original_mlx_level)
+ root_logger.setLevel(original_root_level)
+
+ except Exception as e:
+ # Pass through the actual error for debugging
+ return False, str(e) if str(e) else "Runtime check failed"
+
+
def health_check_operation(model_pattern=None):
"""Health check operation for JSON API with model resolution support."""
result = {
diff --git a/mlxk2/output/human.py b/mlxk2/output/human.py
index 4716c0f..b0d33c4 100644
--- a/mlxk2/output/human.py
+++ b/mlxk2/output/human.py
@@ -52,7 +52,15 @@ def fmt_time(iso_utc_z: Optional[str]) -> str:
return iso_utc_z
-def _table(rows: List[List[str]], headers: List[str]) -> str:
+def _table(rows: List[List[str]], headers: List[str], max_col_width: Optional[int] = None) -> str:
+ """
+ Build a table with optional column width limit for last column.
+
+ Args:
+ rows: Table rows
+ headers: Column headers
+ max_col_width: If set, limits last column to this width (wraps text to new lines)
+ """
widths = [len(h) for h in headers]
for r in rows:
for i, cell in enumerate(r):
@@ -61,14 +69,52 @@ def _table(rows: List[List[str]], headers: List[str]) -> str:
else:
widths.append(len(cell))
+ # Apply max width limit to last column if specified
+ if max_col_width and len(widths) > 0:
+ widths[-1] = min(widths[-1], max_col_width)
+
def fmt_row(cols: List[str]) -> str:
return " | ".join(col.ljust(widths[i]) for i, col in enumerate(cols))
+ def wrap_cell(text: str, width: int) -> List[str]:
+ """Wrap text to width, breaking at word boundaries."""
+ if len(text) <= width:
+ return [text]
+ words = text.split()
+ lines = []
+ current = []
+ current_len = 0
+ for word in words:
+ word_len = len(word)
+ if current and current_len + 1 + word_len > width:
+ lines.append(" ".join(current))
+ current = [word]
+ current_len = word_len
+ else:
+ current.append(word)
+ current_len += (1 if current_len > 0 else 0) + word_len
+ if current:
+ lines.append(" ".join(current))
+ return lines
+
lines = []
lines.append(fmt_row(headers))
lines.append("-+-".join("-" * w for w in widths))
+
for r in rows:
- lines.append(fmt_row(r))
+ # Check if last column needs wrapping
+ if max_col_width and len(r) > 0 and len(r[-1]) > max_col_width:
+ wrapped_lines = wrap_cell(r[-1], max_col_width)
+ # First line with all columns
+ first_row = r[:-1] + [wrapped_lines[0]]
+ lines.append(fmt_row(first_row))
+ # Additional lines with empty cells except last column
+ for wrapped_line in wrapped_lines[1:]:
+ continuation_row = [""] * (len(r) - 1) + [wrapped_line]
+ lines.append(fmt_row(continuation_row))
+ else:
+ lines.append(fmt_row(r))
+
return "\n".join(lines)
@@ -80,7 +126,12 @@ def render_list(data: Dict[str, Any], show_health: bool, show_all: bool, verbose
else:
headers = ["Name", "Hash", "Size", "Modified", "Framework", "Type"]
if show_health:
- headers.append("Health")
+ if verbose:
+ # Verbose mode: split health into Integrity + Runtime + Reason columns
+ headers.extend(["Integrity", "Runtime", "Reason"])
+ else:
+ # Compact mode: single Health column
+ headers.append("Health")
# Human filter:
# - --all: show everything
@@ -127,11 +178,52 @@ def render_list(data: Dict[str, Any], show_health: bool, show_all: bool, verbose
str(m.get("model_type", "-")),
]
if show_health:
- row.append(str(m.get("health", "-")))
+ if verbose:
+ # Verbose mode: Integrity | Runtime | Reason columns
+ health = m.get("health", "unknown")
+ runtime_compatible = m.get("runtime_compatible")
+ reason = m.get("reason", "")
+
+ # Integrity column
+ integrity = "healthy" if health == "healthy" else "unhealthy" if health == "unhealthy" else "-"
+
+ # Runtime column (only meaningful if integrity is healthy)
+ if health == "healthy" and runtime_compatible is not None:
+ runtime = "yes" if runtime_compatible else "no"
+ else:
+ runtime = "-"
+
+ # Reason column (truncate to 60 chars)
+ reason_str = str(reason) if reason else "-"
+ if len(reason_str) > 60:
+ reason_str = reason_str[:57] + "..."
+
+ row.extend([integrity, runtime, reason_str])
+ else:
+ # Compact mode: single Health column (healthy/healthy*/unhealthy)
+ health = m.get("health", "unknown")
+ runtime_compatible = m.get("runtime_compatible")
+
+ if health == "healthy":
+ if runtime_compatible is True:
+ health_str = "healthy"
+ elif runtime_compatible is False:
+ health_str = "healthy*"
+ else:
+ # No runtime check performed
+ health_str = "healthy"
+ elif health == "unhealthy":
+ health_str = "unhealthy"
+ else:
+ health_str = "-"
+
+ row.append(health_str)
rows.append(row)
# Note: show_all/verbose are reserved for future detail; table remains deterministic
- return _table(rows, headers)
+ # Apply 26 char limit to Reason column in verbose mode
+ max_col_width = 26 if (show_health and verbose) else None
+ return _table(rows, headers, max_col_width=max_col_width)
def render_health(data: Dict[str, Any]) -> str:
@@ -155,14 +247,30 @@ def render_show(data: Dict[str, Any]) -> str:
name = model.get("name", "-")
h7 = fmt_hash7(model.get("hash"))
header = f"Model: {name}{('@'+h7) if h7 != '-' else ''}"
+
+ # Build health status string
+ health = model.get('health', '-')
+ runtime_compatible = model.get('runtime_compatible')
+ if health == 'healthy' and runtime_compatible is True:
+ health_str = 'healthy'
+ elif health == 'healthy' and runtime_compatible is False:
+ health_str = 'healthy (files OK, runtime incompatible)'
+ else:
+ health_str = health
+
details = [
f"Framework: {model.get('framework','-')}",
f"Type: {model.get('model_type','-')}",
f"Size: {humanize_size(model.get('size_bytes'))}",
f"Modified: {fmt_time(model.get('last_modified'))}",
- f"Health: {model.get('health','-')}",
+ f"Health: {health_str}",
]
+ # Add reason if present
+ reason = model.get('reason')
+ if reason:
+ details.append(f"Reason: {reason}")
+
# Optional sections
out: List[str] = [header, *details]
if "files" in d and isinstance(d["files"], list):
diff --git a/mlxk2/spec.py b/mlxk2/spec.py
index 08826e4..42f6eb2 100644
--- a/mlxk2/spec.py
+++ b/mlxk2/spec.py
@@ -4,4 +4,4 @@ Single source of truth for the JSON API specification version used by the
current code and tests. Keep this in sync with docs/json-api-specification.md.
"""
-JSON_API_SPEC_VERSION = "0.1.4"
+JSON_API_SPEC_VERSION = "0.1.5"
diff --git a/requirements.txt b/requirements.txt
index 41a0281..831897a 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -3,7 +3,7 @@
huggingface-hub>=0.34.0
requests>=2.32.0
-mlx-lm>=0.27.0 # For running MLX models with streaming support
+mlx-lm>=0.28.3
mlx>=0.29.0 # Core MLX library
# API Server dependencies (for 'mlxk server' command)