Files
mlx-knife/README.md
T
Local Test de7ccf9018 2.0.0-alpha: default 2.0 tests, cache safety, and docs
Testing:
- pytest defaults to tests_2.0 via pytest.ini
- README/TESTING updated; Quick Start uses `pip install -e . && pip install pytest`

Safety:
- Add test-cache sentinel + centralized checks
- Strict delete guard via MLXK2_STRICT_TEST_DELETE=1
- Hide sentinel from 2.0 list output

Portability:
- Remove site-specific paths; generic test/user cache detection (mlxk2_test_ prefix + sentinel)

Docs:
- Environment & Caches, HF cache integrity
- Local-only hooks/excludes and local test script (excluded from VCS)
2025-08-29 16:57:45 +02:00

11 KiB

BROKE Logo MLX-Knife 2.0.0-alpha

JSON-First Model Management for Automation & Scripting

🚧 Alpha Development Branch: This is the feature/2.0.0-json-only branch containing MLX-Knife 2.0.0-alpha. For stable production use, see MLX-Knife 1.1.0.

GitHub Release License: MIT Python 3.9+ Tests

Quick Start

# Installation (local development)
git clone https://github.com/mzau/mlx-knife.git -b feature/2.0.0-json-only
cd mlx-knife
pip install -e .

# Basic usage - JSON API
mlxk-json list --json | jq '.data.models[].name'
mlxk-json health --json | jq '.data.summary'
mlxk-json show "Phi-3-mini" --json | jq '.data.model_info'

What's New: JSON-first architecture for automation and scripting
What's Missing: Server mode, run command (use MLX-Knife 1.x for those)

⚠️ Alpha Status Disclaimer

MLX-Knife 2.0.0-alpha is feature-complete for JSON operations with production-quality reliability:

  • Core functionality works: All 5 commands (list, health, show, pull, rm)
  • Test status: 45/45 passing with comprehensive edge case coverage
  • Production use: Suitable for broke-cluster integration and automation
  • Parallel use: Deploy alongside MLX-Knife 1.x for server functionality

What 2.0.0-alpha Includes

Command Status Description
list Complete Model discovery with JSON output
health Complete Corruption detection and cache analysis
show Complete Detailed model information with --files, --config
pull Complete HuggingFace model downloads with corruption detection
rm Complete Model deletion with lock cleanup and fuzzy matching

What's Coming Later

Feature Target Version Status
🔄 server 2.0.0-rc OpenAI-compatible API server
🔄 run 2.0.0-rc Interactive model execution
🔄 Human-readable output 2.0.0-rc CLI formatting layer
🔄 embed TBD Embedding generation (if merged from 1.x)

Installation & Parallel Usage

Development Installation

# Install 2.0.0-alpha (this branch)
pip install -e /path/to/mlx-knife

# Verify installation
mlxk-json --version  # → MLX-Knife JSON 2.0.0-alpha
mlxk2 --version      # → MLX-Knife JSON 2.0.0-alpha

Parallel with MLX-Knife 1.x

Both versions can coexist safely:

# Install stable 1.x for server/run features
pip install mlx-knife

# Commands available:
mlxk list                    # 1.x - Human-readable output
mlxk server --port 8080      # 1.x - Server mode
mlxk run "model" -p "Hello"  # 1.x - Interactive execution

mlxk-json list --json        # 2.0 - JSON API
python -m mlxk2.cli list     # 2.0 - Module invocation

Package Names:

  • MLX-Knife 1.x: mlx-knifemlxk command
  • MLX-Knife 2.0: mlxk-jsonmlxk-json, mlxk2 commands

JSON API Documentation

📋 Complete API Specification: See docs/json-api-specification.md for comprehensive JSON schema, error codes, and integration examples.

Command Structure

All commands follow this JSON response format:

{
    "status": "success|error", 
    "command": "list|health|show|pull|rm",
    "data": { /* command-specific data */ },
    "error": null | { "message": "...", "details": "..." }
}

Examples

List Models

mlxk-json list --json
# Output:
{
    "status": "success",
    "command": "list", 
    "data": {
        "models": [
            {
                "name": "mlx-community/Phi-3-mini-4k-instruct-4bit",
                "hashes": ["e9675aa3def456789abcdef0123456789abcdef0"],
                "cached": true
            }
        ],
        "count": 1
    },
    "error": null
}

Health Check

mlxk-json health --json
# Output:
{
    "status": "success",
    "command": "health",
    "data": {
        "healthy": [...],
        "unhealthy": [...],
        "summary": {"total": 5, "healthy_count": 4, "unhealthy_count": 1}
    },
    "error": null
}

Show Model Details

mlxk-json show "Phi-3-mini" --json --files
# Output includes file listings, model config, capabilities

Hash Syntax Support

All commands support @hash syntax for specific model versions:

mlxk-json health "Qwen3@e96" --json     # Check specific hash
mlxk-json show "model@3df9bfd" --json   # Short hash matching
mlxk-json rm "Phi-3@e967" --json --force  # Delete specific version

HuggingFace Cache Safety

MLX-Knife 2.0 respects standard HuggingFace cache structure and practices:

Best Practices for Shared Environments

  • Read operations (list, health, show) always safe with concurrent processes
  • Write operations (pull, rm) coordinate during maintenance windows
  • Lock cleanup automatic but avoid during active downloads
  • Your responsibility: Coordinate with team, use good timing

Example Safe Workflow

# Check what's in cache (always safe)
mlxk-json list --json | jq '.data.count'

# Maintenance window - coordinate with team
mlxk-json rm "corrupted-model" --json --force
mlxk-json pull "replacement-model" --json

# Back to normal operations
mlxk-json health --json | jq '.data.summary'

Real-World Examples

🔗 Integration Reference: External projects should implement against docs/json-api-specification.md - this alpha phase helps validate that specification matches actual implementation.

Broke-Cluster Integration

# Get available model names for scheduling
MODELS=$(mlxk-json list --json | jq -r '.data.models[].name')

# Check cache health before deployment
HEALTH=$(mlxk-json health --json | jq '.data.summary.healthy_count')
if [ "$HEALTH" -eq 0 ]; then
    echo "No healthy models available"
    exit 1
fi

# Download required models
mlxk-json pull "mlx-community/Phi-3-mini-4k-instruct-4bit" --json

CI/CD Pipeline Usage

# Verify model integrity in CI
mlxk-json health --json | jq -e '.data.summary.unhealthy_count == 0'

# Clean up CI artifacts
mlxk-json rm "test-model-*" --json --force

# Pre-warm cache for deployment
mlxk-json pull "production-model" --json

Model Management Automation

# Find models by pattern
LARGE_MODELS=$(mlxk-json list --json | jq -r '.data.models[] | select(.name | contains("30B")) | .name')

# Show detailed info for analysis
for model in $LARGE_MODELS; do
    mlxk-json show "$model" --json --config | jq '.data.model_config'
done

Testing

The 2.0 test suite runs by default (pytest discovery points to tests_2.0/):

# Run 2.0 tests (default)
pytest -v

# Explicitly run legacy 1.x tests (not maintained on this branch)
pytest tests/ -v

# Test categories (2.0 example):
# - ADR-002 edge cases
# - Integration scenarios  
# - Model naming logic
# - Robustness testing

# Current status: 45/45 passing ✅

Revolutionary Test Architecture:

  • Isolated Cache System - Zero risk to user data
  • Atomic Context Switching - Production/test cache separation
  • Comprehensive Mock Models - Realistic test scenarios
  • Edge Case Coverage - All documented failure modes tested

Known Issues & Limitations

Critical Issues

  • Health Check False Positive: Health check may report incomplete downloads as healthy during model pull operations (affects both 1.1.0 and 2.0.0-alpha)

Alpha Limitations

  • No interactive prompts (use --force flag for rm operations)
  • JSON output only (no human-readable formatting)
  • Limited error message user experience (coming in beta)

GitHub Issues

  • Issue #18: Server signal handling limitation (known, will fix in 2.0.0-rc)
  • Issue #24: Lock cleanup command (planned for future release)

Development Status

Version Roadmap

  • 2.0.0-alpha ← You are here (JSON API core complete)
  • 2.0.0-beta: 6-8 weeks robust testing, production validation
  • 2.0.0-rc: Server/run features, full 1.x parity
  • 2.0.0-stable: Community validated, enterprise ready

Architecture Decisions

  • JSON-First: All output structured for scripting and automation
  • Cache Safety: Respects HuggingFace standards, no custom formats
  • Atomic Operations: Clean separation between test and production contexts
  • Backward Compatibility: Parallel deployment with 1.x maintained

Contributing

This branch follows the established MLX-Knife development patterns:

# Run quality checks
python test-multi-python.sh  # Tests across Python 3.9-3.13
./run_linting.sh             # Code quality validation

# Key files:
mlxk2/                       # 2.0.0 implementation
tests_2.0/                   # Alpha test suite  
docs/ADR/                    # Architecture decision records

See CONTRIBUTING.md for detailed guidelines.

Support & Feedback

For production use: Consider MLX-Knife 1.1.0 until 2.0.0-beta is available.

Alpha Testing Goals

  • Validate JSON API specification matches implementation
  • Real-world integration feedback from external projects
  • Edge case discovery through broke-cluster usage
  • API stability testing before beta release

MLX-Knife 2.0.0-alpha - Built for automation, tested for reliability, designed for the future.

Local Safety Setup (Optional)

To keep local coordination files out of Git and avoid accidental pushes during development:

  • Ignore locally (branch-independent): add to .git/info/exclude
    • AGENTS.md
    • CLAUDE.md
  • Local hooks (not versioned):
    • .git/hooks/pre-commit blocks commits including AGENTS.md/CLAUDE.md.
    • .git/hooks/pre-push blocks pushes of the current branch. Override once with ALLOW_PUSH=1 git push.

Minimal pre-commit example:

#!/usr/bin/env bash
set -euo pipefail
staged=$(git diff --cached --name-only || true)
for f in AGENTS.md CLAUDE.md; do
  echo "$staged" | grep -qx "$f" && { echo "Commit blocked: $f" >&2; exit 1; }
done

Minimal pre-push example:

#!/usr/bin/env bash
set -euo pipefail
[ "${ALLOW_PUSH:-}" = "1" ] && exit 0
br=$(git rev-parse --abbrev-ref HEAD)
while read -r l _ r _; do [ "$l" = "refs/heads/$br" ] && { echo "Push blocked: $br" >&2; exit 1; }; done
exit 0