mirror of https://github.com/cloudstack-llc/mlx-knife.git synced 2026-07-01 20:44:14 -04:00

Files

T

The BROKE Cluster Team 57bf6d86be 2.0.0-beta.3: Feature Complete - Full 1.1.1 Parity Achieved

Major Features Added:
  • Complete run command implementation with interactive/single-shot modes
  • MLXRunner core engine ported from 1.x with modular architecture
  • OpenAI-compatible server with SIGINT-robust supervisor mode
  • Experimental push feature properly isolated behind environment variable

  Key Improvements:
  - Full feature parity with 1.1.1 stable releases
  - Enhanced human output formatting across all commands
  - Clean separation of stable (184 tests) vs experimental features
  - Updated demo GIF showcasing improved 2.0 interface

  Fixes:
  - Pull operation cache pollution (Issue #30) with preflight access checks
  - Test stability improvements across all environments

  Architecture:
  - Modular runner design with focused helper modules
  - Thread-safe model loading and memory management
  - stable testing across Python 3.9-3.13

  Ready for use as comprehensive 1.x alternative.

2025-09-14 18:04:18 +02:00

7.2 KiB

Raw Blame History

MLX-Knife 2.0 Versioning Strategy

Document Status: Living (pre-release)
Purpose: Principles for versioning and deployment of MLX‑Knife 2.0 until stable

Versioning Schema

2.0.0-alpha (Feature-Complete for JSON-Only)

Scope: Core JSON operations without server/run functionality

Features:

✅ All 5 Operations: list, health, show, pull, rm
✅ JSON API fully implemented per specification
✅ Core functionality working (broke-cluster compatible)
⚠️ Experimental features MAY be present; they MUST be clearly labeled "experimental", safe by default, and must not break existing behavior
❌ Pre-release level testing
❌ No server or run commands

Quality Gate (alpha):

Core operations functional in isolation
JSON schema stable and documented
Basic edge case handling

Target Users:

Broke-cluster integration (POC environment)
Early adopters for JSON automation
Parallel deployment alongside 1.x

2.0.0-beta (Feature-Complete with Server/Run)

Scope: All alpha features PLUS server/run functionality from 1.x

New in Beta:

✅ server command - OpenAI-compatible API from 1.x
✅ run command - Interactive model execution from 1.x
✅ Reasoning model support (GPT-OSS/MXFP4)
✅ Human output backend (already in alpha.3)
✅ 100% test coverage including server/run tests

Version Strategy:

2.0.0-beta.1-local - Initial server/run port (git tag only)
2.0.0-beta.2-local - Full reasoning support (git tag only)
2.0.0-beta.3 - First public beta release (PyPI)

Quality Gate:

Feature parity with 1.1.1-beta.3
All server/run tests passing
Reasoning models working
Documentation complete

Target Users:

Internal testing and validation
Beta.3: Public beta testers
Full MLX-Knife functionality seekers

2.0.0-rc (Feature-Complete vs 1.x)

Scope: Full feature parity with MLX-Knife 1.x

New Features:

✅ server command - OpenAI-compatible API server
✅ run command - Interactive model execution
✅ embed command - Embedding generation (if merged from 1.x)
✅ Human-readable output via CLI layer formatting

Quality Gate:

All 1.x functionality replicated
Migration path documented
Performance parity or better
Server functionality validated

Target Users:

Full 1.x replacement candidates
Users requiring both JSON and human output
Server-mode applications

2.0.0-stable

Scope: Production-ready replacement for MLX-Knife 1.x

Requirements:

✅ All RC features stable and documented
✅ Migration guide with examples
✅ Community feedback incorporated
✅ Long-term support commitment
✅ Package management (pip/brew) ready

Target Users:

All MLX-Knife users
General availability deployment

Deployment Strategy

Broke-Cluster POC Environment

Parallel Deployment Architecture:

# System-wide: MLX-Knife 1.1.0 (stable server functionality)
pip install mlx-knife==1.1.0

# Local development: MLX-Knife 2.0.0-alpha (JSON management)
pip install -e /path/to/mlx-knife  # Local install (current 2.0 feature branch)

Usage Pattern:

# Server operations: Use 1.x (stable, proven)
mlxk server --model "Phi-3-mini" --port 8000

# Management operations: Use 2.0.0-alpha (JSON automation)
mlxk-json list --json | jq '.data.models[].name'
mlxk-json health --json | jq '.data.summary'
mlxk-json pull "new-model" --json

Benefits:

✅ Risk mitigation: Server stability maintained with 1.x
✅ Feature validation: JSON API tested in production environment
✅ Gradual migration: Teams can adopt 2.0 features incrementally
✅ Rollback safety: Can disable 2.0 without affecting server operations

Package Naming Strategy

Development Phase:

mlx-knife (1.1.0) - Stable production version
mlxk2 / mlxk-json - Development 2.0.0-alpha local install (single long‑lived 2.0 branch; releases via annotated tags)

Production Phase:

mlx-knife (2.0.0+) - New major version
mlx-knife-v1 (1.1.0) - Legacy support if needed

Quality Gates Summary

Version	Test Coverage	Features	Server Mode	Production Ready
alpha	~70% (mock issues)	JSON-only (5 ops)	❌	Limited
beta	100%	JSON-only (5 ops)	❌	Yes (JSON)
rc	100%	Full parity	✅	Yes (All)
stable	100% + community	Full parity	✅	Yes (LTS)

Success Metrics

Alpha Success Criteria

Broke-cluster integration working
Core JSON operations stable
No user cache corruption in testing
JSON schema documentation complete

Beta Success Criteria

100% test pass rate
Performance benchmarks established
All ADR-002 edge cases handled
Production deployment successful

RC Success Criteria

Feature parity with 1.x achieved
Migration guide validated
Server mode performance acceptable
Community feedback positive

Stable Success Criteria

6+ months beta stability
Multiple production deployments
Documentation comprehensive
Long-term support plan

Timeline Estimates

Current Status: Active alpha cycle with tagged pre‑releases

JSON CLI stable for broke‑cluster use

Projected Milestones:

2.0.0-alpha: rolling alphas (tagged), experimental features allowed but clearly marked and safe by default
2.0.0-beta: 4-6 weeks (robust testing)
2.0.0-rc: 8-12 weeks (server/run implementation)
2.0.0-stable: 16-20 weeks (community validation)

Risk Mitigation

HuggingFace Cache Compatibility (CRITICAL)

Apple MLX Team & HuggingFace Hub Integration:

~20+ MLX ecosystem users depend on cache stability
HuggingFace Hub attention - changes monitored by upstream
Cache structure: MLX-Knife follows HuggingFace standards

Cache Safety Guidelines:

### Shared Cache Environment Best Practices
- **Read operations** (`list`, `health`, `show`): Always safe with concurrent processes
- **Write operations** (`pull`, `rm`): Coordinate with team during maintenance windows
- **Lock cleanup**: Automatic in MLX-Knife, avoid during active HuggingFace downloads
- **User responsibility**: Coordinate cache access, no special flags needed

Parallel Deployment Risks

Configuration conflicts: Different cache paths, environment variables
User confusion: Clear naming and documentation required
Maintenance burden: Supporting two codebases temporarily

Mitigation Strategies

Clear separation: Different package names, installation paths
Comprehensive docs: Usage examples, best practices, cache guidelines
Automated testing: Both versions in CI/CD pipeline
Community support: Active communication about roadmap

Decision Authority

Architecture Decisions: Development team consensus required Version Releases: Lead maintainer approval + community review Breaking Changes: Major version bump + migration period Support Policy: LTS for stable versions, best-effort for pre-release

This versioning strategy provides a clear path from current alpha-quality code to production-ready 2.0.0 while maintaining stability through parallel deployment with 1.x versions.

7.2 KiB Raw Blame History Unescape Escape

MLX-Knife 2.0 Versioning Strategy

Versioning Schema

2.0.0-alpha (Feature-Complete for JSON-Only)

2.0.0-beta (Feature-Complete with Server/Run)

2.0.0-rc (Feature-Complete vs 1.x)

2.0.0-stable

Deployment Strategy

Broke-Cluster POC Environment

Package Naming Strategy

Quality Gates Summary

Success Metrics

Alpha Success Criteria

Beta Success Criteria

RC Success Criteria

Stable Success Criteria

Timeline Estimates

Risk Mitigation

HuggingFace Cache Compatibility (CRITICAL)

Parallel Deployment Risks

Mitigation Strategies

Decision Authority

7.2 KiB

Raw Blame History