- Add recommended beta installation section with direct GitHub release URL
- Simplifies user onboarding for v2.0.0-beta.3 testing
- Reorganized Quick Start with clear Development vs Release paths
Major Features Added:
• Complete run command implementation with interactive/single-shot modes
• MLXRunner core engine ported from 1.x with modular architecture
• OpenAI-compatible server with SIGINT-robust supervisor mode
• Experimental push feature properly isolated behind environment variable
Key Improvements:
- Full feature parity with 1.1.1 stable releases
- Enhanced human output formatting across all commands
- Clean separation of stable (184 tests) vs experimental features
- Updated demo GIF showcasing improved 2.0 interface
Fixes:
- Pull operation cache pollution (Issue #30) with preflight access checks
- Test stability improvements across all environments
Architecture:
- Modular runner design with focused helper modules
- Thread-safe model loading and memory management
- stable testing across Python 3.9-3.13
Ready for use as comprehensive 1.x alternative.
- Detect MLX/chat via README front‑matter + tokenizer; unify list/show; human list filters aligned (Refs #31)
- Push: create missing branch with --create and retry once on “Invalid rev id”; tolerate missing branches
offline; no‑op still creates branch with --create
- Tests: add offline retry test; detection/human coverage; live list (opt‑in); 98/98 passing
- Docs/Meta: CHANGELOG/TESTING/README/SECURITY/CLAUDE updated; hard split 1.x from this branch; Apache‑2.0 + NOTICE
Major Achievements:
- Live reproduction and documentation of Issue #27 (health check false positive)
- Comprehensive development README.md for alpha phase parallel usage
- JSON API specification integration and references
- 45/45 tests passing with production-quality reliability
Issue #27 Critical Discovery:
- Health check false positives for multi-part model downloads
- Root cause: Multi-part pattern detection flaw in shared logic
- GitHub issue created with reproduction steps and technical analysis
2.0.0-Alpha Development Status:
- Revolutionary test isolation architecture complete
- Atomic cache system with triple safety verification
- Development handbook with parallel deployment guide
- Ready for production testing and broke-cluster integration
Complete isolated test system with 150/150 tests passing.
Production-ready after successful beta testing cycle.
See CHANGELOG.md for comprehensive details including:
- All critical issues from 1.1.0-beta3 resolved
- Enhanced test infrastructure with real model validation
- Multi-Python compatibility (3.9-3.13)
Three major bug fixes for production readiness:
- Issue #21: Fix crash on fresh installations (empty cache directory)
- Issue #22: Suppress urllib3 LibreSSL warnings on macOS Python 3.9
- Issue #23: Fix double rm execution bug - models now deleted in single command
Test improvements:
- 140/140 tests passing (up from 137)
- Added real integration tests for lock cleanup
- Fixed unit tests broken by path corrections
All known cache management issues resolved for stable release.
Issues Resolved:
• Issue #15: Token limits vs natural stop tokens race condition - FIXED
• Issue #16: Interactive vs server token limit policies - FIXED
Major Improvements:
• Automatic optimal token limits - no configuration needed
• Manual --max-tokens control still available when desired
• Eliminates old hardcoded 500/2000 token restrictions
• Performance gains: Up to 524x improvement for large context models
• Enhanced web client with model capabilities display and better UX
Additional Enhancements:
• Enhanced /v1/models API with context_length field
• Comprehensive test expansion: 114 → 131 tests (131/131 passing)
• Python 3.9-3.13 compatibility verified
Known Issues (Beta Status):
• Server deadlock possible under extreme concurrent model loading stress
• Workaround: Avoid simultaneous heavy model operations
Community-driven feature development implementing key GitHub Issues:
- Fix Issue #7: Health check consistency for fuzzy model names - unified health check logic ensures
identical status regardless of name format (Phi-3 vs mlx-community/Phi-3-mini-4k-instruct-4bit)
- Add Issue #6: Repository name validation - pre-validation for HuggingFace Hub 96-character limit with
clear error messages
- Add Issue #13: Hash-based disambiguation - use commit hashes to resolve ambiguous model names (mlxk show
Llama@de2dfaf5 → mlx-community/Llama-3.3-70B-Instruct-4bit)
Enhanced user experience:
- Pure local hash resolution without external API calls, offline-capable
- Improved model name disambiguation logic for better workflow
- Real user workflow support - see hashes in mlxk list, use directly in other commands
Major milestone: First stable release with official PyPI distribution.
New Features:
- PyPI publication: Now installable via \`pip install mlx-knife\`
- Official CLI-only designation with clear API policy
- Absolute GitHub URLs for PyPI package display (logo + demo)
Documentation Updates:
- All docs updated to v1.0.0 (README, CHANGELOG, TESTING, SECURITY, CLAUDE.md)
- Added PyPI installation instructions to README
- Updated supported versions tables
- Clarified CLI-only usage policy
Release Highlights:
- Transition from 1.0-rc3 to stable 1.0.0
- Production-ready with 104/104 tests passing
- Global accessibility via PyPI distribution
- Comprehensive documentation overhaul
Ready for community adoption and production use.
- Fix#1: Partial name filtering for `mlxk list` command
- Fix#2: Fuzzy matching for single-model commands
- Fix#3: Default behavior for `mlxk health` (no --all flag required)
- Expanded test suite to 104/104 tests passing
**Key Improvements**
- Robust exception handling during model loading with guaranteed cleanup
- Protection against nested context manager usage in MLXRunner
- Safe cleanup that handles partial loading failures gracefully
- Exception-resilient cache clearing operations
- Safe tokenizer attribute access with proper defaults
- Graceful memory statistics handling when metrics unavailable
- Comprehensive unit test coverage for memory management edge cases
**Changes**
- Updated version to 1.0-rc2 across all documentation files
- Enhanced MLXRunner context manager with bulletproof exception safety
- Added comprehensive unit tests for memory management scenarios
- Improved error handling for partial model loading failures
- Updated test coverage documentation (96/96 tests passing)
- Refined README to focus on key features rather than test metrics
This release focuses on production-ready memory management and exception
safety, making MLX Knife more robust for real-world usage scenarios.