mlx-knife

mirror of https://github.com/cloudstack-llc/mlx-knife.git synced 2026-07-01 20:44:14 -04:00

Author	SHA1	Message	Date
The BROKE Cluster Team	86f669dc82	Release 2.0.4-beta.1: Vision + Pipes + Memory - Vision Support (Issue #45): CLI + Server with OpenAI-compatible image API, EXIF metadata - Unix Pipes (ADR-014): stdin support, isatty detection, SIGPIPE handling - Memory-Aware Loading (ADR-016): Pre-load checks with >70% RAM warnings - Python 3.9-3.14: Full compatibility verified (476-485 tests passing) - Fixed: --log-json regression (Issue #44), Vision multimodal history filtering See CHANGELOG.md for complete details.	2025-12-16 19:35:30 +01:00
The BROKE Cluster Team	05f1c30486	Release 2.0.3: Foundation for pipes Foundation release for Unix pipe integration with stderr separation, benchmark infrastructure, and reasoning control improvements. Breaking Changes: - stdout/stderr separation (Issue #43) - errors to stderr in human mode - JSON mode unchanged (all output to stdout) Features: - Benchmark reporting infrastructure (ADR-013 Phase 0) - --no-reasoning flag (Issue #40 partial - GPT-OSS/QwQ only) - Interactive mode reasoning control (review_report.md fixes) Bug Fixes: - huggingface-hub 1.x incompatibility (critical dependency fix) - Streaming parity tests refactored (Portfolio Discovery) Testing: - 308 tests passing (Python 3.9-3.13) - 35 skipped (opt-in live tests) - 79/91 E2E tests passing with HF_HOME See CHANGELOG.md for complete details and migration guide.	2025-11-17 22:54:06 +01:00
The BROKE Cluster Team	d32d3185dd	Release 2.0.2: Test infrastructure hardening & empirical validation Stable release completing Issue #32 recovery plan - all tests passing. Bug Fixes: - Test collection regression (E2E suite parametrization) - Stop token ordering (batch + streaming modes) - E2E test temperature flakiness (deterministic sampling) - Web API framework detection (PR #42 by @limey, fixes #41) - E2E test marker fix (show_model_portfolio diagnostics) Architecture: - mlx-lm API evaluation: Keep manual text-based implementation - Stop token workarounds: All 3 validated (Phi-3, DeepSeek-R1, GPT-oss) Testing: - Portfolio Discovery: 73/81 tests, 17 models, 0 failures - E2E infrastructure hardened (TOKENIZERS, polling, gc.collect()) - Multi-Python validation: 3.9-3.13 passing Documentation: - ADR-009 Outstanding Work completed + Implementation Plan removed - TESTING-DETAILS.md: Portfolio Discovery + E2E Architecture updated - CHANGELOG.md: Complete 2.0.2 stable release notes	2025-11-15 22:10:08 +01:00
The BROKE Cluster Team	21cf188fcc	Release 2.0.1: Portfolio Discovery + CLI Exit Code Fixes Issue #32: Stop token Portfolio Discovery validates generic fix across all models - Auto-discovers MLX chat models in HF_HOME with 4-filter validation - RAM-aware testing (40-70% budgets) prevents OOM - Empirical report generation (stop_token_config_report.json) - Fallback to 3 predefined models without HF_HOME - Implementation: tests_2.0/test_stop_tokens_live.py (~110 LOC) Issue #38: CLI exit codes now propagate run command errors correctly - Both text and JSON modes return exit code 1 on model execution failures - Fixed: run_model() now returns error strings in both modes - Implementation: mlxk2/operations/run.py + mlxk2/cli.py error detection - New tests: tests_2.0/test_cli_run_exit_codes.py (9 comprehensive tests) Testing: 306 passed, 20 skipped (zero regressions) Docs: Updated README, TESTING, SECURITY for 2.0.1 stable release Version: 2.0.0 → 2.0.1 (mlxk2/__init__.py)	2025-11-08 20:28:54 +01:00
The BROKE Cluster Team	a2ca3da2d9	Release 2.0.0: Full rewrite with Apache 2.0 license MLX Knife 2.0 replaces 1.x as the primary version. Highlights: - Full 1.x feature parity (list, show, pull, rm, run, server, health) - JSON API for automation (--json flag) - Enhanced error handling and logging - Runtime compatibility checks - Improved stop token detection - License: MIT→Apache 2.0 Breaking changes: - mlxk rm: requires --force flag for models with active locks Migration guide: MIGRATION.md Changelog: CHANGELOG.md Testing: 297/317 tests passed, Python 3.9-3.13 verified Merge branch 'feature/2.0.0-alpha.1'	2025-11-06 16:00:35 +01:00
The BROKE Cluster Team	ae80bbe554	Release 2.0.0: Package rename, Apache 2.0 license, documentation updates	2025-11-06 15:21:10 +01:00
The BROKE Cluster Team	fb54f59cd4	Release 2.0.0-beta.6: Stop token & compatibility bug fixes Fixes Issue #32 (generic multi-EOS detection) and Issue #37 (model detection) - Generic stop token detection: Multi-EOS models (MXFP4, Qwen, Llama) now use eos_token_ids Set instead of model-specific workarounds - Private/org MLX model detection: `mlxk run` now works outside `mlx-community/*` namespace - Commit-pinned compatibility checks: Models with `@commit_hash` validated before inference - Packaging dependencies: Fixed `pip install -e .` requirements - ADR-009: Stop Token Detection Fix (generic approach + test strategy) - ADR-011: E2E Live Test Architecture (planned) See CHANGELOG.md and TESTING.md for details.	2025-10-24 15:46:42 +02:00
The BROKE Cluster Team	f5fe1dd061	Release 2.0.0-beta.5: Enhanced error handling & bug fixes Features: - Enhanced error handling & logging (ADR-004): Unified error envelope, structured logging with JSON support, request correlation - Legacy format detection (Issue #37): Runtime compatibility check for weight file formats Bug Fixes: - Issue #37: Models with legacy weight formats now correctly detected as runtime-incompatible - CLI regression fix: mlxk2 without arguments shows help instead of JSON error Test Status: 295/295 passed, 14 skipped	2025-10-21 00:24:47 +02:00
The BROKE Cluster Team	4b75a22726	Release 2.0.0-beta.4: Runtime compatibility check (Issue #36 ) - JSON API 0.1.5: runtime_compatible + reason fields - mlx-lm dependency updated to >=0.28.3 (stable PyPI release) - Human output: healthy / healthy* / unhealthy status display - All tests passing (253 passed, 12 skipped) across Python 3.9-3.13	2025-10-18 16:06:58 +02:00
The BROKE Cluster Team	d07bf66990	minor changes of badges	2025-10-01 12:09:22 +02:00
The BROKE Cluster Team	58facfb079	markdown formatting issues in README.md corrected	2025-10-01 11:47:45 +02:00
The BROKE Cluster Team	316c2f5585	README: Add beta release installation instructions - Add recommended beta installation section with direct GitHub release URL - Simplifies user onboarding for v2.0.0-beta.3 testing - Reorganized Quick Start with clear Development vs Release paths	2025-09-21 13:59:00 +02:00
mzfive	4a8bcba9b4	Update README.md: linked to updated gif Demo	2025-09-18 14:19:27 +02:00
The BROKE Cluster Team	9261bc0c4e	2.0.0-beta.3: Feature Complete - Clone Implementation & Issue Resolution - Clone Feature (Issue #29): Complete workspace-based workflow with ADR-007 - Pull Preflight (Issue #30): Prevents cache pollution from gated/private repos - Lenient MLX Detection (Issue #31): Framework detection beyond mlx-community - Multi-shard Health (Issue #27): Strict completeness validation - Full JSON API 0.1.4: Complete schema for all 10 commands - Test Suite: 254/254 passed, comprehensive validation See CHANGELOG.md fnd TESTING.md or technical implementation details.	2025-09-18 14:09:32 +02:00
The BROKE Cluster Team	6279b82be8	Release: MLX Knife 1.1.1 - Stable Release Promote 1.1.1-beta.3 to stable with metadata updates: ✅ Version: 1.1.1b3 → 1.1.1 (stable) ✅ PyPI classifier: Development Status 4 Beta → 5 Stable ✅ Documentation: Updated to reflect stable release ✅ Sponsor link: Fixed tileslauncher → tileshq ✅ Security policy: Support 1.1.1 + 2.0.0-beta.3 only No functional changes - same MXFP4 + GPT-OSS features as beta.3	2025-09-14 18:52:25 +02:00
The BROKE Cluster Team	57bf6d86be	2.0.0-beta.3: Feature Complete - Full 1.1.1 Parity Achieved Major Features Added: • Complete run command implementation with interactive/single-shot modes • MLXRunner core engine ported from 1.x with modular architecture • OpenAI-compatible server with SIGINT-robust supervisor mode • Experimental push feature properly isolated behind environment variable Key Improvements: - Full feature parity with 1.1.1 stable releases - Enhanced human output formatting across all commands - Clean separation of stable (184 tests) vs experimental features - Updated demo GIF showcasing improved 2.0 interface Fixes: - Pull operation cache pollution (Issue #30) with preflight access checks - Test stability improvements across all environments Architecture: - Modular runner design with focused helper modules - Thread-safe model loading and memory management - stable testing across Python 3.9-3.13 Ready for use as comprehensive 1.x alternative.	2025-09-14 18:04:18 +02:00
The BROKE Cluster Team	ce46601d9d	Release: 1.1.1-beta.3 - MXFP4 support and GPT-OSS reasoning • MXFP4 Quantization Support (MLX ≥0.29.0, MLX-LM ≥0.27.0) • GPT-OSS Reasoning Models with --hide-reasoning flag • Enhanced Show Command with improved quantization display • Documentation updates (README.md, TESTING.md) See CHANGELOG.md for complete technical details. Partial Issue #32 (GPT-OSS only, other reasoning models remain open).	2025-09-10 13:32:04 +02:00
The BROKE Cluster Team	3f57248121	2.0.0-alpha.3: lenient MLX detection + push branch handling - Detect MLX/chat via README front‑matter + tokenizer; unify list/show; human list filters aligned (Refs #31) - Push: create missing branch with --create and retry once on “Invalid rev id”; tolerate missing branches offline; no‑op still creates branch with --create - Tests: add offline retry test; detection/human coverage; live list (opt‑in); 98/98 passing - Docs/Meta: CHANGELOG/TESTING/README/SECURITY/CLAUDE updated; hard split 1.x from this branch; Apache‑2.0 + NOTICE	2025-09-08 01:14:01 +02:00
Local Test	5045f9e1bd	Release: 1.1.1‑b2 — Issue #31 , stable server tests - Lenient MLX detection via README/tokenizer (Issue #31) - CLI: `show` type, strict `list` (chat), `run` accepts private MLX - Server tests: RAM‑aware gating with `mlxk show`, MoE parsing fix (8x7B), server‑manager process guard, thread‑based timeout - Multi‑Python script hardened; no ANSI; log tail on errors - Docs: CHANGELOG, TESTING, CLAUDE updated; 166/166 green (Py 3.9–3.13), 32 server tests green	2025-09-07 02:24:08 +02:00
Local Test	eedb91b75c	Feat: add experimental push (2.0.0-alpha.2) - Push (upload-only): quiet JSON by default; capture hub logs in data.hf_logs - No-op detection aligned to hub signal; clear commit fields; uploaded_files_count=0 - Add --dry-run (plan vs remote) and --check-only (offline preflight); merge .hfignore; extend default ignores - Human output: concise; --verbose shows commit URL; JSON shape unchanged - Tests: add offline dry-run cases; live push remains opt-in (wet/live_push) - Docs: README push section updated; TESTING.md reference + mini-matrix; - Changelog: add 2.0.0-alpha.2; note Issue #31 under 1.1.1 pending - Spec: keep schema stable (0.1.3); CLI/version docs consistent	2025-09-05 22:42:39 +02:00
Local Test	b9db12ae89	Fix: strict multi-shard health checks (#27 ) — backport to 1.x; bump 1.1.1b1 Summary - Backports the 2.0 strict health policy to 1.x and ships as a pre-release. Health changes (cache_utils.py) - Index-aware: validate safetensors and PyTorch indexes; all referenced shards must exist, be >0B, and not be Git LFS pointers. - Pattern policy: shard patterns (model-XXXXX-of-YYYYY.) without an index → unhealthy (parity with 2.0). - Partial/tmp markers: any “.partial”, “.tmp”, or names containing “partial” anywhere under the snapshot → unhealthy. - LFS scan: recursive detection of Git LFS pointer files (<200B with LFS header). - Single-file fallback: non-empty .safetensors/.bin/.gguf (no pattern shards) remain healthy. - Ergonomics: is_model_healthy() accepts direct snapshot paths; check_lfs_corruption() scans recursively. Tests - Add tests/unit/test_health_multishard.py covering: - index complete → healthy; missing/empty shard → unhealthy; LFS pointer → unhealthy - pattern shards (no index) → unhealthy - partial marker → unhealthy - PyTorch index parity (complete → healthy) - single-file safetensors/gguf → healthy Docs - CHANGELOG.md: add 1.1.1-beta.1 with detailed rules; note GitHub tag vs PyPI mapping (1.1.1-beta.1 ↔ 1.1.1b1). - README.md: tests badge 160/160; pre-release note for 1.1.1b1. - TESTING.md: status 160/160; update test structure (add test_health_multishard.py; remove 2.0 note). Version - version = 1.1.1b1 (PEP 440 pre-release); VERSION = (1, 1, 1). Behavioral impact - Health reporting is stricter (cannot regress functionality): incomplete multi-shard downloads correctly report unhealthy. No changes to pull/run/server behavior. Validation - Python 3.9 local: 160 passed, 36 deselected; warnings eliminated on 3.9/3.10 under project defaults. - New multishard tests pass; manual spot-checks show expected unhealthy→healthy transitions as downloads complete. Release - GitHub: tag v1.1.1-beta.1 (Pre-release). - PyPI: upload 1.1.1b1 (install via pip install --pre mlx-knife).	2025-09-01 01:26:27 +02:00
Local Test	19a66674c0	2.0.0-alpha.1: human output default; strict health (#27 , PyTorch index) See CHANGELOG.md and README.md	2025-08-31 22:25:43 +02:00
Local Test	f511dd9c74	Docs: add sponsor section + badge; add FUNDING; fix footer to 1.1.0	2025-08-29 19:58:32 +02:00
Local Test	de7ccf9018	2.0.0-alpha: default 2.0 tests, cache safety, and docs Testing: - pytest defaults to tests_2.0 via pytest.ini - README/TESTING updated; Quick Start uses `pip install -e . && pip install pytest` Safety: - Add test-cache sentinel + centralized checks - Strict delete guard via MLXK2_STRICT_TEST_DELETE=1 - Hide sentinel from 2.0 list output Portability: - Remove site-specific paths; generic test/user cache detection (mlxk2_test_ prefix + sentinel) Docs: - Environment & Caches, HF cache integrity - Local-only hooks/excludes and local test script (excluded from VCS)	2025-08-29 16:57:45 +02:00
The BROKE Team	d375e1bd3e	MLX-Knife 2.0.0-alpha: Issue #27 Discovery & Development README Major Achievements: - Live reproduction and documentation of Issue #27 (health check false positive) - Comprehensive development README.md for alpha phase parallel usage - JSON API specification integration and references - 45/45 tests passing with production-quality reliability Issue #27 Critical Discovery: - Health check false positives for multi-part model downloads - Root cause: Multi-part pattern detection flaw in shared logic - GitHub issue created with reproduction steps and technical analysis 2.0.0-Alpha Development Status: - Revolutionary test isolation architecture complete - Atomic cache system with triple safety verification - Development handbook with parallel deployment guide - Ready for production testing and broke-cluster integration	2025-08-28 23:49:14 +02:00
The BROKE Team	cf169e28ad	Release MLX Knife 1.1.0 - Stable Release Complete isolated test system with 150/150 tests passing. Production-ready after successful beta testing cycle. See CHANGELOG.md for comprehensive details including: - All critical issues from 1.1.0-beta3 resolved - Enhanced test infrastructure with real model validation - Multi-Python compatibility (3.9-3.13)	2025-08-26 16:30:12 +02:00
The BROKE Team	7d0d6be66d	Release MLX Knife 1.1.0-beta3 - Critical Cache Management Fixes Three major bug fixes for production readiness: - Issue #21: Fix crash on fresh installations (empty cache directory) - Issue #22: Suppress urllib3 LibreSSL warnings on macOS Python 3.9 - Issue #23: Fix double rm execution bug - models now deleted in single command Test improvements: - 140/140 tests passing (up from 137) - Added real integration tests for lock cleanup - Fixed unit tests broken by path corrections All known cache management issues resolved for stable release.	2025-08-26 01:46:40 +02:00
The BROKE Team	1aad374d08	Release MLX Knife 1.1.0-beta2 - Critical Bug Fixes & Test Stability Major fixes: - Issue #19: Server response truncation resolved - large context models work at full capacity - Issue #20: End-Token filtering in non-streaming mode - clean professional output - Test stability: Fixed flaky server tests, improved lifecycle management Technical changes: - Server: Dynamic token limits by default (--max-tokens None) - MLXRunner: Added _filter_end_tokens_from_response() for batch consistency - Tests: 132/132 passing + 48 comprehensive server tests - Documentation: Updated CHANGELOG.md, README.md, TESTING.md	2025-08-22 23:16:50 +02:00
The BROKE Team	74239c4e43	Release MLX Knife 1.1.0-beta1 - Dynamic Token Limits & Enhanced Web Client Issues Resolved: • Issue #15: Token limits vs natural stop tokens race condition - FIXED • Issue #16: Interactive vs server token limit policies - FIXED Major Improvements: • Automatic optimal token limits - no configuration needed • Manual --max-tokens control still available when desired • Eliminates old hardcoded 500/2000 token restrictions • Performance gains: Up to 524x improvement for large context models • Enhanced web client with model capabilities display and better UX Additional Enhancements: • Enhanced /v1/models API with context_length field • Comprehensive test expansion: 114 → 131 tests (131/131 passing) • Python 3.9-3.13 compatibility verified Known Issues (Beta Status): • Server deadlock possible under extreme concurrent model loading stress • Workaround: Avoid simultaneous heavy model operations	2025-08-21 17:36:44 +02:00
The BROKE Team	6117e571ca	Release MLX Knife 1.0.4 - Issue #14 Chat Self-Conversation Fix & Web UI Overhaul Fix Issue #14: Interactive chat self-conversation bug resolved - Added context-sensitive chat stop tokens (\nHuman:, \nAssistant:, \nYou:, \nUser:) - Smart priority system: native model stop tokens first, chat tokens as fallback - Affects both `mlxk run` and `mlxk server` modes with backward compatibility Web UI complete transformation (simple_chat.html): - 🦫 Beaver branding replaces 🔪 knife emoji - Model and chat history persistence across browser sessions - Smart model switching with option to keep or clear chat history Testing infrastructure enhancements: - Automated server testing with RAM-aware model filtering - 15 new regression tests across 7+ MLX models validating Issue #14 fix - Comprehensive TESTING.md guide for server-based testing All 114 tests passing	2025-08-19 20:43:44 +02:00
The BROKE Team	1f70b4984a	Release MLX Knife 1.0.3 - GitHub Issues Implementation & Enhanced User Experience Community-driven feature development implementing key GitHub Issues: - Fix Issue #7: Health check consistency for fuzzy model names - unified health check logic ensures identical status regardless of name format (Phi-3 vs mlx-community/Phi-3-mini-4k-instruct-4bit) - Add Issue #6: Repository name validation - pre-validation for HuggingFace Hub 96-character limit with clear error messages - Add Issue #13: Hash-based disambiguation - use commit hashes to resolve ambiguous model names (mlxk show Llama@de2dfaf5 → mlx-community/Llama-3.3-70B-Instruct-4bit) Enhanced user experience: - Pure local hash resolution without external API calls, offline-capable - Improved model name disambiguation logic for better workflow - Real user workflow support - see hashes in mlxk list, use directly in other commands	2025-08-18 20:21:43 +02:00
The BROKE Team	cbd25c658d	Release MLX Knife 1.0.2 - HF_HOME Cache Consistency & Corruption Fixes │ │ │ │ Major bug fixes addressing cache path inconsistencies and silent failures: │ │ - Fix Issue #11: HF_HOME environment variable handling - unified cache logic ensures consistent │ │ /hub subdirectory usage │ │ - Fix Issue #9: Silent failure on corrupted models with empty snapshots directories │ │ - Enhanced download throttling with adaptive delays (512KB chunks, 2-3s for large files) │ │ - Added migration warnings for legacy cache locations with clear user guidance │ │ - Improved corruption detection and deletion workflow consistency │ │ │ │ Technical improvements: │ │ - Unified cache architecture: CACHE_ROOT/hub for both default and HF_HOME scenarios │ │ - Exception-safe memory management with enhanced baseline tracking │ │ - Updated dependencies to latest tested versions (Python 3.9-3.13 support) │ │ - All 105 tests passing with real MLX model verification	2025-08-18 14:02:30 +02:00
The BROKE Team	8b0db287e4	Release MLX Knife 1.0.1 - Update PyPI Package Description	2025-08-15 23:32:58 +02:00
The BROKE Team	29548fe3f1	Release MLX Knife 1.0.0 - Stable Release with PyPI Publication Major milestone: First stable release with official PyPI distribution. New Features: - PyPI publication: Now installable via \`pip install mlx-knife\` - Official CLI-only designation with clear API policy - Absolute GitHub URLs for PyPI package display (logo + demo) Documentation Updates: - All docs updated to v1.0.0 (README, CHANGELOG, TESTING, SECURITY, CLAUDE.md) - Added PyPI installation instructions to README - Updated supported versions tables - Clarified CLI-only usage policy Release Highlights: - Transition from 1.0-rc3 to stable 1.0.0 - Production-ready with 104/104 tests passing - Global accessibility via PyPI distribution - Comprehensive documentation overhaul Ready for community adoption and production use.	2025-08-15 13:57:26 +02:00
mzfive	8e65ef03c7	Update README.md broke-cluster hint Broke Cluster projekt link in README.md footer section	2025-08-14 17:20:24 +02:00
mzfive	47af2c7096	Release MLX Knife 1.0-rc3 Resolves GitHub Issues #1 , #2 , #3 : - Fix #1: Partial name filtering for `mlxk list` command - Fix #2: Fuzzy matching for single-model commands - Fix #3: Default behavior for `mlxk health` (no --all flag required) - Expanded test suite to 104/104 tests passing	2025-08-14 14:06:26 +02:00
mzfive	01229cb6ef	Release MLX Knife 1.0-rc2: Enhanced Memory Management & Exception Safety Key Improvements - Robust exception handling during model loading with guaranteed cleanup - Protection against nested context manager usage in MLXRunner - Safe cleanup that handles partial loading failures gracefully - Exception-resilient cache clearing operations - Safe tokenizer attribute access with proper defaults - Graceful memory statistics handling when metrics unavailable - Comprehensive unit test coverage for memory management edge cases Changes - Updated version to 1.0-rc2 across all documentation files - Enhanced MLXRunner context manager with bulletproof exception safety - Added comprehensive unit tests for memory management scenarios - Improved error handling for partial model loading failures - Updated test coverage documentation (96/96 tests passing) - Refined README to focus on key features rather than test metrics This release focuses on production-ready memory management and exception safety, making MLX Knife more robust for real-world usage scenarios.	2025-08-13 20:52:34 +02:00
mzfive	b927fa1e33	Update documentation and remove GitHub Actions - no testing on github possible - Remove .github/workflows/tests.yml (local testing only) - Update CONTRIBUTING.md with current development workflow - Refine README.md for 1.0-rc1 release readiness - Update TESTING.md with comprehensive testing guide	2025-08-13 16:14:15 +02:00
mzfive	6b04448d1c	Initial commit: MLX Knife 1.0-rc1	2025-08-12 23:00:55 +02:00

39 Commits