51 Commits

Author SHA1 Message Date
The BROKE Cluster Team d4cd89fab0 Release 2.0.4 stable - see CHANGELOG.md for details 2026-02-11 15:05:09 +01:00
The BROKE Cluster Team 7f10187bee fix: Runtime gates + unit tests; benchmark GPU analysis
Core:
- Run preflight passes probe/framework to audio_runtime_compatibility
- STT model_type gate extended (vibevoice, audio)
- MLX 0.30.x compat: catch Exception in whisper_tokenizer
- Embedding-gate unit tests (3 tests)
- Removed get_encoding duplication (-45 LOC)

Benchmark:
- GPU analysis section in reports
2026-02-07 23:38:34 +01:00
The BROKE Cluster Team e021fb32cd Release 2.0.4-beta.10: Audio PyPI fix (tiktoken workaround complete)
Audio/Whisper works with pip install - no Git workaround needed.
See CHANGELOG.md for details.

Tested: 647 passed, 11 skipped (Python 3.10-3.12)
2026-02-05 10:42:50 +01:00
The BROKE Cluster Team 69a3c19c16 Release 2.0.4-beta.9: PyPI-ready with tiktoken workaround
Changes:
- Bundle tiktoken assets locally (mlxk2/assets/whisper/*.tiktoken)
- Monkey-patch mlx-audio to use bundled assets at import time
- Update pyproject.toml: mlx-audio>=0.3.1 from PyPI (no Git install)
- Simplify README: 3 clear installation methods
- Add NOTICE file for tiktoken asset attribution

Fixes: mlx-audio Issue #479 (missing tiktoken assets in PyPI wheels)
Workaround: Temporary until upstream fix available
2026-02-04 14:41:52 +01:00
The BROKE Cluster Team bf7480d042 Release 2.0.4-beta.9: Audio transcription via mlx-audio
Major Features:
- Audio transcription via mlx-audio backend (Whisper, >10min duration)
- OpenAI /v1/audio/transcriptions endpoint
- Memory Gate System (Vision: 8GB, Audio: 4GB)
- Config-based backend routing (ADR-020)
- Benchmark toolchain (memmon/memplot, Schema v0.2.2)

Key Fixes:
- EuroLLM tokenizer decoding
- Vision-model text-only routing regression
- Multimodal model context length detection
- Memory cleanup bug (mx.metal.clear_cache)
- Orphan process bug

Test Results:
- Unit tests: 647 passed, 11 skipped (Python 3.10-3.12)
- wet-umbrella: 171 passed total

See CHANGELOG.md for complete details and known issues.
2026-02-04 03:10:30 +01:00
The BROKE Cluster Team e8b10ea10b Release 2.0.4-beta.8: Audio transcription support (experimental)
Audio input via --audio flag (CLI) and input_audio content type (Server API).
Uses mlx-vlm native audio processing. ~30s duration limit (model constraint).
Currently only Gemma-3n tested (requires --repair-index fix).

Also includes:
- SERVER-HANDBOOK compliance (image limits, validation error envelopes)
- Dependency updates: mlx>=0.30.0, mlx-lm>=0.30.0, huggingface-hub>=1.0.0
- Audio E2E test suite + ADR-019
2026-01-23 20:20:59 +01:00
The BROKE Cluster Team 5751545b8b Release 2.0.4-beta.7: Server robustness + Vision per-chunk streaming
- Server: exit codes, /v1/models crash fix, vision routing, MLXK2_MAX_TOKENS
- Vision: true SSE streaming, hallucination fix (local numbering)
- Workspace: list prefix-match, push ambiguous pattern handling
- Docs: SERVER-HANDBOOK accuracy updates

See CHANGELOG.md for details.
2026-01-18 16:57:32 +01:00
The BROKE Cluster Team 53d9cca82d Release 2.0.4-beta.6: Local workspace workflow + Vision batch processing
- Complete local development cycle: clone → repair → run/show/server on
  workspace paths without HuggingFace round-trips
- Vision processing now defaults to safe chunking (one image at a time,
  prevents OOM + hallucination)
- Resumable clone with --force-resume and deterministic temp cache naming
- Improved test infrastructure (umbrella marker convention)
- 161 Wet Umbrella tests passing including new Vision→Geo pipe integration tests

See CHANGELOG.md for complete details.
2026-01-07 17:11:07 +01:00
The BROKE Cluster Team 25609e4dcb Release 2.0.4-beta.5: Community repair tool + OS-agnostic benchmarking
Closes #49 (Mistral Tokenizer Bug)

Major features:
- Workspace Infrastructure (ADR-018 Phase 0a): Managed workspace detection,
  provenance metadata, backward compatible with unmanaged workspaces
- Convert Operation (ADR-018 Phase 1): `mlxk convert --repair-index` fixes
  mlx-vlm #624 affected models (7+ models including Qwen2.5-VL, gemma-3)
- Resumable Pull: Auto-detect partial downloads with `--force-resume`
- Wet Umbrella Test Integration: Single entry point for all real model tests

Fixes:
- #49: BPE space markers now correctly converted (Mistral-family models)
- Vision Portfolio Discovery: Filter by capabilities instead of model_type
- Memory Cleanup Hook: Triggers for both live_e2e and wet markers

Test suite: 528 passed, 60 skipped (Python 3.9-3.14)
2025-12-31 16:05:18 +01:00
The BROKE Cluster Team c51ca1b10e Release 2.0.4-beta.4: Pixtral pad_token fix (mlx-vlm upstream)
Fixed Pixtral text-only regression via upstream mlx-vlm fix.
Updated dependency to Blaizzy/mlx-vlm@c536165d (merged PR #643).
Will switch to PyPI v0.3.10 when released.

Test suite: 144 passed, 21 skipped.
2025-12-25 20:17:51 +01:00
The BROKE Cluster Team d3f7d091bc Release 2.0.4-beta.3: Dependency compatibility + Documentation
Bugfixes and compatibility improvements. No new features.

Core fixes:
- Framework detection for web API models (Issue #48)
- Video-only model filtering from vision capability
- Page size detection for memory metrics (macOS)
- Model switch log timing (after load completion)

Compatibility:
- hub 1.x + transformers 5.0 support
- Python 3.9-3.14 verified (494 tests passing)

Testing infrastructure:
- Benchmark schema v0.2.0 (hardware profiling, system health)
- Benchmark template v1.0 (automated JSONL→Markdown reports)
- Memory timeline visualization (memplot.py)
- Unified model filter (build_model_object single source)

Documentation:
- Multi-Modal Support section in README (Vision subsection)
- JSON API 0.1.5-0.1.6 marked Stable
- Vision promoted from alpha to beta status
- Removed conceptual drift and outdated references

See CHANGELOG.md for complete details.
2025-12-23 12:19:04 +01:00
The BROKE Cluster Team f9e40c1720 Release 2.0.4-beta.2: PyPI compatibility + improved documentation
- Fix: Use PyPI mlx-vlm>=0.3.9 instead of Git URL (PyPI requirement)
- Doc: Add Vision installation instructions to README Installation section
- Doc: Clarify Python version requirements (Text: 3.9-3.14, Vision: 3.10-3.14)
- Doc: Update all version references (2.0.4-beta.1 → 2.0.4-beta.2)
- Version: 2.0.4b1 → 2.0.4b2
2025-12-16 21:10:16 +01:00
The BROKE Cluster Team 86f669dc82 Release 2.0.4-beta.1: Vision + Pipes + Memory
- Vision Support (Issue #45): CLI + Server with OpenAI-compatible image API, EXIF metadata
- Unix Pipes (ADR-014): stdin support, isatty detection, SIGPIPE handling
- Memory-Aware Loading (ADR-016): Pre-load checks with >70% RAM warnings
- Python 3.9-3.14: Full compatibility verified (476-485 tests passing)
- Fixed: --log-json regression (Issue #44), Vision multimodal history filtering

See CHANGELOG.md for complete details.
2025-12-16 19:35:30 +01:00
The BROKE Cluster Team 05f1c30486 Release 2.0.3: Foundation for pipes
Foundation release for Unix pipe integration with stderr separation,
benchmark infrastructure, and reasoning control improvements.

Breaking Changes:
- stdout/stderr separation (Issue #43) - errors to stderr in human mode
- JSON mode unchanged (all output to stdout)

Features:
- Benchmark reporting infrastructure (ADR-013 Phase 0)
- --no-reasoning flag (Issue #40 partial - GPT-OSS/QwQ only)
- Interactive mode reasoning control (review_report.md fixes)

Bug Fixes:
- huggingface-hub 1.x incompatibility (critical dependency fix)
- Streaming parity tests refactored (Portfolio Discovery)

Testing:
- 308 tests passing (Python 3.9-3.13)
- 35 skipped (opt-in live tests)
- 79/91 E2E tests passing with HF_HOME

See CHANGELOG.md for complete details and migration guide.
2025-11-17 22:54:06 +01:00
The BROKE Cluster Team d32d3185dd Release 2.0.2: Test infrastructure hardening & empirical validation
Stable release completing Issue #32 recovery plan - all tests passing.

Bug Fixes:
- Test collection regression (E2E suite parametrization)
- Stop token ordering (batch + streaming modes)
- E2E test temperature flakiness (deterministic sampling)
- Web API framework detection (PR #42 by @limey, fixes #41)
- E2E test marker fix (show_model_portfolio diagnostics)

Architecture:
- mlx-lm API evaluation: Keep manual text-based implementation
- Stop token workarounds: All 3 validated (Phi-3, DeepSeek-R1, GPT-oss)

Testing:
- Portfolio Discovery: 73/81 tests, 17 models, 0 failures
- E2E infrastructure hardened (TOKENIZERS, polling, gc.collect())
- Multi-Python validation: 3.9-3.13 passing

Documentation:
- ADR-009 Outstanding Work completed + Implementation Plan removed
- TESTING-DETAILS.md: Portfolio Discovery + E2E Architecture updated
- CHANGELOG.md: Complete 2.0.2 stable release notes
2025-11-15 22:10:08 +01:00
The BROKE Cluster Team 21cf188fcc Release 2.0.1: Portfolio Discovery + CLI Exit Code Fixes
Issue #32: Stop token Portfolio Discovery validates generic fix across all models
- Auto-discovers MLX chat models in HF_HOME with 4-filter validation
- RAM-aware testing (40-70% budgets) prevents OOM
- Empirical report generation (stop_token_config_report.json)
- Fallback to 3 predefined models without HF_HOME
- Implementation: tests_2.0/test_stop_tokens_live.py (~110 LOC)

Issue #38: CLI exit codes now propagate run command errors correctly
- Both text and JSON modes return exit code 1 on model execution failures
- Fixed: run_model() now returns error strings in both modes
- Implementation: mlxk2/operations/run.py + mlxk2/cli.py error detection
- New tests: tests_2.0/test_cli_run_exit_codes.py (9 comprehensive tests)

Testing: 306 passed, 20 skipped (zero regressions)
Docs: Updated README, TESTING, SECURITY for 2.0.1 stable release
Version: 2.0.0 → 2.0.1 (mlxk2/__init__.py)
2025-11-08 20:28:54 +01:00
The BROKE Cluster Team a2ca3da2d9 Release 2.0.0: Full rewrite with Apache 2.0 license
MLX Knife 2.0 replaces 1.x as the primary version.

Highlights:
- Full 1.x feature parity (list, show, pull, rm, run, server, health)
- JSON API for automation (--json flag)
- Enhanced error handling and logging
- Runtime compatibility checks
- Improved stop token detection
- License: MIT→Apache 2.0

Breaking changes:
- mlxk rm: requires --force flag for models with active locks

Migration guide: MIGRATION.md
Changelog: CHANGELOG.md
Testing: 297/317 tests passed, Python 3.9-3.13 verified

Merge branch 'feature/2.0.0-alpha.1'
2025-11-06 16:00:35 +01:00
The BROKE Cluster Team ae80bbe554 Release 2.0.0: Package rename, Apache 2.0 license, documentation updates 2025-11-06 15:21:10 +01:00
The BROKE Cluster Team fb54f59cd4 Release 2.0.0-beta.6: Stop token & compatibility bug fixes
Fixes Issue #32 (generic multi-EOS detection) and Issue #37 (model detection)

  - Generic stop token detection: Multi-EOS models (MXFP4, Qwen, Llama) now use eos_token_ids Set instead of
  model-specific workarounds
  - Private/org MLX model detection: `mlxk run` now works outside `mlx-community/*` namespace
  - Commit-pinned compatibility checks: Models with `@commit_hash` validated before inference
  - Packaging dependencies: Fixed `pip install -e .` requirements

  - ADR-009: Stop Token Detection Fix (generic approach + test strategy)
  - ADR-011: E2E Live Test Architecture (planned)

See CHANGELOG.md and TESTING.md for details.
2025-10-24 15:46:42 +02:00
The BROKE Cluster Team f5fe1dd061 Release 2.0.0-beta.5: Enhanced error handling & bug fixes
Features:
- Enhanced error handling & logging (ADR-004): Unified error envelope, structured logging with JSON support, request correlation
- Legacy format detection (Issue #37): Runtime compatibility check for weight file formats

Bug Fixes:
- Issue #37: Models with legacy weight formats now correctly detected as runtime-incompatible
- CLI regression fix: mlxk2 without arguments shows help instead of JSON error

Test Status: 295/295 passed, 14 skipped
2025-10-21 00:24:47 +02:00
The BROKE Cluster Team 4b75a22726 Release 2.0.0-beta.4: Runtime compatibility check (Issue #36)
- JSON API 0.1.5: runtime_compatible + reason fields

- mlx-lm dependency updated to >=0.28.3 (stable PyPI release)

- Human output: healthy / healthy* / unhealthy status display

- All tests passing (253 passed, 12 skipped) across Python 3.9-3.13
2025-10-18 16:06:58 +02:00
The BROKE Cluster Team d07bf66990 minor changes of badges 2025-10-01 12:09:22 +02:00
The BROKE Cluster Team 58facfb079 markdown formatting issues in README.md corrected 2025-10-01 11:47:45 +02:00
The BROKE Cluster Team 316c2f5585 README: Add beta release installation instructions
- Add recommended beta installation section with direct GitHub release URL
- Simplifies user onboarding for v2.0.0-beta.3 testing
- Reorganized Quick Start with clear Development vs Release paths
2025-09-21 13:59:00 +02:00
mzfive 4a8bcba9b4 Update README.md: linked to updated gif Demo 2025-09-18 14:19:27 +02:00
The BROKE Cluster Team 9261bc0c4e 2.0.0-beta.3: Feature Complete - Clone Implementation & Issue Resolution
- Clone Feature (Issue #29): Complete workspace-based workflow with ADR-007
  - Pull Preflight (Issue #30): Prevents cache pollution from gated/private repos
  - Lenient MLX Detection (Issue #31): Framework detection beyond mlx-community
  - Multi-shard Health (Issue #27): Strict completeness validation
  - Full JSON API 0.1.4: Complete schema for all 10 commands
  - Test Suite: 254/254 passed, comprehensive validation

  See CHANGELOG.md fnd TESTING.md or technical implementation details.
2025-09-18 14:09:32 +02:00
The BROKE Cluster Team 6279b82be8 Release: MLX Knife 1.1.1 - Stable Release
Promote 1.1.1-beta.3 to stable with metadata updates:

   Version: 1.1.1b3 → 1.1.1 (stable)
   PyPI classifier: Development Status 4 Beta → 5 Stable
   Documentation: Updated to reflect stable release
   Sponsor link: Fixed tileslauncher → tileshq
   Security policy: Support 1.1.1 + 2.0.0-beta.3 only

  No functional changes - same MXFP4 + GPT-OSS features as beta.3
2025-09-14 18:52:25 +02:00
The BROKE Cluster Team 57bf6d86be 2.0.0-beta.3: Feature Complete - Full 1.1.1 Parity Achieved
Major Features Added:
  • Complete run command implementation with interactive/single-shot modes
  • MLXRunner core engine ported from 1.x with modular architecture
  • OpenAI-compatible server with SIGINT-robust supervisor mode
  • Experimental push feature properly isolated behind environment variable

  Key Improvements:
  - Full feature parity with 1.1.1 stable releases
  - Enhanced human output formatting across all commands
  - Clean separation of stable (184 tests) vs experimental features
  - Updated demo GIF showcasing improved 2.0 interface

  Fixes:
  - Pull operation cache pollution (Issue #30) with preflight access checks
  - Test stability improvements across all environments

  Architecture:
  - Modular runner design with focused helper modules
  - Thread-safe model loading and memory management
  - stable testing across Python 3.9-3.13

  Ready for use as comprehensive 1.x alternative.
2025-09-14 18:04:18 +02:00
The BROKE Cluster Team ce46601d9d Release: 1.1.1-beta.3 - MXFP4 support and GPT-OSS reasoning
• MXFP4 Quantization Support (MLX ≥0.29.0, MLX-LM ≥0.27.0)
• GPT-OSS Reasoning Models with --hide-reasoning flag
• Enhanced Show Command with improved quantization display
• Documentation updates (README.md, TESTING.md)

See CHANGELOG.md for complete technical details.
Partial Issue #32 (GPT-OSS only, other reasoning models remain open).
2025-09-10 13:32:04 +02:00
The BROKE Cluster Team 3f57248121 2.0.0-alpha.3: lenient MLX detection + push branch handling
- Detect MLX/chat via README front‑matter + tokenizer; unify list/show; human list filters aligned (Refs #31)
  - Push: create missing branch with --create and retry once on “Invalid rev id”; tolerate missing branches
  offline; no‑op still creates branch with --create
  - Tests: add offline retry test; detection/human coverage; live list (opt‑in); 98/98 passing
  - Docs/Meta: CHANGELOG/TESTING/README/SECURITY/CLAUDE updated; hard split 1.x from this branch; Apache‑2.0 + NOTICE
2025-09-08 01:14:01 +02:00
Local Test 5045f9e1bd Release: 1.1.1‑b2 — Issue #31, stable server tests
- Lenient MLX detection via README/tokenizer (Issue #31)
- CLI: `show` type, strict `list` (chat), `run` accepts private MLX
- Server tests: RAM‑aware gating with `mlxk show`, MoE parsing fix (8x7B), server‑manager process guard, thread‑based timeout
- Multi‑Python script hardened; no ANSI; log tail on errors
- Docs: CHANGELOG, TESTING, CLAUDE updated; 166/166 green (Py 3.9–3.13), 32 server tests green
2025-09-07 02:24:08 +02:00
Local Test eedb91b75c Feat: add experimental push (2.0.0-alpha.2)
- Push (upload-only): quiet JSON by default; capture hub logs in data.hf_logs
  - No-op detection aligned to hub signal; clear commit fields; uploaded_files_count=0
  - Add --dry-run (plan vs remote) and --check-only (offline preflight); merge .hfignore; extend
  default ignores
  - Human output: concise; --verbose shows commit URL; JSON shape unchanged
  - Tests: add offline dry-run cases; live push remains opt-in (wet/live_push)
  - Docs: README push section updated; TESTING.md reference + mini-matrix;
  - Changelog: add 2.0.0-alpha.2; note Issue #31 under 1.1.1 pending
  - Spec: keep schema stable (0.1.3); CLI/version docs consistent
2025-09-05 22:42:39 +02:00
Local Test b9db12ae89 Fix: strict multi-shard health checks (#27) — backport to 1.x; bump 1.1.1b1
Summary

- Backports the 2.0 strict health policy to 1.x and ships as a pre-release.

Health changes (cache_utils.py)

- Index-aware: validate safetensors and PyTorch indexes; all referenced shards must exist, be >0B, and not be Git LFS
pointers.
- Pattern policy: shard patterns (model-XXXXX-of-YYYYY.*) without an index → unhealthy (parity with 2.0).
- Partial/tmp markers: any “.partial”, “.tmp”, or names containing “partial” anywhere under the snapshot → unhealthy.
- LFS scan: recursive detection of Git LFS pointer files (<200B with LFS header).
- Single-file fallback: non-empty .safetensors/.bin/*.gguf (no pattern shards) remain healthy.
- Ergonomics: is_model_healthy() accepts direct snapshot paths; check_lfs_corruption() scans recursively.

Tests

- Add tests/unit/test_health_multishard.py covering:
    - index complete → healthy; missing/empty shard → unhealthy; LFS pointer → unhealthy
    - pattern shards (no index) → unhealthy
    - partial marker → unhealthy
    - PyTorch index parity (complete → healthy)
    - single-file safetensors/gguf → healthy

Docs

- CHANGELOG.md: add 1.1.1-beta.1 with detailed rules; note GitHub tag vs PyPI mapping (1.1.1-beta.1 ↔ 1.1.1b1).
- README.md: tests badge 160/160; pre-release note for 1.1.1b1.
- TESTING.md: status 160/160; update test structure (add test_health_multishard.py; remove 2.0 note).

Version

- version = 1.1.1b1 (PEP 440 pre-release); VERSION = (1, 1, 1).

Behavioral impact

- Health reporting is stricter (cannot regress functionality): incomplete multi-shard downloads correctly report
unhealthy. No changes to pull/run/server behavior.

Validation

- Python 3.9 local: 160 passed, 36 deselected; warnings eliminated on 3.9/3.10 under project defaults.
- New multishard tests pass; manual spot-checks show expected unhealthy→healthy transitions as downloads complete.

Release

- GitHub: tag v1.1.1-beta.1 (Pre-release).
- PyPI: upload 1.1.1b1 (install via pip install --pre mlx-knife).
2025-09-01 01:26:27 +02:00
Local Test 19a66674c0 2.0.0-alpha.1: human output default; strict health (#27, PyTorch index)
See CHANGELOG.md and README.md
2025-08-31 22:25:43 +02:00
Local Test f511dd9c74 Docs: add sponsor section + badge; add FUNDING; fix footer to 1.1.0 2025-08-29 19:58:32 +02:00
Local Test de7ccf9018 2.0.0-alpha: default 2.0 tests, cache safety, and docs
Testing:
- pytest defaults to tests_2.0 via pytest.ini
- README/TESTING updated; Quick Start uses `pip install -e . && pip install pytest`

Safety:
- Add test-cache sentinel + centralized checks
- Strict delete guard via MLXK2_STRICT_TEST_DELETE=1
- Hide sentinel from 2.0 list output

Portability:
- Remove site-specific paths; generic test/user cache detection (mlxk2_test_ prefix + sentinel)

Docs:
- Environment & Caches, HF cache integrity
- Local-only hooks/excludes and local test script (excluded from VCS)
2025-08-29 16:57:45 +02:00
The BROKE Team d375e1bd3e MLX-Knife 2.0.0-alpha: Issue #27 Discovery & Development README
Major Achievements:
- Live reproduction and documentation of Issue #27 (health check false positive)
- Comprehensive development README.md for alpha phase parallel usage
- JSON API specification integration and references
- 45/45 tests passing with production-quality reliability

Issue #27 Critical Discovery:
- Health check false positives for multi-part model downloads
- Root cause: Multi-part pattern detection flaw in shared logic
- GitHub issue created with reproduction steps and technical analysis

2.0.0-Alpha Development Status:
- Revolutionary test isolation architecture complete
- Atomic cache system with triple safety verification
- Development handbook with parallel deployment guide
- Ready for production testing and broke-cluster integration
2025-08-28 23:49:14 +02:00
The BROKE Team cf169e28ad Release MLX Knife 1.1.0 - Stable Release
Complete isolated test system with 150/150 tests passing.
  Production-ready after successful beta testing cycle.

  See CHANGELOG.md for comprehensive details including:
  - All critical issues from 1.1.0-beta3 resolved
  - Enhanced test infrastructure with real model validation
  - Multi-Python compatibility (3.9-3.13)
2025-08-26 16:30:12 +02:00
The BROKE Team 7d0d6be66d Release MLX Knife 1.1.0-beta3 - Critical Cache Management Fixes
Three major bug fixes for production readiness:

- Issue #21: Fix crash on fresh installations (empty cache directory)
- Issue #22: Suppress urllib3 LibreSSL warnings on macOS Python 3.9
- Issue #23: Fix double rm execution bug - models now deleted in single command

Test improvements:
- 140/140 tests passing (up from 137)
- Added real integration tests for lock cleanup
- Fixed unit tests broken by path corrections

All known cache management issues resolved for stable release.
2025-08-26 01:46:40 +02:00
The BROKE Team 1aad374d08 Release MLX Knife 1.1.0-beta2 - Critical Bug Fixes & Test Stability
Major fixes:
  - Issue #19: Server response truncation resolved - large context models work at full capacity
  - Issue #20: End-Token filtering in non-streaming mode - clean professional output
  - Test stability: Fixed flaky server tests, improved lifecycle management

  Technical changes:
  - Server: Dynamic token limits by default (--max-tokens None)
  - MLXRunner: Added _filter_end_tokens_from_response() for batch consistency
  - Tests: 132/132 passing + 48 comprehensive server tests
  - Documentation: Updated CHANGELOG.md, README.md, TESTING.md
2025-08-22 23:16:50 +02:00
The BROKE Team 74239c4e43 Release MLX Knife 1.1.0-beta1 - Dynamic Token Limits & Enhanced Web Client
Issues Resolved:
  • Issue #15: Token limits vs natural stop tokens race condition - FIXED
  • Issue #16: Interactive vs server token limit policies - FIXED

  Major Improvements:
  • Automatic optimal token limits - no configuration needed
  • Manual --max-tokens control still available when desired
  • Eliminates old hardcoded 500/2000 token restrictions
  • Performance gains: Up to 524x improvement for large context models
  • Enhanced web client with model capabilities display and better UX

  Additional Enhancements:
  • Enhanced /v1/models API with context_length field
  • Comprehensive test expansion: 114 → 131 tests (131/131 passing)
  • Python 3.9-3.13 compatibility verified

  Known Issues (Beta Status):
  • Server deadlock possible under extreme concurrent model loading stress
  • Workaround: Avoid simultaneous heavy model operations
2025-08-21 17:36:44 +02:00
The BROKE Team 6117e571ca Release MLX Knife 1.0.4 - Issue #14 Chat Self-Conversation Fix & Web UI Overhaul
Fix Issue #14: Interactive chat self-conversation bug resolved
  - Added context-sensitive chat stop tokens (\nHuman:, \nAssistant:, \nYou:, \nUser:)
  - Smart priority system: native model stop tokens first, chat tokens as fallback
  - Affects both `mlxk run` and `mlxk server` modes with backward compatibility

  Web UI complete transformation (simple_chat.html):
  - 🦫 Beaver branding replaces 🔪 knife emoji
  - Model and chat history persistence across browser sessions
  - Smart model switching with option to keep or clear chat history

  Testing infrastructure enhancements:
  - Automated server testing with RAM-aware model filtering
  - 15 new regression tests across 7+ MLX models validating Issue #14 fix
  - Comprehensive TESTING.md guide for server-based testing

  All 114 tests passing
2025-08-19 20:43:44 +02:00
The BROKE Team 1f70b4984a Release MLX Knife 1.0.3 - GitHub Issues Implementation & Enhanced User Experience
Community-driven feature development implementing key GitHub Issues:
    - Fix Issue #7: Health check consistency for fuzzy model names - unified health check logic ensures
  identical status regardless of name format (Phi-3 vs mlx-community/Phi-3-mini-4k-instruct-4bit)
    - Add Issue #6: Repository name validation - pre-validation for HuggingFace Hub 96-character limit with
  clear error messages
    - Add Issue #13: Hash-based disambiguation - use commit hashes to resolve ambiguous model names (mlxk show
  Llama@de2dfaf5 → mlx-community/Llama-3.3-70B-Instruct-4bit)

  Enhanced user experience:
    - Pure local hash resolution without external API calls, offline-capable
    - Improved model name disambiguation logic for better workflow
    - Real user workflow support - see hashes in mlxk list, use directly in other commands
2025-08-18 20:21:43 +02:00
The BROKE Team cbd25c658d Release MLX Knife 1.0.2 - HF_HOME Cache Consistency & Corruption Fixes │
│                                                                                                      │
│   Major bug fixes addressing cache path inconsistencies and silent failures:                         │
│   - Fix Issue #11: HF_HOME environment variable handling - unified cache logic ensures consistent    │
│   /hub subdirectory usage                                                                            │
│   - Fix Issue #9: Silent failure on corrupted models with empty snapshots directories                │
│   - Enhanced download throttling with adaptive delays (512KB chunks, 2-3s for large files)           │
│   - Added migration warnings for legacy cache locations with clear user guidance                     │
│   - Improved corruption detection and deletion workflow consistency                                  │
│                                                                                                      │
│   Technical improvements:                                                                            │
│   - Unified cache architecture: CACHE_ROOT/hub for both default and HF_HOME scenarios                │
│   - Exception-safe memory management with enhanced baseline tracking                                 │
│   - Updated dependencies to latest tested versions (Python 3.9-3.13 support)                         │
│   - All 105 tests passing with real MLX model verification
2025-08-18 14:02:30 +02:00
The BROKE Team 8b0db287e4 Release MLX Knife 1.0.1 - Update PyPI Package Description 2025-08-15 23:32:58 +02:00
The BROKE Team 29548fe3f1 Release MLX Knife 1.0.0 - Stable Release with PyPI Publication
Major milestone: First stable release with official PyPI distribution.

  New Features:
  - PyPI publication: Now installable via \`pip install mlx-knife\`
  - Official CLI-only designation with clear API policy
  - Absolute GitHub URLs for PyPI package display (logo + demo)

  Documentation Updates:
  - All docs updated to v1.0.0 (README, CHANGELOG, TESTING, SECURITY, CLAUDE.md)
  - Added PyPI installation instructions to README
  - Updated supported versions tables
  - Clarified CLI-only usage policy

  Release Highlights:
  - Transition from 1.0-rc3 to stable 1.0.0
  - Production-ready with 104/104 tests passing
  - Global accessibility via PyPI distribution
  - Comprehensive documentation overhaul

  Ready for community adoption and production use.
2025-08-15 13:57:26 +02:00
mzfive 8e65ef03c7 Update README.md broke-cluster hint
Broke Cluster projekt link in README.md footer section
2025-08-14 17:20:24 +02:00
mzfive 47af2c7096 Release MLX Knife 1.0-rc3 Resolves GitHub Issues #1, #2, #3:
- Fix #1: Partial name filtering for `mlxk list` command
  - Fix #2: Fuzzy matching for single-model commands
  - Fix #3: Default behavior for `mlxk health` (no --all flag required)
  - Expanded test suite to 104/104 tests passing
2025-08-14 14:06:26 +02:00
mzfive 01229cb6ef Release MLX Knife 1.0-rc2: Enhanced Memory Management & Exception Safety
**Key Improvements**
  - Robust exception handling during model loading with guaranteed cleanup
  - Protection against nested context manager usage in MLXRunner
  - Safe cleanup that handles partial loading failures gracefully
  - Exception-resilient cache clearing operations
  - Safe tokenizer attribute access with proper defaults
  - Graceful memory statistics handling when metrics unavailable
  - Comprehensive unit test coverage for memory management edge cases

 **Changes**
  - Updated version to 1.0-rc2 across all documentation files
  - Enhanced MLXRunner context manager with bulletproof exception safety
  - Added comprehensive unit tests for memory management scenarios
  - Improved error handling for partial model loading failures
  - Updated test coverage documentation (96/96 tests passing)
  - Refined README to focus on key features rather than test metrics

  This release focuses on production-ready memory management and exception
  safety, making MLX Knife more robust for real-world usage scenarios.
2025-08-13 20:52:34 +02:00
mzfive b927fa1e33 Update documentation and remove GitHub Actions - no testing on github possible
- Remove .github/workflows/tests.yml (local testing only)
  - Update CONTRIBUTING.md with current development workflow
  - Refine README.md for 1.0-rc1 release readiness
  - Update TESTING.md with comprehensive testing guide
2025-08-13 16:14:15 +02:00