Commit Graph

16 Commits

Author SHA1 Message Date
The BROKE Cluster Team ce46601d9d Release: 1.1.1-beta.3 - MXFP4 support and GPT-OSS reasoning
• MXFP4 Quantization Support (MLX ≥0.29.0, MLX-LM ≥0.27.0)
• GPT-OSS Reasoning Models with --hide-reasoning flag
• Enhanced Show Command with improved quantization display
• Documentation updates (README.md, TESTING.md)

See CHANGELOG.md for complete technical details.
Partial Issue #32 (GPT-OSS only, other reasoning models remain open).
2025-09-10 13:32:04 +02:00
Local Test 5045f9e1bd Release: 1.1.1‑b2 — Issue #31, stable server tests
- Lenient MLX detection via README/tokenizer (Issue #31)
- CLI: `show` type, strict `list` (chat), `run` accepts private MLX
- Server tests: RAM‑aware gating with `mlxk show`, MoE parsing fix (8x7B), server‑manager process guard, thread‑based timeout
- Multi‑Python script hardened; no ANSI; log tail on errors
- Docs: CHANGELOG, TESTING, CLAUDE updated; 166/166 green (Py 3.9–3.13), 32 server tests green
2025-09-07 02:24:08 +02:00
Local Test b9db12ae89 Fix: strict multi-shard health checks (#27) — backport to 1.x; bump 1.1.1b1
Summary

- Backports the 2.0 strict health policy to 1.x and ships as a pre-release.

Health changes (cache_utils.py)

- Index-aware: validate safetensors and PyTorch indexes; all referenced shards must exist, be >0B, and not be Git LFS
pointers.
- Pattern policy: shard patterns (model-XXXXX-of-YYYYY.*) without an index → unhealthy (parity with 2.0).
- Partial/tmp markers: any “.partial”, “.tmp”, or names containing “partial” anywhere under the snapshot → unhealthy.
- LFS scan: recursive detection of Git LFS pointer files (<200B with LFS header).
- Single-file fallback: non-empty .safetensors/.bin/*.gguf (no pattern shards) remain healthy.
- Ergonomics: is_model_healthy() accepts direct snapshot paths; check_lfs_corruption() scans recursively.

Tests

- Add tests/unit/test_health_multishard.py covering:
    - index complete → healthy; missing/empty shard → unhealthy; LFS pointer → unhealthy
    - pattern shards (no index) → unhealthy
    - partial marker → unhealthy
    - PyTorch index parity (complete → healthy)
    - single-file safetensors/gguf → healthy

Docs

- CHANGELOG.md: add 1.1.1-beta.1 with detailed rules; note GitHub tag vs PyPI mapping (1.1.1-beta.1 ↔ 1.1.1b1).
- README.md: tests badge 160/160; pre-release note for 1.1.1b1.
- TESTING.md: status 160/160; update test structure (add test_health_multishard.py; remove 2.0 note).

Version

- version = 1.1.1b1 (PEP 440 pre-release); VERSION = (1, 1, 1).

Behavioral impact

- Health reporting is stricter (cannot regress functionality): incomplete multi-shard downloads correctly report
unhealthy. No changes to pull/run/server behavior.

Validation

- Python 3.9 local: 160 passed, 36 deselected; warnings eliminated on 3.9/3.10 under project defaults.
- New multishard tests pass; manual spot-checks show expected unhealthy→healthy transitions as downloads complete.

Release

- GitHub: tag v1.1.1-beta.1 (Pre-release).
- PyPI: upload 1.1.1b1 (install via pip install --pre mlx-knife).
2025-09-01 01:26:27 +02:00
The BROKE Team cf169e28ad Release MLX Knife 1.1.0 - Stable Release
Complete isolated test system with 150/150 tests passing.
  Production-ready after successful beta testing cycle.

  See CHANGELOG.md for comprehensive details including:
  - All critical issues from 1.1.0-beta3 resolved
  - Enhanced test infrastructure with real model validation
  - Multi-Python compatibility (3.9-3.13)
2025-08-26 16:30:12 +02:00
The BROKE Team 7d0d6be66d Release MLX Knife 1.1.0-beta3 - Critical Cache Management Fixes
Three major bug fixes for production readiness:

- Issue #21: Fix crash on fresh installations (empty cache directory)
- Issue #22: Suppress urllib3 LibreSSL warnings on macOS Python 3.9
- Issue #23: Fix double rm execution bug - models now deleted in single command

Test improvements:
- 140/140 tests passing (up from 137)
- Added real integration tests for lock cleanup
- Fixed unit tests broken by path corrections

All known cache management issues resolved for stable release.
2025-08-26 01:46:40 +02:00
The BROKE Team 1aad374d08 Release MLX Knife 1.1.0-beta2 - Critical Bug Fixes & Test Stability
Major fixes:
  - Issue #19: Server response truncation resolved - large context models work at full capacity
  - Issue #20: End-Token filtering in non-streaming mode - clean professional output
  - Test stability: Fixed flaky server tests, improved lifecycle management

  Technical changes:
  - Server: Dynamic token limits by default (--max-tokens None)
  - MLXRunner: Added _filter_end_tokens_from_response() for batch consistency
  - Tests: 132/132 passing + 48 comprehensive server tests
  - Documentation: Updated CHANGELOG.md, README.md, TESTING.md
2025-08-22 23:16:50 +02:00
The BROKE Team 74239c4e43 Release MLX Knife 1.1.0-beta1 - Dynamic Token Limits & Enhanced Web Client
Issues Resolved:
  • Issue #15: Token limits vs natural stop tokens race condition - FIXED
  • Issue #16: Interactive vs server token limit policies - FIXED

  Major Improvements:
  • Automatic optimal token limits - no configuration needed
  • Manual --max-tokens control still available when desired
  • Eliminates old hardcoded 500/2000 token restrictions
  • Performance gains: Up to 524x improvement for large context models
  • Enhanced web client with model capabilities display and better UX

  Additional Enhancements:
  • Enhanced /v1/models API with context_length field
  • Comprehensive test expansion: 114 → 131 tests (131/131 passing)
  • Python 3.9-3.13 compatibility verified

  Known Issues (Beta Status):
  • Server deadlock possible under extreme concurrent model loading stress
  • Workaround: Avoid simultaneous heavy model operations
2025-08-21 17:36:44 +02:00
The BROKE Team 6117e571ca Release MLX Knife 1.0.4 - Issue #14 Chat Self-Conversation Fix & Web UI Overhaul
Fix Issue #14: Interactive chat self-conversation bug resolved
  - Added context-sensitive chat stop tokens (\nHuman:, \nAssistant:, \nYou:, \nUser:)
  - Smart priority system: native model stop tokens first, chat tokens as fallback
  - Affects both `mlxk run` and `mlxk server` modes with backward compatibility

  Web UI complete transformation (simple_chat.html):
  - 🦫 Beaver branding replaces 🔪 knife emoji
  - Model and chat history persistence across browser sessions
  - Smart model switching with option to keep or clear chat history

  Testing infrastructure enhancements:
  - Automated server testing with RAM-aware model filtering
  - 15 new regression tests across 7+ MLX models validating Issue #14 fix
  - Comprehensive TESTING.md guide for server-based testing

  All 114 tests passing
2025-08-19 20:43:44 +02:00
The BROKE Team 1f70b4984a Release MLX Knife 1.0.3 - GitHub Issues Implementation & Enhanced User Experience
Community-driven feature development implementing key GitHub Issues:
    - Fix Issue #7: Health check consistency for fuzzy model names - unified health check logic ensures
  identical status regardless of name format (Phi-3 vs mlx-community/Phi-3-mini-4k-instruct-4bit)
    - Add Issue #6: Repository name validation - pre-validation for HuggingFace Hub 96-character limit with
  clear error messages
    - Add Issue #13: Hash-based disambiguation - use commit hashes to resolve ambiguous model names (mlxk show
  Llama@de2dfaf5 → mlx-community/Llama-3.3-70B-Instruct-4bit)

  Enhanced user experience:
    - Pure local hash resolution without external API calls, offline-capable
    - Improved model name disambiguation logic for better workflow
    - Real user workflow support - see hashes in mlxk list, use directly in other commands
2025-08-18 20:21:43 +02:00
The BROKE Team cbd25c658d Release MLX Knife 1.0.2 - HF_HOME Cache Consistency & Corruption Fixes │
│                                                                                                      │
│   Major bug fixes addressing cache path inconsistencies and silent failures:                         │
│   - Fix Issue #11: HF_HOME environment variable handling - unified cache logic ensures consistent    │
│   /hub subdirectory usage                                                                            │
│   - Fix Issue #9: Silent failure on corrupted models with empty snapshots directories                │
│   - Enhanced download throttling with adaptive delays (512KB chunks, 2-3s for large files)           │
│   - Added migration warnings for legacy cache locations with clear user guidance                     │
│   - Improved corruption detection and deletion workflow consistency                                  │
│                                                                                                      │
│   Technical improvements:                                                                            │
│   - Unified cache architecture: CACHE_ROOT/hub for both default and HF_HOME scenarios                │
│   - Exception-safe memory management with enhanced baseline tracking                                 │
│   - Updated dependencies to latest tested versions (Python 3.9-3.13 support)                         │
│   - All 105 tests passing with real MLX model verification
2025-08-18 14:02:30 +02:00
The BROKE Team 8b0db287e4 Release MLX Knife 1.0.1 - Update PyPI Package Description 2025-08-15 23:32:58 +02:00
The BROKE Team 29548fe3f1 Release MLX Knife 1.0.0 - Stable Release with PyPI Publication
Major milestone: First stable release with official PyPI distribution.

  New Features:
  - PyPI publication: Now installable via \`pip install mlx-knife\`
  - Official CLI-only designation with clear API policy
  - Absolute GitHub URLs for PyPI package display (logo + demo)

  Documentation Updates:
  - All docs updated to v1.0.0 (README, CHANGELOG, TESTING, SECURITY, CLAUDE.md)
  - Added PyPI installation instructions to README
  - Updated supported versions tables
  - Clarified CLI-only usage policy

  Release Highlights:
  - Transition from 1.0-rc3 to stable 1.0.0
  - Production-ready with 104/104 tests passing
  - Global accessibility via PyPI distribution
  - Comprehensive documentation overhaul

  Ready for community adoption and production use.
2025-08-15 13:57:26 +02:00
mzfive 47af2c7096 Release MLX Knife 1.0-rc3 Resolves GitHub Issues #1, #2, #3:
- Fix #1: Partial name filtering for `mlxk list` command
  - Fix #2: Fuzzy matching for single-model commands
  - Fix #3: Default behavior for `mlxk health` (no --all flag required)
  - Expanded test suite to 104/104 tests passing
2025-08-14 14:06:26 +02:00
mzfive 01229cb6ef Release MLX Knife 1.0-rc2: Enhanced Memory Management & Exception Safety
**Key Improvements**
  - Robust exception handling during model loading with guaranteed cleanup
  - Protection against nested context manager usage in MLXRunner
  - Safe cleanup that handles partial loading failures gracefully
  - Exception-resilient cache clearing operations
  - Safe tokenizer attribute access with proper defaults
  - Graceful memory statistics handling when metrics unavailable
  - Comprehensive unit test coverage for memory management edge cases

 **Changes**
  - Updated version to 1.0-rc2 across all documentation files
  - Enhanced MLXRunner context manager with bulletproof exception safety
  - Added comprehensive unit tests for memory management scenarios
  - Improved error handling for partial model loading failures
  - Updated test coverage documentation (96/96 tests passing)
  - Refined README to focus on key features rather than test metrics

  This release focuses on production-ready memory management and exception
  safety, making MLX Knife more robust for real-world usage scenarios.
2025-08-13 20:52:34 +02:00
mzfive b927fa1e33 Update documentation and remove GitHub Actions - no testing on github possible
- Remove .github/workflows/tests.yml (local testing only)
  - Update CONTRIBUTING.md with current development workflow
  - Refine README.md for 1.0-rc1 release readiness
  - Update TESTING.md with comprehensive testing guide
2025-08-13 16:14:15 +02:00
mzfive 6b04448d1c Initial commit: MLX Knife 1.0-rc1 2025-08-12 23:00:55 +02:00