Release MLX Knife 1.1.0-beta1 - Dynamic Token Limits & Enhanced Web Client

Issues Resolved:
  • Issue #15: Token limits vs natural stop tokens race condition - FIXED
  • Issue #16: Interactive vs server token limit policies - FIXED

  Major Improvements:
  • Automatic optimal token limits - no configuration needed
  • Manual --max-tokens control still available when desired
  • Eliminates old hardcoded 500/2000 token restrictions
  • Performance gains: Up to 524x improvement for large context models
  • Enhanced web client with model capabilities display and better UX

  Additional Enhancements:
  • Enhanced /v1/models API with context_length field
  • Comprehensive test expansion: 114 → 131 tests (131/131 passing)
  • Python 3.9-3.13 compatibility verified

  Known Issues (Beta Status):
  • Server deadlock possible under extreme concurrent model loading stress
  • Workaround: Avoid simultaneous heavy model operations
This commit is contained in:
The BROKE Team
2025-08-21 17:36:44 +02:00
parent 6117e571ca
commit 74239c4e43
12 changed files with 993 additions and 42 deletions
+3 -3
View File
@@ -8,7 +8,7 @@ A lightweight, ollama-like CLI for managing and running MLX models on Apple Sili
> **Note**: MLX Knife is designed as a command-line interface tool only. While some internal functions are accessible via Python imports, only CLI usage is officially supported.
**Current Version**: 1.0.4 (August 2025)
**Current Version**: 1.1.0-beta1 (August 2025) - Dynamic Token Limits & Web UI Enhancements
[![GitHub Release](https://img.shields.io/github/v/release/mzau/mlx-knife)](https://github.com/mzau/mlx-knife/releases)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
@@ -16,7 +16,7 @@ A lightweight, ollama-like CLI for managing and running MLX models on Apple Sili
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![Apple Silicon](https://img.shields.io/badge/Apple%20Silicon-M1%2FM2%2FM3-green.svg)](https://support.apple.com/en-us/HT211814)
[![MLX](https://img.shields.io/badge/MLX-Latest-orange.svg)](https://github.com/ml-explore/mlx)
[![Tests](https://img.shields.io/badge/tests-114%2F114%20passing-brightgreen.svg)](#testing)
[![Tests](https://img.shields.io/badge/tests-131%2F131%20passing-brightgreen.svg)](#testing)
## Features
@@ -325,6 +325,6 @@ Copyright (c) 2025 The BROKE team 🦫
<p align="center">
<b>Made with ❤️ by The BROKE team <img src="broke-logo.png" alt="BROKE Logo" width="30" style="vertical-align: middle;"></b><br>
<i>Version 1.0.4 | August 2025</i><br>
<i>Version 1.1.0-beta1 | August 2025</i><br>
<a href="https://github.com/mzau/broke-cluster">🔮 Next: BROKE Cluster for multi-node deployments</a>
</p>