mirror of
https://github.com/cloudstack-llc/mlx-knife.git
synced 2026-06-30 20:48:03 -04:00
Release MLX Knife 1.0-rc2: Enhanced Memory Management & Exception Safety
**Key Improvements** - Robust exception handling during model loading with guaranteed cleanup - Protection against nested context manager usage in MLXRunner - Safe cleanup that handles partial loading failures gracefully - Exception-resilient cache clearing operations - Safe tokenizer attribute access with proper defaults - Graceful memory statistics handling when metrics unavailable - Comprehensive unit test coverage for memory management edge cases **Changes** - Updated version to 1.0-rc2 across all documentation files - Enhanced MLXRunner context manager with bulletproof exception safety - Added comprehensive unit tests for memory management scenarios - Improved error handling for partial model loading failures - Updated test coverage documentation (96/96 tests passing) - Refined README to focus on key features rather than test metrics This release focuses on production-ready memory management and exception safety, making MLX Knife more robust for real-world usage scenarios.
This commit is contained in:
@@ -6,7 +6,7 @@
|
||||
|
||||
A lightweight, ollama-like CLI for managing and running MLX models on Apple Silicon. **Designed for personal, local use** - perfect for individual developers and researchers working with MLX models.
|
||||
|
||||
**Current Version**: 1.0-rc1 (August 2025)
|
||||
**Current Version**: 1.0-rc2 (August 2025)
|
||||
|
||||
[](https://github.com/mzau/mlx-knife/releases)
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
@@ -14,7 +14,7 @@ A lightweight, ollama-like CLI for managing and running MLX models on Apple Sili
|
||||
[](https://www.python.org/downloads/)
|
||||
[](https://support.apple.com/en-us/HT211814)
|
||||
[](https://github.com/ml-explore/mlx)
|
||||
[](#testing)
|
||||
[](#testing)
|
||||
|
||||
## Features
|
||||
|
||||
@@ -43,7 +43,7 @@ A lightweight, ollama-like CLI for managing and running MLX models on Apple Sili
|
||||
- **Memory Insights**: See GPU memory usage after model loading and generation
|
||||
- **Dynamic Stop Tokens**: Automatic detection and filtering of model-specific stop tokens
|
||||
- **Customizable Generation**: Control temperature, max_tokens, top_p, and repetition penalty
|
||||
- **RAII Memory Management**: Context manager pattern ensures automatic cleanup and no memory leaks
|
||||
- **Context-Managed Memory**: Context manager pattern ensures automatic cleanup and prevents memory leaks
|
||||
- **Exception-Safe**: Robust error handling with guaranteed resource cleanup
|
||||
|
||||
## Installation
|
||||
@@ -298,7 +298,7 @@ mlxk run bert-base-uncased
|
||||
|
||||
## Testing
|
||||
|
||||
MLX Knife includes comprehensive test coverage with **86/86 tests passing** across all supported Python versions.
|
||||
MLX Knife includes comprehensive test coverage across all supported Python versions.
|
||||
|
||||
### Quick Start
|
||||
|
||||
@@ -346,7 +346,7 @@ Stop tokens are dynamically extracted from each model's tokenizer:
|
||||
- Common tokens verified as single-token entities
|
||||
|
||||
### Memory Management
|
||||
- **RAII Pattern**: Context manager ensures automatic resource cleanup
|
||||
- **Context Managers**: Automatic resource cleanup with Python context managers
|
||||
- **Exception-Safe**: Model cleanup guaranteed even on errors
|
||||
- **Baseline Tracking**: Memory captured before model loading
|
||||
- **Real-time Monitoring**: GPU memory tracking via `mlx.core.get_active_memory()`
|
||||
@@ -416,5 +416,5 @@ Copyright (c) 2025 The BROKE team 🦫
|
||||
|
||||
<p align="center">
|
||||
<b>Made with ❤️ by The BROKE team <img src="broke-logo.png" alt="BROKE Logo" width="30" style="vertical-align: middle;"></b><br>
|
||||
<i>Version 1.0-rc1 | August 2025</i>
|
||||
<i>Version 1.0-rc2 | August 2025</i>
|
||||
</p>
|
||||
Reference in New Issue
Block a user