mirror of
https://github.com/cloudstack-llc/mlx-knife.git
synced 2026-07-01 20:44:14 -04:00
bf7480d042
Major Features: - Audio transcription via mlx-audio backend (Whisper, >10min duration) - OpenAI /v1/audio/transcriptions endpoint - Memory Gate System (Vision: 8GB, Audio: 4GB) - Config-based backend routing (ADR-020) - Benchmark toolchain (memmon/memplot, Schema v0.2.2) Key Fixes: - EuroLLM tokenizer decoding - Vision-model text-only routing regression - Multimodal model context length detection - Memory cleanup bug (mx.metal.clear_cache) - Orphan process bug Test Results: - Unit tests: 647 passed, 11 skipped (Python 3.10-3.12) - wet-umbrella: 171 passed total See CHANGELOG.md for complete details and known issues.
2.5 KiB
2.5 KiB
Architecture Decision Records (ADRs)
Overview
This directory contains Architecture Decision Records (ADRs) that document significant architectural and design decisions for the MLX-Knife project.
Active ADRs
| ADR | Title | Status | Date |
|---|---|---|---|
| ADR-001 | JSON API Strategy & 2.0 Migration Path | Accepted | 2025-08-28 |
| ADR-002 | Edge Cases from 1.x Test Suite | Accepted | 2025-08-28 |
| ADR-003 | Server and Run Functionality Port from 1.x to 2.0 | Accepted | 2025-09-10 |
| ADR-004 | Enhanced Error Handling & Logging | Accepted | 2025-10-19 |
| ADR-005 | Clone Implementation Beta3 | Superseded by ADR-007 | 2025-09-18 |
| ADR-006 | Clone Implementation Revised | Superseded by ADR-007 | 2025-09-18 |
| ADR-007 | Clone Implementation Fixed Strategy | Accepted | 2025-09-18 |
| ADR-008 | MLXModel Package Format | Proposed | (not committed) |
| ADR-009 | Stop Token Detection Fix | Implemented | 2025-10-21 |
| ADR-010 | Reasoning Content API | Draft | (not committed) |
| ADR-011 | E2E Live Test Architecture | Implemented | 2025-10-21 |
| ADR-012 | Vision Support Roadmap | Implemented (Phase 1-3) | 2025-11-12 |
| ADR-013 | Community Model Quality Database | Planned | (not committed) |
| ADR-014 | Unix Pipe Integration | Implemented (Phase 1) | 2025-11-16 |
| ADR-015 | Embeddings API | Planned | (not committed) |
| ADR-016 | Memory-Aware Model Loading | Implemented (Phase 1-2b) | 2026-01-29 |
| ADR-017 | Image Metadata Extraction (EXIF) | Implemented (Phase 1) | (not committed) |
| ADR-018 | Convert Operation | Implemented (Phase 0-1) | 2025-12-18 |
| ADR-019 | Audio Input Support (beta.8) | Obsolete (→ ADR-020) | 2026-01-20 |
| ADR-020 | Audio Backend Architecture (beta.9) | Implemented | 2026-01-31 |
ADR Format
Each ADR follows this structure:
- Status: Proposed / Accepted / Rejected / Superseded
- Context: Why this decision is needed
- Decision: What we decided to do
- Consequences: What happens as a result