fix: P0 bugfixes + test infrastructure + benchmark metadata sync

P0 Bugfixes:
- cache.py: Handle empty HF_HOME strings in get_current_cache_root()
- clone.py: Remove obsolete _validate_same_volume() check
- common.py: Use importlib.metadata instead of importing transformers

Test Infrastructure:
- runner/__init__.py: Replace "mock" fallback with clear RuntimeError
- Fix mock paths in test_runner_core, test_token_limits, etc.
- Add VISION_TEST_MODELS + AUDIO_TEST_MODELS fallbacks
- Portfolio fixtures work with and without HF_HOME

Benchmark Fixes:
- Sort models/tests alphabetically instead of by regression %
- Fix vision metadata drift: pixtral-12b-8bit → pixtral-12b-4bit

Documentation:
- ADR-022: Workspace-First Paradigm (draft)
- ADR-018: Phase 2 details expanded
- TESTING.md/TESTING-DETAILS.md: Fallback docs updated
This commit is contained in:
The BROKE Cluster Team
2026-02-10 15:52:36 +01:00
parent 7f10187bee
commit dab7ffb6fc
21 changed files with 1443 additions and 278 deletions
+2 -5
View File
@@ -271,12 +271,9 @@ HF_HOME=/path/to/cache pytest -m live_e2e -v
**Stop token validation** (ADR-009):
```bash
# Option A: Portfolio Discovery (recommended)
export HF_HOME=/path/to/cache
pytest -m live_stop_tokens -v
# Option B: Hardcoded models (requires 3 specific models in cache)
# See TESTING-DETAILS.md for model list
# Uses Portfolio Discovery if models found, else fallback models
# See TESTING-DETAILS.md "Required Models for Live Tests"
```
**Push/Clone tests** (alpha features):