Holo3-35B-A3B · OSWorld-Verified · NUC16 LIVE

loading…  ·  ← research

Progress

tasks done / total
success rate
scored
elapsed
ETA remaining
mean task wall
official Holo3-35B-A3B leaderboard runs: 82.56 / 78.15 % (mean 80.4) at the same 100-step budget

Current task

idle

Live system telemetry

Serving aggregate

Success rate by domain

Package power (W) & CPU util — last ~3 h

Tasks

Device bring-up — why CPU (GPU/NPU tried first)

Root cause — why the NPU and iGPU can't serve this model

NPU (Intel NPU 5, OpenVINO) — blocked at the software-stack level, twice over:

iGPU (Xe3, llama.cpp Vulkan) — three independent, measured failure layers:

Proposed solution (concise). Now, on this box: stay on CPU and recover the broken KV prefix-cache — restructure the agent's message history so screenshot eviction stops invalidating the prefix (measured ~25K-token re-prefills at ~60 tok/s dominate step latency; est. 2–3× faster steps), plus a loop-detector that early-FAILs runaway episodes. Upstream: file the two minimal repros (lm_head-only corruption; clip DeviceLost) — either fix unlocks vision-prefill offload to the iGPU. Right long-term fix: heterogeneous placement once OpenVINO gains MoE-VLM support — vision tower on NPU, compute-bound prefill on iGPU, bandwidth-bound expert decode on CPU — or edge silicon with real bandwidth headroom (≥200 GB/s class) where the GPU path pays off.