Description: Mixture-of-Experts (MoE) models scale efficiently by sending each input to just a few specialists (âEśexpertsâEť). Most safety controls today sit after the model generates text (filters and classifiers), which means unsafe content can still be produced internally and only blocked at the end. This paper proposes Safety-MoE, a design that pushes safety into the architecture: a risk-aware router that uses safety signals to choose expertsÍľ an auditor that watches the output as it is being written and can switch to safer experts or stopÍľ and a final gate that either approves the answer or abstains and offers a safe alternative. We formalize safety as a risk functional â„›(𝑥, 𝑦), train with a utility-vs-risk objective, and show how to calibrate the abstention threshold so the overall systemâE™s risk stays under a target budget 𝜏. Color-coded figure illustrates the architecture and the safety-utility trade-off