A Hugging Face model checkpoint that attaches a speculative decoding module to DeepSeek-V4-Flash, enabling million-token context handling with MoE architecture, FP4/FP8 mixed precision, and long-context inference optimizations.