kimi-k2.6-eagle3-mla (continual fine-tune)

Eagle3 MTP draft model with MLA (Multi-Latent Attention) for accelerating inference of Kimi-K2.6.

This checkpoint is a continual fine-tune of lightseekorg/kimi-k2.6-eagle3-mla, further trained on an in-house instruction distribution. The architecture, config, and tensor layout are identical to the base model, so it is a drop-in replacement in the same vLLM serving path.

Training Setup

Init: continual FT from lightseekorg/kimi-k2.6-eagle3-mla.
Framework: Camelot online speculative-decoding training — concurrent FSDP training + vLLM rollout with cross-node hidden-state transfer over Mooncake (RDMA).
Topology: 1 datagen node (Kimi-K2.6, TP=8) -> Mooncake -> 1 trainer GPU.
Schedule: an initial 10k-step cosine phase (LR 2e-5), then a constant low-LR (2e-6) refinement phase continued from the best checkpoint. This published checkpoint is the best-val checkpoint of the refinement phase.
Data: online-generated hidden states from a ~100k-prompt instruction set (each sample consumed once).
seq len 4096, Eagle3 TTT steps 3, global batch size 1.

Validation (training-time, teacher-forced)

Per-position draft accuracy on a fixed held-out val split, measured during training. acc@i is the accuracy at TTT position i (i = 0, 1, 2): full_acc@i requires positions 0..i all correct; cond_acc@i is conditioned on 0..i-1 correct. This is a training-time, teacher-forced metric on a small online-sampled val split — it indicates per-position draft quality but is not a runtime accept-length and is not comparable across runs/splits.

Best checkpoint of the refinement phase (this published checkpoint):

metric	value
val_loss	3.608
full_acc@0	0.799
full_acc@1	0.517
full_acc@2	0.306
cond_acc@1	0.648
cond_acc@2	0.594

A runtime accept_length benchmark (vLLM 0.20, num_speculative_tokens=3) on a common held-out set is pending and will be added once measured.

Quick Start (vLLM >= 0.20.0)

vllm serve moonshotai/Kimi-K2.6 \
    --tensor-parallel-size 8 \
    --speculative-config '{"model": "k-l-lambda/kimi-k2.6-eagle3-mla", "method": "eagle3", "num_speculative_tokens": 3}' \
    --trust-remote-code

Downloads last month: 17

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for k-l-lambda/kimi-k2.6-eagle3-mla

Base model

moonshotai/Kimi-K2.6

Finetuned

lightseekorg/kimi-k2.6-eagle3-mla

Finetuned

(1)

this model