SMOLM2Prover - GGUF Format

GGUF quantized version of the SMOLM2Prover model for use with llama.cpp and compatible runtimes.

Model Details

  • Original Model: reaperdoesntknow/SMOLM2Prover
  • Architecture: LlamaForCausalLM
  • Context Length: 8192 tokens
  • Embedding Dimension: 960
  • Layers: 32
  • Head Count: 15 (Q), 5 (KV) - GQA

Available Files

File Size Quantization Quality
SMOLM2Prover.gguf 692M F16 Original (no quantization)
SMOLM2Prover-Q4_K_M.gguf 258M Q4_K_M Recommended (good quality/size balance)

Usage

With llama.cpp

# Run with the quantized model
./llama-cli -m SMOLM2Prover-Q4_K_M.gguf -p "Your prompt here" -n 256

With Ollama

Create a Modelfile:

FROM ./SMOLM2Prover-Q4_K_M.gguf

Then:

ollama create smolm2prover -f Modelfile
ollama run smolm2prover

With LM Studio

  1. Download SMOLM2Prover-Q4_K_M.gguf
  2. Place in LM Studio models folder
  3. Load and chat!

Quantization Details

The Q4_K_M quantization uses:

  • Q4_K for most weights
  • Q5_0 fallback for tensors not divisible by 256
  • Q6_K/Q8_0 for some critical layers

Size reduction: 692M → 258M (63% smaller) BPW: 5.94 bits per weight

Discrepancy Calculus Foundation

This model is part of the Convergent Intelligence LLC: Research Division portfolio. All models in this portfolio are developed under the Discrepancy Calculus (DISC) framework — a measure-theoretic approach to understanding and controlling the gap between what a model should produce and what it actually produces.

DISC treats training singularities (loss plateaus, mode collapse, catastrophic forgetting) not as failures to be smoothed over, but as structural signals that reveal the geometry of the learning problem. Key concepts:

  • Discrepancy Operator (D): Measures the gap between expected and observed behavior at each training step
  • Jump Sets: Boundaries where model behavior changes discontinuously — these are features, not bugs
  • Ghost Imprinting: Teacher knowledge that transfers to student models through weight-space topology rather than explicit distillation signal

For the full mathematical treatment, see Discrepancy Calculus: Foundations and Core Theory (DOI: 10.57967/hf/8194).

Citation chain: Structure Over Scale (DOI: 10.57967/hf/8165) → Three Teachers to Dual Cognition (DOI: 10.57967/hf/8184) → Discrepancy Calculus (DOI: 10.57967/hf/8194)

License

Same as the original model.


Convergent Intelligence Portfolio

Part of the Standalone Models by Convergent Intelligence LLC: Research Division

Related Models

Model Downloads Format
SMOLM2Prover 56 HF
DeepReasoning_1R 16 HF
SAGI 3 HF
S-AGI 0 HF

Top Models from Our Lab

Total Portfolio: 41 models | 2,781 total downloads

Last updated: 2026-03-28 12:55 UTC


From the Convergent Intelligence Portfolio

DistilQwen Collection — Our only BF16 series. Proof-weighted distillation from Qwen3-30B-A3B → 1.7B and 0.6B on H100. Three teacher variants (Instruct, Thinking, Coder), nine models, 2,788 combined downloads. The rest of the portfolio proves structure beats scale on CPU. This collection shows what happens when you give the methodology real hardware.

Top model: Qwen3-1.7B-Coder-Distilled-SFT — 508 downloads

Full methodology: Structure Over Scale (DOI: 10.57967/hf/8165)

Convergent Intelligence LLC: Research Division


Part of the reaperdoesntknow research portfolio — 48 models, 12,094 total downloads | Last refreshed: 2026-03-29 21:05 UTC

Downloads last month
653
GGUF
Model size
0.4B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for reaperdoesntknow/SMOLM2Prover-GGUF

Dataset used to train reaperdoesntknow/SMOLM2Prover-GGUF