# Model Card for Recursive Transformer Model (RTM) / ERS PyTorch Implementation

This is the official PyTorch implementation of the Recursive Transformer Model (RTM), a novel architecture that augments standard Transformer-based systems with recursive memory reconsideration, temporal decay mechanisms, and Persistent Memory Logic Loops (PMLL). It addresses "nostalgic incorrectness" (the tendency of stateless AI to retain outdated or contradictory beliefs) by maintaining coherent, self-correcting state across inference sessions. The production-grade reference implementation is the Enhanced Reconsideration System (ERS) library, which includes PyTorch components for embeddings, lattice-based tensor routing, multi-petal attention, and knowledge-graph integration.<grok:render card_id="0ccb01" card_type="citation_card" type="render_inline_citation">1<grok:render card_id="f06887" card_type="citation_card" type="render_inline_citation">20

The Kaggle-hosted PyTorch model provides the core RTM/ERS runtime (including PMLLLattice, MemoryBlock, temporal decay, consensus, and contradiction detection) for integration with any LLM/transformer stack. It is not a standalone pretrained language model but a stateful memory layer/framework.

Model Details

Model Description

The Recursive Transformer Model (RTM) extends the classic Transformer architecture with:

Adaptive temporal decay on memory confidence.
Multi-dimensional consensus via embedding-space geometry and knowledge graphs.
Vector-based contradiction detection with integrated rewrite capabilities.
Persistent Memory Logic Loops (PMLL): a lattice-based DAG for compressed, low-rank tensor routing and recursive passes over memory slots.

Key innovations solve the stateless limitation of standard transformers by enabling iterative, multi-pass reconsideration of beliefs during inference. The Enhanced Reconsideration System (ERS) is the complete, production-ready Python/PyTorch reference implementation.<grok:render card_id="c29ff8" card_type="citation_card" type="render_inline_citation">17<grok:render card_id="7edf74" card_type="citation_card" type="render_inline_citation">43

Developed by: Dr. Josef “Q.” Edwards (Josef Kurk Edwards / josefedwards / drQedwards), University of Colorado Boulder
Funded by [optional]: U.S. Department of Defense (funder identifier 100000005)
Shared by [optional]: Josef Edwards (via Kaggle and GitHub)
Model type: Recursive Transformer extension / stateful memory framework (PMLL + ERS)
Language(s) (NLP): Language-agnostic (works with any text/embedding-based input; primarily demonstrated on English factual/knowledge-base tasks)
License: MIT (see ERS repository)
Finetuned from model [optional]: Not finetuned; augments any base Transformer (integrates with sentence-transformers, LangChain, etc.)

Model Sources [optional]

Repository: Kaggle Model • GitHub ERS (primary implementation) • GitHub PMLL_archive
Paper: Edwards, J. K. (2025). The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops. TechRxiv. DOI: 10.36227/techrxiv.176118936.69886233/v1 (October 23, 2025)<grok:render card_id="a04172" card_type="citation_card" type="render_inline_citation">20
Demo [optional]: See ERS README quick-start example (async memory reconsideration loop)

Uses

Direct Use

Use as a drop-in memory layer for any Transformer/LLM pipeline:

Add factual or conversational memories.
Run recursive reconsideration loops (temporal decay → consensus → contradiction detection → optional rewrite).
Persist state across sessions via JSON + safetensors.

Ideal for agents, chatbots, or knowledge-intensive applications that require long-term coherence.

Downstream Use [optional]

Integrate with LangChain agents or any LLM stack via Graphiti/Mem0 knowledge graphs.
Extend base models (e.g., Llama, Mistral) with stateful recursive passes.
Use in production AI systems needing self-correction and belief updating.

Out-of-Scope Use

Not intended as a standalone generative LLM.
Not suitable for real-time low-latency inference without hardware acceleration (multiple recursive passes add compute).
Avoid use in safety-critical systems without additional ethical/guardrail layers (rewrites can be LLM-guided).

Bias, Risks, and Limitations

Technical limitations: Recursive loops increase inference-time compute; performance depends on embedding quality and KG backend (Neo4j recommended for Graphiti).
Sociotechnical risks: Automated memory rewrites could propagate or amplify biases present in the underlying LLM or knowledge graph. Contradiction detection relies on embedding geometry and may miss subtle nuances.
Nostalgic incorrectness mitigation: The core goal is to reduce outdated beliefs, but incorrect source data or poor consensus thresholds can still lead to erroneous updates.

Recommendations

Users should:

Monitor rewrite logs and confidence deltas.
Use high-quality, verified knowledge graphs.
Apply domain-specific safety policies before committing rewrites.
Test with synthetic contradictory memory scenarios to validate behavior.

How to Get Started with the Model

# Via Kaggle (PyTorch model) or direct from ERS GitHub
# Install dependencies (from ERS README)
# pip install torch sentence-transformers safetensors mem0-ai graphiti-core langchain langchain-community

import asyncio
from ERS import EnhancedReconsiderationSystem, MemoryBlock, ERSPromise  # or load from Kaggle PyTorch weights

async def main():
    ers = EnhancedReconsiderationSystem()  # loads saved state if present

    await ers.add_memory("Paris is the capital of France")
    await ers.add_memory("Paris is the largest city in France")  # contradictory example

    await ers.reconsider_deferred()
    await ers.recursive_loop_check()  # performs RTM-style multi-pass reconsideration
    await ers.close()

asyncio.run(main())

Full usage and configuration in the ERS GitHub README. The Kaggle PyTorch model loads the core PMLLLattice and related tensors.

Training Details

Training Data

None (this is an architectural extension/framework, not a pretrained LLM). It operates on top of any Transformer embeddings (e.g., via sentence-transformers). Memory content is user-provided or agent-generated.

Training Procedure

Preprocessing [optional]

Memory blocks are created with embeddings (via sentence-transformers), timestamps, confidence scores, and SHA-256 hashes. Optional KG indexing via Graphiti/Mem0.

Training Hyperparameters

Training regime: Not applicable (no end-to-end training). Runtime inference uses PyTorch (fp32/bf16 supported via torch).
Configuration options (RTM integration): passes: 2, early_stop_cosine_delta: 0.002, max_rewrites_per_slot: 1, decay_alpha: 0.95, adaptive λ decay rates, similarity threshold τ_sim, etc. (fully configurable in ERS).<grok:render card_id="355dea" card_type="citation_card" type="render_inline_citation">43

Speeds, Sizes, Times [optional]

Real-time performance demonstrated in ERS (production-grade). Exact throughput depends on hardware, number of recursive passes, and KG backend. Lattice uses low-rank compression for scalability.

Evaluation

Testing Data, Factors & Metrics

No public benchmark datasets or quantitative results published in the preprint. Evaluation is qualitative/conceptual via synthetic contradictory memory scenarios (e.g., Paris facts example) and convergence metrics (confidence delta, rewrite count, cosine similarity shifts).

Factors

Memory age, source quality, domain volatility, embedding similarity.

Metrics

Nostalgic Incorrectness (NI) metric defined in paper.
Consensus score, contradiction score, confidence update delta.

Results

[More Information Needed] — Paper focuses on theoretical framework and architectural feasibility rather than large-scale empirical benchmarks. ERS demonstrates real-time recursive reconsideration.

Summary

The model successfully maintains coherent state and resolves contradictions in controlled memory scenarios.

Model Examination [optional]

Interpretability is built-in: per-pass logs of embedding shifts, confidence changes, rewrite proposals, and KG updates. Visualize memory graph evolution (planned roadmap feature).

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed] (tested on standard CPU/GPU with PyTorch)
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

Base: Transformer stack with augmented embedding layer and reconsideration head.
Key equations: temporal decay ( \text{conf}_i(t) = \text{conf}_i(0) \cdot e^{-\lambda_i (t - t_i)} \cdot \dots ), consensus scoring, integrated confidence update, PMLL lattice (DAG with quantization and low-rank compression).
Objective: Stateful, self-correcting memory across sessions.

Compute Infrastructure

Hardware

Standard PyTorch-compatible (CPU/GPU).

Software

Python 3.8+, PyTorch, sentence-transformers, safetensors, mem0-ai, graphiti-core, LangChain.

Citation [optional]

BibTeX:

@article{edwards2025recursive,
  author    = {Edwards, Josef Kurk},
  title     = {The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops},
  journal   = {TechRxiv},
  year      = {2025},
  month     = {October},
  doi       = {10.36227/techrxiv.176118936.69886233/v1},
  url       = {https://www.techrxiv.org/users/856117/articles/1345789}
}

APA: Edwards, J. K. (2025). The Recursive Transformer Model: Architecture, Theory, and Implementation with Persistent Memory Logic Loops. TechRxiv. https://doi.org/10.36227/techrxiv.176118936.69886233/v1

Glossary [optional]

PMLL: Persistent Memory Logic Loop — lattice-based memory compression and routing.
ERS: Enhanced Reconsideration System — production Python/PyTorch library.
Nostalgic Incorrectness: Retention of outdated/conflicting beliefs in stateless models.

More Information [optional]

Full paper and math: TechRxiv preprint.
Live implementation: ERS GitHub.
Related work: Hybrid TRM-RTM model, PMLL P=NP proof paper (separate preprint).

Model Card Authors [optional]

Compiled by Dr Q based on public sources from Josef Edwards / Dr. Q.

Model Card Contact

Josef Edwards (Kaggle: josefedwards, GitHub: drqedwards, Email: joed6834@colorado.edu) or open an issue on the ERS repository.

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Drjkedwards/Recursive-Transformer-Model

Base model

Qwen/Qwen3.5-27B

Finetuned

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Finetuned

(22)

this model

Dataset used to train Drjkedwards/Recursive-Transformer-Model

Paper for Drjkedwards/Recursive-Transformer-Model

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 41