N8Programs/NextTerm-440M
N8Programs/NextTerm-440M is a 440 million parameter causal transformer model based on the Qwen3 architecture, specifically designed for integer sequence prediction. Trained on an extended OEIS corpus, it excels at continuing integer sequences, particularly in long-context scenarios and polynomial continuation tasks. The model uses a compact 16-token digit vocabulary and was trained for 14 billion tokens with a 4096-token context length, making it highly specialized for mathematical sequence generation.
Loading preview...
NextTerm-440M: Specialized Integer Sequence Prediction
NextTerm-440M is a 440 million parameter causal transformer utilizing a Qwen3-style architecture, uniquely optimized for continuing integer sequences. Developed by N8Programs, this model was trained on an extensive, augmented OEIS (On-Line Encyclopedia of Integer Sequences) corpus, incorporating additional terms from b-files and various prefix-preserving data transformations.
Key Capabilities & Features
- Integer Sequence Continuation: Specifically designed to predict the next terms in integer sequences.
- Optimized for Long Context: Improves significantly over previous models like NextTerm-47M in long-context sequence continuation and long-range in-context learning, with a training context length of 4096 tokens.
- Specialized Vocabulary: Employs a compact 16-token digit vocabulary, including decimal digits, negative sign, and comma separator, allowing for flexible integer magnitude representation.
- Strong Polynomial Continuation: Demonstrates high accuracy in continuing sequences generated from polynomials of degree 1 through 4, outperforming larger Qwen3 models in cubic and quartic polynomial tasks.
- Robust Training: Trained for 14 billion tokens using a Muon/AdamW hybrid optimizer on a single H100 GPU.
Performance Highlights
NextTerm-440M achieves a 34.43% on OEIS-Eval-Neo and a 17.6239 macro MAPE on the M1 Competition 111, showcasing its proficiency in OEIS next-term prediction. It also boasts 86.39% accuracy on quadratic and 75.20% on cubic polynomial continuations, significantly surpassing other Qwen3 models in its class for these tasks.
When to Use This Model
This model is ideal for applications requiring precise and robust prediction of integer sequences, especially those found in mathematical contexts like the OEIS. Its specialized training makes it a strong candidate for research in sequence prediction, mathematical problem-solving, and educational tools focused on number patterns.