NovoCode/Novocode7b-v2
NovoCode/Novocode7b-v2 is a 7 billion parameter causal language model based on the Mistral architecture, developed by NovoCode. Trained from scratch on the /leet10k-alpaca dataset, it features a 4096-token context length and is optimized for general language understanding and generation tasks. This model demonstrates competitive performance across various benchmarks, including MMLU and HellaSwag, making it suitable for a range of applications requiring robust language capabilities.
Loading preview...
NovoCode/Novocode7b-v2 Overview
NovoCode/Novocode7b-v2 is a 7 billion parameter causal language model built upon the Mistral architecture. It was trained from scratch using the /leet10k-alpaca dataset, focusing on general language understanding and generation. The model utilizes a 4096-token context length and was trained with specific hyperparameters including a learning rate of 5e-06 and a batch size of 8 (with gradient accumulation).
Key Capabilities & Performance
This model demonstrates solid performance across several benchmarks, as evaluated on the Open LLM Leaderboard:
- Average Score: 56.57
- MMLU (5-Shot): 64.05
- HellaSwag (10-Shot): 84.12
- AI2 Reasoning Challenge (25-Shot): 61.01
- Winogrande (5-shot): 79.87
While showing strong results in reasoning and common sense, its performance on mathematical tasks like GSM8k (8.19) indicates areas for further specialization. The training process involved 1 epoch with a cosine learning rate scheduler and flash attention enabled for efficiency.
Intended Uses
NovoCode/Novocode7b-v2 is suitable for a variety of natural language processing tasks, including:
- General text generation and completion
- Question answering
- Summarization
- Reasoning tasks where it shows good performance on benchmarks like MMLU and ARC.
It provides a capable base model for developers looking for a 7B parameter solution with a Mistral-derived architecture.