Name: agentlans/Llama3.1-SuperDeepFuse-CrashCourse12K API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: agentlans

Model Overview

Llama3.1-SuperDeepFuse-CrashCourse12K is an 8 billion parameter, multilingual, instruction-tuned language model developed by agentlans. It is based on the Llama3.1-SuperDeepFuse model and has been further fine-tuned using 12,000 samples from the agentlans/crash-course dataset, which aggregates data from 10 high-quality instruct datasets.

Key Capabilities

Enhanced Multi-task Reasoning: Designed to improve performance across various complex tasks.
Mathematics and Coding: Shows improved capabilities in mathematical problem-solving and code generation.
Instruction Following: Aims for better adherence to given instructions.

Training Details

The model was fine-tuned using LoRA (Low-Rank Adaptation) with a maximum sequence length of 2048. The training involved 1 epoch, utilizing techniques like 4-bit quantization (bitsandbytes), BF16 precision, NEFTune, and RS-LoRA to optimize performance and efficiency.

Performance and Limitations

While this 8B model offers improved reasoning and instruction-following, its performance may be limited compared to larger models. Users should be aware that it can produce misleading or incorrect outputs, and verification of results is recommended for critical applications.

Evaluation Results

According to the Open LLM Leaderboard, the model achieved an Average score of 27.93%. Specific metrics include:

IFEval (0-Shot): 71.87%
BBH (3-Shot): 31.83%
MATH Lvl 5 (4-Shot): 17.67%
MMLU-PRO (5-shot): 29.24%

For more detailed results, refer to the Open LLM Leaderboard.

Overview

Model Overview

Key Capabilities

Training Details

Performance and Limitations

Evaluation Results

Full Model Card (README)