j05hr3d/Llama-3.2-1B-Instruct-C_M_T-SAM-AUX_CT_CE-RHO0_05lr2
j05hr3d/Llama-3.2-1B-Instruct-C_M_T-SAM-AUX_CT_CE-RHO0_05lr2 is a 1 billion parameter instruction-tuned causal language model, fine-tuned by j05hr3d from the Meta Llama-3.2-1B-Instruct base model. Trained using TRL with SFT, this model is designed for general instruction-following tasks. It maintains a context length of 32768 tokens, making it suitable for applications requiring processing of moderately long inputs.
Loading preview...
Model Overview
This model, j05hr3d/Llama-3.2-1B-Instruct-C_M_T-SAM-AUX_CT_CE-RHO0_05lr2, is a 1 billion parameter instruction-tuned language model. It is built upon the meta-llama/Llama-3.2-1B-Instruct base model, indicating its foundation in the Llama 3.2 architecture.
Key Characteristics
- Base Model: Fine-tuned from
meta-llama/Llama-3.2-1B-Instruct. - Parameter Count: 1 billion parameters, offering a balance between performance and computational efficiency.
- Training Method: Utilizes Supervised Fine-Tuning (SFT) via the TRL library.
- Context Length: Supports a context window of 32768 tokens, allowing for processing of substantial input lengths.
Intended Use Cases
This model is primarily designed for general instruction-following tasks, leveraging its instruction-tuned nature. Its 1B parameter size makes it suitable for:
- Quick prototyping and development.
- Applications where computational resources are limited.
- Tasks requiring efficient text generation and understanding based on user prompts.
Training Details
The model was trained using specific versions of popular machine learning frameworks:
- TRL: 0.27.1
- Transformers: 4.57.6
- Pytorch: 2.10.0+cu128
- Datasets: 4.8.4
- Tokenizers: 0.22.2