mjf-su/AutoVLA
AutoVLA is a 4 billion parameter instruction-tuned language model developed by mjf-su, fine-tuned using TRL. This model is designed for general text generation tasks, leveraging its 32768-token context length for processing longer inputs. It is a fine-tuned version of an unspecified base model, optimized for conversational AI and question answering.
Loading preview...
AutoVLA-sft: A Fine-Tuned Language Model
mjf-su/AutoVLA is a 4 billion parameter language model that has been fine-tuned using the TRL (Transformers Reinforcement Learning) library. This model is designed for text generation tasks, building upon an unspecified base model through supervised fine-tuning (SFT).
Key Capabilities
- General Text Generation: Capable of generating human-like text based on given prompts.
- Instruction Following: Optimized through fine-tuning to better understand and respond to instructions.
- Extended Context: Benefits from a 32768-token context window, allowing it to process and generate longer sequences of text.
Training Details
The model was trained using the SFT method, leveraging the TRL framework (version 1.4.0) alongside Transformers (4.57.6), Pytorch (2.10.0), Datasets (4.8.5), and Tokenizers (0.22.1). The training process can be visualized via Weights & Biases, as indicated in the original model card.
Good For
- Conversational AI: Responding to user queries and engaging in dialogue.
- Question Answering: Generating answers to a wide range of questions.
- Text Completion: Extending partial texts or generating continuations.