Model Overview
Heralax/Augmentoolkit-DataSpecialist-v0.1 is a 7 billion parameter language model, fine-tuned from the Heralax/datagen-pretrain-v1-7b-mistralv0.2 base model, which is derived from the Mistral v0.2 architecture. This model was trained using Axolotl, focusing on a broad spectrum of data generation tasks.
Key Capabilities
- Diverse Data Generation: Trained on a wide array of datasets, including
29_mil_asstr.jsonl, 40mil_gutenberg.jsonl, hle-1_formatted_2mil.jsonl, and 11_mil_fineweb.jsonl for completion tasks. - Conversational AI: Incorporates multi-turn and single-turn segments, alongside datasets like
openhermes2_5 and openthoughts, to enhance its ability in generating conversational text. - ChatML Support: Utilizes the ChatML format for its chat-templated datasets, facilitating structured conversational outputs.
- Optimized Training: Achieved a validation loss of 0.6304 after 2 epochs, with a learning rate of 2e-05 and a total training batch size of 150.
Good For
- Synthetic Data Creation: Ideal for generating large volumes of diverse text data for various applications.
- Chatbot Development: Suitable for developing and enhancing conversational agents due to its extensive training on chat-templated datasets.
- Content Generation: Can be used for generating varied content, from general text to structured dialogues, based on its broad training data.