Overview
Delta-Vector/Austral-70B-Preview is a 70 billion parameter language model developed by Delta-Vector. This model is a "Vulpecula Finetune" and represents Delta-Vector's first 70B finetune, building upon the same datasets used for the Francois-Huali series. It incorporates a custom blend of filtered open-source and self-created data, with a strong emphasis on light novel and book content, and minimal synthetic data.
Key Characteristics
- Model Size: 70 billion parameters.
- Training Data: Primarily light novel/book data with very little synthetic data, aiming for a distinct writing style.
- Context Length: Supports a 32768 token context.
- Chat Format: Utilizes the LLama-Instruct chat format, including optional thinking via prefilling with think tags.
- Quantization: Available in GGUF (for LLama.cpp), EXL3 (for TabbyAPI), and FP8 (for Aphrodite/VLLM) formats.
Training Details
The model was trained over 2 epochs using 8 x A100 GPUs. It leveraged a R64 A32 16-bit LoRA configuration with no dropout, utilizing Axolotl LoRA kernels and an LR of 2e-5. The training configuration is publicly available.
Intended Use
This model is designed for users who appreciate a specific writing style derived from its unique dataset composition. While the developer notes some coherency issues, the overall writing style is highlighted as a preferred characteristic.