Kanonenbombe/llama3.2-1B-Function-calling
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold
Kanonenbombe/llama3.2-1B-Function-calling is a 1 billion parameter model, based on the Llama 3.2 architecture, developed by Kanonenbombe. This model is currently under development and is specifically intended for function-calling tasks. With a context length of 32768 tokens, it is a work-in-progress and not yet optimized for production use, requiring further fine-tuning to enhance its performance.
Loading preview...
Model Overview
Kanonenbombe/llama3.2-1B-Function-calling is a 1 billion parameter model, built on the Llama 3.2 architecture, developed by Kanonenbombe. This model is currently in its early stages of development and is specifically designed for function-calling tasks. It has a notable context length of 32768 tokens.
Key Characteristics
- Development Status: The model is a work-in-progress and has not been fully fine-tuned or optimized. It is not yet suitable for production environments.
- Intended Use: Primarily aimed at handling function-calling tasks.
- Training Details: Trained from scratch on an unspecified dataset. Initial training hyperparameters included a learning rate of 2e-05, a batch size of 1 (with 32 gradient accumulation steps), and 3 epochs.
- Preliminary Results: Initial training showed a validation loss of 0.1491 after 3 epochs, though these results are subject to change with further development.
Important Considerations
- Not Production Ready: Due to its developmental status, comprehensive fine-tuning and evaluation are still required.
- Limited Evaluation: The model has not yet undergone full evaluation, and its capabilities are still being confirmed.