Kanonenbombe/llama3.2-1B-Function-calling

TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Kanonenbombe/llama3.2-1B-Function-calling is a 1 billion parameter model, based on the Llama 3.2 architecture, developed by Kanonenbombe. This model is currently under development and is specifically intended for function-calling tasks. With a context length of 32768 tokens, it is a work-in-progress and not yet optimized for production use, requiring further fine-tuning to enhance its performance.

Loading preview...

Model Overview

Kanonenbombe/llama3.2-1B-Function-calling is a 1 billion parameter model, built on the Llama 3.2 architecture, developed by Kanonenbombe. This model is currently in its early stages of development and is specifically designed for function-calling tasks. It has a notable context length of 32768 tokens.

Key Characteristics

  • Development Status: The model is a work-in-progress and has not been fully fine-tuned or optimized. It is not yet suitable for production environments.
  • Intended Use: Primarily aimed at handling function-calling tasks.
  • Training Details: Trained from scratch on an unspecified dataset. Initial training hyperparameters included a learning rate of 2e-05, a batch size of 1 (with 32 gradient accumulation steps), and 3 epochs.
  • Preliminary Results: Initial training showed a validation loss of 0.1491 after 3 epochs, though these results are subject to change with further development.

Important Considerations

  • Not Production Ready: Due to its developmental status, comprehensive fine-tuning and evaluation are still required.
  • Limited Evaluation: The model has not yet undergone full evaluation, and its capabilities are still being confirmed.