Name: harsha070/expfinal-phi-mbpp-s42-lambda-0p0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: harsha070

Model Overview

The harsha070/expfinal-phi-mbpp-s42-lambda-0p0 is a 4 billion parameter language model, fine-tuned from harsha070/sft-warmup-phi-v1. It leverages a 4096 token context length, making it suitable for processing moderately long inputs.

Training Methodology

This model was trained using the GRPO (Gradient Regularized Policy Optimization) method, a technique highlighted in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). The training was conducted using the TRL (Transformers Reinforcement Learning) framework, specifically version 1.3.0, alongside Transformers 5.8.0 and PyTorch 2.11.0.

Key Capabilities

General Text Generation: Capable of generating coherent and contextually relevant text based on user prompts.
Fine-tuned Performance: Benefits from a fine-tuning process that builds upon a base Phi model.
GRPO Integration: Incorporates a training method known for its application in enhancing mathematical reasoning, which may contribute to improved logical coherence in generated text.

Potential Use Cases

Question Answering: Generating responses to open-ended questions.
Creative Writing: Assisting with story generation, dialogue, or other creative text tasks.
Conversational AI: Developing chatbots or interactive agents for various applications.

Overview

Model Overview

Training Methodology

Key Capabilities

Potential Use Cases

Full Model Card (README)