Name: olusegunola/phi-1.5-distill-v2-Ablation_No_L2_Norm-merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: olusegunola

Model Overview

This model, olusegunola/phi-1.5-distill-v2-Ablation_No_L2_Norm-merged, is a 1.4 billion parameter language model based on the phi-1.5 architecture. It features a context length of 2048 tokens. The key characteristic of this specific version is the explicit removal of L2 normalization during its development, indicating it is part of an ablation study.

Key Characteristics

Architecture: Based on the phi-1.5 model family.
Parameter Count: 1.4 billion parameters.
Context Length: Supports a 2048-token context window.
Unique Feature: Developed as an ablation study variant with L2 normalization intentionally excluded.

Intended Use Cases

Given the specific modification (removal of L2 normalization), this model is primarily suited for:

Research: Investigating the effects of L2 normalization on language model training, performance, and generalization.
Comparative Analysis: Comparing its behavior and outputs against versions of phi-1.5 that include L2 normalization to understand its impact.
Experimental Development: Exploring alternative regularization techniques or understanding the baseline performance without standard regularization methods.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)