Name: winglian/basilisk-4b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: winglian

Basilisk 4B: A Llama-2 Model Fine-tuned for Reasoning

Basilisk 4B is a 4 billion parameter language model built upon the winglian/llama-2-4b base. It has been fine-tuned using the OpenOrca Chain-of-Thought (CoT) dataset, which aims to enhance its reasoning and problem-solving capabilities.

Key Capabilities

Reasoning Tasks: The model shows performance across various reasoning benchmarks, including AGIEval (e.g., LSAT, SAT math) and BigBench (e.g., logical deduction, causal judgment).
Common Sense Understanding: It demonstrates abilities in tasks like BoolQ, PIQA, and Winogrande, indicating a foundational understanding of common sense.
General Language Understanding: The model performs on ARC Challenge and ARC Easy, suggesting general question-answering and comprehension skills.

Performance Highlights

Evaluations on various benchmarks indicate its performance in different domains:

AGIEval: Achieves 0.2362 acc on AQUA-RAT, 0.2688 acc on LogiQA-en, and 0.2318 acc on SAT-math.
BigBench: Scores 0.5000 multiple_choice_grade on Causal Judgement and Navigate, and 0.3800 on Logical Deduction Three Objects.
Common Sense: Attains 0.7196 acc on BoolQ and 0.6937 acc on PIQA.

Good For

This model is suitable for applications requiring a compact Llama-2 based model with enhanced reasoning abilities, particularly for tasks benefiting from Chain-of-Thought fine-tuning. Its performance on various academic and common-sense benchmarks makes it a candidate for general-purpose language understanding and generation where a smaller parameter count is advantageous.

Overview

Basilisk 4B: A Llama-2 Model Fine-tuned for Reasoning

Key Capabilities

Performance Highlights

Good For

Full Model Card (README)