Ashenone3/LM-Searcher

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Sep 3, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Ashenone3/LM-Searcher is an 8 billion parameter language model designed for neural architecture search (NAS). This task-agnostic framework leverages LLMs to efficiently explore design spaces and identify optimal solutions for various problems. It is specifically engineered to power automated search processes, making it suitable for researchers and developers focused on optimizing neural network architectures.

Loading preview...

LM-Searcher: LLM-Powered Neural Architecture Search

LM-Searcher, developed by Ashenone3, is an 8 billion parameter language model specifically designed for cross-domain neural architecture search (NAS). Unlike traditional LLMs focused on text generation or understanding, LM-Searcher's core function is to act as a task-agnostic framework for exploring and optimizing neural network architectures.

Key Capabilities

  • Automated Architecture Search: Utilizes an LLM to sample new architectural configurations within a defined search space.
  • Task-Agnostic: Designed to be adaptable across various problem domains, allowing users to define custom reward functions for evaluation.
  • Unified Numerical Encoding: Employs a method for encoding architectural designs that LLMs can process effectively.
  • Scalable Deployment: Supports deployment via vllm for efficient inference during the search process.

Good For

  • Researchers and Engineers: Ideal for those looking to automate the discovery of optimal neural network architectures.
  • Experimentation: Facilitates rapid prototyping and evaluation of different architectural designs.
  • Optimizing Performance: Can be integrated with custom evaluation metrics to find architectures that maximize specific performance indicators (e.g., accuracy, efficiency).

LM-Searcher provides a programmatic interface for defining search spaces and integrating custom reward functions, enabling a flexible approach to automated machine learning model design.