Determine project structure #1

Open
opened 2026-03-07 17:04:09 +01:00 by cobuzz · 0 comments
Member

As a new project it is still unclear how to structure this.

Features:

  • Routes OpenAI-API-compatible requests
  • Offline mode - can spawn llama.cpp servers as workers
  • Queues up requests when a model is unloaded by llama.cpp
  • Option to defer specific models requests to mistral
  • Tunes parameters as defined by config for maximum efficiency
    - Note: This mostly affects num_ctx for larger models, as we want
    everything to be on the GPU when available
As a new project it is still unclear how to structure this. Features: - [ ] Routes OpenAI-API-compatible requests - [ ] Offline mode - can spawn llama.cpp servers as workers - [ ] Queues up requests when a model is unloaded by llama.cpp - [ ] Option to defer specific models requests to mistral - [ ] Tunes parameters as defined by config for maximum efficiency - Note: This mostly affects num_ctx for larger models, as we want everything to be on the GPU when available
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
CuLabs/rover#1
No description provided.