THE CORE ARCHITECTURE
Dive deep into the technology that makes AVA possible.
THE CORE ARCHITECTURE
01
vLLM Engine
High-throughput and memory-efficient LLM serving engine. It utilizes PagedAttention to manage attention key and value memory effectively, delivering state-of-the-art inference speed for local models.
02
Ray Framework
Unified framework for scaling AI applications. AVA SDK uses Ray to orchestrate distributed inference and manage resources efficiently across your GPU and CPU, ensuring smooth multitasking.
03
LlamaFactory
The ultimate tool for fine-tuning. We provide predefined recipes to fine-tune Llama 3 and other models specifically for gaming and assistance contexts within the AVA ecosystem.