📄️ In-Process SDK for Python
This tutorial walks you through using the In-Process SDK of LLMBoost, so you can easily integrate it into your own Python application.
📄️ Using OpenAI API
LLMBoost supports an OpenAI-compatible web endpoint, which is compatible with the OpenAI Chat Completion API.
📄️ Using OpenWeb UI
This guide shows how to deploy Open WebUI Chatbot UI interface and connect it to your LLMBoost server.
📄️ Using Multiple GPUs Effectively
In a server with multiple GPUs, LLMBoost supports multiple dimensions of parallelism to maximize GPU utilization, scalability, and inference throughput. These parallelism strategies can be configured independently or combined, depending on your deployment needs.