Deep Dive Topics | Mango LLMBoost

📄️ In-Process SDK for Python

This tutorial walks you through using the In-Process SDK of LLMBoost, so you can easily integrate it into your own Python application.

📄️ Using OpenAI API

LLMBoost supports an OpenAI-compatible web endpoint, which is compatible with the OpenAI Chat Completion API.

📄️ Using OpenWeb UI

This guide shows how to deploy Open WebUI Chatbot UI interface and connect it to your LLMBoost server.

📄️ Using Multiple GPUs Effectively

In a server with multiple GPUs, LLMBoost supports multiple dimensions of parallelism to maximize GPU utilization, scalability, and inference throughput. These parallelism strategies can be configured independently or combined, depending on your deployment needs.