Docs

usage.mdx

AI Inference Platform Usage Guide

This guide explains how to configure, run, and interact with the AI Inference Platform.

Prerequisites

Rust: Stable toolchain (1.75+)
Docker: For running dependencies (Redis) and the containerized app.
Redis: Required for caching (optional in dev if configured).

Configuration

The application is configured via config/default.toml or environment variables. Environment variables take precedence and use double underscores __ for nesting.

Key Configuration Options

Running Locally

Start Dependencies Ensure Redis is running. You can use Docker Compose for this:
bash
```
docker-compose up -d redis
```
Run the Application
bash
```
cargo run --release
```
Verify Status Visit http://localhost:8080/health or http://localhost:8080/api/docs.

Running with Docker

Build and Start
bash
```
docker-compose up --build -d
```
This starts the application, Redis, Prometheus, and Grafana.
Access Services
- API: http://localhost:8080
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (Default login: admin/admin)

Production Deployment

For production, use the release artifacts found in the release/ directory.

Build Release Binary
bash
```
cargo build --release
```
The binary will be at target/release/ai-inference-platform.
Use Systemd Copy release/infer.service to /etc/systemd/system/ and adjust paths.
bash
```
sudo systemctl enable --now infer
```
Use Release Docker Compose Use docker-compose.release.yml for a lean production deployment.
bash
```
docker-compose -f docker-compose.release.yml up -d
```