usage.mdx
AI Inference Platform Usage Guide
This guide explains how to configure, run, and interact with the AI Inference Platform.
Prerequisites
- Rust: Stable toolchain (1.75+)
- Docker: For running dependencies (Redis) and the containerized app.
- Redis: Required for caching (optional in dev if configured).
Configuration
The application is configured via config/default.toml or environment variables. Environment variables take precedence and use double underscores __ for nesting.
Key Configuration Options
| TOML Key | Env Variable | Description | Default |
| :--- | :--- | :--- | :--- |
| server.host | APP_SERVER__HOST | Host to bind to | 127.0.0.1 |
| server.port | APP_SERVER__PORT | Port to listen on | 8080 |
| cache.redis_url | APP_CACHE__REDIS_URL | Redis connection URL | redis://127.0.0.1:6379 |
| models.ollama.base_url | APP_MODELS__OLLAMA__BASE_URL | URL for Ollama service | http://localhost:11434 |
Running Locally
-
Start Dependencies Ensure Redis is running. You can use Docker Compose for this:
bashdocker-compose up -d redis -
Run the Application
bashcargo run --release -
Verify Status Visit
http://localhost:8080/healthorhttp://localhost:8080/api/docs.
Running with Docker
-
Build and Start
bashdocker-compose up --build -dThis starts the application, Redis, Prometheus, and Grafana.
-
Access Services
- API:
http://localhost:8080 - Prometheus:
http://localhost:9090 - Grafana:
http://localhost:3000(Default login:admin/admin)
- API:
Production Deployment
For production, use the release artifacts found in the release/ directory.
-
Build Release Binary
bashcargo build --releaseThe binary will be at
target/release/ai-inference-platform. -
Use Systemd Copy
release/infer.serviceto/etc/systemd/system/and adjust paths.bashsudo systemctl enable --now infer -
Use Release Docker Compose Use
docker-compose.release.ymlfor a lean production deployment.bashdocker-compose -f docker-compose.release.yml up -d