Docs

usage.mdx

AI Inference Platform Usage Guide

This guide explains how to configure, run, and interact with the AI Inference Platform.

Prerequisites

  • Rust: Stable toolchain (1.75+)
  • Docker: For running dependencies (Redis) and the containerized app.
  • Redis: Required for caching (optional in dev if configured).

Configuration

The application is configured via config/default.toml or environment variables. Environment variables take precedence and use double underscores __ for nesting.

Key Configuration Options

| TOML Key | Env Variable | Description | Default | | :--- | :--- | :--- | :--- | | server.host | APP_SERVER__HOST | Host to bind to | 127.0.0.1 | | server.port | APP_SERVER__PORT | Port to listen on | 8080 | | cache.redis_url | APP_CACHE__REDIS_URL | Redis connection URL | redis://127.0.0.1:6379 | | models.ollama.base_url | APP_MODELS__OLLAMA__BASE_URL | URL for Ollama service | http://localhost:11434 |

Running Locally

  1. Start Dependencies Ensure Redis is running. You can use Docker Compose for this:

    bash
    docker-compose up -d redis
  2. Run the Application

    bash
    cargo run --release
  3. Verify Status Visit http://localhost:8080/health or http://localhost:8080/api/docs.

Running with Docker

  1. Build and Start

    bash
    docker-compose up --build -d

    This starts the application, Redis, Prometheus, and Grafana.

  2. Access Services

    • API: http://localhost:8080
    • Prometheus: http://localhost:9090
    • Grafana: http://localhost:3000 (Default login: admin/admin)

Production Deployment

For production, use the release artifacts found in the release/ directory.

  1. Build Release Binary

    bash
    cargo build --release

    The binary will be at target/release/ai-inference-platform.

  2. Use Systemd Copy release/infer.service to /etc/systemd/system/ and adjust paths.

    bash
    sudo systemctl enable --now infer
  3. Use Release Docker Compose Use docker-compose.release.yml for a lean production deployment.

    bash
    docker-compose -f docker-compose.release.yml up -d