Deployment

Relevant Source Files

.github/workflows/docker.yml ansible/README.md ansible/group_vars/all/vault.yml.example ansible/playbook.yml ansible/roles/templar/defaults/main.yml ansible/roles/templar/templates/miner.service.j2 docker/Dockerfile docker/compose.yml docker/docker-compose-test.yml scripts/entrypoint.sh

This page provides an overview of deployment options and considerations for the Templar decentralized training framework. It covers Docker-based and Ansible-based deployment approaches, environment configuration, and resource requirements for running miner and validator nodes.

For Docker-specific deployment details, see Docker Deployment. For Ansible-specific deployment instructions, see Ansible Deployment.

Deployment Options Overview

The Templar framework offers two primary deployment methods:

Docker-based deployment - Containerized approach with Docker and Docker Compose
Ansible-based deployment - Infrastructure-as-code approach for configuring hosts

The choice between these depends on your operational requirements, infrastructure management approach, and team preferences.

flowchart TD
    subgraph "Deployment Options"
        D["Docker Deployment"] --> DC["Docker Compose"]
        D --> DT["Docker Test Environment"]
        A["Ansible Deployment"] --> AP["Ansible Playbooks"]
        A --> AR["Ansible Roles"]
    end
    
    subgraph "Operational Components"
        N["Node Types"]
        N --> M["Miner"]
        N --> V["Validator"]
        RM["Resource Management"]
        RM --> GPU["GPU Assignment"]
        RM --> MEM["Memory Allocation"]
        EV["Environment Variables"]
        WT["Watchtower Updates"]
    end
    
    D --> N
    D --> RM
    D --> EV
    D --> WT
    A --> N
    A --> RM
    A --> EV

Sources: docker/compose.yml , docker/Dockerfile , ansible/playbook.yml , ansible/README.md

Docker Deployment Architecture

Docker is the most streamlined deployment method for Templar, using NVIDIA GPU-enabled containers.

flowchart TD
    subgraph "Docker Host"
        subgraph "Templar Node Container"
            E["Entrypoint.sh"] --> N["Node Process"]
            N --> M["Miner.py"] 
            N --> V["Validator.py"]
            ENV["Environment Variables"]
            VOL["Volumes"]
            VOL --> W["Wallets"]
            VOL --> L["Logs"]
        end
        
        subgraph "Watchtower Container"
            WT["Watchtower Service"]
            WT --> IM["Image Updates"]
            WT --> CR["Container Restart"]
        end
        
        GPUs["NVIDIA GPUs"]
        GPUs --> N
    end
    
    GH["GitHub Container Registry"] --> WT

Sources: docker/compose.yml , docker/Dockerfile , scripts/entrypoint.sh

Docker Image

The Templar Docker image is based on NVIDIA’s CUDA runtime image, with Python and essential dependencies installed:

Base image: nvidia/cuda:12.6.0-runtime-ubuntu22.04
Python with dependencies installed via uv
Entrypoint script for node startup

The official image is published to GitHub Container Registry as ghcr.io/tplr-ai/templar.

Sources: docker/Dockerfile , .github/workflows/docker.yml

Docker Compose Configuration

The docker-compose.yml file defines the services required for running Templar nodes:

node service - Configures the Templar node (miner or validator)
watchtower service - Provides automatic updates of container images

Key configuration aspects include:

Volume mounts for wallet and log persistence
Environment variable configuration
GPU device assignment
Automatic updates via Watchtower

For testing environments, a docker-compose-test.yml is provided that configures a multi-node setup with miners and validators.

Sources: docker/compose.yml , docker/docker-compose-test.yml

Environment Variables for Docker Deployment

Docker deployments require a number of environment variables to configure node behavior, access storage resources, and connect to the Bittensor network.

Category	Variable Name	Description	Required
Node Configuration	NODE_TYPE	Either “miner” or “validator”	Yes
	WALLET_NAME	Bittensor wallet name	Yes
	WALLET_HOTKEY	Bittensor wallet hotkey	Yes
	CUDA_DEVICE	CUDA device to use (e.g., “cuda:0”)	Yes
	NETWORK	Bittensor network (e.g., “finney”, “test”)	Yes
	NETUID	Bittensor subnet UID	Yes
	DEBUG	Enable debug mode (true/false)	No
API Keys	WANDB_API_KEY	Weights & Biases API key	Yes
R2 Storage	R2_GRADIENTS_ACCOUNT_ID	Cloudflare R2 account ID	Yes
	R2_GRADIENTS_BUCKET_NAME	Bucket name for gradients	Yes
	R2_GRADIENTS_READ_ACCESS_KEY_ID	Read access key ID	Yes
	R2_GRADIENTS_READ_SECRET_ACCESS_KEY	Read secret access key	Yes
	R2_GRADIENTS_WRITE_ACCESS_KEY_ID	Write access key ID	Yes
	R2_GRADIENTS_WRITE_SECRET_ACCESS_KEY	Write secret access key	Yes
	R2_DATASET_*	Similar set of variables for dataset bucket	Yes
	R2_AGGREGATOR_*	Similar set of variables for aggregator bucket	Yes
GitHub Integration	GITHUB_USER	GitHub username for Watchtower	No
	GITHUB_TOKEN	GitHub token for Watchtower	No

Sources: docker/compose.yml , scripts/entrypoint.sh

Ansible Deployment Approach

Ansible provides a more infrastructure-focused approach to deployment, suitable for managing multiple machines or complex deployments.

flowchart TD
    subgraph "Control Machine"
        A["Ansible Playbook"]
        IT["Inventory"]
        VT["Vault"]
    end
    
    subgraph "Target Host"
        subgraph "Per GPU"
            TR["Templar Repository"]
            EV["Environment Configuration"]
            VE["Python venv"]
            SD["Systemd Service"]
            G["GPU Assignment"]
        end
        APT["System Packages"]
        PIP["Global Pip Packages"]
    end
    
    A --> IT
    A --> VT
    A --> APT
    A --> PIP
    A --> TR
    A --> EV
    A --> VE
    A --> SD
    EV --> G

Sources: ansible/playbook.yml , ansible/roles/templar/defaults/main.yml , ansible/README.md

Ansible Configuration

The Ansible deployment requires:

Inventory file - Defines target hosts and their GPU configurations
Vault file - Securely stores environment variables and secrets
Playbook - Orchestrates the deployment process

The Ansible setup supports multi-GPU deployments, creating separate instances for each GPU with dedicated directories and services.

Key files:

ansible/playbook.yml - Main playbook
ansible/roles/templar/* - Role definitions
group_vars/all/vault.yml - Encrypted variables

Sources: ansible/playbook.yml , ansible/roles/templar/defaults/main.yml , ansible/README.md , ansible/group_vars/all/vault.yml.example

Systemd Service Configuration

For persistent operation, the Ansible deployment can configure systemd services to manage Templar processes:

[Unit]
Description=Templar Miner Service
After=network.target

[Service]
WorkingDirectory=/path/to/templar
EnvironmentFile=/path/to/templar/.env
ExecStart=/path/to/templar/.venv/bin/python neurons/miner.py [options]
Restart=always
RestartSec=5s

This ensures automatic restart on failure and proper startup sequence.

Sources: ansible/roles/templar/templates/miner.service.j2

Resource Requirements

Templar requires:

CUDA-capable NVIDIA GPU(s)
Sufficient RAM for model operations
Storage for checkpoints and logs
Network connectivity for Bittensor communication and R2 storage access

The docker-compose configuration allows assigning specific GPUs to containers using the device_ids property:

deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          device_ids: [ '0', '1', '2' ]
          capabilities: [ gpu ]

Sources: docker/compose.yml , docker/docker-compose-test.yml

CI/CD Pipeline

Templar includes a GitHub Actions workflow for building and publishing Docker images:

flowchart TD
    subgraph "GitHub Actions Workflow"
        T["Trigger"] --> C["Checkout Code"]
        C --> S["Setup Docker Buildx"]
        S --> L["Login to Registry"]
        L --> M["Extract Metadata"]
        M --> B["Build Docker Image"]
        B --> P["Push to Registry"]
    end
    
    subgraph "Triggers"
        REL["Release Published"]
        WD["Workflow Dispatch"]
    end
    
    subgraph "Tags Generated"
        SV["Semantic Version Tags"]
        LT["Latest Tag"]
        SHA["SHA Tags"]
    end
    
    REL --> T
    WD --> T
    M --> SV
    M --> LT
    M --> SHA

The workflow automatically builds images when releases are published and tags them appropriately based on semantic versioning.

Sources: .github/workflows/docker.yml

Startup Process

When a Templar container starts, the entrypoint script (scripts/entrypoint.sh) performs several initialization steps:

Validates required environment variables
Activates Python virtual environment
Checks CUDA availability
Logs in to Weights & Biases
Starts the appropriate node type (miner or validator)

The script constructs the correct command-line arguments based on environment variables.

Sources: scripts/entrypoint.sh

Troubleshooting

Common deployment issues include:

CUDA availability - Ensure NVIDIA drivers are installed and compatible with the container’s CUDA version
Environment variables - Check that all required variables are set correctly
GPU assignment - Verify that GPUs are correctly assigned to containers
Network connectivity - Ensure access to Bittensor network and R2 storage

For Docker deployments, you can check logs with:

docker logs templar-{NODE_TYPE}-{WALLET_HOTKEY}

For Ansible deployments with systemd, check:

systemctl status templar-miner
journalctl -u templar-miner

Sources: scripts/entrypoint.sh , docker/compose.yml , ansible/roles/templar/templates/miner.service.j2