Deployment
This page provides an overview of deployment options and considerations for the Templar decentralized training framework. It covers Docker-based and Ansible-based deployment approaches, environment configuration, and resource requirements for running miner and validator nodes.
For Docker-specific deployment details, see Docker Deployment. For Ansible-specific deployment instructions, see Ansible Deployment.
Deployment Options Overview
Section titled “Deployment Options Overview”The Templar framework offers two primary deployment methods:
- Docker-based deployment - Containerized approach with Docker and Docker Compose
- Ansible-based deployment - Infrastructure-as-code approach for configuring hosts
The choice between these depends on your operational requirements, infrastructure management approach, and team preferences.
flowchart TD
subgraph "Deployment Options"
D["Docker Deployment"] --> DC["Docker Compose"]
D --> DT["Docker Test Environment"]
A["Ansible Deployment"] --> AP["Ansible Playbooks"]
A --> AR["Ansible Roles"]
end
subgraph "Operational Components"
N["Node Types"]
N --> M["Miner"]
N --> V["Validator"]
RM["Resource Management"]
RM --> GPU["GPU Assignment"]
RM --> MEM["Memory Allocation"]
EV["Environment Variables"]
WT["Watchtower Updates"]
end
D --> N
D --> RM
D --> EV
D --> WT
A --> N
A --> RM
A --> EV
Sources: docker/compose.yml , docker/Dockerfile , ansible/playbook.yml , ansible/README.md
Docker Deployment Architecture
Section titled “Docker Deployment Architecture”Docker is the most streamlined deployment method for Templar, using NVIDIA GPU-enabled containers.
flowchart TD
subgraph "Docker Host"
subgraph "Templar Node Container"
E["Entrypoint.sh"] --> N["Node Process"]
N --> M["Miner.py"]
N --> V["Validator.py"]
ENV["Environment Variables"]
VOL["Volumes"]
VOL --> W["Wallets"]
VOL --> L["Logs"]
end
subgraph "Watchtower Container"
WT["Watchtower Service"]
WT --> IM["Image Updates"]
WT --> CR["Container Restart"]
end
GPUs["NVIDIA GPUs"]
GPUs --> N
end
GH["GitHub Container Registry"] --> WT
Sources: docker/compose.yml , docker/Dockerfile , scripts/entrypoint.sh
Docker Image
Section titled “Docker Image”The Templar Docker image is based on NVIDIA’s CUDA runtime image, with Python and essential dependencies installed:
- Base image:
nvidia/cuda:12.6.0-runtime-ubuntu22.04 - Python with dependencies installed via
uv - Entrypoint script for node startup
The official image is published to GitHub Container Registry as ghcr.io/tplr-ai/templar.
Sources: docker/Dockerfile , .github/workflows/docker.yml
Docker Compose Configuration
Section titled “Docker Compose Configuration”The docker-compose.yml file defines the services required for running Templar nodes:
- node service - Configures the Templar node (miner or validator)
- watchtower service - Provides automatic updates of container images
Key configuration aspects include:
- Volume mounts for wallet and log persistence
- Environment variable configuration
- GPU device assignment
- Automatic updates via Watchtower
For testing environments, a docker-compose-test.yml is provided that configures a multi-node setup with miners and validators.
Sources: docker/compose.yml , docker/docker-compose-test.yml
Environment Variables for Docker Deployment
Section titled “Environment Variables for Docker Deployment”Docker deployments require a number of environment variables to configure node behavior, access storage resources, and connect to the Bittensor network.
| Category | Variable Name | Description | Required |
|---|---|---|---|
| Node Configuration | NODE_TYPE | Either “miner” or “validator” | Yes |
| WALLET_NAME | Bittensor wallet name | Yes | |
| WALLET_HOTKEY | Bittensor wallet hotkey | Yes | |
| CUDA_DEVICE | CUDA device to use (e.g., “cuda:0”) | Yes | |
| NETWORK | Bittensor network (e.g., “finney”, “test”) | Yes | |
| NETUID | Bittensor subnet UID | Yes | |
| DEBUG | Enable debug mode (true/false) | No | |
| API Keys | WANDB_API_KEY | Weights & Biases API key | Yes |
| R2 Storage | R2_GRADIENTS_ACCOUNT_ID | Cloudflare R2 account ID | Yes |
| R2_GRADIENTS_BUCKET_NAME | Bucket name for gradients | Yes | |
| R2_GRADIENTS_READ_ACCESS_KEY_ID | Read access key ID | Yes | |
| R2_GRADIENTS_READ_SECRET_ACCESS_KEY | Read secret access key | Yes | |
| R2_GRADIENTS_WRITE_ACCESS_KEY_ID | Write access key ID | Yes | |
| R2_GRADIENTS_WRITE_SECRET_ACCESS_KEY | Write secret access key | Yes | |
| R2_DATASET_* | Similar set of variables for dataset bucket | Yes | |
| R2_AGGREGATOR_* | Similar set of variables for aggregator bucket | Yes | |
| GitHub Integration | GITHUB_USER | GitHub username for Watchtower | No |
| GITHUB_TOKEN | GitHub token for Watchtower | No |
Sources: docker/compose.yml , scripts/entrypoint.sh
Ansible Deployment Approach
Section titled “Ansible Deployment Approach”Ansible provides a more infrastructure-focused approach to deployment, suitable for managing multiple machines or complex deployments.
flowchart TD
subgraph "Control Machine"
A["Ansible Playbook"]
IT["Inventory"]
VT["Vault"]
end
subgraph "Target Host"
subgraph "Per GPU"
TR["Templar Repository"]
EV["Environment Configuration"]
VE["Python venv"]
SD["Systemd Service"]
G["GPU Assignment"]
end
APT["System Packages"]
PIP["Global Pip Packages"]
end
A --> IT
A --> VT
A --> APT
A --> PIP
A --> TR
A --> EV
A --> VE
A --> SD
EV --> G
Sources: ansible/playbook.yml , ansible/roles/templar/defaults/main.yml , ansible/README.md
Ansible Configuration
Section titled “Ansible Configuration”The Ansible deployment requires:
- Inventory file - Defines target hosts and their GPU configurations
- Vault file - Securely stores environment variables and secrets
- Playbook - Orchestrates the deployment process
The Ansible setup supports multi-GPU deployments, creating separate instances for each GPU with dedicated directories and services.
Key files:
ansible/playbook.yml- Main playbookansible/roles/templar/*- Role definitionsgroup_vars/all/vault.yml- Encrypted variables
Sources: ansible/playbook.yml , ansible/roles/templar/defaults/main.yml , ansible/README.md , ansible/group_vars/all/vault.yml.example
Systemd Service Configuration
Section titled “Systemd Service Configuration”For persistent operation, the Ansible deployment can configure systemd services to manage Templar processes:
[Unit]Description=Templar Miner ServiceAfter=network.target
[Service]WorkingDirectory=/path/to/templarEnvironmentFile=/path/to/templar/.envExecStart=/path/to/templar/.venv/bin/python neurons/miner.py [options]Restart=alwaysRestartSec=5sThis ensures automatic restart on failure and proper startup sequence.
Sources: ansible/roles/templar/templates/miner.service.j2
Resource Requirements
Section titled “Resource Requirements”Templar requires:
- CUDA-capable NVIDIA GPU(s)
- Sufficient RAM for model operations
- Storage for checkpoints and logs
- Network connectivity for Bittensor communication and R2 storage access
The docker-compose configuration allows assigning specific GPUs to containers using the device_ids property:
deploy: resources: reservations: devices: - driver: nvidia device_ids: [ '0', '1', '2' ] capabilities: [ gpu ]Sources: docker/compose.yml , docker/docker-compose-test.yml
CI/CD Pipeline
Section titled “CI/CD Pipeline”Templar includes a GitHub Actions workflow for building and publishing Docker images:
flowchart TD
subgraph "GitHub Actions Workflow"
T["Trigger"] --> C["Checkout Code"]
C --> S["Setup Docker Buildx"]
S --> L["Login to Registry"]
L --> M["Extract Metadata"]
M --> B["Build Docker Image"]
B --> P["Push to Registry"]
end
subgraph "Triggers"
REL["Release Published"]
WD["Workflow Dispatch"]
end
subgraph "Tags Generated"
SV["Semantic Version Tags"]
LT["Latest Tag"]
SHA["SHA Tags"]
end
REL --> T
WD --> T
M --> SV
M --> LT
M --> SHA
The workflow automatically builds images when releases are published and tags them appropriately based on semantic versioning.
Sources: .github/workflows/docker.yml
Startup Process
Section titled “Startup Process”When a Templar container starts, the entrypoint script (scripts/entrypoint.sh) performs several initialization steps:
- Validates required environment variables
- Activates Python virtual environment
- Checks CUDA availability
- Logs in to Weights & Biases
- Starts the appropriate node type (miner or validator)
The script constructs the correct command-line arguments based on environment variables.
Sources: scripts/entrypoint.sh
Troubleshooting
Section titled “Troubleshooting”Common deployment issues include:
- CUDA availability - Ensure NVIDIA drivers are installed and compatible with the container’s CUDA version
- Environment variables - Check that all required variables are set correctly
- GPU assignment - Verify that GPUs are correctly assigned to containers
- Network connectivity - Ensure access to Bittensor network and R2 storage
For Docker deployments, you can check logs with:
docker logs templar-{NODE_TYPE}-{WALLET_HOTKEY}For Ansible deployments with systemd, check:
systemctl status templar-minerjournalctl -u templar-minerSources: scripts/entrypoint.sh , docker/compose.yml , ansible/roles/templar/templates/miner.service.j2