Testing
This page documents the testing infrastructure and practices for the Templar codebase. It covers the test organization, key test fixtures, testing of core components, and guidelines for running tests and creating new tests. For information about the CI/CD pipeline, see CI/CD Pipeline.
Test Organization
Section titled “Test Organization”The Templar project uses pytest as its primary testing framework. Tests are organized in the tests/ directory, with individual test files corresponding to specific components of the system.
flowchart TD
subgraph "Test Structure"
TS["/tests directory"]
CF["conftest.py"]
MET["test_metrics_logger.py"]
EVAL["test_evaluator.py"]
CKPT["test_checkpoints.py"]
GRAD["test_prepare_gradient_dict.py"]
end
subgraph "System Components"
ML["Metrics Logger"]
EV["Evaluator"]
CP["Checkpoint Management"]
GP["Gradient Processing"]
end
CF --> MET
CF --> EVAL
CF --> CKPT
CF --> GRAD
MET --> ML
EVAL --> EV
CKPT --> CP
GRAD --> GP
Sources: tests/conftest.py , tests/test_metrics_logger.py , tests/test_evaluator.py , tests/test_checkpoints.py , tests/test_prepare_gradient_dict.py
Test Fixtures
Section titled “Test Fixtures”Test fixtures provide a consistent environment for tests to run in. The conftest.py file defines fixtures that are available across all test files, including mock models, metadata, communications interfaces, and system configurations.
flowchart LR
subgraph "Core Fixtures"
M["model()"]
TK["totalks()"]
CI["comms_instance()"]
EL["enable_tplr_logger_propagation()"]
end
subgraph "Mocks for Integration Testing"
MM["mock_metagraph()"]
MV["mock_validator()"]
MNN["num_non_zero_incentive()"]
NAM["num_active_miners()"]
end
subgraph "Helper Classes"
DW["DummyWallet"]
DC["DummyConfig"]
DH["DummyHParams"]
DM["DummyMetagraph"]
end
M --- CI
TK --- CI
MM --> MV
MNN --> MM
NAM --> MV
DW --> CI
DC --> CI
DH --> CI
DM --> CI
Sources: tests/conftest.py:60-197
Testing Core Components
Section titled “Testing Core Components”Metrics Logging
Section titled “Metrics Logging”The tests for the metrics logging system verify that metrics can be properly collected, formatted, and sent to the metrics storage backend (InfluxDB). The tests use mocking to isolate the metrics logger from the actual InfluxDB service.
flowchart TD
subgraph "MetricsLogger Tests"
TI["test_init()"]
TPV["test_process_value()"]
TLB["test_log_basic()"]
TLSM["test_log_with_system_metrics()"]
TLGM["test_log_with_gpu_metrics()"]
TLWL["test_log_with_list_fields()"]
TLCT["test_log_with_config_tags()"]
TLE["test_log_with_exception()"]
TLOC["test_log_call_invokes_write_once()"]
end
subgraph "Fixtures"
MI["mock_influxdb_client()"]
ML["metrics_logger()"]
MCF["mock_cuda_functions()"]
MSM["mock_system_metrics()"]
BTC["bt_config()"]
end
MI --> ML
ML --> TI
ML --> TPV
ML --> TLB
ML --> TLSM
ML --> TLGM
ML --> TLWL
ML --> TLCT
ML --> TLE
ML --> TLOC
MCF --> TLGM
MSM --> TLSM
BTC --> TLCT
Sources: tests/test_metrics_logger.py:58-341
Evaluator Testing
Section titled “Evaluator Testing”The evaluator tests verify that the evaluator component can properly detect, load, and evaluate new model checkpoints, handling versioning and tracking state correctly.
| Test Case | Purpose |
|---|---|
test_evaluator_skips_old_checkpoints | Verifies the evaluator doesn’t reload already evaluated checkpoints |
test_evaluator_loads_new_checkpoints | Confirms the evaluator correctly loads and processes new checkpoints |
Sources: tests/test_evaluator.py:21-143
Checkpoint Management Testing
Section titled “Checkpoint Management Testing”The checkpoint tests verify the creation, storage, and loading of model checkpoints, which are critical for distributed training continuity and recovery.
flowchart TD
subgraph "Checkpoint Tests"
TCDS["test_checkpoint_data_structure()"]
TMK["test_missing_key_in_checkpoint()"]
TCF["test_corrupted_checkpoint_file()"]
TCL["test_catch_up_logic()"]
TNC["test_no_catch_up_when_aligned()"]
TMC["test_miner_checkpoint_cycle()"]
TSO["test_scheduler_optimizer_sync_after_catch_up()"]
TAG["test_async_gather_failures()"]
TCSF["test_checkpoint_save_trigger_frequency()"]
TLFC["test_local_file_creation_after_checkpoint_save()"]
TCPUV["test_cpu_device_verification_for_checkpoint_saved_tensors()"]
TCSLC["test_checkpoint_save_and_load_cycle()"]
end
subgraph "Test Components"
DC["DummyComponents"]
CCD["create_checkpoint_data()"]
DCOM["DummyComms"]
end
DC --> TCDS
DC --> TMK
DC --> TCL
DC --> TNC
DC --> TMC
DC --> TSO
DC --> TAG
DC --> TCSF
DC --> TLFC
DC --> TCPUV
DC --> TCSLC
CCD --> TCDS
CCD --> TMK
CCD --> TCL
CCD --> TNC
DCOM --> TMK
DCOM --> TCF
DCOM --> TCL
DCOM --> TNC
DCOM --> TMC
DCOM --> TSO
DCOM --> TAG
DCOM --> TCSF
DCOM --> TLFC
DCOM --> TCPUV
DCOM --> TCSLC
Sources: tests/test_checkpoints.py:236-774
Gradient Processing Testing
Section titled “Gradient Processing Testing”Tests for the prepare_gradient_dict function verify proper gradient processing, compression, momentum calculation, and metadata attachment.
| Test Case | Purpose |
|---|---|
test_return_structure_and_types | Verifies the function returns the expected structure |
test_metadata_attachment | Confirms metadata is properly attached to gradients |
test_weight_decay_application | Tests weight decay is correctly applied |
test_momentum_decay_and_gradient_accumulation | Verifies momentum calculation |
test_compressor_and_transformer_calls | Tests compression and transformation operations |
test_handling_multiple_parameters | Verifies handling of multiple model parameters |
test_behavior_when_p_grad_is_none | Tests error handling for missing gradients |
test_logging_behavior | Verifies proper logging |
test_correct_use_of_scheduler_learning_rate | Tests learning rate is properly used |
test_propagation_of_compressor_failure | Tests exception handling from compressor |
test_propagation_of_transformer_failure | Tests exception handling from transformer |
Sources: tests/test_prepare_gradient_dict.py:80-516
Test Fixtures and Mock Objects
Section titled “Test Fixtures and Mock Objects”Templar tests use a variety of fixtures and mock objects to create isolated test environments. These include:
Core Fixtures
Section titled “Core Fixtures”-
Model Fixture
@pytest.fixturedef model():# Create a simple dummy model for testing.return torch.nn.Sequential(torch.nn.Linear(10, 10)) -
Communications Fixture
@pytest.fixtureasync def comms_instance():# Initialize communications with mock dependenciescomms = comms_module.Comms(...)# Add transformer and compressorreturn comms -
Metagraph and Validator Fixtures
@pytest.fixturedef mock_metagraph(mocker, num_non_zero_incentive, num_miners=250):# Create a mock metagraph with specified miners and incentive distributionmetagraph = mocker.Mock()# Configure propertiesreturn metagraph@pytest.fixturedef mock_validator(mocker, mock_metagraph, num_active_miners):# Initialize mock validatorvalidator = object.__new__(Validator)# Set up necessary attributesreturn validator
Sources: tests/conftest.py:60-198
Running Tests
Section titled “Running Tests”Tests can be run using pytest. The Templar project has both synchronous and asynchronous tests, with the latter being marked with the @pytest.mark.asyncio decorator.
Basic Test Execution
Section titled “Basic Test Execution”# Run all testspytest
# Run tests in a specific filepytest tests/test_metrics_logger.py
# Run a specific test functionpytest tests/test_metrics_logger.py::TestMetricsLogger::test_initHandling Asynchronous Tests
Section titled “Handling Asynchronous Tests”Asynchronous tests are configured via the pytest_configure function in conftest.py:
def pytest_configure(config): config.addinivalue_line("markers", "asyncio: mark test as requiring async")When writing asynchronous tests, use the @pytest.mark.asyncio decorator and async def function definition:
@pytest.mark.asyncioasync def test_async_function(): # Test code here passSources: tests/conftest.py:1-6 , tests/test_evaluator.py:59-143 , tests/test_checkpoints.py:279-593
Testing Patterns
Section titled “Testing Patterns”Fixture-Based Testing
Section titled “Fixture-Based Testing”Templar tests use pytest fixtures extensively to set up test dependencies:
@pytest.fixturedef metrics_logger(self, mock_influxdb_client): # Create a MetricsLogger instance for testing logger = MetricsLogger(...) return loggerMock Implementation Patterns
Section titled “Mock Implementation Patterns”Mocks are used to isolate components under test:
@pytest.fixturedef mock_influxdb_client(self): # Patch InfluxDBClient, configure mocks, and return the mock class with patch("tplr.metrics.InfluxDBClient", autospec=True) as mock_client_class: # Configure mock yield mock_client_classWaiting For Asynchronous Operations
Section titled “Waiting For Asynchronous Operations”For asynchronous operations, the codebase includes helper functions like wait_for_mock_call:
def wait_for_mock_call(mock_object: Mock, timeout: float = 3.0): """Waits for a mock object to be called at least once.""" start_time = time.monotonic() while time.monotonic() < start_time + timeout: if mock_object.call_count > 0: return True time.sleep(0.05) return FalseSources: tests/test_metrics_logger.py:40-55 , tests/test_metrics_logger.py:84-114 , tests/test_metrics_logger.py:60-82
Adding New Tests
Section titled “Adding New Tests”When adding new tests to the Templar project, follow these guidelines:
- Test Organization: Place tests in the appropriate file based on the component being tested
- Fixtures: Use existing fixtures from
conftest.pywhen possible, or add new ones if needed - Mocking: Use mocks to isolate the component under test from external dependencies
- Async Testing: Use the
@pytest.mark.asynciodecorator for asynchronous tests - Test Coverage: Aim to test both normal operation and error conditions
Test Architecture and Component Relationship
Section titled “Test Architecture and Component Relationship”The test architecture mirrors the Templar system architecture, with tests for each major component and their interactions.
flowchart TD
subgraph "Test Framework"
PF["pytest Framework"]
FST["Fixtures (conftest.py)"]
TMO["Test Mocks"]
UHF["Utility Helper Functions"]
end
subgraph "Component Tests"
MTL["MetricsLogger Tests"]
EVT["Evaluator Tests"]
CKT["Checkpoint Tests"]
GPT["Gradient Processing Tests"]
end
subgraph "System Components"
ML["tplr.metrics.MetricsLogger"]
EV["scripts.evaluator.Evaluator"]
CP["tplr.comms (Checkpoint Functions)"]
GP["tplr.neurons.prepare_gradient_dict"]
end
PF --> FST
PF --> TMO
PF --> UHF
FST --> MTL
FST --> EVT
FST --> CKT
FST --> GPT
TMO --> MTL
TMO --> EVT
TMO --> CKT
TMO --> GPT
UHF --> MTL
UHF --> EVT
UHF --> CKT
MTL --> ML
EVT --> EV
CKT --> CP
GPT --> GP
Sources: tests/conftest.py , tests/test_metrics_logger.py , tests/test_evaluator.py , tests/test_checkpoints.py , tests/test_prepare_gradient_dict.py