Testing
This page documents the testing infrastructure and practices for the Templar codebase. It covers the test organization, key test fixtures, testing of core components, and guidelines for running tests and creating new tests. For information about the CI/CD pipeline, see CI/CD Pipeline.
Test Organization
Section titled “Test Organization”The Templar project uses pytest as its primary testing framework. Tests are organized in the tests/
directory, with individual test files corresponding to specific components of the system.
flowchart TD subgraph "Test Structure" TS["/tests directory"] CF["conftest.py"] MET["test_metrics_logger.py"] EVAL["test_evaluator.py"] CKPT["test_checkpoints.py"] GRAD["test_prepare_gradient_dict.py"] end subgraph "System Components" ML["Metrics Logger"] EV["Evaluator"] CP["Checkpoint Management"] GP["Gradient Processing"] end CF --> MET CF --> EVAL CF --> CKPT CF --> GRAD MET --> ML EVAL --> EV CKPT --> CP GRAD --> GP
Sources: tests/conftest.py , tests/test_metrics_logger.py , tests/test_evaluator.py , tests/test_checkpoints.py , tests/test_prepare_gradient_dict.py
Test Fixtures
Section titled “Test Fixtures”Test fixtures provide a consistent environment for tests to run in. The conftest.py
file defines fixtures that are available across all test files, including mock models, metadata, communications interfaces, and system configurations.
flowchart LR subgraph "Core Fixtures" M["model()"] TK["totalks()"] CI["comms_instance()"] EL["enable_tplr_logger_propagation()"] end subgraph "Mocks for Integration Testing" MM["mock_metagraph()"] MV["mock_validator()"] MNN["num_non_zero_incentive()"] NAM["num_active_miners()"] end subgraph "Helper Classes" DW["DummyWallet"] DC["DummyConfig"] DH["DummyHParams"] DM["DummyMetagraph"] end M --- CI TK --- CI MM --> MV MNN --> MM NAM --> MV DW --> CI DC --> CI DH --> CI DM --> CI
Sources: tests/conftest.py:60-197
Testing Core Components
Section titled “Testing Core Components”Metrics Logging
Section titled “Metrics Logging”The tests for the metrics logging system verify that metrics can be properly collected, formatted, and sent to the metrics storage backend (InfluxDB). The tests use mocking to isolate the metrics logger from the actual InfluxDB service.
flowchart TD subgraph "MetricsLogger Tests" TI["test_init()"] TPV["test_process_value()"] TLB["test_log_basic()"] TLSM["test_log_with_system_metrics()"] TLGM["test_log_with_gpu_metrics()"] TLWL["test_log_with_list_fields()"] TLCT["test_log_with_config_tags()"] TLE["test_log_with_exception()"] TLOC["test_log_call_invokes_write_once()"] end subgraph "Fixtures" MI["mock_influxdb_client()"] ML["metrics_logger()"] MCF["mock_cuda_functions()"] MSM["mock_system_metrics()"] BTC["bt_config()"] end MI --> ML ML --> TI ML --> TPV ML --> TLB ML --> TLSM ML --> TLGM ML --> TLWL ML --> TLCT ML --> TLE ML --> TLOC MCF --> TLGM MSM --> TLSM BTC --> TLCT
Sources: tests/test_metrics_logger.py:58-341
Evaluator Testing
Section titled “Evaluator Testing”The evaluator tests verify that the evaluator component can properly detect, load, and evaluate new model checkpoints, handling versioning and tracking state correctly.
Test Case | Purpose |
---|---|
test_evaluator_skips_old_checkpoints | Verifies the evaluator doesn’t reload already evaluated checkpoints |
test_evaluator_loads_new_checkpoints | Confirms the evaluator correctly loads and processes new checkpoints |
Sources: tests/test_evaluator.py:21-143
Checkpoint Management Testing
Section titled “Checkpoint Management Testing”The checkpoint tests verify the creation, storage, and loading of model checkpoints, which are critical for distributed training continuity and recovery.
flowchart TD subgraph "Checkpoint Tests" TCDS["test_checkpoint_data_structure()"] TMK["test_missing_key_in_checkpoint()"] TCF["test_corrupted_checkpoint_file()"] TCL["test_catch_up_logic()"] TNC["test_no_catch_up_when_aligned()"] TMC["test_miner_checkpoint_cycle()"] TSO["test_scheduler_optimizer_sync_after_catch_up()"] TAG["test_async_gather_failures()"] TCSF["test_checkpoint_save_trigger_frequency()"] TLFC["test_local_file_creation_after_checkpoint_save()"] TCPUV["test_cpu_device_verification_for_checkpoint_saved_tensors()"] TCSLC["test_checkpoint_save_and_load_cycle()"] end subgraph "Test Components" DC["DummyComponents"] CCD["create_checkpoint_data()"] DCOM["DummyComms"] end DC --> TCDS DC --> TMK DC --> TCL DC --> TNC DC --> TMC DC --> TSO DC --> TAG DC --> TCSF DC --> TLFC DC --> TCPUV DC --> TCSLC CCD --> TCDS CCD --> TMK CCD --> TCL CCD --> TNC DCOM --> TMK DCOM --> TCF DCOM --> TCL DCOM --> TNC DCOM --> TMC DCOM --> TSO DCOM --> TAG DCOM --> TCSF DCOM --> TLFC DCOM --> TCPUV DCOM --> TCSLC
Sources: tests/test_checkpoints.py:236-774
Gradient Processing Testing
Section titled “Gradient Processing Testing”Tests for the prepare_gradient_dict
function verify proper gradient processing, compression, momentum calculation, and metadata attachment.
Test Case | Purpose |
---|---|
test_return_structure_and_types | Verifies the function returns the expected structure |
test_metadata_attachment | Confirms metadata is properly attached to gradients |
test_weight_decay_application | Tests weight decay is correctly applied |
test_momentum_decay_and_gradient_accumulation | Verifies momentum calculation |
test_compressor_and_transformer_calls | Tests compression and transformation operations |
test_handling_multiple_parameters | Verifies handling of multiple model parameters |
test_behavior_when_p_grad_is_none | Tests error handling for missing gradients |
test_logging_behavior | Verifies proper logging |
test_correct_use_of_scheduler_learning_rate | Tests learning rate is properly used |
test_propagation_of_compressor_failure | Tests exception handling from compressor |
test_propagation_of_transformer_failure | Tests exception handling from transformer |
Sources: tests/test_prepare_gradient_dict.py:80-516
Test Fixtures and Mock Objects
Section titled “Test Fixtures and Mock Objects”Templar tests use a variety of fixtures and mock objects to create isolated test environments. These include:
Core Fixtures
Section titled “Core Fixtures”-
Model Fixture
@pytest.fixturedef model():# Create a simple dummy model for testing.return torch.nn.Sequential(torch.nn.Linear(10, 10)) -
Communications Fixture
@pytest.fixtureasync def comms_instance():# Initialize communications with mock dependenciescomms = comms_module.Comms(...)# Add transformer and compressorreturn comms -
Metagraph and Validator Fixtures
@pytest.fixturedef mock_metagraph(mocker, num_non_zero_incentive, num_miners=250):# Create a mock metagraph with specified miners and incentive distributionmetagraph = mocker.Mock()# Configure propertiesreturn metagraph@pytest.fixturedef mock_validator(mocker, mock_metagraph, num_active_miners):# Initialize mock validatorvalidator = object.__new__(Validator)# Set up necessary attributesreturn validator
Sources: tests/conftest.py:60-198
Running Tests
Section titled “Running Tests”Tests can be run using pytest. The Templar project has both synchronous and asynchronous tests, with the latter being marked with the @pytest.mark.asyncio
decorator.
Basic Test Execution
Section titled “Basic Test Execution”# Run all testspytest
# Run tests in a specific filepytest tests/test_metrics_logger.py
# Run a specific test functionpytest tests/test_metrics_logger.py::TestMetricsLogger::test_init
Handling Asynchronous Tests
Section titled “Handling Asynchronous Tests”Asynchronous tests are configured via the pytest_configure
function in conftest.py
:
def pytest_configure(config): config.addinivalue_line("markers", "asyncio: mark test as requiring async")
When writing asynchronous tests, use the @pytest.mark.asyncio
decorator and async def
function definition:
@pytest.mark.asyncioasync def test_async_function(): # Test code here pass
Sources: tests/conftest.py:1-6 , tests/test_evaluator.py:59-143 , tests/test_checkpoints.py:279-593
Testing Patterns
Section titled “Testing Patterns”Fixture-Based Testing
Section titled “Fixture-Based Testing”Templar tests use pytest fixtures extensively to set up test dependencies:
@pytest.fixturedef metrics_logger(self, mock_influxdb_client): # Create a MetricsLogger instance for testing logger = MetricsLogger(...) return logger
Mock Implementation Patterns
Section titled “Mock Implementation Patterns”Mocks are used to isolate components under test:
@pytest.fixturedef mock_influxdb_client(self): # Patch InfluxDBClient, configure mocks, and return the mock class with patch("tplr.metrics.InfluxDBClient", autospec=True) as mock_client_class: # Configure mock yield mock_client_class
Waiting For Asynchronous Operations
Section titled “Waiting For Asynchronous Operations”For asynchronous operations, the codebase includes helper functions like wait_for_mock_call
:
def wait_for_mock_call(mock_object: Mock, timeout: float = 3.0): """Waits for a mock object to be called at least once.""" start_time = time.monotonic() while time.monotonic() < start_time + timeout: if mock_object.call_count > 0: return True time.sleep(0.05) return False
Sources: tests/test_metrics_logger.py:40-55 , tests/test_metrics_logger.py:84-114 , tests/test_metrics_logger.py:60-82
Adding New Tests
Section titled “Adding New Tests”When adding new tests to the Templar project, follow these guidelines:
- Test Organization: Place tests in the appropriate file based on the component being tested
- Fixtures: Use existing fixtures from
conftest.py
when possible, or add new ones if needed - Mocking: Use mocks to isolate the component under test from external dependencies
- Async Testing: Use the
@pytest.mark.asyncio
decorator for asynchronous tests - Test Coverage: Aim to test both normal operation and error conditions
Test Architecture and Component Relationship
Section titled “Test Architecture and Component Relationship”The test architecture mirrors the Templar system architecture, with tests for each major component and their interactions.
flowchart TD subgraph "Test Framework" PF["pytest Framework"] FST["Fixtures (conftest.py)"] TMO["Test Mocks"] UHF["Utility Helper Functions"] end subgraph "Component Tests" MTL["MetricsLogger Tests"] EVT["Evaluator Tests"] CKT["Checkpoint Tests"] GPT["Gradient Processing Tests"] end subgraph "System Components" ML["tplr.metrics.MetricsLogger"] EV["scripts.evaluator.Evaluator"] CP["tplr.comms (Checkpoint Functions)"] GP["tplr.neurons.prepare_gradient_dict"] end PF --> FST PF --> TMO PF --> UHF FST --> MTL FST --> EVT FST --> CKT FST --> GPT TMO --> MTL TMO --> EVT TMO --> CKT TMO --> GPT UHF --> MTL UHF --> EVT UHF --> CKT MTL --> ML EVT --> EV CKT --> CP GPT --> GP
Sources: tests/conftest.py , tests/test_metrics_logger.py , tests/test_evaluator.py , tests/test_checkpoints.py , tests/test_prepare_gradient_dict.py