Testing Guidelines¶
This guide explains how to write and run tests for this project, and what is required for contributions to be accepted.
Writing Tests¶
- Use
pytestfor all unit and integration tests. - Place tests in files named
test_*.pyor*_test.py. - Write tests for any new features or bugfixes you add.
- Strive for good test coverage and meaningful assertions.
Example¶
A simple function and its test:
# test_my_module.py
from my_module import sum
def test_sum():
assert sum(2, 3) == 5
assert sum(-1, 1) == 0
Running Tests¶
- From the root of the repository, run:
- This will automatically discover and run all tests.
Continuous Integration (CI)¶
- All pull requests are automatically checked for code style and tests.
- If tests fail or style checks do not pass, the pull request will be rejected until the issues are fixed.
Sample Test Data for macro_data¶
Overview¶
The macro_data package includes comprehensive sample test data located in:
This sample data mirrors the structure and format of the full datasets used by the simulation framework, but contains only a subset of countries and time periods to keep tests fast and the repository size manageable.
Sample Data Structure¶
The sample data is organized by data source:
sample_raw_data/
├── eurostat/ # European statistics (GDP, debt, CPI, etc.)
├── world_bank/ # World Bank indicators (inflation, population, etc.)
├── oecd_econ/ # OECD economic data (employment, taxes, etc.)
├── icio/ # Input-Output tables
├── wiod_sea/ # World Input-Output Database
├── hfcs/2014/ # Household Finance and Consumption Survey
├── compustat/ # Corporate financial data
├── exchange_rates/ # Currency exchange rates
├── emissions/ # Energy price data
└── [other_data_sources]/ # Additional data sources
Sample Data Characteristics¶
- Primary test country: France (FRA)
- Additional countries: Canada (CAN), UK (GBR), USA
- Time focus: Primarily 2014 (simulation year) with historical data 2010-2018
- Representative values: Real data values, not synthetic or dummy data
- File formats: CSV, PKL, JSON, XLSB depending on source
Testing Requirements for New Data Sources¶
When adding a new data source to macro_data, you MUST:
- Add sample data to
tests/test_macro_data/unit/sample_raw_data/[your_data_source]/ - Follow the subset pattern: Include data for France (FRA) and at least one other country
- Include test years: Focus on 2014 with some historical context if relevant
- Use real data format: Same structure and format as the full dataset
- Keep size reasonable: Subset the data to essential test cases
Example: Adding New Data Source Sample Data¶
sample_raw_data/
└── my_new_data_source/
├── emissions_data.csv # Main data file
├── metadata.json # Supporting metadata
└── 2014/
└── detailed_data.csv # Year-specific data if needed
Sample Data Integration¶
Reader Integration:
# In your reader class
@classmethod
def from_data(cls, data_path: Path) -> "MyNewReader":
"""Create reader from data directory."""
# Reader should work with both full data and sample data
data_file = data_path / "my_data_file.csv"
data = pd.read_csv(data_file)
return cls(data=data)
Test Integration:
# In your test file
def test_my_reader_functionality(readers):
"""Test reader with sample data."""
# Use the readers fixture which provides sample data
result = readers.my_new_reader.get_data_for_country("FRA", 2014)
# Validate against expected sample data values
assert result == pytest.approx(expected_value, abs=tolerance)
Validation Testing Pattern¶
All sample data should be validated with specific value tests:
def test_sample_data_values(readers):
"""Test that sample data returns expected values."""
# Test specific known values from sample data
gdp_fra_2014 = readers.eurostat.get_quarterly_gdp("FRA", 2014, 1)
assert gdp_fra_2014 == pytest.approx(535467e6, abs=1e6)
# Test data availability
assert readers.my_new_reader.has_data_for_country("FRA")
# Test data consistency
assert len(readers.my_new_reader.get_country_list()) >= 1
Sample Data Maintenance¶
- Keep synchronized: Sample data should reflect the structure of full data
- Update when needed: If full data format changes, update sample data accordingly
- Document changes: Note any changes to sample data in test documentation
- Validate regularly: Ensure sample data tests continue to pass
Best Practices¶
- Validate data and check for edge cases in your tests.
- Use fixtures for setup if needed.
- Tests also serve as documentation for expected behavior.
- For macro_data: Always include sample test data for new data sources.
- Test with real values: Use actual data values in sample data, not synthetic ones.
- Validate specific values: Test against known values from sample data to catch regressions.
For more on workflow and style, see the Development Guide and Code Style Guide.