macro_data Package Documentation¶
The macro_data package is responsible for creating and managing synthetic economic data for the macromodel. It handles data preprocessing, validation, and transformation to create a comprehensive dataset that represents the economic structure of multiple countries.
Overview¶
The package provides tools to:
- Create synthetic country-level economic data
- Process and validate input-output tables
- Handle trade flows and exchange rates
- Manage emissions data
- Generate calibration data
Core Components¶
DataWrapper¶
The main container class that manages synthetic economic data for multiple countries. It serves as the primary interface for data preprocessing and management, coordinating all economic agents, markets, and their relationships.
View DataWrapper documentation →
Data Readers¶
A collection of specialized readers for different types of economic data, handling data ingestion, preprocessing, and validation.
View Data Readers documentation →
Country and Region Objects¶
- Country: Enum and utilities for country codes, EU membership, and code conversion.
- Region: Class for regional codes and metadata, including parent country and value-added ratios.
Data Structure¶
The package organizes data in a hierarchical structure:
-
DataWrapper
- Manages all synthetic economic data
- Handles data preprocessing and validation
- Coordinates between different data sources
-
Synthetic Countries
- Individual country data containers
- Contains all economic agents and markets
- Manages country-specific data and relationships
-
Economic Agents
- Households
- Firms
- Banks
- Government institutions
-
Markets
- Credit market
- Housing market
- Goods market
- Labor market
Usage Example¶
from macro_data import DataWrapper
from macro_data.configuration_utils import default_data_configuration
# Create configuration for multiple countries
data_config = default_data_configuration(
countries=["FRA", "CAN", "USA"],
proxy_country_dict={"CAN": "FRA", "USA": "FRA"}
)
# Initialize DataWrapper with configuration
creator = DataWrapper.from_config(
configuration=data_config,
raw_data_path="path/to/raw/data",
single_hfcs_survey=True
)
# Save processed data
creator.save("path/to/save/data.pkl")
Best Practices¶
-
Data Validation
- Always validate input data before processing
- Check for missing or inconsistent values
- Verify economic relationships
-
Configuration Management
- Use configuration files for reproducible results
- Document all configuration parameters
- Version control configuration files
-
Performance Optimization
- Use appropriate data structures
- Implement efficient data processing
- Cache intermediate results when possible