Skip to content

Synthetic Matching

The Synthetic Matching modules handle the harmonization of data between different economic agents to create a consistent initial state for the model. These modules are not markets themselves, but rather data processing components that ensure consistency between different data sources.

Core Functionality

The matching modules handle:

  1. Individual-Firm Matching:

    • Harmonizing employment data
    • Reconciling wages with labor expenses
    • Matching workers to positions
    • Validating industry relationships
  2. Firm-Bank Matching:

    • Reconciling financial data
    • Matching firms to banking relationships
    • Harmonizing deposit and loan balances
    • Validating balance sheet consistency
  3. Household-Bank Matching:

    • Harmonizing retail banking data
    • Matching households to accounts
    • Reconciling deposit and loan balances
    • Validating customer relationships
  4. Household-Housing Matching:

    • Reconciling property ownership
    • Matching tenants to properties
    • Harmonizing rental relationships
    • Processing social housing allocation

Key Components

Individual-Firm Matching

  • matching_individuals_with_firms.py: Handles employment data harmonization
  • Validates employment counts
  • Reconciles wage totals
  • Computes position assignments
  • Adjusts for tax effects

Firm-Bank Matching

  • matching_firms_with_banks.py: Manages corporate banking relationships
  • Reconciles financial data
  • Allocates accounts
  • Validates balance sheets
  • Records relationships

Household-Bank Matching

  • matching_households_with_banks.py: Processes retail banking relationships
  • Harmonizes deposit data
  • Reconciles loan balances
  • Validates customer accounts
  • Records assignments

Household-Housing Matching

  • matching_households_with_houses.py: Handles property relationships
  • Processes ownership data
  • Manages rental market
  • Handles social housing
  • Validates tenure status

Implementation

Module for harmonizing individual and firm employment data.

This module matches and harmonizes employment data from different sources: 1. Individual Data Source: - Actual wages from household surveys - Industry of employment - Activity status (1: employed, 2: unemployed, 3: inactive) - Employment income

  1. Firm Data Source:
  2. Total labor expenses
  3. Number of employees
  4. Industry classification
  5. Production data

The harmonization process involves: 1. Data Validation: - Checking employment numbers match across sources - Ensuring industry totals align - Validating activity status

  1. Data Reconciliation:
  2. Scaling individual wages to match firm labor expenses
  3. Adjusting for tax effects
  4. Computing per-position wages from total expenses

  5. Optimal Matching:

  6. Minimizing discrepancy between data sources
  7. Preserving industry-specific relationships
  8. Recording final assignments
Note

This module focuses on harmonizing employment data from different sources to create a consistent initial state. The actual labor market dynamics are implemented in the simulation package.

match_individuals_with_firms_country(industries: list[str] | np.ndarray, income_taxes: float, employee_social_contribution_taxes: float, firms: SyntheticFirms, population: SyntheticPopulation)

Harmonize employment data across all industries in a country.

This function coordinates the complete data harmonization by: 1. Processing each industry sector 2. Validating employment data consistency 3. Reconciling wages with labor expenses 4. Computing final labor inputs

The process ensures: - Industry-specific relationships are preserved - Tax effects are properly handled - Wage totals match firm expenses - Employment counts are consistent

Parameters:

Name Type Description Default
industries list[str] | ndarray

Industry sectors to process

required
income_taxes float

Income tax rate for reconciliation

required
employee_social_contribution_taxes float

Social security tax rate

required
firms SyntheticFirms

Firm labor expense data

required
population SyntheticPopulation

Individual employment data

required

Module for harmonizing firm and bank financial data.

This module harmonizes financial data from different sources: 1. Firm Survey/Balance Sheet Data: - Reported cash holdings - Outstanding credit balances - Industry-specific financials - Balance sheet information

  1. Banking System Data:
  2. Aggregate corporate deposits
  3. Commercial loan portfolio
  4. Balance sheet totals
  5. Corporate account data

The harmonization process involves: 1. Data Validation: - Checking total deposits match across sources - Validating loan balances - Ensuring consistent client counts

  1. Data Reconciliation:
  2. Scaling firm deposits to match bank totals
  3. Adjusting loan balances for consistency
  4. Computing account distributions

  5. Optimal Assignment:

  6. Minimizing discrepancy between data sources
  7. Preserving financial relationships
  8. Recording final assignments
Note

This module focuses on harmonizing financial data from different sources to create a consistent initial state. The actual financial market dynamics are implemented in the simulation package.

match_firms_with_banks_optimal(firms: SyntheticFirms, banks: SyntheticBanks) -> None

Harmonize firm and bank financial data using optimal assignment.

This function reconciles financial data by: 1. Scaling firm data to match bank totals 2. Allocating accounts based on bank size 3. Using linear sum assignment to minimize discrepancies 4. Recording harmonized relationships

The optimization: - Minimizes differences between reported values - Respects bank balance sheet constraints - Maintains consistent financial totals - Preserves deposit-loan relationships

Parameters:

Name Type Description Default
firms SyntheticFirms

Firm financial data

required
banks SyntheticBanks

Bank balance sheet data

required

Module for harmonizing household and bank financial data.

This module harmonizes financial data from different sources: 1. Household Survey Data: - Reported deposit holdings - Outstanding loan balances - Financial asset information - Debt service payments

  1. Banking System Data:
  2. Aggregate household deposits
  3. Retail loan portfolio
  4. Balance sheet totals
  5. Customer account data

The harmonization process involves: 1. Data Validation: - Checking total deposits match across sources - Validating loan balances - Ensuring consistent customer counts

  1. Data Reconciliation:
  2. Scaling household deposits to match bank totals
  3. Adjusting loan balances for consistency
  4. Computing account distributions

  5. Optimal Assignment:

  6. Minimizing discrepancy between data sources
  7. Preserving financial relationships
  8. Recording final assignments
Note

This module focuses on harmonizing financial data from different sources to create a consistent initial state. The actual financial market dynamics are implemented in the simulation package.

match_households_with_banks_optimal(population: SyntheticPopulation, banks: SyntheticBanks) -> None

Harmonize household and bank financial data using optimal assignment.

This function reconciles financial data by: 1. Scaling household data to match bank totals 2. Allocating accounts based on bank size 3. Using linear sum assignment to minimize discrepancies 4. Recording harmonized relationships

The optimization: - Minimizes differences between reported values - Respects bank balance sheet constraints - Maintains consistent financial totals - Preserves deposit-loan relationships

Parameters:

Name Type Description Default
population SyntheticPopulation

Household financial data

required
banks SyntheticBanks

Bank balance sheet data

required

Module for harmonizing household and housing unit data.

This module harmonizes housing data from different sources: 1. Household Survey Data: - Reported property ownership - Rental payments - Housing wealth - Tenure status

  1. Property Register Data:
  2. Housing unit values
  3. Rental income
  4. Ownership records
  5. Social housing status

The harmonization process involves: 1. Owner-Occupied Housing: - Reconciling ownership records - Harmonizing property values - Adjusting imputed rents - Validating relationships

  1. Rental Market:
  2. Reconciling landlord holdings
  3. Harmonizing rental income
  4. Adjusting rental rates
  5. Matching tenant records

  6. Data Validation:

  7. Removing outliers
  8. Imputing missing values
  9. Ensuring market consistency
  10. Validating relationships
Note

This module focuses on harmonizing housing data from different sources to create a consistent initial state. The actual housing market dynamics are implemented in the simulation package.

Tenure status in HFCS falls into 4 categories: 1. Own ('1') 2. Part own ('2') 3. Rent ('3') 4. Free use of their home ('4')

This module treats part-owners and free-users as owner-occupiers. It adds a category for social renters ('-1').

set_housing_df(synthetic_population: SyntheticPopulation, rental_income_taxes: float, social_housing_rent: float, total_imputed_rent: float) -> pd.DataFrame

Create and harmonize the housing market dataset.

This function reconciles housing data through several steps: 1. Owner-Occupied Properties: - Reconciling ownership records with survey data - Harmonizing property values across sources - Computing consistent imputed rents

  1. Rental Properties:
  2. Reconciling landlord holdings with property records
  3. Harmonizing rental income with payments
  4. Adjusting for tax effects
  5. Validating social housing data

  6. Data Cleaning:

  7. Removing outliers in values and rents
  8. Imputing missing data points
  9. Validating price-rent relationships
  10. Ensuring data consistency

  11. Market Reconciliation:

  12. Harmonizing tenant-property relationships
  13. Recording consistent ownership data
  14. Updating household records

Parameters:

Name Type Description Default
synthetic_population SyntheticPopulation

Household survey data

required
rental_income_taxes float

Tax rate on rental income

required
social_housing_rent float

Standardized social housing rent

required
total_imputed_rent float

Total imputed rent for owned properties

required

Returns:

Type Description
DataFrame

pd.DataFrame: Harmonized housing market data with consistent ownership and rental relationships