Skip to content

CompustatBanksReader

This module provides functionality for reading and processing Compustat bank-level financial data. It handles quarterly bank financial statements with support for multiple countries and automatic currency conversion.

Key Features: - Read and process quarterly Compustat bank data - Handle multiple countries and currencies - Automatic missing value imputation - Support for proxy data using US banks - Currency conversion capabilities

The module processes various banking metrics including: - Balance sheet items (assets, liabilities, equity) - Funding sources (deposits, debt) - Debt dynamics (issuance, reduction) - Income data

Example
from pathlib import Path
from macro_data.readers.population_data.compustat_banks_reader import CompustatBanksReader
from macro_data.configuration.countries import Country

# Initialize reader with raw data
reader = CompustatBanksReader.from_raw_data(
    year=2020,
    quarter=4,
    raw_quarterly_path=Path("path/to/quarterly.csv"),
    countries=["GBR", Country.FRANCE],
    proxy_with_us=True  # Include US banks for proxying
)

# Get bank data for a specific country
uk_banks = reader.get_country_data(
    country="GBR",
    exchange_rate=1.25
)

# Get proxy data using another country's banks
proxy_banks = reader.get_proxied_country_data(
    proxy_country="USA",
    exchange_rate=1.25
)
Note

Missing values are imputed using scikit-learn's IterativeImputer, with imputation performed separately for each country's banks.

CompustatBanksReader

A class for reading and processing Compustat bank-level financial data.

This class handles the reading and processing of Compustat bank data, including: - Reading quarterly financial statements - Filtering by country and time period - Imputing missing values - Currency conversion for international comparisons

Parameters

data : pd.DataFrame Processed Compustat bank data with standardized columns

Attributes

data : pd.DataFrame Processed bank-level data indexed by country numerical_columns : list[str] List of columns containing monetary values

Notes
  • Missing values are imputed separately for each country's banks
  • All monetary values are in their original currencies unless converted
data = data instance-attribute
numerical_columns property

Get the list of columns containing monetary values.

Returns

list[str] Names of columns containing monetary values that are present in the data

__init__(data: pd.DataFrame)
from_raw_data(year: int, quarter: int, raw_quarterly_path: Path | str, countries: list[str | Country], proxy_with_us: bool = True) classmethod

Create a CompustatBanksReader instance from raw Compustat data.

This method: 1. Reads quarterly bank data 2. Filters for specific time period 3. Processes and cleans the data 4. Imputes missing values by country

Parameters

year : int Year to filter data for quarter : int Quarter to filter data for (1-4) raw_quarterly_path : Path | str Path to quarterly Compustat data file countries : list[str | Country] List of countries to include in the data proxy_with_us : bool, optional Whether to include US banks for proxying (default: True)

Returns

CompustatBanksReader Initialized reader with processed data

Notes
  • Data is filtered to match the specified year and quarter
  • Countries can be specified as strings or Country enum values
  • Missing values are imputed separately for each country
  • US banks are included if proxy_with_us is True
get_country_data(country: str, exchange_rate: float) -> pd.DataFrame

Get bank-level data for a specific country.

For non-US countries, this method returns US bank data converted to the target country's currency using the provided exchange rate.

Parameters

country : str Country to get data for exchange_rate : float Exchange rate for currency conversion (if using US proxy)

Returns

pd.DataFrame Bank-level data for the specified country

Notes
  • Returns actual data for US banks
  • Returns converted US bank data for other countries
get_proxied_country_data(proxy_country: str | Country, exchange_rate: float)

Get bank-level data from a proxy country with currency conversion.

This method is useful when direct data for a country is not available and data from another country needs to be used as a proxy.

Parameters

proxy_country : str | Country Country to use as proxy (string or Country enum) exchange_rate : float Exchange rate to convert monetary values

Returns

pd.DataFrame Converted bank-level data from the proxy country

Notes
  • Only monetary values are converted
  • Non-monetary fields are unchanged