Skip to content

Upgrade breakdown enum mismatch from WARNING to ERROR #444

@MaxGhenis

Description

@MaxGhenis

Summary

When a parameter YAML file declares breakdown: [some_enum_variable], the homogenize_parameters function in policyengine-core checks whether the YAML's top-level keys match the enum values. Currently, if a YAML contains keys that are not in the breakdown enum, it only logs a WARNING and keeps the orphaned children — it should raise an ERROR instead.

Problem

In policyengine_core/parameters/helpers/homogenize_parameters.py (around line 109-111), mismatched keys produce a warning like:

WARNING  policyengine_core.parameters.helpers.homogenize_parameters:homogenize_parameters.py:111
Parameter gov.usda.snap.income.deductions.utility.standard.main has children ['AK_C', 'AK_N', ...] that are not in the breakdown variable snap_utility_region

But execution continues, silently dropping or ignoring those keys. This caused real data loss in PolicyEngine/policyengine-us#7726, where SNAP utility allowance parameters had breakdown: [state_code] but used snap_utility_region enum keys (like AK_C, NY_NYC). The sub-region values were silently ignored with only a warning.

Proposed fix

Upgrade the warning to an error (raise an exception) when a parameter YAML contains keys that are not in the breakdown enum.

Important: only flag extra keys not in the enum, not missing enum values. There are legitimate cases where a YAML intentionally omits some enum values:

  • state_code breakdown omitting armed forces codes (AA, AE, AP) — no data exists for those
  • filing_status breakdown omitting SURVIVING_SPOUSE — IRS SOI data doesn't track it separately

So the validation should be:

  • yaml_keys - enum_values is non-empty → ERROR (unknown keys that will be silently dropped)
  • enum_values - yaml_keys is non-empty → fine (partial coverage is intentional)

Impact

Without this fix, typos or mismatched breakdown declarations silently discard parameter data, which is very difficult to debug. The current half-state (warn but keep the orphaned children) is confusing and error-prone.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions