UniBM packages reusable statistical code for block-maxima inference under
serial dependence, with benchmark and application workflows for both the
extreme value index (xi) and the extremal index (theta).
The main methodological selling point is not the term design-life level by itself, but the paired UniBM workflow that starts from dependent block-maxima quantile scaling and then reports:
- one severity branch for
xiestimation on short records; - one persistence branch for
thetaestimation for clustering and episode persistence; - one decision-facing design-life output derived from the severity branch,
i.e.
T-year block-maximumtau-quantiles on the original physical scale.
From the repository root, the standard end-to-end workflow is:
just fullThis is the standard code-repo full rebuild. It covers:
- environment sync into the external uv environment declared in
.env; - benchmark rebuilds;
- benchmark report generation;
- USGS site freezing plus application rebuild;
- vignette regeneration plus in-place notebook execution;
- unit tests,
src/unibmcoverage, Sphinx docs, and repo-wide formatting normalization.
If you only want the main workflow blocks individually, use:
just verify
just benchmark
just application
just vignetteOptional: install just if you want short
task aliases. Without just, all commands below can still be run directly with
uv run python ... and uv run python -m jupytext ....
The package documentation source lives under docs/.
Build the local HTML site with:
uv run sphinx-build -b html docs docs/_build/htmlUse just docs for the same build through the repo task runner. just full
also includes a docs rebuild before the cold workflow refresh.
After the build finishes, open:
docs/_build/html/index.html
For the docs source on GitHub, browse
docs/.
- Install dependencies with
uv. - Copy the local environment template:
cp .env.example .env- Edit
.envand set:DIR_WORKto your local code-repo clone pathUV_PROJECT_ENVIRONMENTto a dedicated external uv environment path, e.g./Users/yourname/.venvs/unibm
- Run one of the top-level
justtasks.just verifyis the lightest first run and will create or update the environment automatically:
just verify
# or, for the full rebuild:
just full
# or, if you prefer raw uv commands:
set -a; source .env; set +a
uv sync --devThe main task entrypoints are:
just full
just verify
just docs
just benchmark
just application
just vignette
just clean-generatedjust full expands to verify + docs + clean-generated + benchmark + application + vignette. Each top-level just
task loads .env and runs uv sync --dev before its main work. The commands
you mainly need to remember are:
just full: fail-fast verify, then cold rebuild of the main code-repo outputsjust verify:uv sync --dev+ tests + coverage +ruff format .just docs:uv sync --dev+ Sphinx HTML build intodocs/_build/htmljust benchmark: benchmark rebuild + benchmark reportsjust application: USGS freeze + application rebuildjust vignette: sync the paired Jupytext notebook, execute it in place, and format outputs
Current defaults:
workers="6"for benchmark and application multiprocessing;screening_bootstrap="20"for USGS screening.
These defaults are intentionally conservative. They are fast enough for routine use without being overly aggressive on CPU or memory.
Examples with explicit overrides:
just full 8 40
just application 6 40
just benchmark 8Your standard code-repo full workflow is:
just fullIf you prefer the raw commands instead of just, the workflow behind
just full is:
set -a; source .env; set +a
uv sync --dev
uv run coverage run -m unittest discover -s tests -p 'test_*.py'
uv run coverage report -m
uv run coverage xml
uv run coverage html
uv run ruff format .
rm -rf docs/_build
uv run sphinx-build -b html docs docs/_build/html
mkdir -p out/benchmark/cache
find out -mindepth 1 -maxdepth 1 ! -name benchmark -exec rm -rf {} +
find out/benchmark -mindepth 1 -maxdepth 1 ! -name cache -exec rm -rf {} +
UNIBM_BENCHMARK_WORKERS=6 uv run python scripts/benchmark/evi_benchmark.py
UNIBM_BENCHMARK_WORKERS=6 uv run python scripts/benchmark/ei_benchmark.py
uv run python scripts/benchmark/evi_report.py
uv run python scripts/benchmark/ei_report.py
UNIBM_SCREENING_BOOTSTRAP_REPS=20 uv run python scripts/application/freeze_usgs.py
UNIBM_APPLICATION_WORKERS=6 uv run python scripts/application/build.py
uv run python -m jupytext --sync notebooks/vignette.py
uv run python -m nbconvert --to notebook --execute --inplace notebooks/vignette.ipynb
uv run ruff format .Notes:
- Prefer
uv sync --devoveruv sync -Ufor reproducible rebuilds.-Uupgrades dependencies and is better treated as an explicit maintenance step. - Top-level
justtasks load.envautomatically and sync the development environment before running. If you runuv ...commands directly, load.envinto your shell first soDIR_WORKandUV_PROJECT_ENVIRONMENTare respected. just vignettesyncsnotebooks/vignette.pyintonotebooks/vignette.ipynb, executes the paired notebook in place, and then formats tracked.pyand.ipynbfiles.just verifyexpands touv sync --dev + tests + coverage + ruff format ..just docsexpands touv sync --dev + sphinx-build -b html docs docs/_build/html.just fullexpands toverify + docs + clean-generated + benchmark + application + vignette.just clean-generatedremoves generated outputs underout/while preservingout/benchmark/cache. Use it when you want a cold rebuild of all rendered artifacts without deleting the benchmark cache.
Rebuild the raw benchmark summaries:
UNIBM_BENCHMARK_WORKERS=6 uv run python scripts/benchmark/evi_benchmark.py
UNIBM_BENCHMARK_WORKERS=6 uv run python scripts/benchmark/ei_benchmark.pyUNIBM_BENCHMARK_WORKERS controls scenario-level multiprocessing.
Build benchmark figures and tables:
uv run python scripts/benchmark/evi_report.py
uv run python scripts/benchmark/ei_report.pyRun the application workflow:
UNIBM_SCREENING_BOOTSTRAP_REPS=20 uv run python scripts/application/freeze_usgs.py
UNIBM_APPLICATION_WORKERS=6 uv run python scripts/application/build.pyThe current application package uses six application-facing series across three environmental-risk layers:
- Houston wet-season precipitation as a secondary EVI-only weather case;
- Phoenix hot-dry severity as a secondary EVI-only compound-hazard case;
- Texas streamflow;
- Florida streamflow;
- Texas NFIP daily building payouts;
- Florida NFIP daily building payouts.
scripts/application/freeze_usgs.py screens a curated Texas/Florida gauge pool
and freezes one flagship USGS site per state into
data/metadata/application/usgs_frozen_sites.json. The main application
workflow then downloads any missing raw inputs and writes application CSVs
and figures.
Provider-specific notes:
GHCN-Dailyis used for Houston and Phoenix weather-side EVI cases.USGS daily dischargeis used for Texas and Florida streamflow.OpenFEMA NFIP claimsis used for Texas and Florida impact series.- NFIP uses
positive-payout-daytotals for EVI andzero-filled dailytotals for EI so claim-wave timing is preserved.
Cached application downloads are reused by default. The workflow only refreshes files when they are missing, obviously broken, or explicitly force-refreshed.
- USGS raw extracts are automatically refreshed when the cached file is too short or unreadable.
- GHCN station files and NFIP state extracts are reused unless their cached file fails a basic integrity check.
- Set
UNIBM_FORCE_REFRESH_APPLICATION_DATA=1to force fresh downloads across application inputs.
Main application outputs are written to out/applications/:
application_series_registry.csvapplication_screening.csvapplication_summary.csvapplication_design_life_levels.csvdesign-life-level curves over the application tau grid;application_stationarity.csvapplication_scaling_gof.csvapplication_design_life_intervals.csvapplication_methods.csvapplication_ei_methods.csvapplication_ei_seasonal_methods.csvapplication_usgs_site_screening.csv
Application method defaults are now intentionally asymmetric:
application_methods.csvrecords only the headline EVI fitsliding_median_fgls, but expands it over the application tau grid0.50 / 0.90 / 0.95 / 0.99by reusing the same plateau andxiwhile estimating only tau-specific intercept shifts;application_ei_methods.csvrecords the four-method EI comparison setbb_sliding_fgls,northrop_sliding_fgls,k_gaps, andferro_segersonly for the EI applications (tx_streamflow,fl_streamflow,tx_nfip_claims, andfl_nfip_claims);application_ei_seasonal_methods.csvstores the appendix-only monthly empirical-PIT to unit-Frechet seasonal sensitivity for those same four EI methods and the same EI applications.
Interpreting the streamflow/NFIP application diagnostics:
- the
quantile scalingpanel is the fitted UniBM log-log block-summary curve; - the
design-life levelpanel is not a separate GEV fit, but the same fitted scaling law evaluated at larger block sizes and then mapped to longer design-life spans; - the literature term closest to this output is a design-life level, i.e.
a
T-year block-maximumtau-quantile; - the current application default is
tau = 0.50, so the headline curve is a median design-life level; - the application plots and exports now also show
tau = 0.90 / 0.95 / 0.99as increasingly conservative shared-xicompanion curves; - the EVI plateau and the EI stable window are selected from different statistical paths, so they do not need to match;
- different
tauvalues are conceptually valid and should share the same asymptotic slopexiwhile differing mainly in intercept; in the application workflow those higher-taucurves are derived by holding the headline plateau and slope fixed and re-estimating only the intercept; - in this direct block-maxima framework, serial dependence is already
internalized in the fitted block-maximum law, so there is no second BM-side
thetaadjustment on the design-life-level curve; - for NFIP, active-day design-life levels and calendar-day EI estimates are kept separate on purpose because they live on different clocks.
Rendered application figures and tables stay under this repository's generated output tree together with the benchmark outputs and cached summaries.
The composite figure is now the default notebook-facing visual. For streamflow
and NFIP it combines target stability, the headline median-sliding-FGLS scaling
fit, the four-method EI comparison, and the design-life-level panel in one 2x2
layout. The scaling and design-life-level panels now show the application tau grid
0.50 / 0.90 / 0.95 / 0.99, with tau = 0.50 as the headline design-life
median and the higher curves as shared-xi upper companions. Houston and
Phoenix use an EVI-only composite variant where the raw daily series replaces
the EI panel. The older single-purpose PDFs remain
available as secondary/debug outputs.
Design-life-level plotting uses a mixed scale convention:
- Houston precipitation and Phoenix hot-dry severity keep a linear
yaxis. - Texas/Florida streamflow and Texas/Florida NFIP design-life-level plots use a log
yaxis so the multi-order-of-magnitude spread remains readable.
Sync the notebook artifact from the Jupytext source of truth and execute the paired notebook in place:
just vignette
# or
uv run python -m jupytext --sync notebooks/vignette.py
uv run python -m nbconvert --to notebook --execute --inplace notebooks/vignette.ipynbjust vignette finishes with a repo-wide ruff format pass so the paired
.py and .ipynb stay normalized after execution.
The source of truth now lives at notebooks/vignette.py in Jupytext
py:percent format, and the committed paired notebook lives at
notebooks/vignette.ipynb.
The vignette presents the application section in the same style as the benchmark sections:
- benchmark results are shown from cached benchmark summaries plus inline plotting helpers;
- application results are re-fit inside the notebook via
build_application_bundles(...)and rendered inline withplot_application_*helpers rather than embedding external PDFs.
Application sections now use plot_application_composite(...) as the main
visual. The notebook still shows the raw time series separately, but the
headline EI application comparison is carried by the composite figure
plus the CSV/LaTeX tables for streamflow and NFIP. Houston and Phoenix appear
later in the notebook only as secondary EVI-only weather plots.
The application notebook also includes a dedicated Data Provenance and Source Records section summarizing:
- the NOAA GHCN-Daily station ids and source URLs used for Houston and Phoenix;
- the frozen USGS gauge ids used for Texas and Florida streamflow;
- the OpenFEMA NFIP endpoint and state-level claim filters used for Texas and Florida building-payout series.
It also reports an appendix-only seasonal-adjusted EI sensitivity based on a monthly empirical PIT -> unit-Frechet transform of each prepared EI series for the EI applications only. Those rows are a robustness check and are not used in the main median design-life-level summaries or in the headline application summary tables.
out/benchmark/preview/ is no longer part of the formal workflow. It was used
only for temporary benchmark figure previews while tuning display limits and has
been removed.
For the quick local-docs pointer, see Docs near the top of this README. The
reusable statistical library under src/unibm/ has a lightweight Sphinx
site under docs/, and the generated HTML entrypoint is
docs/_build/html/index.html.
The docs include:
- API reference pages for the slim root facade plus the canonical
unibm.eviandunibm.eipackages; - an
EVI and EI Workflow Guideoverview page; - a
Worked Examplespage with small runnable examples for EVI, bootstrap backbones, design-life intervals, and EI estimation; - a
Reading Returned Objectspage showing which result fields to inspect first for EVI and EI fits.
On the EI side, the docs now separate:
unibm.ei.preparationandunibm.ei.pathsfor sample preparation and BM-path construction;unibm.ei.selectionfor stable-window selection/extraction;unibm.ei.bmandunibm.ei.thresholdfor the two formal EI estimator families.unibm.ei.plottingfor library-grade EI path and fit plotting helpers.
On the EVI side, the docs now separate:
unibm.evi.estimationfor the canonical UniBM scaling-estimator family;unibm.evi.tailfor tail/order-statistic xi comparator estimators;unibm.evi.spectrumfor spectrum-style block-maxima xi comparator estimators.
For the combined repo-level verification pass, use:
just verifyRun the unit test suite:
uv run python -m unittest discover -s tests -p 'test_*.py'Run the same suite with branch coverage:
uv run coverage run -m unittest discover -s tests -p 'test_*.py'
uv run coverage report -m
uv run coverage xml
uv run coverage htmlRun the explicit static lint pass separately with:
uv run ruff check .Coverage is intentionally scoped to the reusable statistical core under
src/unibm/, not to raw-data downloads or workflow glue. The current
coverage gate is 90% for src/unibm/.
src/unibm/contains the installable reusable statistical core.scripts/benchmark/contains synthetic benchmark compute/report pipelines.scripts/application/contains real-data application build, screening, metadata, and export code.scripts/shared/contains shared CLI bootstrap and runtime helpers.scripts/notebook_api.pycontains the notebook-facing helper API used by the Jupytext vignette.scripts/data_prep/contains application-specific preprocessing helpers.data/metadata/application/contains frozen USGS site selections and the CPI deflator table used by the NFIP workflow.docs/contains Sphinx docs for the reusable statistical layer.notebooks/vignette.pyis the Jupytext source of truth for the research notebook, andjust vignettesyncs plus executes the pairednotebooks/vignette.ipynbin place.