Train model in simulation variable space by RemiLehe · Pull Request #405 · BLAST-AI-ML/synapse

RemiLehe · 2026-03-15T13:49:34Z

Context

This PR extracts one of the changes from #389 to make that PR easier to review.

What this PR does

Replaces the old approach (converting simulation data into experimental variable space before training) with the inverse: converting experimental data into simulation variable space, and training the model entirely in that space.

Concretely:

Adds build_guess_calibration() (analogous to build_normalizations) which constructs input_guess_calibration and output_guess_calibration — AffineInputTransforms encoding the alpha/beta guess from the config.
Converts experimental data to simulation variable space before building normalization and training.
Prepends input_guess_calibration to the input transformer list and appends output_guess_calibration to the output transformer list in the LUME model, so that at inference time inputs are mapped from experimental to simulation space and outputs are mapped back.

Why train in simulation variable space?

In #389, the model is always trained first on simulation data (Phase 1), and then a calibration is learned on experimental data (Phase 2). Training the model (NN or GP) in simulation variable space makes the code, the calibration logic, and the sequence of transformers in #389 much easier to follow: the inner model always works in a single, well-defined space (simulation variables), and the guess calibration transforms sit cleanly at the boundary between experimental and simulation spaces.

Instead of converting simulation data to experimental variable space, convert experimental data to simulation variable space and train the model there. This ensures the model operates natively in the simulation variable space defined by the config. Introduces build_guess_calibration() (analogous to build_normalizations) which builds input_guess_calibration and output_guess_calibration AffineInputTransforms. These are prepended/appended to the transformer lists in the LUME model so that at inference time inputs are mapped from experimental to simulation space, and outputs are mapped back. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

for more information, see https://pre-commit.ci

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

for more information, see https://pre-commit.ci

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

for more information, see https://pre-commit.ci

Consistent with the normalize() function which also assigns torch tensors directly into pandas DataFrame columns. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

RemiLehe and others added 9 commits March 15, 2026 06:45

[pre-commit.ci] auto fixes from pre-commit.com hooks

1cf197f

for more information, see https://pre-commit.ci

Move simulation calibration extraction into build_guess_calibration

d1118cc

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge remote branch

37555f1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

709ca65

for more information, see https://pre-commit.ci

Remove unnecessary .detach() when converting exp data to sim space

f5cec3d

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

eafad32

for more information, see https://pre-commit.ci

Remove .numpy() when assigning calibrated values to DataFrame

bdfc313

Consistent with the normalize() function which also assigns torch tensors directly into pandas DataFrame columns. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Remove unneeded type conversion

f056ae5

RemiLehe changed the title ~~Train model in simulation variable space~~ [WIP] Train model in simulation variable space Mar 16, 2026

RemiLehe added 2 commits March 24, 2026 14:54

Merge branch 'main' into train-in-sim-variables

10e451c

Fix test

b207e61

RemiLehe changed the title ~~[WIP] Train model in simulation variable space~~ Train model in simulation variable space Mar 25, 2026

Update comments

61ff8b4

EZoni requested a review from RevathiJambunathan March 26, 2026 23:26

EZoni added the ml Changes related to the ML models label Mar 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train model in simulation variable space#405

Train model in simulation variable space#405
RemiLehe wants to merge 12 commits intoBLAST-AI-ML:mainfrom
RemiLehe:train-in-sim-variables

RemiLehe commented Mar 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RemiLehe commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

What this PR does

Why train in simulation variable space?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RemiLehe commented Mar 15, 2026 •

edited

Loading