Skip to content

refactor: combine RBMDS#70

Open
sahiljhawar wants to merge 6 commits intomainfrom
sahiljhawar/merge-RBMDS-Nc
Open

refactor: combine RBMDS#70
sahiljhawar wants to merge 6 commits intomainfrom
sahiljhawar/merge-RBMDS-Nc

Conversation

@sahiljhawar
Copy link
Copy Markdown
Collaborator

This pull request extends the RBMDataSet class to add comprehensive support for loading and managing RBM data from NetCDF (.nc) files, in addition to the existing .mat and .pickle formats. The changes introduce format-specific loading logic, variable mapping for NetCDF files, and update the class interface and documentation accordingly. The most important changes are grouped below:

NetCDF File Support and Variable Loading:

  • Added the _read_all_datasets_netcdf helper function to recursively read all variables (including those in groups) from NetCDF-4 files and return them in a dictionary keyed by their full hierarchical path.
  • Introduced _load_variable_netcdf and _get_rbm_name_for_nc methods to handle loading variables from NetCDF files and mapping NetCDF variable names to internal RBM variable names. This includes logic for time trimming, computed variables, and attribute assignment. (Fa091418L445R534, Fa091418L534R621)
  • Updated the _load_variable method to dispatch to either NetCDF or .mat/.pickle loading logic based on the selected file extension. (Fa091418L445R534)

Class Interface and Initialization:

  • Extended the preferred_extension parameter and related logic to accept "nc" as a valid option, with validation and storage in the class. [1] [2] [3]
  • When using NetCDF files, set up a variable_lut mapping NetCDF variable paths to internal variable names, including handling of the magnetic field model (mfm).

File Path and Name Handling:

  • Modified _create_file_path_stem and _create_file_name_stem to generate the correct file path and name patterns for NetCDF files, which differ from .mat and .pickle conventions. [1] [2]

Documentation and Usability:

  • Updated the class and method docstrings to reflect support for NetCDF files and clarified usage instructions. [1] [2] [3]
  • Improved error handling and validation for file extension and variable loading scenarios. (swvo/io/RBMDataSet/RBMDataSet.pyR196-R206, Fa091418L534R621)

Other Improvements:

  • Refactored computed property logic for P and InvV to cache results and handle empty values more robustly.

These changes make the RBMDataSet class more flexible and robust, allowing it to seamlessly handle multiple data formats and improving its usability for a wider range of datasets.

Copilot AI review requested due to automatic review settings March 31, 2026 08:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors RBM dataset handling by consolidating NetCDF (.nc) support into the existing RBMDataSet implementation, removing the separate NetCDF dataset class and dataset manager.

Changes:

  • Add NetCDF recursive variable reading and NetCDF-specific variable loading/mapping to RBMDataSet.
  • Extend preferred_extension to support "nc" and adjust file path/name stem logic for NetCDF directory conventions.
  • Restructure tests by removing RBMNcDataSet/manager tests and adding NetCDF-focused tests to the main RBMDataSet test suite.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
swvo/io/RBMDataSet/RBMDataSet.py Adds NetCDF support (reader + loader dispatch), extends preferred_extension, and adjusts computed variable caching/loaded-variable tracking.
swvo/io/RBMDataSet/__init__.py Removes exports for RBMDataSetManager and RBMNcDataSet, leaving only RBMDataSet as the main entry point.
swvo/io/RBMDataSet/RBMNcDataSet.py Deleted (NetCDF functionality moved into RBMDataSet).
swvo/io/RBMDataSet/RBMDataSetManager.py Deleted (manager no longer available/exported).
tests/io/RBMDataSet/test_RBMDataset.py Adds NetCDF-mode tests and new coverage for computed variables being tracked by get_loaded_variables().
tests/io/RBMDataSet/test_RBMNcDataset.py Deleted (superseded by NetCDF tests in test_RBMDataset.py).
tests/io/RBMDataSet/test_RBMDatasetManager.py Deleted (manager removed).
Comments suppressed due to low confidence (1)

swvo/io/RBMDataSet/init.py:26

  • Removing RBMDataSetManager (and RBMNcDataSet) from the public package exports is a breaking change and currently leaves internal code referencing these names (e.g. swvo/io/RBMDataSet/scripts/create_RBSP_line_data.py still imports RBMDataSetManager, and some modules reference RBMNcDataSet under TYPE_CHECKING). Either keep/backfill these symbols (possibly as deprecated aliases) or update the remaining internal imports/type hints accordingly so the package imports and type-checking don’t break.
from swvo.io.RBMDataSet.interp_functions import TargetType as TargetType
from swvo.io.RBMDataSet.scripts.create_RBSP_line_data import create_RBSP_line_data as create_RBSP_line_data
from swvo.io.RBMDataSet.RBMDataSet import RBMDataSet as RBMDataSet


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants