Description of issue or feature request:
Right now, tuf.ngclient is heavily tied to local system I/O: it assumes a metadata directory on disk that can be read/written. For example:
|
def _persist_metadata(self, rolename: str, data: bytes) -> None: |
|
"""Write metadata to disk atomically to avoid data loss.""" |
|
temp_file_name: Optional[str] = None |
|
try: |
|
# encode the rolename to avoid issues with e.g. path separators |
|
encoded_name = parse.quote(rolename, "") |
|
filename = os.path.join(self._dir, f"{encoded_name}.json") |
|
with tempfile.NamedTemporaryFile( |
|
dir=self._dir, delete=False |
|
) as temp_file: |
|
temp_file_name = temp_file.name |
|
temp_file.write(data) |
|
os.replace(temp_file.name, filename) |
|
except OSError as e: |
|
# remove tempfile if we managed to create one, |
|
# then let the exception happen |
|
if temp_file_name is not None: |
|
with contextlib.suppress(FileNotFoundError): |
|
os.remove(temp_file_name) |
|
raise e |
This is problematic in distributed worker setups like Warehouse (PyPI), where each worker has its own container/entire VM and thus can't easily share on-disk TUF repos. In particular, this causes both reliability and security concerns:
- Reliability: an unfortunate corruption in a single worker's TUF repo results in a hard-to-diagnose flaky worker, since each worker has its own copy of the repo.
- Security: each worker's TUF repo is independently stored on a (machine-local) disk, making them harder to audit.
This problem was noted a few years back, before tuf.ngclient was created: #1009. The solution then was to add a filesystem abstraction to the tuf.metadata APIs, which was done via secure-systems-lab/securesystemslib#232 and #1009. However, this abstraction wasn't added to the ngclient APIs, only to the low-level metadata ones.
Current behavior:
tuf.ngclient currently assumes that it can perform persistent local I/O for its repository.
Expected behavior:
tuf.ngclient should support an I/O abstraction (such as the pre-existing StorageBackendInterface, if suitable) for persistent repo operations, enabling use in distributed deployments.
Description of issue or feature request:
Right now,
tuf.ngclientis heavily tied to local system I/O: it assumes a metadata directory on disk that can be read/written. For example:python-tuf/tuf/ngclient/updater.py
Lines 293 to 312 in 4d2ff8d
This is problematic in distributed worker setups like Warehouse (PyPI), where each worker has its own container/entire VM and thus can't easily share on-disk TUF repos. In particular, this causes both reliability and security concerns:
This problem was noted a few years back, before
tuf.ngclientwas created: #1009. The solution then was to add a filesystem abstraction to thetuf.metadataAPIs, which was done via secure-systems-lab/securesystemslib#232 and #1009. However, this abstraction wasn't added to thengclientAPIs, only to the low-levelmetadataones.Current behavior:
tuf.ngclientcurrently assumes that it can perform persistent local I/O for its repository.Expected behavior:
tuf.ngclientshould support an I/O abstraction (such as the pre-existingStorageBackendInterface, if suitable) for persistent repo operations, enabling use in distributed deployments.