Skip to content

feat: add pilot database and router#570

Open
Robin-Van-de-Merghel wants to merge 8 commits intoDIRACGrid:mainfrom
Robin-Van-de-Merghel:robin-pilot-management
Open

feat: add pilot database and router#570
Robin-Van-de-Merghel wants to merge 8 commits intoDIRACGrid:mainfrom
Robin-Van-de-Merghel:robin-pilot-management

Conversation

@Robin-Van-de-Merghel
Copy link
Copy Markdown
Contributor

Split of #421 , first part : pilot management

@Robin-Van-de-Merghel Robin-Van-de-Merghel force-pushed the robin-pilot-management branch 2 times, most recently from 8c655b0 to 2cfe44a Compare June 13, 2025 11:19
@Robin-Van-de-Merghel Robin-Van-de-Merghel marked this pull request as draft June 13, 2025 11:36
@Robin-Van-de-Merghel Robin-Van-de-Merghel marked this pull request as ready for review June 15, 2025 06:13
Copy link
Copy Markdown
Contributor

@fstagni fstagni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First review.

"""Tried to access a value which is asynchronously loaded but not yet available."""


class DiracFormattedError(DiracError):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All exceptions apart from the 2 below inherit from DiracError, and this one is further customization for only a few related to pilot. What if instead you modify DiracError?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know really if I should keep as before classes with code duplication, or if a generic thingy will help... I had that in mind, but seems overkill:

class DiracFormattedError(Exception):
     http_status_code = HTTPStatus.BAD_REQUEST  # 400
    pattern = "Error [args]"

    def __init__(self, data: dict[str, str], detail: str = ""):
        self.data = data
        self.detail = detail
        super().__init__(self._render_message())

    def _render_message(self) -> str:
        context = {
            **self.data,
            "args": self._format_args(),
            "detail": self._format_detail(),
        }

        return re.sub(r'\[([^\]]+)\]', lambda m: context.get(m.group(1), ""), self.pattern)

    def _format_args(self) -> str:
        if not self.data:
            return ""
        return " ".join(f"({k}: {v})" for k, v in self.data.items())

    def _format_detail(self) -> str:
        return f": {self.detail}" if self.detail else ""

Then:

class PilotAlreadyExistsError(DiracFormattedError):
    http_status_code = HTTPStatus.CONFLICT 
    pattern = "Pilot [args] already exists[detail]"

@Robin-Van-de-Merghel
Copy link
Copy Markdown
Contributor Author

Robin-Van-de-Merghel commented Jun 24, 2025

Where an issue?

Whether we have a base router with require_auth or not, we won't be able to override it in its children (cf #417 ).
So for pilots we must have require_auth=False at the base router because of secret-exchange. But now every pilot endpoint is opened.

Possibilities

Splitting

Let's start with the routers themselves. We can separate pilots and users endpoints : /api/pilots (only for pilots) and ~/api/pilot_management (only for users).

That brings us:

  1. Cleaner: each their own territory, no overlapping, easier to read (you know that on /pilots its only pilot resources)
  2. More secure: we can add a dependency to the pilot router (same as verify_dirac_pilot_token but for pilots)
  3. We can have require_auth for /pilot_management, and enforce tokens

Using another approach

We could have a bare base router without dependency injection, with sub routers with verify_dirac_access_token:

# Authenticated parent router
protected_router = APIRouter(dependencies=[Depends(verify_dirac_access_token)])

@protected_router.get("/secure")
def secure_route():
    return {"msg": "secure"}

# Public sub-router (no auth)
public_router = APIRouter()

@public_router.get("/public")
def public_route():
    return {"msg": "public"}

Its goal would be to replacer DiracxRouter by fixing the issue, and keeping an explicit depedency injection.
Or have @router.authentificated.get(...) instead of @router.get, to explicitely say if we want auth or not

)


async def clear_pilots(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No description provided.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For deletePilots: delete mapping also, same later for logs #511 + PilotOutput)

Copy link
Copy Markdown
Contributor Author

@Robin-Van-de-Merghel Robin-Van-de-Merghel Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left open as a reminder (already did mapping and output)

@Robin-Van-de-Merghel Robin-Van-de-Merghel force-pushed the robin-pilot-management branch 2 times, most recently from a9b353d to 4ba2c9b Compare July 4, 2025 11:48
@Robin-Van-de-Merghel Robin-Van-de-Merghel force-pushed the robin-pilot-management branch 3 times, most recently from 3f801ac to bbade5e Compare August 5, 2025 07:45
@Robin-Van-de-Merghel
Copy link
Copy Markdown
Contributor Author

Will need to be cleaned before merge (integration test)

@Robin-Van-de-Merghel Robin-Van-de-Merghel force-pushed the robin-pilot-management branch 2 times, most recently from 1192a8b to 7c6ba77 Compare August 7, 2025 09:10
@@ -0,0 +1,20 @@
## Presentation

Pilots are a piece of software that is running on *worker nodes*. There are two types of pilots: "DIRAC pilots", and "DiracX pilots". The first type corresponds to pilots with proxies, sent by DIRAC; and the second type corresponds to pilots with secrets. Both kinds will eventually interact with DiracX using tokens (DIRAC pilots by exchanging their proxies for tokens, DiracX by exchanging their secrets for tokens).
Copy link
Copy Markdown
Contributor

@fstagni fstagni Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, I am afraid, it is misleading. There are no DiracX pilots, while there are pilots equipped for interacting with DIRAC or DiracX servers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rephrase it

@aldbr aldbr linked an issue Jan 29, 2026 that may be closed by this pull request
@fstagni fstagni force-pushed the robin-pilot-management branch from 20795e5 to f0e6846 Compare March 24, 2026 15:10
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community bot commented Mar 24, 2026

Documentation build overview

📚 diracx | 🛠️ Build #31967459 | 📁 Comparing 7e63b63 against latest (e552651)


🔍 Preview build

Show files changed (1 files in total): 📝 0 modified | ➕ 1 added | ➖ 0 deleted
File Status
dev/explanations/pilots/index.html ➕ added

@fstagni fstagni changed the title Add pilot management: create/delete/patch and query feat: add pilot management database and router Mar 24, 2026
@fstagni fstagni marked this pull request as draft March 24, 2026 15:20
@fstagni fstagni force-pushed the robin-pilot-management branch 3 times, most recently from 40814d8 to f584afc Compare March 25, 2026 08:58
@fstagni fstagni force-pushed the robin-pilot-management branch from f584afc to fc29b47 Compare March 25, 2026 09:07
@fstagni fstagni force-pushed the robin-pilot-management branch from aa0b56a to 706ab09 Compare March 25, 2026 10:08
@fstagni fstagni changed the title feat: add pilot management database and router feat: add pilot database and router Mar 25, 2026
@fstagni fstagni force-pushed the robin-pilot-management branch from 9cb8d15 to 7e63b63 Compare March 25, 2026 13:46
@fstagni fstagni marked this pull request as ready for review March 25, 2026 13:49
@fstagni
Copy link
Copy Markdown
Contributor

fstagni commented Mar 25, 2026

This PR is I think ready for being tested in the certification setup. In the meantime, please review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Open access and require auth not working inside a router

3 participants