-
Notifications
You must be signed in to change notification settings - Fork 2
feat: add CLI shell and exec commands for pod terminal access #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
V2arK
wants to merge
39
commits into
main
Choose a base branch
from
honglin/shell-forwarding-clean
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
39 commits
Select commit
Hold shift + click to select a range
1684078
feat: add CLI shell and exec commands for deployment pod terminal access
V2arK e26170c
fix: use urlparse for scheme replacement to satisfy CodeQL
V2arK 996e25b
fix: apply black formatting and fix CodeQL url.startswith alert
V2arK 3c380ed
style: condense multiline expressions for readability
V2arK 1d4a977
fix: resolve pylint warnings in shell.py and test_shell.py
V2arK d84ce85
fix: skip PyTorch-dependent tests in sanity mode
V2arK cef57d7
fix: break out of exec loop after end marker to prevent hanging
V2arK c4b5757
fix: re-enable OPOST after setraw to fix terminal rendering
V2arK 58a6005
fix: replace pytest-asyncio with asyncio.run in tests for CI compat
V2arK 7f60fdc
fix: match Web UI protocol - remove rows/cols from stdin messages, re…
V2arK 20cf81d
fix: send delayed resize to fix prompt rendering after shell startup
V2arK d3529f2
fix: await cancelled tasks for cleanup, reduce WS close_timeout to 2s
V2arK 0ae4ba4
fix: toggle PTY width to force SIGWINCH and prompt redraw on connect
V2arK a661f3b
fix: include rows/cols in stdin messages and send Ctrl+L after resize…
V2arK 90f8d9e
fix: use stty to set PTY dimensions from inside shell instead of resi…
V2arK 69b8aad
fix: re-enable OPOST after setraw to convert bare \n to \r\n like xte…
V2arK dd62ac5
fix: convert \n to \r\n in output and use stty to fix PTY dimensions …
V2arK 1ba56af
feat: use pyte terminal emulator for interactive shell rendering
V2arK 7449bf0
fix: swap rows/cols unpacking from shutil.get_terminal_size
V2arK ec9230b
fix: use alternate screen buffer to prevent scrollback in Warp terminal
V2arK 571059c
fix: handle WebSocket ConnectionClosed to prevent hang on shell exit
V2arK 192a826
refactor: use pyte for exec ANSI stripping and add ConnectionClosed h…
V2arK 68142c4
fix: treat ArgoCD Code message as reconnect signal, not shell exit code
V2arK 18bb688
fix: stop reconnecting when shell has genuinely exited
V2arK 289362a
chore: add debug file logging to shell and exec for exit hang diagnosis
V2arK 94b7b64
fix: detect shell exit via idle timeout instead of Code message
V2arK b7e2441
fix: exit immediately on exit echo, ignore echo exit with trailing pr…
V2arK 62176d6
fix: skip websocket close handshake wait after session ends
V2arK 13d1d48
refactor: extract shell logic from CLI to SDK layer
V2arK 4e619ec
refactor: extract shell logic to SDK layer, rely on server close frame
V2arK 1101c51
ruff format
V2arK 2a201ed
refactor: remove debug logging, fix unused imports and SDK/CLI bounda…
V2arK d3d2064
update redundancy
michaelshin f79fbf0
clean up pyte
michaelshin bb14ba2
clean up implementation
michaelshin ea6e0c8
address comments
michaelshin 78cfb45
fix lint
michaelshin 6f84d40
address comments
michaelshin 1dca1d6
revert
michaelshin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| """CLI commands for interactive shell and command execution in deployment pods.""" | ||
|
|
||
| import asyncio | ||
| import shlex | ||
| import sys | ||
|
|
||
| import click | ||
|
|
||
| from centml.cli.cluster import handle_exception | ||
| from centml.sdk import auth | ||
| from centml.sdk.api import get_centml_client | ||
| from centml.sdk.config import settings | ||
| from centml.sdk.shell import build_ws_url, exec_session, get_running_pods, interactive_session | ||
|
|
||
|
|
||
| def _select_pod(running_pods, deployment_id): | ||
| click.echo(f"Multiple running pods found for deployment {deployment_id}:") | ||
| for i, name in enumerate(running_pods, 1): | ||
| click.echo(f" [{i}] {name}") | ||
|
|
||
| choice = click.prompt( | ||
| "Select a pod", type=click.IntRange(1, len(running_pods)), prompt_suffix=f" [1-{len(running_pods)}]: " | ||
| ) | ||
| return running_pods[choice - 1] | ||
|
|
||
|
|
||
| def _connect_args(deployment_id, pod, shell_type, first_pod=False): | ||
| """Resolve pod, build WebSocket URL, and obtain auth token.""" | ||
| with get_centml_client() as cclient: | ||
| running_pods = get_running_pods(cclient, deployment_id) | ||
| if not running_pods: | ||
| raise click.ClickException(f"No running pods found for deployment {deployment_id}") | ||
|
|
||
| if pod is not None and pod not in running_pods: | ||
| pods_list = ", ".join(running_pods) | ||
| raise click.ClickException(f"Pod '{pod}' not found. Available running pods: {pods_list}") | ||
|
|
||
| if pod is not None: | ||
| pod_name = pod | ||
| elif len(running_pods) == 1 or first_pod: | ||
| pod_name = running_pods[0] | ||
| elif not sys.stdin.isatty(): | ||
| raise click.ClickException( | ||
| "Multiple running pods found and stdin is not a TTY. " | ||
| "Please specify a pod with --pod or use --first-pod." | ||
| ) | ||
| else: | ||
| pod_name = _select_pod(running_pods, deployment_id) | ||
|
|
||
| ws_url = build_ws_url(settings.CENTML_PLATFORM_API_URL, deployment_id, pod_name, shell_type) | ||
| token = auth.get_centml_token() | ||
| return ws_url, token | ||
|
|
||
|
|
||
| @click.command(help="Open an interactive shell to a deployment pod") | ||
| @click.argument("deployment_id", type=int) | ||
| @click.option("--pod", default=None, help="Specify a pod name") | ||
| @click.option("--shell", "shell_type", default=None, type=click.Choice(["bash", "sh", "zsh"]), help="Shell type") | ||
| @click.option( | ||
| "--first-pod", is_flag=True, default=False, help="Auto-select the first running pod (skip interactive selection)" | ||
| ) | ||
| @handle_exception | ||
| def shell(deployment_id, pod, shell_type, first_pod): | ||
| if not sys.stdin.isatty(): | ||
| raise click.ClickException("Interactive shell requires a terminal (TTY)") | ||
|
|
||
| ws_url, token = _connect_args(deployment_id, pod, shell_type, first_pod) | ||
| exit_code = asyncio.run(interactive_session(ws_url, token)) | ||
| sys.exit(exit_code) | ||
|
|
||
|
|
||
| @click.command(help="Execute a command in a deployment pod", context_settings={"ignore_unknown_options": True}) | ||
| @click.argument("deployment_id", type=int) | ||
| @click.argument("command", nargs=-1, required=True, type=click.UNPROCESSED) | ||
| @click.option("--pod", default=None, help="Specific pod name") | ||
| @click.option("--shell", "shell_type", default=None, type=click.Choice(["bash", "sh", "zsh"]), help="Shell type") | ||
| @click.option( | ||
| "--first-pod", is_flag=True, default=False, help="Auto-select the first running pod (skip interactive selection)" | ||
michaelshin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ) | ||
| @handle_exception | ||
| def exec_cmd(deployment_id, command, pod, shell_type, first_pod): | ||
| ws_url, token = _connect_args(deployment_id, pod, shell_type, first_pod) | ||
| exit_code = asyncio.run(exec_session(ws_url, token, shlex.join(command))) | ||
| sys.exit(exit_code) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| from centml.sdk.shell.exceptions import NoPodAvailableError, PodNotFoundError, ShellError | ||
| from centml.sdk.shell.session import build_ws_url, exec_session, get_running_pods, interactive_session | ||
|
|
||
| __all__ = [ | ||
| "NoPodAvailableError", | ||
| "PodNotFoundError", | ||
| "ShellError", | ||
| "build_ws_url", | ||
| "exec_session", | ||
| "get_running_pods", | ||
| "interactive_session", | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| class ShellError(Exception): | ||
| """Base exception for shell operations.""" | ||
|
|
||
|
|
||
| class NoPodAvailableError(ShellError): | ||
| """No running pods found for the deployment.""" | ||
|
|
||
|
|
||
| class PodNotFoundError(ShellError): | ||
| """Specified pod not found among running pods.""" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this right? if --pod is not provided, then we default to first pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are multiple replicas, then we interactively ask the user for a pod. --first-pod ensures that we apply the command to the first pod