Skip to content

OCPNODE-4106: Propagate KubeletConfig TLS min version to CRI-O via ContainerRuntime #5672

Open
asahay19 wants to merge 1 commit intoopenshift:mainfrom
asahay19:mco
Open

OCPNODE-4106: Propagate KubeletConfig TLS min version to CRI-O via ContainerRuntime #5672
asahay19 wants to merge 1 commit intoopenshift:mainfrom
asahay19:mco

Conversation

@asahay19
Copy link
Copy Markdown

@asahay19 asahay19 commented Feb 19, 2026

What this PR does

This PR is part of the OCPSTRAT-2748 and epic OCPNODE-3916 initiative to enforce consistent TLS security policies across all OpenShift components. Today, when a cluster administrator sets a TLS security profile (via KubeletConfig or the cluster APIServer), the kubelet does it but CRI-O does not — leaving CRI-O's streaming and metrics servers with default TLS settings that may not meet the organization's security requirements.

The CRI-O side of this work has already been done and PR has been merged : cri-o/cri-o#9723 , which added support for reading tls_min_version and tls_cipher_suites from CRI-O's drop-in configuration. Both PRs target OpenShift 4.22.
Right now, when an admin sets a TLS profile in KubeletConfig, the kubelet picks it up but CRI-O doesn't. This PR fixes that by generating a CRI-O drop-in file (/etc/crio/crio.conf.d/01-ctrcfg-tls) with the matching TLS version and ciphers. If no KubeletConfig sets TLS, it falls back to the cluster APIServer profile (default: Intermediate/TLS 1.2).

Reference: OCPNODE-4106

Background:

The ContainerRuntimeConfig controller now has a dedicated tlsQueue and syncTLSConfig handler, following the same pattern as kubelet_config_nodes.go. It watches KubeletConfig, APIServer, and MCP events, and for each pool generates a 99--generated-containerruntime-tls MachineConfig with the CRI-O drop-in containing both tls_min_version and tls_cipher_suites .
This runs independently of ContainerRuntimeConfig CRs, so every pool gets TLS config even without any CTRCFGs defined. KubeletConfig takes priority; if none is set, it falls back to the APIServer profile (default: Intermediate/TLS 1.2). When multiple KubeletConfigs target the same pool, the most recently created one (greatertimestamp) wins.
At bootstrap, a separate RunTLSBootstrap function generates TLS MachineConfigs for all pools, called outside the ContainerRuntimeConfig block so TLS is propagated from the start.

The controller writes a TOML drop-in at /etc/crio/crio.conf.d/01-ctrcfg-tls containing the [crio.api] section with tls_min_version and tls_cipher_suites. CRI-O reads this on startup (or reload) and applies the settings to its streaming and metrics servers. For example, an Intermediate profile produces tls_min_version = "VersionTLS12" with the full set of IANA cipher suites, while a Modern profile produces tls_min_version = "VersionTLS13" with only TLS 1.3 ciphers.

Output

Locally tested all the unit test cases are Passing :
--- PASS: TestRunTLSBootstrap (0.00s) --- PASS: TestRunTLSBootstrap/NilKubeletConfigsAndNilAPIServer_FallsBackToIntermediate --- PASS: TestRunTLSBootstrap/KubeletConfigCustomTLS13_OverridesAPIServer --- PASS: TestRunTLSBootstrap/CustomProfileWithCiphers_PropagatesCipherSuites --- PASS: TestRunTLSBootstrap/MCNameAndAnnotations --- PASS: TestCreateCRIOTLSDropinFile (0.00s) --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.2_generates_drop-in --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.3_generates_drop-in --- PASS: TestCreateCRIOTLSDropinFile/Empty_version_returns_error --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.2_with_cipher_suites --- PASS: TestTLSConfigFromKubeletConfigs (0.00s) --- PASS: TestTLSConfigFromKubeletConfigs/No_KubeletConfigs_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_without_TLS_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_1.3_matching_worker_pool --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_targeting_different_pool_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_matching_master_pool --- PASS: TestTLSConfigFromKubeletConfigs/Multiple_matching_KubeletConfigs:_newest_wins --- PASS: TestTLSConfigFromKubeletConfigs/Custom_profile_with_ciphers_returns_ciphers --- PASS: TestTLSConfigFromKubeletConfigs/Intermediate_profile_returns_ciphers --- PASS: TestKubeletConfigTLSPropagationToCRIO (20.30s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/APIServerFallback --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS12WithCiphers --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS13 --- PASS: TestKubeletConfigTLSPropagationToCRIO/IntermediateProfile --- PASS: TestKubeletConfigTLSPropagationToCRIO/ModernProfile --- PASS: TestKubeletConfigTLSPropagationToCRIO/RevertOnKubeletConfigDeletion

e2e test result
asahay@asahay-ubuntu:~/crio-test/machine-config-operator$ go test -v -timeout 30m -run TestKubeletConfigTLSPropagationToCRIO ./test/e2e-2of2/... === RUN TestKubeletConfigTLSPropagationToCRIO ctrcfg_test.go:400: Created empty MCP "tls-api-test" (no matching nodes) ctrcfg_test.go:411: Found dedicated TLS MC "99-tls-api-test-generated-containerruntime-tls" === RUN TestKubeletConfigTLSPropagationToCRIO/APIServerFallback ctrcfg_test.go:506: Verified: APIServer fallback includes cipher suites from Intermediate profile === RUN TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS12WithCiphers ctrcfg_test.go:529: Verified: Custom profile with TLS 1.2 + specific ciphers propagated to CRI-O === RUN TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS13 ctrcfg_test.go:542: Verified: Custom profile with TLS 1.3 propagated to CRI-O === RUN TestKubeletConfigTLSPropagationToCRIO/IntermediateProfile ctrcfg_test.go:554: Verified: Intermediate profile propagated TLS 1.2 + cipher suites to CRI-O === RUN TestKubeletConfigTLSPropagationToCRIO/ModernProfile ctrcfg_test.go:562: Verified: Modern profile propagated TLS 1.3 to CRI-O === RUN TestKubeletConfigTLSPropagationToCRIO/RevertOnKubeletConfigDeletion ctrcfg_test.go:573: Verified: Reverted to APIServer fallback with cipher suites after KubeletConfig deletion --- PASS: TestKubeletConfigTLSPropagationToCRIO (20.34s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/APIServerFallback (0.02s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS12WithCiphers (0.03s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS13 (2.04s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/IntermediateProfile (2.05s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/ModernProfile (2.03s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/RevertOnKubeletConfigDeletion (2.03s) PASS ok github.com/openshift/machine-config-operator/test/e2e-2of2 20.415s

Pre-Merge Testing : https://issues.redhat.com/browse/OCPNODE-4113
Upgrade Testing : https://redhat.atlassian.net/browse/OCPNODE-4220
Functionality Testing : https://redhat.atlassian.net/browse/OCPNODE-4222

cc @ngopalak-redhat @bitoku @haircommander

Comment thread cmd/machine-config-controller/start.go
Comment thread pkg/controller/container-runtime-config/container_runtime_config_controller.go Outdated
@bitoku
Copy link
Copy Markdown
Contributor

bitoku commented Feb 19, 2026

/hold
It should wait for cri-o is updated to 1.35 in the 4.22 release.

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 19, 2026
@asahay19 asahay19 force-pushed the mco branch 2 times, most recently from 8db40b4 to 8c13d8f Compare February 24, 2026 08:09
@asahay19
Copy link
Copy Markdown
Author

/hold It should wait for cri-o is updated to 1.35 in the 4.22 release.

We can unhold it, latest 4.22 build now contains crio-1.35 .
sh-5.1# crio --version
crio version 1.35.0-2.rhaos4.22.git7deb12c.el9

@asahay19
Copy link
Copy Markdown
Author

/retest

@asahay19 asahay19 force-pushed the mco branch 2 times, most recently from 2b984f7 to ce2c9c0 Compare February 25, 2026 10:55
@asahay19
Copy link
Copy Markdown
Author

/test unit

@asahay19
Copy link
Copy Markdown
Author

/retest

2 similar comments
@asahay19
Copy link
Copy Markdown
Author

/retest

@asahay19
Copy link
Copy Markdown
Author

/retest

@asahay19 asahay19 changed the title Propagate KubeletConfig TLS min version to CRI-O via ContainerRuntime OCPNODE-4106: Propagate KubeletConfig TLS min version to CRI-O via ContainerRuntime Mar 6, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 6, 2026

@asahay19: This pull request references OCPNODE-4106 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does

Add a KubeletConfig watcher to the ContainerRuntimeConfig controller so that when a
KubeletConfig CR has a TLSSecurityProfile set, the controller picks up the tls_min_version
and generates a CRI-O drop-in config file at /etc/crio/crio.conf.d/01-ctrcfg-tlsMinVersion.

Reference: OCPNODE-4106

Background:

The kubelet controller stays responsible for kubelet config only. The ContainerRuntimeConfig
controller now handles all CRI-O configuration including TLS. No API surface changes are
needed — TLS settings come from the existing KubeletConfig CR, and the CRI-O drop-in is
written by the ContainerRuntimeConfig controller.

  • The ContainerRuntimeConfig controller watches KubeletConfig add/update/delete events
    and re-enqueues all ContainerRuntimeConfigs on change.
  • For each pool, the controller extracts the TLS min version from the matching KubeletConfig
    and generates a CRI-O drop-in at /etc/crio/crio.conf.d/01-ctrcfg-tlsMinVersion.
  • The TLS version is tracked via the machineconfiguration.openshift.io/crio-tls-min-version
    annotation on the MachineConfig, so changes are detected even when the ContainerRuntimeConfig
    itself hasn't changed.
  • The same logic is applied at bootstrap time so that new clusters get the correct CRI-O
    TLS configuration from the start.

Changes:

  • Add KubeletConfig watcher to ContainerRuntimeConfig controller
  • When a KubeletConfig CR has a TLSSecurityProfile, extract tls_min_version and generate a CRI-O drop-in config at /etc/crio/crio.conf.d/01-ctrcfg-tlsMinVersion
  • TLS version tracked via machineconfiguration.openshift.io/crio-tls-min-version annotation on MachineConfig to detect drift and force re-sync
  • Same TLS propagation logic applied at bootstrap time for new clusters
  • Kubelet controller stays responsible for kubelet config only; ContainerRuntimeConfig controller handles all CRI-O config including TLS
  • No API surface changes — TLS comes from existing KubeletConfig CR
  • Added unit tests for createCRIOTLSDropinFile and getTLSMinVersionForPool and adding KubeletConfig after bootstrap
  • Updated existing unit tests to handle new KubeletConfig informer actions
  • Added E2E integration test TestKubeletConfigTLSPropagationToCRIO — verifies TLS 1.3 drop-in creation, update to TLS 1.2, and cleanup on KubeletConfig deletion
  • Added E2E bootstrap test case with ContainerRuntimeConfig

Files changed:

  • cmd/machine-config-controller/start.go - Pass KubeletConfigs informer, ClusterVersionInformer, and FeatureGatesHandler to ContainerRuntimeConfig controller
  • pkg/controller/container-runtime-config/container_runtime_config_controller.go - Watch KubeletConfig events, re-enqueue ContainerRuntimeConfigs on change, new getTLSMinVersionFromKubeletConfigs(), generate CRI-O TLS drop-in, annotate MC with TLS version
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go - Accept KubeletConfig slice, compute TLS min version per pool at bootstrap, include drop-in in generated MCs
  • pkg/controller/bootstrap/bootstrap.go - Pass KubeletConfig list to RunContainerRuntimeBootstrap
  • pkg/controller/common/constants.go - New CRIOTLSMinVersionAnnotationKey annotation constant
  • pkg/controller/container-runtime-config/helpers.go - New CRIODropInFilePathTLSMinVersion constant,
    tomlConfigCRIOTLSMinVersion struct, and createCRIOTLSDropinFile() function
  • pkg/controller/container-runtime-config/helpers_test.go - Unit tests for createCRIOTLSDropinFile (TLS 1.2, TLS 1.3, empty version) and getTLSMinVersionForPool (no configs, no TLS, correct pool, wrong pool, master pool)
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go - Updated filterInformerActions to filter kubeletconfigs list/watch actions from the new informer
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap_test.go - Test for adding KubeletConfig after bootstrap
  • test/e2e-2of2/ctrcfg_test.go - E2E test TestKubeletConfigTLSPropagationToCRIO: create KubeletConfig with TLS 1.3, verify drop-in on node, update to TLS 1.2, verify update, delete and verify removal
  • test/e2e-bootstrap/bootstrap_test.go - Bootstrap E2E test case with ContainerRuntimeConfig

Output

Locally tested all the test cases are Passing :
--- PASS: TestAddKubeletCfgAfterBootstrapKubeletCfg (4.36s) --- PASS: TestAddKubeletCfgAfterBootstrapKubeletCfg/AWS (1.50s) --- PASS: TestAddKubeletCfgAfterBootstrapKubeletCfg/None (1.44s) --- PASS: TestAddKubeletCfgAfterBootstrapKubeletCfg/unrecognized (1.41s) --- PASS: TestContainerRuntimeConfigCreate (2.35s) --- PASS: TestContainerRuntimeConfigCreate/AWS (0.80s) --- PASS: TestContainerRuntimeConfigCreate/None (0.77s) --- PASS: TestContainerRuntimeConfigCreate/unrecognized (0.78s) --- PASS: TestContainerRuntimeConfigOptions (0.00s) --- PASS: TestContainerruntimeConfigResync (6.51s) --- PASS: TestContainerruntimeConfigResync/AWS (2.20s) --- PASS: TestContainerruntimeConfigResync/None (2.13s) --- PASS: TestContainerruntimeConfigResync/unrecognized (2.17s)

Pre-Merge Testing : https://issues.redhat.com/browse/OCPNODE-4113

cc @ngopalak-redhat @bitoku @haircommander

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Comment thread pkg/controller/container-runtime-config/container_runtime_config_controller.go Outdated
Comment thread cmd/machine-config-controller/start.go
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 16, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e71ce707-4097-4e26-bae3-1451ca4946e4

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds TLS minimum-version propagation from KubeletConfig (with APIServer fallback) into CRI-O via generated drop-ins, wires KubeletConfig and APIServer informers into the container-runtime-config controller and bootstrap paths, and adds helpers and tests to compute, inject, and validate the TLS setting.

Changes

Cohort / File(s) Summary
Informer wiring / bootstrap
cmd/machine-config-controller/start.go, test/e2e-bootstrap/bootstrap_test.go
Passes KubeletConfigs informer into the containerruntimeconfig controller creation and updates the constructor signature to accept the extra informer.
Bootstrap logic
pkg/controller/bootstrap/bootstrap.go, pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go, pkg/controller/container-runtime-config/container_runtime_config_bootstrap_test.go
Extends RunContainerRuntimeBootstrap to accept kubeletConfigs and apiServer; computes TLS min version (KubeletConfig preferred, APIServer fallback) and conditionally injects a CRI-O TLS drop-in during bootstrap; tests updated to pass new args (nil in unit tests).
Controller runtime and eventing
pkg/controller/container-runtime-config/container_runtime_config_controller.go, pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
Adds KubeletConfig and APIServer informers/listers and event handlers; integrates TLS-min-version computation into per-pool syncs, includes CRI-O TLS drop-in changes in generation/hash checks, and adds helpers to enqueue all CRs on KubeletConfig/APIServer changes; tests wired to include fixtures and listers.
CRI-O TLS helpers and tests
pkg/controller/container-runtime-config/helpers.go, pkg/controller/container-runtime-config/helpers_test.go
Adds CRIODropInFilePathTLSMinVersion, TOML struct, createCRIOTLSDropinFile, and machineConfigCRIOTLSMatchesCurrent plus unit tests for drop-in creation, parsing, and KubeletConfig TLS selection logic.
Controller bootstrap tests and unit updates
pkg/controller/container-runtime-config/container_runtime_config_bootstrap_test.go, pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
Updated calls and fixtures to provide kubeletConfigs/apiServer (or nil) to extended APIs; test informers/listers include KubeletConfig and APIServer and are marked ready.
End-to-end test updates
test/e2e-2of2/ctrcfg_test.go
Adds TestKubeletConfigTLSPropagationToCRIO e2e test validating TLS propagation and rollback; changes fileExists to return (bool, error) and use retry semantics; imports APIServer and KubeletConfig types and exercises KubeletConfig-driven TLS changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
pkg/controller/bootstrap/bootstrap.go (1)

256-262: ⚠️ Potential issue | 🟠 Major

Bootstrap still skips CRI-O TLS when no ContainerRuntimeConfig exists.

This new path only runs behind len(crconfigs) > 0, so a fresh cluster that sets TLS through KubeletConfig or APIServer but never defines a ContainerRuntimeConfig gets no CRI-O TLS drop-in at install time. That leaves bootstrap behavior dependent on an unrelated user CR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/bootstrap/bootstrap.go` around lines 256 - 262, The current
conditional skips CRI-O TLS bootstrap when crconfigs is empty; ensure CRI-O TLS
drop-ins are always generated by invoking the container runtime bootstrap logic
unconditionally (or extracting the TLS-specific logic out of
containerruntimeconfig.RunContainerRuntimeBootstrap into a function that runs
regardless of crconfigs). Specifically, call or refactor
containerruntimeconfig.RunContainerRuntimeBootstrap (or a new
RunContainerRuntimeTLSBootstrap) using b.templatesDir, crconfigs (possibly
empty), cconfig, pools, kconfigs, apiServer so TLS artifacts are produced even
when len(crconfigs)==0, handle the returned configs and error the same way, and
append the resulting containerRuntimeConfigs to configs.
🧹 Nitpick comments (1)
pkg/controller/container-runtime-config/helpers_test.go (1)

1401-1512: Cover the actual bootstrap selection path here.

This test only exercises getTLSMinVersionFromKubeletConfigsBootstrap, so it never hits the APIServer fallback in getTLSMinVersionForPoolBootstrap or the multi-match precedence case. Adding those cases would lock down the new bootstrap behavior much better.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/container-runtime-config/helpers_test.go` around lines 1401 -
1512, The test currently only calls getTLSMinVersionFromKubeletConfigsBootstrap;
extend TestGetTLSMinVersionForPool to call the higher-level
getTLSMinVersionForPoolBootstrap so you exercise the APIServer fallback and
multi-match precedence logic: add a case where kubeletConfigs is nil but an
apiserver TLSProfile (use the APIServer object with TLSSecurityProfile set)
yields the expected min version, and add a case with multiple KubeletConfigs
matching the same pool with conflicting MinTLSVersion values to assert which one
wins (matching the precedence implemented in getTLSMinVersionForPoolBootstrap);
reference getTLSMinVersionForPoolBootstrap,
getTLSMinVersionFromKubeletConfigsBootstrap (and
getTLSMinVersionFromAPIServerBootstrap if present) when locating code to update.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go`:
- Around line 106-125: The function getTLSMinVersionFromKubeletConfigsBootstrap
is relying on slice iteration order to pick the TLS minimum version, which makes
selection order-dependent; instead, first determine the single effective
KubeletConfig for the pool (by filtering using
metav1.LabelSelectorAsSelector/selector.Matches on
kc.Spec.MachineConfigPoolSelector and kc.Name) and then derive the TLS min
version from that resolved config using ctrlcommon.GetSecurityProfileCiphers.
Implement this by collecting matching kc objects into a slice, pick the
effective one (prefer the explicit resolver if available, e.g. a helper like
ResolveKubeletConfig, or if none exists, choose the newest by
kc.CreationTimestamp), then call GetSecurityProfileCiphers on that single config
and return its minVersion.

In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 875-880: The log claims the TLS version came "from KubeletConfig"
even when getTLSMinVersionForPool falls back to the API server; change the code
to record the actual source and log it. Modify getTLSMinVersionForPool to return
both the tlsMinVersion and a source identifier (e.g., "KubeletConfig" or
"APIServer"), update the call site here to unpack (tlsMinVersion, tlsSource),
and replace the klog.Infof message to include tlsSource (e.g., "Propagating TLS
min version %q from %s to CRI-O drop-in for pool %s"); keep usage of
createCRIOTLSDropinFile and appending to configFileList the same.
- Around line 1458-1489: The function getTLSMinVersionFromKubeletConfigs is
non-deterministic because kubeletConfigs from ctrl.kcLister.List are iterated in
arbitrary order; sort the slice by KubeletConfig.Name (add "sort" to imports)
before the for-loop so the selection of tlsMinVersion is deterministic across
runs—i.e., perform sort.Slice(kubeletConfigs, func(i,j int) bool { return
kubeletConfigs[i].Name < kubeletConfigs[j].Name }) then keep the existing logic
(which uses ctrlcommon.GetSecurityProfileCiphers on each kc) to set
tlsMinVersion.

In `@test/e2e-2of2/ctrcfg_test.go`:
- Around line 417-435: Capture the initial TLS drop-in presence in
tlsDropinFileName (as done with initialTLSExists) and after deleting the
KubeletConfig assert that the node returns to that captured state instead of
always expecting the file to be removed; update the post-delete checks (the
block around the delete and the similar block at 510-524) to call fileExists
and, if initialTLSExists was true, verify the drop-in is present and its
tls_min_version value (via getValueFromCrioConfig) has returned to the expected
pre-test value, otherwise if initialTLSExists was false verify the file was
removed.
- Around line 378-407: The test currently ignores AlreadyExists when creating
fixed-name fixtures (defaultMCConfig via helpers.CreateMC and ctrcfgObj created
with ctrcfgName) which makes runs on reused clusters non-deterministic; change
the creation logic to either generate unique names (e.g., append a
UUID/timestamp to mcName and ctrcfgName before calling helpers.CreateMC and
building ctrcfgObj) or, if you must keep fixed names, call
cs.MachineConfigs().Get and cs.ContainerRuntimeConfigs().Get first and validate
the existing object's spec matches the expected spec (and fail the test or
update the object if it differs) instead of silently continuing when
IsAlreadyExists is returned. Ensure you update cleanupFuncs to delete the actual
resource names you create or validate.

---

Duplicate comments:
In `@pkg/controller/bootstrap/bootstrap.go`:
- Around line 256-262: The current conditional skips CRI-O TLS bootstrap when
crconfigs is empty; ensure CRI-O TLS drop-ins are always generated by invoking
the container runtime bootstrap logic unconditionally (or extracting the
TLS-specific logic out of containerruntimeconfig.RunContainerRuntimeBootstrap
into a function that runs regardless of crconfigs). Specifically, call or
refactor containerruntimeconfig.RunContainerRuntimeBootstrap (or a new
RunContainerRuntimeTLSBootstrap) using b.templatesDir, crconfigs (possibly
empty), cconfig, pools, kconfigs, apiServer so TLS artifacts are produced even
when len(crconfigs)==0, handle the returned configs and error the same way, and
append the resulting containerRuntimeConfigs to configs.

---

Nitpick comments:
In `@pkg/controller/container-runtime-config/helpers_test.go`:
- Around line 1401-1512: The test currently only calls
getTLSMinVersionFromKubeletConfigsBootstrap; extend TestGetTLSMinVersionForPool
to call the higher-level getTLSMinVersionForPoolBootstrap so you exercise the
APIServer fallback and multi-match precedence logic: add a case where
kubeletConfigs is nil but an apiserver TLSProfile (use the APIServer object with
TLSSecurityProfile set) yields the expected min version, and add a case with
multiple KubeletConfigs matching the same pool with conflicting MinTLSVersion
values to assert which one wins (matching the precedence implemented in
getTLSMinVersionForPoolBootstrap); reference getTLSMinVersionForPoolBootstrap,
getTLSMinVersionFromKubeletConfigsBootstrap (and
getTLSMinVersionFromAPIServerBootstrap if present) when locating code to update.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1c1eb8aa-19fe-477d-a5b1-9f8405771346

📥 Commits

Reviewing files that changed from the base of the PR and between 74fef05 and c1054dd.

📒 Files selected for processing (10)
  • cmd/machine-config-controller/start.go
  • pkg/controller/bootstrap/bootstrap.go
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap_test.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
  • pkg/controller/container-runtime-config/helpers.go
  • pkg/controller/container-runtime-config/helpers_test.go
  • test/e2e-2of2/ctrcfg_test.go
  • test/e2e-bootstrap/bootstrap_test.go

Comment thread pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go Outdated
Comment thread pkg/controller/container-runtime-config/container_runtime_config_controller.go Outdated
Comment thread pkg/controller/container-runtime-config/container_runtime_config_controller.go Outdated
Comment thread test/e2e-2of2/ctrcfg_test.go Outdated
Comment thread test/e2e-2of2/ctrcfg_test.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 350-352: The handler kubeletConfigUpdated currently enqueues all
ContainerRuntimeConfigs on any update event (including status/metadata-only
changes); change it to cast the old/new params to the KubeletConfig type, return
early if casting fails, and compare only the meaningful spec/config fields (e.g.
the Spec or ConfigData) using a deep-equality check; call
enqueueAllContainerRuntimeConfigs() only when the spec/config actually changed
to avoid full requeues on status-only updates.

In `@test/e2e-2of2/ctrcfg_test.go`:
- Around line 506-509: The poll is using wait.PollImmediate with
getValueFromCrioConfig which currently fails the test on transient node exec
errors, preventing retries; change the approach so the closure passed to
wait.PollImmediate handles transient exec errors instead of causing test fatal:
modify getValueFromCrioConfig (or add a non-fatal variant like
getValueFromCrioConfigNoFail) to return (string, error) rather than calling
t.Fatalf, and update the poll closure in ctrcfg_test.go to call that variant,
returning (false, nil) when it receives a transient exec error so PollImmediate
can retry, and only return an actual error for unrecoverable conditions; apply
the same pattern to the other occurrence around lines 534-535.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6581c020-31df-4b36-980d-d73d4685afa2

📥 Commits

Reviewing files that changed from the base of the PR and between c1054dd and 2d93d44.

📒 Files selected for processing (10)
  • cmd/machine-config-controller/start.go
  • pkg/controller/bootstrap/bootstrap.go
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap_test.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
  • pkg/controller/container-runtime-config/helpers.go
  • pkg/controller/container-runtime-config/helpers_test.go
  • test/e2e-2of2/ctrcfg_test.go
  • test/e2e-bootstrap/bootstrap_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • cmd/machine-config-controller/start.go
  • pkg/controller/container-runtime-config/helpers_test.go

Comment thread pkg/controller/container-runtime-config/container_runtime_config_controller.go Outdated
Comment thread test/e2e-2of2/ctrcfg_test.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
test/e2e-2of2/ctrcfg_test.go (2)

378-381: ⚠️ Potential issue | 🟠 Major

Wait for leaked fixtures to disappear before recreating them.

These deletes are async, so the creates on Lines 385, 407, and 467 can still race a terminating object and fail with AlreadyExists on reused clusters. Use unique names per run or poll each resource to NotFound before recreating it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e-2of2/ctrcfg_test.go` around lines 378 - 381, The cleanup deletes for
cs.MachineConfigs(), cs.ContainerRuntimeConfigs(), and cs.KubeletConfigs() are
asynchronous and may still be terminating when the test later calls create,
causing AlreadyExists races; update the test to either use per-run unique
resource names (mcName, ctrcfgName, kcName) or, after calling Delete(...), poll
the API (e.g., using a short retry loop) until Get(...) returns NotFound for
each resource before proceeding to the creates on the later lines (385, 407,
467), ensuring the objects are fully removed prior to recreation.

506-509: ⚠️ Potential issue | 🟠 Major

Keep these waits retryable across reboot windows.

getValueFromCrioConfig still fails the test on transient node exec errors, so these wait.PollImmediate loops cannot actually retry during reboot windows.

Suggested change
+func getValueFromCrioConfigWithError(cs *framework.ClientSet, node corev1.Node, regexKey, confPath string) (string, error) {
+	out, err := helpers.ExecCmdOnNodeWithError(cs, node, "cat", filepath.Join("/rootfs", confPath))
+	if err != nil {
+		return "", err
+	}
+	re := regexp.MustCompile(regexKey)
+	matches := re.FindStringSubmatch(string(out))
+	if len(matches) != 2 || matches[1] == "" {
+		return "", fmt.Errorf("unable to parse %q from %s on node %s", regexKey, confPath, node.Name)
+	}
+	return matches[1], nil
+}
@@
 err = wait.PollImmediate(5*time.Second, 15*time.Minute, func() (bool, error) {
-	val := getValueFromCrioConfig(t, cs, node, `tls_min_version = "(\S+)"`, tlsDropinPath)
-	return val == "VersionTLS12", nil
+	val, err := getValueFromCrioConfigWithError(cs, node, `tls_min_version = "(\S+)"`, tlsDropinPath)
+	if err != nil {
+		return false, nil
+	}
+	return val == "VersionTLS12", nil
 })
@@
 err = wait.PollImmediate(15*time.Second, 25*time.Minute, func() (bool, error) {
 	exists, err := fileExists(t, cs, node, tlsDropinFileName)
 	if err != nil {
 		return false, nil
 	}
 	if !initialTLSExists {
 		return !exists, nil
 	}
 	if !exists {
 		return false, nil
 	}
-	val := getValueFromCrioConfig(t, cs, node, `tls_min_version = "(\S+)"`, tlsDropinPath)
-	return val == initialTLSValue, nil
+	val, err := getValueFromCrioConfigWithError(cs, node, `tls_min_version = "(\S+)"`, tlsDropinPath)
+	if err != nil {
+		return false, nil
+	}
+	return val == initialTLSValue, nil
 })

Also applies to: 523-535

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e-2of2/ctrcfg_test.go` around lines 506 - 509, The PollImmediate
closures fail fast because getValueFromCrioConfig can return an error on
transient node exec failures; modify the anonymous functions used in
wait.PollImmediate (the one calling getValueFromCrioConfig for tls_min_version
and the similar block at the 523-535 range) to treat execution errors as
retryable by returning (false, nil) when getValueFromCrioConfig returns an
error, and only return (true, nil) when the desired value is observed (or return
the error only for terminal/non-transient failures), so the wait loop can
continue across reboot windows.
pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go (1)

108-130: ⚠️ Potential issue | 🟠 Major

A newer matching KubeletConfig still can't clear an older TLS override.

Line 111 filters out TLSSecurityProfile == nil before choosing newest. If the newest matching KubeletConfig removes TLS settings, this helper ignores it and keeps returning an older TLS floor instead of falling back to APIServer/defaults.

Suggested change
 func getTLSMinVersionFromKubeletConfigsBootstrap(kubeletConfigs []*mcfgv1.KubeletConfig, pool *mcfgv1.MachineConfigPool) string {
 	var newest *mcfgv1.KubeletConfig
 	for _, kc := range kubeletConfigs {
-		if kc.Spec.TLSSecurityProfile == nil {
-			continue
-		}
 		selector, err := metav1.LabelSelectorAsSelector(kc.Spec.MachineConfigPoolSelector)
 		if err != nil {
 			klog.Warningf("invalid label selector in KubeletConfig %s: %v", kc.Name, err)
 			continue
 		}
 		if selector.Empty() || !selector.Matches(labels.Set(pool.Labels)) {
 			continue
 		}
 		if newest == nil || kc.CreationTimestamp.After(newest.CreationTimestamp.Time) {
 			newest = kc
 		}
 	}
-	if newest == nil {
+	if newest == nil || newest.Spec.TLSSecurityProfile == nil {
 		return ""
 	}
 	minVersion, _ := ctrlcommon.GetSecurityProfileCiphers(newest.Spec.TLSSecurityProfile)
 	return minVersion
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go`
around lines 108 - 130, The helper getTLSMinVersionFromKubeletConfigsBootstrap
currently skips KubeletConfigs with TLSSecurityProfile==nil which prevents a
newer config that clears TLS settings from being chosen; change the selection
logic to consider all matching kubeletConfigs (do not continue on
TLSSecurityProfile==nil) so the loop can pick the newest matching kc for the
pool, then after newest is chosen if newest.Spec.TLSSecurityProfile == nil
return "" else call
ctrlcommon.GetSecurityProfileCiphers(newest.Spec.TLSSecurityProfile) to obtain
and return the minVersion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go`:
- Around line 57-63: The TLS drop-in is only added when rendering user
ContainerRuntimeConfig, so pools without a ContainerRuntimeConfig never get
01-ctrcfg-tlsMinVersion; move the tlsMinVersion retrieval and append logic (use
getTLSMinVersionForPoolBootstrap(pool, kubeletConfigs, apiServer) and
createCRIOTLSDropinFile) out of the ContainerRuntimeConfig-specific block and
into the per-pool rendering path where configFileList is assembled for every
pool (ensuring you still append to the same configFileList and avoid duplicating
entries if a user ContainerRuntimeConfig already provided the same drop-in).

In `@test/e2e-2of2/ctrcfg_test.go`:
- Around line 473-487: The current wait.PollImmediate call only checks for the
TLS drop-in file via fileExists (tlsDropinFileName) so it can return early if
the file is present but still contains the old value; change the predicate
inside wait.PollImmediate to read and validate the tls_min_version value using
getValueFromCrioConfig (with tlsDropinPath and the regex `tls_min_version =
"(\S+)"`) and only return true when that value equals "VersionTLS13", while
still treating transient errors (from getValueFromCrioConfig or exec) as
non-fatal by returning (false, nil) to keep retrying. Ensure the require.Nil
error message remains if the poll times out.

---

Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go`:
- Around line 108-130: The helper getTLSMinVersionFromKubeletConfigsBootstrap
currently skips KubeletConfigs with TLSSecurityProfile==nil which prevents a
newer config that clears TLS settings from being chosen; change the selection
logic to consider all matching kubeletConfigs (do not continue on
TLSSecurityProfile==nil) so the loop can pick the newest matching kc for the
pool, then after newest is chosen if newest.Spec.TLSSecurityProfile == nil
return "" else call
ctrlcommon.GetSecurityProfileCiphers(newest.Spec.TLSSecurityProfile) to obtain
and return the minVersion.

In `@test/e2e-2of2/ctrcfg_test.go`:
- Around line 378-381: The cleanup deletes for cs.MachineConfigs(),
cs.ContainerRuntimeConfigs(), and cs.KubeletConfigs() are asynchronous and may
still be terminating when the test later calls create, causing AlreadyExists
races; update the test to either use per-run unique resource names (mcName,
ctrcfgName, kcName) or, after calling Delete(...), poll the API (e.g., using a
short retry loop) until Get(...) returns NotFound for each resource before
proceeding to the creates on the later lines (385, 407, 467), ensuring the
objects are fully removed prior to recreation.
- Around line 506-509: The PollImmediate closures fail fast because
getValueFromCrioConfig can return an error on transient node exec failures;
modify the anonymous functions used in wait.PollImmediate (the one calling
getValueFromCrioConfig for tls_min_version and the similar block at the 523-535
range) to treat execution errors as retryable by returning (false, nil) when
getValueFromCrioConfig returns an error, and only return (true, nil) when the
desired value is observed (or return the error only for terminal/non-transient
failures), so the wait loop can continue across reboot windows.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 953fc993-dea6-4101-ae5c-67ab524f9977

📥 Commits

Reviewing files that changed from the base of the PR and between 2d93d44 and f9bef9a.

📒 Files selected for processing (10)
  • cmd/machine-config-controller/start.go
  • pkg/controller/bootstrap/bootstrap.go
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap_test.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller.go
  • pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
  • pkg/controller/container-runtime-config/helpers.go
  • pkg/controller/container-runtime-config/helpers_test.go
  • test/e2e-2of2/ctrcfg_test.go
  • test/e2e-bootstrap/bootstrap_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/controller/container-runtime-config/container_runtime_config_bootstrap_test.go
  • test/e2e-bootstrap/bootstrap_test.go
  • cmd/machine-config-controller/start.go
  • pkg/controller/container-runtime-config/helpers.go

Comment thread pkg/controller/container-runtime-config/container_runtime_config_bootstrap.go Outdated
Comment thread test/e2e-2of2/ctrcfg_test.go Outdated
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2026
Comment thread pkg/controller/container-runtime-config/helpers.go
…Config controller

Signed-off-by: Aditi <asahay@redhat.com>
@asahay19
Copy link
Copy Markdown
Author

/payload-job periodic-ci-openshift-machine-config-operator-release-4.22-periodics-e2e-aws-mco-disruptive-techpreview-1of3

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 23, 2026

@asahay19: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-machine-config-operator-release-4.22-periodics-e2e-aws-mco-disruptive-techpreview-1of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c9c38cf0-26b8-11f1-8a86-439171d498a4-0

@ngopalak-redhat
Copy link
Copy Markdown
Contributor

/payload-job periodic-ci-openshift-release-master-nightly-4.22-e2e-aws-ovn-upgrade-fips

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 23, 2026

@ngopalak-redhat: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

@ngopalak-redhat
Copy link
Copy Markdown
Contributor

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 23, 2026

@ngopalak-redhat: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/75472520-26bc-11f1-8107-c74e29aafe8f-0

@asahay19
Copy link
Copy Markdown
Author

/payload-job periodic-ci-openshift-machine-config-operator-release-4.22-periodics-e2e-gcp-mco-disruptive-techpreview-1of3 periodic-ci-openshift-machine-config-operator-release-4.22-periodics-e2e-gcp-mco-disruptive-techpreview-2of3 periodic-ci-openshift-machine-config-operator-release-4.22-periodics-e2e-gcp-mco-disruptive-techpreview-3of3

Hey @bitoku . All the jobs got passed. Even the jobs triggered by me also got passed.

Copy link
Copy Markdown
Contributor

@bitoku bitoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Mar 24, 2026
@asahay19
Copy link
Copy Markdown
Author

/retest

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 24, 2026

@asahay19: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-2of2 ce2c9c0 link true /test e2e-gcp-op-2of2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@asahay19
Copy link
Copy Markdown
Author

/retest

@asahay19
Copy link
Copy Markdown
Author

/hold cancel

Latest 4.22 build contains crio 1.35.

asahay@asahay-ubuntu:~/crio-test/machine-config-operator$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.22.0-0-2026-03-24-092947-test-ci-ln-ql5c42k-latest True False 19h Cluster version is 4.22.0-0-2026-03-24-092947-test-ci-ln-ql5c42k-latest

$ oc debug node/asahay-gcp-dbpcq-worker-a-h886f
Starting pod/asahay-gcp-dbpcq-worker-a-h886f-debug-stdxc ...
To use host binaries, run `chroot /host`. Instead, if you need to access host namespaces, run `nsenter -a -t 1`.
Pod IP: 10.0.128.4
All commands and output from this session will be recorded in container logs, including credentials and sensitive information passed through the command prompt.
If you don't see a command prompt, try pressing enter.
sh-5.1# chroot /host
sh-5.1# crio -version
crio version 1.35.1-5.rhaos4.22.gitc22ce9f.el9

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 25, 2026
@ngopalak-redhat
Copy link
Copy Markdown
Contributor

/lgtm
I reviewed some of the test cases along with the code

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 25, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: asahay19, bitoku, ngopalak-redhat
Once this PR has been reviewed and has the lgtm label, please assign djoshy for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@asahay19
Copy link
Copy Markdown
Author

/verified by ci

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 25, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@asahay19: This PR has been marked as verified by ci.

Details

In response to this:

/verified by ci

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 25, 2026

@asahay19: This pull request references OCPNODE-4106 which is a valid jira issue.

Details

In response to this:

What this PR does

This PR is part of the OCPSTRAT-2748 and epic OCPNODE-3916 initiative to enforce consistent TLS security policies across all OpenShift components. Today, when a cluster administrator sets a TLS security profile (via KubeletConfig or the cluster APIServer), the kubelet does it but CRI-O does not — leaving CRI-O's streaming and metrics servers with default TLS settings that may not meet the organization's security requirements.

The CRI-O side of this work has already been done and PR has been merged : cri-o/cri-o#9723 , which added support for reading tls_min_version and tls_cipher_suites from CRI-O's drop-in configuration. Both PRs target OpenShift 4.22.
Right now, when an admin sets a TLS profile in KubeletConfig, the kubelet picks it up but CRI-O doesn't. This PR fixes that by generating a CRI-O drop-in file (/etc/crio/crio.conf.d/01-ctrcfg-tls) with the matching TLS version and ciphers. If no KubeletConfig sets TLS, it falls back to the cluster APIServer profile (default: Intermediate/TLS 1.2).

Reference: OCPNODE-4106

Background:

The kubelet controller stays responsible for kubelet config only. The ContainerRuntimeConfig controller now handles all CRI-O configuration including TLS. No API surface changes are needed — TLS settings come from the existing KubeletConfig CR's TLSSecurityProfile field, with the cluster APIServer config as fallback..

  • A dedicated tlsQueue and syncTLSConfig handler run independently of ContainerRuntimeConfig CRs
  • For each MachineConfigPool, the controller resolves TLS settings (KubeletConfig override or APIServer fallback) and creates/updates a dedicated 99--generated-containerruntime-tls MachineConfig
  • KubeletConfig, APIServer, and MCP add events enqueue to the tlsQueue
  • RetryOnConflict wraps the full Get+Create/Update cycle for correctness
  • The same logic applies at bootstrap via a separate RunTLSBootstrap function, called outside the ContainerRuntimeConfig block so TLS is propagated even with zero CTRCFGs
  • Both tls_min_version and tls_cipher_suites are propagated; OpenSSL cipher names are converted to IANA format via GetSecurityProfileCiphers

Changes:

  • Add tlsQueue, tlsWorker, and syncTLSConfig to the ContainerRuntimeConfig controller
  • Watch KubeletConfig, APIServer, and MCP events to trigger TLS re-sync\
  • Shared tlsConfigFromKubeletConfigs helper used by both controller and bootstrap paths; picks the most recently created KubeletConfig when multiple match a pool
  • Separate RunTLSBootstrap function for bootstrap-time TLS MC generation
  • CRIODropInFilePathTLSMinVersion drop-in at /etc/crio/crio.conf.d/01-ctrcfg-tlsMinVersion carries both tls_min_version and tls_cipher_suites
  • Unit tests: TestTLSConfigFromKubeletConfigs, TestRunTLSBootstrap (4 subtests covering fallback, override, cipher propagation, naming/annotations)
  • API-level E2E test TestKubeletConfigTLSPropagationToCRIO with 6 subtests: APIServer fallback, Custom TLS 1.2 with ciphers, Custom TLS 1.3, Intermediate, Modern, and revert on KubeletConfig deletion

Output

Locally tested all the test cases are Passing :
--- PASS: TestAddKubeletCfgAfterBootstrapKubeletCfg (4.36s) --- PASS: TestAddKubeletCfgAfterBootstrapKubeletCfg/AWS (1.50s) --- PASS: TestAddKubeletCfgAfterBootstrapKubeletCfg/None (1.44s) --- PASS: TestAddKubeletCfgAfterBootstrapKubeletCfg/unrecognized (1.41s) --- PASS: TestContainerRuntimeConfigCreate (2.35s) --- PASS: TestContainerRuntimeConfigCreate/AWS (0.80s) --- PASS: TestContainerRuntimeConfigCreate/None (0.77s) --- PASS: TestContainerRuntimeConfigCreate/unrecognized (0.78s) --- PASS: TestContainerRuntimeConfigOptions (0.00s) --- PASS: TestContainerruntimeConfigResync (6.51s) --- PASS: TestContainerruntimeConfigResync/AWS (2.20s) --- PASS: TestContainerruntimeConfigResync/None (2.13s) --- PASS: TestContainerruntimeConfigResync/unrecognized (2.17s)

Pre-Merge Testing : https://issues.redhat.com/browse/OCPNODE-4113

cc @ngopalak-redhat @bitoku @haircommander

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 25, 2026

@asahay19: This pull request references OCPNODE-4106 which is a valid jira issue.

Details

In response to this:

What this PR does

This PR is part of the OCPSTRAT-2748 and epic OCPNODE-3916 initiative to enforce consistent TLS security policies across all OpenShift components. Today, when a cluster administrator sets a TLS security profile (via KubeletConfig or the cluster APIServer), the kubelet does it but CRI-O does not — leaving CRI-O's streaming and metrics servers with default TLS settings that may not meet the organization's security requirements.

The CRI-O side of this work has already been done and PR has been merged : cri-o/cri-o#9723 , which added support for reading tls_min_version and tls_cipher_suites from CRI-O's drop-in configuration. Both PRs target OpenShift 4.22.
Right now, when an admin sets a TLS profile in KubeletConfig, the kubelet picks it up but CRI-O doesn't. This PR fixes that by generating a CRI-O drop-in file (/etc/crio/crio.conf.d/01-ctrcfg-tls) with the matching TLS version and ciphers. If no KubeletConfig sets TLS, it falls back to the cluster APIServer profile (default: Intermediate/TLS 1.2).

Reference: OCPNODE-4106

Background:

The kubelet controller stays responsible for kubelet config only. The ContainerRuntimeConfig controller now handles all CRI-O configuration including TLS. No API surface changes are needed — TLS settings come from the existing KubeletConfig CR's TLSSecurityProfile field, with the cluster APIServer config as fallback..

  • A dedicated tlsQueue and syncTLSConfig handler run independently of ContainerRuntimeConfig CRs
  • For each MachineConfigPool, the controller resolves TLS settings (KubeletConfig override or APIServer fallback) and creates/updates a dedicated 99--generated-containerruntime-tls MachineConfig
  • KubeletConfig, APIServer, and MCP add events enqueue to the tlsQueue
  • RetryOnConflict wraps the full Get+Create/Update cycle for correctness
  • The same logic applies at bootstrap via a separate RunTLSBootstrap function, called outside the ContainerRuntimeConfig block so TLS is propagated even with zero CTRCFGs
  • Both tls_min_version and tls_cipher_suites are propagated; OpenSSL cipher names are converted to IANA format via GetSecurityProfileCiphers

Changes:

  • Add tlsQueue, tlsWorker, and syncTLSConfig to the ContainerRuntimeConfig controller
  • Watch KubeletConfig, APIServer, and MCP events to trigger TLS re-sync\
  • Shared tlsConfigFromKubeletConfigs helper used by both controller and bootstrap paths; picks the most recently created KubeletConfig when multiple match a pool
  • Separate RunTLSBootstrap function for bootstrap-time TLS MC generation
  • CRIODropInFilePathTLSMinVersion drop-in at /etc/crio/crio.conf.d/01-ctrcfg-tlsMinVersion carries both tls_min_version and tls_cipher_suites
  • Unit tests: TestTLSConfigFromKubeletConfigs, TestRunTLSBootstrap (4 subtests covering fallback, override, cipher propagation, naming/annotations)
  • API-level E2E test TestKubeletConfigTLSPropagationToCRIO with 6 subtests: APIServer fallback, Custom TLS 1.2 with ciphers, Custom TLS 1.3, Intermediate, Modern, and revert on KubeletConfig deletion

Output

Locally tested all the unit test cases are Passing :
--- PASS: TestRunTLSBootstrap (0.00s) --- PASS: TestRunTLSBootstrap/NilKubeletConfigsAndNilAPIServer_FallsBackToIntermediate --- PASS: TestRunTLSBootstrap/KubeletConfigCustomTLS13_OverridesAPIServer --- PASS: TestRunTLSBootstrap/CustomProfileWithCiphers_PropagatesCipherSuites --- PASS: TestRunTLSBootstrap/MCNameAndAnnotations --- PASS: TestCreateCRIOTLSDropinFile (0.00s) --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.2_generates_drop-in --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.3_generates_drop-in --- PASS: TestCreateCRIOTLSDropinFile/Empty_version_returns_error --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.2_with_cipher_suites --- PASS: TestTLSConfigFromKubeletConfigs (0.00s) --- PASS: TestTLSConfigFromKubeletConfigs/No_KubeletConfigs_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_without_TLS_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_1.3_matching_worker_pool --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_targeting_different_pool_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_matching_master_pool --- PASS: TestTLSConfigFromKubeletConfigs/Multiple_matching_KubeletConfigs:_newest_wins --- PASS: TestTLSConfigFromKubeletConfigs/Custom_profile_with_ciphers_returns_ciphers --- PASS: TestTLSConfigFromKubeletConfigs/Intermediate_profile_returns_ciphers --- PASS: TestKubeletConfigTLSPropagationToCRIO (20.30s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/APIServerFallback --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS12WithCiphers --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS13 --- PASS: TestKubeletConfigTLSPropagationToCRIO/IntermediateProfile --- PASS: TestKubeletConfigTLSPropagationToCRIO/ModernProfile --- PASS: TestKubeletConfigTLSPropagationToCRIO/RevertOnKubeletConfigDeletion

e2e test result
``

Pre-Merge Testing : https://issues.redhat.com/browse/OCPNODE-4113
Upgrade Testing : https://redhat.atlassian.net/browse/OCPNODE-4220
Functionality Testing : https://redhat.atlassian.net/browse/OCPNODE-4222

cc @ngopalak-redhat @bitoku @haircommander

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 25, 2026

@asahay19: This pull request references OCPNODE-4106 which is a valid jira issue.

Details

In response to this:

What this PR does

This PR is part of the OCPSTRAT-2748 and epic OCPNODE-3916 initiative to enforce consistent TLS security policies across all OpenShift components. Today, when a cluster administrator sets a TLS security profile (via KubeletConfig or the cluster APIServer), the kubelet does it but CRI-O does not — leaving CRI-O's streaming and metrics servers with default TLS settings that may not meet the organization's security requirements.

The CRI-O side of this work has already been done and PR has been merged : cri-o/cri-o#9723 , which added support for reading tls_min_version and tls_cipher_suites from CRI-O's drop-in configuration. Both PRs target OpenShift 4.22.
Right now, when an admin sets a TLS profile in KubeletConfig, the kubelet picks it up but CRI-O doesn't. This PR fixes that by generating a CRI-O drop-in file (/etc/crio/crio.conf.d/01-ctrcfg-tls) with the matching TLS version and ciphers. If no KubeletConfig sets TLS, it falls back to the cluster APIServer profile (default: Intermediate/TLS 1.2).

Reference: OCPNODE-4106

Background:

The ContainerRuntimeConfig controller now has a dedicated tlsQueue and syncTLSConfig handler, following the same pattern as kubelet_config_nodes.go. It watches KubeletConfig, APIServer, and MCP events, and for each pool generates a 99--generated-containerruntime-tls MachineConfig with the CRI-O drop-in containing both tls_min_version and tls_cipher_suites .
This runs independently of ContainerRuntimeConfig CRs, so every pool gets TLS config even without any CTRCFGs defined. KubeletConfig takes priority; if none is set, it falls back to the APIServer profile (default: Intermediate/TLS 1.2). When multiple KubeletConfigs target the same pool, the most recently created one (greatertimestamp) wins.
At bootstrap, a separate RunTLSBootstrap function generates TLS MachineConfigs for all pools, called outside the ContainerRuntimeConfig block so TLS is propagated from the start.

The controller writes a TOML drop-in at /etc/crio/crio.conf.d/01-ctrcfg-tls containing the [crio.api] section with tls_min_version and tls_cipher_suites. CRI-O reads this on startup (or reload) and applies the settings to its streaming and metrics servers. For example, an Intermediate profile produces tls_min_version = "VersionTLS12" with the full set of IANA cipher suites, while a Modern profile produces tls_min_version = "VersionTLS13" with only TLS 1.3 ciphers.

Output

Locally tested all the unit test cases are Passing :
--- PASS: TestRunTLSBootstrap (0.00s) --- PASS: TestRunTLSBootstrap/NilKubeletConfigsAndNilAPIServer_FallsBackToIntermediate --- PASS: TestRunTLSBootstrap/KubeletConfigCustomTLS13_OverridesAPIServer --- PASS: TestRunTLSBootstrap/CustomProfileWithCiphers_PropagatesCipherSuites --- PASS: TestRunTLSBootstrap/MCNameAndAnnotations --- PASS: TestCreateCRIOTLSDropinFile (0.00s) --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.2_generates_drop-in --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.3_generates_drop-in --- PASS: TestCreateCRIOTLSDropinFile/Empty_version_returns_error --- PASS: TestCreateCRIOTLSDropinFile/TLS_1.2_with_cipher_suites --- PASS: TestTLSConfigFromKubeletConfigs (0.00s) --- PASS: TestTLSConfigFromKubeletConfigs/No_KubeletConfigs_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_without_TLS_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_1.3_matching_worker_pool --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_targeting_different_pool_returns_empty --- PASS: TestTLSConfigFromKubeletConfigs/KubeletConfig_with_TLS_matching_master_pool --- PASS: TestTLSConfigFromKubeletConfigs/Multiple_matching_KubeletConfigs:_newest_wins --- PASS: TestTLSConfigFromKubeletConfigs/Custom_profile_with_ciphers_returns_ciphers --- PASS: TestTLSConfigFromKubeletConfigs/Intermediate_profile_returns_ciphers --- PASS: TestKubeletConfigTLSPropagationToCRIO (20.30s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/APIServerFallback --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS12WithCiphers --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS13 --- PASS: TestKubeletConfigTLSPropagationToCRIO/IntermediateProfile --- PASS: TestKubeletConfigTLSPropagationToCRIO/ModernProfile --- PASS: TestKubeletConfigTLSPropagationToCRIO/RevertOnKubeletConfigDeletion

e2e test result
asahay@asahay-ubuntu:~/crio-test/machine-config-operator$ go test -v -timeout 30m -run TestKubeletConfigTLSPropagationToCRIO ./test/e2e-2of2/... === RUN TestKubeletConfigTLSPropagationToCRIO ctrcfg_test.go:400: Created empty MCP "tls-api-test" (no matching nodes) ctrcfg_test.go:411: Found dedicated TLS MC "99-tls-api-test-generated-containerruntime-tls" === RUN TestKubeletConfigTLSPropagationToCRIO/APIServerFallback ctrcfg_test.go:506: Verified: APIServer fallback includes cipher suites from Intermediate profile === RUN TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS12WithCiphers ctrcfg_test.go:529: Verified: Custom profile with TLS 1.2 + specific ciphers propagated to CRI-O === RUN TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS13 ctrcfg_test.go:542: Verified: Custom profile with TLS 1.3 propagated to CRI-O === RUN TestKubeletConfigTLSPropagationToCRIO/IntermediateProfile ctrcfg_test.go:554: Verified: Intermediate profile propagated TLS 1.2 + cipher suites to CRI-O === RUN TestKubeletConfigTLSPropagationToCRIO/ModernProfile ctrcfg_test.go:562: Verified: Modern profile propagated TLS 1.3 to CRI-O === RUN TestKubeletConfigTLSPropagationToCRIO/RevertOnKubeletConfigDeletion ctrcfg_test.go:573: Verified: Reverted to APIServer fallback with cipher suites after KubeletConfig deletion --- PASS: TestKubeletConfigTLSPropagationToCRIO (20.34s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/APIServerFallback (0.02s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS12WithCiphers (0.03s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/CustomProfileTLS13 (2.04s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/IntermediateProfile (2.05s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/ModernProfile (2.03s) --- PASS: TestKubeletConfigTLSPropagationToCRIO/RevertOnKubeletConfigDeletion (2.03s) PASS ok github.com/openshift/machine-config-operator/test/e2e-2of2 20.415s

Pre-Merge Testing : https://issues.redhat.com/browse/OCPNODE-4113
Upgrade Testing : https://redhat.atlassian.net/browse/OCPNODE-4220
Functionality Testing : https://redhat.atlassian.net/browse/OCPNODE-4222

cc @ngopalak-redhat @bitoku @haircommander

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@bitoku
Copy link
Copy Markdown
Contributor

bitoku commented Mar 27, 2026

/hold
until internal discussion settled.

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants