coco: initial integration for Confidential Containers and Trustee operators#80
coco: initial integration for Confidential Containers and Trustee operators#80beraldoleal wants to merge 17 commits intovalidatedpatterns:mainfrom
Conversation
29c9c84 to
341c962
Compare
5074bb3 to
74e2c74
Compare
8b9a6cf to
67509bc
Compare
sabre1041
left a comment
There was a problem hiding this comment.
This is a good start. A few issues that have arisen during the review
- ZTWIM GA reconciles changes so the imperative configurations applied here are reverted immediately
- There is no mention about applying labels to nodes. Otherwise the sample workload fails to be scheduled
- There should be a callout about the instance types that may need to be configured. I tested in
eastasiaregion and the configured instance was not available - Additional comments inline
We fixed it by adding CREATE_ONLY_MODE=true env var to the ZTWIM operator via OLM subscription config in values-coco-dev.yaml
I will add a proper CONFIDENTIAL-CONTAINERS.md file. |
67509bc to
5c16b3c
Compare
|
Hey @sabre1041, @butler54 , @bpradipt ... let's give this a second shot! I addressed all the comments from the previous review. Feel free to reopen any or add new ones. This was tested on Azure with AMD SEV-SNP (DCasv6 / Genoa), OCP 4.20.8, using the ZTWIM operator stable-v1 channel, sandbox operator v1.11.0, and trustee operator v1.0.0. The chart references still point to custom branches.... waiting for @butler54 's PRs. Once those PRs merge, I will update the references. Hopefully that won't be a blocker for review. |
a67a701 to
c40eff1
Compare
|
@sabre1041 @butler54 @bpradipt no more fork references. Its using now the official validatedpatterns/charts release versions. |
butler54
left a comment
There was a problem hiding this comment.
There are a few hardcoded vars that definitely need to change.
The biggest question is the use of the imperative framework to generate certificates. If this can be moved to generation in cert manager I think that would be more 'kube friendly'
The justification for this is the imperative framework is always the second last option (the last option being a work done on the developer workstation.
| --- | ||
|
|
There was a problem hiding this comment.
Flag for future work - We should create an issue to add this playbook to the VP ansible collection. @mhjacks
| # Generate SPIRE x509pop certificates for CoCo integration | ||
| # Creates CA certificate and agent certificates for all workloads | ||
|
|
||
| - name: Generate SPIRE x509pop certificates |
There was a problem hiding this comment.
I want to check here whether we should be using certificate manager or the ansible approach here.
To me this would make a lot more sense (if we can) do use cert manager then we'd have less janking around to get things done (still a non-zero amount of janking).
There was a problem hiding this comment.
Agree cert-manager would be cleaner. went the imperative route to unblock testing faster. the main friction is SPIRE expects the CA as a ConfigMap and cert-manager outputs Secrets. will explore in a follow-up PR.
| echo "=== DEBUG: Testing network connectivity to KBS (cluster-internal) ===" | ||
| curl -k -I https://kbs-service.trustee-operator-system.svc.cluster.local:8080 2>&1 | head -20 | ||
| echo "=== DEBUG: Testing network connectivity to KBS (public route) ===" | ||
| curl -k -I https://kbs.apps.bleal-vp.azure.sandboxedcontainers.com 2>&1 | head -20 |
There was a problem hiding this comment.
This is hardcoded - you should be able to template out this using the the VP provided var for the cluster domain.
| ansible.builtin.shell: | | ||
| hash=$(sha256sum "{{ rendered_path }}" | cut -d' ' -f1) | ||
| initial_pcr=0000000000000000000000000000000000000000000000000000000000000000 | ||
| echo -n "$initial_pcr$hash" | python3 -c "import sys,hashlib; print(hashlib.sha256(bytes.fromhex(sys.stdin.read())).hexdigest())" | ||
| register: pcr8_hash |
There was a problem hiding this comment.
Did you get the python script to demonstrably work? This needs to be backported into the coco-pattern to avoid a custom container.
There was a problem hiding this comment.
yes, this works. And that is the plan.
Signed-off-by: Andrew Block <andy.block@gmail.com>
Signed-off-by: Manuel Lorenzo <mlorenzofr@redhat.com>
…idatedpatterns#89) * Initial architecture diagrams for the ZTVP. It shows the layered approach for managing the development of independent use cases and components, and the initial use case logical and schematic structures. Includes the draw.io source and pages rendered as png images at the time of commit. * Updated architecture diagrams for the ZTVP. Corrections added based on PR feedback: 1) added missing sidecar; 2) added duplicate connections; 3) moved csi driver into app space; 4) minor technical corrections for spelling, stray drawing objects, aligning objects. Most updates to use case 12 logical and schematic drawings. Includes the draw.io source and pages rendered as png images at the time of commit.
…patterns#93) Signed-off-by: Manuel Lorenzo <mlorenzofr@redhat.com>
Signed-off-by: Andrew Block <andy.block@gmail.com>
…ns#94) Bumps [ansible/ansible-lint](https://github.com/ansible/ansible-lint) from 25 to 26. - [Release notes](https://github.com/ansible/ansible-lint/releases) - [Commits](ansible/ansible-lint@v25...v26) --- updated-dependencies: - dependency-name: ansible/ansible-lint dependency-version: '26' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v5...v6) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…tedpatterns#83) * feat: add global certificate management with secretRef and extraValueFiles Implements comprehensive certificate management for ZTVP: Certificate Sources: - Primary custom CA via secretRef (customCA.secretRef) - Additional certificates via extraValueFiles (overrides/values-ztvp-certificates.yaml) - Auto-detected proxy CA from trusted-ca-bundle (openshift-config-managed) - Auto-detected ingress CA from all IngressControllers (not just default) - Auto-detected service CA from openshift-service-ca Features: - Initial Job for immediate certificate extraction on install - CronJob for periodic certificate rotation (daily at 2 AM) - Warning and continue behavior for missing additional certificates - Automatic rollout restart for consuming applications (labeled strategy) - ACM Policy distribution to target namespaces Configuration: - Use extraValueFiles for complex nested structures (additionalCertificates, rollout) - Simple overrides via values-hub.yaml for flat key-value pairs Signed-off-by: Min Zhang <minzhang@redhat.com> * feat: add secure Java truststore for qtodo with Vault integration - Add init container to convert CA bundle to Java JKS truststore - Add ExternalSecret for truststore password from Vault - Configure JAVA_TOOL_OPTIONS for JVM-level SSL truststore - Mount ztvp-trusted-ca ConfigMap for CA certificates - Enable truststore by default when SPIRE is enabled Signed-off-by: Min Zhang <minzhang@redhat.com> * feat: use PKCS12 truststore with jshell bulk import for qtodo - Use jshell with Java KeyStore API for truststore creation - Single JVM startup instead of 149 keytool invocations - Bulk import all CA certificates in one operation - Significantly faster truststore creation (~2-5s vs 30-60s) Signed-off-by: Min Zhang <minzhang@redhat.com> * refactor: move truststore creation script to separate Java file Move inline jshell script to a proper Java file (qtodo-truststore.java) loaded via ConfigMap. This improves code clarity and maintainability. Changes: - Add charts/qtodo/files/qtodo-truststore.java with proper Java class - Add qtodo-truststore-config.yaml to load the Java file - Update app-deployment.yaml to use 'java /usr/local/bin/qtodo-truststore.java' - Update superlinter.yml to exclude the Java file from linting - Add conditionals for truststore.enabled to control init container and volumes - Refactor Helm conditionals to use shorter syntax (remove 'eq ... true') Suggested-by: Manuel Lorenzo <mlorenzofr@redhat.com> Signed-off-by: Min Zhang <minzhang@redhat.com> --------- Signed-off-by: Min Zhang <minzhang@redhat.com>
…patterns#96) Reorganize Vault secrets into segmented paths for least-privilege access: Secret Path Structure: - apps/<app-name>/ - Application-specific secrets (e.g., apps/qtodo/) - hub/infra/<component>/ - Infrastructure secrets (e.g., hub/infra/keycloak/) - global/ - Shared secrets (unchanged) - hub/ - Hub-level secrets (unchanged) Policy Naming Convention: - K8s auth policies: <path>-k8s-secret (for ClusterSecretStore/ExternalSecrets) - JWT auth policies: <path>-jwt-secret (for SPIFFE workload identity) Changes: - Update vaultPrefixes in values-secret.yaml.template for new paths - Update ExternalSecret references in chart values.yaml files - Add JWT policies to values-hub.yaml for SPIFFE workload authentication - Pass JWT policies to vault-config-jwt ansible task via vault-utils.sh This enables application-level secret isolation where each app only has access to its own secrets, following zero-trust principles. Depends on: rhvp/rhvp.cluster_utils PR for auto-creating K8s auth policies Signed-off-by: Min Zhang <minzhang@redhat.com>
…generated pull secrets (validatedpatterns#97) Signed-off-by: Andrew Block <andy.block@gmail.com>
This adds initial integration for Confidential Containers and Trustee Operators as a separated clustergroup. Co-authored-by: Chris Butler <chris.butler@redhat.com> Signed-off-by: Beraldo Leal <bleal@redhat.com>
Add automated configuration for SPIRE Server x509pop NodeAttestor plugin required for CoCo peer-pods attestation. CoCo peer-pods run on untrusted cloud infrastructure. Using k8s_psat would require trusting the cloud provider's cluster. Instead, pods perform hardware TEE attestation to KBS to obtain x509 certificates as cryptographic proof of running in genuine confidential hardware, then use x509pop to register with SPIRE. The Red Hat SPIRE Operator's SpireServer CRD does not expose x509pop configuration, requiring a ConfigMap patch via this imperative job. Signed-off-by: Beraldo Leal <bleal@redhat.com>
Add hello-coco Helm chart demonstrating SPIRE agent deployment in confidential containers using x509pop node attestation. The chart deploys a test pod in a CoCo peer-pod (confidential VM with AMD SNP or Intel TDX) that fetches SPIRE agent certificates from KBS after TEE attestation, establishing hardware as the root of trust instead of Kubernetes. The pod contains three containers: init container fetches sealed secrets from KBS, SPIRE agent uses x509pop for node attestation, and test workload receives SPIFFE SVIDs via unix attestation. This validates the complete integration flow between ZTVP and CoCo components. Note: This could be dropped, if we stick with only the todoapp. Signed-off-by: Beraldo Leal <bleal@redhat.com>
Signed-off-by: Beraldo Leal <bleal@redhat.com>
Signed-off-by: Beraldo Leal <bleal@redhat.com>
Basic markdown file with deployment steps. Signed-off-by: Beraldo Leal <bleal@redhat.com>
c40eff1 to
3436602
Compare
Vide individual commits for messages.