diff --git a/README.md b/README.md index 382a6a72..0bfac305 100644 --- a/README.md +++ b/README.md @@ -2,28 +2,34 @@ ## Introduction -This deployment is based on the `validated pattern framework`, using GitOps for -seamless provisioning of all operators and applications. It deploys a Chatbot -application that harnesses the power of Large Language Models (LLMs) combined +This deployment uses the [**Validated Patterns**](https://validatedpatterns.io/) framework, +taking advantage of GitOps for seamless provisioning of all operators and applications. +It deploys a Chatbot application that harnesses the power of Large Language Models (LLMs) combined with the Retrieval-Augmented Generation (RAG) framework. -The pattern uses the [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to deploy and serve LLM models at scale. +The pattern uses [**Red Hat OpenShift AI**](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to deploy and serve LLM models at scale. -The application uses either the [EDB Postgres for Kubernetes operator](https://catalog.redhat.com/software/container-stacks/detail/5fb41c88abd2a6f7dbe1b37b) -(default) or Redis to store embeddings of Red Hat products, running on Red Hat -OpenShift to generate project proposals for specific Red Hat products. +By default, this pattern uses [**pgvector**](https://github.com/pgvector/pgvector) as the RAG DB backend. +[**EDB Postgres**](https://www.enterprisedb.com/docs/edb-postgres-ai/latest/ai-factory/vector-engine/), +[**Redis**](https://redis.io/docs/latest/develop/get-started/vector-database/), +[**Elasticsearch**](https://www.elastic.co/elasticsearch/vector-database), and +[**Microsoft SQL Server**](https://learn.microsoft.com/en-us/sql/sql-server/ai/vectors?view=sql-server-ver17) +(either a local deployment as part of the pattern or an existing SQL Server DB on Azure) are also options +for RAG DB backends. + +This pattern populates your chosen RAG DB with documents relating to Red Hat OpenShift AI for the purpose +of generating project proposals. ## Pre-requisites - Podman - Red Hat Openshift cluster running in AWS. Supported regions are : us-east-1 us-east-2 us-west-1 us-west-2 ca-central-1 sa-east-1 eu-west-1 eu-west-2 eu-west-3 eu-central-1 eu-north-1 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-southeast-1 ap-southeast-2 ap-south-1. -- GPU Node to run Hugging Face Text Generation Inference server on Red Hat OpenShift cluster. - Create a fork of the [rag-llm-gitops](https://github.com/validatedpatterns/rag-llm-gitops.git) Git repository. - **EDB Postgres Operator Credentials** (Required only if you select EDB): The EDB Postgres for Kubernetes operator from the certified-operators catalog requires authentication to pull images from `docker.enterprisedb.com`. You will need to: 1. Register for a free trial account at [EDB Registration](https://www.enterprisedb.com/accounts/register) 2. Obtain your subscription token from [EDB Repos Downloads](https://www.enterprisedb.com/repos-downloads) 3. Add the token to your `values-secret.yaml` file during configuration (see below) - + For more details, see the [EDB Installation Documentation](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/installation_upgrade/). ## Demo Description & Architecture @@ -111,7 +117,7 @@ cd rag-llm-gitops ### Configuring model -This pattern deploys [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) out of box. Run the following command to configure vault with the model ID. +This pattern deploys [IBM Granite 3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct) out of box. Run the following command to configure vault with the model ID. ```sh # Copy values-secret.yaml.template to ~/values-secret-rag-llm-gitops.yaml. @@ -120,9 +126,12 @@ This pattern deploys [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-gr cp values-secret.yaml.template ~/values-secret-rag-llm-gitops.yaml ``` -To deploy a model that can requires an Hugging Face token, grab the [Hugging Face token](https://huggingface.co/settings/tokens) and accept the terms and conditions on the model page. Edit ~/values-secret-rag-llm-gitops.yaml to replace the `model Id` and the `Hugging Face` token. +To deploy a model that requires a Hugging Face token, grab the [Hugging Face token](https://huggingface.co/settings/tokens) and accept the terms and conditions on the model page. Update the `hftoken` secret in +`~/values-secret-rag-llm-gitops.yaml` and edit the value of `.global.model.vllm` in +[`values-global.yaml`](./values-global.yaml) to your desired model. -**IMPORTANT**: If you are using EDB Postgres for Kubernetes, you must add your EDB subscription token to the `values-secret.yaml` file: +**IMPORTANT**: If you are using EDB Postgres for Kubernetes, you must add your EDB subscription token to +`~/values-secret-rag-llm-gitops.yaml`: ```sh secrets: @@ -130,43 +139,36 @@ secrets: fields: - name: hftoken value: null - - name: modelId - value: "ibm-granite/granite-3.1-8b-instruct" - name: edb fields: - name: token value: "YOUR_EDB_TOKEN_HERE" # Replace with your EDB subscription token description: EDB subscription token for pulling certified operator images - - name: minio - fields: - - name: MINIO_ROOT_USER - value: minio - - name: MINIO_ROOT_PASSWORD - value: null - onMissingValue: generate ``` The EDB token is synced into Vault and then used by External Secrets to create the required pull secret (`postgresql-operator-pull-secret`) in `openshift-operators`. Without this token, the EDB operator will fail to pull its container image and the database will not be created. -### Provision GPU MachineSet +If you are using PGVector or SQL Server, you can update the password in this file. Otherwise, an autogenerated +password is used. -As a pre-requisite to deploy the application using the validated pattern, GPU nodes should be provisioned along with Node Feature Discovery Operator and NVIDIA GPU operator. To provision GPU Nodes +### Provision GPU MachineSet -Following command will take about 5-10 minutes. +As a pre-requisite to deploy the application using this Validated Pattern, a GPU node needs to be provisioned. +To provision the GPU node on AWS: ```sh ./pattern.sh make create-gpu-machineset ``` -Wait till the nodes are provisioned and running. +Wait till the node is provisioned and running. ![Diagram](images/nodes.png) -Alternatiely, follow the [instructions](./GPU_provisioning.md) to manually install GPU nodes, Node Feature Discovery Operator and NVIDIA GPU operator. +Alternatiely, follow the [instructions](./GPU_provisioning.md) to manually install the GPU node. ### Deploy application -***Note:**: This pattern supports four types of vector databases: PGVECTOR (local chart), EDB Postgres for Kubernetes, Elasticsearch, and Redis. By default the pattern will deploy PGVECTOR as a vector DB. To deploy EDB, set `global.db.type` to `EDB` in [values-global.yaml](./values-global.yaml). +***Note:**: This pattern supports five types of vector databases: pgvector, EDB Postgres for Kubernetes, Elasticsearch, Redis, and SQL Server. By default the pattern will deploy pgvector as the RAG DB. To deploy EDB, set `global.db.type` to `EDB` in [values-global.yaml](./values-global.yaml). ```yaml --- @@ -176,14 +178,28 @@ global: useCSV: false syncPolicy: Automatic installPlanApproval: Automatic -# Possible value for db.type = [REDIS, EDB, ELASTIC, PGVECTOR] + # Possible values for RAG vector DB db.type: + # REDIS -> Redis (Local chart deploy) + # EDB -> PGVector via EDB operator (Local chart deploy) + # PGVECTOR -> PGVector (Local Postgres chart deploy) + # ELASTIC -> Elasticsearch (Local chart deploy) + # MSSQL -> MS SQL Server (Local chart deploy) + # AZURESQL -> Azure SQL (Pre-existing in Azure) db: index: docs - type: PGVECTOR # <--- Default is PGVECTOR. Use EDB, REDIS, or ELASTIC as needed. + type: PGVECTOR + # Models used by the inference service (should be a HuggingFace model ID) + model: + vllm: ibm-granite/granite-3.3-8b-instruct + embedding: sentence-transformers/all-mpnet-base-v2 + + storageClass: gp3-csi + main: clusterGroupName: hub multiSourceConfig: enabled: true + clusterGroupChartVersion: 0.9.* ``` Following commands will take about 15-20 minutes diff --git a/charts/all/rag-llm/charts/pgvector/templates/secret.yaml b/charts/all/rag-llm/charts/pgvector/templates/secret.yaml index d0a92934..dbfeb3a7 100644 --- a/charts/all/rag-llm/charts/pgvector/templates/secret.yaml +++ b/charts/all/rag-llm/charts/pgvector/templates/secret.yaml @@ -1,15 +1,35 @@ -{{- if and (eq .Values.global.db.type "PGVECTOR") .Values.secret.create }} -kind: Secret -apiVersion: v1 +{{- if eq .Values.global.db.type "PGVECTOR" }} +apiVersion: "external-secrets.io/v1beta1" +kind: ExternalSecret metadata: - name: vectordb-app - labels: - {{- include "pgvector.labels" . | nindent 4 }} -data: - username: {{ .Values.secret.user | b64enc | quote }} - password: {{ .Values.secret.password | b64enc | quote }} - host: {{ default (include "pgvector.fullname" .) .Values.secret.host | b64enc | quote }} - port: {{ .Values.secret.port | b64enc | quote }} - dbname: {{ .Values.secret.dbname | b64enc | quote }} -type: Opaque + name: pgvector-external-secret +spec: + refreshInterval: 15s + secretStoreRef: + name: {{ .Values.secretStore.name }} + kind: {{ .Values.secretStore.kind }} + target: + name: vectordb-app + template: + type: Opaque + engineVersion: v2 + data: + username: "{{ `{{ .username }}` }}" + password: "{{ `{{ .password }}` }}" + dbname: "{{ `{{ .dbname }}` }}" + host: {{ include "pgvector.fullname" . }} + port: "{{ .Values.service.port }}" + data: + - secretKey: username + remoteRef: + key: {{ .Values.secretStore.key }} + property: username + - secretKey: password + remoteRef: + key: {{ .Values.secretStore.key }} + property: password + - secretKey: dbname + remoteRef: + key: {{ .Values.secretStore.key }} + property: dbname {{- end }} diff --git a/charts/all/rag-llm/charts/pgvector/values.yaml b/charts/all/rag-llm/charts/pgvector/values.yaml index afa168c1..6a83f9e2 100644 --- a/charts/all/rag-llm/charts/pgvector/values.yaml +++ b/charts/all/rag-llm/charts/pgvector/values.yaml @@ -73,18 +73,10 @@ volumeMounts: nodeSelector: {} affinity: {} -# Secret configuration for pgvector -# These values can be overridden from a parent chart or via --set flags -# Example: helm install ... -# --set pgvector.secret.user=myuser - -# --set pgvector.secret.password=mypass -secret: - create: true - user: postgres - password: rag_password - dbname: rag_blueprint - host: pgvector - port: "5432" +secretStore: + name: vault-backend + kind: ClusterSecretStore + key: secret/data/hub/pgvector extraDatabases: [] #- name: test_db diff --git a/values-secret.yaml.template b/values-secret.yaml.template index d38e970d..7e5f48fd 100644 --- a/values-secret.yaml.template +++ b/values-secret.yaml.template @@ -2,15 +2,8 @@ # https://github.com/validatedpatterns/common/tree/main/ansible/roles/vault_utils#values-secret-file-format version: "2.0" -# Ideally you NEVER COMMIT THESE VALUES TO GIT (although if all passwords are -# automatically generated inside the vault this should not really matter) - -# In order to use huggingface models, you will need to -# provide your token as a value for hftoken below. - -# EDB Postgres Operator requires authentication to pull images from docker.enterprisedb.com -# Register for a free trial at: https://www.enterprisedb.com/accounts/register -# Get your token from: https://www.enterprisedb.com/repos-downloads +# Do not update sensitive secrets (db credentials) in this file and commit to git. +# Copy this template file to ~/values-secret-rag-llm-gitops and update secrets in your home directory backingStore: vault @@ -22,15 +15,37 @@ vaultPolicies: rule "charset" { charset = "0123456789" min-chars = 1 } secrets: + # This must be set to use models requiring huggingface authentication + # The default model (ibm-granite/granite-3.3-8b-instruct) does not require authentication - name: hfmodel fields: - name: hftoken value: null + + # Only used when .global.db is set to PGVECTOR in values-global.yaml + - name: pgvector + fields: + - name: username + value: postgres + - name: password + onMissingValue: generate + override: true + vaultPolicy: basicPolicy + - name: dbname + value: rag_blueprint + + # Only used when .global.db is set to EDB in values-global.yaml + # EDB Postgres Operator requires authentication to pull images from docker.enterprisedb.com + # Register for a free trial at: https://www.enterprisedb.com/accounts/register + # Get your token from: https://www.enterprisedb.com/repos-downloads - name: edb fields: - name: token value: null description: EDB subscription token for pulling certified operator images + + # Only used when .global.db is set to MSSQL in values-global.yaml + # The pattern creates a local SQL Server deployment. To use an existing SQL Server DB on Azure, use secret below. - name: mssql fields: - name: sa-pass @@ -38,6 +53,9 @@ secrets: override: true vaultPolicy: basicPolicy description: mssql password for sa user + + # Only used when .global.db is set to AZURESQL in values-global.yaml + # The Azure SQL Server database needs to be created outside of the pattern. - name: azuresql fields: - name: user