feat:Add dataset management endpoints and refactor text model hierarchy#138
feat:Add dataset management endpoints and refactor text model hierarchy#138
Conversation
WalkthroughThis pull request updates the OpenAPI specification to introduce dataset management endpoints (/datasets, train_model, upload_assets) with comprehensive schemas for datasets, assets, sharing, and organizational structures. Additionally, it refactors the text generation model from a flat TextLayerV3 to a hierarchical TextSpanV3/TextItemV3/TextContainerV3 composition, updating related examples accordingly. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/libs/Ideogram/openapi.yaml`:
- Around line 5097-5178: The dataset example objects in openapi.yaml use null
for the enum fields permission, organization.role, and organization.type which
violates their non-null string-enum schemas; update the example entries (the
dataset example arrays shown with keys
collection_id/creation_time/user_id/name/cover_asset_identifier/sharing) to
either set permission to a valid enum value from the Permission schema and set
organization.role and organization.type to valid enum values from their
respective schemas or simply remove those keys when the value is unknown; apply
the same fixes to the other example blocks referenced (around the other example
ranges noted) so all dataset examples conform to their schemas.
- Around line 5303-5319: The example in openapi.yaml is inconsistent:
total_count is 0 while successful_assets contains 6 entries and failed_assets
contains at least one, and failure_reason is null despite being required. Update
the example so total_count equals success_count + number of failed_assets (e.g.,
set total_count to 7 if success_count is 6 and there is 1 failed asset), ensure
success_count matches the length of successful_assets, and populate each
failed_assets.failure_reason with a non-null string; apply the same fixes to the
other occurrence around the same example (lines referencing successful_assets,
failed_assets, total_count, success_count, failure_reason).
- Around line 52-64: The API uses two different identifier names (collection_id
vs dataset_id) causing ambiguity; pick one canonical name (recommend dataset_id)
and update the OpenAPI spec so all path parameters, parameter refs,
request/response schemas and operationIds consistently use that name (e.g.,
change /datasets/{collection_id}/... and any response fields returning
collection_id to dataset_id) — specifically adjust the path
/datasets/{dataset_id}/train_model and related operations (e.g.,
train_dataset_model and the list/upload/train flows) and any duplicated
occurrences elsewhere to the chosen identifier, or if both must be supported,
explicitly document and expose both fields in responses and accept both path
params; make the change in the upstream OpenAPI source (not just this generated
copy) so the rename persists across auto-updates.
- Around line 5370-5379: The OpenAPI schema for UploadDatasetAssetsRequest marks
"files" as required but allows an empty array; update the schema for the "files"
property in the UploadDatasetAssetsRequest definition to add "minItems: 1" so
validators and generated clients reject an empty files array (look for the
"UploadDatasetAssetsRequest" object and the "files" property in the
openapi.yaml).
- Around line 5416-5445: The schema currently only documents constraints in
descriptions for model_name, training_steps, lora_rank, num_chips, ema, and
learning_rate; update each property to use JSON Schema keywords so they are
enforced: for model_name add minLength: 5, maxLength: 30 and a pattern allowing
alphanumerics, spaces and hyphens; for training_steps add type: integer (already
present), minimum: 100, maximum: 20000, multipleOf: 100 and default: 1000; for
lora_rank replace prose with enum: [64,128,256] and default: 128 (and keep the
note about >128 requiring num_chips in description if needed); for num_chips use
enum: [16,32]; for ema and learning_rate set type: number, minimum: 0 (or
exclusiveMinimum: 0 if strictly >0) and include the documented defaults where
applicable; ensure these keywords are placed on the corresponding properties
(model_name, training_steps, lora_rank, num_chips, ema, learning_rate) so
validators and generated SDKs enforce them.
- Around line 2057-2085: The shared schema currently exposes binary fields
(font_file_h1, font_file_h2, font_file_body, font_file_small, etc.) for both
application/json and multipart/form-data; split the contract by creating two
distinct request schemas (e.g., GenerateGraphicRequestV3_JSON and
GenerateGraphicRequestV3_Multipart, and similarly GenerateDesignRequestV3_JSON
and GenerateDesignRequestV3_Multipart), remove all binary file properties from
the JSON variants, and update the endpoints' requestBody to reference the
per-media-type schema (application/json -> *_JSON, multipart/form-data ->
*_Multipart) so file fields only appear in the multipart schema.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c0f185e5-9ba7-472b-af55-c441170404ac
⛔ Files ignored due to path filters (89)
src/libs/Ideogram/Generated/Ideogram..JsonSerializerContext.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.DatasetsClient.CreateDataset.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.DatasetsClient.ListDatasets.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.DatasetsClient.TrainDatasetModel.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.DatasetsClient.UploadDatasetAssets.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.DatasetsClient.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.GenerateClient.PostGenerateDesignV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.GenerateClient.PostLayerizeDesignV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.GenerateClient.PostTryOnV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.CreateDataset.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.ListDatasets.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.TrainDatasetModel.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.UploadDatasetAssets.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IGenerateClient.PostGenerateDesignV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IGenerateClient.PostLayerizeDesignV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IGenerateClient.PostTryOnV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IIdeogramClient.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.IdeogramClient.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.AssetType.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.AssetTypeNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.DatasetAssetFailureReason.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.DatasetAssetFailureReasonNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.OrganizationRoleEnum.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.OrganizationRoleEnumNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.OrganizationType.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.OrganizationTypeNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.SharingPermission.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.SharingPermissionNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3Alignment.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3AlignmentNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3GlyphType.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3GlyphTypeNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3StyleClass.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3StyleClassNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextSpanV3FontFxItem.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextSpanV3FontFxItemNullable.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.JsonSerializerContextTypes.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.AssetIdentifier.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.AssetIdentifier.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.AssetType.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.CreateApiKeyResponse.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.CreateDatasetRequest.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.CreateDatasetRequest.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.Dataset.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.Dataset.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.DatasetAssetFailureReason.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.DatasetUploadFailedAsset.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.DatasetUploadFailedAsset.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.DatasetUploadSucceededAsset.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.DatasetUploadSucceededAsset.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.GenerateDesignRequestV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.GenerateGraphicRequestV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.LayeredImageGenerationObjectV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.LayeredImageGenerationResponseV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.LayerizeDesignRequestV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.ListDatasetsResponse.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.ListDatasetsResponse.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.OrganizationAvatarObject.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.OrganizationAvatarObject.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.OrganizationObject.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.OrganizationObject.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.OrganizationRoleEnum.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.OrganizationType.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.PostApiKeyResponse.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.SharingInfo.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.SharingInfo.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.SharingPermission.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextContainerV3.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextContainerV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3Alignment.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3GlyphType.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3StyleClass.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextSpanV3.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextSpanV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TextSpanV3FontFxItem.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TrainDatasetModelRequest.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TrainDatasetModelRequest.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TrainDatasetModelResponse.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TrainDatasetModelResponse.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.TryOnRequestV3.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.UploadDatasetAssetsRequest.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.UploadDatasetAssetsRequest.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.UploadDatasetAssetsResponse.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.UploadDatasetAssetsResponse.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.User.Json.g.csis excluded by!**/generated/**src/libs/Ideogram/Generated/Ideogram.Models.User.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/Ideogram/openapi.yaml
| '/datasets/{dataset_id}/train_model': | ||
| post: | ||
| tags: | ||
| - datasets | ||
| summary: Train a model from a dataset | ||
| description: "Start training a custom model from a dataset. The dataset must contain at least 5 images. Returns the model ID and training status.\n" | ||
| operationId: train_dataset_model | ||
| parameters: | ||
| - name: dataset_id | ||
| in: path | ||
| required: true | ||
| schema: | ||
| type: string |
There was a problem hiding this comment.
Use one identifier name across the dataset API.
/datasets returns collection_id, but the follow-up routes and training response use dataset_id for what looks like the same identifier. Either standardize the name or expose both explicitly if they are actually distinct; otherwise the list → upload/train flow is ambiguous in generated clients and docs. Based on learnings: this file is auto-updated from the upstream OpenAPI source every 3 hours, so the rename needs to land there rather than only in this generated copy.
Also applies to: 87-99, 5188-5199, 5393-5396
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/libs/Ideogram/openapi.yaml` around lines 52 - 64, The API uses two
different identifier names (collection_id vs dataset_id) causing ambiguity; pick
one canonical name (recommend dataset_id) and update the OpenAPI spec so all
path parameters, parameter refs, request/response schemas and operationIds
consistently use that name (e.g., change /datasets/{collection_id}/... and any
response fields returning collection_id to dataset_id) — specifically adjust the
path /datasets/{dataset_id}/train_model and related operations (e.g.,
train_dataset_model and the list/upload/train flows) and any duplicated
occurrences elsewhere to the chosen identifier, or if both must be supported,
explicitly document and expose both fields in responses and accept both path
params; make the change in the upstream OpenAPI source (not just this generated
copy) so the rename persists across auto-updates.
| font_file_h1: | ||
| type: string | ||
| description: 'Custom font file for H1 (heading) text. Supported formats .ttf, .otf, .woff, .woff2 (max 5MB). Takes priority over font_name_h1.' | ||
| format: binary | ||
| font_name_h1: | ||
| type: string | ||
| description: Font name from the available font library for H1 text. Ignored if font_file_h1 is provided. | ||
| example: Ubuntu-Bold.ttf | ||
| font_file_h2: | ||
| type: string | ||
| description: 'Custom font file for H2 (subheading) text. Supported formats .ttf, .otf, .woff, .woff2 (max 5MB). Takes priority over font_name_h2.' | ||
| format: binary | ||
| font_name_h2: | ||
| type: string | ||
| description: A font name from the available font library to use for text rendering. Cannot be used together with uploaded_font_file. | ||
| example: Ubuntu-Regular.ttf | ||
| uploaded_font_file: | ||
| description: Font name from the available font library for H2 text. Ignored if font_file_h2 is provided. | ||
| font_file_body: | ||
| type: string | ||
| description: 'A custom font file to use for text rendering. Supported formats are .ttf, .otf, .woff, and .woff2 (max size 5MB). Cannot be used together with requested_font_file.' | ||
| description: 'Custom font file for body text. Supported formats .ttf, .otf, .woff, .woff2 (max 5MB). Takes priority over font_name_body.' | ||
| format: binary | ||
| font_name_body: | ||
| type: string | ||
| description: Font name from the available font library for body text. Ignored if font_file_body is provided. | ||
| font_file_small: | ||
| type: string | ||
| description: 'Custom font file for small/caption text. Supported formats .ttf, .otf, .woff, .woff2 (max 5MB). Takes priority over font_name_small.' | ||
| format: binary | ||
| font_name_small: | ||
| type: string | ||
| description: Font name from the available font library for small text. Ignored if font_file_small is provided. |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In OpenAPI 3.x, what is the recommended way to model an endpoint that accepts both application/jsonandmultipart/form-data when only the multipart variant has file fields (type: string, format: binary)?
💡 Result:
Model it as one operation with a single requestBody that lists both media types under content, giving each media type its own schema. Put the file fields (type: string, format: binary) only in the multipart/form-data schema, and omit them from the application/json schema. OpenAPI explicitly allows request bodies to consume multiple media types and to vary schemas by media type. [1]
paths:
/example:
post:
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/ExampleJsonRequest'
multipart/form-data:
schema:
$ref: '#/components/schemas/ExampleMultipartRequest'
responses:
'200': { description: OK }
components:
schemas:
ExampleCommon:
type: object
properties:
name: { type: string }
tags:
type: array
items: { type: string }
required: [name]
ExampleJsonRequest:
allOf:
- $ref: '#/components/schemas/ExampleCommon'
ExampleMultipartRequest:
allOf:
- $ref: '#/components/schemas/ExampleCommon'
- type: object
properties:
file:
type: string
format: binary # file upload in OAS3 multipart
required: [file]For the multipart variant, type: string + format: binary is the standard way to represent a file upload part. [2][3]
🏁 Script executed:
# First, let's examine the file structure around the mentioned lines
wc -l src/libs/Ideogram/openapi.yamlRepository: tryAGI/Ideogram
Length of output: 94
🏁 Script executed:
# Check the content around lines 2057-2085
sed -n '2040,2090p' src/libs/Ideogram/openapi.yamlRepository: tryAGI/Ideogram
Length of output: 2811
🏁 Script executed:
# Check the content around lines 2151-2179
sed -n '2140,2190p' src/libs/Ideogram/openapi.yamlRepository: tryAGI/Ideogram
Length of output: 2596
🏁 Script executed:
# Find which schemas these font fields belong to
sed -n '2000,2100p' src/libs/Ideogram/openapi.yaml | grep -B 30 "font_file_h1"Repository: tryAGI/Ideogram
Length of output: 1896
🏁 Script executed:
# Search for all occurrences of GenerateGraphicRequestV3 and GenerateDesignRequestV3 to see where they're used
rg 'GenerateGraphicRequestV3|GenerateDesignRequestV3' src/libs/Ideogram/openapi.yaml -B 2 -A 2Repository: tryAGI/Ideogram
Length of output: 1413
Define separate request schemas for application/json and multipart/form-data to prevent file fields from appearing in the JSON contract.
Both GenerateGraphicRequestV3 and GenerateDesignRequestV3 are referenced by endpoints offering both application/json and multipart/form-data, but the same schema instance is used for both content types. The binary file fields (font_file_h1, font_file_h2, font_file_body, font_file_small, and others) now appear in the shared schema, which incorrectly implies the JSON variant accepts file uploads. OpenAPI 3.x supports per-media-type schemas within a single requestBody; use separate schema definitions for each content type, keeping binary fields only in the multipart variant.
Also applies to: 2151–2179
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/libs/Ideogram/openapi.yaml` around lines 2057 - 2085, The shared schema
currently exposes binary fields (font_file_h1, font_file_h2, font_file_body,
font_file_small, etc.) for both application/json and multipart/form-data; split
the contract by creating two distinct request schemas (e.g.,
GenerateGraphicRequestV3_JSON and GenerateGraphicRequestV3_Multipart, and
similarly GenerateDesignRequestV3_JSON and GenerateDesignRequestV3_Multipart),
remove all binary file properties from the JSON variants, and update the
endpoints' requestBody to reference the per-media-type schema (application/json
-> *_JSON, multipart/form-data -> *_Multipart) so file fields only appear in the
multipart schema.
| example: | ||
| datasets: | ||
| - collection_id: collection_id | ||
| creation_time: '2000-01-23T04:56:07+00:00' | ||
| user_id: user_id | ||
| name: name | ||
| cover_asset_identifier: | ||
| asset_type: RESPONSE | ||
| asset_id: 7uS_VESkRI6O3-sVgHQp_A | ||
| sharing: | ||
| - updated_time: '2000-01-23T04:56:07+00:00' | ||
| organization: | ||
| role: null | ||
| publishing_disabled: true | ||
| organization_id: organization_id | ||
| name: name | ||
| avatar: | ||
| avatar_url: avatar_url | ||
| type: null | ||
| permission: null | ||
| user: | ||
| badge: badge | ||
| subscription_plan_id: subscription_plan_id | ||
| display_handle: display_handle | ||
| user_id: user_id | ||
| photo_url: photo_url | ||
| - updated_time: '2000-01-23T04:56:07+00:00' | ||
| organization: | ||
| role: null | ||
| publishing_disabled: true | ||
| organization_id: organization_id | ||
| name: name | ||
| avatar: | ||
| avatar_url: avatar_url | ||
| type: null | ||
| permission: null | ||
| user: | ||
| badge: badge | ||
| subscription_plan_id: subscription_plan_id | ||
| display_handle: display_handle | ||
| user_id: user_id | ||
| photo_url: photo_url | ||
| - collection_id: collection_id | ||
| creation_time: '2000-01-23T04:56:07+00:00' | ||
| user_id: user_id | ||
| name: name | ||
| cover_asset_identifier: | ||
| asset_type: RESPONSE | ||
| asset_id: 7uS_VESkRI6O3-sVgHQp_A | ||
| sharing: | ||
| - updated_time: '2000-01-23T04:56:07+00:00' | ||
| organization: | ||
| role: null | ||
| publishing_disabled: true | ||
| organization_id: organization_id | ||
| name: name | ||
| avatar: | ||
| avatar_url: avatar_url | ||
| type: null | ||
| permission: null | ||
| user: | ||
| badge: badge | ||
| subscription_plan_id: subscription_plan_id | ||
| display_handle: display_handle | ||
| user_id: user_id | ||
| photo_url: photo_url | ||
| - updated_time: '2000-01-23T04:56:07+00:00' | ||
| organization: | ||
| role: null | ||
| publishing_disabled: true | ||
| organization_id: organization_id | ||
| name: name | ||
| avatar: | ||
| avatar_url: avatar_url | ||
| type: null | ||
| permission: null | ||
| user: | ||
| badge: badge | ||
| subscription_plan_id: subscription_plan_id | ||
| display_handle: display_handle | ||
| user_id: user_id | ||
| photo_url: photo_url |
There was a problem hiding this comment.
Make the dataset examples satisfy their own schemas.
These examples set permission, role, and type to null, but the referenced schemas are non-null string enums. Use a real enum value for permission, and either use real enum values or omit role / type when they are unknown.
Also applies to: 5216-5256, 6000-6016, 6051-6058
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/libs/Ideogram/openapi.yaml` around lines 5097 - 5178, The dataset example
objects in openapi.yaml use null for the enum fields permission,
organization.role, and organization.type which violates their non-null
string-enum schemas; update the example entries (the dataset example arrays
shown with keys
collection_id/creation_time/user_id/name/cover_asset_identifier/sharing) to
either set permission to a valid enum value from the Permission schema and set
organization.role and organization.type to valid enum values from their
respective schemas or simply remove those keys when the value is unknown; apply
the same fixes to the other example blocks referenced (around the other example
ranges noted) so all dataset examples conform to their schemas.
| example: | ||
| successful_assets: | ||
| - asset_identifier: | ||
| asset_type: RESPONSE | ||
| asset_id: 7uS_VESkRI6O3-sVgHQp_A | ||
| file_name: file_name | ||
| - asset_identifier: | ||
| asset_type: RESPONSE | ||
| asset_id: 7uS_VESkRI6O3-sVgHQp_A | ||
| file_name: file_name | ||
| total_count: 0 | ||
| failed_assets: | ||
| - file_name: file_name | ||
| failure_reason: null | ||
| - file_name: file_name | ||
| failure_reason: null | ||
| success_count: 6 |
There was a problem hiding this comment.
Fix the upload response example.
total_count is 0 while the same payload shows 6 successes and 1 failure, and failure_reason is null even though that field is required and non-nullable. This example is internally inconsistent and does not match the schema.
Also applies to: 5354-5356
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/libs/Ideogram/openapi.yaml` around lines 5303 - 5319, The example in
openapi.yaml is inconsistent: total_count is 0 while successful_assets contains
6 entries and failed_assets contains at least one, and failure_reason is null
despite being required. Update the example so total_count equals success_count +
number of failed_assets (e.g., set total_count to 7 if success_count is 6 and
there is 1 failed asset), ensure success_count matches the length of
successful_assets, and populate each failed_assets.failure_reason with a
non-null string; apply the same fixes to the other occurrence around the same
example (lines referencing successful_assets, failed_assets, total_count,
success_count, failure_reason).
| required: | ||
| - files | ||
| type: object | ||
| properties: | ||
| files: | ||
| type: array | ||
| items: | ||
| type: string | ||
| format: binary | ||
| description: "Image files (JPEG, PNG, WebP), .txt caption sidecar files, and/or ZIP archives containing images and captions. Caption .txt files are matched to images by filename stem (e.g. sunset.txt provides the caption for sunset.jpg).\n" |
There was a problem hiding this comment.
Require at least one file in UploadDatasetAssetsRequest.
A required array can still be empty. Add minItems: 1 so validators and generated clients do not treat files: [] as a valid upload.
📎 Minimal schema fix
UploadDatasetAssetsRequest:
required:
- files
type: object
properties:
files:
type: array
+ minItems: 1
items:
type: string
format: binary📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| required: | |
| - files | |
| type: object | |
| properties: | |
| files: | |
| type: array | |
| items: | |
| type: string | |
| format: binary | |
| description: "Image files (JPEG, PNG, WebP), .txt caption sidecar files, and/or ZIP archives containing images and captions. Caption .txt files are matched to images by filename stem (e.g. sunset.txt provides the caption for sunset.jpg).\n" | |
| required: | |
| - files | |
| type: object | |
| properties: | |
| files: | |
| type: array | |
| minItems: 1 | |
| items: | |
| type: string | |
| format: binary | |
| description: "Image files (JPEG, PNG, WebP), .txt caption sidecar files, and/or ZIP archives containing images and captions. Caption .txt files are matched to images by filename stem (e.g. sunset.txt provides the caption for sunset.jpg).\n" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/libs/Ideogram/openapi.yaml` around lines 5370 - 5379, The OpenAPI schema
for UploadDatasetAssetsRequest marks "files" as required but allows an empty
array; update the schema for the "files" property in the
UploadDatasetAssetsRequest definition to add "minItems: 1" so validators and
generated clients reject an empty files array (look for the
"UploadDatasetAssetsRequest" object and the "files" property in the
openapi.yaml).
| model_name: | ||
| title: model_name | ||
| type: string | ||
| description: "Name for the trained model. Must be 5-30 characters, alphanumeric with spaces and hyphens allowed.\n" | ||
| example: my-custom-model | ||
| training_steps: | ||
| title: training_steps | ||
| type: integer | ||
| description: "Number of training steps. Must be between 100 and 20000 and a multiple of 100. Defaults to 1000.\n" | ||
| example: 1000 | ||
| lora_rank: | ||
| title: lora_rank | ||
| type: integer | ||
| description: "LoRA rank for model training. Must be one of 64, 128, or 256. Defaults to 128. Note: lora_rank > 128 requires num_chips >= 32.\n" | ||
| example: 64 | ||
| num_chips: | ||
| title: num_chips | ||
| type: integer | ||
| description: "Number of TPU chips to use. Must be one of 16 or 32.\n" | ||
| example: 16 | ||
| ema: | ||
| title: ema | ||
| type: number | ||
| description: "Exponential moving average decay rate. Must be greater than 0.\n" | ||
| example: 0.999 | ||
| learning_rate: | ||
| title: learning_rate | ||
| type: number | ||
| description: "Learning rate for the optimizer. Must be greater than 0.\n" | ||
| example: 0.0001 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In OpenAPI 3.x / JSON Schema, do constraints written only in descriptionaffect validation or code generation, or must rules like length limits, allowed values,multipleOf, defaults, and positive-only numbers be expressed as schema keywords?
💡 Result:
Constraints written only in description are non-normative documentation and do not participate in JSON Schema / OpenAPI validation. In JSON Schema, description is an annotation keyword, not a validation/assertion keyword. (json-schema.org)
In OpenAPI 3.1 (which aligns Schema Objects with JSON Schema vocabularies), tooling is expected to treat annotation keywords as metadata; actual validation rules must be expressed using schema keywords (e.g., minLength/maxLength, enum/const, multipleOf, minimum + exclusiveMinimum, etc.). (spec.openapis.org)
Related note: default is also annotation-style metadata in JSON Schema—typically used by documentation/UI/codegen as a suggested value, not enforced by validation unless a tool adds nonstandard behavior. (json-schema.org)
Citations:
- 1: https://json-schema.org/understanding-json-schema/reference/annotations?utm_source=openai
- 2: https://spec.openapis.org/oas/v3.1.1.html?utm_source=openai
- 3: https://json-schema.org/understanding-json-schema/reference/annotations?utm_source=openai
🏁 Script executed:
# Check if the file exists and verify the specific line range
wc -l src/libs/Ideogram/openapi.yamlRepository: tryAGI/Ideogram
Length of output: 94
🏁 Script executed:
# Read the specific lines mentioned in the review comment
sed -n '5416,5445p' src/libs/Ideogram/openapi.yamlRepository: tryAGI/Ideogram
Length of output: 1295
🏁 Script executed:
# Check broader context to see the full TrainDatasetModelRequest definition
sed -n '5400,5460p' src/libs/Ideogram/openapi.yamlRepository: tryAGI/Ideogram
Length of output: 2140
Encode the training limits as schema keywords, not only in prose.
The 5–30 character name rule, allowed character set, training_steps bounds and multiple, allowed lora_rank and num_chips values, positive-only ema and learning_rate, and the documented defaults are described only in description fields. Per JSON Schema specification, description is annotation metadata and does not participate in validation. Validators and generated SDKs will not enforce or surface these constraints unless they are modeled as schema keywords such as minLength, maxLength, pattern, enum, minimum, exclusiveMinimum, multipleOf, and default.
📐 Proposed schema keywords
model_name:
title: model_name
type: string
+ minLength: 5
+ maxLength: 30
+ pattern: '^[A-Za-z0-9 -]+$'
training_steps:
title: training_steps
type: integer
+ minimum: 100
+ maximum: 20000
+ multipleOf: 100
+ default: 1000
lora_rank:
title: lora_rank
type: integer
+ enum:
+ - 64
+ - 128
+ - 256
+ default: 128
num_chips:
title: num_chips
type: integer
+ enum:
+ - 16
+ - 32
ema:
title: ema
type: number
+ exclusiveMinimum: 0
learning_rate:
title: learning_rate
type: number
+ exclusiveMinimum: 0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model_name: | |
| title: model_name | |
| type: string | |
| description: "Name for the trained model. Must be 5-30 characters, alphanumeric with spaces and hyphens allowed.\n" | |
| example: my-custom-model | |
| training_steps: | |
| title: training_steps | |
| type: integer | |
| description: "Number of training steps. Must be between 100 and 20000 and a multiple of 100. Defaults to 1000.\n" | |
| example: 1000 | |
| lora_rank: | |
| title: lora_rank | |
| type: integer | |
| description: "LoRA rank for model training. Must be one of 64, 128, or 256. Defaults to 128. Note: lora_rank > 128 requires num_chips >= 32.\n" | |
| example: 64 | |
| num_chips: | |
| title: num_chips | |
| type: integer | |
| description: "Number of TPU chips to use. Must be one of 16 or 32.\n" | |
| example: 16 | |
| ema: | |
| title: ema | |
| type: number | |
| description: "Exponential moving average decay rate. Must be greater than 0.\n" | |
| example: 0.999 | |
| learning_rate: | |
| title: learning_rate | |
| type: number | |
| description: "Learning rate for the optimizer. Must be greater than 0.\n" | |
| example: 0.0001 | |
| model_name: | |
| title: model_name | |
| type: string | |
| minLength: 5 | |
| maxLength: 30 | |
| pattern: '^[A-Za-z0-9 -]+$' | |
| description: "Name for the trained model. Must be 5-30 characters, alphanumeric with spaces and hyphens allowed.\n" | |
| example: my-custom-model | |
| training_steps: | |
| title: training_steps | |
| type: integer | |
| minimum: 100 | |
| maximum: 20000 | |
| multipleOf: 100 | |
| default: 1000 | |
| description: "Number of training steps. Must be between 100 and 20000 and a multiple of 100. Defaults to 1000.\n" | |
| example: 1000 | |
| lora_rank: | |
| title: lora_rank | |
| type: integer | |
| enum: | |
| - 64 | |
| - 128 | |
| - 256 | |
| default: 128 | |
| description: "LoRA rank for model training. Must be one of 64, 128, or 256. Defaults to 128. Note: lora_rank > 128 requires num_chips >= 32.\n" | |
| example: 64 | |
| num_chips: | |
| title: num_chips | |
| type: integer | |
| enum: | |
| - 16 | |
| - 32 | |
| description: "Number of TPU chips to use. Must be one of 16 or 32.\n" | |
| example: 16 | |
| ema: | |
| title: ema | |
| type: number | |
| exclusiveMinimum: 0 | |
| description: "Exponential moving average decay rate. Must be greater than 0.\n" | |
| example: 0.999 | |
| learning_rate: | |
| title: learning_rate | |
| type: number | |
| exclusiveMinimum: 0 | |
| description: "Learning rate for the optimizer. Must be greater than 0.\n" | |
| example: 0.0001 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/libs/Ideogram/openapi.yaml` around lines 5416 - 5445, The schema
currently only documents constraints in descriptions for model_name,
training_steps, lora_rank, num_chips, ema, and learning_rate; update each
property to use JSON Schema keywords so they are enforced: for model_name add
minLength: 5, maxLength: 30 and a pattern allowing alphanumerics, spaces and
hyphens; for training_steps add type: integer (already present), minimum: 100,
maximum: 20000, multipleOf: 100 and default: 1000; for lora_rank replace prose
with enum: [64,128,256] and default: 128 (and keep the note about >128 requiring
num_chips in description if needed); for num_chips use enum: [16,32]; for ema
and learning_rate set type: number, minimum: 0 (or exclusiveMinimum: 0 if
strictly >0) and include the documented defaults where applicable; ensure these
keywords are placed on the corresponding properties (model_name, training_steps,
lora_rank, num_chips, ema, learning_rate) so validators and generated SDKs
enforce them.
Summary by CodeRabbit
Release Notes