Skip to content

feat:Add dataset management endpoints and refactor text model hierarchy#138

Merged
HavenDV merged 1 commit intomainfrom
bot/update-openapi_202603140105
Mar 14, 2026
Merged

feat:Add dataset management endpoints and refactor text model hierarchy#138
HavenDV merged 1 commit intomainfrom
bot/update-openapi_202603140105

Conversation

@HavenDV
Copy link
Contributor

@HavenDV HavenDV commented Mar 14, 2026

Summary by CodeRabbit

Release Notes

  • New Features
    • Added dataset management capabilities, enabling users to create datasets, upload assets, and train custom models
    • Enhanced text rendering in layered image generation with improved text structure and formatting support

@HavenDV HavenDV enabled auto-merge March 14, 2026 01:06
@coderabbitai
Copy link

coderabbitai bot commented Mar 14, 2026

Walkthrough

This pull request updates the OpenAPI specification to introduce dataset management endpoints (/datasets, train_model, upload_assets) with comprehensive schemas for datasets, assets, sharing, and organizational structures. Additionally, it refactors the text generation model from a flat TextLayerV3 to a hierarchical TextSpanV3/TextItemV3/TextContainerV3 composition, updating related examples accordingly.

Changes

Cohort / File(s) Summary
Dataset Management Endpoints & Schemas
src/libs/Ideogram/openapi.yaml
Adds /datasets GET/POST endpoints, train_model, and upload_assets endpoints with associated request/response schemas (ListDatasetsResponse, CreateDatasetRequest, TrainDatasetModelRequest/Response, UploadDatasetAssetsRequest/Response) for dataset lifecycle management.
Dataset Supporting Schemas
src/libs/Ideogram/openapi.yaml
Introduces Asset-related schemas (AssetIdentifier, AssetType, DatasetUploadSucceededAsset, DatasetUploadFailedAsset, DatasetAssetFailureReason) and sharing/organization infrastructure (SharingInfo, SharingPermission, OrganizationObject, OrganizationType, OrganizationRoleEnum, OrganizationAvatarObject, User) to support dataset ownership and access control.
Text Model Refactoring
src/libs/Ideogram/openapi.yaml
Replaces flat TextLayerV3 with hierarchical text structure: TextSpanV3 (atomic text segment), TextItemV3 (grouped spans), and TextContainerV3 (container-level schema). Updates LayeredImageGenerationObjectV3 and LayeredImageGenerationResponseV3 to use text_containers and text_html instead of text_layers.
Updated Examples & Defaults
src/libs/Ideogram/openapi.yaml
Revises example payloads and default values throughout the specification to align with new text composition model and provide more explicit field semantics.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Hoppity hop, the schemas now align,
Datasets dance in structures so divine,
Text containers nest with grace and care,
New assets shared with organizations fair,
A specification blooms, refined and bright!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title Check ✅ Passed Title check skipped as CodeRabbit has written the PR title.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bot/update-openapi_202603140105
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@HavenDV HavenDV merged commit 03138fd into main Mar 14, 2026
3 of 4 checks passed
@HavenDV HavenDV deleted the bot/update-openapi_202603140105 branch March 14, 2026 01:07
@coderabbitai coderabbitai bot changed the title feat:@coderabbitai feat:Add dataset management endpoints and refactor text model hierarchy Mar 14, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/libs/Ideogram/openapi.yaml`:
- Around line 5097-5178: The dataset example objects in openapi.yaml use null
for the enum fields permission, organization.role, and organization.type which
violates their non-null string-enum schemas; update the example entries (the
dataset example arrays shown with keys
collection_id/creation_time/user_id/name/cover_asset_identifier/sharing) to
either set permission to a valid enum value from the Permission schema and set
organization.role and organization.type to valid enum values from their
respective schemas or simply remove those keys when the value is unknown; apply
the same fixes to the other example blocks referenced (around the other example
ranges noted) so all dataset examples conform to their schemas.
- Around line 5303-5319: The example in openapi.yaml is inconsistent:
total_count is 0 while successful_assets contains 6 entries and failed_assets
contains at least one, and failure_reason is null despite being required. Update
the example so total_count equals success_count + number of failed_assets (e.g.,
set total_count to 7 if success_count is 6 and there is 1 failed asset), ensure
success_count matches the length of successful_assets, and populate each
failed_assets.failure_reason with a non-null string; apply the same fixes to the
other occurrence around the same example (lines referencing successful_assets,
failed_assets, total_count, success_count, failure_reason).
- Around line 52-64: The API uses two different identifier names (collection_id
vs dataset_id) causing ambiguity; pick one canonical name (recommend dataset_id)
and update the OpenAPI spec so all path parameters, parameter refs,
request/response schemas and operationIds consistently use that name (e.g.,
change /datasets/{collection_id}/... and any response fields returning
collection_id to dataset_id) — specifically adjust the path
/datasets/{dataset_id}/train_model and related operations (e.g.,
train_dataset_model and the list/upload/train flows) and any duplicated
occurrences elsewhere to the chosen identifier, or if both must be supported,
explicitly document and expose both fields in responses and accept both path
params; make the change in the upstream OpenAPI source (not just this generated
copy) so the rename persists across auto-updates.
- Around line 5370-5379: The OpenAPI schema for UploadDatasetAssetsRequest marks
"files" as required but allows an empty array; update the schema for the "files"
property in the UploadDatasetAssetsRequest definition to add "minItems: 1" so
validators and generated clients reject an empty files array (look for the
"UploadDatasetAssetsRequest" object and the "files" property in the
openapi.yaml).
- Around line 5416-5445: The schema currently only documents constraints in
descriptions for model_name, training_steps, lora_rank, num_chips, ema, and
learning_rate; update each property to use JSON Schema keywords so they are
enforced: for model_name add minLength: 5, maxLength: 30 and a pattern allowing
alphanumerics, spaces and hyphens; for training_steps add type: integer (already
present), minimum: 100, maximum: 20000, multipleOf: 100 and default: 1000; for
lora_rank replace prose with enum: [64,128,256] and default: 128 (and keep the
note about >128 requiring num_chips in description if needed); for num_chips use
enum: [16,32]; for ema and learning_rate set type: number, minimum: 0 (or
exclusiveMinimum: 0 if strictly >0) and include the documented defaults where
applicable; ensure these keywords are placed on the corresponding properties
(model_name, training_steps, lora_rank, num_chips, ema, learning_rate) so
validators and generated SDKs enforce them.
- Around line 2057-2085: The shared schema currently exposes binary fields
(font_file_h1, font_file_h2, font_file_body, font_file_small, etc.) for both
application/json and multipart/form-data; split the contract by creating two
distinct request schemas (e.g., GenerateGraphicRequestV3_JSON and
GenerateGraphicRequestV3_Multipart, and similarly GenerateDesignRequestV3_JSON
and GenerateDesignRequestV3_Multipart), remove all binary file properties from
the JSON variants, and update the endpoints' requestBody to reference the
per-media-type schema (application/json -> *_JSON, multipart/form-data ->
*_Multipart) so file fields only appear in the multipart schema.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c0f185e5-9ba7-472b-af55-c441170404ac

📥 Commits

Reviewing files that changed from the base of the PR and between b292ce0 and 3f6bb75.

⛔ Files ignored due to path filters (89)
  • src/libs/Ideogram/Generated/Ideogram..JsonSerializerContext.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.DatasetsClient.CreateDataset.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.DatasetsClient.ListDatasets.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.DatasetsClient.TrainDatasetModel.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.DatasetsClient.UploadDatasetAssets.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.DatasetsClient.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.GenerateClient.PostGenerateDesignV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.GenerateClient.PostLayerizeDesignV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.GenerateClient.PostTryOnV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.CreateDataset.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.ListDatasets.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.TrainDatasetModel.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.UploadDatasetAssets.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IDatasetsClient.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IGenerateClient.PostGenerateDesignV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IGenerateClient.PostLayerizeDesignV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IGenerateClient.PostTryOnV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IIdeogramClient.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.IdeogramClient.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.AssetType.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.AssetTypeNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.DatasetAssetFailureReason.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.DatasetAssetFailureReasonNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.OrganizationRoleEnum.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.OrganizationRoleEnumNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.OrganizationType.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.OrganizationTypeNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.SharingPermission.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.SharingPermissionNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3Alignment.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3AlignmentNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3GlyphType.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3GlyphTypeNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3StyleClass.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextItemV3StyleClassNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextSpanV3FontFxItem.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonConverters.TextSpanV3FontFxItemNullable.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.JsonSerializerContextTypes.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.AssetIdentifier.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.AssetIdentifier.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.AssetType.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.CreateApiKeyResponse.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.CreateDatasetRequest.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.CreateDatasetRequest.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.Dataset.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.Dataset.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.DatasetAssetFailureReason.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.DatasetUploadFailedAsset.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.DatasetUploadFailedAsset.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.DatasetUploadSucceededAsset.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.DatasetUploadSucceededAsset.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.GenerateDesignRequestV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.GenerateGraphicRequestV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.LayeredImageGenerationObjectV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.LayeredImageGenerationResponseV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.LayerizeDesignRequestV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.ListDatasetsResponse.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.ListDatasetsResponse.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.OrganizationAvatarObject.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.OrganizationAvatarObject.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.OrganizationObject.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.OrganizationObject.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.OrganizationRoleEnum.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.OrganizationType.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.PostApiKeyResponse.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.SharingInfo.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.SharingInfo.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.SharingPermission.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextContainerV3.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextContainerV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3Alignment.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3GlyphType.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextItemV3StyleClass.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextSpanV3.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextSpanV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TextSpanV3FontFxItem.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TrainDatasetModelRequest.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TrainDatasetModelRequest.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TrainDatasetModelResponse.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TrainDatasetModelResponse.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.TryOnRequestV3.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.UploadDatasetAssetsRequest.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.UploadDatasetAssetsRequest.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.UploadDatasetAssetsResponse.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.UploadDatasetAssetsResponse.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.User.Json.g.cs is excluded by !**/generated/**
  • src/libs/Ideogram/Generated/Ideogram.Models.User.g.cs is excluded by !**/generated/**
📒 Files selected for processing (1)
  • src/libs/Ideogram/openapi.yaml

Comment on lines +52 to +64
'/datasets/{dataset_id}/train_model':
post:
tags:
- datasets
summary: Train a model from a dataset
description: "Start training a custom model from a dataset. The dataset must contain at least 5 images. Returns the model ID and training status.\n"
operationId: train_dataset_model
parameters:
- name: dataset_id
in: path
required: true
schema:
type: string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use one identifier name across the dataset API.

/datasets returns collection_id, but the follow-up routes and training response use dataset_id for what looks like the same identifier. Either standardize the name or expose both explicitly if they are actually distinct; otherwise the list → upload/train flow is ambiguous in generated clients and docs. Based on learnings: this file is auto-updated from the upstream OpenAPI source every 3 hours, so the rename needs to land there rather than only in this generated copy.

Also applies to: 87-99, 5188-5199, 5393-5396

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/libs/Ideogram/openapi.yaml` around lines 52 - 64, The API uses two
different identifier names (collection_id vs dataset_id) causing ambiguity; pick
one canonical name (recommend dataset_id) and update the OpenAPI spec so all
path parameters, parameter refs, request/response schemas and operationIds
consistently use that name (e.g., change /datasets/{collection_id}/... and any
response fields returning collection_id to dataset_id) — specifically adjust the
path /datasets/{dataset_id}/train_model and related operations (e.g.,
train_dataset_model and the list/upload/train flows) and any duplicated
occurrences elsewhere to the chosen identifier, or if both must be supported,
explicitly document and expose both fields in responses and accept both path
params; make the change in the upstream OpenAPI source (not just this generated
copy) so the rename persists across auto-updates.

Comment on lines +2057 to +2085
font_file_h1:
type: string
description: 'Custom font file for H1 (heading) text. Supported formats .ttf, .otf, .woff, .woff2 (max 5MB). Takes priority over font_name_h1.'
format: binary
font_name_h1:
type: string
description: Font name from the available font library for H1 text. Ignored if font_file_h1 is provided.
example: Ubuntu-Bold.ttf
font_file_h2:
type: string
description: 'Custom font file for H2 (subheading) text. Supported formats .ttf, .otf, .woff, .woff2 (max 5MB). Takes priority over font_name_h2.'
format: binary
font_name_h2:
type: string
description: A font name from the available font library to use for text rendering. Cannot be used together with uploaded_font_file.
example: Ubuntu-Regular.ttf
uploaded_font_file:
description: Font name from the available font library for H2 text. Ignored if font_file_h2 is provided.
font_file_body:
type: string
description: 'A custom font file to use for text rendering. Supported formats are .ttf, .otf, .woff, and .woff2 (max size 5MB). Cannot be used together with requested_font_file.'
description: 'Custom font file for body text. Supported formats .ttf, .otf, .woff, .woff2 (max 5MB). Takes priority over font_name_body.'
format: binary
font_name_body:
type: string
description: Font name from the available font library for body text. Ignored if font_file_body is provided.
font_file_small:
type: string
description: 'Custom font file for small/caption text. Supported formats .ttf, .otf, .woff, .woff2 (max 5MB). Takes priority over font_name_small.'
format: binary
font_name_small:
type: string
description: Font name from the available font library for small text. Ignored if font_file_small is provided.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In OpenAPI 3.x, what is the recommended way to model an endpoint that accepts both application/jsonandmultipart/form-data when only the multipart variant has file fields (type: string, format: binary)?

💡 Result:

Model it as one operation with a single requestBody that lists both media types under content, giving each media type its own schema. Put the file fields (type: string, format: binary) only in the multipart/form-data schema, and omit them from the application/json schema. OpenAPI explicitly allows request bodies to consume multiple media types and to vary schemas by media type. [1]

paths:
  /example:
    post:
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ExampleJsonRequest'

          multipart/form-data:
            schema:
              $ref: '#/components/schemas/ExampleMultipartRequest'
      responses:
        '200': { description: OK }

components:
  schemas:
    ExampleCommon:
      type: object
      properties:
        name: { type: string }
        tags:
          type: array
          items: { type: string }
      required: [name]

    ExampleJsonRequest:
      allOf:
        - $ref: '#/components/schemas/ExampleCommon'

    ExampleMultipartRequest:
      allOf:
        - $ref: '#/components/schemas/ExampleCommon'
        - type: object
          properties:
            file:
              type: string
              format: binary   # file upload in OAS3 multipart
          required: [file]

For the multipart variant, type: string + format: binary is the standard way to represent a file upload part. [2][3]


🏁 Script executed:

# First, let's examine the file structure around the mentioned lines
wc -l src/libs/Ideogram/openapi.yaml

Repository: tryAGI/Ideogram

Length of output: 94


🏁 Script executed:

# Check the content around lines 2057-2085
sed -n '2040,2090p' src/libs/Ideogram/openapi.yaml

Repository: tryAGI/Ideogram

Length of output: 2811


🏁 Script executed:

# Check the content around lines 2151-2179
sed -n '2140,2190p' src/libs/Ideogram/openapi.yaml

Repository: tryAGI/Ideogram

Length of output: 2596


🏁 Script executed:

# Find which schemas these font fields belong to
sed -n '2000,2100p' src/libs/Ideogram/openapi.yaml | grep -B 30 "font_file_h1"

Repository: tryAGI/Ideogram

Length of output: 1896


🏁 Script executed:

# Search for all occurrences of GenerateGraphicRequestV3 and GenerateDesignRequestV3 to see where they're used
rg 'GenerateGraphicRequestV3|GenerateDesignRequestV3' src/libs/Ideogram/openapi.yaml -B 2 -A 2

Repository: tryAGI/Ideogram

Length of output: 1413


Define separate request schemas for application/json and multipart/form-data to prevent file fields from appearing in the JSON contract.

Both GenerateGraphicRequestV3 and GenerateDesignRequestV3 are referenced by endpoints offering both application/json and multipart/form-data, but the same schema instance is used for both content types. The binary file fields (font_file_h1, font_file_h2, font_file_body, font_file_small, and others) now appear in the shared schema, which incorrectly implies the JSON variant accepts file uploads. OpenAPI 3.x supports per-media-type schemas within a single requestBody; use separate schema definitions for each content type, keeping binary fields only in the multipart variant.

Also applies to: 2151–2179

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/libs/Ideogram/openapi.yaml` around lines 2057 - 2085, The shared schema
currently exposes binary fields (font_file_h1, font_file_h2, font_file_body,
font_file_small, etc.) for both application/json and multipart/form-data; split
the contract by creating two distinct request schemas (e.g.,
GenerateGraphicRequestV3_JSON and GenerateGraphicRequestV3_Multipart, and
similarly GenerateDesignRequestV3_JSON and GenerateDesignRequestV3_Multipart),
remove all binary file properties from the JSON variants, and update the
endpoints' requestBody to reference the per-media-type schema (application/json
-> *_JSON, multipart/form-data -> *_Multipart) so file fields only appear in the
multipart schema.

Comment on lines +5097 to +5178
example:
datasets:
- collection_id: collection_id
creation_time: '2000-01-23T04:56:07+00:00'
user_id: user_id
name: name
cover_asset_identifier:
asset_type: RESPONSE
asset_id: 7uS_VESkRI6O3-sVgHQp_A
sharing:
- updated_time: '2000-01-23T04:56:07+00:00'
organization:
role: null
publishing_disabled: true
organization_id: organization_id
name: name
avatar:
avatar_url: avatar_url
type: null
permission: null
user:
badge: badge
subscription_plan_id: subscription_plan_id
display_handle: display_handle
user_id: user_id
photo_url: photo_url
- updated_time: '2000-01-23T04:56:07+00:00'
organization:
role: null
publishing_disabled: true
organization_id: organization_id
name: name
avatar:
avatar_url: avatar_url
type: null
permission: null
user:
badge: badge
subscription_plan_id: subscription_plan_id
display_handle: display_handle
user_id: user_id
photo_url: photo_url
- collection_id: collection_id
creation_time: '2000-01-23T04:56:07+00:00'
user_id: user_id
name: name
cover_asset_identifier:
asset_type: RESPONSE
asset_id: 7uS_VESkRI6O3-sVgHQp_A
sharing:
- updated_time: '2000-01-23T04:56:07+00:00'
organization:
role: null
publishing_disabled: true
organization_id: organization_id
name: name
avatar:
avatar_url: avatar_url
type: null
permission: null
user:
badge: badge
subscription_plan_id: subscription_plan_id
display_handle: display_handle
user_id: user_id
photo_url: photo_url
- updated_time: '2000-01-23T04:56:07+00:00'
organization:
role: null
publishing_disabled: true
organization_id: organization_id
name: name
avatar:
avatar_url: avatar_url
type: null
permission: null
user:
badge: badge
subscription_plan_id: subscription_plan_id
display_handle: display_handle
user_id: user_id
photo_url: photo_url
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make the dataset examples satisfy their own schemas.

These examples set permission, role, and type to null, but the referenced schemas are non-null string enums. Use a real enum value for permission, and either use real enum values or omit role / type when they are unknown.

Also applies to: 5216-5256, 6000-6016, 6051-6058

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/libs/Ideogram/openapi.yaml` around lines 5097 - 5178, The dataset example
objects in openapi.yaml use null for the enum fields permission,
organization.role, and organization.type which violates their non-null
string-enum schemas; update the example entries (the dataset example arrays
shown with keys
collection_id/creation_time/user_id/name/cover_asset_identifier/sharing) to
either set permission to a valid enum value from the Permission schema and set
organization.role and organization.type to valid enum values from their
respective schemas or simply remove those keys when the value is unknown; apply
the same fixes to the other example blocks referenced (around the other example
ranges noted) so all dataset examples conform to their schemas.

Comment on lines +5303 to +5319
example:
successful_assets:
- asset_identifier:
asset_type: RESPONSE
asset_id: 7uS_VESkRI6O3-sVgHQp_A
file_name: file_name
- asset_identifier:
asset_type: RESPONSE
asset_id: 7uS_VESkRI6O3-sVgHQp_A
file_name: file_name
total_count: 0
failed_assets:
- file_name: file_name
failure_reason: null
- file_name: file_name
failure_reason: null
success_count: 6
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix the upload response example.

total_count is 0 while the same payload shows 6 successes and 1 failure, and failure_reason is null even though that field is required and non-nullable. This example is internally inconsistent and does not match the schema.

Also applies to: 5354-5356

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/libs/Ideogram/openapi.yaml` around lines 5303 - 5319, The example in
openapi.yaml is inconsistent: total_count is 0 while successful_assets contains
6 entries and failed_assets contains at least one, and failure_reason is null
despite being required. Update the example so total_count equals success_count +
number of failed_assets (e.g., set total_count to 7 if success_count is 6 and
there is 1 failed asset), ensure success_count matches the length of
successful_assets, and populate each failed_assets.failure_reason with a
non-null string; apply the same fixes to the other occurrence around the same
example (lines referencing successful_assets, failed_assets, total_count,
success_count, failure_reason).

Comment on lines +5370 to +5379
required:
- files
type: object
properties:
files:
type: array
items:
type: string
format: binary
description: "Image files (JPEG, PNG, WebP), .txt caption sidecar files, and/or ZIP archives containing images and captions. Caption .txt files are matched to images by filename stem (e.g. sunset.txt provides the caption for sunset.jpg).\n"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Require at least one file in UploadDatasetAssetsRequest.

A required array can still be empty. Add minItems: 1 so validators and generated clients do not treat files: [] as a valid upload.

📎 Minimal schema fix
     UploadDatasetAssetsRequest:
       required:
         - files
       type: object
       properties:
         files:
           type: array
+          minItems: 1
           items:
             type: string
             format: binary
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
required:
- files
type: object
properties:
files:
type: array
items:
type: string
format: binary
description: "Image files (JPEG, PNG, WebP), .txt caption sidecar files, and/or ZIP archives containing images and captions. Caption .txt files are matched to images by filename stem (e.g. sunset.txt provides the caption for sunset.jpg).\n"
required:
- files
type: object
properties:
files:
type: array
minItems: 1
items:
type: string
format: binary
description: "Image files (JPEG, PNG, WebP), .txt caption sidecar files, and/or ZIP archives containing images and captions. Caption .txt files are matched to images by filename stem (e.g. sunset.txt provides the caption for sunset.jpg).\n"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/libs/Ideogram/openapi.yaml` around lines 5370 - 5379, The OpenAPI schema
for UploadDatasetAssetsRequest marks "files" as required but allows an empty
array; update the schema for the "files" property in the
UploadDatasetAssetsRequest definition to add "minItems: 1" so validators and
generated clients reject an empty files array (look for the
"UploadDatasetAssetsRequest" object and the "files" property in the
openapi.yaml).

Comment on lines +5416 to +5445
model_name:
title: model_name
type: string
description: "Name for the trained model. Must be 5-30 characters, alphanumeric with spaces and hyphens allowed.\n"
example: my-custom-model
training_steps:
title: training_steps
type: integer
description: "Number of training steps. Must be between 100 and 20000 and a multiple of 100. Defaults to 1000.\n"
example: 1000
lora_rank:
title: lora_rank
type: integer
description: "LoRA rank for model training. Must be one of 64, 128, or 256. Defaults to 128. Note: lora_rank > 128 requires num_chips >= 32.\n"
example: 64
num_chips:
title: num_chips
type: integer
description: "Number of TPU chips to use. Must be one of 16 or 32.\n"
example: 16
ema:
title: ema
type: number
description: "Exponential moving average decay rate. Must be greater than 0.\n"
example: 0.999
learning_rate:
title: learning_rate
type: number
description: "Learning rate for the optimizer. Must be greater than 0.\n"
example: 0.0001
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In OpenAPI 3.x / JSON Schema, do constraints written only in descriptionaffect validation or code generation, or must rules like length limits, allowed values,multipleOf, defaults, and positive-only numbers be expressed as schema keywords?

💡 Result:

Constraints written only in description are non-normative documentation and do not participate in JSON Schema / OpenAPI validation. In JSON Schema, description is an annotation keyword, not a validation/assertion keyword. (json-schema.org)

In OpenAPI 3.1 (which aligns Schema Objects with JSON Schema vocabularies), tooling is expected to treat annotation keywords as metadata; actual validation rules must be expressed using schema keywords (e.g., minLength/maxLength, enum/const, multipleOf, minimum + exclusiveMinimum, etc.). (spec.openapis.org)

Related note: default is also annotation-style metadata in JSON Schema—typically used by documentation/UI/codegen as a suggested value, not enforced by validation unless a tool adds nonstandard behavior. (json-schema.org)

Citations:


🏁 Script executed:

# Check if the file exists and verify the specific line range
wc -l src/libs/Ideogram/openapi.yaml

Repository: tryAGI/Ideogram

Length of output: 94


🏁 Script executed:

# Read the specific lines mentioned in the review comment
sed -n '5416,5445p' src/libs/Ideogram/openapi.yaml

Repository: tryAGI/Ideogram

Length of output: 1295


🏁 Script executed:

# Check broader context to see the full TrainDatasetModelRequest definition
sed -n '5400,5460p' src/libs/Ideogram/openapi.yaml

Repository: tryAGI/Ideogram

Length of output: 2140


Encode the training limits as schema keywords, not only in prose.

The 5–30 character name rule, allowed character set, training_steps bounds and multiple, allowed lora_rank and num_chips values, positive-only ema and learning_rate, and the documented defaults are described only in description fields. Per JSON Schema specification, description is annotation metadata and does not participate in validation. Validators and generated SDKs will not enforce or surface these constraints unless they are modeled as schema keywords such as minLength, maxLength, pattern, enum, minimum, exclusiveMinimum, multipleOf, and default.

📐 Proposed schema keywords
         model_name:
           title: model_name
           type: string
+          minLength: 5
+          maxLength: 30
+          pattern: '^[A-Za-z0-9 -]+$'

         training_steps:
           title: training_steps
           type: integer
+          minimum: 100
+          maximum: 20000
+          multipleOf: 100
+          default: 1000

         lora_rank:
           title: lora_rank
           type: integer
+          enum:
+            - 64
+            - 128
+            - 256
+          default: 128

         num_chips:
           title: num_chips
           type: integer
+          enum:
+            - 16
+            - 32

         ema:
           title: ema
           type: number
+          exclusiveMinimum: 0

         learning_rate:
           title: learning_rate
           type: number
+          exclusiveMinimum: 0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
model_name:
title: model_name
type: string
description: "Name for the trained model. Must be 5-30 characters, alphanumeric with spaces and hyphens allowed.\n"
example: my-custom-model
training_steps:
title: training_steps
type: integer
description: "Number of training steps. Must be between 100 and 20000 and a multiple of 100. Defaults to 1000.\n"
example: 1000
lora_rank:
title: lora_rank
type: integer
description: "LoRA rank for model training. Must be one of 64, 128, or 256. Defaults to 128. Note: lora_rank > 128 requires num_chips >= 32.\n"
example: 64
num_chips:
title: num_chips
type: integer
description: "Number of TPU chips to use. Must be one of 16 or 32.\n"
example: 16
ema:
title: ema
type: number
description: "Exponential moving average decay rate. Must be greater than 0.\n"
example: 0.999
learning_rate:
title: learning_rate
type: number
description: "Learning rate for the optimizer. Must be greater than 0.\n"
example: 0.0001
model_name:
title: model_name
type: string
minLength: 5
maxLength: 30
pattern: '^[A-Za-z0-9 -]+$'
description: "Name for the trained model. Must be 5-30 characters, alphanumeric with spaces and hyphens allowed.\n"
example: my-custom-model
training_steps:
title: training_steps
type: integer
minimum: 100
maximum: 20000
multipleOf: 100
default: 1000
description: "Number of training steps. Must be between 100 and 20000 and a multiple of 100. Defaults to 1000.\n"
example: 1000
lora_rank:
title: lora_rank
type: integer
enum:
- 64
- 128
- 256
default: 128
description: "LoRA rank for model training. Must be one of 64, 128, or 256. Defaults to 128. Note: lora_rank > 128 requires num_chips >= 32.\n"
example: 64
num_chips:
title: num_chips
type: integer
enum:
- 16
- 32
description: "Number of TPU chips to use. Must be one of 16 or 32.\n"
example: 16
ema:
title: ema
type: number
exclusiveMinimum: 0
description: "Exponential moving average decay rate. Must be greater than 0.\n"
example: 0.999
learning_rate:
title: learning_rate
type: number
exclusiveMinimum: 0
description: "Learning rate for the optimizer. Must be greater than 0.\n"
example: 0.0001
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/libs/Ideogram/openapi.yaml` around lines 5416 - 5445, The schema
currently only documents constraints in descriptions for model_name,
training_steps, lora_rank, num_chips, ema, and learning_rate; update each
property to use JSON Schema keywords so they are enforced: for model_name add
minLength: 5, maxLength: 30 and a pattern allowing alphanumerics, spaces and
hyphens; for training_steps add type: integer (already present), minimum: 100,
maximum: 20000, multipleOf: 100 and default: 1000; for lora_rank replace prose
with enum: [64,128,256] and default: 128 (and keep the note about >128 requiring
num_chips in description if needed); for num_chips use enum: [16,32]; for ema
and learning_rate set type: number, minimum: 0 (or exclusiveMinimum: 0 if
strictly >0) and include the documented defaults where applicable; ensure these
keywords are placed on the corresponding properties (model_name, training_steps,
lora_rank, num_chips, ema, learning_rate) so validators and generated SDKs
enforce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant