Create enhanced AudioToTextEnhancedProvider that also reformats text from transcription by lukasdotcom · Pull Request #363 · nextcloud/integration_openai

lukasdotcom · 2026-05-06T18:46:09Z

Built off of #362 and creates an AudioToTextEnhancedProvider that can be used instead that also reformats text. (582825f is the important commit)

…g handler Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

Copilot

Pull request overview

This PR extends the app’s Task Processing providers by introducing a paragraph-reformatting text provider and an “enhanced” audio-to-text provider that chains transcription → paragraph reformatting (when the Nextcloud task type is available).

Changes:

Add ReformatParagraphsProvider (core:text2text:reformatparagraphs) that inserts paragraph breaks based on LLM-returned “anchor” lines.
Add AudioToTextEnhancedProvider that runs AudioToTextProvider and then invokes the reformat-paragraphs task as a follow-up step.
Register the new providers conditionally based on task type availability and add a unit test + psalm reference for the new task type class.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`lib/TaskProcessing/ReformatParagraphsProvider.php`	New synchronous provider that requests anchor lines from the LLM and inserts paragraph breaks into the original input text.
`lib/TaskProcessing/AudioToTextEnhancedProvider.php`	New provider extending `AudioToTextProvider` and post-processing transcription via task processing manager.
`lib/TaskProcessing/AudioToTextProvider.php`	Makes dependencies `protected` to enable subclass access from the enhanced provider.
`lib/AppInfo/Application.php`	Registers the enhanced audio provider and the new reformat provider when the task type exists.
`tests/unit/Providers/OpenAiProviderTest.php`	Adds a unit test for `ReformatParagraphsProvider` and gates it by task type availability.
`psalm.xml`	Adds `TextToTextReformatParagraphs` as a referenced class for analysis.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lukasdotcom · 2026-05-08T06:28:57Z

+		if (isset($input['model']) && is_string($input['model'])) {
+			$model = $input['model'];
+		} else {
+			$model = $this->appConfig->getValueString(Application::APP_ID, 'default_completion_model_id', Application::DEFAULT_MODEL_ID, lazy: true) ?: Application::DEFAULT_MODEL_ID;


This could be changed in the other providers too

Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

…from transcription Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz> # Conflicts: # lib/TaskProcessing/ReformatParagraphsProvider.php

Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

marcelklehr · 2026-05-11T08:17:27Z

+	}
+
+	public function getOptionalOutputShape(): array {
+		return $this->audioToTextProvider->getOptionalOutputShape();


Can we hardcode these? Otherwise changes in the audioToTextProvider will automatically change this provider as well, which feels like to much magic.

Some of them it makes sense to change like for the output ones, but I would say the input ones should not be changed because they are passed into the audioToTextProvider directly anyway and should always be the same.

Alright, that makes sense, yes.

marcelklehr · 2026-05-11T09:17:22Z

+				. 'A break is allowed only when the subject matter changes significantly. '
+				. 'Output format: For each identified paragraph, return only the first 8 to 12 words verbatim from the input. '
+				. 'Structure: Return exactly one paragraph per line. Do not include bullets, html tags, numbering, summaries, quotes, or any additional text. '
+				. 'Single topic: If the text covers only one topic, return exactly one line.';


Tested with output from stt_whisper2 and Mistral Small 24B from IONOS and it doesn't seem to work for me :/

I get the complete blob back from the task type, same as the input

Interesting what prompt did you use. I used the one below and tested it on Mistral Small 24B and it splits it into two paragraphs:

Details
Dogs have been humanity's most devoted companions for thousands of years, evolving from wild wolves into loyal members of our families. Renowned for their incredible intelligence, adaptability, and emotional connection, they serve in countless roles ranging from herding flocks and searching for disaster victims to providing comfort to those with anxiety. Whether they are playful puppies chasing a ball, energetic adults running off-leash, or quiet guardians watching over the home, dogs bring a unique blend of energy, affection, and unconditional love into our lives. Their ability to read human body language and emotions has fostered a deep bond between species, making them not just pets, but true partners in our daily journeys. Cats are captivating creatures that have held a unique place in human history for millennia, revered both as loyal companions and mysterious symbols of grace. With their sleek, muscular bodies and agile movements, they are masters of the hunt, capable of pouncing with pinpoint accuracy even in the dimmest light. Beyond their physical prowess, cats possess an enigmatic personality that ranges from aloof independence to devoted affection, often choosing their owners with a quiet intuition. Whether lounging lazily in a sunbeam, watching the world with wide, golden eyes, or performing a silent, graceful leap, cats embody a blend of elegance and mystery that continues to enchant people everywhere.

I tried with the transcript of the last company call from may 5th

Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

feat(TaskProcessing): add TextToTextReformatParagraphs task processin…

7045804

…g handler Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

marcelklehr reviewed May 7, 2026

View reviewed changes

Comment thread lib/TaskProcessing/AudioToTextEnhancedProvider.php Outdated

marcelklehr reviewed May 7, 2026

View reviewed changes

Comment thread lib/TaskProcessing/AudioToTextEnhancedProvider.php

marcelklehr reviewed May 7, 2026

View reviewed changes

Comment thread lib/TaskProcessing/AudioToTextEnhancedProvider.php Outdated

marcelklehr requested a review from Copilot May 7, 2026 06:02

Copilot started reviewing on behalf of marcelklehr May 7, 2026 06:03 View session

Copilot AI reviewed May 7, 2026

View reviewed changes

marcelklehr mentioned this pull request May 7, 2026

feat(TaskProcessing): add TextToTextReformatParagraphs task processing handler #362

Open

lukasdotcom added 3 commits May 8, 2026 09:25

Resolve feedback

4d03993

Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

Resolve feedback

7ca59cc

Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

Create enhanced AudioToTextEnhancedProvider that also reformats text …

c37f479

…from transcription Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz> # Conflicts: # lib/TaskProcessing/ReformatParagraphsProvider.php

lukasdotcom force-pushed the better-transcript branch 2 times, most recently from df57549 to fb23749 Compare May 8, 2026 08:44

Switch away from extend and to dependency injection

a85993e

Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

lukasdotcom force-pushed the better-transcript branch from fb23749 to a85993e Compare May 8, 2026 09:16

marcelklehr reviewed May 11, 2026

View reviewed changes

Don't use AudioToTextProvider functions for output types

83a1c94

Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>

Conversation

lukasdotcom commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukasdotcom May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

marcelklehr May 11, 2026

Choose a reason for hiding this comment

Uh oh!

lukasdotcom May 11, 2026

Choose a reason for hiding this comment

Uh oh!

marcelklehr May 11, 2026

Choose a reason for hiding this comment

Uh oh!

marcelklehr May 11, 2026

Choose a reason for hiding this comment

Uh oh!

marcelklehr May 11, 2026

Choose a reason for hiding this comment

Uh oh!

lukasdotcom May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marcelklehr May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lukasdotcom commented May 6, 2026 •

edited

Loading

lukasdotcom May 8, 2026 •

edited

Loading

lukasdotcom May 11, 2026 •

edited

Loading