Add Pixelization (SIGGRAPH Asia 2022) conversion + demo + hub integration#96
Open
john-rocky wants to merge 1 commit intomasterfrom
Open
Add Pixelization (SIGGRAPH Asia 2022) conversion + demo + hub integration#96john-rocky wants to merge 1 commit intomasterfrom
john-rocky wants to merge 1 commit intomasterfrom
Conversation
Model
- conversion_scripts/convert_pixelization.py — --size/--precision flags,
FP16 (38 MB) default, FP32 variant available.
- Two non-obvious pitfalls handled and documented:
* cellcode from G_A.MLP reaches magnitudes of 1e8, overflowing FP16
inside ModulationConvBlock. Since cellcode is constant, bake
(W*c)/norm(W*c) into plain Conv2d weights at conversion time and
drop the modulation op entirely.
* Upstream's custom LayerNorm (manual view(-1).std() over ~8M
elements) diverges ~4x in FP16. Swap to nn.GroupNorm(1, C), which
is mathematically equivalent and lowers to coremltools' native
group_norm op.
- Both lessons added to docs/coreml_conversion_notes.md.
Sample app (sample_apps/PixelizationDemo/)
- SwiftUI demo that loads the mlpackage from the bundle.
- Preset picker: Off / Game Boy / NES / Pico-8 / C64 — palette-only
differences, network-native cellSize default (4).
- Cell-size slider (4-10) for post-process grid fineness.
- Abstraction segmented picker (Auto / Off / 256 / 128 / 64 / 32)
driving a pre-blur pass on the input — the photo is resized down and
back up to match the upstream test_pro.py's scale-aware behaviour
inside our fixed-512 network. Lets the user dial between 'fine
detail' and 'iconic chunky silhouette' independently of cell size.
Hub app integration (sample_apps/CoreMLModelsApp/)
- ImageInOutDemoView gains an output_type: 'pixel_art' path matching
the standalone demo: preset chips, cell-size slider, Abstraction
picker. Re-runs the network on preset / abstraction change; during
cell-size drag it only re-runs the cheap palette-snap.
- Raw 512x512 network output is cached so palette / cell-size changes
stay instant.
README / manifest
- Added Pixelization entry under Image2Image with license + sample
project + conversion-script links.
- Updated .gitignore to skip the vendored Pixelization/ clone and the
sample PNGs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mlboydaisuke/coreml-zoo/pixelization/.sample_apps/PixelizationDemo/) with palette presets (Off / Game Boy / NES / Pico-8 / C64), cell-size slider, and an Abstraction picker driving a scale-aware pre-blur.CoreMLModelsApp) gets a newpixel_artoutput_type inImageInOutDemoViewwith the same preset / slider / abstraction UI.Conversion notes (also in
docs/coreml_conversion_notes.md)G_A.MLPoutput (cellcode) has magnitudes around 1e8, which overflows FP16 insideModulationConvBlock. Becausecellcodeis constant, precompute(W * c) / ‖W * c‖at conversion time and drop the modulation op entirely.nn.GroupNorm(1, C). The upstreamview(-1).std()over ~8M elements diverges ~4× in FP16.nn.GroupNorm(1, C)is mathematically equivalent and lowers to coremltools' nativegroup_normop.UI design
cellSize: 4— the network's native grid is what reads cleanest across photos; palettes are what distinguishes each console mode.test_pro.py's scale-adaptive resize inside our fixed-512 model.Test plan
sample_apps/PixelizationDemoon devicesample_apps/CoreMLModelsApp, open Pixelization from the hub, verify the same flow