Skip to content

Add Pixelization (SIGGRAPH Asia 2022) conversion + demo + hub integration#96

Open
john-rocky wants to merge 1 commit intomasterfrom
feat/pixelization
Open

Add Pixelization (SIGGRAPH Asia 2022) conversion + demo + hub integration#96
john-rocky wants to merge 1 commit intomasterfrom
feat/pixelization

Conversation

@john-rocky
Copy link
Copy Markdown
Owner

Summary

  • Converts WuZongWei6/Pixelization (SIGGRAPH Asia 2022) to CoreML. FP16 mlpackage is 38 MB, uploaded to mlboydaisuke/coreml-zoo/pixelization/.
  • Standalone SwiftUI sample app (sample_apps/PixelizationDemo/) with palette presets (Off / Game Boy / NES / Pico-8 / C64), cell-size slider, and an Abstraction picker driving a scale-aware pre-blur.
  • Hub app (CoreMLModelsApp) gets a new pixel_art output_type in ImageInOutDemoView with the same preset / slider / abstraction UI.

Conversion notes (also in docs/coreml_conversion_notes.md)

  • Bake the fixed style code into conv weights. G_A.MLP output (cellcode) has magnitudes around 1e8, which overflows FP16 inside ModulationConvBlock. Because cellcode is constant, precompute (W * c) / ‖W * c‖ at conversion time and drop the modulation op entirely.
  • Swap the upstream custom LayerNorm for nn.GroupNorm(1, C). The upstream view(-1).std() over ~8M elements diverges ~4× in FP16. nn.GroupNorm(1, C) is mathematically equivalent and lowers to coremltools' native group_norm op.

UI design

  • Presets default to cellSize: 4 — the network's native grid is what reads cleanest across photos; palettes are what distinguishes each console mode.
  • The cell-size slider (4–10) only controls the post-process; dragging it is cheap (no network re-run).
  • The Abstraction picker (Auto / Off / 256 / 128 / 64 / 32) is what re-runs the network with a pre-blurred input, emulating upstream test_pro.py's scale-adaptive resize inside our fixed-512 model.

Test plan

  • Build sample_apps/PixelizationDemo on device
  • Pick a photo, verify each preset tapping switches palette without re-reading from Photos
  • Drag cell-size slider — should update instantly (post-process only)
  • Tap Abstraction options (128 / 64 / 32) — each should trigger a network re-run with different pre-blur levels
  • Build sample_apps/CoreMLModelsApp, open Pixelization from the hub, verify the same flow

Model
- conversion_scripts/convert_pixelization.py — --size/--precision flags,
  FP16 (38 MB) default, FP32 variant available.
- Two non-obvious pitfalls handled and documented:
  * cellcode from G_A.MLP reaches magnitudes of 1e8, overflowing FP16
    inside ModulationConvBlock. Since cellcode is constant, bake
    (W*c)/norm(W*c) into plain Conv2d weights at conversion time and
    drop the modulation op entirely.
  * Upstream's custom LayerNorm (manual view(-1).std() over ~8M
    elements) diverges ~4x in FP16. Swap to nn.GroupNorm(1, C), which
    is mathematically equivalent and lowers to coremltools' native
    group_norm op.
- Both lessons added to docs/coreml_conversion_notes.md.

Sample app (sample_apps/PixelizationDemo/)
- SwiftUI demo that loads the mlpackage from the bundle.
- Preset picker: Off / Game Boy / NES / Pico-8 / C64 — palette-only
  differences, network-native cellSize default (4).
- Cell-size slider (4-10) for post-process grid fineness.
- Abstraction segmented picker (Auto / Off / 256 / 128 / 64 / 32)
  driving a pre-blur pass on the input — the photo is resized down and
  back up to match the upstream test_pro.py's scale-aware behaviour
  inside our fixed-512 network. Lets the user dial between 'fine
  detail' and 'iconic chunky silhouette' independently of cell size.

Hub app integration (sample_apps/CoreMLModelsApp/)
- ImageInOutDemoView gains an output_type: 'pixel_art' path matching
  the standalone demo: preset chips, cell-size slider, Abstraction
  picker. Re-runs the network on preset / abstraction change; during
  cell-size drag it only re-runs the cheap palette-snap.
- Raw 512x512 network output is cached so palette / cell-size changes
  stay instant.

README / manifest
- Added Pixelization entry under Image2Image with license + sample
  project + conversion-script links.
- Updated .gitignore to skip the vendored Pixelization/ clone and the
  sample PNGs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant