Skip to content
View Peterc3-dev's full-sized avatar

Block or report Peterc3-dev

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Peterc3-dev/README.md
 ██████╗ ███████╗████████╗███████╗██████╗  ██████╗██████╗
 ██╔══██╗██╔════╝╚══██╔══╝██╔════╝██╔══██╗██╔════╝╚════██╗
 ██████╔╝█████╗     ██║   █████╗  ██████╔╝██║      █████╔╝
 ██╔═══╝ ██╔══╝     ██║   ██╔══╝  ██╔══██╗██║      ╚═══██╗
 ██║     ███████╗   ██║   ███████╗██║  ██║╚██████╗██████╔╝
 ╚═╝     ╚══════╝   ╚═╝   ╚══════╝╚═╝  ╚═╝ ╚═════╝╚═════╝

Building open-source ML tooling for AMD APUs. Squeezing every FLOP out of hardware the industry forgot about.

I found 4 bugs blocking ML on AMD's newest GPU architecture (gfx1150/RDNA 3.5), built the first public PyTorch for it, patched the NPU driver, and designed a tri-processor inference engine that routes compute across CPU + GPU + NPU using hyperdimensional computing. Also do security research, Android apps, and creative coding.


AMD / ML Stack

Project What it does
rag-race-router R.A.G-Race-Router [TAP] — Tri-processor inference engine. HDC cosine similarity routing across CPU+iGPU+NPU on Ryzen AI 300
miopen-gfx1150 Found 3 bugs blocking ML training on RDNA 3.5 — whitelist patch, CK VGPR analysis, solver availability matrix
pytorch-gfx1150 First public PyTorch build for AMD RDNA 3.5 (Radeon 890M) with native GPU acceleration
amdxdna-strix-fix NPU driver patch for Strix Point — SMU init bypass, brought XDNA 2 NPU to life on Linux
unified-ml Custom HIP + Vulkan kernels proving 60% Vulkan speedup over ROCm on consumer AMD GPUs
apu-codec Neural audio codec from scratch — NPU encoder, GPU decoder, 512x compression at 44.1kHz

Security / Tools

Project What it does
graphql-authz-fuzzer Detects GraphQL mutation authorization bypass — zero dependencies, pure Python
PlateAuth NFC behavioral biometric authentication research prototype (Android/Kotlin)
cin-agent Telegram bot with natural language shell execution and safe command sandboxing

Apps / Creative

Project What it does
retune432-android Batch A440→A432 Hz audio converter with metadata preservation
d-board Ortholinear Android keyboard — true grid layout, zero stagger
geodancer Audio-reactive geometric wireframe dancer — C++, OpenGL, libaubio
crt-magnifier-lens Chrome extension — draggable CRT magnifier with barrel distortion

Currently working on

  • Vulkan PyTorch backend — PrivateUse1 dispatch, no ROCm needed, runs on any GPU
  • HDC routing on NPU — Hyperdimensional Computing on XDNA 2 systolic array for sub-100us dispatch
  • Rust rewrite — zero-GIL tri-processor coordination with PyO3 bindings
  • Upstream contributions — MIOpen/CK/PyTorch issues filed, AMD engineers engaged

Tech

Python · Rust · C++/OpenGL · Kotlin · JavaScript · GLSL/SPIR-V · HIP/ROCm · Vulkan (Kompute) · IREE/MLIR · Android SDK · Chrome Extensions (MV3) · GraphQL · Linux (CachyOS) · Tailscale


Pinned Loading

  1. miopen-gfx1150 miopen-gfx1150 Public

    MIOpen research for AMD RDNA 3.5 (gfx1150) — whitelist patch and three-bug analysis for Strix Point APU training

    1

  2. recursive-routing-racer recursive-routing-racer Public

    R.A.G-Race-Router [TAP — Trigonometric Assembly Processing Engine] — Self-optimizing CPU+iGPU+NPU inference via cosine similarity routing on AMD Ryzen AI 300

    Python 1

  3. amdxdna-strix-fix amdxdna-strix-fix Public

    Fix for AMD XDNA NPU driver on Ryzen AI 300 (Strix Point/Halo) — SMU init bypass patch, systemd auto-loader, full root cause analysis

    Shell

  4. pytorch-gfx1150 pytorch-gfx1150 Public

    PyTorch built from source for AMD RDNA 3.5 (gfx1150) — Radeon 890M/880M GPU acceleration

    Shell

  5. torch-vulkan torch-vulkan Public

    Vulkan compute backend for PyTorch — runs on any GPU. PrivateUse1 dispatch, SPIR-V shaders, zero ROCm/CUDA dependency.

    C++

  6. unified-ml unified-ml Public

    Unified memory ML inference engine for AMD APU (RDNA 3.5 / gfx1150) — Custom HIP kernels with 2.2x speedup over standard allocation

    Python