Skip to content

harpertoken/kernel.metal

Repository files navigation

Metal SAXPY Compute Kernel

Metal Compute Shader

Jump to GitLab Runner Setup

Performing large-scale vector operations on the CPU can be slow and inefficient for data-intensive tasks. Utilizing the GPU via Metal compute shaders on Apple Silicon enables parallel processing, significantly accelerating computations like SAXPY (Single-precision A*X + Y).

Recommendation

Use Metal compute kernels for parallelizable operations to leverage GPU performance. Ensure proper thread mapping and buffer management for optimal results.

Example

This project performs SAXPY on vectors of 1,000,000 floats (X[i] = i, Y[i] = 2i, a = 2.0) to produce Y[i] = 2.0i + 2i = 4i.

Kernel Code (kernel.metal)

#include <metal_stdlib>
using namespace metal;

kernel void saxpy(
    constant float &a [[buffer(0)]],
    device const float* X [[buffer(1)]],
    device float* Y [[buffer(2)]],
    constant uint &count [[buffer(3)]],
    uint id [[thread_position_in_grid]]
) {
    if (id >= count) return;
    Y[id] = a * X[id] + Y[id];
}

Host Code (main.swift)

// ... (full code as provided)

Sample Output

(Shows first 5 and last 5 elements for brevity; full array is 1,000,000 elements)

Elapsed (s): 0.001
Sample results:
Y[0] = 0.0
Y[1] = 4.0
Y[2] = 8.0
Y[3] = 12.0
Y[4] = 16.0
...
Y[999995] = 3999980.0
Y[999996] = 3999984.0
Y[999997] = 3999988.0
Y[999998] = 3999992.0
Y[999999] = 3999996.0

Local Build and Run

To build and run locally on Apple Silicon (M1/M2):

  1. Install dependencies: ./install.sh
  2. Compile: swiftc main.swift -framework Metal -o vector_add
  3. Run: ./vector_add

This performs vector addition on 1,000,000 floats using GPU compute.

CI/CD

  • GitHub Actions: Runs linting on ubuntu-latest and syncs to GitLab.
  • GitLab CI: Runs full build and test on macOS for Apple Silicon compatibility.
  • CircleCI: Runs validation on Ubuntu (file checks, actionlint).
  • Local CI: Use act for GitHub Actions simulation, gitlab-ci-local for GitLab CI, or CircleCI local CLI.

Note: GitHub's hosted macOS-26 runners run natively on Apple Silicon and support Metal compute shaders. Full build and test can run on GitHub Actions using macos-26.

Setting Up GitLab Runner

To run GitLab CI jobs on macOS, register a self-hosted runner:

Option 1: Using Homebrew (Recommended)

brew install gitlab-runner

Option 2: Manual Download

sudo curl --output /usr/local/bin/gitlab-runner https://gitlab-runner-downloads.s3.amazonaws.com/latest/binaries/gitlab-runner-darwin-arm64
sudo chmod +x /usr/local/bin/gitlab-runner

# Verify installation
gitlab-runner --version

Register the runner

# May require sudo for system-wide installation
sudo gitlab-runner register --non-interactive \
  --url https://gitlab.com \
  --registration-token <TOKEN> \
  --description macos \
  --tag-list macos \
  --executor shell

Start and verify

  • For Homebrew installations: The service is managed by brew services.
    brew services start gitlab-runner
    gitlab-runner verify
  • For manual installations: First install the service, then start it.
    sudo gitlab-runner install
    sudo gitlab-runner start
    gitlab-runner verify

If gitlab-runner start fails (e.g., launchctl error), run in background instead:

gitlab-runner run &
gitlab-runner verify

Get the registration token from: GitLab → Project → Settings → CI/CD → Runners → New runner.

References

About

KERNEL.METAL: GPU compute kernel on Apple Silicon.

Topics

Resources

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE.md

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors