Show HN: UHOP – An Open Hardware Optimization Platform for GPUs
github.comI’ve been working on UHOP (Universal Hardware Optimization Platform) — an open-source framework that helps developers optimize GPU and accelerator workloads across architectures (CUDA, ROCm, OpenCL, etc.) without vendor lock-in.
It started as a personal frustration: I’d write something that ran great on CUDA, then have to rewrite or retune for ROCm or OpenCL. UHOP tries to make that portable — it detects your hardware, generates or benchmarks candidate kernels, and caches the best performer. It also supports AI-assisted kernel generation using OpenAI APIs and comes with a simple CLI for demos and benchmarking.
Right now, UHOP can:
Auto-detect hardware backends and pick optimal kernels
Run and benchmark fused ops like conv+ReLU
Cache and reuse tuned kernels
Generate kernels dynamically via codegen (CUDA/OpenCL/Python/Triton)
There’s still a lot in progress — better backend integration, distributed optimization, and a web dashboard for visualizing results. I’m sharing it early to get feedback from folks who’ve worked on compilers, GPU runtimes, and ML infra.
Repo: github.com/sevenloops/uhop
Demo: uhop.dev
Would love any thoughts on architecture, testing approaches, or potential contributions from ex-NVIDIA/ROCm engineers.