Exploiting RISC-V Hint Instructions for Lightweight VLIW Execution
Leo Marek, Gregory Chekler, Jing Jing Chen, Elliot Chae, Ethan Carr, Ray Simar — Rice University
The pursuit of high Instruction-Level Parallelism (ILP) at lower power has renewed interest in Very Long Instruction Word (VLIW) architectures. Conventional VLIW designs often face challenges such as code density and lack of binary compatibility. This work introduces a novel hint-based VLIW implementation built on the RISC-V ISA. We use architecturally reserved HINT instructions to encode static scheduling decisions, enabling parallel execution without complex hazard detection hardware.
The design was evaluated using the Google MPACT simulator and implemented at RTL by modifying the OpenHW Group CVW (Wally) core. It is verified with Questa and Verilator and is being synthesized and deployed on FPGA. Results on a suite of handwritten DSP benchmarks—FFT, IIR, FIR, and Dot Product (a subset of Embench DSP)—show substantial speedup (1.5–4×) while preserving full binary cross compatibility: libraries written for STARBUG can be linked directly with unmodified RISC-V code.
We began by modeling the HINT-based extensions in MPACT to validate decoding and bundle formation before committing to RTL. For hardware, we modified the fetch and decode stages of Wally to recognize scheduled bundles and implemented a 12-read / 4-write register file to support four integer datapaths (4-wide VLIW). Verification uses Questa and Verilator; the design is being synthesized for FPGA deployment.
Superscalar cores extract ILP via dynamic scheduling, but speculation and hazard detection add area and power. In our approach, HINT instructions are interpreted as NOPs on standard RISC-V cores but carry explicit dependency information for our microarchitecture, enabling VLIW-style parallel issue without the complexity of traditional dynamic scheduling.
For questions or collaboration, reach out or see my resume and GitHub.