THURSDAY November 05, 8:00am - 12:00pm | Slot 5
Designing Quantized IP Models with QKeras and hls4ml

Claudionor Coelho - Palo Alto Networks, Inc.
Sioni Summers - CERN, Geneva, Switzerland
Vladimir Loncar - CERN, Geneva, Switzerland
Jennifer Ngadiuba - CERN, Geneva, Switzerland
Thea Aarrestad - CERN, Geneva, Switzerland
In this hands-on workshop, we will teach the basics of quantization using QKeras and show how to create a quanized model using QKeras, and how to generate ML/DL IPs using hls4ml. We introduce the QKeras library, an extension of the Keras library allowing for the creation of heterogeneous quantized versions of deep neural network models, through drop-in replacement of Keras layers. These are trained quantization-aware, where the user can trade-off model area or energy consumption by accuracy. We demonstrate how the reduction of numerical precision, through quantization-aware training, significantly reduces resource consumption while retaining high accuracy when implemented on FPGA hardware. hls4ml library is a popular library for synthesizing ML models, supporting QKeras. It generates models based in HLS, targeted at FPGA synthesis flow. This workshop complements the paper submission to ICCAD, giving participants hands-on experience on quantization and synthesis of ML models from industry experts in this area from Google and CERN.