KalEdge-Lite

Neural Network Compression & FPGA Deployment Framework

Model compression hls4ml FPGA deployment QAT · Pruning · KD HW-aware screening AI Agents Surrogate model

KalEdge-Lite is an end-to-end framework for neural network compression and FPGA deployment, accessible through a web-based interface. It orchestrates the full pipeline: from dataset preparation and architecture definition to compression, synthesis, and resource estimation.

Compression pipeline

Multiple training strategies can be combined within a unified execution pipeline: pruning, TF-MOT quantization, QKeras quantization, QAT, QAP, and knowledge distillation; either as separate experiments or as hybrid sequential workflows. All trained models are available for download.

Hardware-aware screening & deployment

Each generated model is scored across task-level metrics (accuracy, model size, parameter count) using a configurable policy (FPGA-oriented, embedded, TinyML, or accuracy-driven). The top-ranked configuration is exported to a synthesizable HLS project via hls4ml. A lightweight analytical surrogate model then estimates latency and FPGA resource utilization without requiring full synthesis, enabling fast design-space exploration. The generated accelerator IP is automatically integrated into a preconfigured Vivado platform template, completing the ML-to-FPGA workflow.

End-to-end KalEdge-Lite application flow

Launch KalEdge-Lite [Coming soon]

← Back to Home