The rise of eHealth technologies has transformed cardiac disease diagnosis, leveraging edge computing, AI, and IoT to offer critical insights into heart health. Data privacy constraints in centralized systems hinder access to large-scale ECG datasets, posing challenges for early diagnosis. While advancements in quantization and compression enable neural networks to run on edge devices, effective solutions for efficient training and inference on constrained devices are limited. To address challenges to training on the edge, we propose a mix-precision quantized DNN FPGA accelerator designed for multi-class cardiac diagnosis. Our solution achieves a top-1 test accuracy of up to 93.26% while enhancing computational efficiency, optimizing resource usage, and reducing transmission power. Our Mix-Precision Quantized FPGA Accelerator achieves up to 136x and 7.2x faster inference compared to state-of-theart Split-CNN and DCNN-Convolutional FPGA Accelerators, respectively. The accelerators offer a throughput of up to 1439.36 samples per second, latency of only 695µs, and programming logic power consumption staying below 600mW. Using hardwaresoftware co-design, our FPGA-based "Training on the Edge" approach combines software flexibility with hardware speed and improved diagnostic Top-1 test accuracy by up to 2.8% within just five training cycles, making the model more diverse to the dataset. The proposed approach also accelerates the development and reduces hardware rebuild time by a factor of (Training Cycles-1)x, ensuring efficient, sustainable ML solutions on edge devices. The source code is made available on https://github.com/shakeelakram00/Continual-Learningon-FPGAs-using-FINN