Optimal control approaches, such as model predictive (MPC) and reinforcement learning (RL) control, are promising tools in decentralized grid applications, but face various challenges, e.g., in the presence of model inaccuracy or with concern to safety-related limitations. While contemporary research articles mainly cover simulation-based investigations, this contribution presents the application and head-to-head comparison of both control algorithms on commercial inverter hardware, with the topology being a three-phase, three-level inverter with an LC-filter and a common neutral point. Herein, the safety-shielded RL controller is trained from scratch utilizing an edge learning toolchain while simultaneously interacting with the hardware. Both control approaches benefit from system identification, allowing to decouple achievable controller and safeguard performance from potentially unreliable or incomplete a priori information. During transients, both methods exhibit almost identically fast settling characteristics when operated on the same safety constraints. In steady state, MPC achieves a normalized mean absolute error (w.r.t. the reference voltage amplitude) of 0.33 %, with the RL controller even reaching 0.18 %. Lastly, it is empirically demonstrated that the RL controller manages to compensate inverter non-linearities, which contrary reduce the MPC's tracking performance noticeably in the partial load range if left unchecked.