A long-standing challenge in studying the global carbon cycle has been understanding the factors controlling inter–annual variation (IAV) of carbon fluxes, and improving their representations in existing biogeochemical models. Here, we compared an optimality-based model and a semi-empirical light use efficiency model to understand how current models can be improved to simulate IAV of gross primary production (GPP). Both models simulated hourly GPP and were parameterized for (1) each site–year, (2) each site with an additional constraint on IAV (CostIAV), (3) each site, (4) each plant–functional type, and (5) globally. This was followed by forward runs using calibrated parameters, and model evaluations using Nash–Sutcliffe efficiency (NSE) as a model-fitness measure at different temporal scales across 198 eddy-covariance sites representing diverse climate–vegetation types. Both models simulated hourly GPP better (median normalized NSE: 0.83 and 0.85) than annual GPP (median normalized NSE: 0.54 and 0.63) for most sites. Specifically, the optimality-based model substantially improved from NSE of -1.39 to 0.92 when drought stress was explicitly included. Most of the variability in model performances was due to model types and parameterization strategies. The semi-empirical model produced statistically better hourly simulations than the optimality-based model, and site–year parameterization yielded better annual model performance. Annual model performance did not improve even when parameterized using CostIAV. Furthermore, both models underestimated the peaks of diurnal GPP, suggesting that improving predictions of peaks could produce better annual model performance. Our findings reveal current modelling deficiencies in representing IAV of carbon fluxes and guide improvements in further model development.