Motion in-betweening is a flexible and efficient technique for generating 3-dimensional animations. In this paper, we propose a keyframe-driven method that effectively addresses the pose ambiguity issue and achieves robust in-betweening performance. We introduce a keyframe-driven synthesis framework. At each recursion, the key poses at both ends keep predicting the new one at the midpoint. The recursive breakdown reduces motion ambiguities by simplifying the in-betweening sequence as the integration of short clips. The hybrid positional encoding scales the hidden states to adapt to long-and-short-term dependencies. Additionally, we employ a temporal refinement network to capture the local motion relationships, thereby enhancing the consistency of the predicted pose sequence. Through comprehensive evaluations that include both quantitative and qualitative comparisons, the proposed model demonstrates its competitiveness in prediction accuracy and in-betweening flexibility.