The paper investigates a non-zero-sum differential investment and reinsurance problem between two alpha-robust, risk-averse competitive insurers under a time-consistent mean-variance criterion inspired by Li et al. (2016). The claim arrival processes for both insurers follow the classical Cramér-Lundberg model, and the reinsurance premium is calculated using the variance premium principle. Each insurer can invest their surplus in one risk-free asset, one risky asset, and a defaultable corporate bond. The paper also considers the effect of bounded memory, which is characterized by the wealth process with delay. Using the dynamic programming approach, we solve for the non-zero-sum alpha-robust optimal strategy and the corresponding value function by solving the Hamilton-Jacobi-Bellman (HJB) equation. In the numerical simulation section, we observe a phenomenon where the optimal strategy for investing in defaultable bonds decreases as the competitor’s risk aversion coefficient increases, provided that one’s own risk aversion coefficient remains constant. However, when there is a change in one’s own risk aversion coefficient, even if the competitor’s risk aversion changes in the opposite direction, the optimal investment strategy in defaultable bonds still decreases as one’s own risk aversion coefficient increases.