Radar High Resolution Range Profile (HRRP), which can provide target structure information with great potential for target recognition. However, the structural information is not fully exploited by most existing deep learning methods, which focus only on local or sequence information. Furthermore, existing methods equalise target and non-target regions in HRRP. This is not conducive to target feature extraction. In this letter, we propose a target recognition method using wavelet patch merging and contraction Transformer, called CT. CT can adaptively focus on the target region and efficiently extract local and sequence information. CT used convolution to extract local features and contraction self-attention to extract sequential features. Wavelet patch merging was used to avoid oversampling. Finally, the experimental results show that the CT can effectively extract structural features in HRRP to improve target recognition performance. It is also robust under low signal-to-noise conditions.