Anomaly detection (AD) in medical images aims to recognize test-time abnormal inputs according to normal samples in the training set.  Knowledge distillation based on the teacher-student (T-S) model is a simple and effective method to identify anomalies, yet its efficacy  is constrained by the similarity between teacher and student network architectures. To address this problem, in this paper, we propose a T-S model with skip connections (Skip-TS) which is trained by direct reverse knowledge distillation (DRKD) for AD in medical images. First, to overcome the low sensitivity to anomalies caused by structural similarity, we design an encoder-decoder architecture where the teacher network (T-Net) is a pre-trained encoder and the student network (S-Net) is a randomly initialized decoder. During training, the S-Net learns to reconstruct the shallow representations of images from the output of the T-Net, which is called DRKD. Secondly, we introduce skip connections to the T-S model to prevent the S-Net from missing normal information of images at multi-scale. In addition, we design a multi-scale anomaly consistency (MAC) loss to improve the anomaly detection and localization performance. Thorough experiments conducted on twelve public medical datasets and two private medical datasets demonstrate that our approach surpasses the current state-of-the-art by 6.4% and 8.2% in terms of AUROC on public and private datasets, respectively. Code and organized benchmark datasets will be available at https://github.com/Arktis2022/Skip-TS.