Underwater object tracking is a highly challenging task in the field of computer vision. This study focuses on this domain and proposes an innovative fine-grained temporal encoding and decoding-based underwater object tracking method. Due to the complex and dynamic underwater environment, such as uneven lighting, turbid water quality, and complex target motion patterns, existing underwater object tracking methods face significant limitations in accuracy and stability. By carefully designing a refinement module that combines fine-grained consistency and candidate elimination, this method can accurately extract fine-grained features of the target and effectively mitigate the interference of various underwater complexities during feature extraction, thereby improving the precision of target features. Furthermore, leveraging the temporal encoding-decoding module, the target features are continuously propagated along the temporal sequence, allowing full utilization of the relational information between frames, which further enhances tracking stability. Experiments were conducted on the UVOT400 dataset, which is large-scale and rich in attributes with diverse target categories. The results demonstrate that, compared to existing methods, this approach significantly outperforms in both accuracy and stability of underwater object tracking, providing new insights and effective solutions for the advancement of underwater object tracking technology.