Generative modeling has seen significant advancements with the rise of diffusion models, which leverage iterative denoising processes to generate high-quality data. Originally developed for image synthesis, these models have now been extended to a wide range of domains, including text, audio, and video. Beyond their generative capabilities, diffusion models have also demonstrated strong potential for representation learning, enabling the extraction of meaningful features useful for downstream tasks. This survey provides a comprehensive review of diffusion models in the context of representation learning. We first introduce the theoretical foundations and key architectural advancements, such as improved noise schedules, latent diffusion, and conditional modeling. Next, we examine their role in self-supervised and contrastive learning, highlighting how the denoising process facilitates structured feature representations. We also explore the integration of diffusion models with other generative frameworks, including VAEs and GANs, to enhance efficiency and representation quality. Despite their growing success, diffusion models face several challenges, including high computational costs, slow inference, and limited theoretical understanding. We discuss ongoing efforts to address these limitations, including accelerated sampling techniques, model compression strategies, and the development of more interpretable representations. Additionally, we highlight ethical concerns, such as bias, robustness, and the potential misuse of generated content. Finally, we outline promising research directions, including multimodal and crossdomain learning, improved integration with large-scale language models, and real-world deployment considerations. By synthesizing recent developments, this survey aims to provide a structured perspective on the evolving role of diffusion models in both generative modeling and representation learning.