In remote sensing image segmentation, recognizing buildings is challenging when the visual evidence from pixels is weak or when buildings belong to small, spatially structured objects. To address this issue, we propose a structure-prior guided adaptive context selection network (SGACS-Net) for remote sensing semantic segmentation. The core is to use structure-prior knowledge to dynamically capture prior contextual information and higher-order object structural features, thereby improving the accuracy of remote sensing building segmentation. First, an adaptive context selection module is designed. By dynamically adjusting the spatial sensing field, this module effectively models the global long-range context information dependencies. It captures varying context information of buildings at different scales, thereby enhancing the network’s ability to extract building feature representations. Second, a structure-prior guided variable loss function is proposed. It utilizes the structural features of building points, lines, and surface to identify key regions. By leveraging advanced structure-prior knowledge, it enhances the network’s ability to express structural features. Experimental results on two datasets show that the proposed SGACS-Net outperforms other typical and state-of-the-art methods in terms of remote sensing semantic segmentation performance.