Independent, blind assessment of structure prediction methods is essential for establishing the state of the art, identifying current limitations, and guiding future developments in the field. The Continuous Automated Model EvaluatiOn (CAMEO) platform provides weekly, automated, and independent benchmarking of structure prediction servers, serving as a continuous complement to the Critical Assessment of Structure Prediction (CASP) experiments. This work presents recent advancements in CAMEO aimed at evaluating predictions of macromolecular complexes, including protein–protein interactions, nucleic acid-containing assemblies, and polymer–ligand complexes. A comprehensive set of evaluation metrics is employed to capture various aspects of structural accuracy, including global and local correctness, interface geometry, and ligand placement. In addition, CAMEO provides multiple reference baselines to facilitate systematic comparisons against state-of-the-art methods. Here, we analyze the CAMEO benchmark dataset and report on the performance of baseline predictors and initial participating servers. By delivering continuous, blind, and objective evaluations, CAMEO supports the ongoing development and refinement of next-generation structure prediction methodologies.