Significant efforts are made to eliminate biases from models and observations, especially at operational centres. However, these biases still significantly impact the quality of assimilated data products. In the case of numerical weather prediction, residual biases can result in suboptimal utilization of available data or even render them unusable. In climate research based on re-analyzed datasets, it can be difficult to distinguish between accurate signals and trends from inaccurate ones caused by biases in models and data.This study used a detection algorithm written in the R language to perform statistical computing and data analysis. The algorithm was applied to a synthetic study utilizing pseudo-stations based on ERA5 to simulate and detect instrumental effects. Rather than using observational data from real-world sources, the study generated artificial scenarios to guarantee the quality of the data assessment.ERA5 is a well-known atmospheric reanalysis product that was used to create simulated or pseudo-weather stations. These stations were designed to mimic actual stations but were generated computationally to enable controlled experimentation. The study constructed twenty-five pseudo-stations in Frankfurt, Germany, within the latitude 49–50° and longitude 8–9° in the Northern Hemisphere. The study utilized the ERA5 land surface dataset of hourly 2-m air temperature of September in 2013 and 2014. The study tool significantly improves data quality assessment by evaluating the synthetic dataset's precision, dependability, and general robustness. It introduces a range of factors to assess the degree to which the data quality can be enhanced and maintained, including station movements, errors, and noise.To determine the likelihood of the threshold correlation occurring at our confirmed noise threshold, the correlation values occurring at 1.53 for each locational trial were extracted. Our threshold correlation was evaluated to see if it occurred within a likely range of correlations occurring at 1.53 degrees of noise, where 0.9744052 is less than 0.9744667 but greater than 0.9781093. This process helps improve detection methods for data anomalies, contributing to advancements in data quality assessment.