Understanding language is crucial to psychological research, as it is the basis for almost all psychological measurements. Following previous works, we conducted a pre-registered replication study, analyzing 876 unique terms generated by 187 participants in a free response format. These terms were rated on social desirability, observability, importance, abstractness, base rate, and stability by approximately 15 raters each (n = 456). Key findings were replicated, including a bimodal distribution of social desirability, a greater number of negative vs. positive terms, and a u-shaped relationship between importance and social desirability. Using a Large Language Model (LLM) to derive synthetic social desirability estimates, we observed a moderate correlation (r = .65) with human ratings, but did not replicate all predicted effects. LLM-based item property estimates are not yet entirely equivalent to ratings by human participants but hold much promise for future research.