1. Background

Molecular simulation is a methodology for predicting the collective (in particular, thermodynamic and transport) properties of systems from information about how the molecules in the system interact with each other. That “information” can be obtained on-the-fly from quantum mechanics but, in most molecular simulations, it is encoded in a mathematical function, called a force field, that attempts to include all the intermolecular interactions between molecules (electrostatic interactions, van der Waals repulsive and attractive interactions) as well as intramolecular interactions (e.g., bond stretching, bond angle bending, and torsional interactions). More specifically, a force field is a representation of the total potential energy associated with interactions of all the atoms in the system (obtained by summing over all the molecules), which can be differentiated with respect to the position of an atom to obtain the force exerted on that atom. Force fields can be derived from first principles calculations (e.g., quantum chemistry calculations) and/or experimental data; thus, generally force fields are semi-empirical. For inhomogenous systems (e.g., a fluid adsorbed on a surface or into a pore), the force field includes models for how the molecules interact with atoms in the surfaces or with an external field. Assuming that the molecular simulation runs long enough to attain equilibrium, and that the system is large enough or configured to eliminate unwanted surface effects (through so-called periodic boundary conditions), for a given force field, molecular simulation can provide essentially exact information about the properties of the system, obtained by averaging over the configurations generated in the simulation. Two major types of molecular simulations are routinely performed: molecular dynamics (MD), in which Newton’s equations, or a convenient variation thereof, are solved for the dynamics of each atom in the system, and Monte Carlo (MC) simulation, in which configurations of the system are generated via a Markov chain process that asymptotically are distributed according to the appropriate equilibrium ensemble probability (e.g., for systems at constant molecule number\(N\), volume \(V\), and temperature \(T\), the Boltzmann distribution, in which configurations have probability \(\propto e^{-E/k_{B}T}\), where \(E\) is the energy of the system and \(k_{B}\) is Boltzmann’s constant). In either case, the raw output of the simulation is configurations of the system (known as a trajectory) that can then be analyzed to compute properties. From an MD simulation, the trajectory will consist of positions and velocities for all atoms in the system over the course of the simulation; a typical MD simulation will employ a time step of \(10^{-15}\)s, so that a 10-100 ns trajectory covers\(10^{7}-10^{8}\) steps. For a 100,000-atom simulation (a typical system size with current computational resources), a trajectory file can be of the order of terabytes, so that statistical analysis of such files can be thought of as a particular kind of “big data” problem.
Molecular simulation began in the 1950s with simple systems such as hard spheres (MC1 and MD2,3) and in the 1960s with the Lennard-Jones fluid (MC4 and MD5). For such monatomic systems, the force field is very simple, specifying the interaction energy between spherically symmetric molecules. Beginning in the 1970s, molecular simulation was introduced to the field of chemical engineering primarily by Keith Gubbins, the honoree of this Founders issue of AIChE Journal . Keith is known and admired internationally and across many disciplines not only for his contributions in molecular theory (which have been seminal, such as Gray-Gubbins perturbation theory and the statistical associating fluid theory, or SAFT, equation of state) but also for his research in molecular simulation. One of the earliest Gubbins simulation papers6 from 1979 has been cited almost 1000 times. [As an aside, his postdoctoral trainee co-author on this paper, Dominic Tildesley, was for many years a successful academic in the UK before joining Unilever, where he established one of the world’s premier industrial molecular modeling groups, eventually rising to Vice President of Discovery Platforms; Tildesley also co-authored one of the seminal text books on molecular simulation7.] Keith’s influence on the field of chemical engineering in relation to molecular simulation can be measured in programming at AIChE Annual Meetings (which in the early 1980s had no sessions on molecular simulation in contrast to today when a whole programming area – the Computational Molecular Science and Engineering Forum, Area 21 – is largely focused on molecular simulation) and in papers presented at Properties and Phase Equilibria for Process and Product Design conference series established in 1977 (in which the first molecular simulation paper was presented in 1980, and by 2007 more than half the presentations involved molecular simulation and/or molecular theory). Since its early days, molecular simulation has become a workhorse in science and industry. The promise of being able to predict collective properties from molecular interactions, and the attendant insight gained, have made molecular simulation (both MD and MC) an ideal and indispensable capability in materials science, biology, medicine (specifically, drug discovery) and engineering. There are commercial entities that market molecular simulation software (e.g., BIOVIA and Schrödinger). A 2002 international comparative study on molecular modeling (of which molecular simulation constitutes a major component) documented the widespread use of molecular modeling in industry, including many chemical, drug, and personal care product companies8.
The authors of this Perspective article are all beneficiaries of the trail-blazing efforts of Keith Gubbins in establishing molecular simulation as an accepted and respected subfield of chemical engineering. Today, molecular simulation is taught in most chemical engineering departments in the U.S. at the graduate level, and is increasingly available as an elective at the undergraduate level or even offered as a first-year seminar to incoming undergraduate students. It has become one of the major focuses of the educational foundation, CAChE (Computer Aids for Chemical Engineering Education, cache.org), which established a molecular modeling task force in 1998. CACHE runs a highly successful technical conference, Foundations of Molecular Modeling and Simulation (fomms.org, held every three years since 2000) that has produced many educational resources to enable chemical engineers to teach and utilize molecular simulation in the classroom. In 2012, Keith Gubbins was awarded the FOMMS Medal for his numerous and long-standing contributions to the molecular simulation community. In addition to prodigious research contributions, he has authored seminal textbooks9, including the two-volume definitive treatise on the theory of molecular fluids10,11 that is an essential part of the library of any serious statistical mechanician interested in molecular fluids.

2. Development of molecular simulation tools in the chemical engineering community

Although molecular simulation (MD and MC) transcends disciplinary boundaries as noted above, chemical engineers have been particularly active in developing algorithms that compute properties of strong interest to the chemical engineering community (ChEC). One example is vapor-liquid phase equilibria, which is of enduring interest to the ChEC due to separation processes. Thus, a molecular simulation methodology for computing phase equilibrium directly and efficiently, the Gibbs ensemble MC (GEMC) algorithm, was developed in 1987 within the ChEC by Panagiotopoulos12. Phase equilibria can involve differences in densities between phases of several orders of magnitude; likewise, in chemical manufacturing there can be wide ranges of state conditions. Hence, along with the development of algorithms, the ChEC has also been at the forefront of developing force fields that are accurate over wide ranges of state conditions, such as the TraPPE family of force fields optimized for vapor-liquid equilibrium (see the extensive resources at http://trappe.oit.umn.edu) and the Gaussian charge polarizable model (GCPM) for water13 that correctly predicts water’s phase equilibria, thermodynamic, transport and dielectric properties over wide ranges of temperature and pressure. By contrast, much of the molecular simulation community in other disciplines is focused on properties at or near ambient conditions (including ambient conditions for biological systems).