When Alán Aspuru-Guzik was in college, he really got into SETI@home, the project that uses home computers to speed the search for extraterrestrial intelligence. He was less interested in finding aliens in outer space, however, than in using fleets of computers to search molecular space. He wanted to find chemical compounds that could do intelligent things here on Earth.
SETI@home is a well-known distributed computing project that allows regular people to volunteer their idle computers to sift through reams of data — in this case, radio signals. Aspuru-Guzik, now a theoretical chemist at Harvard University, hopes to harness thousands of home computers to comb through almost every possible combination of atoms.
Sorting all of those chemical combinations and then making and testing each potentially useful molecule could take close to forever. Given the number of elements in the periodic table (118) and the number of ways they could be combined into reasonably sized molecules, Aspuru-Guzik estimates that there are from 1060 to 10180 possibilities, each with its own personality and behaviors.
“It sounds hopeless, it sounds crazy,” Aspuru-Guzik says. “But we are trying to go through that infinite space of combinations — all the different ways you could put these molecules together like Legos — and find ones that are useful.”
Knowing a molecule’s usefulness — as a drug ingredient, for example, or as a material for touch screens — means understanding how that molecule’s atoms influence its behavior, an endeavor that scientists have struggled with for decades. The difficulty is that atoms are essentially quantum beings, governed by math that rapidly becomes hard to apply when the molecules those atoms make become even a little bit complicated. At a fundamental level, Aspuru-Guzik and others like him are trying to glean each molecule’s quantumness, to figure out what its fuzzy clouds of electrons say about its nature. That information would offer instant insight into how any molecule would behave. One tool that could pull this off is a full-scale quantum computer, capable of simulating any atomic system.
But quantum computers are still in the early stages of development. (Aspuru-Guzik recently announced a bet on Twitter that quantum computers will be able to do certain intractable chemistry calculations by 2035.) In the meantime, he and others are using a work-around. By harnessing the power of classical computers, including the desktop machines of ordinary citizens à la SETI@home, scientists can run calculations that can reveal quite a bit about how a molecule will behave. The approach is still less precise than quantum computing promises to be. But combining these rough calculations with algorithms that can be trained to find connections between molecules and properties may open new doors. Adding this kind of machine learning offers a way to take compounds generated by the quantum calculations and quickly assess millions of other molecules with potentially similar properties. Scientists are jumping on this intelligent brute-force approach.
The potential payoff is enormous. While a very tiny portion of all the possible molecules identified via quantum chemistry calculations will prove useful, just a handful of home runs could affect people’s lives in major ways. New drugs might solve the problem of antibiotic resistance or offer cures for diseases like Ebola. There might be easy-to-produce substitutes for the rare earth elements that are used in electronic devices, lasers and medical diagnostics. Quantum chemistry could reveal compounds that lead to materials that help solve the energy crisis, such as better batteries, spray-on solar panels or superefficient lightbulbs.
The shortcut approach is starting to bear molecular fruit. Searching for materials with promising electrochemical properties, Aspuru-Guzik and colleagues winnowed a pool of more than a thousand quinones — compounds found widely in the photosynthesis machinery of plants — to a handful that could be tested. One of the best-suited quinones (similar to one in rhubarb) is being used in a new battery that scientists led by Harvard’s Michael Aziz described in Nature last year. In a separate advance, a thermoelectric material discovered by Lawrence Berkeley National Laboratory scientists could lead to technologies for converting waste heat from a car’s engine, for example, into power for its lights.
Aspuru-Guzik and Stanford University’s Zhenan Bao used quantum calculation shortcuts to find a variation on a semiconductor material that is more than twice as efficient at transporting charge as the parent molecule, they reported in Nature Communications in 2011.
Story continues below table
“When I see the molecules that the computer programs predict, it gives me inspiration,” says Bao, who is creating new materials for all sorts of flexible electronics, from solar cells to electronic skin for robots or prostheses. “You have a million molecules, then a few thousand, then a few hundred. When you get to less than 10, it is much more practical for chemists to consider making them.” Bao should know. As a materials scientist who actually makes things in the lab, she says it still can take a year to synthesize and test a candidate molecule that looks good on paper. “This theory-guided design really helps to shorten the discovery time,” she says.
Electron behavior
When chemists are looking for a better-performing molecule, they typically start with something that works, and then “make only small changes in chemical structure,” Bao says. This classical approach, which often entails line drawings of molecular structures, relies on a chemist’s experience and intuition. But in some respects, using dots and lines on a notebook page to understand how molecules work is like using stick figures to appreciate human physiology. The method incorporates a small sliver of reality to guide a time-consuming and expensive process of trial and error.
For solar cells, repeating units in a polymer might have 40 to 60 atoms, says Bao. “If you rearrange just one atom, the electronic properties can change, how the molecule arranges itself physically in space might change.” Photovoltaic materials in particular are complicated beasts, she says, because there are so many qualities that matter. “You have to know about light absorption, charge separation, electron transport and charge recombination. It’s so much more complicated and there’s no one theory that captures all of that accurately.”
All that trial and error would be unnecessary if scientists could solve Schrödinger’s equation. Published in 1926, that equation determines the quantum mechanical wave function, or Ψ, which encapsulates all the information about the properties of a molecule. Those properties in turn dictate, for example, how an enzyme will grab a protein or whether a material reflects light. Schrödinger’s equation has only a handful of mathematical terms, but its simplicity on paper belies the computational effort required to solve it. By the late 1920s, physicists had figured out how to solve the cryptic code for really simple systems, such as the hydrogen atom. But the mathematical tools needed to solve it for more complicated systems still don’t exist.
“The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known,” wrote physicist Paul Dirac in 1929. “The difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble.”
Those words still apply today, says theoretical chemist Johannes Hachmann of the University at Buffalo, a former postdoc in Aspuru-Guzik’s lab. In principle, Hachmann says, Schrödinger’s equation reveals all the information about any molecule. In reality, “we can only do that for very small molecules that no one cares about,” he says.
So theoretical chemists and physicists have developed mathematical alternatives that find approximate, rather than exact, solutions to Schrödinger’s equation. By the 1930s, scientists had mapped out some general recipes for such approximations (primarily “mean field wave function theory,” “correlated wave function theory” and “density functional theory”). With the advent of computers, scientists put these theories to practice. Further refinements to the math improved the accuracy and reduced the computational costs of these methods.
These simplified solutions aren’t perfect. “The more approximations you make, the more you are chopping away at Schrödinger’s equation,” Hachmann says. “It gets simpler, but you are also losing information; you start throwing away a lot of physics.”
But the approximations offer a way to abandon the incremental trial-and-error approach for one that is more rational and efficient — based on “first principles” of physics — to determine a molecule’s properties. Today, scientists are doing quantum chemistry calculations that don’t throw out the proverbial baby with the mathwater. Several quantum chemistry software packages and tools can chew through calculations for all sorts of molecules. Scientists are using them to probe the effectiveness of arthritis drugs and the elasticity of materials, for example.
Many of these calculations take hours to days for a single molecule, so some scientists are leaning on modern conveniences like supercomputers and parallel computing operations to do the work within a feasible time frame. The Materials Project, led by scientists at Lawrence Berkeley and MIT, is running quantum calculations for tens of thousands of inorganic compounds using the National Energy Research Scientific Computing Center, home to some of the fastest supercomputers in the world. Scientists using the Materials Project database have found new transparent conductive materials that might find uses in touch screens; others are seeking candidate compounds for semiconductors and batteries.
Better solar cells
The Harvard Clean Energy Project is Aspuru-Guzik’s SETI@home. It is using parallel computing to search for solar cell materials. He calls the project “the largest quantum chemistry experiment ever done.”
Traditional silicon-based solar cells are bulky, rigid and surprisingly fragile. Today’s commercially available solar cells can require roughly two years to generate the amount of energy used to make them, scientists estimate. Many researchers are looking to carbon-based, or organic, photovoltaic alternatives, which would be cheaper to make and much more versatile. For example, researchers are investigating spray-on or paintable materials that could be easily applied to a building or airplane. Yet current versions of these carbon-based solar cells just don’t make the grade; they’re inefficient and don’t last very long. Enter the Harvard Clean Energy Project.
Conceived in 2008, the project includes Bao and collaborators at MIT and Clark University in Worcester, Mass. The team is probing the photovoltaic propensities that emerge from 26 basic molecular building blocks, fragments chosen with advice from Bao based on the feasibility of making them in the lab. Phase 1 of the project focused, in part, on understanding how candidate molecules pack together to form a solid. The massive effort generated a library of 10 million potentially interesting molecular candidates, the researchers reported in the Journal of Physical Chemistry Letters in 2011.
Exploring all these iterations would be impossible without the immense power of IBM’s World Community Grid, which is tackling problems related to health, poverty and sustainability. The grid has 2.7 million computers, smartphones and tablets in 80 countries working on calculations. As of May, 477,118 computers were running calculations specifically for the Clean Energy Project. With the grid’s support, the project has moved to phase 2 and has already done in five years what it would take one computer 33,000 years to do.
The project is making serious headway. About 35,000 of the analyzed compounds look like they might perform at roughly double the efficiency of most current organic solar cells, the team reported last year in Energy & Environmental Science.
While the grid makes the quantum chemistry calculations much easier, it still can take several hours to do the calculations for a single molecule. But that’s OK. Machine learning algorithms can learn from these quantum calculation–based datasets and use much simpler math to help churn through millions of candidate compounds more quickly.
Human biases
Machine learning programs look for patterns; they don’t care about physics. The matching approach is pervasive, from the ads on Google that know just what you’ve been shopping for lately to the potential partners selected by popular dating sites. Such algorithms excel at generating new outputs based on known inputs. Like the ultimate personal shopper, they take stock of known information to make recommendations.
Story continues below infographic
Say you need 20 new outfits. Show a machine learning program your closet and it learns what you like. It can then shop for you, picking clothes in your size, made of fabrics and colors that fit with your palette. The algorithms might recommend clothes you didn’t even know you would like because it saw something similar in the subset, say a shirt made by a particular designer or pants with an especially wide leg.
The main appeal of machine learning programs is their speed; they can screen molecules in seconds rather than hours to days. Given a starter library of, say, 2 million molecules, scientists might run quantum chemistry calculations on only 50,000. Those 50,000 calculations will reveal relationships between particular structures, such as groups of atoms, with particular properties, such as solubility or stiffness. The machine learning program learns those relationships and then can sift through the remaining 1.95 million molecules in the library. Molecules generated by that screening can be verified and refined using more quantum chemistry calculations. The result, researchers hope, is a targeted short list for chemists to synthesize and study in the lab.
While advances in computing power have pushed the field forward, partnerships between experimentalists like Bao and theorists like Hachmann and Aspuru-Guzik are also crucial. Experimentalists still have some understandable reluctance, however. Creating and testing molecules is time intensive, so embracing methods that uncover candidate compounds that are wildly different from those with a proven track record can be difficult, says Geoff Hutchison, a materials chemist at the University of Pittsburgh. Spending time and money on counterintuitive ideas is a hard sell.
Hutchison has encountered such resistance. His lab has developed machine learning algorithms to identify promising compounds to use in solar cells. Traditional approaches look for mixtures of materials that are electron donors and electron acceptors, but Hutchison’s algorithms spat out a weird combination, electron donors mixed with other electron donors. No synthetic chemists were banging on Hutchison’s door, eager to make these compounds, even though their traits suggested they would be good at converting sunlight into electricity. “It actually made sense from a quantum point of view,” says Hutchison, who reported the work in the Journal of Physical Chemistry Letters in 2013. “But it wasn’t an obvious strategy to people.” He hasn’t yet persuaded anyone to make the compounds.
So part of the difficulty isn’t a computational challenge, it’s a human one. While traditional human knowledge from experimental chemists is necessary — it enriches the information generated by computers — it can also limit options. The real promise of computational quantum chemistry is allowing chemists to overcome their human biases, enabling the discovery of molecules that might otherwise get the cold shoulder or go unnoticed. Government backing is trying to help: the Materials Genome Initiative, launched by President Obama in 2011, aims to speed the discovery of fruitful new materials. Its four goals include leading “a culture shift in materials research to encourage and facilitate an integrated team approach” and “integrating experiment, computation and theory.”
Such a culture shift is essential for making real the promise of quantum chemistry. While probing the fundamental nature of molecules is a formidable task, it’s equally difficult to imagine how undiscovered materials might transform the world: Skyscrapers that are built from some yet-unknown lightweight material rather than steel, smart clothing that keeps a body cool in warm weather, fuels from clean sources like water. Each of these aspirations might be possible if theorists, experimentalists, funding agencies and industry can keep pushing the chemistry envelope. The odds of success are at least as good as, if not much better than, the likelihood of finding aliens.
This article appeared in the June 13, 2015, issue of Science News under the headline, “Molecular Pursuits: Using quantum chemistry to sift through a sea of compounds could launch a manufacturing revolution.”