Researchers Hone Their Homology Tools

Robert F. Service

The Protein Structure Initiative (PSI) is churning out new protein structures at a pace never seen before. But even the hundreds of structures the initiative unveils each year don't make much of a dent in the millions of proteins and multiprotein complexes thought to be out there. One hope for PSI, however, is that the proteins it has solved will give researchers insights into the structures and functions of some of those whose shapes are unknown. For such work, computational biologists employ "homology models," which use solved structures as templates for computer models of the three-dimensional (3D) shapes of closely related proteins.

Figure 1Model behavior. Computational biologists use known protein structures to help them model the shape of closely related proteins.

CREDIT: NEW YORK STRUCTURAL GENOMIX RESEARCH CONSORTIUM, PSI-1

 

 

How well do homology models work? Not well enough, according to members of a review panel that issued a mixed report card on PSI in December. Although such models often get the general shape of related proteins correct, they typically lack the atomic-scale resolution needed to gain specific insights into how a protein does its job--or even what job it does. "The large numbers of new structures determined by the PSI effort have not led to significant improvements in the accuracy of homology modeling that would allow modeling of more biologically relevant proteins, complexes or conformational states," the report concluded.

But computer modelers say that conclusion misses the mark on several counts. First off, they point out, it was never a stated goal of PSI to improve the accuracy of homology models. "This was a complete red herring," says John Moult, a computational biologist at the University of Maryland Biotechnology Institute in Rockville. Moult says the initial intention was simply to allow computational biologists to apply existing models to a larger number of target proteins. And that, he says, has undoubtedly occurred. Of all the structures submitted to the global Protein Data Bank, PSI now contributes about 40% of all the "novel" protein structures--those significantly different from any solved previously. And according to one recent estimate, those allow for the creation of more than 40,000 homology models that could otherwise not be made.

That said, Moult and others argue that PSI is actually now beginning to contribute to the improvement of homology models themselves. In its second phase, PSI has supported two small centers geared toward improving computer models and has also supported individual computer-modeling groups. That bioinformatics support was perhaps "a little slow" in coming, says Andrej Sali, a computational biologist at the University of California, San Francisco. But he and others argue that this support, together with the increased number of structures, has helped spur advances in the basic algorithms to improve the accuracy of models.

Whether due to PSI or not, Moult and others say there's plenty of evidence homology models are improving. For starters, they point to a biennial competition among computational biologists to predict the structure for a series of proteins. The Critical Assessment of Structure Prediction (CASP), which began in 1994, will hold its eighth competition later this year. The first "was embarrassing," says Moult, who heads the CASP competitions. Few of the early models even came close to figuring out the actual structure of their target proteins, which were also simultaneously solved by x-ray crystallography for comparison. But by 2002, 60% of the models got close enough to the final structures to add useful information. By 2006, that number had climbed to 80%. "I don't want to say modeling was improving only because of the PSI," Moult says. But the added structures in the database, he argues, are making a "very significant contribution." Adds David Baker, a computational biologist at the University of Washington, Seattle: "Homology modeling is definitely getting better."

In another key advance, improved computer models are making it easier for x-ray crystallographers to solve their structures. Experimentalists solve these structures by firing powerful beams of x-rays at protein crystals and tracking how those x-rays ricochet off their targets. These data give them much of what they need to nail down the position of all the atoms in the protein. But for a complete 3D picture, researchers typically compare the original data with another set taken from a closely related protein. Combining the two data sets is usually enough to finish the job. Not all proteins have close relatives that have been solved. But in a Nature paper last November, Baker's team showed that it was possible to use newer high-resolution homology models as the close relative to help researchers solve the x-ray structures. "It's not an established method yet," Baker says. However, he argues, it shows the synergy that can occur between high-quality experimental data and computational models.