PAIRPred - Partner Aware Interacting Residue PREDictor


by Fayyaz ul Amir Afsar Minhas and Asa Ben-Hur

Department of Computer Science, Colorado State University, Fort Collins, CO USA.

Release Version 1.0 (March 1, 2013)


What is PAIRPred?

PAIRPred is a partner specific protein-protein interaction site predictor that can make accurate predictions of whether a pair of residues from two different proteins interact or not. It differs from most existing interaction site predictors in that it considers the information about the interaction partner of a protein in making its predictions whereas most other methods produce partner-independent predictions. It employs a Support Vector Machine (SVM) with pairwise kernels to generate interaction propensity scores for a pair of residues from sequence information alone or in conjunction with structure based features. PAIRPred offers state of the art prediction accuracy. More details about how PAIRPred works and its performance evaluation are available in this paper.

A test case prediction

Below is an example prediction from PAIRPred for the interaction between the Influenze Virus NS1 protein (1XEQ) and Human ISG15 (1Z2M). The true complex structure is available as 3SDL. The AUC score for this test case (not a part of PAIRPred's training data) was ~0.90. The true positives are shown in red and orange dotted lines (for different chain contacts) with the width of the dotted line proportional to the prediction score for an interaction between two residues.

./3SDL_PAIRPred.jpg

Download Code

The Python code for the program can be downloaded here (~74KB). Please read the readme.txt file included with the code in order to install all the dependencies for PAIRPred. The same file also details the usage of the program for testing and training. Please note that this is a work in progress and the code may not be the latest one. We have used the Protein-Protein Docking Benchmark Dataset (DBD) 3.0 and 4.0 in our evaluation. For comparison wih existing methods, you can download the pre-computed kernels obtained from training data (DBD 3.0). Files for both the structure kernel (~1.7GB) and the sequence kernel (~1.6GB) are available. If you want to use PAIRPred for testing over novel protein pairs, we recommend that you use the structure kernel computed from all the 176 complexes in DBD 4.0 (~3.2GB).

Dowload Evaluation Data & Results

Downloading individual prediction files

You can download the prediction files generated using Leave-One-Complex-Out cross validation over the 176 complexes in DBD 4.0. The details of the prediction file format are available in the readme file included with the code. You can also download the prediction files for DBD 3.0 generated using Leave-One-Complex-Out cross validation over the 123 complexes in DBD 3.0.

Reproducing results in the paper

You can download the PAIRPred feature files for all proteins in DBD 4.0. Together with the kernel files and the code, these feature files will allow you to evaluate the performance of PAIRPred and reproduce the results given in our paper using the code posted above.

License

GPLv3

All programs in this collection are free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Acknowledgements

Fayyaz Minhas would like to thank the support by the J. William Fulbright doctoral student scholarship from the US Department of State and the Higher Education Commission of Pakistan.