The web-server PHPred was developed to identify the phage proteins located in host cell based on the sequence information. The analysis of variance was used to seek optimized g-gap dipeptide. In jackknife cross-validation, our method can discriminate between the bacteriophage proteins located in host cell (PH proteins) and the bacteriophage proteins not located in host cell (non-PH proteins) with maximum overall accuracy of 84.2% and can further classify the bacteriophage proteins located in host cell cytoplasm (PHC proteins) and the bacteriophage proteins located in host cell membrane (PHM proteins) with maximum overall accuracy of 92.4%. 84.7% PH proteins and 83.6% non-PH proteins were correctly predicted in jackknife cross-validation. The best model can also correct identify 89.7% PHM proteins and 94.7% PHC proteins. All data can be downloaded from the Data window of this web-server.
(1) For each submission, the number of protein sequences is limited at 100 or less;
(2) The input sequences must be in FASTA format; i.e., each protein sequence should start with a greater-than symbol (" > ") in the first column. The words right after the " > " symbol in the single initial line are optional and only used for the purpose of identification and description.
(3) If a query sequence contains any illegal character, the prediction will be stopped.