Prediksi Struktur Sekunder Protein menggunakan Hidden Markov Model pada Imbalanced Data

Sari, Dian Puspita

View/Open

full text (9.882Mb)

Date

2014

Author

Sari, Dian Puspita

Haryanto, Toto

Metadata

Show full item record

Abstract

This research aimed to predict protein secondary structure using Hidden Markov Model. A total of 780 data, will be conducted with 600 training data and 180 testing data. Training data obtained protein secondary structure 394052 with 152782 alpha-helix (H), 82355 betha-sheets (B) , and 158915 coil (C). Seen from a percentage of the result, the data retrieved is still imbalanced therefore used oversampling to increase the smallest class randomly until it equal to the largest class. The result of this research show that the Hidden Markov Model (HMM) can be applied to predict the secondary structure of proteins. The data has been oversampled produced Q3 score 45.49% for training data and 43.21% for testing data. For data that was not done oversampling produced Q3 score 43.50% for training data and 43.19% for testing data.

URI

http://repository.ipb.ac.id/handle/123456789/72900

Collections

UT - Computer Science [2482]