Metagenome Fragment Binning Based on Characterization Vectors
Abstract
We propose an approach for metagenome fragment binning using support vector machine (SVM) and characterization vectors. We developed this method to overcome the limitation of the composition-based approach using k-mer features to perform the binning process, particularly for short fragments. We take advantage of characterization vectors, which consider global information of DNA fragments without performing sequence alignments. The global information of sequences can be represented by twelve-dimensional information. Our experiments show that this method is highly accurate for binning metagenome fragments at the genus level with fragment lengths ≥ 500 bp for datasets representing known and new organisms. This approach is promising for extension to other taxonomy levels.
Collections
- Published By Media [158]