首页>
外国专利>
FEATURE VECTOR-BASED METHOD FOR REMOVING REDUNDANCY IN A TRAINING DATASET
FEATURE VECTOR-BASED METHOD FOR REMOVING REDUNDANCY IN A TRAINING DATASET
展开▼
机译:基于特征向量的训练数据集冗余消除方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
The present invention is based on characteristic vectors, a kind of method, for generating training data, redundant data is removed that, the technical failure being solved, if it were not for conventional method is compared in the unnecessary repetition for including data, a data set of a larger scale can be generated on the basis of sequence similarity and even, it include to the protein of feature vectors and the feature of RNA sequence to provide the more effective methods for generation by train data, the rna binding protein sequence that is present in amino acid is estimated than conventional method based on feature vectors, it uses unique feature of protein. Train data is according to the present invention based on characteristic vector production method of the invention for this purpose: (1) determine RNA combination amino acid interaction RNA and protein determine Hydrogenbond-RNA binding sites and, (2) and trend and RNA in amino acid triad is calculated, (3) in the sequence predict rna binding protein amino acid protein and RNA sequence various functions interaction the following steps are included:: feature vector is encoded, (4) it is characterized in that, it includes: the step for establishing redundancy of the training data set to remove the coded data based on feature vector.
展开▼