中国农业机械化科学研究院集团有限公司 主管

北京卓众出版有限公司 主办

基于改进加权聚类算法的小麦种质推荐模型

Wheat germplasm recommendation model based on improved weighted clustering algorithm

  • 摘要: 随着小麦种质资源数据不断增长,如何帮助育种专家高效、准确地获取小麦种质显得极为迫切。针对这一问题,提出基于聚类算法的小麦种质推荐模型,对小麦种质数据集进行K-means聚类,找出数据集的聚类中心;找到育种专家需求种质数据所属的聚类簇别,并用最近邻算法得出育种专家所需求小麦种质。考虑到小麦种质属性特征的不同贡献度,提出一种灰色加权K-means聚类算法(GWK-means)。在通过欧氏距离计算小麦种质的相似度时,结合灰色关联分析确定小麦种质属性的权重,加大聚类不同簇间的距离,提高聚类算法的准确率和运行速度,为推荐模型提供有力支撑。在小麦种质数据集的试验结果表明,小麦种质推荐结果Top5和育种专家需求种质的平均准确度达到94%以上。

     

    Abstract: With growing wheat germplasm resources data, how to help breeding experts to obtain wheat germplasm efficiently and accurately has become an urgent issue.For this problem, a clustering algorithm-based wheat germplasm recommendation model was proposed.Wheat germplasm dataset was clustered using K-means to identify cluster centers of dataset.Cluster group was found to which breeding experts' germplasm data belonged, and then the nearest neighbor algorithm was used to derive wheat germplasm required by experts.Considering different contributions degree of wheat germplasm attribute features, gray weighted K-means clustering algorithm(GWK-means)was proposed.When calculating similarity of wheat germplasm by Euclidean distance, weight of wheat germplasm attributes were determined by grey correlation analysis, increasing distance between different clusters, improving accuracy and running speed of clustering algorithm, and providing a strong support for recommendation model.Experimental results on wheat germplasm dataset showed that average accuracy of the top 5 recommended wheat germplasm results and germplasm required by breeding experts reached more than 94%.

     

/

返回文章
返回