中文农业专业分词器设计与实现
Research and Realization of Chinese Word Segmentation for Agriculture
-
摘要: 利用Hash表在查找效率上的优势,提出了基于Hash机制的词典查找、更新、删除和添加等操作算法。该算法根据汉字GB码的特点,将保存首字GB码,提高了存储空间利用率;在词典中建立农业专业词汇和方言词汇一对多的对应关系,在满足系统需求的同时,提高了分词的准确性。Abstract: The Hash table has many advantages in the search efficiency.Based on the Hash mechanism,a algorithm of dictionary search is presented.This algorithm preserves the first character GB code and saved the storage space.The algorithm satisfies the system requirements and enhances the word segmentation accuracy by establishing the one-to-many corresponding relationships of the agricultural specialized glossary and the dialect glossary in the dictionary.