A frequency-based technique to improve the spelling suggestion rank in medical queries

被引:23
作者
Cowell, J [1 ]
Zeng, Q
Ngo, L
Lacroix, EM
机构
[1] Harvard Univ, Sch Med, Brigham & Womens Hosp, Decis Syst Grp, Boston, MA 02115 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[3] Natl Lib Med, Publ Serv Div, Bethesda, MD 20209 USA
关键词
D O I
10.1197/jamia.M1474
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: There is an abundance of health-related information online, and millions of consumers search for such information. Spell checking is of crucial importance in returning pertinent results, so the authors propose a technique for increasing the effectiveness of spell-checking tools used for health-related information retrieval. Design: A sample of incorrectly spelled medical terms was submitted to two different spell-checking tools, and the resulting suggestions, derived under two different dictionary configurations, were re-sorted according to how frequently each term appeared in log data from a medical search engine. Measurements: Univariable analysis was carried out to assess the effect of each factor (spell-checking tool, dictionary type, re-sort, or no re-sort) on the probability of success. The factors that were statistically significant in the univariable analysis were then used in multivariable analysis to evaluate the independent effect of each of the factors. Results: The re-sorted suggestions proved to be significantly more accurate than the original list returned by the spell-checking tool. The odds of finding the correct suggestion in the number one rank were increased by 63% after re-sorting using the authors' method. This effect was independent of both the dictionary and the spell-checking tools that were used. Conclusion: Using knowledge about the frequency of a given word's occurrence in the medical domain can significantly improve spelling correction for medical queries.
引用
收藏
页码:179 / 185
页数:7
相关论文
共 28 条