A probabilistic query suggestion approach without using query logs

More Info
expand_more

Abstract

Commercial web search engines include a querysuggestion module so that given a user's keyword query, alternativesuggestions are offered and served as a guide to assistthe user in formulating queries which capture his/her intendedinformation need in a quick and simple manner. Majorityof these modules, however, perform an in-depth analysis oflarge query logs and thus (i) their suggestions are mostlybased on queries frequently posted by users and (ii) theirdesign methodologies cannot be applied to make suggestions oncustomized search applications for enterprises for which theirrespective query logs are not large enough or non-existent. To address these design issues, we have developed PQS, aprobabilistic query suggestion module. Unlike its counterparts, PQS is not constrained by the existence of query logs, sinceit solely relies on the availability of user-generated contentfreely accessible online, such as the Wikipedia.org documentcollection, and applies simple, yet effective, probabilistic-andinformation retrieval-based models, i.e., the Multinomial, BigramLanguage, and Vector Space Models, to provide usefuland diverse query suggestions. Empirical studies conductedusing a set of test queries and the feedbacks provided byMechanical Turk appraisers have verified that PQS makesmore useful suggestions than Yahoo! and is almost as goodas Google and Bing based on the relatively small difference inperformance measures achieved by Google and Bing over PQS.