Algorithms for relevance in SE

posted in: Relevance | 0

In the simplest case, relevance can be calculated by examining how many times a query term appears in a document (term frequency), possibly combined with how discriminative that query term is across the searched collection (often called Term Frequency-Inverse Document Frequency).

Since search engines and other businesses rely upon the accuracy of their results, many additional, more complex algorithms have been developed to estimate result relevance. Many of these algorithms, particularly those used by search engines, are hidden to the public, as a user that knows the details of a search algorithm can artificially boost his own content’s ranking.

Relevance calculation is often misinterpreted by the press. For example, it has often been said that when Google burst onto the scene it was miles ahead of its competitors because it, unlike anyone else, ranked web pages by relevance. This is not true since everyone ranks by relevance. It is just that Google had come up with a fairly new way of estimating relevance, namely PageRank. But even search engines that only use TFIDF rank by relevance.

This guide is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Leave a Reply

Your email address will not be published. Required fields are marked *