Understanding relevance and boosting
When a visitor searches your site for content, the number of items returned may be in the thousands. Visitors can sort the results by relevance, which orders the results by best match to the search criteria. Brightspot determines relevance using the following process:
- Ingest the visitor’s search terms.
- Retrieve all items containing the terms.
- Compute a relevance for each retrieved item.
- Multiply the relevance by a boost (if any).
- Sort the items by the boosted relevance.
Suppose a visitor wants to retrieve the previous article, and uses the search term pumpkin. Brightspot retrieves every item containing the search term, and assigns a relevance to each item. If you know that your visitors are—
- more interested in articles than any other content type, you can boost articles to make them more relevant, and they appear higher in the search results.
- more interested in newer content, you can boost new content to make those items more relevant, and they appear higher in the search results.
The following table lists some of the components used to compute relevance. The examples are simplified versions of the actual calculation. (Your version of Brightspot may use different components or relevance calculations.)
Components affecting relevance
Component | Effect | Example |
Number of items containing the term | As more items contain a term, the lower the relevance becomes. |
|
Number of items with the field | As more items contain the field, the higher the relevance becomes. | The matching field is Headline.
|
Inverse item frequency | Terms that are rare over all items contribute to a higher relevance. | If pumpkin appears in only one of your items, that item receives a high relevance. |
Frequency | Items with many occurrences have higher relevance than items with fewer occurrences. | Items with many occurrences of pumpkin receive a higher relevance than items with fewer occurrences. |
Term saturation | As the number of occurrences grows, their contribution to relevance decreases. This component helps to prevent exaggerated relevance being assigned to documents containing many occurrences of the search term. |
|
Length normalization | Compensates for the number of words in items of varying length. Without length normalization, a long item with many occurrences of the term receives a higher relevance than a shorter item, but those additional occurrences may not contribute to relevance. | An item 100 words long with the 30 occurrences of the term pumpkin receives a similar relevance as an item 1,000 words long with 300 occurrences. |
Field length | As the length of a field grows, the containing item’s relevance decreases. |
Item A has higher relevance. |
Average field length | As the average length of all fields containing the term increases, the containing items’ relevance increases. |
Items C and D receive higher relevance than items A and B. |
Boost | Increases the relevance for items containing the boosted term. | If the term pumpkin has a boost of 50, and the term olives has a boost of 10, items containing pumpkin are five times more relevant than items containing olives. |
Brightspot does not search every field for a term. For example, when searching through images, Brightspot may not search the credits field. Contact your Brightspot administrator to determine which fields are included in searches.