Rails fuzzy searching with Sunspot gem

Written by John Eberly on

SEARCH FOR: 'Jon Smath' and get=> 'John Smith' This post explains the easy way to get "fuzzy" search results when using sunspot with ruby on rails. This is probably obvious to solr experts out there, but I found the information to be lacking in the rails community. I originally had compared the results of solr vs sphinx for fuzzy searching. In the original comparison I was using the Levenshtein distance for solr, which turns out not to scale well and doesn't always return the best results when there are exact matches.

For this reason, we switched to using the "Double-Metaphone" algorithm for fuzzy searching in solr. It provides a simple way to get fuzzy results for solr searches while still being able to scale well since most of the work is done at the time of indexing.

Here is how to make it work using Sunspot.

Edit solr/conf/schema.xml and add the following line to the <analyzer> section.

<filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/>

So it looks something like this.

<fieldType name="text" class="solr.TextField" omitNorms="false">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/>
  </analyzer>
</fieldType>

Note: order is important and you will want this towards the bottom of the config block. See the Solr Wiki for more detailed information

Last step, restart solr and reindex. Then test your searches with misspellings etc.

comments powered by Disqus