Sitecore Search and Indexing: Creating a simple search

With Sitecore 7, Sitecore introduced the new Sitecore.ContentSearch API which out of the box can query Lucene and SOLR based index’s.

Searching the index’s has been made easier through Linq to Sitecore that allows you to construct a query using Linq, the same as you would use with things like Entity Framework or Linq to SQL.

To do a query you first need a search context. Here I’m getting the a context on one of the default index’s:

using (var context = ContentSearchManager.GetIndex("sitecore_web_index").CreateSearchContext())
{
    ...
}

Next a simple query would look like this. Here I’m doing a where parameter on the “body” field:

using (var context = ContentSearchManager.GetIndex("sitecore_web_index").CreateSearchContext())
{
    IQueryable<SearchResultItem> searchQuery = context.GetQueryable<SearchResultItem>().Where(item => item["body"] == “Sitecore”)
}

But what if you want to add a search to your site. Typically you would want to filter on more than one field, what the user enters may be a collection of words rather than an exact phrase and you’d also like some intelligent ordering to your results.

Here I am splitting the search term on spaces and then building a predicate that has an “or” between each of its conditions. For each condition rather than doing a .Contains on a specific field, I’m doing it on a content field that will contain data for all fields in the item.

using (var context = ContentSearchManager.GetIndex("sitecore_web_index").CreateSearchContext())
{
    IQueryable<SearchResultItem> query = context.GetQueryable<SearchResultItem>();

    var predicate = PredicateBuilder.True<SearchResultItem>();

    foreach (string term in criteria.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries))
    {
        predicate = predicate.Or(p => p.Content.Contains(searchTerm.Trim()));
    }

    SearchResults<SearchResultItem> searchResults = query.Where(predicate).GetResults();

    results = (from hit in searchResults.Hits
                        select hit.Document).ToList();
}

The intelligent ordering of results you will get for free based on what was search for.

Muddlings with Sitecore Index’s

On a recent Sitecore project we needed to have a faceted product search. For this we opted to use the Lucene based Search and Indexing functionality that comes with Sitecore. Overall this proved very easy to use, but here are the details of a couple of issues we encountered.

Items Duplicating on Publish and never Deleting

The first issue we found was that although the index was being built and we could read it. If we ever deleted an item, it wasn’t removed from the index. Equally if you ever saved and published an item, it would become duplicated in the index.

Doing a manual rebuild of the index would clear the items back down to what we would normally expect. But for some reason changes were clearly just being added to the index rather than updating it.

Looking through Sitecores “Sitecore Search and Indexing Guide” (http://sdn.sitecore.net/upload/sitecore7/75/sitecore_search_and_indexing_guide_sc75-a4.pdf) wasn’t much help, as far as we could tell the index was set up correctly. Comparing to the default index that comes with a blank install of Sitecore didn’t help much either.

In the end it transpired in your index’s field name definition you must include a field for “_uniqueid”. We had assumed that some sort of config like this must be needed, however Sitecore’s indexing guide doesn’t actually mention it anywhere.

<fieldNames hint="raw:AddFieldByFieldName">
  <field fieldName="_uniqueid"            storageType="YES" indexType="TOKENIZED"    vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
    <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
  </field>

Index not updating in the Content Delivery environment

At this point our index’s were working fine in our own test environments. Upon deploying to our clients servers however the index’s were never updating on either there Content Management or Content Delivery servers. Doing a manual rebuild of the index would cause the Content Management servers index to update, but the Content Delivery servers index constantly remained empty.

Clearly there was some sort of difference between there’s and our environments. We didn’t have any direct access to there servers, so we checked out the config settings that are view-able by going to /sitecore/admin/showconfig.aspx

Sure enough there was a difference.

This Sitecore install was running 7.2 and prior to that the latest version they had used was 6.6. They had set up a custom config setting which removed the hooks section from config. This was because some of the default hooks Sitecore has interfered with performance monitoring tools they use on there sites. Unfortunetly it was also removing a hook that loads Sitecore.ContentSearch. Without this index’s are never updated on events.

<hooks>
<hook type="Sitecore.ContentSearch.Hooks.Initializer, Sitecore.ContentSearch" patch:source="Sitecore.ContentSearch.config"/>
</hooks>