On a recent Sitecore project we needed to have a faceted product search. For this we opted to use the Lucene based Search and Indexing functionality that comes with Sitecore. Overall this proved very easy to use, but here are the details of a couple of issues we encountered.
Items Duplicating on Publish and never Deleting
The first issue we found was that although the index was being built and we could read it. If we ever deleted an item, it wasn't removed from the index. Equally if you ever saved and published an item, it would become duplicated in the index.
Doing a manual rebuild of the index would clear the items back down to what we would normally expect. But for some reason changes were clearly just being added to the index rather than updating it.
Looking through Sitecores "Sitecore Search and Indexing Guide" (http://sdn.sitecore.net/upload/sitecore7/75/sitecore_search_and_indexing_guide_sc75-a4.pdf) wasn't much help, as far as we could tell the index was set up correctly. Comparing to the default index that comes with a blank install of Sitecore didn't help much either.
In the end it transpired in your index's field name definition you must include a field for "_uniqueid". We had assumed that some sort of config like this must be needed, however Sitecore's indexing guide doesn't actually mention it anywhere.
1<fieldNames hint="raw:AddFieldByFieldName">
2 <field fieldName="_uniqueid" storageType="YES" indexType="TOKENIZED" vectorType="NO" boost="1f" type="System.String" settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider">
3 <analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer, Sitecore.ContentSearch.LuceneProvider" />
Index not updating in the Content Delivery environment
At this point our index's were working fine in our own test environments. Upon deploying to our clients servers however the index's were never updating on either there Content Management or Content Delivery servers. Doing a manual rebuild of the index would cause the Content Management servers index to update, but the Content Delivery servers index constantly remained empty.
Clearly there was some sort of difference between there's and our environments. We didn't have any direct access to there servers, so we checked out the config settings that are view-able by going to /sitecore/admin/showconfig.aspx
Sure enough there was a difference.
This Sitecore install was running 7.2 and prior to that the latest version they had used was 6.6. They had set up a custom config setting which removed the hooks section from config. This was because some of the default hooks Sitecore has interfered with performance monitoring tools they use on there sites. Unfortunately it was also removing a hook that loads Sitecore.ContentSearch. Without this index's are never updated on events.