We recently completed a Sitecore 7.2 public website project. We made the decision to use Coveo as our search engine because of its easy integration with Sitecore and the search centre’s attractive, highly functional front-end experience.
Habanero has a ton of SharePoint search experience, especially using SharePoint 2013. Initially, we figured “search is search – how different can it be?” The concept of publishing content, parsing the content, storing the searched content in an index, and then returning that content in response to search queries is pretty much the same across any search product.
What we found out is that although a lot of things are identical, or very similar, between Coveo for Sitecore and SharePoint search, there are some key differences to be aware of.
Coveo is not just a search engine for Sitecore. It started life as a standalone product, with integration points with Sitecore and other applications. As a result, Coveo for Sitecore still comes with Coveo’s own feature-rich management interface, much of which does not apply or is disabled in the Coveo for Sitecore product.
What’s the same?
- Automatically indexing published content in the content sources
- Automatically recognizing custom content types/fields (SharePoint) and templates/fields (Sitecore) and mapping these to searchable properties in the index
- Rich management interfaces for configuring the search indexes and the search experience
- Rich end-user search experience, including ranking, sorting, refining, and customizable display of results
- The ability to “promote” results in response to a particular query
- Query capabilities based on the full-text of a piece of content, or based on specific fields in the content
What’s different?
- How to manage the search experience
- Where to manage the search experience
- Crawling content
- Faceted search
- Promoted results and top results
- Infrastructure
In SharePoint 2013, search is largely a matter of configuring out-of-the-box functionality. Everything is provided for you, with settings so that you can tweak things the way you want. In some cases, you may need to do some custom development–for example, the process of creating custom display templates to render search results the way you want generally requires a developer.
For example, defining content sources, defining crawl schedules, managing the search schema, resetting the index, defining result sources and result types, or configuring search Web Parts is a configuration exercise. SharePoint provides a user interface for doing all these things or you can configure it using the .NET APIs or PowerShell.
In Coveo for Sitecore, out-of-the-box functionality is limited to the infrastructure side. Sitecore and Coveo work together to create an index for each Sitecore database, and Sitecore custom templates and fields are automatically pushed over to Coveo. For any other customization, you generally need to write .NET code to make it happen.
For example, to create a calculated field in the Coveo index, you need to write a .NET class and insert it into the Sitecore processing pipeline. For the end-user experience, Coveo for Sitecore provides a sample search centre results page with many configuration options. However, if you want to brand the page or modify functionality on any of the Web Parts, the way to do that is to take a copy of the Coveo for Sitecore Web Part and modify it as you wish. This will require a developer.
In SharePoint, most of the search management is done in Central Administration, in the Search service application. Here, you can define content sources, manage the search schema, set search rules, reset the index, and start crawls.
In Coveo for Sitecore, most of the search management is done in Sitecore. Indexing is performed automatically when content is saved or published in Sitecore. Initiating a full indexing operation is done in Sitecore. Defining refinable properties (called “facets” in Coveo) is done partially in the Sitecore pipeline and partially in the Sitecore content tree.
A few things are still done on the Coveo management side. Resetting the search index (actually, “deleting” the indexes) is done in the Coveo admin tool. As well, setting up promoted results (called “top results” in Coveo) is done in the Coveo admin tool, and needs to be done in every index.
One area in which Coveo excels is the ability to directly examine the index. Using the Coveo admin tool, you can browse all content in the index, both full-text content as well as individual fields, which is a great help in troubleshooting.
SharePoint 2013 uses a “crawl” paradigm to discover new, changed, and deleted content. Each content source has a crawl schedule, which can perform a full, incremental, or continuous crawl. SharePoint content is published, and then there is a delay before the crawler will discover the changes and update the search index. Continuous and incremental crawls examine the SharePoint logs to determine what has changed since the last crawl, and then indexes only that content. Full crawls examine the entire content source, can be slow depending on the size of the content, and are usually performed relatively infrequently (eg. weekly).
Coveo for Sitecore takes a different, “push,” approach. Every time a piece of content is saved in Sitecore, the modified content is pushed to Coveo for Sitecore, which parses the content and updates the search index. As a result, modified content is typically available in the Coveo index much faster–usually within seconds as opposed to minutes. This is in part driven by Sitecore’s search model–Coveo for Sitecore has to follow this pattern, just as Sitecore Lucene and Solr search providers do as well.
For this reason, the equivalent of SharePoint’s full crawl in Sitecore-Coveo is really a full push --pushing every piece of content from the master database to the web database, which will pass each item to Coveo for Sitecore. In Sitecore, this is called reindexing and is available from the Sitecore control panel.
Faceted search refers to the experience of displaying a list of facets (which SharePoint calls refiners) on the search results page. These usually take the form of a list of categories with checkboxes. Checking the box filters the search results to just the ones that match the tag.
SharePoint 2013’s refiners are always “and” refiners, which means if you select more than one value for a specific refiner, the search results show only those results that are tagged with all the selected values.
Coveo for Sitecore’s facets are, by default, “or” refiners. This means that if you select more than one value for a specific refiner, the search results show the set of results that are tagged with at least one of the selected values. You can configure the facets to be “and” refiners instead.
SharePoint 2013 has a single search index for all published content. To create a promoted result, you define the query you want to match, and then you set the result(s) you wish to promote. This is typically done in the search service application in Central Administration.
Coveo for Sitecore has multiple indexes, one for each Sitecore database. The process for creating a top result is similar (you can create them either by specifying the query first and then the result(s), or navigating to the result and then defining the query which will match). However, you must define the top results in each index you want it to be in. If you have multiple published indexes (one of our applications does), this doubles the work for the search administrators. Coveo is looking into a way to manage the top results in Sitecore, which would push the results into multiple indexes, so this functionality may change in the future.
Coveo for Sitecore relies on an open-source message-queueing application called Rabbit MQ to manage the content pushed from Sitecore to Coveo. When a piece of content is updated in Sitecore, Sitecore sends a message to Rabbit MQ with the content, including any custom fields or modifications that are made in the Sitecore pipeline. Rabbit MQ then passes the message to Coveo for indexing. Rabbit MQ has a rich management interface which allows you to view the queues and the messages being relayed.
In a few cases, we found that Rabbit MQ stalls and messages coming from Sitecore piled up in the queues without being relayed on to Coveo. Restarting the Coveo search service resolved the problem. We didn’t do much investigation on this, but we did write a small application which monitors the queues and sends a notification if the queues become overly full.