Sitecore
Debugging Sitecore 9 Analytics Issues

Debugging Sitecore 9 Analytics Issues

With Sitecore 9 the way analytics is recorded and processed changed. Rather than everything being done by the IIS application, the architecture changed to include:

  1. A second IIS application called xconnect
  2. A windows services called Sitecore XConnect Search Indexer

As well as this Mongo was replaced by:

  1. SQL Server databases
  2. A SOLR XDB Core

Sitecore 9 also introduced data being secure in transit as well as at rest which means all traffic is encrypted using certificates.

While all these peices can be considered good, it does also create more points of failure that becomes harder to debug. So after spending a decent amount of time debugging why no analytic reports were loading and then why no data was appearing in the reports, I've made a checklist to go through.

Debugging Sitecore Analytics Checklist

1. XConnect Connection String

The main Sitecore application has connection strings for where it will find the XConnect service. They will look like this:

1 <add name="xconnect.collection" connectionString="https://mysite.xconnect" />
2 <add name="xconnect.collection.certificate" connectionString="StoreName=My;StoreLocation=LocalMachine;FindType=FindByThumbprint;FindValue=EF7F38B623E6664359110F2C6EB6DA00D567950F" />

One gives the path to the XConnect service and the other gives the path to find the certificate.

Firstly check that the XConnect path is correct and that you can access it and secondly check that the thumbprint corresponds with a certificate in the certificate store.

You can see what certificates are on your machine using this PowerShell script:

1Get-ChildItem -Path "cert:\LocalMachine\Root" | Format-Table Subject, FriendlyName, Thumbprint
2Get-ChildItem -Path "cert:\LocalMachine\My" | Format-Table Subject, FriendlyName, Thumbprint
3Get-ChildItem -Path "cert:\CurrentUser\My" | Format-Table Subject, FriendlyName, Thumbprint

2. Check certificate expiry date

Having a certificate is a good start, but it could still have expired.

Find the certificate in the certificate store by hitting start > manage computer certificates and find it in one of the folders.

Check the expiration date. If it's in the past then it's not going to work. This is common because the installation script for Sitecore 9 will set the expiration date to a year after install by default.

If it has expired you will need a new cert, you can create this by using the same script that you used to install sitcore origionally (Just the certificate bit).

1#Switch to correct vesion of SIF
2Remove-Module -Name SitecoreInstallFramework
3Import-Module -Name SitecoreInstallFramework -RequiredVersion 1.2.1
4
5#define parameters
6$prefix = "SitePrefix"
7$PSScriptRoot = "C:\resourcefiles9.0"
8$XConnectCollectionService = "$prefix.xconnect"
9$sitecoreSiteName = "$prefix.sc"
10
11#install client certificate for xconnect
12$certParams = @{
13 Path = "$PSScriptRoot\xconnect-createcert.json"
14 CertificateName = "$prefix.xconnect_client"
15 RootCertFileName = "SIF121Root"
16}

3. Check security permissons on the certificate

If you have a certificate and it's still valid then it could be that the app pool the site is running in doesn't have access to read the certificate.

To check this:

  1. Right click the certificate
  2. All Tasks
  3. Manage Private Keys
  4. Check that the app pool user for your site is listed in the list of users and that it has read permission. If it's not there add it using the name
    IIS APPPOOL\app pool name

4. Check License Files

Partner licenses only last for a year so if your using one of those it may have expired.

We're all used to checking the license file in the Sitecore application but XConnect has a license too.

These will be in:
sitename.xconnect\App_data\
sitename.xconnect\App_data\jobs\continuous\IndexWorker\App_data
sitename.xconnect\App_data\jobs\continuous\AutomationEngine\App_Data

Only the first will be used when your viewing the site, but it's worth knowing about the others to, incase you ever run a job manually.

5. Check manual rebuild of indexes

You can trigger a manual rebuild of the xDB index by following these instructions:

https://doc.sitecore.com/developers/90/sitecore-experience-platform/en/rebuild-the-xdb-index-in-solr.html

Remember in point 4 that it has it's own license file. It also has it's own connection strings.

6. Check XConnect site works in a browser

If you open XConnect in a browser you should recieve no certificate errors and a timestamp saying how long XConnect had been running for.

7. Check certificates are in the right store

This stack overflow post was a big help for me (https://stackoverflow.com/questions/26247462/http-error-403-16-client-certificate-trust-issue ). I was at the point where everything seemed right, but moving the certificates as shown here got it to the point of the analytics reports loading.

Windows 2012 introduced stricter certificate store validations. According to KB 2795828: Lync Server 2013 Front-End service cannot start in Windows Server 2012, the Trusted Root Certification Authorities (i.e. Root) store can only have certificates that are self-signed. If that store contains non-self-signed certificates, client certificate authentication under IIS returns with a 403.16 error code.

To solve the problem, you have to remove all non-self-signed certificates from the root store. This PowerShell command will identify non-self-signed certificates:

1Get-Childitem cert:\LocalMachine\root -Recurse |
2 Where-Object {$_.Issuer -ne $_.Subject}

In my situation, we moved these non-self-signed certificates into the Intermediate Certification Authorities (i.e. CA) store:

1Get-Childitem cert:\LocalMachine\root -Recurse |
2 Where-Object {$_.Issuer -ne $_.Subject} |
3 Move-Item -Destination Cert:\LocalMachine\CA

Checklist for data not going into Analytics

If you've got to the point of the analytics reports working, but not showing any data, this is my checklist for making sure data goes in. In my case I was trying to log site searches as per my article from a few years ago https://himynameistim.com/2017/09/13/populating-the-internal-search-report-in-sitecore/ there wern't any errors, but no data ever showed.

1. Enable Analytics Debugging

In Sitecore.Analytics.Tracking.config there is a setting to set the analytics logging level to debug. You will also need to set the log level on log4net root in Sitecore.config to debug.

2. Disable Robot Detection

In my case Sitecore thought I was a robot. Changing these settings will disable that:

1<setting name="Analytics.AutoDetectBots" set:value="false" />
2<setting name="Analytics.Robots.IgnoreRobots" set:value="false" />

3. Test analytics is tracking something

One of the hardest parts about analytics is it's not instant. The initial tracking only goes into the DB at the end of the users session and that's only for collection. It won't appear in the reports until processing has happened. So to speed this up:

Create a page called kill.aspx as follows. This will end the users session and trigger the data to be fed into the DB.

1<%@ Page language="c#" %>
2
3<!DOCTYPE html>
4<html>
5 <head>
6
7 </head>
8 <body>
9
10<div>Session Abandoned</div>
11<% Session.Abandon(); %>
12
13 </body>
14</html>

Next do something on the site that will cause some tracking to get added to the DB. In my case it was the search. Then go to kill.aspx to force session abondon.

Check the logs. You should see something like this...

125016 11:19:13 DEBUG [Analytics]: The CommitSession pipeline, ProcessSubscriptions is skipped - there is no subscriptions for location id: 4ebd0208-8328-5d69-8c44-ec50939c0967

Check the DB an entry should have gone into a shard db for [xdb_collection].[Interactions] table

To speed up processing, restart the main sitecore application.

Sitecore strange language switching

Sitecore strange language switching

The other day we started experiencing a strange issue on one of our test sites. Pages were starting to error and looking at the logs the errors were happening in view's which hadn't been updated in a long time. Some of these seemed like the code wasn't robust enough to handle when a datasource hadn't been set (the classic object not set to an instance of an object error), but fixing these just resulted in an error in another. It also didn't explain why this suddenly started happening, not to mention the components in question should have all had data.

Then we noticed that on the first view of any page of the site with a new session the site would display fine even though these were server errors happening. Navigating to any other page or refreshing though caused the server error to return.

This led us to look at the differences between the request headers and we had our answer. On all subsequent requests the language context had been changed to some random thing which there is no content for. The site itself is not a multilanguage site and only has content for en. Adding the language code into the URL would force the context language back and the page would work again.

How language works in Sitecore

To understand what is happening it's good to know how languages in Sitecore work.

Sitecore is built as a platform which can server content in multiple languages. By default you start with one language in the editor (en) but are able to add more. You can read my blog post from a few years ago on how to do this here (https://himynameistim.com/2015/06/30/sitecore-adding-languages-for-a-multilingual-site/).

Sitecore will recognise which language should be displayed based on a language code in a URL. e.g. himynameistim.com/en-gb/ for English - Great Britian. This is done through the strip language pipeline which picks up these languages and sets the context language.

Config for the link manager then controls if links are generated with these language codes in the URL or not. On a single language site you would have this set to never resulting in URL's without a language code and the default language will be used.

The flaw in all of this though is the strip language pipeline always runs, even when your pages only have one lauange. The pipeline also doesn't check if the language it finds in the url is set up as a language on the site, so quite a lot of two letter combinations will work and change the language context. When this is changed, it is changed for the users session meaning it is possible for a url to inadvertedly cause the language context to change for a user on the site when the site only has one language.

Disabling Strip Language

As the site in question only has one language the fix is quite simple. For multilingual sites the solution is a bit harder.

For a single language site you can simply turn off the strip language functionality. You can do this using a patch config file as follows:

1<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
2 <sitecore>
3 <settings>
4 <setting name="Languages.AlwaysStripLanguage">
5 <patch:attribute name="value">false</patch:attribute>
6 </setting>
7 </settings>
8 </sitecore>
9</configuration>
Removing port 443 from urls generated by Sitecore

Removing port 443 from urls generated by Sitecore

For as long as I've been working on Sitecore there has been this really annoying issue where setting the link manager to include server url and running under https will cause urls to be generated with the port number included. e.g. https://www.himynameistim.com:443/ which naturally you don't actually want.

To overcome this there are a few methods you can take.

Method 1 - Set the Scheme and Port on you site defenition

This is possibly the smallest change you can make as it's just 2 settings in a config file.

Setting the external port on site node to 80 (yes 80) tricks the link manager code into not appending the port number as it does it for everything other than port 80.

1<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
2 <sitecore>
3 <sites xdt:Transform="Insert">
4 <site name="website">
5 <patch:attribute name="hostName">www.MySite.com</patch:attribute>
6 <patch:attribute name="rootPath">/sitecore/content/MySite</patch:attribute>
7 <patch:attribute name="scheme">https</patch:attribute>
8 <patch:attribute name="externalPort">80</patch:attribute>
9 </site>
10 </sites>
11 </sitecore>
12</configuration>

What I don't like about this method though, is your setting something to be wrong to get something else to come out right. It's all a bit wrong.

Method 2 - Write your own link provider

The second method which I have generally done is to write your own provider which strips the port number off the generated URL.

For this you will need:

1. A patch file to add the provider:

1<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
2 <sitecore>
3 <linkManager defaultProvider="sitecore">
4 <patch:attribute
5 name="defaultProvider"
6 value="CustomLinkProvider" />
7 <providers>
8 <add name="CustomLinkProvider"
9 type="MySite.Services.CustomLinkProvider,
10 MySite"
11 languageEmbedding="never"
12 lowercaseUrls="true"
13 useDisplayName="true"
14 alwaysIncludeServerUrl="true"
15 />
16 </providers>
17 </linkManager>
18 <mediaLibrary>
19 <mediaProvider>
20 <patch:attribute name="type">
21 MySite.Services.NoSslPortMediaProvider, MySite
22 </patch:attribute>
23 </mediaProvider>
24 </mediaLibrary>
25 </sitecore>
26</configuration>

2. A helper method that removes the SSL port

1namespace MySite
2{
3 /// <summary>
4 /// Link Helper is used to remove SSL Port
5 /// </summary>
6 public static class LinkHelper
7 {
8 /// <summary>
9 /// This method removes the 443 port number from url
10 /// </summary>
11 /// <param name="url">The url string being evaluated</param>
12 /// <returns>An updated URL minus 443 port number</returns>
13 public static string RemoveSslPort(string url)
14 {
15 if (string.IsNullOrWhiteSpace(url))
16 {
17 return url;
18 }
19
20 if (url.Contains(":443"))
21 {
22 url = url.Replace(":443", string.Empty);
23 }
24
25 return url;
26 }
27 }
28}

3. The custom link provider which first gets the item URL the regular way and then strips the SSL port

1using Sitecore.Data.Items;
2using Sitecore.Links;
3
4namespace MySite
5{
6 /// <summary>Provide links for resources.</summary>
7 public class CustomLinkProvider : LinkProvider
8 {
9 public override string GetItemUrl(Item item, UrlOptions options)
10 {
11 // Some code which manipulates and exams the item...
12
13 return LinkHelper.RemoveSslPort(base.GetItemUrl(item, options));
14 }
15 }
16}
17

4. The same provider for media

1using Sitecore.Data.Items;
2using Sitecore.Resources.Media;
3
4namespace MySite
5{
6 /// <summary>
7 /// This method removes SSL port number from Media Item URLs
8 /// </summary>
9 public class NoSslPortMediaProvider : MediaProvider
10 {
11 /// <summary>
12 /// Overrides Url mechanism for Media Items
13 /// </summary>
14 /// <param name="item">Sitecore Media Item</param>
15 /// <param name="options">Sitecore Media Url Options object</param>
16 /// <returns>Updated Media Item URL minus 443 port</returns>
17
18 public override string GetMediaUrl(MediaItem item, MediaUrlOptions options)
19 {
20 var mediaUrl = base.GetMediaUrl(item, options);
21 return LinkHelper.RemoveSslPort(mediaUrl);
22 }
23 }
24}

What I don't like about this method is it's messy in the opposite way. The port number is still being added, and we're just adding code to try and fix it after.

Credit to Sabo413 for the code in this example

Method 3 - Official Sitecore Patch

Given that it's Sitecore's bug, it does actually make sense that they fix it. After all people are paying a license fee for support! This simplifies your solution down to 1 extra patch file and a dll. What's better is as it's Sitecores code they have the responsibility of fixing it, if it ever breaks something, and you have less custom code in your repo.

You can get the fix here for Sitecore version 8.1 - 9.0.

So this may leave you wondering how did Sitecore fix it? Well having a look inside the dll reveals they wen't for method 2.

Sitecore: Returning a 404 response on the page requested rather than redirecting to a 404 page

Sitecore: Returning a 404 response on the page requested rather than redirecting to a 404 page

Previously I've blogged about:

but while looking through the posts today, I realised I had never written about how you stop Sitecore from issuing 302 redirects to your 404 page and instead return a 404 on the URL requested with the contents of the 404 page.

While search engines will recognise a 302 response to a 404 as a 404 (in fact they're intelligent enough to work out that a 404 page without a correct response status code is a 404) it's considered SEO best practice for the URL to stay the same and to issue the correct status code.

Creating a NotFoundResolver class

When Sitecore processes a request it will run the httpRequestBegin pipeline, and within that pipeline is a Item Resolver processor that will attempt to find the requested item. If after this the context item is still null then the logic to redirect to the ItemNotFoundUrl will kick in. To stop this happening we can simply add our own process to the pipeline after ItemResolver and set the item.

Our class looks like this:

1using Sitecore;
2using Sitecore.Configuration;
3using Sitecore.Data;
4using Sitecore.Diagnostics;
5using Sitecore.Pipelines.HttpRequest;
6
7namespace Pipelines.HttpRequest
8{
9 public class NotFoundResolver : HttpRequestProcessor
10 {
11 private static readonly string PageNotFoundID = Settings.GetSetting("PageNotFound");
12
13 public override void Process(HttpRequestArgs args)
14 {
15 Assert.ArgumentNotNull(args, nameof(args));
16
17 if ((Context.Item != null) || (Context.Database == null))
18 return;
19
20 if (args.Url.FilePath.StartsWith("/~/"))
21 return;
22
23 var notFoundPage = Context.Database.GetItem(new ID(PageNotFoundID));
24 if (notFoundPage == null)
25 return;
26
27 args.ProcessorItem = notFoundPage;
28 Context.Item = notFoundPage;
29 }
30 }
31}

To add our process to the pipeline we can use a patch file like this:

1<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
2 <sitecore>
3 <pipelines>
4 <httpRequestBegin>
5 <processor
6 patch:after="processor[@type='Sitecore.Pipelines.HttpRequest.ItemResolver, Sitecore.Kernel']"
7 type="LabSitecore.Core.Pipelines.NotFoundResolver, LabSitecore.Core" />
8 </httpRequestBegin>
9 </pipelines>
10 <settings>
11 <!-- Page Not Found Item Id -->
12 <setting name="PageNotFound" value="ID OF YOUR 404 PAGE" />
13 </settings>
14 </sitecore>
15</configuration>

Notice the setting for the ID of your 404 page to be loaded as the content.

Remember if you do do this make sure you also follow one of the methods to return a 404 status code, otherwise you will have just made every URL a valid 200 response on your site.