Tuning SharePoint Search Ranking

84

Tuning SharePoint Search Ranking in the object model

SharePoint Search results are returned in order of relevancy, which is determined by a ranking model. There are a number of ranking models cooked into SharePoint 2010. These can be refined to a limited extent, with a bit of insight, to better serve users.

To see the models and their definition, let’s query the SharePoint Search application DB:

SELECT * FROM [Search_Service_Application_DB].[dbo].[MSSRankingModels]

The resultset has the models; the GUID, whether it is default, and the underlying XML that specifies the model. The model name is at the beginning of the XML.

Using PowerShell, we can get the array of ranking models, and is the only supported approach for manipulating the models, changing the default, and for creating new ranking models. Here’s how to get the models:

Get-SPEnterpriseSearchServiceApplication | Get-SPEnterpriseSearchRankingModel

Now we can assign the ranking model array to a variable and index into it:

$A = Get-SPEnterpriseSearchServiceApplication | Get-SPEnterpriseSearchRankingModel

Or we can grab the one ranking model that we like by using the GUID, which we have to predetermine, but that’s easy, as it’s returned by the above query and is unchanging. For new models, we get to specify the GUID as well.

Once you know your rank model GUID, you can switch to it by getting it, and setting it as default:

$r = Get-SPEnterpriseSearchServiceApplication | Get-SPEnterpriseSearchRankingModel 8f6fd0bc-06f9-43cf-bbab-08c377e083f4
$r.MakeDefault()

To create a custom rank model, first identify the Managed Properties, by PID. The name is part of the XML, but it is the PID that drives the ranking. Here’s how to get all the Managed Properties and their PIDs:

Get-SPEnterpriseSearchServiceApplication | Get-SPEnterpriseSearchMetadataManagedProperty

Now we create a new ranking model called MyRank, note i want LastModified property to be relevant.

Get-SPEnterpriseSearchServiceApplication | New-SPEnterpriseSearchRankingModel –rankingmodelxml "<?xml version='1.0'?><rankingModel name='MyRank2' id='8447b4bc-3582-45c5-9cb8-ba2a319d850e' description='CustomJoelRank2' xmlns='http://schemas.microsoft.com/office/2009/rankingModel'>
	<queryDependentFeatures>
		<queryDependentFeature name='Body' pid='1' weight='0.00125145559138435' lengthNormalization='0.0474870346616999'/>
		<queryDependentFeature name='LastModifiedTime' pid='4' weight='3.46602125767061' lengthNormalization='0.549393313908594'/>
		<queryDependentFeature name='Title' pid='2' weight='1.46602125767061' lengthNormalization='0.549393313908594'/>
		<queryDependentFeature name='Author' pid='3' weight='0.410225403867996' lengthNormalization='1.0563226501349'/>
		<queryDependentFeature name='DisplayName' pid='56' weight='0.570071355441683' lengthNormalization='0.552529462971364'/>
		<queryDependentFeature name='ExtractedTitle' pid='302' weight='1.67377875011698' lengthNormalization='0.600572652201123'/>
		<queryDependentFeature name='SocialTag' pid='264' weight='0.593169953073459' lengthNormalization='2.28258134389272'/>
		<queryDependentFeature name='QLogClickedText' pid='100' weight='1.87179361911171' lengthNormalization='3.31081658691434'/>
		<queryDependentFeature name='AnchorText' pid='10' weight='0.593169953073459' lengthNormalization='2.28258134389272'/>
	</queryDependentFeatures>
	<queryIndependentFeatures>
		<queryIndependentFeature name='ClickDistance' pid='96' default='5' weight='1.86902034145632'>
			<transformInvRational k='0.0900786349287429'/>
		</queryIndependentFeature>
		<queryIndependentFeature name='URLDepth' pid='303' default='3' weight='1.68597497899313'>
			<transformInvRational k='0.0515178916330992'/>
		</queryIndependentFeature>
		<queryIndependentFeature name='Lastclick' pid='341' default='0' weight='0.219043069749249'>
			<transformRational k='5.44735200915216'/>
		</queryIndependentFeature>
		<languageFeature name='Language' pid='5' default='1' weight='-0.56841237556044'/>
	</queryIndependentFeatures>
</rankingModel>"

There are two parts to the model; the query dependent section that is associated with the actual query and it’s metadata, and the query independent part that ranks based on number of slashes (URLDepth) and click frequency etc.

As soon as a model is default, you can see the effect of the new ranking model.

Here’s how to change this model, note I add a new field called MyCompany and boost its relevance:

<?xml version="1.0" encoding="utf-8"?>
Get-SPEnterpriseSearchServiceApplication | Get-SPEnterpriseSearchRankingModel 8447b4bc-3582-45c5-9cb8-ba2a319d850e | Set-SPEnterpriseSearchRankingModel –rankingmodelxml "<?xml version='1.0'?><rankingModel name='CustomJoelRank2' id='8447b4bc-3582-45c5-9cb8-ba2a319d850e' description='MyRank2' xmlns='http://schemas.microsoft.com/office/2009/rankingModel'>
	<queryDependentFeatures>
		<queryDependentFeature name='Body' pid='1' weight='0.00125145559138435' lengthNormalization='0.0474870346616999'/>
		<queryDependentFeature name='MyCompany' pid='414' weight='3.610225403867996' lengthNormalization='1.0563226501349'/>
		<queryDependentFeature name='Title' pid='2' weight='0.46602125767061' lengthNormalization='0.549393313908594'/>
		<queryDependentFeature name='Author' pid='3' weight='0.410225403867996' lengthNormalization='1.0563226501349'/>
		<queryDependentFeature name='DisplayName' pid='56' weight='0.570071355441683' lengthNormalization='0.552529462971364'/>
		<queryDependentFeature name='ExtractedTitle' pid='302' weight='1.67377875011698' lengthNormalization='0.600572652201123'/>
		<queryDependentFeature name='SocialTag' pid='264' weight='0.593169953073459' lengthNormalization='2.28258134389272'/>
		<queryDependentFeature name='QLogClickedText' pid='100' weight='1.87179361911171' lengthNormalization='3.31081658691434'/>
		<queryDependentFeature name='AnchorText' pid='10' weight='0.593169953073459' lengthNormalization='2.28258134389272'/>
	</queryDependentFeatures>
	<queryIndependentFeatures>
		<queryIndependentFeature name='ClickDistance' pid='96' default='5' weight='1.86902034145632'>
			<transformInvRational k='0.0900786349287429'/>
		</queryIndependentFeature>
		<queryIndependentFeature name='URLDepth' pid='303' default='3' weight='1.68597497899313'>
			<transformInvRational k='0.0515178916330992'/>
		</queryIndependentFeature>
		<queryIndependentFeature name='Lastclick' pid='341' default='0' weight='0.219043069749249'>
			<transformRational k='5.44735200915216'/>
		</queryIndependentFeature>
		<queryIndependentFeature name='CustomJoelModified' pid='445' default='1' weight='2.56841237556044'>
				<transformRational k='5.44735200915216'/>
			</queryIndependentFeature>
			<languageFeature name='Language' pid='5' default='1' weight='1.5'/>
	</queryIndependentFeatures>
</rankingModel>"

I admittedly did not have success ranking by how recent a document was updated. This is known as “Freshness”. SP2010 has very limited ability to customize ranking. I have not succeeded in getting it to respect “freshness”. A simple freshness ranking seems infuriatingly out of reach. However SP2013 supports it explicitly. While the default SharePoint 2013 ranking model doesn’t boost the rank of search results based on their freshness, we can achieve this by adding a tuning of the static rank that combines information from the LastModifiedTime managed property with the DateTimeUtcNow query property, using the freshness transform function. These Transform functions are used to customize ranking in SP2013. The freshness transform function is the only transform that we can use for this freshness rank feature, because it converts the age of the item from an internal representation into days. In SP2010 the transforms are much more obscure and not really usable. Microsoft reports that the freshness transform in SP2013 can be used. Even before getting to SP2013, we can have an SP2013 farm configured to crawl production SP2010 and return results tuned in this way, and can use the SP2013 search results to serve any client we choose to point to SP2013, including a simple search site in SP2013.

Share this entry

Leave a Reply

Your email address will not be published. Required fields are marked *

Table of Contents

Categories

Categories