Fixing bad SharePoint taxonomy term references

How to fix bad Taxonomy Terms in SharePoint automatically

A given document using managed metadata can have a term orphaned from the termset. This can happen due to bad references to the intermediary site collection cached terms list, which is in each site collection and hidden, under the list name “HiddenTaxonomyList”, here’s how to recreate this URL: [siteurl]/Lists/TaxonomyHiddenList/AllItems.aspx

While it’s easy to extend this to check the health of each document term, for simplicity let’s imagine we want to fix a single document, and have the URL and know the name of the taxonomy field. Let’s set the basics before we really get started:

$docurl = "http://WebApp/path/lib/doc.xlsx" #URL to correct
$site = New-Object Microsoft.SharePoint.SPSite($docurl)  #grab the SPSite
$web = $site.OpenWeb() #grab the SPWeb
$item = $web.GetListItem($docurl) #grab the SPItem
$targetField = "MMS Field Name" # let's establish the name of the field
$TermValueToReplace = $item[$targetField].label;  #this is the termset value we want to re-assign correctly

Now let’s get a taxonomy session, with proper error detection:

try
{
Write-Host "getting Tax Session for $($Site.url)..." -NoNewline
$taxonomySession = Get-SPTaxonomySession -Site $site  #get one session per site collection you are in
$termStore = $taxonomySession.TermStores[0]  #We need to move the  get Termset to above the lib level too!
Write-Host "Got Tax Session. " 
}
catch
{
Write-Host "Tax session acquisition problem for $($site.url)"
$ok=$false;
}

Given the field, let’s get the correct termset.

[Microsoft.SharePoint.Taxonomy.TaxonomyField]$taxonomyField = $item.Fields.GetField($targetField)  
$termSetID=$taxonomyField.TermSetId
$termSet = $termStore.GetTermSet($taxonomyField.TermSetId)  #if we ever loop, move to outside item loop, this takes a long time!

Now we do a lookup for the term in the Managed Metadata Service. We expect precisely one match, but we’ll check for that later

#"true" parameter avoids untaggable terms, like parent term at higher tier that should not be selected
[Microsoft.SharePoint.Taxonomy.TermCollection] $TC = $termSet.GetTerms($TermValueToReplace,$true)  

Now let’s populate a TaxonomyFieldValue, and assign it to the SPItem, and save it without changing timestamp or author by using SystemUpdate()

$taxonomyFieldValue = new-object Microsoft.SharePoint.Taxonomy.TaxonomyFieldValue($taxonomyField)
$t1=$TC[0]  #use the first result
$taxonomyFieldValue.set_TermGuid($t1.get_Id())  #this assigns the GUID
$taxonomyFieldValue.set_Label($TermValueToReplace)  #this assigns the value
$taxonomyField.SetFieldValue($Item,$taxonomyFieldValue)  #let's assign to the SPItem
$item.systemupdate()

That’s the meat of it. Let’s put it all together with error handling:

$ok=$true;
$docurl = "http://WebApp/path/lib/doc.xlsx" #URL to correct
$site = New-Object Microsoft.SharePoint.SPSite($docurl)
$web = $site.OpenWeb()
$item = $web.GetListItem($docurl)
$targetField = "FieldName"
$TermValueToReplace = $item[$targetField].label;
try
{
Write-Host "getting Tax Session for $($Site.url)..." -NoNewline
$taxonomySession = Get-SPTaxonomySession -Site $site  #get one session per site collection you are in
$termStore = $taxonomySession.TermStores[0]  #We need to move the  get Termset to above the lib level too!
Write-Host "Got Tax Session. " 
}
catch
{
Write-Host "Tax session acquisition problem for $($site.url)"
$ok=$false;
}
[Microsoft.SharePoint.Taxonomy.TaxonomyField]$taxonomyField = $item.Fields.GetField($targetField)  
$termSetID=$taxonomyField.TermSetId
$termSet = $termStore.GetTermSet($taxonomyField.TermSetId)  #Move to outside item loop, this takes a long time! 	
[Microsoft.SharePoint.Taxonomy.TaxonomyFieldValue]$taxonomyFieldValue = New-Object Microsoft.SharePoint.Taxonomy.TaxonomyFieldValue($taxonomyField)  
[Microsoft.SharePoint.Taxonomy.TermCollection] $TC = $termSet.GetTerms($TermValueToReplace,$true)  #true avoids untaggable terms, like parent company at higher tier
if ($TC.count -eq 0)
{
Write-Host -ForegroundColor DarkRed "Argh, no Taxonomy entry for term $($TermValueToReplace)"
$ok=$false;
}
else
{
if ( $TC.count -gt 1)
{
Write-Host -ForegroundColor DarkRed "Argh, $($TC.count) Taxonomy entries for Claim term $($TermValueToReplace)"
$ok=false; #we can't be sure we got the right claim!
}
$taxonomyFieldValue = new-object Microsoft.SharePoint.Taxonomy.TaxonomyFieldValue($taxonomyField)
$t1=$TC[0]
$taxonomyFieldValue.set_TermGuid($t1.get_Id())
#$taxonomyFieldValue.ToString()
#$targetTC.add($taxonomyFieldValue)
}
try
{
$taxonomyFieldValue.set_Label($TermValueToReplace)
$taxonomyField.SetFieldValue($Item,$taxonomyFieldValue)  
}
catch
{
Write-Host -ForegroundColor DarkRed "Argh, can't write Tax Field for $($Item.url)"
$ok=$false;
}
if ($ok)
{
$item.systemupdate()
write-host "Fixed term for item"
}
else
{
Write-Host -ForegroundColor DarkRed "Did not fix term for item"
}

Each TaxonomyFieldValue has three important properties; these appear often as pipe separated values:
Label : It is the property of the Label selected by the user from the Labels property of the Term object
TermGuid : it is the Id (Guid) property of the Term (inherited from TaxonomyItem)
WssId : Reference back to the TaxonomyHiddenList, actually ID of the list entry in the site collection

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *