Posts

SharePoint Document IDs

SharePoint Document IDs

The SharePoint Document ID Service is a new feature of SharePoint 2010 that offers a number of useful capabilities, but carries some limitations.  Let’s dig a bit deeper and see what it does and how it works.

One challenge for SharePoint users is that links tend to easily break. Rename a file or folder, or move the document, and a previously saved or shared link will not work.  By tagging a document with an ID, SharePoint can start referencing documents using this ID, even when the underlying structure beneath it has changed.  SharePoint can accept a link with this ID, by referencing a dedicated page on each site that takes care of finding the the document.  This page is named DocIDRedir.aspx.  Here’s what a URL might look like:

“http: //%3csitecollection%3e/%3cweb%3e/_layouts/DocIdRedir.aspx?ID=XXXX”

There’s also a Document ID web part that’s available for users to enter a Document ID.  This is used most prominently when creating a Records Center site, which is based on an out-of-box website template.

The Document ID Service is enabled at the Site Collection level, and assigns Document IDs that are unique only within the site collection.  There is a prefix available for configuration that is most useful when assigned uniquely for each Site Collection to ensure uniqueness across your web application and even farm.  If you have more than one farm, it makes sense to provide an embedded prefix to indicate the farm, to ensure uniqueness globally.

Setting Document ID

Once the Document ID Service is enabled, every new or edited document instantly gets a Document ID assigned.  However, historical documents do not get an immediate Document ID assignment.  The assignment of Document IDs to documents that were uploaded prior to this service being enabled are assigned by a Timer Job called the “Document ID assignment job” that exists at the Web Application level.  By default this job runs nightly.  This is one of two jobs associated with the Document ID Service; the other being the “Document ID enable/disable job ”

When the Document ID Service is enabled for a Site Collection, Event Receivers are automatically installed in each Document Library.  Actually there is a set of Event Receivers installed for each and every Content Type configured within that document library.  The Event Receiver is called “Document ID Generator” and is configured to by fired synchronously.  There is a separate Event Receiver for the following events:

  • ItemAdded
  • ItemUpdated
  • ItemCheckedIn
  • ItemUncheckedOut

Once a Document ID is assigned, it is changeable through the Object Model, although do so at your own risk.  Before the Document  ID Service is enabled, the Document ID field does not exist to be assigned.   if you are migrating from a legacy system that has existing Document IDs, you can first migrate the documents, then the Document ID service is enabled.  This adds the internal Document ID field.  Then before the daily Document ID Assignment job runs (better yet, disable it during this process), we can programmatically take the legacy Document IDs and assign their values to the SharePoint IDs.  With the Document ID field populated, the Document ID Service will not overwrite the already set Document IDs.

Note that part of Document ID Service is to redirect URLs referencing the Document ID.  It turns out, if you manually assign duplicate Document IDs (something that in theory should never occur), the daily Document ID Assignment Job detects this situation, and the DocIDRedir.aspx redirects to a site-based search page that passes in the Document ID.

Under the covers there are three internal components to a Document ID:

  • _dlc_DocIdUrl: fully qualified URL for document referencing the DocIDRedir.aspx along with the lookup parameter
  • _dlc_DocId: The Document ID.  This is the internal property you can directly address and assign as $item[“_dlc_DocId”]
  • _dlc_DocIdItemGuid: DocID related GUID

That completes our tour of the Document ID Service.  I look forward to hearing of others’ experience with it.

Clarifying the Security Scope limits in SharePoint lists

Historically, Microsoft has defined it along the more conservative definition as “A scope is the security boundary for a securable object and any of its children that do not have a separate security boundary defined.”.  This implies that 10,000 uniquely assigned documents amongst five people (5*4*3*2*1 = 5! = 120) would be 10,000 security permissions, and not some number <=120.

https://technet.microsoft.com/en-us/library/cc262787(v=office.15).aspx

After the number of unique security scopes exceeds the value of the List Query Size Threshold, SharePoint uses a code path that requires additional SQL round trips to analyze the scopes before rendering a view.  So the impact is on viewing the list of documents (and providing the security trimmed view), hence my suggestion in such situations to consider retaining or even reducing the SharePoint default View limit of 30 items per page .

To test this out, I did an extract from a Content DB that houses a Document Library I have handy with many custom permissions divided amongst eight users.  Probing the underlying data, it has a 18 distinct ACLs among 7K+ “Scopes” each with a unique ScopeURL and ScopeID (the internal Microsoft  field names).  Even though the ACLs are identical (SharePoint uses an ACL ID that I can pivot on to group the Scopes) each folder and often document has a unique scope, because of the way I had broken inheritance thousands of times, even though the ACL assignment was only one of a couple dozen permutations.  Here’s the SQL you can use if you wish to probe:

SELECT [SiteId], [ScopeId], [RoleDefWebId], [WebId], [ScopeUrl], [Acl]   
FROM [your Content DB].[dbo].[Perms]   order by scopeurl

An even better query exposes the web URL

SELECT [fullurl],perms.[SiteId] , perms.[ScopeId], [RoleDefWebId], [WebId], [ScopeUrl], [Acl]   
FROM [dbo].[Perms],[dbo].allwebs   
where perms.WebId = allwebs.id
order by scopeurl

As an aside, you can see the number of security principals per scope, using this SQL:

select COUNT(ra.PrincipalId) as [Count],p.ScopeUrl from RoleAssignment ra with(nolock) 
join Perms p with(nolock) 
on p.SiteId = ra.SiteId and p.ScopeId = ra.ScopeId 
group by p.ScopeUrl 
order by p.ScopeUrl desc

So that closes the issue.  The limit is not the unique combinations of ACLs, but instead the number of times permissions has been broken from a parent in the Document Library. So the guidelines might be:

  • Use folders whenever possible to encapsulate security
  • Break up huge Document Libraries along security boundaries
  • Never go past 50,000 unique permissions per list
  • You don’t want to go beyond 5,000 unique permissions per list without suffering severe performance impact
  • Don’t breech 1,000 unique permissions per list, otherwise you will see a performance impact of 20% or more
  • Never, ever feed a SharePoint Administrator caffeine after midnight…lol

Microsoft writes:
https://technet.microsoft.com/en-us/library/cc262787(v=office.15).aspx
The maximum number of unique security scopes set for a list should not exceed 1,000.
A scope is the security boundary for a securable object and any of its children that do not have a separate security boundary defined. A scope contains an Access Control List (ACL), but unlike NTFS ACLs, a scope can include security principals that are specific to SharePoint Server. The members of an ACL for a scope can include Windows users, user accounts other than Windows users (such as forms-based accounts), Active Directory groups, or SharePoint groups.

Microsoft writes:
https://support.microsoft.com/en-us/kb/2420771

SharePoint 2010:
When a greater number of unique security scopes than the value of the List Query Size Threshold (default is 5,000) set to the web application are created for folders or documents in a list, there’s a significant performance degradation in SharePoint operations that badly affects end users operations(rendering content) as well as SharePoint activities like indexing SharePoint Content. After the number of unique security scopes exceeds the value of the List Query Size Threshold, SharePoint uses a code path that requires additional SQL round trips to analyze the scopes before rendering a view.

Joining a Server to the Farm

Joining a server to the farm, how hard could that be?  Heck, it’s just a command…let’s have a look at how to do it, and what to do if things don’t go smoothly.

First, let’s get the passphrase into a secure string.  This is the object that is required for all things password-related.

$plaintextphrase="your password"

Beware of the need to escape special characters.  For example a password with “$” has to be represented as `$

$securepw = convertto-securestring -string $plaintextphrase -asplaintext -force
Connect-SPConfigurationDatabase -DatabaseServer "your DB server" -DatabaseName "SharePoint_2010_Farm" -Passphrase $securepw

If you have difficulty recalling your Passphrase, no worries!  Just set a new one:

Set-SPPassPhrase -PassPhrase $securepw -Confirm

If you are lucky, the new server will join the farm.  Now here’s where the pain can start.  I recently encountered an error stating that the farms did not have identical software.  After a very exhaustive comparison between servers, I found that the installed product inventory is stored in the Registry on the server.

To make sure that absolutely the same “Products” are installed (as SharePoint sees it) first, check the registry on the existing and the Server to be added. The SKUs in the following registry entry need to match 100%: HKEY_LOCAL_MACHINESOFTWAREMicrosoftShared ToolsWeb Server Extensions14.0WSSInstalledProducts (e.g. You will find a ‘926E4E17-087B-47D1-8BD7-91A394BC6196’ for Office Web Apps)

Get a list of which Products Sharepoint is actually aware of being installed. On the Server that is in the Farm, open up a Sharepoint Powershell, then run

Get-SPFarm | select Products

This returns an array of the GUIDs of installed production SKUs in Sharepoint.   This will allow you to compare what the candidate server has in its registry to what the farm believes is running.  If this does not match the list in the Registry, run the following:

Set-SPFarmConfig -installedproductsrefresh

This will “synchronize” the registry with the list of products in the farm.  Then re-run

Get-SPFarm | select Products

Check if all your installed products are now shown, you should be able to join your Server to the Farm.

Within the Farm database, the list of products can be viewed:

SELECT [Product], [Version], [PatchableUnit], [PatchableUnitName]
FROM [SharePoint_Config].[dbo].[ServerVersionInformation]

Redirect to Friendly URLs in SharePoint

It’s hard to pull the plug on an old URL host name in favor of a new one. Sure AAM (Alternate Access Methods) allows for the definition of up to five hostnames (one per zone) on the same Web Application, but there are wrinkles to having alternate hostnames floating about.

For just one example, Contextual Search (searching on a list or site) will only work if you your search crawl is on the default zone, and if the user is using the default zone. Otherwise there will be no search results returned.  Sometimes the goal is to retire a hostname, and perhaps we want to have a transition period where users are redirected to the new URL.

You can have URLs redirected to your preferred host very simply using an add-on to IIS from Microsoft called “IIS Redirect 2.0” available from this site, just be sure to install the x64 version.

The preferred form of redirect in this case is a 301.  It is respected by search engines, and allows browsers the possibility to correct bookmarks.   As per RFC2616, search engines should forget the redirecting address and save the address pointed to by the redirection as the preferred address.

Once you’ve installed “IIS Redirect 2.0”, go into IIS (no reboot required), and find the target site:

Enter URL Redirect through a double-click or right-click properties, and add a rule:

I suggest a Canonical domain name rule:

Simply choose the preferred URL to retain; all other ways in get the redirect to this URL:

You can view the Rule that’s created. It’s simply a regular expression match on anything that’s different from your preferred URL and a redirect to your preferred URL:

An IISReset is always advisable.  Happy Redirecting!

Records Management with SharePoint – Information Architecture: part 2

There is a good deal of groundwork required to fully implement Records Management in SharePoint.  The foundation is the overall Information Architecture.  SharePoint 2010 provides a range of capabilities and is very flexible.  With this flexibility comes choices.  Some of these decisions affect the manageability and extensibility and usability of SharePoint, so we want to plan carefully.  Below are the primary facets of a SharePoint Information Architecture:

  • Hierarchy
    This includes Web Applications, the breakdown of Site Collections, the Site Hierarchy, and associated Document Libraries.  Separate Site Collections that ride along managed paths allow a logical and granular division between content databases, allowing near endless scalability.
  • Navigation
    A good portion of navigation flows out of the decisions on Hierarchy combined with selection and standardization of navigation elements including tables of contents, left hand navigation, horizontal top level global navigation, breadcrumbs, and optionally additional techniques such as MegaMenus.  Best practice dictates security trimmed navigation, so users are only presented with navigation elements to which they have some level of access.
  • Security
    Best practice guides to the use of permissions inheritance wherever possible.  This will make administration as easy as possible.  If security is granted broadly at the top, and more restrictive as one descends the hierarchy, the user will have the best possible experience. This is because subsites will be reachable naturally via navigation, reducing the incidence of pockets and islands that can only be reached via manual bookmarks and links.  Leveraging AD and/or SharePoint groups further minimizes security overhead.
  • Metadata
    This is the heart of the Information Architecture, and the primary focus of this article.

Metadata can be assigned to individual documents and allocated within individual document libraries, however for a true enterprise-class Information Architecture, this needs to be viewed holistically from top down.  To achieve this, the following should be viewed as best practices:

  • Leverage Content Types
    Content Types are the glue that connects data across the enterprise.  The encapsulate the metadata, the document template, workflow and the policies that apply to documents.  A single centrally managed content type can control documents in libraries within countless sites.
  • Content Syndication Hub
    Before SharePoint 2010, Content Types lived within the Site Collection as a boundary.   This was a significant obstacle to scalability and consistency across the enterprise.  The Content Syndication Hub changes all that.  From a single location, All Content Types can be defined and published across the farm.  That includes the information policies, metadata and document template.
  • Content Type inheritance
    All Content Types must inherit from built-in SharePoint Content Types.  However by structuring your content types to inherit is a logical and hierarchical fashion, management and evolution of your Information Architecture can be an elegant and simple affair.   An example could be a Corporation Content Type, with sub-companies inheriting from it, then divisions, departments, and finally use-oriented content types.  Imagine needing to add a new field (or site column) across an entire company.  Adding it high in your hierarchy will propagate to all subordinate content types.
  • Build out enterprise taxonomies
    For the Information Architecture to be relevant and useful, it needs to map to the organization from a functional perspective.  The vast majority of the naming of data in an organization, as well as the hierarchy and relationships need definition, to enable the SharePoint Farm to enable users to tag, search and utilize the documents and information in the farm.  The larger the organization, the harder this is to achieve.

One challenge is managing all the Content Types and Site Columns.   This is because on publishing, Site Collections actually identify these by name instead of a GUID (Guaranteed Unique Identifier).  If you have an existing Site Column or Content Type locally defined in a Site Collection, this name collision will prevent the propagation of these conflicts into this Site Collection.  The challenge is magnified by the Content Syndication Hub publishing the Content Types and Site Columns to all subscribing Site Collections.  So even if your Site Collection only needs a few, it’s an all or nothing affair.

Given we are limited to planning to avoid naming conflicts, my recommendation is to add identifying information to the trailing end of Site Columns and Content Types, especially when defining a generic content type such as “Reference Document” or a Site Column such as “Completion Date”.  Instead, perhaps add additional text in a consistent manner.  Such as “Reference Document (AR)”  (for Accounts Receivable) or “Completion Date (PMO Task)”.  The reason to add the text at the end is in many situations the end of the text is cut-off in the user interface.  While hovering over the text (such as in a grid column) oftentimes shows the full name, best is to make the title easily identifiable from a user perspective.

The real challenge in setting up the Information Architecture is not the technical configuration.  That’s a walk in the park.  The real hard part is gathering the experts to define the taxonomies and making the appropriate decisions is the hardest part in large organizations.   If you have an existing farm that has grown organically and has not taken advantage of content types, the syndication hub, it is actually possible to wrestle it from chaos to order, but it’s not a cakewalk.  I have created a range of scripts and techniques for publishing the components of the new Information Architecture, and reassigning documents and metadata to it, resulting in the structured farm that works within the defined Information Architecture framework.

Records Management with SharePoint – part 1

SharePoint 2010 has some great capabilities for implementing a true Records Management policy.  Let’s explore both the capabilities as well as limitations, and how to extend SharePoint to create a true enterprise-class Records Management system.

The overarching goal in Records Management is to ensure Records are handled in a manner that is consistent with the organization’s Records policy.  So first a policy must be defined across the enterprise, and then all systems including SharePoint must manage documents in a policy-compliant, automated, user-friendly and auditable fashion.  Strategically we want to:

  • Limit demands on the user
    Simplify metadata tagging, and hide records jargon from end-users
  • Policy based disposition
    Automate disposition to eliminate the dependency on end-users to take action on each document.
  • Enhanced reporting
    Enable users to self-satisfy, to explore document expiration and disposition.

First let’s clarify what Records are.  Not all documents are Records.  SharePoint offers a range of capabilities in support of defining and managing records:

  • Records can be managed centrally or in-place
    Central management through sending documents to a “Record Center” offers the ultimate in centralized control, search, storage, security and administration.  However there is a real impact on end-users when Records are moved from their usual home.  SharePoint offers “In-Place Records Management” which is the direction the industry seems to be heading.
  • Records can be blocked from deletion
    End users can be prevented from deleting a Record, which makes sense, as Records Management by definition provides policy for treating Records.
  • Records can be stripped of versions
    This reducing the frequency that multiple versions of a Record are stored.
  • Records can be made read-only
    This can be used to lock down a record so it does not change.
  • Records can and likely do have their own expiration and disposition rules
    SharePoint allows a different policy to be applied to a document if it is a record.
  • Records are quickly identified
    Searching, sorting, filtering are available to identify records.  Documents that are records are also easily identified by a special record mark on the document icon.

Below are the pieces of the puzzle, each of which I will devote an article to addressing:

  1. Define your information architecture
  2. Creating a centrally managed set of Content Types with Information Policies
  3. Wrestling an unstructured set of sites, libraries and documents into a centrally managed information architecture
  4. Document Disposition in SharePoint
  5. Customizing the Expiration Date calculation
  6. Reporting on pending document disposition in SharePoint
  7. Review and approval of document disposition
  8. Control the timing of document disposition

I’m going to delve into how to accomplish the above to define and put in place a Records Management policy and system across an enterprise.

Using the SP2013 Client Object Model in InfoPath managed code

Someone asked me today about using the SharePoint 2013 Client Object Model in Managed Code.  I’ve done this in VSTA (Visual Studio for Tools and Applications) in C#.  It took a bit of experimentation to get it going, here’s how I did it:

I extracted the Client Object Model runtime DLLs from the SharePoint server.  That’s both Microsoft.SharePoint.Client.dll and Microsoft.SharePoint.Client.runtime.dll.  For good measure I also took Microsoft.SharePoint.Client.Runtime.dll.deploy

In my C# (I was using VSTA) as for Managed C# in InfoPath, I:

  1. Added three project references:
    1. System.Core
    2. SharePoint.Client
    3. SharePoint.Client.Runtime
  2. Added Using for SharePoint.Client

Opened AssemblyInfo.cs to deal with partial trust exception, and added:

using System.Security; //Added to try to resolve the partial trust error: JP
[assembly: AllowPartiallyTrustedCallers]   //Added to try to resolve the partial trust error: JP

I had to turn the form  from “Automatically determine security level” to “Full Trust”.  This is an InfoPath specific setting, that may affect how this runs as a Farm Solution, and whether it needs a Farm Administrators approval.

At that point, you can use the Client Object Model.  Getting data is done two different ways:

–          CAML: my native old-style preferred approach for simple queries

–          LINQ: for SQL-like queries; the one catch is you need to use SPMetal to create a runtime Thunk of sorts specific to your target sites/lists

Hope that helps!

Book Review: Sharepoint 2010 Development with Visual Studio 2010

Cookbook is a good term for this book.  It is an excellent overview of SharePoint programming, with a task focus on the core programming tasks and deliverables a typical programmer is faced with.  This means it’s a handy shelf reference when faced with an assignment.

I particularly enjoyed the common approach of “How it works” and “There’s more” to continue exploration.  This is a well structured and approachable programming reference book I would recommend.

http://www.amazon.com/gp/product/1849684588/ref=cm_cr_mts_prod_img

RBS (Remote Blob Storage) part 3

A configured and running Remote Blob Storage (RBS) is a ticking time bomb until you’ve configured the Maintainer to run. That’s because RBS is designed to leave a trail of unused objects in its wake. RBS counts on a periodic process to run to eliminate all these unused objects. If you save a file in SharePoint a dozen times, even with versioning disabled, each save will leave a blob object behind, largely for performance reasons.

Setting up Maintainer to run is not easy; which is compounded by skimpy and inconsistent documentation.

Microsoft’s documentation indicates that Encryption is not required if a Trusted connection is used. However I have found Encryption was required, with Maintainer complaining if the connection string was unencrypted.

Overview

To configure the Maintainer, the following steps must be carefully done:
1. Decrypt Connection string, if needed
2. Define each connection string in the config file; all connections are defined in this one file
3. Encrypt the connection strings within the config file
4. Run the Maintainer for each connection, referencing each connection by name

Let’s go through these step by step. First establish the locations of some key components. I like to put shortcuts to each on the desktop that open in a CMD window. If you do this, you’ll thank me that you did. For me it was:

Maintainer location:
C:Program FilesMicrosoft SQL Remote Blob Storage 10.50Maintainer

.NET framework location:
C:WindowsMicrosoft.NETFrameworkv2.0.50727

This is the utility to encrypt and decrypt connection strings. It only works against files named “web.config”, so we will need to do a fair amount of file renaming along the way.
aspnet_regiis.exe

Command to encrypt a connection string, from Maintainer directory and rename it back:

C:WindowsMicrosoft.NETFrameworkv2.0.50727aspnet_regiis.exe -pef connectionStrings . -prov DataProtectionConfigurationProvider
RENAME web.config Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config

Alternatively, you can run the command to encrypt the connection string from the .NET directory:

aspnet_regiis.exe -pef connectionStrings . -prov DataProtectionConfigurationProvider
RENAME "C:Program FilesMicrosoft SQL Remote Blob Storage 10.50Maintainerweb.config" "C:Program FilesMicrosoft SQL Remote Blob Storage 10.50MaintainerMicrosoft.Data.SqlRemoteBlobs.Maintainer.exe.config"

To decrypt from the Maintainer directory:

C:WindowsMicrosoft.NETFrameworkv2.0.50727aspnet_regiis.exe -pdf connectionStrings .

This is the command to start maintainer, referencing the PATH variable:

%programfiles%Microsoft SQL Remote Blob Storage 10.50MaintainerMicrosoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection   -Operation GarbageCollection ConsistencyCheck  ConsistencyCheckForStores -GarbageCollectionPhases rdo -ConsistencyCheckMode r -TimeLimit 120
Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection   -Operation GarbageCollection ConsistencyCheck  ConsistencyCheckForStores -GarbageCollectionPhases rdo -ConsistencyCheckMode r -TimeLimit 120

You will want to decrypt and re-encrypt multiple times to make sure the Go to .NET directory. Some suggestions:

  1. run from .NET directory
  2. rename the maintainer config file first to web.config
    REN Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config web.config
  3. If the web.config file has an empty connectionstring, then the filename/directory was incorrect
  4. Review and tweak the connection string
  5. after encryption, rename back:
    REN web.config Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config

Decrypt to examine (the “d” in -pdf is “Decrypt”): aspnet_regiis.exe -pdf connectionStrings “%programfiles%Microsoft SQL Remote Blob Storage 10.50Maintainer”

Encryption of connection string within Maintainer config file; run from .NET directory (the “e” in -pef is “Encrypt”):

aspnet_regiis -pef connectionStrings "%programfiles%Microsoft SQL Remote Blob Storage 10.50Maintainer" -prov DataProtectionConfigurationProvider

This is the command to start the Maintainer:

Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection -Operation GarbageCollection ConsistencyCheck  ConsistencyCheckForStores -GarbageCollectionPhases rdo -ConsistencyCheckMode r -TimeLimit 120

Note that the cofiguration file used by the Maintainer is called ” Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config”.

Here it is with the connection string unencrypted. A few things to note:

  1. “Application Name=”… must use the &quot and the name in those quotes. This is essential. Changing &quot to an actual quote causes the request to fail.

The connection string must be encrypted for it to work, here it is below encrypted. Note the EncryptedData and CipherData tags:

    
AQAAANCMnd8BFdERjHoAwE/Cl+sBAAAAvyx0EESER0Kn9Ui9t9ZzcgQAAAACAAAAAAADZgAAwAAAABAAAAA1gYCb5s1usLg/P7uwA7TmAAAAAASAAACgAAAAEAAAACVQO6o7eVm/lmikyJUtSeRIAgAAX5uFsFeWOEZQycBwOhxJmFN11JFnTdM+PycItclQJYk90gQZhZ2B7E6bf6h3MovJB/jnWM4cEKbOG3w+9pPPEbAuk9c7Y+zQj4atoHdnlNX1D0kKge39A1LQK+C+JQ575bx4TWVI/Zl5Edc5hbLWt+IifMytGHrZ4MHHENQOR3S001yMtBlaISuPQVa1DUDpzoBS3rpTDej2UAmHmBIjtHF1vXfmBz6R+p2xdQlpBajPRLOfQ28gXoT25HrpEyKTZtWyeFyWcYslqm7msowJ6FOP7iwatY9/H9LkvWj0pAegHUQrmaCJnl9M+xGGiOigNeBe1o20tQFxYrW4RIJ/MSOrTZthbccAaRMmgoR8PjMBNOAzE3eDMihGCpeEFtIxSDYZnT7OCuAkSCmqCWqLpDAJyqCmUJbCRrttU2xo7VnfCCGACzI4jfOHWVIFKYaIIfPFD6KVeDSdBDt9J4xeR/sE5HV6Jcugcm8yAcW1CIq6/w5QwfjSN20pjzOHXo9SFukhMJPJIWTf0GnGuwEdO2ci4a6mL0Y8me6BhZxpc3228Hegp+C7/3p3kyrm0H93GwNTB2XkdUajg8K62buGve/OLwIbgLH3pG1jkyhkgm1l5W/CC8lA/6QsdNiLTWHB4fq1AsbxpEIGgOy9sDvJmL3dvcaOeMpMv5g9mVetbYvE0D+WsZVH+ILBeZ0HwsZ4kty3E+5yVTG3TlkQ48j07e8QjE3o4xYv/j5bSB7T+2Udlsi1rlNy93C1iggCfDrGCfpnhhMn5ZsUAAAACeMf9OAnNDQzMKpZ4HseVXcUBpg=

Specific steps

The connection string in the Maintainer are specified in the maintainer.exe.config file. This file has to co-exist with the executable. The connection string are either all encrypted or none. The default connection string that the RBS specifies is always encrypted and its a good practice to encrypt the connection strings .

At first glance, the documentation is not clear on where Maintainer should be run. My findings indicate it should be configured on a single WFE (Web Front End) SharePoint server.

Documentation does not indicate it, but I found a special name=tag needed to be added for each connection string, and it needs to be referenced when running the command. On the command line, for each database that the Maintainer needs to be run for, the ConnectionStringName parameter needs to be set to a corresponding Name= in the XML. This matching reference is what connects the command line Maintainer call to the specific connection to a database. depending on how RBS was installed, the default config file could have this tag, or it could be missing, so take care to check.

Logging to the screen is not too helpful, as log data scrolls out of buffer. Enabling logging is recommended in the XML file.

In order to encrypt and decrypt, the source (starting) file needs to be web.config, so the file needs to be repeatedly renamed between web.config and Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config

For each content Database that is RBS enabled, a content database connection string needs to be added to the Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config file. It needs to be named using the Add Name=, and then referenced at runtime using the ConnectionStringName parameter. So, for additional databases, simply add additional connection string to the web.config file for each content database that is rbs enabled.


Encrypt the web.config file again by using following command

cd /d %windir%Microsoft.NETFramework64v2.0.50727

aspnet_regiis -pef connectionStrings “%programfiles%Microsoft SQL Remote Blob Storage 10.50Maintainer ” -prov DataProtectionConfigurationProvider

Rename the file back to original

cd /d %programfiles%Microsoft SQL Remote Blob Storage 10.50Maintainer

ren web.config Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config

Note: the XML file is case sensitive, you need to use the exact string for ‘connectionStrings’ parameter above.

For the first time, run the Maintainer manually. Thereafter, you’ll want to schedule it to run:

  1. Create a Maintenance Task using following steps (for each database)
  2. Click Start, point to Administrative Tools, and click Task Scheduler.
  3. Right-click Task Scheduler (Local) and click Create Task.
  4. Click the Actions tab and click New.
  5. On the New Action page, specify:
i. Action as Start a Program.
ii. For the Program/script, click Browse and navigate to the RBS Maintainer application; by default, the location is %programfiles%Microsoft SQL Remote Blob Storage 10.50Maintainer Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.
iii. In the Add Arguments (optional) field, enter the following parameter string: (change the name of the connection string as specified in the config file earlier)
-ConnectionStringName RBSMaintainerConnection -Operation GarbageCollection ConsistencyCheck ConsistencyCheckForStores -GarbageCollectionPhases rdo -ConsistencyCheckMode r -TimeLimit 120
iv. Click OK

Note: XML file is case sensitive, you need to use the exact string for the connection string above.

5. On the Triggers tab, click New.

6. In the New task dialog box, set:

i. Begin the task to On a schedule.
ii. The trigger schedule to be Weekly, Sunday, at 2am (or at another time when system usage is low.)
iii. Click OK.
  1. On the General tab, enter a name for the task, such as “ RBS Maintainer”, where identifies the database associated with the task. In the Security settings section:
  2. Make sure that the account under which the task is to be run has sufficient permissions to the database.
  3. Select the option to Run whether user is logged on or not.
  4. Click OK.

Tuning the internal Maintainer parameters

There are several internal parameters that should be set that control the frequency that the Maintainer can be run, as well as how long deleted entries should be maintained before being truly removed. This later option is meant to save DBAs from trouble if they restore an older Content DB without restoring the FILESTREAM. If deletes are very “lazy” (ie, delayed by days) rolling back to a previous DB without a FILESTREAM restore could work. I set the parameters more aggressively, knowing I won’t fall prey to this issue. Here’s the SQL to apply the changes. Note it’s better to use the Stored Procedures than setting the values directly:

<pre>USE [Content_database_yourname]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
exec mssqlrbs.rbs_sp_set_config_value delete_scan_period, 'days 1'
exec mssqlrbs.rbs_sp_set_config_value orphan_scan_period, 'days 1'
exec mssqlrbs.rbs_sp_set_config_value garbage_collection_time_window, 'days 1'
GO

if you want to get aggressive on storage recovery use a smaller timeframe:

exec mssqlrbs.rbs_sp_set_config_value delete_scan_period, 'time 00:00:10'
exec mssqlrbs.rbs_sp_set_config_value orphan_scan_period, 'time 00:00:10'
exec mssqlrbs.rbs_sp_set_config_value garbage_collection_time_window, 'time 00:00:10'

Tips

  • Back up all configuration files first
  • Use Visual Studio to edit the XML files
  • XML is largely case sensitive, so take care
  • -pdf is for Decrypion, -pef is for encryption