Rationalizing SharePoint Storage

SharePoint Storage Management

SharePoint storage tends to grow organically, and in an uneven fashion. Periodically it makes sense to take stock of how Site Collections are distributed amongst Content DBs. The goal should ideally be to keep Content DBs at 50GB or below where possible. When a Site Collection grows to 100GB or more, steps should be taken to manage the growth, as large Content DB performance can degrade, and backup/restore can become lengthy.

Here’s a one-line script that outputs how site collections are distributed among Content DBs, and the size of the Content DB. The results can be pasted into Excel. If needed Excel can separate Text to Columns, allowing you to to pivot larger data sets:

Get-SPContentDatabase | get-spsite -Limit all | % {write-host "$($_.rootweb.url)|$($_.rootweb.title)|$($_.contentdatabase.name)|$($_.ContentDatabase.DiskSizeRequired)"}

System characteristics of a well performing SharePoint Farm

Characteristics of a well performing SharePoint Farm

Tempdb data files

  • Dedicated disks for the tempdb
  • Tempdb should be placed on RAID 10
  • The number of tempdb files should be equal the number of CPU Cores, and the tempdb data files should be set at an equal size
  •  The size of the tempdb data files should be equal to the (Physical memory no. of processor). That’s the initial size of the tempdb data files.

DB management

  •  Enough headroom must be available for the databases and log files, plus enough capacity to keep up the requests
  • Pre grow all databases and logs if you can. Be sure to monitor the sizes so that you do not run out of disk space
  • When using SQL Server mirroring, do not store more than 50 databases on a single physical instance of SQL Server
  • Database servers should not be overloaded by using too many databases or data
  • The content databases must be limited to 200GB. Limit content databases to 200GB Indices must be Defragmented and rebuilt daily, if users can absorb the downtime required to rebuild
  • More than 25% disk space must be free.
  • Another instance of SQL must be created for data exceeding 5 TB.
  • When using SQL Server mirroring, more than 50 databases must not be stored on a single physical instance of SQL Server

Monitor the database server

Key performance counters to monitor are:

  • Network Wait Queue: at 0 or 1 for good performance
  • Average disk queue Length (latency): less than 5 ms
  • Memory used: less than 70%
  • Free disk space: more than 25%

SharePoint Monitoring Metrics

SharePoint Monitoring Metrics

I pulled together some useful metrics for monitoring SharePoint environments:

Storage

  1. Server drive(s) disk space availability (15% free)
  2. Content DB size, and monthly growth rate
  3. Site Collection size and monthly growth rate (same as above, for where we have one site collection per content DB)
  4. Storage by file type (are people uploading home pictures or MP3s)
  5. MySites site size

Security

  1. Unique permission scopes per library

Usage

  1. Items per library, and monthly growth rate
  2. Most active site / least active site
  3. Libraries/Sites with no activity in 90 days
  4. Documents with over 200 Versions (considering version trimming of middle 75% of versions, if > 100 versions of a document)
  5. Duplicate Documents
  6. Users who do not create documents in 30 days (perhaps they are storing locally or on other media)
  7. Folders with over 2000 items in the root
  8. Any newly created library or site

Search

  1. Search terms resulting in no-hits
  2. Very frequent searches

System

–          Reboots

–          IIS Pool recycle frequency

–          Daily report of Unexpected events in ULS log (could be large, and not actionable)

–          RAM usage > 95%

–          Full Crawl exceeding 20 hours

–          Incremental crawl not running hourly

–          Failed Timer jobs

–          CPU Usage > 90% average over any five minute period

–          Disk queue length on SharePoint server

–          Web Page load time (can have a program pull a page every minute)

Configuring Blob Cache correctly

Blob Cache

For snappy SharePoint performance, one great option to enable is Blob Caching. Make sure to first back up the web-config for your target web app, which you can navigate to via IIS.

I prefer to increase the max-age=”600″ to a full day max-age=”86400″. This parameter forces remote browser caching which really helps with remote users over relatively poor connections.

Set an appropriate location; I prefer to segregate by web app name. Microsoft suggests a 10GB maxSize, with 20% extra space. You can lower it, it rarely increases to that level.

<BlobCache location="D:BlobCachewebappname" path=".(gif|jpg|jpeg|jpe|jfif|bmp|dib|tif|tiff|ico|png|wdp|hdp|css|js|asf|avi|flv|m4v|mov|mp3|mp4|mpeg|mpg|rm|rmvb|wma|wmv)$" maxSize="7" max-age="86400" enabled="true" />

Best is configuring this on a dedicated drive. Unless you have an enterprise-class SAN, dedicated spindles are the way to go.

For every cache, you’ll want to know how to flush it:

$webApp = Get-SPWebApplication "http ://sharepoint"
[Microsoft.SharePoint.Publishing.PublishingCache]::FlushBlobCache($webApp)
Write-Host "Flushed the BLOB cache for:" $webApp

It’s totally safe to flush the cache at any time without disruption, and best of all, wastes no water 🙂

Output Site Collection and storage information

Reporting on Site Collection and storage information

Wouldn’t it be nice to have a single report of Site Collections, the Content Database each is in, and the size in GB? Well, let’s do it!

First let’s grab the Web Application:

get-spwebapplication http ://SharePoint 
Then we grab the Site Collections (all of them):
Get-SPSite -Limit all 

Now, let’s select the information we want to see in the report:

select url,contentdatabase

Let’s calculate the size in Gigabytes:

select url,contentdatabase,@{label="Size in GB";Expression={$_.usage.storage/1GB}} 

Now, let’s output the report to a CSV file, making it easy to read in Excel:

convertto-csv | set-content "L:PowerShellDBsize.csv"

Now let’s put it all together in a single command, piping the commands to produce the report:

get-spwebapplication http ://SharePoint | Get-SPSite -Limit all | select url,contentdatabase,@{label="Size in GB";Expression={$_.usage.storage/1GB}} | convertto-csv | set-content "L:PowerShellDBsize.csv"

Sync DB grows indefinitely

Sync DB is a database that is a part of the User Profile Service related to Active Directory synchronization.   Unfortunately it tends to grow in an unconstrained fashion.  SharePoint has no built-in cleanup mechanism.  The table that grows without bounds is “InstanceData”.  The followed Stored Procedure should be first run in waves, so as not to overload the transaction logs or tempDB.  Running on at least a monthly basis is recommended.

The following is a report that shows monthly growth in rowcount:

SELECT COUNT ([lastUpdated]) as countOfInstancedata, month ([created]) as month, YEAR([created]) as year
       FROM [Sync DB].[dbo].[InstanceData]
  group by month ([created]), YEAR ([created]) order by YEAR ([created]), month ([created])

Here’s the stored procedure:

USE [Sync DB]
GO
CREATE SCHEMA FIM
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [fim].[TruncateInstanceData]
AS
BEGIN
SET NOCOUNT ON;
DECLARE @truncationTime datetime;
SET @truncationTime = DATEADD(day, -1, GETUTCDATE());
DELETE FROM [dbo].[InstanceData]
WHERE ([created] < @truncationTime)
END

Alternatively, I’ve been running this loop to gradually delete rows.  You can adjust the loop number, delete size (rowcount) and delay to taste:

EXEC sp_spaceused N'dbo.Instancedata';
 -- Declare local variables
DECLARE @NumberOfLoops AS int;
SET @NumberOfLoops = 500;
DECLARE @CurrentLoop AS int;
SET @CurrentLoop = 0
WHILE @CurrentLoop < @NumberOfLoops BEGIN
set rowcount 10000
delete from [Sync DB].[dbo].[InstanceData] 
where [Sync DB].[dbo].[InstanceData].created <CONVERT(DATETIME,'2012-02-01 00:00:00', 102)
WAITFOR DELAY '00:00:01:00';
SET @CurrentLoop = @CurrentLoop + 1;END
-- Check space used by table after we are done
EXEC sp_spaceused N'dbo.BigLoggingTable';

A more generic WHERE clause I’ve successfully used is:
WHERE [Sync DB1].[dbo].[InstanceData].created <= DATEADD(day, -60, GETUTCDATE())

SharePoint Farm Trusted Certificate

We all want our SharePoint farms to be blindingly fast.  Here’s an easy step you can take to improve performance, by reducing the traffic from your SharePoint servers that check on farm certificate validity.  Even though you may be using Classic NTLM as authentication for your Web Applications, inside SharePoint the components rely on Claims Authentication.

First, let’s export the Farm Certificate to C: with these two PowerShell commands:

$rootCert = (Get-SPCertificateAuthority).RootCertificate
$rootCert.Export("Cert") | Set-Content C:SharePointRootAuthority.cer -Encoding byte

Now, let’s import the certificate into the Microsoft Management Console (MMC):

1. Click Start, Run, MMC

2. Add the Certificates snap-in:

3. Select “Computer Account”

4. Then import the Certificate: