Gradual Site Collection Deletion

Gradual Site Collection Deletion

I had a mission to achieve overnight; move 28 site collections into a new managed path, as well as rename the 28 associated content databases.   Pretty straightforward to do, and to script in advance to run in stages:

  1. Backup site collections
  2. Delete site collections
  3. Dismount content databases
  4. Rename content databases and shuffle around underlying storage including FILESTREAM RBS location
  5. Create the new managed path; reset of IIS
  6. Remount 28 content DBs
  7. Delete the old managed path
  8. Restore the 27 content databases (this is where tinkling glass is heard, followed by painful silence)

After enough jolt cola to get that turtle to beat the hare fair and square, I got to the bottom of the problem.  First the problem.  The site collections could not be restored into their original database, as the site collection purportedly already existed there.  Even though it was deleted.

By midnight I gave up, and did the 28 Restore-SPSite into any random content databases, tossing organization and structure to the winds (temporarily), knowing I’ve got gads of available storage, knowing once I got to the bottom of the issue, a simple set of move-spsite commands would set things right.  No, I don’t like randomness, but I also like a few winks of sleep and happy users…

Now the cause.  Since SharePoint 2010 SP1 and SP2013 and SP2013, has the ability to recover deleted site collections (not through the UI, but only through PowerShell).  I used the -gradualdelete option, thinking I would be nice and gentle with a production farm.  Here’s a sample of my delete commands, where I also disable prompting:

Remove-spsite  href="http ://sharepoint/div/clm/int/A/"  -confirm:$false -gradualdelete

Here’s the kicker.  After the delete, the site collection is indeed still there.   It sticks around actually for the duration of the the Recycle Bin duration (default 30 days).  There’s one good way to see, let’s dive into the forbidden content database, and have a peek:

SELECT [DeletionTime]
FROM [Content_mydivision_a].[dbo].[SiteDeletion]
where (Restorable=1)

Restorable=1 indicates this site collection could be restored.

The solution?  Well, it’s not the recycle bin job, that has no effect on this.  There is a gradual delete job at the web application level, but that won’t help us either; at least not just yet.  First you have to use the remove-spsite CmdLet to remove each site permanently.  Here’s the syntax:

remove-spdeletedsite f5f7639d-536f-4f76-8f94-57834d177a99 -confirm:$false

Ah, you don’t know your Site Collection GUIDs by heart? well, me neither, I prefer more useful allocation of brain cells, so here’s the command that will give you the Site Collection GUIDs that have been (partially) deleted:

get-spdeletedsite -webapplication "http ://sharepoint/"

So, you got your partially deleted GUIDs, you diligently did a remove-spdeletedsite for each, but the Restore-SPSite still will not work.  Now’s the time to run the handy-dandy Gradual Delete timer job for your web application, in Central Admin, Monitoring.  First thing you might notice is the job is taking a bit of time to run.  That’s good, it’s doing something for you, and actually triggering the Content DB Stored Procedure called proc_DeleteSiteCoreAsync.  It actually deletes in batches.

Here’s how to wipe out all these mildly annoying site collections from your recycle bin for a web application:

get-spdeletedsite -webapplication "http://my-srv-sp10/" | Remove-SPDeletedSite

At this point your Restore-SPSite will work to your target content database, and if you played SharePoint Roulette like me and restored to a random location, a move-SPSite will make fast work of putting things where they should be.

More information on the Gradual Deletion Timer Job can be found in this Technet Article by Bill Baer

Converting SQL for embedded use within VBA

Converting SQL for embedded use within VBA

After creating and testing SQL to embed within a VB or VBA application, it needs to be added to a VB project in usable strings. How to convert your SQL easily without introducing errors? Here’s a PowerShell script that takes in a SQL file with a header (SQL) and condenses it into 80+ character strings for copying into your VB code.

$MyFile=get-content -Path "sql.txt"  
$outfile = "sqlNew.txt"
Remove-Item $outfile -ErrorAction SilentlyContinue
for ($i=0; $i-lt $MyFile.Count; $i++)
if ($str.length -gt 80)
if ($firstLine)
$str = '"' + $str + '" _'
$str = '& "' + $str + '" _'
Add-Content $outfile "$($str)`n"
$nextLine = $MyFile[$i]
$nextLine = $nextLine.Replace("`t"," ");
$nextLine = $nextLine.Replace("  "," ");$nextLine = $nextLine.Replace("  "," ");$nextLine = $nextLine.Replace("  "," ");
$idx = $nextLine.indexof("--");
if ($idx -ge 0)
$nextLine = $nextLine.Substring(0,$idx)
$str = $str + ' ' + $nextLine;
if ($firstLine)
$str = '"' + $str + ' "'
$str = '& "' + $str + ' "'
Add-Content $outfile "$($str)`n"
$str = $null;

Custom SQL Reporting in MS-Project Server 2013

Custom SQL Reporting in MS-Project Server 2013

It is easy to navigate the database schema in MS-Project Server to generate reports.  The SQL can be embedded in an ODC, or can be used within PowerPivot.  If joining Task and Project data, there’s a challenge of rollups.  The first challenge is avoiding double-counting from summary tasks.  The solution is to exclude them on the join, adding this condition:

where TaskIsSummary=0

The next source for double-counting are external tasks; those exposed through cross-linking tasks in separate projects. We can exclude both this way:

where TaskIsSummary=0 and TaskIsExternal = 0

The next problem is if merging task and project tables, project values would roll up incorrectly, however such numeric fields can be pro-rated to the project work, as long as we avoid divide-by-zero errors, here’s how, referencing a custom field called “Budgeted Costs”; note how its value is proportionate to the task work:

, case
when [MSP_EpmProject_UserView].[Budgeted Costs] = 0 THEN 0
when MSP_EpmTask_UserView.TaskRegularWork = 0 THEN 0
when MSP_EpmProject_UserView.ProjectWork = 0 THEN 0
[MSP_EpmProject_UserView].[Budgeted Costs] * ( MSP_EpmTask_UserView.TaskRegularWork/ MSP_EpmProject_UserView.ProjectWork )
END as [Budgeted Costs]
FROM dbo.MSP_EpmProject_UserView INNER JOIN dbo.MSP_EpmTask_UserView
ON MSP_EpmProject_UserView.ProjectUID = MSP_EpmTask_UserView.ProjectUID
where TaskIsSummary=0 and TaskIsExternal = 0
ORDER BY MSP_EpmProject_UserView.ProjectName, MSP_EpmTask_UserView.TaskIndex, MSP_EpmTask_UserView.TaskName

One step further, we can do the same using task assignment data, here’s what that looks like using the assignment work:

, case
when [MSP_EpmProject_UserView].[Budgeted Costs] = 0 THEN 0
when MSP_EpmAssignment_UserView.AssignmentWork = 0 THEN 0
when MSP_EpmProject_UserView.ProjectWork = 0 THEN 0
[MSP_EpmProject_UserView].[Budgeted Costs] * ( MSP_EpmAssignment_UserView.AssignmentWork/ MSP_EpmProject_UserView.ProjectWork )
END as [Budgeted Costs]
,[MSP_EpmResource_UserView].[Cost Type]
,[MSP_EpmResource_UserView].[Resource Departments]
,[MSP_EpmResource_UserView].[Resource Title] FROM dbo.MSP_EpmProject_UserView INNER JOIN dbo.MSP_EpmTask_UserView ON MSP_EpmProject_UserView.ProjectUID = MSP_EpmTask_UserView.ProjectUID LEFT OUTER JOIN dbo.MSP_EpmAssignment_UserView ON MSP_EpmTask_UserView.TaskUID = MSP_EpmAssignment_UserView.TaskUID AND MSP_EpmTask_UserView.ProjectUID = MSP_EpmAssignment_UserView.ProjectUID LEFT OUTER JOIN dbo.MSP_EpmResource_UserView ON MSP_EpmAssignment_UserView.ResourceUID = MSP_EpmResource_UserView.ResourceUID
where TaskIsSummary=0 and TaskIsExternal = 0
ORDER BY MSP_EpmProject_UserView.ProjectName, MSP_EpmTask_UserView.TaskIndex, MSP_EpmTask_UserView.TaskName

SharePoint Database naming

The primary naming convention to put in place is the Database Naming standard.  By default, SharePoint puts a GUID at the end of every database.  This is to ensure that two SharePoint farms can use the same database server without conflict.  However the GUIDs make memorizing a database name practically impossible.  Here are some basic naming conventions to consider adopting:

  • Avoid GUIDs at all costs
  • Avoid blanks
  • Avoid underscores
  • Leave “DB” and “database” out of the name
  • Use Capital Letters to highlight the start of words (CamelCase)
  • Consistent description to allow a database to be clearly associated with a specific Web App, and Service App
  • References to Production vs. Development are not necessary
  • References to SharePoint version are not necessary
  • References to “SharePoint” are unnecessary, especially for a dedicated SQL Server
  • Leave the obscure “WSS” SharePoint convention for content databases, instead use “Content” at the start of the database name.  That’s clearer for DBAs who are not versed in the mysterious acronyms of SharePoint.

Here’s a proposed syntax for structuring database names:

[Major Application][Type] [Minor Application] [Specific]

Component Description Sample Values
[Major Application] Major category of Application [left blank for SharePoint]

MSPS  (for MS Project Server)

 [Type] Category or type of database, based on primary system using the database Content


[Minor Application] Can be Service Application PerformancePoint


[Specific] Can describe each of multiple service app DBs.  Description of use of Content DB for Web App CentralAdmin


Default: Search_Connector_CrawlStoreDB_4040b7300e9e42779edb3e6b926be5a7

New: ServiceApp_SearchConnectorCrawlStoreDB

Default: SharePoint_AdminContent_ff35d171-482c-4f9d-8305-a4a259ec1a15

New: Content_CentralAdmin

Default: wss_content_eaee9d8f-ed75-4a56-bad3-5abf232b4f66

New: Content_ DIV_HR

Default: StateService_0f2a42e8b90d4c60830ca442e753de13

New: ServiceApp_State

Reporting on SharePoint MySite distribution by Content Database

Reporting on MySite Content Databases

Knowing how sites are distributed among Content Databases is key, such as knowing which Content Database to restore for a given user.

Wouldn’t it be nice to see the breakdown of MySites, which belong to a given Content Database, and the total size of each Content Database? Here’s a script that generates this useful report:

$DBs = Get-SPWebApplication http://MySites | Get-SPContentDatabase
foreach ($db in $DBs)
Write-Host -ForegroundColor DarkBlue "DB Name: $($db.Name)"
$siz="{0:N3}" -f ($db.disksizerequired/1GB)
write-host -ForegroundColor DarkBlue "DB Size: $($siz) GB"
Write-Host "========"
$db | Get-SPSite -Limit all
Write-Host " " 

Metadata Warehouse of SharePoint Content

I often write custom reports to be able to analyze huge data from a SharePoint farm. I can ansewr questions such as:
– Number of documents updated per user per month over time.
– What are the metadata fields and values in use, and their frequency by department.

I typically export as a CSV, using pipe delimited format (to avoid misinterpreting the ubiquitous comma), and import into a spreadsheet.

What happens when the dataset is too large for Excel? The data I analyze often pushes the limits of Excel or my machine. In that case, I’ll import into SQL Server. To make it repeatable, I’ll create a import process, that drops and recreates the target table, defines the fields, and how to handle errors for each, then export as an SSIS DTSX package.

To run a DTSX package, I’ll import into Business Intelligence Studio as a package and run from there.

Once in SQL Server, one can handle larger datasets, with all the tools and techniques of SQL for dealing with really big data.

The data can be exposed as an SSRS report.

Quick Inventory of SharePoint databases

Inventory SharePoint databases

Often I have to quickly analyze SharePoint databases; both Content DBs as well as the full set of existing databases. This could be to see which of the databases are still a part of the farm. Here’s how to quickly see which Content DBs are mounted in the Farm:


For the more general set of all databases associated with the farm, it is nice to know the type of each database. Here’s a quick query to get that for you:

Get-SPDatabase | select name, type

Happy farming!

SharePoint Timer Job History analysis

Analyzing the SharePoint Timer Job History

At times one needs to analyze the timer job history to see what jobs have ran a long time. The Central Admin user interface limits you to a sequential view, although you can filter by Service or Web Application. The actual timer job history is maintained in a single table in the Config DB. Here’s a simple SQL Select to get the 100 longest running timer jobs in a time range:

/****** Script for SelectTopNRows command from SSMS  ******/
SELECT top 100 [Id]
,datediff(S,StartTime,EndTime) as SecondsElapsed
FROM [SharePoint_2013_Farm].[dbo].[TimerJobHistory]
where StartTime > '2013-02-12 02:40:00' and EndTime < '2013-02-12 03:55:00'
--note that the TimerHistory timestamp is always in GMT!
order by SecondsElapsed desc
$events= Invoke-SQLcmd -Server "NY-SRV-SQLPRD02" -Database SharePoint_2013_Farm "select  JobTitle,WebApplicationName,ServerName,DatabaseName, StartTime,EndTime,ErrorMessage from dbo.TimerJobHistory where Status=3 and StartTime between GETDATE() -1 and GETDATE()"  

Then in PowerShell:

foreach($event_num in $event) 
Invoke-SQLcmd -Server "NY-SRV-SQLPRD04" -Database MYdbreports "insert into [SharePoint].[TimerJobHistory]  ( Status,StartTime,EndTime,JobTitle,ServerName,DatabaseName,ErrorMessage ) values ($event_num.Status,$event_num.StartTime,$event_num.EndTime,$event_num.JobTitle,$event_num.ServerName,$event_num.DatabaseName,$event_num.ErrorMessage)"

System characteristics of a well performing SharePoint Farm

Characteristics of a well performing SharePoint Farm

Tempdb data files

  • Dedicated disks for the tempdb
  • Tempdb should be placed on RAID 10
  • The number of tempdb files should be equal the number of CPU Cores, and the tempdb data files should be set at an equal size
  •  The size of the tempdb data files should be equal to the (Physical memory no. of processor). That’s the initial size of the tempdb data files.

DB management

  •  Enough headroom must be available for the databases and log files, plus enough capacity to keep up the requests
  • Pre grow all databases and logs if you can. Be sure to monitor the sizes so that you do not run out of disk space
  • When using SQL Server mirroring, do not store more than 50 databases on a single physical instance of SQL Server
  • Database servers should not be overloaded by using too many databases or data
  • The content databases must be limited to 200GB. Limit content databases to 200GB Indices must be Defragmented and rebuilt daily, if users can absorb the downtime required to rebuild
  • More than 25% disk space must be free.
  • Another instance of SQL must be created for data exceeding 5 TB.
  • When using SQL Server mirroring, more than 50 databases must not be stored on a single physical instance of SQL Server

Monitor the database server

Key performance counters to monitor are:

  • Network Wait Queue: at 0 or 1 for good performance
  • Average disk queue Length (latency): less than 5 ms
  • Memory used: less than 70%
  • Free disk space: more than 25%

Diagnosing and rectifying SQL Transaction Log Growth

Diagnosing and rectifying SQL Transaction Log Growth

Ever wonder why your transaction logs can grow out of control? First place to check is whether the Database is set to Full or Simple mode.   Unless you are running in Simple Mode, backups need to be taken at least hourly to reduce the accumulating Transaction Logs.

The database keeps its own sys log that can tell you what’s preventing Transaction Logs from being cleared; here’s a query to run:

SELECT TOP 1000 [name]
FROM [master].[sys].[databases]

First thing to try is to Shrink the database, assuming you are already in Simple Recovery Mode:

USE dbname
--First parameter below is the fileno for the log file, often 2. Check the sys.database_files
--Second parameter is the target size in MB.

Logfile space is not automatically recovered.  The SQL to Truncate and recover the log space is to run the following against the LDF file:

logical file_name
target_size in MB ], TRUNCATEONLY