SharePoint Document IDs

SharePoint Document IDs

The SharePoint Document ID Service is a new feature of SharePoint 2010 that offers a number of useful capabilities, but carries some limitations.  Let’s dig a bit deeper and see what it does and how it works.

One challenge for SharePoint users is that links tend to easily break. Rename a file or folder, or move the document, and a previously saved or shared link will not work.  By tagging a document with an ID, SharePoint can start referencing documents using this ID, even when the underlying structure beneath it has changed.  SharePoint can accept a link with this ID, by referencing a dedicated page on each site that takes care of finding the the document.  This page is named DocIDRedir.aspx.  Here’s what a URL might look like:

“http: //%3csitecollection%3e/%3cweb%3e/_layouts/DocIdRedir.aspx?ID=XXXX”

There’s also a Document ID web part that’s available for users to enter a Document ID.  This is used most prominently when creating a Records Center site, which is based on an out-of-box website template.

The Document ID Service is enabled at the Site Collection level, and assigns Document IDs that are unique only within the site collection.  There is a prefix available for configuration that is most useful when assigned uniquely for each Site Collection to ensure uniqueness across your web application and even farm.  If you have more than one farm, it makes sense to provide an embedded prefix to indicate the farm, to ensure uniqueness globally.

Setting Document ID

Once the Document ID Service is enabled, every new or edited document instantly gets a Document ID assigned.  However, historical documents do not get an immediate Document ID assignment.  The assignment of Document IDs to documents that were uploaded prior to this service being enabled are assigned by a Timer Job called the “Document ID assignment job” that exists at the Web Application level.  By default this job runs nightly.  This is one of two jobs associated with the Document ID Service; the other being the “Document ID enable/disable job ”

When the Document ID Service is enabled for a Site Collection, Event Receivers are automatically installed in each Document Library.  Actually there is a set of Event Receivers installed for each and every Content Type configured within that document library.  The Event Receiver is called “Document ID Generator” and is configured to by fired synchronously.  There is a separate Event Receiver for the following events:

  • ItemAdded
  • ItemUpdated
  • ItemCheckedIn
  • ItemUncheckedOut

Once a Document ID is assigned, it is changeable through the Object Model, although do so at your own risk.  Before the Document  ID Service is enabled, the Document ID field does not exist to be assigned.   if you are migrating from a legacy system that has existing Document IDs, you can first migrate the documents, then the Document ID service is enabled.  This adds the internal Document ID field.  Then before the daily Document ID Assignment job runs (better yet, disable it during this process), we can programmatically take the legacy Document IDs and assign their values to the SharePoint IDs.  With the Document ID field populated, the Document ID Service will not overwrite the already set Document IDs.

Note that part of Document ID Service is to redirect URLs referencing the Document ID.  It turns out, if you manually assign duplicate Document IDs (something that in theory should never occur), the daily Document ID Assignment Job detects this situation, and the DocIDRedir.aspx redirects to a site-based search page that passes in the Document ID.

Under the covers there are three internal components to a Document ID:

  • _dlc_DocIdUrl: fully qualified URL for document referencing the DocIDRedir.aspx along with the lookup parameter
  • _dlc_DocId: The Document ID.  This is the internal property you can directly address and assign as $item[“_dlc_DocId”]
  • _dlc_DocIdItemGuid: DocID related GUID

That completes our tour of the Document ID Service.  I look forward to hearing of others’ experience with it.

14 replies
  1. Ian
    Ian says:

    I have been using the DOC ID service for our EDRMS for about 5 months now. Recently a user discovered an issue that after doing some digging I have no idea how to fix.

    If you download a copy of a document, change the file name, and upload it to the same library (the user had 4 or 5 similar documents to create and she was looking for the fastest way to do this) you end up with 5 documents, different file names but duplicate DOC IDs!

    The reason this happens is the metadata is saved within the word document when it is downloaded. Changing the file name does not reset the metadata. When that file is uploaded into SP, the DOC ID service thinks the DOC ID has already been assigned and doesn’t trigger.

    I could write some code for when the user clicks “Download a Copy” that the PERSIST ID column is set to FALSE to ensure that a new DOC ID will get assigned when it uploaded. The problem with that is this will cause issues for other scenarios. Example: The user downloads a copy to work at home over the weekend, comes in Monday morning and uploads the new version. In this scenario they are using the same file name and want to keep the same DOC ID.

    Any thoughts?

    Reply
    • Joel Plaut
      Joel Plaut says:

      Wow, that’s quite a situation. Best is to create a tipsheet for users who want to copy a document they want to rename, that is intended to be different.

      Note that the March CU fixed a related issue, but that was when documents got restored as part of a Site template: https://support.microsoft.com/en-us/kb/2597150

      Other possible solutions are:
      – Go into Site Collection Document ID settings, and trigger a re-assignment of Doc IDs. This is not feasible if users reference these IDs long term.

      – I can write a PowerShell script that periodically scans all documents searching for duplicate IDs, and correcting them, by wiping the document IDs out of the newer files, then triggering the DocID Assignment Job that assigns IDs to any document, then trigger an email telling people what was done.

      I’m very interested in this, as I have not heard this occurring elsewhere. Let me know what you find.

      Warmest regards,

      Joel Plaut

      Reply
      • Ian
        Ian says:

        Hi Joel.

        Thank you for your quick response. What we decided was users are creating copies of documents outside of SharePoint and that is not the “best practice”. We are adding a rule to our Governance Plan and creating training materials that show users how to create copies of document inside SharePoint.

        The easiest method is to open a document in “read-only”, click “File” -> “Save As” and type in the new name you want. This will create a new copy in the same library and SharePoint will assign a new DOC ID.

        This is a bit of a cop-out but I didn’t see any clean way to handle this scenario. I have been a SharePoint consultant for about 5 years and I haven’t run into this issue before. I am also surprised that other companies haven’t come across this as the scenario performed by the user makes perfect sense to me.

        We will manage the one off situations when users don’t follow the “best practice” by creating a script and running it every so often (as you suggested…..thank you!).

        Reply
        • Joel Plaut
          Joel Plaut says:

          Glad to hear. Remember to set your DocID prefix to something unique (and even meaningful, optionally) per site collection. That can be scripted, let me know if you’d like that script.

          Warm regards,

          Joel

          Reply
    • Brian Kline
      Brian Kline says:

      If the downloaded files are DOC or DOCX files you can clear the custom properties by using the “Document Inspector” to remove the custom properties.

      Reply
  2. Ian
    Ian says:

    Thanks for the offer and advice Joel. My client wants to create the script on their own but it was nice of you to offer.

    Just wanted to say that I love your blog. I think I’ve read everyone of your posts and there is lots of great stuff here. I often use your scripts as you have identified very practical situations and solutions. Keep up the good work!

    Reply
  3. Stephen
    Stephen says:

    We’re finding that the _dlc_DocIDUrl property is not updated when a database is restored into another farm/web application. Manually editing the document afterwards will update it, but that isn’t practical. Re-crawling and kicking off the Document ID enable/disable and assignment jobs have no effect on this property (as viewed through SharePoint Manager). We have external integrations using this property.

    Do you have any ideas how to get this property to update to show the URL of the new farm/web application?

    Reply
    • Joel Plaut
      Joel Plaut says:

      When you say “Enable/Disable” I assume you Disable/Enable the sit collection feature? After disabling, please run the DocID Timer Job “Document ID Enable/Disable” once after disabling the Site Collection feature, and once after re-enabling the site collection feature.

      Please check the CU level of your farm. You may wish to be on a recent patch level, in case this is recently addressed.

      When you restored to a different farm, the underlying for the properties of DocID service need to be updated, which SharePoint should be doing.

      Let me know.

      Kind regards,

      joel

      Reply
      • Stephen
        Stephen says:

        I did not disable the site collection feature, I was just running the jobs. I did try what you suggested; no luck. On another suggestion I chose the “Reset all Document IDs in this Site Collection to begin with these characters” and chose a new prefix (no effect whatsoever if I don’t change the prefix). This had the desired effect on the top-level part of the URL in the _dlc_DocIDUrl field, except now my DocIDs were changed. So, one more time to change it back. The DocID assignment job must be run each time after changing the settings. Not pretty, but it’s the best I’ve got so far. Thanks for your help!

        Reply
  4. Thanh-Nu Leroy
    Thanh-Nu Leroy says:

    Hi Joel,
    Thanks for this helpful post. I would like to get your script for setting theDocID prefix to something unique per site collection, could you please send me the link or post it somewhere so that I can read it? Thanks for your help.

    Reply
  5. Flo
    Flo says:

    Hi Joel,
    we are having the same issues with some duplicate Doc IDs in our sharepoint site collection. I would be very interested if you can provide a script that does what you have said above, quoting you: “I can write a PowerShell script that periodically scans all documents searching for duplicate IDs, and correcting them, by wiping the document IDs out of the newer files, then triggering the DocID Assignment Job that assigns IDs to any document, then trigger an email telling people what was done.”
    Can you please provide us some powershell script that does exactly that?

    Thanks and best regards

    Reply
  6. Silv
    Silv says:

    OOPS. Fixed typos:

    Hi,

    What if I keep the location of a document, but change its name? We had to rename all of the documents in one library to shorter names. This broke all of the links we had in a Curriculum. I had to then manually go in and fix all of the links. Is there a workaround to this?

    Thanks!

    Reply
    • joel plaut
      joel plaut says:

      Document ID should work regardless of the name change. However I’ve heard of it breaking. Did this occur after a crawl?
      If there’s an issue in the metadata, that could be fixed up via PowerShell. Let me know.

      Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *