SharePoint 2010 : How to retain Document ID while moving documents

Today’s water cooler discussion revolved around Document ID management system of SharePoint. It is a nice feature and can actually help to find the documents quickly and in a predicted manner. Basically you get a short URL in SharePoint for your document, rather than a cryptic one and SharePoint will find the document no matter where you moved it.

But there is a catch, it will be able to find it if you move it within the Site Collection. If you move it out of the site collection, a new Doc ID will be generated and assigned to the document. BOOM!! your short URL is pointing to nothing now.

So how do we move documents without changing their Doc ID?

For this, we would need to understand how the document ID generation, assignment etc. happens. There is a comprehensive document on MSDN which provides there details. You can read about it here.

So basically, SharePoint uses event handlers to assign Document IDs at runtime (there are Timer Jobs also, but about them – some other day). There is 1 interesting paragraph in the article which catches my fancy.

When a new item is added, SharePoint Server 2010 first checks to see whether the item has a document ID. If the item has a document ID, the server checks to see whether the PreserveID attribute is set to True or False, and then sets it to False if it is currently set to True. If the item does not already have a document ID, the server gets a document ID for the item from the specified provider, writes it to metadata, and sets the PreserveID attribute to False.

This is saying that there is an attribute “PreserveID” which is checked before updating the document ID but is again set to False, effectively not retaining its value. But could not find any explanation for this anywhere on MSDN.

Time to dive into trusted .NET Reflector!

The code for the Document ID infrastructure is under Microsoft.Office.DocumentManagement.dll, precisely in ItemChangedInternal method, where it checks for an internal property on a document “_dlc_DocIdPersistId” (called as PreserveID attribute) in the article. But every time code finds this attribute set to True, it just flips it back to False but skips assigning new document ID, which is exactly we need.

Now for that particular execution cycle, the document ID will not be changed, but next move it will be changed as PreserveID attribute has been set to False. This would work well when you are using custom code to add files to the library and you can use the code as given in a very good blog. But what if you are using OOTB features to move around the documents? How do you set the PreserveID attribute?

It turns out to be a simple trick of adding an ItemAdding event handler for the library and setting the property on every document being added.

       public override void ItemAdding(SPItemEventProperties properties)
       {
           properties.AfterProperties["_dlc_DocIdPersistId"] = "true";
           base.ItemAdding(properties);
       }

 

This would execute every time a new document is being added to the library and it just says that I want to retain the document ID. ItemChangedInternal handler will check if a document ID exists, it will just skip assigning the new ID, otherwise just assign a new document ID.

Hope this helps out some of you out there.

Happy Coding Smile

7 thoughts on “SharePoint 2010 : How to retain Document ID while moving documents”

  1. Nice. Wish I had read this prior to export/importing a large document library to a new site collection (and contentdb). I would have liked to retain the old Document IDs even though I don’t believe my users are relying on them.

    Reply
  2. Just a quick note to say there appears to be a fault introduced in the Sep 2015 CU for SP 2013 (not sure about SP 2010), whereby the _dlc_DocIdPersistId no longer works. When you set it to true in one of the “ing” event handlers, the persistence demand is ignored and the Document Id is over-written with a new one.

    Microsoft Support have reproduced the fault using the code above in a simple event handler Visual Studio project: In the March 2015 CU it works fine, in the Sep 2015 CU the code fails to instigate the persistence expected. The event handler does fire, it’s just the outcome isn’t right. Using Reflector shows the code in the ItemChangedInternal method of Microsoft.Office.DocumentManagement.dll was indeed refactored between the two CU releases.

    It’s heading to MS Product Group for further investigation; fix TBA.

    Anyone brighter than me (not a challenge) wishing to analyse and correct the MS code for the Product Group would probably get some brownie points, but I’m not sure who from.

    So if you’ve relied on the excellent work above to deliver software using the “ing” event handlers that persists the Document Id for documents that exit SharePoint and re-enter later, or similar type application, don’t apply the Sep 2015 CU or later until the issue is fixed.

    Reply
    • Hi Lawrence,

      Thanks for the update. It’s been a while I have been away from SharePoint. Let me see what I can find out and will update here.

      Regards,
      Manpreet

      Reply
      • “It’s been a while I have been away from SharePoint.”

        Lucky you! 🙂

        I didn’t mean to drag you back to the beast, I just wanted to get the warning out there. Yours is the highest result on the search engines for anyone looking for _dlc_DocIdPersistId, so I thought it the best place to comment.

        I’ll report back on here what the MS Product Group says, if that’s ok.

        Cheers,
        Loz

        Reply
        • hahahaa… as they say, once you are a SharePoint guy… they can’t take SharePoint out of you!

          Surely, update here what PG says and if there are code changes required, will update the blog with latest changes.

          Reply
  3. MS Product Group have responded. I’m quoting from the support incident: –

    “I’ve heard back from Product Group and this is currently By Design behavior. Any externally uploaded document should get a new id assigned as SharePoint isn’t aware of this document.
    DocId feature was developed to make sure SharePoint can track a document as it moves inside the farm using move to, drop off library etc. but once it leaves the farm – it is a brand new document for SharePoint.
    _dlc_DocIdPersistId property is officially not documented, it is internal and it is not supposed to be used.
    It can very easily cause ids duplications in different scenarios. One such scenario is given below

    If someone set _dlc_DocIdPersistId, upload a document to the site where doc ids feature is not enabled. Because of properties demotion it will end up in document properties. Customer’s keeps using this document as a template for other document.
    Once doc id feature is activated – many documents will get same doc id assigned.

    As a workaround you can create a column in doc libraries and generate the ids. Properties promotion/demotion will ensure that after document is downloaded from SharePoint it will get the same value in that column after it is uploaded back.
    You could use this Column to compare if the document uploaded is a new doc or a old one.”

    So it seems the usage of “_dlc_DocIdPersistId” should be internal to SP only, not available to us directly for development.

    If you want to track documents that leave SP and return later, and compare their document id’s to SP, you must either implement your own id column and supporting code (as described above), or use the CSOM to test the document id’s before uploading. Problem is you have to open the documents you wish to upload in code using the Office app on the client, using VSTO or Office add-in model, whatever it’s called these days, for each document you’re uploading first, to get the returning document id. Or you could write an SP app to which the uploaded documents go first, which then opens them in code in the Office app model to get the document id, uses web services to check if the document id is already in use. Sort of like an Office server (but that’s what SharePoint is supposed to be, isn’t it? Well, yes, exactly …).

    Bet you drink a toast on the anniversary day you got away from “DespairPoint” ?! …

    Cheers,
    Loz

    Reply
  4. Thanks for the update!

    Well, that’s a typical answer I was expecting from PG. The particular scenario that I was discussing in my blog was moving documents within the same SharePoint farm but different Site Collections like to an Archival site, which is a pretty common scenario and still breaks the DocID feature.

    I agree to the fact that once the document is out of the SP Farm and then comes back, it becomes a tedious process to maintain the metadata, but isn’t that one scenario that we are looking to prevent by having Office Web Apps or have deep integration with Office client applications, so that you DO NOT HAVE A NEED to download and upload the documents over and over again. This goes against the principle of having single version of truth, the need for file locks, checkin/checkouts and OneDrive for Business to take docs offline when on move…

    But writing all that code just to ensure that one scenario, it is equivalent of writing your own document id feature, then why would SP even have that. Just provide the guidance and let everyone write their own stuff or better let an ISV build it for MS!

    Enough of ranting, I guess should get back to see what has been done to the code internally and can we find a new fix and get our share of high 🙂

    Reply

Leave a Comment