joeware - never stop exploring... :)

Information about joeware mixed with wild and crazy opinions...

Protecting yourself from massive accidental deletes in Active Directory

by @ 4:33 pm on 3/4/2007. Filed under tech

Now I know that title caught your attention… Why? Because most of the folks reading this blog do something with Active Directory and this is a pain point in AD. An unnecessary pain point in my opinion, but a pain point none-the-less.

I bring it up because Tim Springston from PSS (or CSS, whatever you want to call them now. 🙂 brought it up over on his blog.

http://blogs.technet.com/ad/archive/2007/02/07/active-directory-in-longhorn-feature-feedback-requested.aspx

This is something that I have brought up on several occasions in the past and have submitted DCRs to MSFT about. The gist of the whole post is the idea of having a special check box on specific objects (in ADUC) that you have to uncheck to complete a delete. The implementation, as he mentions in the comments, is through a DENY ACE on the object for delete.

Now this is something, besides the checkbox in ADUC, that you can easily implement right now by yourself if you want that functionality. I don’t, in general, want MSFT spending lots of time on doing things that I can already set up for myself, especially when it isn’t a very comprehensive solution; as Tim admits, this won’t help subtree delete ops.

I would like to see MSFT solve this the way I have been pushing for for years, a better implementation of the delete functionality. This is something that we cannot do ourselves and would be comprehensive and EXTREMELY useful. It is something I have been asking for, again, for years both in ActiveDir Org posts as well as through submissions to MSFT for design changes.

The idea that I have been pushing is a staged delete process. Many, if not most, large companies already implement something like this manually to the best of their ability which is something many have problems with through mistakes or misunderstandings of how various apps will respond to the process.

The process most companies basically follow is that they move objects they would normally delete to some OU or container X. They disable the object and depending on the level of understanding they will set specific attributes to certain values to make it so apps like Exchange don’t have a complete identity crisis and soil their panties.

Many of you know the process as you implemented it and probably most of you have encountered issues with it that caused you to specifically tweak the process to work with xyz apps that you have running.

Some of you may even have put into place processes and procedures that specifically state “NO USER/GROUP OBJECT WILL EVER BE DELETED”… I have seen/heard of that as well. Then you have the de facto standards of no computer objects ever getting deleted and that is simply because no one is ever sure if they can be safely deleted.

 

My thoughts are that the delete process should be changed to a staged delete process. At a minimum I visualize four distinct configurable phases. It allows you to configure which object classes go through which phases and how long they hang out in each, plus you should have the ability to move objects through some of the phases quickly if need be – say for a test or some mass delete and you don’t want the normal tombstones to apply.

So what are the four stages? They are:

Stage 1: See you later alligator

In this phase, objects are marked as unusable by the directory in a similar way that objects are marked right now. Only they maintain all of their original attributes; including linked attributes. I could visualize this being implemented in a couple of ways. The first is via the standard tombstone/delete process now but you maintain all attributes. The second is a new container which is hidden like the deleted objects container and is avoided by the DS for things like group expansion when users log on for the case of say a deleted group. You have to use the show deleted objects control to check this new container out but can easily be added to say ADUC so admins could reach in and pull objects back out of it and drag them to another OU or right click and select restore to put the object back to where it was before the delete.

There is a new attribute added, something like mS-DS-DeletionStage or something like that. It will be set to 1 when the object is in this stage[1]. Also, you could use a mS-DS-WhenDeleted attribute to make things easier for LDAP or just read the metadata on the mS-DS-DeletionStage attribute to determine when the next stage of delete should occur. I would like to see that attribute myself as well, possibly, of a another attribute called mS-DS-WhenRecovered. Being able to find objects that have been recovered may be nice so you can go talk to the people who are deleting things that constantly need to be recovered. Oh yeah, we also want to see mS-DS-WhoDeleted and mS-DS-WhoRecovered and have something set other than “Administrators”. 🙂

This would be an OPTIONAL stage of delete, maybe you don’t want all objects to go here, but instead want them to a later delete stage immediately – say like AD Connection objects or maybe OUs which would get a tombstone period for this stage of 0 days. Or maybe you want different objects to be retained in this stage for different periods say like OU’s for 3 days but users for 365 days or even -1 days (i.e. keep in this stage forever). This would be configured with some new AD Schema definitions – however, it would be nice to have some overrides on a per object basis as well (mS-DS-DeletePolicy – which points to a custom defined delete policy which could also be used by the schema objects as I think about it) – say for any apps that like to use a generic object type like container for many of its objects but you normally don’t want those objects maintained.

By Default I would have the OS maintain user, computer, group objects for 30 days. That way even if someone isn’t aware enough to enable this, the primary objects of concern would still be kept.

 

Stage 2: Classic (or Don’t I know you?)

This would be a delete as we know it now where an object is immediately tombstoned upon receipt of the delete request. Attributes are scrubbed from the object as per the schema and internal hardcoding but only I would say add the ability to actually maintain link attributes still. This would be mS-DS-DeletionStage=2. Objects could be undeleted from here just like they are now, same processes, etc. Though with luck, you get back linked attributes. 🙂

I figure the vast majority of object types would hit this stage as their first stage as they would be set to skip stage 1.

The current tombstone lifetime would be used for this stage.

 

Stage 3: I seem to recall…

Another very optional stage that would be used only for very few object types I expect. This stage maintains a couple of key attributes that you cannot easily duplicate later if you need to recreate an object, namely the objectGuid, objectSid, sIDHistory, etc. I would recommend allowing customers to specify additional attributes to be maintained here as well like say sAMAccountName or whatever customs they never want to lose. Objects here can be recovered like regular tombstoned objects, you just have more stuff you have repopulate. On the positive side, if several months down the road you need to bring something back with the same old SID for some odd reason (yes I know you shouldn’t ever HAVE to but….) then you can relatively easily do so. I know of several companies that maintain deleted objects like this in directories outside of AD because they don’t ever want to lose the info in case they find a SID/GUID they can’t resolve or want to use sAMAccountNames as unique keys FOREVER so they need a way to check to make sure they don’t duplicate them. mS-DS-DeletionStage=3

Honestly I would only expect users and groups to go into this stage because security principals would be the primary need for this. By default I wouldn’t let anything into this category, admins would have to specifically choose to do this. I would expect objects that went into this stage would stay there indefinitely but if someone wanted a tombstone time on these to push them into stage 4 that would be fine too, maybe 3,5,10 year values?

The tricky thing here is that you couldn’t manually push an object from this stage into stage 4 easily because there is nothing to pass along to  the other DCs after doing so. I almost wonder too if these objects should actually pop back out of the deleted phase and reappear as another object type but viewable through normal means. I could visualize it being handy to be able to resolve these objects say for event logs, etc that have SIDs listed. How many times have you tried to resolve a SID only to come to the conclusion that the object must have been deleted? If you resolve and it comes out to be an object of type “user-Deleted” then you know for sure and you can even know the name of the user.

 

Stage 4: Hit the road Jack, don’t you come back no more no more no more no more…

This is the scavenge phase where the object is yanked out of the DS and no longer retrievable which happens at the end of the tombstone lifetime now.

 

Examples

So I have a few examples, that is where you flesh out many issues…

Ex 1:

Default production environment spun up anywhere. Default settings for user objects Stage1 TSL=30, Stage2 TSL=120, Stage3 TSL=0. Ten users get deleted, go into Stage1. Three of the users get restored in 2 weeks as accidental deletes, all of the work handled in ADUC with a simple move of the object from one container to another OR a right click and Restore op. Objects are exactly as before. In 30 days, the remaining 7 users go to Stage2 and get stripped down to attribute set as per schema. 120 days later, objects are scavenged from directory (Stage4).

Ex 2:

Test lab which is configured for speedy cleanup because of small size and huge churn in add/deletes of users. Tweaked settings for user objects Stage1 TSL=0, Stage2 TSL=1, Stage3 TSL=0. Ten users get deleted, go into Stage2 and get scrubbed. A day later the objects get scavenged (Stage4).

Ex 3:

Production environment tweaked such that user delete stages are configured as Stage1 TSL=365, Stage2 TSL=120, Stage3 TSL=-1. Ten users get deleted, users can be fully restored for up to 1 year. This means that if you delete 1000 users per month, you need to have DIT capacity to maintain your normal active users + 12,000 inactive users. After one year, the objects are moved into Stage2 and scrubbed of all attributes not specifically marked to be maintained in this stage. Finally after that 120 Stage2 period ends, the objects are further scrubbed down to just a couple of attributes like objectGuid, objectSid, sIDHistory, sAMAccountName, and name and will always exist from then on unless the configuration is changed later. This means that after 10 years at 1000 user deletes a month, your DIT would have 120,000 inactive users in your directory which assuming no sIDHistory and SAM Name and Name being 20 characters or less and my math hasn’t taken a complete right turn would be around 10MB of information. That is a pretty small cost IMO.

 

Do we need Microsoft for this?

No. We don’t strictly need MSFT for this… This could be implemented via a vendor. However to do so would really involve screwing with the DS pretty deeply. The vendor would have to intercept all of the delete ops which is pretty darn intrusive. It could be done by someone who has very strong understanding of AD and history of doing things similar. Quest comes immediately to mind. ;o)  Microsoft would be EXTREMELY unhappy about it though. I would be very slow to recommend someone implement something like this from a third party vendor. Like I said, I expect Microsoft would be extremely unhappy if someone did this. I would love to see MSFT step up and do this, I think it would help one of the more common issues out there that people end up running auth restores for and IMO, auth restores are dangerous for people to be doing.

I have walked people through setting up processes to implement this procedurally but seriously but they have to jump special hoops and it is kind of a pain. This would be great to have built in because then you don’t have to worry about whether people are following procedure or not. If they use ADUC or dsrm or admod or NET USER /DELETE it doesn’t matter, the rules would be applied.

 

   joe

 

[1] 0 or null if not in a deletion stage at all – i.e. normal object.

Rating 3.00 out of 5

One Response to “Protecting yourself from massive accidental deletes in Active Directory”

  1. Hilde says:

    EXCELLENT idea – make it so! thanks, Joe. For all you do, this Bud’s for YOU!

[joeware – never stop exploring… :) is proudly powered by WordPress.]