We, as IT people in general and AD people in particular, often get dinged with the ubiquitous “quick question”. You all know what I mean, the “Hey I have a quick question” or “Hey do you have a second for a quick question” or most often just “Hey, got a sec?” as the person sits down and searches your desk for cookies, candy, or other things that they have no right to but will instantly latch onto as they settle in for decidedly more than “a second”.
Or if you are lucky enough to work at home the IM window pops up with “Got a sec? ” and you look to make sure you previously set your status to unavailable or away so if you want, you can just ignore the implied lie behind the seemingly harmless text with the disarming smiley face.
Either way, in our minds we are screaming “NOOOOOOOO, not for you. My life is composed in its entirety of ‘these seconds’ you take so cavalierly and I would rather not waste them on whatever you think will only take ‘just a second’ plus I have the letters Q, J, K, D, W, B, and O in Words with Friends and I have no clue where I am going to place any of them and am already losing by 160 points.” But… in the end… we know that saying no is pretty much pointless and that “second” could turn into three times as long as it would have been anyway if we waste the time trying to fight it… so we only get to respond with… “Sure, what’s up?” and may even feign some level of enthusiasm for dramatic effect.
Anyway…
How big is the AD DIT?
“Got a sec?”
“Sure. What’s up?”
“What is the size of the AD DIT?”
<LONG PAUSE with deep breath>
The only thing “got a sec” about that question is the amount of time to utter the syllables. The only person that single datum is valuable to is the person worried about disk space on a domain controller so unless you are looking to figure out how big of a disk you need to order for your next DC or perhaps you are in a “who has the biggest DIT contest???” asking that specific question is simply the act of pushing the first in a long chain of dominos.
So, instead of looking at your favorite DC and quickly spouting whatever the value is you instead[1] say “Why?”. You then get the response you likely were dreading… “Because we are having problems with Exchange and the Microsoft support guy wants to know how big the DIT is.”……………….. Sigh.
Some of you may be asking? But joe? What’s wrong with asking that question? The problem is that the answer to that question doesn’t really tell you anything without the appropriate contextual information to go around it. Say the answer is 3GB. What does that mean? Do we jump for joy? Do we skulk in shame? Do we yip in pain? I don’t know. It could be good, it could be bad, it may not matter at all – who am I to know with the information in front of me?
The answer starts to make some amount of sense once you know the OS level, Windows 2000 versus Windows Server 2003 versus Windows Server 2008 R2 . It makes more sense when you have some clue as to what other functions are running on the domain controller and what memory load those functions utilize. And finally it makes a heck of a lot more sense when you know where on the scale between 256MB of RAM and 64GB of RAM that your domain controller is at. The fact that you have a 6GB DIT means something entirely different on machine with Windows 2000 and 512MB of RAM with SQL Server running in the background than it does with a Windows Server 2008 R2 with 16 processors and 64GB of RAM and only running DNS and AD functions. So simply asking “How big is the DIT?” is like asking how much oxygen is in the room. Without understanding context around it, it is pointless.
SIDEBAR: That being said, how nice would it be to have a fancy RootDSE operational attribute that you could query on all of your DCs for some value that gives you a clue about DIT size versus RAM utilization so if someone was say, troubleshooting Exchange or something else, they could query the DC for that attribute and it would give them an idea on whether or not they should follow up with the DAs, or perhaps the DA’s could even monitor[2] the attribute across all of their DCs and be alerted that perhaps they need to be a little more aggressive in checking things out. Sure sure there are a ton of performance counters available that could be used but in all reality, most admins look at them and their eyes glaze over. Heck my eyes don’t much like them either. It would be nice if they broke those out by role and feature like they have been doing with the Server Manager functions[3]. Anyway Microsoft Exchange Support Engineers, imagine if you could ask the Exchange folks you are working with if they could do a quick LDAP query of the RootDSE of a DC to get the answer you really want versus asking them to ask someone else what the size of the DIT is? Heck it could be put into the ExRAP tool as well as the Baseline Analyzer tools.
We are seeing delays in replication…
“Got a sec?”
“Sure. What’s up?”
“We are seeing delays in replication, why?”
<PAUSE>
My response, to get a feel of what direction the questioner is driving and what kind of vehicle they are using is usually of the type “Why do you think there is a delay?” That often, but not always results in a response of the type “It just doesn’t seem to be moving as fast as we would expect.” Which I translate in my head to “We have no clue how long it is supposed to take and our stuff isn’t working correctly and we need a wall to throw the problem over…” and when I get the feeling someone is looking for a wall to toss things over I usually come out with the old standby “You need to get a network trace of the problem” which tends to make them go away for awhile if not permanently when they find some other group to accept the task of troubleshooting their problem.
But in this case of replication delay there is a better response… “What is your expected theoretical max replication latency from the source DC to the destination DC?” If they say they don’t know then I respond with “How do you know you are seeing delays? You don’t even know how long it is supposed to take in the first place.” The fact that it “feels slow” or isn’t what you expect doesn’t mean it is delayed. The entire issue could be and very often is that they have an incorrect expectation. To be able to make an objective claim of “it is delayed” means you have a thorough understanding of what it is designed to be and is during normal functioning. You should be able to say it is delayed by x minutes or hours and be able to point at the expected latency based on the design and point at what it is really taking.
SIDEBAR: And again… That being said, it doesn’t seem like it would be terribly hard for the AD site and subnet tool or for some tool supplied by MSFT that could tell you the expected max theoretical convergence time when selecting a source and destination DC. I actually have, and have had for some time, a tool listed on my “tools to build someday” list that could do this. Unfortunately, my time isn’t as free as it once was and you may notice that joeware updates and tools don’t flow quite as freely as previously. This is being worked on but MSFT definitely has quite a few more available man hours for producing things like this. Again, how nice would it be for the PSS guys to tell the admin that is having problems, fire up this tool, click on the DC that you put the change on, click on the DC you want the change to get to, and the tool will tell you a theoretical minimum and maximum time frame we have for convergence assuming a properly running replication environment.
Why are my LDAP queries going slow???
“Got a sec?”
“Sure. What’s up?”
“Why are my queries going slow?” or alternately “The PSS ExRAP or the Exchange PSS guy says the LDAP Queries are going slow. Why?”[4]
<PAUSE>
My response to this is always, “What exactly is the query that is going slow? Specifically I want the Host you are querying, the search base, the search scope, and the filter and what attributes you are asking for.” This one is really quite annoying to me because the Exchange people through the years have really irked me by looking at some DSACCESS counters and it says things aren’t good but no one can tell me specifically what it is that isn’t good…. Just something. Sorry, that isn’t good enough. Find out the queries, try them manually and show me that they are not performing properly. Otherwise I am more likely to believe based on personal experience through the years that Exchange is screwed up in its configuration somewhere versus the DCs not functioning properly. A problem isn’t a problem to me unless you can show me specifically what isn’t working properly as it applies to me, showing me some generic counter from your application isn’t proof. It has literally been dozens if not more times that someone has come to me with those DSACCESS counter complaints and I start performing LDAP Query tests on the DCs and the DCs are operating just fine and I tell the Exchange folks that and they go off and find something else to blame.
If you come to me with specific queries, I can *usually* determine why they are going slow and it is 98.9% of the time because of a poorly formulated query or a real poor choice for search scope or complete lack of anything resembling an indexed attribute. Have I had DCs that were underperforming, yes, but that is the rounding error compared to the other issues that resided outside of the domain controller.
SIDEBAR: And finally… Debugging LDAP queries on Active Directory and ADAM, IMO, is more painful than it should be. Most LDAP directories I have seen have a simple LDAP query debugging capability that dumps LDAP queries and debugging info into a simple text log file; Active Directory doesn’t have this. I know there is the whole Tracing thing but I have had zero time to dig into it and if it requires me to dig in and study it to figure it out, it is too difficult to enable and use.
Anyway, that is my rant for the day. Have a good week and Happy Lunar New Year / Chinese New Year – Year of the Dragon.
joe
[1] Because you naively think you can nip the whole chain of events you know is about to start in the bud.
[2] Monitor – to proactively and automatically check the service quality, availability, and functionality of your service in substantial regular intervals and alert on system faults and non-optimal performance. I only define this because lately I seem to be finding a lot of people who think the best “monitors” for AD are called “Users” and “The Help Desk”. When your users contact you to tell you the service isn’t working, that isn’t called monitoring, that is called failing.
[3] And perhaps they have been in the most recent versions of the OS. I, unfortunately, seem to be spending a lot of time on Windows Server 2003 lately which is a step up from the Windows 2000 I had to keep dealing with previously.
[4] Yes yes I am picking on Exchange. But as I said years ago completely off the cuff in a humorous (but serious) manner in a Dean and joe Show session at one of the Directory Experts Conferences, <finger air quotes>Exchange is Special</finger air quotes>. To be honest, they aren’t the only ones I have had issues with this over the last 12 or so years, but they certainly win the award for the most consistent and excessive volume. :D I also had some nice fun with issues around poorly written LDAP queries with IBM’s WebSphere Portal application software. That one was pretty bad, IBM consultants onsite testing WebSphere functionality against a test DC sitting on the same switch as their app server… A DC with an AD they built “out” with 5 users and 3 groups on hardware that was 50 times better than anything anyone has ever used anywhere in the world for a DC and then getting pissed when they try to run the same queries against an environment with hundreds of thousands of users and hundreds of thousands of groups across 6 routers shared with thousands of people.