It has been almost 11 years since Windows 2000 and Active Directory hit the world and apps are still having issues. At this point I just want to look at vendors and say “Seriously??? Figure it out already.”.
So I am just going to start typing issues that I have encountered that by now shouldn’t really be issues IMO. Maybe by doing this, vendors will stumble upon it and make sure their apps don’t have these issues. This isn’t an all inclusive list, this is just me typing things that pop into my head as I sit here watching TV.
Simple Binds
The first thing that pops into my head is simple bind. If you are using simple bind you absolutely should be using LDAPS. Period. You have this important (at least it should be right?) application that needs to talk to AD so they give you an ID and you set a nice safe secure password to keep the ID and your application and any data it has access to safe and then you throw out there in the clear for the world to see.
Hard-coding DCs
Hard-coding DCs is probably one of the most frustrating things to hit Active Directory Admins around the world. Someone thinks their application is so important that, of course, the AD Admins, who have nothing else to worry about, will be constantly focused on the DC the application people decided to target and make sure it has 110% uptime. Ah yeah, not going to happen. Application vendors, developers, integrators, and support teams… the AD Support people have lots of things to do that have nothing to do with your application, they very likely don’t even remember talking to you about your app if they know you exist at all.
There are multiple ways to solve this issue but the most professional is to set things up like MSFT intended. Microsoft didn’t come up with the idea of publishing SRV records for the services so they could keep the DNS people in work, they did it so applications can locate DC resources as needed. I have seen teams in companies working on UNIX apps as long ago as 2002 writing their own DC Locator services. If these teams who work for companies whose focus in the world is something entirely different than writing software to integrate with AD can pull this off, how come vendors who write LDAP products that they sell can’t accomplish the same thing? Especially when they say their applications are Active Directory aware. I would propose that if an LDAP based application can’t locate the closest DCs via some sort of DC Locator functionality, they are NOT Active Directory aware and we should be pretty quick to tell them that. I haven’t looked in a while, but I expect JNDI still doesn’t work right for this[2]. That is pretty annoying because it is so ubiquitous out there, if they could fix that, lots of apps would be fixed.
Poor Handling of Multiple Domain / Complex Domain Configurations
What is an Active Directory forest supposed to look like? Default answer for most companies, including sometimes, even MSFT, one single domain in the forest. Reality though is much different. As much as many companies, especially now a days, would like to have a single domain forest for their corporate AD, they often have an AD forest that was built up from several geographically based NT4 domains that they had deployed before AD came along. Possibly even having more than one or two domains for a single geographic area. This could be for a variety of reasons but the most common I have encountered are 1) NT4 SAM DB size limitations 2) Corporate HQ gets its own domain and everyone else in the geographic region gets another 3) Business user domain versus manufacturing domains.
These domains are often combined in a variety of configurations. The most common I have experienced is the empty forest root with child domains for each of the geographic domains. This is so common for me that I am in shock if I don’t see it in larger multinational companies. You will have company.com, na.company.com (or am.company.com or americas.company.com or nam.company.com or northam.company.com or amer.company.com, etc), eu.company.com (or emea.company.com or ema.company.com or europe.company.com, etc), ap.company.com (or apj.company.com or apac.company.com) and possibly a few more depending on the company. But in general, you have americas, europe/middle east, and asia pacific rim all as children of the forest root. I have also encountered a case where someone actually used grandchildren domains which is, for me, unusual to see. You will also occasionally run into environments where different domain trees are deployed. Out of all of these, the different domain trees are probably the worst for normal MSFT based apps and scripts. Most scripts do not take this case into account and if you try to run generic scripts that make assumptions instead of properly checking the RootDSE for the namespace and therefore get confused and can’t find things they should be finding.
People writing tools for alternate LDAP directories that get “ported” to work on AD tend to make additional mistakes that hurt you even when you have a nice pretty namespace layout (as defined by having a nice clean domain hierarchy like parent and 3 children). That issue is that they have you point at a single DC (aka LDAP server) and expect to be able to see the whole LDAP hierarchy for the forest. That can work, but only if you point at a Global Catalog and specify the GC port… But wait, now you don’t have access to all of the attributes and if the tool writes to AD, that offers additional issues. Add in multiple domain trees and you can totally freak out those alternate directory tools.
Lack of Paging/Ranging Support
This one is another one of the “If your LDAP app can’t do this, your app really isn’t Active Directory Aware” items in my book. I hate when I see emails or forum posts or any mechanism of asking, hey, how do I change the page size or default range size in Active Directory because I have an app that doesn’t do paging/ranging. My response is usually of two parts, 1) “If you increase it now, what happens later when that isn’t enough? You raise it again? Will your app be smart enough to know it has hit the limit?” 2) Go beat on the vendor to fix their app.
If you have are a vendor and your app doesn’t page or range properly, you can be 100% sure that if one of your customers or your possible customers asks me about it and what I think I will say they need to run as fast as they can from your app because you do not know enough about Active Directory to be messing with it.
LDAP Signing/Sealing
LDAP Signing and Sealing is really cool, for security. It sucks for when you are trying to troubleshoot an application that doesn’t have good logging and you want to use a network sniffer to figure out what is happening. Application developers, yes use LDAP Signing and Sealing, but please give an easy to find and use method to disable it at least for short periods of time so when your application isn’t doing what is expected, we have a method to troubleshoot it that doesn’t entirely depend on what you foresaw you needed to do for logging. Hey if it only disables for 24 hours and then resets itself or maybe it resets itself between every launch of the app, whatever, that is fine.
LDAP Logging
This is one I fall down on myself, logging of LDAP transactions. Active Directory’s logging is traditionally pretty weak. Many LDAP Servers will log darn near everything you could possibly want from the requests the server will handle to a text file, Active Directory doesn’t. Now there is some stuff in the Event Tracing functionality that is more difficult than just saying, log this stuff to a text file, but I haven’t yet had time to dig into. You can learn more from Brandon (http://bsonposh.com/archives/347) and Tony (http://www.activedir.org/Articles/tabid/54/articleType/ArticleView/articleId/49/Default.aspx).
Inefficient and/or Incorrect Queries
This one really irks me when I see it from big time vendors. While paging and ranging and domain controller location and even complex forest structures take some decent knowledge to figure out, an ok filter really shouldn’t be that difficult. A vendor who doesn’t produce a decent query for AD really should invest some time and money into figuring it out. At the very least read the following document – http://msdn.microsoft.com/en-us/library/ms808539.aspx. It’s bad enough to find this in apps when looking at network traces (and yes even MSFT has screwed this up themselves) because you can maybe hope no one will go look and see what you screwed up, but when you actually publish this in a support document you publish on the web… sheesh. As a recent example, I had to go through the Google Apps Directory Sync for Postini Services document this last week, I ran into a couple of bad queries right off:
All users, but exclude disabled users:
(&(&(objectclass=user)(objectcategory=person))(!(userAccountControl=514)))
This isn’t inefficient, it is just incorrect. If you are checking for disabled users, you need to perform a bit-wise check of userAccountControl, not check it for a specific value.
Active Directory LDAP: All users
(objectClass=person)
Unless you have indexed objectClass or you have deployed Windows Server 2008, this will be inefficient. Also it will return more than users… such as contacts, trusts, etc.
Active Directory LDAP: All email users (alternate)
(&(objectclass=user)(objectcategory=person))
This one is incorrect for what they are looking for, it is an efficient query for all users… Regardless of email status. Will still get trusts btw.
If Google is looking for good AD people… contact me… ;o)
Inefficient ADSI Use
This last one may confuse people. Folks may think… “Hey it’s a Microsoft framework, doesn’t it just work efficiently automatically?” No. This impacts people using ADSI directly or the .NET stuff that thunks down to ADSI, i.e. the stuff under System.DirectoryServices (excluding S.DS.Protocols). My recommendation here is if you aren’t completely positive about what ADSI is doing with what you are telling it, get out a network sniffer and look at the traffic for your operations and see if you can clean it up[1]. Heck do it even if you think you know exactly what it is doing. You could very likely find parts of your app that are incredibly inefficient and slow and unnecessarily beating up the DCs. One quick one here is app developers not taking advantage of IADs:GetInfoEx to pull the actual individual attributes needed versus the whole object. When and why is this important? If you have an app running on one machine that is dealing with one off AD Objects occasionally then the importance goes down a little as it won’t be generating a lot of load on the DC or network though personally I think you should still only pull what you need. If it is an app running on hundreds of thousands of machines and you are pulling the whole object instead of just the attributes you need then you are pulling way too much data from the DC. Or if you have the app running on one machine but pulling lots of objects, again, you are pulling way too much data from the DC.
joe
[1] I would love for the Microsoft Developers on the Exchange team to pay attention to this point because I was looking at an Exchange 2010 network trace the other day and the best thing that I had to say was that whatever I was looking at was an unnecessary network and AD utilization pig. Every object that it pulled it pulled twice. The first time to see if it existed with just the objectClass, the next time it pulled the whole object, for every object.
[2] UPDATE (2010-12-09): I was looking for something and I found what appears to look like good news for JNDI. It seems that in the JNDI 1.5 docs they discuss automatic discovery of LDAP Service. (http://download.oracle.com/javase/1.5.0/docs/guide/jndi/jndi-ldap.html) It doesn’t look like they look at site specific records, but at least they are looking at LDAP SRV records which is further along than they were previously that I recall.
Nice post. I find this one interesting and relevant to ad. “ldap paging” http://support.microsoft.com/kb/951581 . Im quite positive that majority of developers out there dont include paging trigger in their app. It’s either always on or always off.
Pat: Thanks for that link and info. I recall hearing about that some time ago but never saw the KB article. I may try to write up a blog entry around that because I do it in AdFind and all of my LDAP apps with paging always on as well unless I know for a fact I am only returning a couple of objects. I need to work up some generic mechanism to detect and handle that better.