While working on some posts about writing code for leveraging Active Directory I realized that a very weak point I have run into with many (perhaps most) apps is the logging, particularly for use in troubleshooting and/or debugging of issues. I don’t care how good of a coder you are (or think you are), your code will eventually be smack dab in the middle of a problem and someone is going to have to troubleshoot it. So just assume that that is going to happen right from the jump and do something about it as you write your very first lines of code and you won’t be called “a moron coder” when someone runs into the issue that needs that debugging… NB: I myself have been called a moron coder, often by myself, sometimes multiple times in a single sitting.
Personally, I have always been a fan of plain text file logging because it is simple to implement and simple to use requiring no special viewers, is easily searchable, etc but in the end, ANY logging is better than no logging. If instead of text logging you would rather use binary or XML log entries and instead of a text file you would rather send logging events to SYSLOG or the Windows Event Log or Windows Debug Stream or a sweet MariaDB SQL Server or all of them or something else entirely please do so if it gets you to implement logging. There is a caveat in that the “something else” shouldn’t be console logging unless your console is a teletype of some sort because otherwise the logs will just scroll off the screen and never be seen again. That makes it too easy to lose salient details that way. Actually, on second thought, even if you do have a teletype console available don’t use console logging as your sole logging method. I used to have to look through reams of paper with console logging entries and I regularly missed things in it.
I realize that logging can impact performance so if you want to allow the consumer to control the impact they are experiencing but still allow them to get data that is useful for troubleshooting set up logging levels that can be configured such that the higher the logging level that is configured the more verbose the logging becomes. Visualize a sliding dial, less logging/higher performance to super verbose logging/lower performance. There should probably be at least 3-5 levels of verbosity from normal regular running and tracking high level events to full on debugging tracking everything that is happening. If you can figure out a way to do it, you could even log the performance hit currently being experienced because the level of logging. That would require some baselining in any given environment but could be done such that perhaps you occasionally run a “transaction”, whatever that is for your app, raw without logging and then run the same “transaction” with the logging and see what the delta is. The more you do that the closer you can be to defining the performance hit involved.
For AD interaction logging in particular, when configured for the most verbose logging you should log entries for the whole process to select the domain controller such as what DC was used to figure out the bootstrap information, anything read in from config files/registry/etc, DNS results, possible DCs, selected DCs and some of the information that is used to help you determine which DC to use (like performance, capabilities of the DCs, etc). You want to track all connection attempts, the parameters, controls, etc and their results (but don’t put in clear text passwords!) and performance of same. All queries again including parameters, controls, etc and the results and ta da… Performance of same. All updates again including parameters, controls, etc and the results and performance of same. You really want to include performance information (how long it takes to do various things – response time for connections and queries are usually quite useful), etc because some of the most common issues as you scale up environments is that something is “running slow” and you don’t want to make people have to break their arms trying to find convoluted ways to determine performance issues. If you are using multiple threads/processes then they should be uniquely identified in the logs as well so you can track the various streams that are likely occurring simultaneously. Oh, time stamps, lots and lots of time stamps. Perhaps the machine name the code is actually running on especially if you could have a pool of application servers running and want to consolidate the logging somehow. Pretty much ANYTHING that would help you troubleshoot issues with connecting and/or returning information and/or updating information in Active Directory or any other LDAP directory.
Also, and this is a bit of Psych 101, tie in non-configurable logging to the most verbose performance impacting logging levels that indicates what debug/hardcoding flags are set. Track everything put into the code to use for troubleshooting or debugging that someone may enable and actually forget about and leave running for long periods of time, this could include verbose debug logging levels being configured, hardcoding, etc. I see nothing wrong in stamping the log every 15 minutes, for example, when someone has hardcoded a specific machine or other resource to use when the application supports dynamic resource detection. No one wants to see errors popping up in their logs over and over again so when you do that for the debug configuration items you make it far more likely that someone will notice what is happening and will get it corrected. If there is a good reason for the debug logging then the admin/integrator/dev will understand and be able to properly look past those alert entries in the log for the limited period of time they are debugging and once they are done it will hopefully be so noisy and painful they will fix it. You can also put in regular warning reminders in the logs for perhaps some less than optimal configurations, especially configurations that impact security or performance. For example, a log entry once an hour saying “Hey goofball, you are sending passwords across an LDAP connection with no encryption, start using LDAPS or STARTTLS or LDAP Signing/Sealing!” That little bit could help keep your application from being used insecurely and jeopardizing a company.
This should be able to go unsaid but unfortunately I absolutely know better as I have seen regular occurrences of this rule being broken year over year but as I specifically called out above… Do NOT output clear text security secrets like passwords into log files. If you want to put passwords into the log files then a few basic rules:
1. It should only be secrets/passwords that you already know and control. I.E. Application ID passwords for YOUR application. DO NOT EVER output in any way shape or form the passwords of anyone using your application. You can and should log the IDs, but don’t even think about logging the passwords or even an encrypted form of the passwords. The user passwords should never be available outside of the memory space of the currently running application and it should be in memory only for a very short period of time as well, milliseconds at most, the time taken to authenticate the user.
2. The secrets that are ok for you to log you should encrypt in some way shape or form so a casual glance cannot pick up on them. And when I say encrypt I mean actually encrypt, don’t do something stupid like MIME encode them such that anyone who can grab the text can put it into one of a thousand different pages on the internet to revert back to clear text. Better than encryption is to use a hash. It still isn’t full proof but you can easily create the hash of your password that you think it is and then compare to the hash that is in the log to make sure it is the same.
3. If you ever gain knowledge or have a feeling that your secrets have been exposed to someone they shouldn’t be exposed to immediately change those secrets. Secrets like application / process passwords should be changed frequently, at least anytime someone leaves a team that knows the password and at a minimum annually though more preferably monthly, weekly, or even daily. I have even seen applications that changed their own passwords every 8-12 hours. That last one may be a bit excessive but it really depends on how critical or sensitive the information is that the secrets/passwords are protecting.
To wrap up logging… Produce good logging, you will thank yourself and operations will thank you as well when they are forced to use it. Be clear, be concise, be complete. Think about what information someone may need from your application when it is 2AM and it isn’t doing what it is supposed to be doing and someone needs to figure out what is wrong and they have called you to ask you to fix it. Things WILL go wrong, period. No one writes perfect code, no systems run 100% perfectly 100% of the time. It may not be your system or code that is failing but your code could still be blamed. You can NOT depend on other people, especially the people who support the underlying infrastructure that you use, to be able to tell you why your application is failing, it isn’t their job and very often, if the environment has any serious scale, they have no real capability to help you. It is all on you.
joe
Ditto this. I used to do text based logging but last year I discovered SQL. It is such an amazingly cool store for everything. Worth investing the time to get into it.