joeware - never stop exploring... :)

Information about joeware mixed with wild and crazy opinions...

LDAP Client API Memory Leak

by @ 10:07 pm on 10/6/2006. Filed under tech

A user in one of the newsgroups I frequent ran into an issue with some LDAPS code that was leaking memory on him. You can view the entire thread by clicking here.

The summary is that the user was writing some LDAP API code and submitting one of the LDAP Options, specifically LDAP_OPT_CLIENT_CERTIFICATE session option which allows you to specify a callback for the LDAP Client code to invoke your own code for selecting a client cert to use.

Well the user actually didn’t want anything returned to the server so was setting the callback and the function he registered simply returned FALSE. Here is the minimal code snippet to cause the leak.

#include <windows.h>
#include <ntldap.h>
#include <winldap.h>
#include <schnlsp.h>
#include <stdio.h>

BOOLEAN _cdecl GetClientCertRoutine(void *Connection,
                                    void *trusted_CAs,
                                    void *ppCertificate)
 {
  return (FALSE);
 }

int main(int argc, char* argv[])
 {
  LDAP* ld = NULL;
  INT iRtn = 0;
  INT connectSuccess = 0;
  PCHAR pHost = NULL;
  ULONG version = LDAP_VERSION3;
  SecPkgContext_ConnectionInfo sslInfo;
  LONG lv = 0;

  SecPkgContext_ConnectionInfo *psslInfo;

  pHost = “hostname.domain.com”;

  while (1)
   {
    ld = ldap_sslinit(pHost,LDAP_SSL_PORT,1);
    if (ld == NULL)
     {
      printf( “ldap_sslinit failed with 0x%x.\n”,GetLastError());
      return -1;
     }

    iRtn = ldap_set_option(ld, LDAP_OPT_PROTOCOL_VERSION,(void*)&version);
    if (iRtn != LDAP_SUCCESS)
      goto FatalExit;

    // setting function to avoid client authetication
    iRtn = ldap_set_option(ld, LDAP_OPT_CLIENT_CERTIFICATE, &GetClientCertRoutine);
    if (iRtn != LDAP_SUCCESS)
      goto FatalExit;

    iRtn = ldap_get_option(ld,LDAP_OPT_SSL,(void*)&lv);
    if (iRtn != LDAP_SUCCESS)
      goto FatalExit;
    if ((void*)lv == LDAP_OPT_ON)
      printf(“SSL is enabled\n”);
    else
     {
      iRtn = ldap_set_option(ld,LDAP_OPT_SSL,LDAP_OPT_ON);
      if (iRtn != LDAP_SUCCESS)
        goto FatalExit;
     }

    connectSuccess = ldap_connect(ld, NULL);
    if (connectSuccess == LDAP_SUCCESS)
      printf(“ldap_connect succeeded \n”);
    else
     {
      printf(“ldap_connect failed with 0x%x.\n”,connectSuccess);
      goto FatalExit;
     }

    printf(“Binding …\n”);
    iRtn = ldap_bind_s(ld,NULL,NULL,LDAP_AUTH_NEGOTIATE);
    if (iRtn != LDAP_SUCCESS)
      goto FatalExit;

    goto NormalExit;

    NormalExit:
      if (ld != NULL)
       {
        ldap_unbind_s(ld);
        continue;
       }

    // Cleanup after an error.
    FatalExit:
      if ( ld != NULL )
        ldap_unbind_s(ld);
      printf( “\n\nERROR: 0x%x\n”, iRtn);
      break;
   }
 }

This is a little shorter and slightly different than the code that the user posted so it is quicker to read through. I copied the code and slapped it into a new Borland Developer Studio c++ console project, added the wldap32.lib and compiled it. I fired up good old perfmon and then launched the app so I could select counters for that process. The specific counters I wanted were Process\Private Bytes and Process\Working Set. I then watched the counters and sure enough… There appeared to be a leak.

 

Since the application isn’t allocating memory within the loop that it isn’t deallocating, the counters should be a straight and HORIZONTAL line… While that line is relatively straight, horizontal it is not.

So I started commented out almost all of the code and recompiled and got the straight line that I expected. So the next step was to either un-comment parts and recompile and rerun or un-comment everything and then comment certain pieces until you find the source. Doesn’t matter which way you go really, up to you. In the end I found that if I commented out just this section

    // setting function to avoid client authetication
    iRtn = ldap_set_option(ld, LDAP_OPT_CLIENT_CERTIFICATE, &GetClientCertRoutine);
    if (iRtn != LDAP_SUCCESS)
      goto FatalExit;

The leak was gone. Looking at the call back function (described more below) I saw that one of the parameters was a list of trusted CAs and was dynamically allocated… It was likely the leak was in that section of the LDAP Client API code. So I then broke out my super secret MSDN Premium smart card which gives me access to the Windows Operating Source Code….

Having played in the source quite a bit over the last 3 or 4 years it was fairly easy (a minute or two) to find the location where the user specified function was being called. I then saw what I thought to be the problem,  as I figured the list of trusted CAs was allocated and I didn’t immediately see it deallocated. So I chased a little more and still didn’t find a deallocation. That made me feel even more strongly that it was the culprit. I verified by actually changing the callback function so that it deallocated the list… I then recompiled and reran the test only to see….

 

That is the nice flat line I expected in the first place… A little bit of allocation up front and then nice and flat… Definitely stands out as quite different eh?

So I spent a little time and forwarded the newsgroup posting with my observations written up including the lines of code in the LDAP Client API source that I felt were at fault and sent them off to a very helpful MSFT guy. He dug into it and sure enough, the leak was confirmed and a bug opened on for it (note I was told I could say this). This will be corrected in some future version of the LDAP Client API DLLs though if someone needs a QFE for this, they could open up a support ticket with Microsoft and request a QFE and possibly get it (and this…).

Since this is a problem in the actual LDAP Client API it will most likely impact anything utilizing the API, this means ADSI, .NET S.DS, etc. So if you are using the callback function for the LDAP_OPT_CLIENT_CERTIFICATE option then you need to be aware of this. It shouldn’t be a tremendous issue unless you are binding and unbinding over and over again so most apps shouldn’t have to worry about it. However if you are doing something similar to what is above, you may want to be take care.

Oh, what code did I add to remove the leak? I had to add a couple of items. Basically the definition of the callback function is defined here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ldap/ldap/queryclientcert.asp. The definition is

BOOLEAN _cdecl QUERYCLIENTCERT(
  PLDAP Connection,
  PSecPkgContext_IssuerListInfoEx trusted_CAs,
  PCCERT_CONTEXT* ppCertificate
);

SecPkgContext_IssuerListInfoEx is a structure and the dynamically allocated CA blobs are in the aIssuers property.

This memory is freed with FreeContextBuffer which requires you to include security.h.

 

So at the top of the program, add in

#define SECURITY_WIN32 1    // Required for security.h

and then include security.h where the other includes are

#include <security.h>      // Required for FreeContextBuffer()

and then change the callback function, this could be done in a couple of ways:

BOOLEAN _cdecl GetClientCertRoutine(void *Connection,
                                    void *trusted_CAs,
                                    void *ppCertificate)
 {
  SecPkgContext_IssuerListInfoEx *p = (SecPkgContext_IssuerListInfoEx *)trusted_CAs;
  FreeContextBuffer((void*)p->aIssuers);
  return (FALSE);
 }

or

BOOLEAN _cdecl GetClientCertRoutine(void *Connection,
                                    SecPkgContext_IssuerListInfoEx *trusted_CAs,
                                    void *ppCertificate)
 {
  FreeContextBuffer((void*)trusted_CAs->aIssuers);
  return (FALSE);
 }

Whichever floats your boat… I am sure one looks or feels more elegant to you than the other, chose the one you like, just make sure you deallocate the memory.  🙂

 

Certainly I could have figured this out without the OS source code access; for this particular problem the information needed was available in MSDN if you took the leap of faith that the CA list was where the leak was. Quiet honestly, after seeing that the callback function caused the leak and looking at the MSDN info on the callback function parameters I was pretty sure that was, in fact, it. Having source access though certainly helped me move along a little more confidently and definitely helped me to describe exactly where in the LDAP Client API code the issue was for Microsoft. I don’t know how much if any time that could have saved for them but I am sure the developers were happy for direct pointers, I know I would be. I think this is one of those cases where having the ability to see the source (though not recompile it) is quite handy. It is why if MSFT ever opened their source in any large scale, I think doing so in such a fashion that you can look but not modify would be fine. However, I admit that is tough to enforce and quite honestly, I don’t care if every OSS person in the world has a heartattack over MSFT NOT opening their source, MSFT needs to feel comfortable in releasing anything to the open source world and if they don’t, I fully agree with their right not to do so. Though I admit it is useful to have the access – but I feel I earned the access rights I have and continue to try and earn the right. 😉

   joe

Rating 3.00 out of 5

One Response to “LDAP Client API Memory Leak”

  1. Joe Kaplan says:

    Great post! I tried to follow up with a brief discussion of the .NET implications on my blog. I forgot to mention that I don’t think this affects ADSI at all, as there is no way to change client certificate handling with ADSI. You get the default Schannel behavior. It only affects the lower level S.DS.Protocols stacks which directly wraps the affected callback function.

    http://www.joekaplan.net/JoeRichardsFindLDAPClientAPIBugThatAffectsSDSProtocols.aspx

    Go Tigers!

[joeware – never stop exploring… :) is proudly powered by WordPress.]