joeware - never stop exploring... :)

Information about joeware mixed with wild and crazy opinions...

The trust relationship between this workstation and the primary domain failed.

by @ 8:06 pm on 6/5/2012. Filed under tech

clip_image001

Ever see that message before?

How about on a machine that was previously working just perfectly?

It’s anything but [OK]. You can usually think of some different verbiage that should be in the button under that message… Especially when the machine has special software on it that takes offense when you remove a machine from the domain and then try to rejoin the domain…

So what is the fix when you encounter this problem? You, as the AD fixit person, tell the people complaining that you have no clue why that happened but it must be something they did because if it was AD, there would be a lot of people with the issue. 🙂

And most likely, you are probably right.

To date, with over 12 years playing around with Active Directory and 16 years of playing with Windows Domains total I have yet to positively determine a case where AD broke and caused this issue. I am able to track it down to bad processes like cloning desktops after they were joined to a domain or someone resetting the machine account thinking that “reset account” meant something else or some goofy software that tries to make it so your machine can be a member of multiple domains and screws it up instead or finally when I can’t find anything but I also can see that AD is just fine and the password hasn’t been changed in AD for a week or more meaning if AD was the issue, you should have been hurting long before now…

Certainly part of the issue is troubleshooting. It is difficult to get information for this problem; you get that crappy error message and you can look at the date of the last password change in AD on the computer object (and even see it in human readable formats thanks to AdFind) and that is pretty much about it. So in the end, the fix for the problem is to tell the people to disjoin and rejoin the machine to the domain. That will solve it… Again, this gets painful with some apps that don’t like the idea of being dropped out of the domain and then rejoined to the domain. You can probably think of one or two (Microsoft designed and written even) apps like that…

Sidebar: So what is actually wrong? Why can’t the machine talk to the domain properly?

So as you may or may not know, every computer that is joined to a domain has an object in the domain that is called a “computer account”, this computer account is pretty much the same thing as a “user account” as AD looks at computers as users. There are some special cases like AD doesn’t force computers to change its password like it can do for a normal user, but realistically, there isn’t much more difference than that and the fact that various attributes that can be stacked onto the different object types. So like users, these computer accounts have passwords and the computer has to know the password for its account in order to “log into” the domain just like a normal user must log into the domain.

When this broken trust message is displayed, it is either because the computer object has been deleted OR the computer doesn’t agree with Active Directory on what it’s password is (and it can’t find the post-it note that it wrote the correct password down on). So like any user that forgets their password, AD tells them to go pound sand. Only in this case, instead of seeing an error message of “Hey numpty, that password isn’t correct, try again”, the message is far more dire… The trust relationship has FAILED! DAH DA DUMMMMMMMM! And the men swoon and the women scream… And what is the fix to the failed trust? To divorce the machine from AD and then to remarry them… At least effectively… You have to tell the machine that it is no longer with that pretty Domain X and is now hitting the single’s bar WORKGROUP. But then, phew, an admin tells Domain X you want her back and she accepts and the machine gets to join back up with pretty Domain X.

How is that accomplished? Well you reset the machine account in AD (or if you are silly you actually delete the machine account, then recreate it) which results in a machine account with a password that is known to the domain and to the machine and then you tell the machine to disjoin from the domain and then rejoin the domain. When the machine rejoins it uses this already well known password and voila, it can log on again. <And everyone cheers and throws bird seed>.

And now you are saying, great, so what, I have learned absolutely nothing…

Some time ago when digging through the new protocol docs that Microsoft was required to document (http://msdn.microsoft.com/en-us/library/hh128055(v=prot.13).aspx) I found some interesting discussion about computer passwords and that got me to digging into the API more and also into the Windows source and learned where the member computers store the current password they have saved for their domain account (i.e. the machine’s form of post-it note). I wrote some basic code and was able to display the information stored in the password slots but even cooler, I was able to display when the password (and previous password) were last set… Ah finally, additional debugging info. Now I also would really have liked to have gotten a good clear text password so I could say, hey, if we just change the password in AD to this same password, we will be good again as the machine and Active Directory will agree with each other on what the password is. Unfortunately, something else came up and I stopped looking at it and got dragged into some other issues and promptly forgot all about that line of research for some time…

Then recently a friend of mine and former Microsoft Directory Services MVP (which is how I met him years and years ago) who will go nameless to protect the guilty was complaining about some issues he was seeing in an environment he works in around machines that were failing their trusts. This guy is a bright guy, good photographer, and also knows a fair bit about Espresso I have learned; we shall call him Rich for lack of a better name. Well Rich starts pointing out to me that some folks in his account are having this issue and he has given them the standard AD response but “Gosh darn it joe… isn’t there a better answer?” I asked how many machines? He responded it was some smallish amount, like tens of machines out of tens of thousands machines… I said, that isn’t even a rounding error, no their isn’t a better answer; it’s their fault. 🙂

But then I thought about it and recalled my previous experiments and said to Rich, “Hey, I was working on something to extract the machine account passwords from machines previously, I don’t recall where I was on it, but let me check it out, we can probably do something with it.” and he thought that would be pretty cool.

So I dug into the my source code and saw where I got stuck and spent some nights walking through the Microsoft source code and completely failed to determine how to get nice clean clear text passwords out. But then I thought… who cares? Does it even matter? The old proverb… If you can’t solve the problem, change the problem.

So I changed my direction a little and started working on a tool that could be used both to get some troubleshooting information as well as actually set the password on the machine to something that AD and the machine could agree upon. Specifically, the default of the tool is to set the password to the default used by ADUC when you tell it to reset a computer account (though I let you specify a different password if you want). Once the computer account has been reset in AD and on the machine, you can simply force a secure channel reset (via NLTEST /SC_RESET:DOMAIN) and the machine should reconnect fine. In fact, I have also that secure channel reset capability within the tool so you don’t have to add an extra step. Voila, the machine and AD are trusting each other again. Sort of like what a great marriage counselor could do I guess only there are no lingering doubts and silent internal screams of “I HATE YOU!” afterwards. 😉

The utility isn’t completely done, I am slowing adding new features to it in the spare time I may or may not have in the evenings after work (doesn’t that make some/any company want to pay me millions to have me around to just dream crap up like this???). The most recent piece I added this last weekend is for those security conscious people that say… “Wait a minute, so you set that password to a known password and now the machine and AD are talking happily… but I don’t like the idea that the machine account now has a well-known password….” . And to that I say, ah good thinking, that is a bad idea isn’t it? Good thing I thought of it first. 😉 So there is an updatepwd option that tells the machine to kick LSASS into performing its normal password update routine. That way the weak password that originally got it talking to AD is no longer there and instead it is now has the default strength computer password.

Once I put that last piece in I decided I should simply the switches and allow a single switch to set the password to the ADUC default, reset the secure channel, then update the password to something secure. And for that I used a very simple, and hopefully intuitive, switch… /fix.

 

Here is a sample run of the whole process…

Pretend you saw the dialog box above at the top of the post and you have finished swearing. You have moved forward and logged into a machine that can still talk to the domain with an ID that has the ability to reset the password of the untrusted machine account and into the machine itself with a local administrator ID[1].

You open dsa.msc (ADUC aka Active Directory Users and Computers) on the machine that can talk to the domain and you find the computer account in question and then you right click and select actions and then reset account.

You then open an admin command prompt on the machine that is hurt and this is what it will look like…

Yep this trust isn’t happy… NLTEST with both QUERY and RESET say so…

C:\temp>nltest /sc_query:test
Flags: 0
Trusted DC Name
Trusted DC Connection Status Status = 5 0x5 ERROR_ACCESS_DENIED
The command completed successfully

C:\temp>nltest /sc_reset:test
I_NetLogonControl failed: Status = 5 0x5 ERROR_ACCESS_DENIED

 

So what does my new utility, MachinePwd, say about the password age… See below… Note that AD shows you a different time stamp. Depending on how well your domain is maintaining time, you will have a good idea how far out of sync they are. My example times below are forced and arbitrary. In real life I expect you will see one or the other timestamp has a much older value. If AD has the older time, something happened on the machine. If AD has the younger time, someone probably “accidently” reset the computer account.

One thing I learned is that the standard processes update both the current and previous passwords every time they are told to update the current password. I don’t know why they do that but I saw it clearly in the source code. One guess is that someone didn’t realize that the Secrets portion of the registry would maintain the previous secret string for you automatically, you don’t have to mess with it. If you do, and they do, well then the time stamp gets dorked. Alternately, there is some other reason that I couldn’t divine.

Notice the string “LocalTime”… that is there because I am writing this code in Visual Studio 2010 and I haven’t ported all of my previous libs over to it so the cool stuff I had for building time strings including time zone info isn’t available yet and I didn’t want to go through and whip up a temp TZ processing routine. You know what your local time is, I don’t have to tell you. I expect future versions will get it as I get more modules ported.

Anyway, what the machine says…

C:\temp>machinepwd

MachinePwd V01.00.00cpp Joe Richards (joe@joeware.net)  May 2012

Determining Machine Name…
MachineName: MEMB1-K8
Determining Domain Membership…
Domain Name: TEST
Opening policy…
Determining primary domain DNS info from machine policy…
DNS Domain Name: test.loc
DNS Forest Name: test.loc
Retrieving Machine Password Information…
Current  Password TimeStamp: 2012/06/04-20:40:47 LocalTime
Previous Password TimeStamp: 2012/06/04-20:40:47 LocalTime
Command completed successfully.

 

What AD says… I can tell you that I purposely changed the password in AD to break the trust between the machine and the domain.

C:\temp>adfind -default -f samaccountname=memb1-k8$ pwdlastset -tdcs

AdFind V01.45.00cpp Joe Richards (joe@joeware.net) March 2011

Using server: TEST-DC1.test.loc:389
Directory: Windows Server 2003
Base DN: DC=test,DC=loc

dn:CN=MEMB1-K8,OU=LockoutOU,DC=test,DC=loc
>pwdLastSet: 2012/06/04-21:42:03 Eastern Daylight Time

1 Objects returned

 

So we have the current info, let’s fix the secure channel (trust)…

C:\temp>machinepwd /fix

MachinePwd V01.00.00cpp Joe Richards (joe@joeware.net)  May 2012

Determining Machine Name…
MachineName: MEMB1-K8
Determining Domain Membership…
Domain Name: TEST
Opening policy…
Determining primary domain DNS info from machine policy…
DNS Domain Name: test.loc
DNS Forest Name: test.loc
Retrieving Machine Password Information…
Current  Password TimeStamp: 2012/06/04-20:40:47 LocalTime
Previous Password TimeStamp: 2012/06/04-20:40:47 LocalTime
Setting Machine Password to memb1-k8…
Set Machine Password.
Checking Secure Channel for test.loc…
Secure Channel is not established, error 0…
Resetting Secure Channel for test.loc…
Secure Channel is established with \\TEST-DC1.test.loc.
Request LSASS to securely update local machine password…
Machine password successfully updated.
Retrieving Machine Password Information…
Current  Password TimeStamp: 2012/06/04-21:46:18 LocalTime
Previous Password TimeStamp: 2012/06/04-21:46:18 LocalTime
Command completed successfully.

 

And let’s take a look at what NLTEST says now…

C:\temp>nltest /sc_query:test
Flags: 30 HAS_IP  HAS_TIMESERV
Trusted DC Name \\TEST-DC1.test.loc
Trusted DC Connection Status Status = 0 0x0 NERR_Success
The command completed successfully

If you so chose, you don’t have to use /fix. You can instead set a default (or other password) and then manually use nltest to reset the secure channel or you can, if you so choose, use the tool to tell the machine to change its current password (assuming there is a good secure channel in place already).

I expect to release the tool within the next few days. If you have any thoughts, leave a comment or send me an email. 🙂

[UPDATE]: You can find info on the new tool and download info here –> http://blog.joeware.net/2012/06/07/2513/

 

joe

 

[1] You can actually use the broken trust member to do all of the work if you use something other than ADUC that knows how to talk to AD without an established  trust (say like AdMod).

Rating 4.73 out of 5

25 Responses to “The trust relationship between this workstation and the primary domain failed.”

  1. Daniel says:

    Just wanted to say Thank You for the wonderful read. I “couldn’t put it down.” I’ve always wondered what the problem/resolution was for this. I’ve seen it a handful of times in our environment and we always disjoined/rejoined to resolve, because the troubleshooting was exhausting. The read was more exciting than the fix, for me. Thanks again, Joe, love these thorough posts.

  2. Dave S says:

    “You then open an admin command prompt on the machine that is hurt”

    This can be a problem if all of your admin access on that machine was through your domain login.

    Lets say you joined the machine months/years ago and long since forgot the local administrator password, so getting local admin can be very difficult. This is especially difficult if bitlocker is enabled so any offline local administrator reset tool won’t work.

    A neat trick is to unplug the machine from the network but still use your network login. It doesn’t care that the trust channel is broken. Since there is no network it won’t try to contact the DC. It will use your old cached password and you will get logged in. This assumes your domain account is in the local administrators group. Now you have local admin again to run your process.

  3. Mike Kline says:

    Wow joe, this is going to be an immediate download and I’m guessing will become one of the most popular tools you have released because this is such a common problem. Thanks a usual for giving the community these great tools…for free.

    For those reading commeonts…remember there is a tip jar 🙂

  4. Mike Kline says:

    In addition one thing I’d love to see in future versions is more verbose information in the error to help people identify root cause. I agree with Daniel many times it would be easier just to join it again (especially for help desk folks just needing to get the users back up again)

    Thanks again

    Mike

  5. Fred Woodbridge says:

    I love you, man. Thanks for this.

  6. Tony Murray says:

    Thanks Joe. Awesome job!

    I fully expect Microsoft to include the corresponding Powershell cmdlet in Windows 9.

    Tony

  7. Good read AND good utility, joe.

    Thanks again, as always. (And thanks, unnamed friend)

  8. Matthew Huxtable says:

    Awesome, Joe! Fantastic stuff. Thanks (as always) for your time investment to further benefit the community.

    -Matt

  9. joe says:

    Thanks for all of the feedback, yes, just disjoining/rejoining was previously the fastest solution, now this should beat that out. The work at the AD level is the same either way, the work on the machine has two fewer reboots (leave domain, reboot, join domain, reboot) with this utility. 😉

    Also, something that I thought would be good is to actually output the OLD information before stomping over it with the fix… That way you can just do the /FIX option but still have something to take away with you to think about. 🙂

  10. joe says:

    I heard from an old friend that they have been seeing this problem caused by the Windows Restore functionality as well. I.E. You restore a machine to a previous point and if there has been a password change in the meanwhile… you are now broken.

  11. SuperGumby says:

    There’s a MUCH easier way to reset the trust.
    rt-click ‘computer’, properties, on Win7 in the domain/workgroup area click ‘change settings’ (XP/Vista has similar) and run the ‘Network’ wiz, just confirm the current domain settings and when prompted supply credentials that have the right to add a PC to the domain.
    You will be prompted ‘A computer account already exists for This_PC, do you wish to use it?’. The ‘secure channel’ will be reset.

    • joe says:

      Do you feel its easier because its a GUI? What specifically makes it easier?

      Also I don’t think that wizard exists on server OS machines.

      joe

  12. Awinish says:

    The tool is simply incredible in cutting the unnecessary effort due to broken secure channel. Kudos to you & your tools.

  13. MikeC says:

    Thank you, Thank you, Thank you! This will make our lives easier as we have had to do the domain re-join process on remote networks in different countries and it’s not a fun task. You are once again giving us a great tool!

  14. Dhiraj says:

    Great Tool, Joe. Thanks as always.

  15. JP Rhodes says:

    I’ve run into a situation where a domain controller itself has become the victim of this issue; an old machine that had the same AD name as the DC was reintroduced to the network, and its name was changed. Resetting the machine account password of the DC isn’t allowed, and now the only login option for the DC is DSRM, where demotion isn’t possible.

    Any thoughts or solutions?

  16. fakey says:

    Hey,

    There’s a PowerShell cmdlet that you can use to supposedly fix the secure channel, Test-ComputerSecureChannel [-reset]

    Haven’t had the opportunity to test it yet, but yeah.

    Great stuff, this article. Though this issue is rather “obscure”, I’d definitely love to know why it happens. I know of a brand new computer (+-1 month) that has had this happen many times already, like, how is that even possible if it’s only related to the machine password (changing)?

  17. Byron Pearce says:

    How can we download the Utility you were working on?

    Happy Holidays,

    Byron Pearce
    bpearce@interthinx.com
    byronwp@yahoo.com

  18. Carsten says:

    Hi Joe you saved my life 🙂
    I had to restore a broken Sharepoint Web Frontend Server (W2k3) with a installed Sub-CA. I had to change the Hardware and restore it from an Image. Now Murphy came accross the corner and everything… *no comment* Trust relationship broken and problems with the sub-ca
    because the old admin did not renew the sub-ca main Certficate and so the ca could not start and so on and so on one problem follows the next. Disjoin from AD was not possible due to the installed CA… -> netdom pwd reset did not work ( In Wireshark I could see a access denied error on protocol Level but netdom prints out “Domain not found” -> WTF ??? Useless error message ) Head -> Desk
    But then I found your little tool and now the sun is shineing again 🙂 Secure Channel is online again, CA is running again and the Sharepoint is also back online.

    Big Thanks 🙂 🙂 🙂

    Carsten

    • joe says:

      That is awesome news Carsten. 🙂 Sounds like it saved you a metric shit ton of work.

  19. rick says:

    Hi Joe,

    I am having this trust issue with a users computer account, would this tool reset the users password in AD? Do I run this under the local admin account on her computer? What do the other switches do? such as /forcescreset /fix

  20. Patric says:

    Joe, not that I doubted it, but it works like a charm.
    It’s great getting a little more knowledge on those things. Keep up the good work!

    (btw, guess you don’t remember me, but we were in contact during your time at the blue oval employer ;o) and I’ve been using your tools which were deployed there almost everywhere during my daily helpdesk business ;o))

  21. Andy Godfrey says:

    Hi,

    Just wanted to say that this tool gives far better results then any other solution I have found so far, with 100% success rate(so far)! I thought I would add that this seems to give results where the wizard to rejoin the domain works but only for 24 hours or so. I wasn’t able to get the powershell cmdlet Test-ComputerSecureChannel [-reset] to get results either so this is a god send! Perhaps more importantly it has highlighted that there is a time discrepancy which gives me something to look at for a more preventative solution.

    Thanks!

    Andy

[joeware – never stop exploring… :) is proudly powered by WordPress.]