joeware - never stop exploring... :)

Information about joeware mixed with wild and crazy opinions...

How not to handle abuse…

by @ 1:30 pm on 3/4/2006. Filed under tech

As some of you may have found, the joeware downloads, forums, and blog were not working for a majority of the day on Friday. It was down from ~11AM to about ~6PM EST with the forums being down until probably about ~9PM.

Basically what happened is that my ISP, www.powweb.com, had 24 PHP scripts running from my website at once that they felt might be using excessive CPU so they turned off ALL scripting on my website. PHP, perl CGI, you name it. For fun I will walk through the process of what happened. I am extremely disappointed in how POWWEB handled this which is sad as I was just about to spin up another website with them that I had already purchased the domain name for. Now I have to sit back and think about this because this was handled so poorly.

So around 11:15AM EST or so I start receiving emails from various folks indicating that joeware downloads aren’t working. This is something I tend to hear from business folks on a regular basis and is usually an issue with their firewall/proxy admins deciding that the joeware site shouldn’t be accessed or in some odd cases that CGI scripts shouldn’t be accessed. I pull up the site to verify that I can still access it and low and behold the site is moving fast but the downloads does indeed throw back an error. So I take an early lunch break so I can sort it all out and start looking into it. Within a few minutes I determine that none of the scripts (perl nor PHP) are working.

Since I haven’t modified any of this stuff in a while (I have been too busy with other things) I know that the issue is server related so submit a ticket to POWWEB at 11:30AM EST.

I dig a little more into it and don’t find anything so decide to look at email to see if there are any notifications of maintenance. No, but I find this entirely too cryptic email that I received just before 11AM EST:

*****************POSSIBLE ABUSER FOUND************************
TIME: 2006-03-03 07:46:18
Machine: clust04-www06.powweb.com
User: xxxx
Proccess count: 24
load averages: 5.92, 4.40, 4.24

To verify this information, try /powweb/bin/user.procs xxxx
xxxx 74677 1.8 0.3 16592 11448 ?? S 7:46AM 0:00.29 php4
xxxx 74697 2.6 0.3 16592 11448 ?? S 7:46AM 0:00.33 php4
xxxx 74643 1.2 0.3 16796 11668 ?? S 7:46AM 0:00.31 php4
xxxx 74632 0.9 0.3 16592 11448 ?? S 7:46AM 0:00.28 php4
xxxx 74686 1.9 0.3 16876 11788 ?? S 7:46AM 0:00.31 php4
xxxx 74690 2.1 0.3 16796 11668 ?? S 7:46AM 0:00.31 php4
xxxx 74667 1.6 0.3 16796 11668 ?? S 7:46AM 0:00.33 php4
xxxx 74706 2.3 0.3 16592 11448 ?? S 7:46AM 0:00.21 php4
xxxx 74616 0.6 0.3 16796 11668 ?? S 7:45AM 0:00.30 php4
xxxx 74685 1.9 0.3 16796 11668 ?? S 7:46AM 0:00.29 php4
xxxx 74676 1.8 0.3 16796 11668 ?? S 7:46AM 0:00.30 php4
xxxx 74681 1.9 0.3 16796 11668 ?? S 7:46AM 0:00.30 php4
xxxx 74691 2.3 0.3 16796 11668 ?? S 7:46AM 0:00.28 php4
xxxx 74647 1.4 0.3 16592 11448 ?? S 7:46AM 0:00.29 php4
xxxx 74635 1.1 0.3 16796 11668 ?? S 7:46AM 0:00.31 php4
xxxx 74639 0.9 0.3 16592 11448 ?? S 7:46AM 0:00.29 php4
xxxx 74673 1.7 0.3 16592 11448 ?? S 7:46AM 0:00.31 php4
xxxx 74678 1.7 0.3 16796 11668 ?? S 7:46AM 0:00.29 php4
xxxx 74708 2.1 0.3 16796 11668 ?? S 7:46AM 0:00.18 php4
xxxx 74689 2.0 0.3 16796 11668 ?? S 7:46AM 0:00.31 php4
xxxx 74644 1.2 0.3 16796 11668 ?? S 7:46AM 0:00.34 php4
xxxx 74646 1.2 0.3 16592 11448 ?? S 7:46AM 0:00.29 php4
xxxx 74622 0.9 0.3 16796 11668 ?? S 7:46AM 0:00.37 php4
xxxx 74645 1.2 0.3 16592 11448 ?? S 7:46AM 0:00.32 php4

Status unchanged in 0.00 minutes
Status message received from 66.152.98.46

Sincerly,
The PowWeb Hosting Team
http://www.powweb.com

As you can see that email is pretty much useless for troubleshooting anything.

SIDEBAR:If you don’t mind the momentary segway this is about as useful as the dsaccess ldap counters in Exchange… I.E. AD might be having troubles or it might not be, you need to go figure it out by yourself and no I won’t give you any real hints of things to look at other than AD. Couple that with the fact that you can’t trust the info ESM tells you for what DCs/GCs are being used (this is a bug I have been fighting with MS over since last summer) and voila you understand how useless those counters truly are for getting to the root of a problem.

Now what this email is supposed to tell me is that my site is considered abusive and that my scripting has been turned off. I don’t know about you but I don’t read that in the email. The only reason I understand what is going on is because of users (if you paid to access my site or tools you would be called customers) sending me emails saying, hey dude, WTF!?!.

So around 11:40 EST I responded to the email asking how am I supposed to know what to fix from that email. I then surf through their FAQs looking for info on resolving abuse issues. I finally find that I am supposed to go figure out what the problem is by looking at the logs and then let them know that I fixed it. Unfortunately the only way to access logs by default is through their sitetools which is all CGI so it doesn’t work. So now all avenues for me to figure out on my own what is wrong are blocked.

So I take a guess that the problem is with the sucky forum software I have (PHPBB2) and that possibly there is yet another hack out there for it that I have to update it for. So I rename the folder that software is run from and send another note to update my support ticket at POWWEB indicating what was done. I then log into the Support Chat line and ping a support person and find out that they can’t do a thing about it, I have to wait for an abuse analyst.

joe: How do I contact them?
support: Email these two addresses.
joe: What number do I call?
support: You don’t…

So I email the addresses, abuse@powweb.com and sa-abuse@powweb.com, that shouldn’t take too long, the common alias for all external abuse issues….

So I go back to work pretty much having wasted a majority of my lunch break. But hey as my friend Dean would probably say, I won’t hurt for missing that meal or more likely I should miss more meals. 😉

I finally get an email response around 1PM EST and I find out it is only the support analyst who told me to email abuse decided to close out the first ticket I submitted by saying

Our admin team has disabled your cgi for abuse of server’s resources. An email was sent to you at xxxx@yyyyy.zzz

Please read that email and then reply back to abuse@powweb.com

Please don’t hesitate to email us back if you need further assistance.

GREAT! Quite helpful.


if (bNothing_To_Do)
 {
  DoWork(NULL,60);   // Do nothing but sleep for 60 minutes
  iSupport_Tickets_Handled_For_The_Day++ 
 }

I also get an automated response from ABUSE….

Thank you for contacting PowWeb. The problem you’ve reported requires additional help in our Abuse Department. Your ticket has been escalated and you can expect a response when the issue has been resolved.

Please understand that due to the more complex nature of most of the issues involved in the escalation process, a response may require more than 24 hours. We appreciate your patience as we continue to work toward resolving to your problem.

Also quite helpful… not.

Fast forward to 5PM EST. Still nothing. In the meanwhile while I was not responded too all afternoon I have analyzed perf logs for ~15 heavily Exchange used Domain Controllers and looked at separate LDAP search query latency perf across the pool of DCs over the course of a week from a custom script I have.

SIDEBAR: Note to people building DCs for LARGE (read tens of thousands of mailboxes) Exchange deployments, regardless of what the MS planning docs say, RAID-1 is NOT acceptable for the disk with the DIT on it. You need spindles, Exchange generates a ton of queries and unless your whole DIT is cached, your disks will get the crap beat out of them. RAID 10 or 0+1 or even RAID-5 is much better for this, the more spindles the better. The customer in question here had changed from RAID-1 to RAID 0+1 and the difference in the counters was night and day.

So fast forward to 5PM I write up another email to support@powweb.com, abuse@powweb.com, and adding sales@powweb.com indicating that this support really sucks and they needed to fix their process. At the very least they need to have a way to contact the abuse folks directly or have an automated system that web site owners can “unlock” the script lock.

Time for dinner and to go out and run some errands.

At approximately 8:20PM EST I get home and see a response from Carlos in POWWEB sales. He sent the email at about 5:40PM EST saying that he had enabled CGI and that if the admins locked me down again I would have to patiently wait for abuse to eventually contact me. Carlos followed up with another email at 8:01PM EST asking if I had made any progress. I responsed as soon as saw the emails indicating that I would start digging into it. It is amazing how much more you can learn when you can actually get at data since now that CGI worked I could actually see logs.

Looking at the logs I noticed that I was getting blasted with a ton of requests to the forum software from two IP addresses, luckily those are the only two IP addresses hitting me for PHP at the time. These IP addresses appear to be web proxies for Lockheed Martin. Specifically

Name: proxy3b.external.lmco.com
Address: 192.35.35.35

Name: proxy3d.external.lmco.com
Address: 192.91.173.42

Within the two minutes indicated by the original abuse email there are 721 requests for PHP scripts from those two IP addresses of which there are 6 different PHP scripts. No way to narrow it down further.

Was it an attack? Was it simply high load because lots of people at Lockheed Martin didn’t have anything to do Friday morning but access my forums, I don’t know. I haven’t looked at the entries close enough yet to pull out individual session IDs and look at the patterns.

In the meanwhile I lock down the forum software a little more so that requests can’t come in as fast. I briefly consider requiring everyone to register and authenticate to possibly avoid automated attacks.

Around 8:50PM I finally get a response (form letter) back from abuse… Again not extremely helpful

Thank you for contacting PowWeb Technical Support

There were multiple instances of your script[s] running, which began to cause a burden on the server and affect our other customers. Your account was flagged for this and your ability to run cgi scripts was disabled. You will need to

check the software packages on your site and make sure that they are fully

updated and patched (or if the script was just an individual script, you need to ensure that it is optimized).You can determine which scripts were running at the time of your disabling by referencing your access logs (available in your OPS control panel, in the Packages -> Site Tools -> Log Viewer section).

Your CGI scripting ability has been restored.

The only way you could tell from the logs what script was running is if the website was the slowest least busy site on earth or only had one perl script and one PHP script so you could ascertain what it was from the abuse email. There were 24 instances of long running scripts when 721 were fired. Which ones was it? No way to tell from anything POWWEB provided.

I responded thanking the analyst for his time and then indicated the four major concerns I had over how this was all handled and four recommendations for better handling in the future.

My main concerns briefly 😉

1. Cryptic email was suited for POWWEB admins, not customers. I pay, I am a customer.

2. FAQ says to look at your logs but there is no way to look at your logs by default since the tool to do it requires the script access.

3. Silence for hours from support. No mechanism to help myself, no way to contact people live to get help.

4. Form letters that are absolutely worthless in helping anything.

My recommendations, again briefy 😉

1. Send Informative emails with good explanation of what happened and the impact. Don’t make me guess that you shut off scripting.

2. Send the most current log with the email above or let me get access to the logs when there is a problem. Better yet, send a few lines from the log that indicate the most likely causes of the issues. If you know that the script was launched at time X and is a PHP script, then send me the lines from that time frame where a PHP script was requested.

3. Shut down the source of the problem, not everything. If PHP is the problem, stop PHP, not every scripting mechanism. Better yet, stop the specific script or IP address(es) requesting the script(s). Assume your customers aren’t trying to screw you so don’t screw them.

4. Allow an automated process for people to unlock their site or make sure abuse responds within an hour or less.

In the meanwhile, it seems it might be possible to set up a cron job to copy the logs for me. I used the site tools to configure it but it didn’t seem to work so now I have to go look into that.

In the meanwhile, I am really unsure if POWWEB will get any more business from me. I am very thankful to Carlos in POWWEB sales. He probably broke a rule by enabling scripting for me but at least I was able to actually start figuring out what was going on. No one in support had helped me to that point.

My main focus tends to be around making systems run at scale. Sure it is usually around AD or Exchange or whatever but the concepts flow across technologies. It isn’t feasible to have enough people to support everything that could come up in a large environment, as such you need automated systems to help your customers keep working. Without it, you can not effectively scale. You hit the point where your size is detrimental to your customer service. POWWEB has hit that at least in this aspect. I am a paying customer and they locked me out of my stuff like I was a hacker attacking them. This is a fundamentally wrong way to treat the customers.

joe

Rating 3.00 out of 5

One Response to “How not to handle abuse…”

  1. cjsmith says:

    I had a different problem but got the same results with both Powweb and Lunarpages. I don’t trust any of them period. They disabled by downloads directory and a few other not so nice things. They wouldn’t reply to my emails or phone calls for about 3 1/2 days. When they did they didn’t even respond to any of my questions. I finally got a dedicated server which is somewhat expensive 🙁

    I actually found this page because I was searching for proxy3d.external.lmco.com. The ip was definitely trying to hack in to my site or preparing for a hack. I’ve also had hits from other military and even gov ips but think that they have been faked or spoofed somehow. It no longer freaks me out when the NSC (NSA) shows and downloads everything on my website. I don’t know if they are faked, but I think many of them are. The only thing I can come up with is there is something on your site a “corporation” doesn’t like. These are corporate hackers who have inside connections with all the low-cost hosting sites and can turn things on and off, sign up for your forum but don’t post anything etc…

[joeware – never stop exploring… :) is proudly powered by WordPress.]