PLEASE BOOKMARK (usually control-D) THIS PAGE NOW SO YOU CAN FIND IT AGAIN IN CASE OF AN EMERGENCY!


If you are experiencing a problem not reported here, check our web panel for more information.
(Please remember, posting in the comments here IS NOT an official way to contact DreamHost.)

Emergency OS patch on file server

Posted 4 hours, 26 minutes ago (May 9th, 2008 at 8:27 am PST) by Kelly

Severity: Low   Resolved: Yes

Per an open ticket with Sun we need to apply two patches to one of our file servers. This is to hopefully fix a degraded zpool which will not finish a parity rebuild. There are exactly 78 users in the frisky cluster on this file server. This will bring your email and web services offline for the time being. The patch should only require about 15 minutes of downtime. I apologize for doing this patch during peak hours, but we really need to get this data back up to full integrity.

Tech nerd details: The raid array is a raidz2 operating with one failed disk. It has been rebuilding off of a hot spare for a day or two and reset the rebuilding process itself after getting to 99%. We contacted Sun and after analyzing troubleshooting information believe a kernel + ZFS patch should resolve the problem. Fortunately this is a raidz2 so it can sustain a disk failure and still be fault tolerant.

Update: Well, one of the two patches was installed. It seems SunSolve is rejecting our service contract to download a specific patch in the dependency tree. We’ve updated our case with Sun. The system is back online and serving files!

DingDong and Pizarro down

Posted 17 hours, 40 minutes ago (May 8th, 2008 at 7:14 pm PST) by glen

Severity: Low   Resolved: Yes

The HTTP servers DingDong and Pizarro are both currently unresponsive to our reboot efforts. We are working on getting manual reboots done in their respective data centers or evaluating whither moving to new hardware is necessary. Estimated downtime as long as 1 hour.

Update 7:44p
Pizarro is back up and seemingly stable on new hardware
DingDong is awaiting a tech to reach it’s data center still.

Update 8:15p

DingDong has been manually rebooted and is up. Tests of sites show them responding. If you have any further issues related to either of these 2 servers, please contact support through our panel.

Central database crash

Posted 1 day, 20 hours ago (May 7th, 2008 at 4:45 pm PST) by Kelly

Severity: Medium   Resolved: Yes

Our central database server crashed and restarted itself. It is currently replaying transaction logs and should be back in under an hour. This should not affect your websites, email, etc, but the user control panel (https://panel.dreamhost.com) and similar services are down until it comes online.

We are monitoring the situation and will report back where when it comes online!

Edit: And webmail! I forgot those two were tied together. Regular IMAP/POP3 email access should continue to work.

Edit: 5:28PM Pacific It looks like we’re back in business! The user control panel and webmail are working. We will continue to check the rest of our central services and update this if we find anything else still broken! If you are having problems, please contact technical support.

FTP problems (connection drops)

Posted 2 days, 3 hours ago (May 7th, 2008 at 9:49 am PST) by andrea

Severity: Medium   Resolved: Yes

Some of our customers are experiencing problems related to their FTP service. This includes error messages while connecting or dropped connections. We’re looking into it, and will post an update as soon as we know more. Sorry about the inconvenience.

Please check back here for updates.

Update 5/7/08 11am Pacific — We have narrowed down the problem to only affecting customers using the following ISPs: AT&T/Sbcglobal, Comcast and RoadRunner. We have tested the following solution with the help of one of the affected customers, so this should work for all: You will need to change your ftp-only user to sftp to go around what we suspect to be bandwidth throttling affecting only FTP connections.

Edit your ftp-only user to be sftp in the control panel under Users/Manage Users ( https://panel.dreamhost.com/index.cgi?tree=users.users& ). This should schedule a user update, and as soon as the little clock next to your username in the control panel is gone, log in with that user using the SFTP settings on your upload client, and you should be able to upload the large files. If your current client doesn’t have SFTP connection, you will need to download another one.
Please note, if you’re already using a shell user to connect through ftp, then all you need to do is change to sftp in your client, you don’t need to modify your user.

If the above doesn’t make your problems go away, then it’s most likely unrelated. Please send support a message, so we can look into it for you.

Update 5/8/08 9am Pacific — Apparently Verizon and other ISPs have been added to the list. Even if your ISP is not listed, please try to change to sftp, and if that doesn’t work, let us know with as many details as possible.

Update 5/8/08 10:50am Pacific — We’re continuing to look into this problem to make sure it’s not something on our end that’s causing the connection drops. We have just tested a few more cases were the customer switching to SFTP didn’t make a difference, but so far we couldn’t replicate the problem, but we’re still checking, examining more data.
In the meantime, continue trying to switch to SFTP to try that way. We recommend that over FTP anyway for its security. If you’re using telnet, you should be switching to ssh instead.

Here’s a link to the article in our Wiki regarding the use of SFTP…
(http://wiki.dreamhost.com/SFTP)

Webserver Hermes being moved to new hardware

Posted 2 days, 3 hours ago (May 7th, 2008 at 9:25 am PST) by chih

Severity: Low   Resolved: Yes

Hermes crashed earlier and server isn’t coming back up when rebooted. It’s currently being migrated over to new hardware and should be back up shortly.

Sorry for the inconvenience this has caused.

UPDATE: Migration is complete. Your sites should be back up and running. Contact support if your sites are still down.

Email server emergency maintenance tonight (janky,randy,postal,spunky)

Posted 2 days, 21 hours ago (May 6th, 2008 at 3:26 pm PST) by jordan

Severity: Low   Resolved: Yes

About 30 minutes ago we had some major problems with one of our email load balancers that keeps email for janky,randy,spunky and postal clusters. Tonight at approximately 11:30 PM PST we will be doing some maintenance that may affect the performance / uptime of any customers that have email in these clusters. It is only expected to last about 10-15 minutes at most and we apologize for the short notice. This post will be updated as soon as the maintenance has completed.

UPDATE 11:30 PDT
Maintenance has completed all mail services should be up and running again. If you are still having problems please contact support.

Problems caused by apache service updates

Posted 3 days, 1 hour ago (May 6th, 2008 at 11:18 am PST) by andrea

Severity: High   Resolved: Yes

We have noticed that any apache service updates initiated from the control panel are breaking the service. This includes anything that has to do with the domain’s web service, like adding a new domain, changing FTP users on domain, etc. This doesn’t just break the actual domain that initiated the change, but all domains on that apache service.

We are fixing this as we find them, and trying to catch up with the ones that broke a little while ago. Very sorry about the web downtime this has caused you, and please check back here for updates.

Uptdate 5/6/08 5pm Pacific: This issue is now resolved. Sorry about the delay, we have reconfigured all apache services for which the firs fix didn’t take. If you’re still having problems, then it’s probably unrelated. Please write to support with some details, and we’ll be happy to check into it for you.

Delay in quarantined junk mail delivery

Posted 5 days, 13 hours ago (May 3rd, 2008 at 11:47 pm PST) by andrea

Severity: Low   Resolved: Yes

We have noticed that messages that were quarantined by our junk filter are not getting delivered to the Junk Folder. We checked on the messages, and they are still on the servers. The reason they weren’t getting delivered was due to a connection problem to the mysql database that controls the Junk Folder. The problem is now fixed, but it will be a while before it all catches up, as mail has been backed up since last night. We apologize for all inconvenience this has caused. We’ll post updates on the progress of junk mail delivery as soon as we have them.

Update 05/04/2008 noon Pacific — Quarantined junk mail is again being delivered to the Junk Folder. Sorry about the delay.

redhot getting new hardware

Posted 1 week ago (May 2nd, 2008 at 12:06 pm PST) by kitchen

Severity: Low   Resolved: No

Redhot has had a hardware failure and needs to be replaced. We are currently working on moving it over to new hardware.

This should take about 30 minutes, and we will keep you posted as things progress!

Update: 1:06pm pacific: the failover has completed and services are up and running. If you experience any issues, please contact support!

Update May 3, 2008: It looks like the new hardware was not set up properly so for now so we’ve moved to another backup server that was available (I am ensuring that this machine is configured correctly for you). The changeover is almost complete.

Mysql Maintenance

Posted 1 week ago (May 1st, 2008 at 9:47 pm PST) by Hutch

Severity: Medium   Resolved: Yes

These mysql servers will be experiencing some downtime tonight for maintenance:
jake
leo
midnight
snarf

Downtime is expected to be less than 30 minutes. It should happen shortly after midnight PDT.

–Update
The maintenance was preformed and the servers were back up within 15 minutes.