I know nothing about coding or servers, so dunno if these suggestions are feasible:
(1) Restart the server immediately after it crashes.
In another thread some dev(s) mentioned that we keep getting crashes in Live because, with so many players on, there are simply exponentially more ways that errors may occur that cause crashes (and it is this that causes crashes as opposed to an underlying problem the devs are unable to fix). I would say if the error is so rare that it has happened for the first time only weeks after Live began, why not restart immediately without figuring out a hot fix and figure out the problem while the server is back up. What are the chances this same error causes a server crash before another error causes a server crash?
(2) Designate more people to be able to restart the server.
I don't know how hard it would be remotely do this without having a dev client installed, etc., but we seem to have some blind spots, especially during the US West Coast hour evening hours in getting the server back up if it goes down. Perhaps if it could made where people who can restart are not able to bring it down, but only start it up, there would be less issues with giving non-devs this ability?