пятница, 3 июля 2009 г.

Worst-case scenario vs. Majority of situations

Вот замечательный вопрос-ответ нашел на Линкедине (url).

Вопрос:

Planning for every single thing that can go wrong is virtually impossible. What % of your effort would you put towards planning for the large majority of cases, versus planning for the absolute worst?

Put another way, if one spends all his or her time planning for disaster-recovery, rather than smooth standard operating procedures, would that lead to more of the worst-case situations?

Thanks in advance for your thoughts. -Mark

Ответ 1:

We normally plan for normal operations (within reason) and have a separate disaster recovery plan (as opposed to supporting a lot of interum disasters).

The disaster recovery plan is often bid or costed separately as well as an assessment of the likelihood. This allows the business to decide whether or not to invest.

Often one can substantially reduce disaster recovery costs through lower recovery metrics like 4-8 hour recovery times in case of complete destruction of primary facility or only backing up 3-4 times/day.

Focus on design and COTS products that can auto-sync data and transactions to minimize having to invest inordinate amounts of time worrying about disasters.

Karl Garrison, CTO/Owner Intelligent Fusion, Enterprise Architect, Business Strategist, PhD

Ответ 2:

Similar to Karl's answer (above) we focus on planning normal operations, and then tolerance levels given the severity of a potential DR situation. You start by listing your business units, systems, operations, etc. and then ask yourself based on a short term outage (hours) medium range (a couple of days), longer (weeks), or wiped of the face of the earth (never coming back) how critical it is to your business, and what is the solution should it go offline.

We all know that the majority DR situations are in the short to medium term range, so that is where you need to concentrate the bulk of your efforts. Knowing that operations go off line due to power outages, storms/hurricanes, etc. and how you operate around those kinds of events is the most practical use of your time.

Most minor issues (while not disasters) can be managed through by using redundancy in your network, creative staffing strategies, etc. The real DR situations will require a potential plan in place for the most likely scenario(s) but beyond having that plan and informing your team of their role in that plan should it become reality: you're back to managing your day to day business.

Other then that, you would need to look at outside vendor relationships and make sure that services that you depend on in a DR recovery situation are handled. Case in point, at TMW we have the ability to install larger generator capacity in our DC's should we lose power for extended periods of time. We don't own the generator(s), but we pay a retainer fee to a vendor so that in the case that we need one it can be installed within a specified amount of time. In addition, we have a fuel agreement to supply us with diesel in the event that we use that generator as a short to medium range solution during recovery from an incident. This is an example of many contingecies to plan for, and it does take alot of work to get to that point, but once in place you will feel more comfortable once you have a plan in place and it has been communicated properly to your team(s).

Jamie Bragg, Senior Vice President — Operations at The Men's Wearhouse, Inc.


Предлагаю обсудить в коментах, поделиться своими выводами, предложить свои ответы и пр.

Непростая это задача — создание интернет-магазина. Обращайтесь к профессионалам и не знайте проблем.

Это как в анекдоте — "Никогда еще свидетелем не приходилось быть." Peg-perego — на страже гармонии в вашем доме.

А помните?: "Так никаких волостей не напасешься." Жир, лучшая цена в это время года.

Комментариев нет:

Отправить комментарий