Uptime On A Dime

Download PDF

Every data center manager wants the best for his or her facility.  But let’s face it—sometimes a few dimes are all you have left to your budget. That shouldn’t get in the way of aiming for the most reliable facility.  Uptime isn’t about how much money you can spend.  It’s about implementing practices that make your facility more efficient and cost effective—and sticking to them. Sure, adding a redundant generator and second utility feed may be out of the question.  But, there are plenty of best practices that can be implemented with little to no cost—and just a second of your time.

Let’s look at a dozen architectural, mechanical, electrical, fire suppression, maintenance, and operations best practices that you can put into effect today.

Architectural

 Protect your raised access floor during moves, additions, and changes.

Whether you are moving equipment around in your facility, adding additional equipment, or changing the contents of an equipment rack, it is important to remember to protect your raised access floor. Lay down plywood or other hardboard before you roll in equipment of a significant weight. And equipment should only be brought into the data center after the floor system is installed.  Implementing this best practice will help avoid unnecessary bumps and scratches and ultimately extend the life of your floor system.

Establish a standard size for cable cutouts.

This is a manageable task with numerous benefits. For starters, it assumes that you will be using the same method of closing holes in the raised access floor for every cutout, which is a wise decision.  Second, it allows

you to cut access floor panels in batches, saving you time and money. Third, once adopted, it enables you to easily calculate the open space in your raised access floor system.

Mechanical

 Label all CRAC units.

Labeling all CRAC units, their associated condensers, and the piping that runs between them will speed up the troubleshooting process.

Seal space behind all CRAC units.

As simple as it seems, sealing the space between each CRAC unit and the wall is often overlooked.  As a result, conditioned air is slipping through the back of the unit and into the flow of the return air.  Not only are you losing capacity, but also the temperature reading of the return air is inaccurate. This same lesson can be applied to the balance of the unsealed openings in your floor—and the payback for tightening up your mechanical system is great.

Electrical

Track AMP readings.

Just as you balance your checkbook on a regular basis, you should track the AMP readings on all PDU circuits to make sure that they do not exceed 80-percent capacity. It’s a good habit to check readings semi-annually and whenever a change is made.

Create your checkbook.

It is also important to have the checkbook mentioned in the paragraph above on hand.  A surprisingly large number of data center managers don’t have ready access to a complete inventory of the equipment housed in their facilities. A simple spreadsheet program and a disciplined change control process will provide you with an up-to- date, up-to-the–minute inventory of what is in your room. At a minimum, you can collect heat and power load at the rack level and take a proactive approach to utilizing capacity. Expand your spreadsheet and voila—asset management numbers, lease information, equipment model numbers, and OS system updates at your fingertips. The important thing to know here is that you don’t necessarily need an expensive software package to manage your books.

Fire suppression

Install a telephone next to the abort switch.

Imagine holding down the abort switch and you’re unable to get help. The only phone on you is your mobile phone, and you have no signal. It is important to install a telephone next to the abort switch and pre-program numbers for the fire department and supervisory contacts. This will greatly reduce stress and lost time in the event of an emergency. For example, when programming the numbers, put them in the order that they should be called. Speed dial number one is the first contact to call; speed dial number two is the second contact to call; and so on. This will certainly help speed up the process.

Establish a training program for in-house data center personnel.

Eighty percent of accidental agent discharges occur because of human error.  Of that 80 percent, 90 percent occur with someone in the room.  Establishing a training program, implementing it, and conducting reviews and continuous educational opportunities will help you greatly reduce your risk of an accidental agent discharge.

Maintenance

Know your technicians.

It is extremely important to have a good relationship with your maintenance technicians—good enough that you actually have their home telephone numbers.  Maintenance technicians have their hands on your vital equipment. Being able to communicate well with them is just as important as the communication between you and your physician.  In fact, during each visit, you should make an effort to spend some time with the technicians. Ask them each time if there are any upgrades available for your equipment.  Also ask them if there are any recent issues that might affect your facility’s reliability. You’d be surprised at the advice they offer for free.

Conduct visual inspections of equipment.

Your in-house data center staff should know how to complete a visual inspection of your CRAC units, UPS, PDU filters, battery terminals, and condenser units.  It’s impor- tant to have a basic understanding of how to properly maintain your facility. Your staff will be prepared and know what to do when they spot something out of the ordinary in-between your regular maintenance visits.

Operations

Synchronize clocks.

In the event that your facility experiences downtime, it will be difficult to create a log of events if each clock in your facility has a different time. Precision is key.  Check all the clocks on your support equipment and make sure they are synchronized.  Clocks should also be updated monthly. It’s surprising how many devices don’t track time well and tend to drift.

Establish materials handling rules.

We all know the importance of keeping the data center clean and contamination-free. Specifically addressing materials handling rules will reduce mistakes.  Make it clear that storage and handling of packaging materials within the data center is strongly discouraged and uncrating of equipment shipped in cardboard boxes and wood crates is not accepted.

So even if you only have a few dimes in your budget, every data center manager should be aware of these simple, inexpensive best practices that can easily be implemented today to help make your data center more reliable, cost effective, and healthy for tomorrow.

Bick Group has subject matter experts in this and many other topics. Talk to a Bick Consulting Services expert by emailing: tdavies@bickgroup.com.