Archive for June, 2007
Filed Under ( cool tools) by Dave Mast on June-30-2007
While I was at GCC a couple weeks ago, Kyle showed me a nifty little utility for gauging where file usage is happening at on your hard drives.
It’s called TreeSize. It’s made by JAM Software, and it’s available in three different flavors; Free, Personal, and Pro. Each step up offers more powerful searching and reporting capabilities, with the Pro version topping out at $50.
Below is a screen cap of TreeSize Free. Even the free version is pretty useful for finding out where a majority of your hard drive space is getting used at. The purple bar represents the total file size for each folder.

In the next screenshot, I set TreeSize’s filter to only check for audio files (.WMA .MP3 .AAC .AIF .M4A .M4P). While these files affect our storage capacity on our file server, we filter out these same file types with our backup software, so they don’t take up space on our backup drives.

I like the free version, but I’m gonna download the pro version and and give it a spin as well.
|
Filed Under ( off-topic) by Dave Mast on June-30-2007
I was out early this morning and happened to catch this sunrise while I was outside…
Definitely worth a picture. I want to be awake to see more of these before it’s too cold to be outside.
Here’s another pic a few minutes later…

|
Filed Under ( roundtable) by Dave Mast on June-28-2007
Yeah I’m pumped.
|
It’s about 2AM right now in Northeast Ohio, and earlier this evening I started the task of taking all of our virtual servers down so that I could defrag the host machines that they live on.
For taking care of large virtual disk files, I use Contig, which is part of the Windows Sysinternals software lineup. Basically Contig is a tool for defragmenting large files. It can take wildcards and even recurse subdirectories if you want it to. This makes it pretty simple to go to the directory where your virtual machines are kept and defrag all of your .vmdk (virtual disk) files in one sweep.
I was nervous for a good while this evening, because the Contig utility was taking an EXTRA long time to defragment a piece of the virtual hard drive that is part of our Exchange server, and perfmon was showing little-to-no disk activity at the same time. About halfway through the second paragraph, however, the virtual disk finally finished up and Contig continued on to the next file. WHEW!
Looking at our file server’s disk usage, I am amazed at how our storage needs have skyrocketed. When I started in 2005, our dinky little file server had a 30something-GB SCSI drive on it, and it was enough to hold everyone’s information. Since then we’ve moved to a 425GB RAID array, and we’ve managed to fill over 80% of that space. Safe to say we’ll be looking for another storage solution sooner than later.
|
This past Saturday night I was able to get a script working on our servers that would take them all down gracefully in the event of a power outage. This was in response to an previous blunder on my part that had allowed every last server in our building to go down hard (although nothing was damaged in the event).
Fast forward to this Tuesday (earlier this week). Our area got rocked by a couple of huge storms going through the area. About 3/4 of the way through the first storm, the lights in the office flickered, dimmed, and finally went out. I didn’t think the UPS script would get tested this quickly.
When I flipped open the KVM on our server rack, I was pleased to see that a countdown was already running on each machine. The UPS software had run its script, and now each server was about 60 seconds away from shutting down automatically. When it was all said and done, every last server shut down on its own, with plenty of battery life to spare.
The only tweak I ended up making to this process so far involved our physical domain controller (we have two, and the other one is a VM). It resides in one of the IDFs and it shut down too quickly in response to the power outage. As a result, after the VM-based DC went down, the remaining servers had no DC to talk to, and thus took a longer time to shut down cleanly. All-in-all though, the real-life test proved successful, and as a result. I have one more reason to sleep better at night.
|
Filed Under ( off-topic) by Dave Mast on June-17-2007
Normally you’d expect a post like this to come shortly after New Year’s Day rather than heading into the dog days of summer.
There was a time when I was in pretty decent shape. I wasn’t super-athletic but I had plenty of energy and I at least made an attempt to get some regular exercise in. In July of last year my life started to take a big shift in the way of scheduling and priority-type stuff. We were getting ready for the transition into our new building, and I was already starting to get things set on the network end.
In August, it was apparent that some big changes were going to happen in our new infrastructure. Don’t get me wrong, these changes were good, but they cost a lot, especially in terms of time. I found myself working some night-owl-style hours and putting away lots of caffeine drinks to get things done, and by December when we opened, I was taking in enough Rockstar drinks to be at a very serious health risk. Since then, I can hardly look at a can of Rockstar or Full Throttle without gagging.
My diet has definitely improved since then, but I didn’t realize how bad of shape I was still in until this last Friday. I was asked to stand in for one of our drummers so that our worship team could practice their music. Now I LOVE drumming (and bass guitar almost as much), so I jumped at the chance to do this, because I haven’t played in about 9 or so months. Long story short, I was able to tell just how out of shape I was over the next couple of days. Everything except my legs and feet just HURT!!
It’s crazy how it takes something like that to motivate me to want to get exercising again. Don’t get me wrong, either: Exercise is NOT my favorite thing to do. However, the health issues that hinder me now are the same health issues that could seriously disrupt my life 10 years down the road, and I don’t wanna be that guy.
So tomorrow after our staff meeting, I’m headed down to the gym I used to work out at to restart my membership.
It’s weird…you know for me, the exercise isn’t half as hard as the act of prioritizing it. That will be my challenge. My goal is to start waking up at 6am and be in the gym by 7, because past experience tells me that if I don’t do it in the morning, it’s EXTREMELY hard to fit it in the middle of my day.
This isn’t something I’m going to post on regularly, but it’s been on my mind a good bit this afternoon, so it was worth some typing.
Enjoy the rest of your weekend.
|
After tinkering around with PowerAlert a little more tonight (yeah, I know, it’s Saturday), I stumbled onto some interesting things about the program, and ultimately got it working the way I want.
First, as far as executing command scripts: You need to make sure that the file, even if it’s a CMD or BAT file, has its permissions set properly. PowerAlert attempts to run the scripts as SYSTEM (<LOCALMACHINE>\SYSTEM), so not only did I need to set permissions on the script files to reflect that, I also had to set permissions on psshutdown.exe as well.
After doing that, things worked like a charm. Once the UPS lost power, the monitoring server starts a 2-minute timer. At the end of that 2 minutes, PsShutdown starts, reads a text file containing the names of the servers to shut down, and sends the shutdown command to each one. Plus, if the power would happen to kick back on within those 60 seconds, a second script is run that calls PsShutdown to cancel the previous shutdown commands.
Since our firewalls (we run pfSense) are running on PCs as well, one of the next steps will be to find a utility that can automatically SSH-telnet to each firewall and shut it down as well.
There’s still a couple minor tweaks that I want to do, but they will have to wait until the UPS battery is back up to 100%. For now though, I’m pretty happy that PowerAlert is working like it’s supposed to. AND, I feel a lot better knowing that the servers are going to take themselves down gracefully next time we lose power.
|
It seems like things I’ve put at the back burner since the move have started to move quickly towards the edge of the stove on their own…
During my drive home from Granger on Wedesday evening, my phone started receiving text messages stating that either our Exchange Server had gone down, or our internet connection had been totally hosed. Minutes passed, then an hour, and I didn’t receive any notification that things had come back up. Finally, about 90 minutes later, my Q started buzzing again to tell me that connectivity had been restored. Relieved that things appeared to be back to normal, I continued my trek back to Dover.
The next day came in to work and found no signs that anything was wrong. Servers were working properly, my phone didn’t have any voicemail on it, and it seemed that our infrastructure HADN’T turned into a flaming ball of molten aluminum while I was gone. However, when I logged into our backup server to work on our CommVault configuration, I got hit up with a prompt asking me to explain an unexpected shutdown that took place on Wednesday night.
As I logged onto each server and looked at the event logs and did some asking around, it became pretty apparent what had happened: The notices that I received on the way home from GCC were actually a result of our server rack’s UPS running out of juice after a lengthy power outage. Our servers, every last one, had gone down. Hard. Ouch.
How EVERY server managed to start back up with no errors is beyond me…WAY beyond me. But needless to say, the once-back-burner task of automating a shutdown process for our servers has come straight to the front.
I’ve spent the past 2 days dorking around with Tripp-Lite’s PowerAlert software. My plan is to have it execute a PsTools command called PsShutdown, which has the ability to shut down any windows machine on your network remotely. So far though, I’ve yet to see any evidence of PowerAlert even trying to run the script. I’m going to mess with it a little more and then give Tripp-Lite’s tech support a call. In any event, I found myself thanking God yesterday that we were able to get off so easy from my mistake of doing IT “on the edge” like that. Things could have been much MUCH different on Thursday, in which case I might still be in the server room right now.
|
This past Tuesday I had the opportunity to drive out to Granger Community Church to hang with Jason, Ed, Kyle, and their team of volunteers as we got to see a demonstration of 2 high-powered products from Fluke Networks: The EtherScope and the OmniView. Kevin from Fluke did a fantastic job demonstrating what these units are capable of. The guy knows his stuff.
Jason has a very good write-up on some of the things that we picked up throughout the demo. You can read the rest yourself, but here’s a small tidbit…
The BIG take away of the night for all of us was the incredible impact bluetooth devices have on access points! A single bluetooth device in use causes an amazing amount of interference in the .11b/g range. Kevin said he’s seen as few as 12 bluetooth devices take down an access point! What?! We all started turning on our bluetooth phones and doing partner searches and the real-time wifi graphs on the Fluke’s were going nuts!
Just like was everyone else, I was blown away by this. How in the world has this not been widely publicized? Did I miss it somewhere?
Are wireless manufacturers going to use this to make .11a and .11n easier to market? I mean, seriously, was b/g equipment NOT tested against other wireless gear that would be used in like manner? “What that? Your b/g wireless products are falling victim to an influx of bluetooth devices? Here, we’ve got some sweet a and n units that will work MUCH better.”
Sounds like a good forced-upgrade plan. Maybe not, but it was enough to make my mind slip into conspiracy mode.
The next day I was able to hang with the Jason, Ed and Kyle and basically go through a normal work day with them. I had my laptop with me, so I VPN’d in to NewPointe and took care of some support issues when we weren’t in discussion about anything.
I finally arrived home around midnight exhausted, but VERY glad to have had the opportunity to hook up with everyone at GCC. Big thanks to Ed, Kyle, Jason, and their volunteer team for making an Ohio geek feel welcome. :-) I always leave there wiser than when I show up.
|
My adventures with the Norco disk system have continued throughout the day, and after some more tinkering and some software-aided intervention, our storage array for the editor seems to be back up and running.
Apple’s Disk Utility program failed to do any sort of repair work on the array. After many repair attempts, all I could get were directory errors. After browsing around for some Mac disk repair utilities, I landed on one that showed promise: DiskWarriorby Alsoft.
I fired up DiskWarrior, pointed it to our now-unmounted disk array, and watched it go to work. It identified the array and cleaned up the directory structure, various file attributes, and various other things, and in about 5 minutes the array was back online. SWEET! This was WELL worth the price tag (about $80).
Thinking things were back to normal, I went ahead and shut everything down so I could install a UPS in front of the Mac and the resurrected RAID box. After plugging everything in, I turned on the Norco drives, powered the Mac up, and watched in utter astonishment….as the system failed to recognize five of the twelve installed drives.
At this point I was about at my limit with this RAID array. Not really knowing what else to do, I went ahead and shut both the disks and the Mac completely down, and then restarted the RAID after about a minute. I let the disk system run for about 2 minutes before powering up the Mac. My thought on this was that if the backplane or anything else in the RAID box has to run a POST or anything, I’m going to give it plenty of time to do so before restarting the computer.
I pushed the power button on the Mac, and lo and behold, I heard many drives starting to spin up all at one time. In a few seconds, all 12 drives had spun up, and the RAID seemed to be back on its feet. A quick look through the Finder revealed that all of our stuff on the array appeared to be intact.
So is this over? I really don’t know. I really haven’t felt like testing to see if I can replicate the issue by starting the Mac “too soon” after the RAID array is powered up. What I am going to do is exchange our eSATA controller card for one that has been tested and is more compatible with the unit. If that clears things up, then I’ll feel more confident about marking this down as a hardware issue. At this point, it makes sense that 2 different chipsets wouldn’t play well together. But at the same time, you’d think there would be more consistency to it.
Maybe the constant here is just sheer unreliability. Time will tell. Until then, I’ll be copying our FCP project files onto a safer hard drive.
|
|
|