Backup Storage Part 5: Realization of a failure

No one likes admitting they’re wrong, and I’m certainly no different.  Being a mature person means being able to admit you’re wrong, even if it means doing it publicly, and that is what I’m about to do.

I’ve been writing this series slowly over the past few months, and during that time, I’ve noticed an increasing number of instances where my storage space virtual disks NTFS would go corrupt.  Basically, I’d see Veeam errors writing to our repository, and when investigating, I would find files not deleting (old VBK’s).  When trying to manually delete them, they would either throw some error, or they would act like they were deleted (they’d disappear), but then return only a second later.  The only way to fix this (temporarily) was to do a check disk, which requires taking the disk offline.  When you have a number of backup jobs going at anytime, this means something is going to crash, and it was my luck that it was always in middle of a 4TB+ VM.

Basically what I’m saying, that as of this date, I can no longer recommend NTFS running on Storage Spaces.  At least not on bare metal HW.  My best guess is we were suffering from bit rot, but who knows since storage spaces / NTFS can’t tell me otherwise, or at least I don’t know how to figure it out.

All that said, I suspect I wouldn’t have run into these issues had I been running ReFS.  ReFS has online scrubbing, and its looking for things like failed CRC checks (and auto repairs them) .  At this point, I’m burnt out on running storage spaces, so I’m not going to even attempt to try ReFS.  Enough v1 product evals in prod for me :-).

Fortunately I knew this might not have worked out, so my back out plan is to take the same disks / JBODS and attach them to a few RAID cards.  Not exactly thrilled about it, but hopefully it will bring a bit more consistency / reliability back to my backup environment.  Long term I’m looking at getting a SAN implemented for this, but thats for a later time.

Its a shame as  I really had high hopes for storage spaces, but like many MS products, I should have known better than to go with their v1 release.  At least it was only backup’s and not prod…

Update (09/13/2016):

I wanted to add it bit more information.  At this point it’s theory, but just incase this article is or is not dissuading you from doing storage spaces, it’s worth noting some additional information.

We had two NTFS volumes, each being 100TB in size.  One for Veeam and one for our SQL backup data.  We never had problems with the SQL backup volume (probably luck), but the Veeam volume certainly had issues.  Anyway, after tearing it all down, I was still bugged about the issue, kind of felt really disappointed about the whole thing.  In some random google, I stumbled across this link going over some of NTFS’s practical maximums.  In theory at least, we went over the tested (recommended) max volume size.  Again, I’m not one to hide things and I fess up when I screw up.  Some of the storage spaces issues may have been related to us exceeding the recommended size, and NTFS couldn’t proactively fix things in the background.  I don’t know for sure, and I really don’t have the appetite to try it again.  I know it sounds crazy to have a 100TB volume, but we had 80TB of data stored in there.  In other words, most smaller companies don’t hit that size limit, but we have no problem at all exceeding that.  If you’re wondering why we made such a large volume, it really boiled down to wanting to maximize both contiguous space as well as not wasting space.  Storage spaces doesn’t let you thin provision storage when its clustered, so if we for example would have created five 20TB LUNS instead, the contiguous space would have been much smaller and ultimately more difficult to manage with Veeam.  We don’t have that issue anymore with CommVault as it can deal with lots of smaller volumes with ease.

Anyway, while I would love to say MS shouldn’t let you format a volume larger than what they’ve tested (and they shouldn’t without at least a warning), ultimately the blame falls on me for not digging into this a bit more.  Then again, try as I may, I’ve been unable to validate the information posted on the linked blog above.  I don’t doubt the accuracy of the information, often I find fellow bloggers do a better job of explaining how to do something or conveying real world limits than the vendor.

Best of luck to you, if you do go forward with storage spaces, and if you do have questions, let me know, I worked with it in production for over a year, at a decent scale.

Quicky post: Centralized task management and workflow

Just wanted to post this quick, and I may have time to do a more through post later, but this will do for now.

In our backup solution, we had to join together two solutions, CommVault and Veeam.  Veeeam backed up the data to disk and CommVault backed the data up to tape.  The problem with this solution is the systems were unaware of each other.  We really needed a solution that could take disperate systems and jobs, and basically link them together in a workflow.  After a lot of searching we found this AWESOME product from MVP systems called “JAMS (Job automation management scheduler).

If you’re managing tons of scheduled tasks (or rather not able to manage them), want alerts, job restarts, automatic logging, support for cross platform, cross system, cross job workflows, there’s not a product out there that I’ve seen can beat it.

They’re not paying me, nor have they asked me to comment on the product.  This isn’t an official review, its just a “hey, you should really check these guys out” sort of post.

Backup Storage Part 4a: Windows Storage Spaces Gotcha’s

The pro and the con of a software defined storage is that its not a turn key solution.  Not only do  you have the power to customize your solution, you also have no choice but to design your solution.  With Storage Spaces, we figured most of this stuff out before selecting it as our new backup storage solution.  At the time, there was some documentation on storage spaces, but it was very much a learning process.  Most “how to’s” were demonstrated inside labs, so I found some aspects of the documentation to be useless, but I was able to glean enough information, to at least know what to think about.

So to get started, if you end up here because you want to build a solution like this, I would encourage you to start with this FAQ’s that MS has put together.  A lot of the questions I had were answered here.

I want to go over a number of gotcha’s with WSS before you take the plunge:

  1. If you’re used to and demand a solution that notifies you automatically of HW failures, this solution may not be right for you.  Don’t get me wrong, you can tell that things are going bad, but you’ll need to monitor them yourself.  MS has written a health script, and I myself was also able to put together a very simple health script (I’ll post it once I get my GitHub page up).
  2. WSS only polls HW health once every 30 minutes.  You can’t change this.  That means if you rip an power supply out of your enclosure it will take up to 30 minutes before the enclosure goes into an unhealthy state.  I confirmed this with MS’s product manager.
  3. Disk rebuilds are not automatic, nor are they what I would call intuitive.  You shouldn’t just rip a disk out when its bad, plop a disk in and walk away.  There is a process that must be followed in order to replace a failed disk.  BTW, this process as of now, is all powershell based.
  4. Do NOT cheap out on consumer grade HW.  Stick to the MS HCL found here.  There have been a number of stability problems listed with WSS, and its almost always has to do with not sticking with the HCL and not WSS’s reliability.
  5. This isn’t specific to WSS, but do not plan on mixing SATA and SAS drives on the same SAS backplane.  Either go all SAS or go all SATA, avoid mixing.  For HDD’s specifically, the cost is so negligible between SATA and SAS, I would personally recommend just sticking with SAS, unless you never plan to cluster.
    1. Do NOT use SAS expanders either, again, stop cheaping out, this is your data we’re talking about here.
  6. Do NOT plan on using parity based virtual disks, they’re horrible at writes, as in 90MBps tops.  Use mirroring or nothing.
  7. Do NOT plan on using dedicated hot spares, instead plan on reserving free space in your storage pool.  This is one of many advantages to WSS.  it uses free space and ALL disks to rebuild your data.
    1. If you plan on using the “enclosure awareness”, you need to reserve a drives capacity of free space * the number of enclosures you have.  So if you have 4 enclosures and you’re using 4TB drives, you must reserve 16TB of space per storage pool spanned across those enclosures.
  8. Plan on taking point 7, and also ensuring there’s at least 20% free space in your pool.  I like to plan like this.
    1. Subtract 20% right of the top of your raw capacity.  So if you have 200TB raw, that’s 160TB.
    2. As mentioned in point 7\1.  If you plan to use enclosure awareness, subtract a drive for each enclosure, otherwise, subtract at least one drives worth of capacity.  So in we had 4 enclosures and they had 4TB drives, that would be 160TB – 16TB = 144TB usable before mirroring.
  9. I recommend thick provisioned disk, unless you’re going to be VERY diligent about monitoring your pools free space.  Your storage pool will go OFFLINE if even one disk in the pool runs low on free space.  Thick provisioning prevents this, as the space will only allocate what it can reserve.
  10. Figure out your strip width (number of disks in a span) before building a virtual disk.  Plan this carefully because it can’t be reversed.  This has its own set of gotcha’s.
    1. This determines the increments of disks you need to expand your pool.  If you choose a 2 column mirror, that means you need 4 disks to grow the virtual disk.
    2. Enclosure awareness is also taken into account with columns.
    3. Performance is dependent on the number of columns.  if you have 80 disk, and you create a 2 column mirrored virtual disk, you’ll only have the read performance of 4 disks, and write performance of 2 disks (even with 80 disks).  You will however be able to grow at 4 disk increments.  However, if you create a virtual disk with 38 columns, you’ll have the read performance of 76 disks, and the write performance of 38, but you’ll need 76 disks to grow the pool.  So plan your balance of growth vs. performance.
  11. Find a VAR that has WSS experience to purchase your HW through.  Raid Inc. is who we used, but now other vendors such as Dell have also taken to selling WSS approved solutions.  I still prefer Raid Inc due to pricing, but if you want the warm and fuzzies, its good to know you now can go to Dell.
  12. Adding storage to an existing pool does not rebalance the data across the new disks.  The only way to do this is to create a new virtual disk, move the data over, and remove the original virtual disk.
    1. This is resolved in server 2016.

Those are most of the gotcha’s to consider.  I know it looks big, but I’m pretty confident that every storage vendor you look at, has their own unique list of gotcha’s.  Heck, a number of these are actually very similar to NetApp/ZFS, and other tier 1 storage solutions.

We’ll nerd out in my next post on what HW we ended up getting, what it basically looks like and why.

Backup Storage Part 3d: Traditional SAN / NAS

Of all the types of storage we looked at, this was the one type that I kept coming back to.  It was a storage technology that I had the most personal experience with, and having just finished implementing my companies first SAN, I was pretty well versed in what was out on the market.

There were other reasons as well.  Traditional storage is arguably the most familiar storage technology out there.  Even if you don’t understand SAN, you probably have a basic understanding of local RAID and to some degree, that knowledge is far more transferable to traditional storage, than say something like a scale out solution.  More disks generally = faster, RAID 10 = good for writes and 7k drives are slower than 10k drives.  I’m over simplifying on purpose, but you can see where I’m coming from.    The point is, adopting a storage solution where existing skills can transfer is a big win when you don’t have time to be an expert in everything.  On top of that, you can NOT beat the cost per GB of traditional storage nor the performance edge of traditional storage compared to all its fancier alternatives.  I’m not saying there aren’t cases where deduplication targets will offer better costs per GB, nor am I saying scale out or cloud don’t have their other winning use cases.  However, as a general purpose, do a little bit of everything good and few things great, you can’t beat traditional storage.

By now, its probably pretty apparent that we went with a traditional storage solution, and I suspect your curious as to whose solution.  Well, I think you’ll find the answer a little surprising, but let first dig in to who/what we looked at.

  1. Dell Equallogic (6510’s):  I actually have a very good bit of experience with Equallogic.  In fact at my previous employer we had been running Equallogic for a good 6 years, so there wasn’t a whole lot of research we needed to do here.  Ultimately, Equallogic, while a great SAN (during its hay day), just could’t compete with what was out in the 2014 time frame.
    1. 15TB LUN limit in the year 2014 is just stupid.  Yes I know I could have spanned volumes in the OS, but that’s messy and frankly, I shouldn’t need to do it.
    2. The cost while better than most, was still on the pricey side for backup.  Things like the lack of 4TB drive support, not being able to add trays of storage, and instead being forced to buy an entire san to expand, just make the solution more expensive then it was worth.
    3. I didn’t like that it was either RAID 10 or RAID 50.  Again its 2014, and they had no support for RAID 60.  Who in their right mind would use RAID 50 with drives greater than 1TB?  Yes I know they have RAID 6, but again, who is going to want a wide disk span with a RAID 6?  That might be fine for 24 400GB drives, but that’s totally not cool with 48 (-2 for hot spares) 3TB drives.
  2. Nimble CS 200 series:  This is our current tier 1 SAN vendor and  still my favorite all around enterprise SAN.  I REALLY wanted to use them for backup, but they ultimately weren’t a good fit.
    1. If I had to pick a single reason, it would be price.  The affordability of them as a storage vendor isn’t just performance, its also that they do inline compression.  The problem is, the data I’d be storing on them would already be compressed, so I wouldn’t be taking advantage of “usable” capacity price point and instead would be left with their raw price point.  I even calculated adding shelves of storage, 100% maxed out, and the price point was still way above what we could afford.  Add to all of that, that in reality, for backup storage, they really didn’t have a ton of capacity at the time.  24TB for the head SAN, and 33TB for each shelf, with a 3 shelf maximum.  That’s 123TB usable, at 100% max capacity.  It would have taken up 12 rack units as well, and that’s if we stacked them on top of each other (which we don’t).
    2. Performance *may* have been an issue with them for backup.  Most of Nimble performance point is based on an assumption that your working data set lives in flash. Well my backup’s working data set is somewhere between 30TB and 40TB over a weekend, good luck with that.  What this means is the spindles alone would need to be able to support the load.  While the data set would be mostly sequential, which is generally easy for disks, it would be A LOT of sequential workloads in parallel, which Nimbles architecture just wouldn’t be able to handle well IMO.  If it was all write or even all read, that may have been different, but this would be 50% to even 75% write, with 25% – 50% reads.  Add to that, some of our work load would be random, but not accessed frequently enough (or small enough) to fit in cache, so again, left to the disks to handle.  There’s a saying most car guys know, which is “there’s no replacement for displacement”, and in the case of backup, you can’t beat lots of disks.
  3. ZFS:  While I know its not true for everyone, I’d like to think most folks in IT have at least heard of ZFS by now.  Me personally, I heard about it almost 8 years ago on a message board, when a typical *NIX zealot was preaching about how horrible NTFS was, and that they were using the most awesome filesystem ever, and it was called ZFS.  Back then I think I was doing tech support, so I didn’t pay it much mind as file system science was too nerdy for my interests.  Fast forward to my SAN evaluation, and good ‘ol ZFS ended up in my search results.  Everything about it sounded awesome about it, except that it was not windows admin friendly.  Don’t mis-understand me, I’m sure there are tons of windows admins that would have no problem with ZFS, and no GUI, but I had to make sure whatever we implemented was easy enough for average admin to support. Additionally, I ideally wanted something with a phone number that I could call for support,and I really wanted something that supported had HA controllers built in.  After a lot of searching, it was clear there’s at least 50 different ways to implement ZFS, but there were only a few what I would consider enterprise implementations of ZFS.
    1. Nexenta:  I looked at these guys during my SAN evaluation.  Pretty awesome product, that was unfortunately axed from my short list almost right away when I saw that they licensed their storage based on RAW capacity and not usable capacity.  While there was tiered based pricing (the more you buy, the cheaper it gets) it was still way too expensive for backup grade storage.  For the price we would have paid for their solution, plus HW, we would have been better off with Nimble.
    2. TrueNAS:  I had heard a lot about and even tried out FreeNAS.  TrueNAS was an enterprise supported version of FreeNAS.  Unfortunately I didn’t give them much thought because I honestly couldn’t find one damn review about them on the web.  Add to that, I honestly found FreeNAS to be totaly unpolished as a solution.  So many changes required restarting services, which led to storage going offline.  Who knows, maybe these services were replaced with more enterprise frinedly services in the TrueNAS version, but I doubted it.
    3. Napp-IT:  I tried running this on OpenIndiana, and some Joyent distribution of Unix (can’t recall its name).  In either case, while I had no doubt I could get things running.  I found the GUI so un-intuitive, that I just reverted to the CLI for configuring things.  This of course defeats the purpose of looking at Napp-IT to begin with.  The only postive thing I could say about it, is that it did make it a little easier to see what the lable were for the various drives in the system.  On top of all this, no HA was natively built in to this solution (not without a third party solution) so this was pretty much doomed to fail from the begining.  if we were a shop that had a few full time *nix admins, I probably would have been all over it, but I didn’t know enough to feel 100% comfortable supporting it, and I couldn’t expect others in my team to pick it up either.
    4. Tegile:  Not exactly the first name you think of when you’re looking up ZFS, but they are in fact based on ZFS.  Again, they were a nice product, but way too expensive for backup storage.
    5. Oracle:  Alas, in desperation, knowing there was one place left to look, one place that I knew would check all the boxes, I hunkered down and called Oracle.  Now let me be clear, *I* do not have a problem with Oracle, but pretty much everyone at my company hates them.  In fact, I’m pretty sure everyone that was part of the original ZFS team hates Oracle too for obvious reasons.  Me, I was personally calling them as a last resort, because I just assumed pricing would be stupid expensive and huge waste of my time.  I thought wrong!  Seriously, at the end these guys were my second overall pick for backup storage, and they’re surprisingly affordable.  As far as ZFS is goes, if you’re looking for an enterprise version of ZFS, join the dark side and buy into Oracle.  You get enterprise grade HW 100% certified to work (and with tech support / warranty), a kick ass GUI, and compared to the likes of Tegile, NetApp and even Nexenta, they’re affordable.  Add to that, you’re now dealing with a company that owns ZFS (per say) so you’re going to end up with decent development and support.  There were only a few reason we didn’t go with Oracle
      1. My company hated Oracle.  Now, if everything else was equal, I’m sure I could work around this, but it was a negative for them right off the get go.
      2. We have a huge DC and rack space isn’t a problem in general, but man do they chew up a ton of rack space.  24 drives per 4u, wasn’t the kind of density I was looking for.
      3. They’re affordable, but only at scale.  I think when comparing them against my first choice, we didn’t break even until I hit six JBOD filled with 4TB drives.  And that assumed that my first choice was running brand new servers.
      4. Add to point 3, I couldn’t fill all their JBODs.  They either have 20 drive options or 24 drive options.  I needed to allow room for ZIL drives, which means at a minimum, one JBOD would be only populated with 20 4TB drives.
      5. Those that work with ZFS know you need LOTs of memory for metadata/caching.  In my case, we were looking at close to 400TB of raw capacity, which meant close to 400GB’s of DRAM or DRAM + SSD.  Oracle specifically in their design doesn’t use shared read cache and instead populates each controller with read cache.  In my case, that meant I was paying for cache in another controller that would rarely get used and those cache drives weren’t cheap.
    6. Windows Storage Spaces (2012 R2):  I know what you’re thinking, and let me be clear, despite what some may say, this is not the same solution as old school Windows RAID.  Windows Storage Spaces is surprisingly a very decent storage solution, as long as you understand its limitations, and work within them.  It’s very much their storage equivalent of Hyper-V.  It’s not as polished as other solution, but in many cases its likely good enough, and in our case, it was perfect for backup storage.  They ultimately beat out all the other storage solutions we looked at, which if you’ve been following this series is a lot.  As for why?  Here are a few reasons:
      1. We were able to use commodity HW, which  I know sounds so cliche, but honestly, its a big deal when you’re trying to keep the cost per TB down.  I love Dell, but their portfolio was not only too limiting, but its also expensive.  I still use Dell servers, but everything else is generic (LSI, Segate) HW.
      2. With the money we saved going commodity, we were able to not only buy more storage, but also design our storage so that it was resilient.  There’s not a single component in our infrastructure that’s not resilient.
        1. Dual Servers (Failover clustering)
        2. Dual port  10g NICS with teaming
        3. Dual Quad port SAS cards with MPIO for SAS failover in each server
        4. A RAID 10 distributed across 4 JBOD’s in a way that we can survive and entire JBOD going offline.
      3. Because these are servers running windows, we could also dual purpose them for other uses, in our case, we were able to use them for Veeam Proxy’s.  Storage Spaces is very low in memory and CPU consumption.
      4. Storage spaces is portable.  if we find that we grow out of the JBODs or the servers, its as simple as attaching the storage to new servers, or moving the disks from one JBOD into a newer one.  Windows will import the storage pool and virtual disk, and you’ll be back and running in no time.
      5. Storage spaces offered a good compromise of enabling us to build a solution on our own, but still allowed us to have a vendor that we could call in the event of a serious issue.  Sure there is more than one neck to choke now, but honestly, its not that much different than Vmware and Dell.
      6. Its surprisingly fast, or at least windows doesn’t cause overhead like you might think.  I’ll go more into my HW spec’s later, but basically I’m limited by my HW, not Windows.  As a teaser, I’m getting over 5.8GBps sequential read (that’s Bytes with a big B).  Normally I’m not a big fan of sequential IO as a benchmark, but for backup, most of what goes on is sequential IO, so its a pretty good IO pattern to demonstrate.  Again, its not the fastest solution out there, but its fast enough.

This is the final post on the types of storage we looked at.  All in all, it was pretty fun to check out all of the solutions out there, even if it was for something that was kind of on the boring side of IT.  Coming up, I’m going start going over how we tackled going from nothing, to a fairly large, fast and highly resilient storage spaces solution.

Backup Storage Part 3c: Cloud Storage

Anyone who’s heard the word “cloud” as it pertains to technology knows its a fairly nebulous term.  About the only thing people seem to grasp is that its means “not in my datacenter or home”.  When we talk about “cloud storage” the term is still fuzzy, but at least we know it has something to do with storage.  For this post, I’m going to be writing about two specific types of storage outlined below.

  • Online / Realtime storage:  This is just a name I’m going to use to describe the type of storage that you’d normally run cloud hosted VMs on.  I like to think of it as cloud SAN.  This type of storage is great for primary backup storage, both because of its performance capability and always being online.  However, do to its price, it may not be the most prudent cloud storage option for long term retention or for secondary copies.
  • Nearline or Archive storage: This storage is a kind of like tape (and depending on the vendor may be tape). You don’t write or read to this storage directly, and instead use either a virtual appliance, or some form of API / SW.  Archive and nearline storage are great options for secondary storage or for hosting secondary copies of your backups.  Other than how you interact with the storage, the only downside tends to be the potential of longer recovery times.

With two great options, what’s not to love?  You have the perfect mix of short term, fast storage for your more recent backups, and your cheap and deep storage for your secondary and extended retention copies.

Initially it sounded great, until we started digging deeper into it.  Ultimately we ended up having to pass on the solution for a number of reasons below.

  1. Cost:
    1. Let’s face it, unless you’re one of the lucky few, like us you have a tough enough time getting budget for things that might actually make your company more productive.  Trying to get your company to invest heavily in something they might never need to use, or rarely need to use is not likely to happen.
    2. This ones a toss up, but if your company prefers capex over opex, cloud is not going to be an easy battle for you.  In our case, capex is preferred, which gave cloud storage a big black eye.
    3. The storage is only one small piece of the cost of cloud options.
      1. Your network pipe is probably not big enough for all the data you need to send.  This mostly depends on your backup solution as a whole (software optimized backups with dedupe and compression may help), but I suspect you don’t have enough to send your weekly full, let alone try to recover a weekly full.  So on top of the never ending cost of cloud storage, you’re going to need to add a more expensive pipe to get the data there, and maybe even pull the data back.
      2. Potentially in addition to the network, or maybe instead of the network, you’ll need to invest in a cloud gateway or some other backup data optimizing software.  Either way, you’re going to be investing even more capex on top of your increasing opex.
  2. DR: Our new DR plan wasn’t finalized yet, which made it really difficult to pick a solution that we weren’t sure would make sense in two years.  For example, if we move our DR solution to the cloud, some of the considerations in point 1 go away, as we’ll be running our secondary site in the cloud anyway.  However, if we decided to stay in a colocation unit, while the cloud technically speaking could still work, it wouldn’t make sense to send our data to the cloud compared to just sending it to our DR site.
  3. Speed:  While point 1/3/1 (Network) if invested in correctly shouldn’t pose a bottleneck, its tough to argue against copying our data to tape from a performance perspective.  We easily saturate 5 LTO6 tape drives in parallel (off our new backup storage solution to be discussed in an upcoming post).  There’s no way that it would be cost effective to get that same level of performance out of cloud storage.
  4. Integration:  While more and more backup vendors are integrating cloud storage API’s, they’re not always free, and they’re not always good.   Veeam as an example, had a cloud solution for their product, but it was terribly inefficient (as I recall).  There was no backup optimization to the cloud.  Veeam was simply copying files for you to the cloud.  Its simple, but not efficient.  This also goes back to point 1/3/2.  We could have worked around this limitation with different backup SW, or cloud gateways, but again, we’re talking about adding cost to an already limited budget.

At the end of the day, I really wanted cloud, but financially speaking it doesn’t make much sense to backup to the cloud.  While the price of the storage its self is quiet reasonable, the cost of the pipe or SW to get the data there is what kills the solution.  I’m not saying it doesn’t make sense for others, but for us, our data was too big and the the cost would have been too high.

Factors that would change my view are below:

  1. if ISP’s prices were dropping at the same rates as cloud vendors, I could see this making cloud more affordable.
  2. If the cloud providers themselves started offering deduplication targets as part of their storage offering I think this would make a big difference.  Perhaps instead of charging what we physically consume (per GB) they could instead charge what we logically consume.  This way they still win, and so do we.

Backup Storage Part 3b: Scale Out Storage

While evaluating EMC’s DataDomain solution, our EMC rep suggested we also take a look at their Isilon product.  Not being a person to turn down a chance to check out new technology I happily obliged.

Before we get into my thoughts on Isilon, let’s go over a simplistic overview of what scale out storage is, and what makes it different from say an EMC VNX.

SANs like the EMC VNX are what I’ll refer to as more traditional storage.  Typically they’re deployed with two controllers, and each controller shares access to one or more JBODs.  Within this architecture, there are a few typical “limitations”.  I’m putting limitations in quotes as in a lot cases most folks don’t run into these limitations when a system is properly designed.

  1. A controller is really jut an x86 server and like any server it is only going to have so many PCI slots.  Just like you would in any server you need to balance the use of those PCI slots.  In most cases, you’re going to be balancing how many slots are used for host uplinks (targets) or use for storage connections (initiator).
  2. While some high end SAN’s support true active / active (and more than two controllers BTW), your average configuration is going to have one controller active and the other one waiting to take over.  Meaning, 50% of your controllers are doing nothing all day long.
    1. Some people will divide their storage up, and have both controllers hosting LUNs.  Meaning if there is 50 disks, 25 disks may be active on controller 1, and the remaining 25 will be active on controller 2.  In the case of a two controller configuration, your net worst case performance will be that of a single controller.
  3. When you max out the number of disks, CPU, or target/initiator ports that a SAN can host, you need to deploy another SAN.  At this point, its likely you’re leaving some amount of resources on an island never to be utilized again.  Maybe its free spaces, maybe its the CPU, no matter what, something is being left under-utilized.
    1. This has a negative effect when we start talking about things like file shares.  Now this means spreading your file system over non-related storage units, it can get messy quick.

This is the way most shops run, and honestly its not a bad thing.  Having some extra of something isn’t always a negative, as long as its the right extra.

So how does scale out fit into all of this?  What does any of this have to do with backup?  Well first, lets start with the basics of scale out storage.  There are few things to keep in mind.

  1. Scale out no longer uses shared disk.  There is instead 1 controller to 1 set of disks and this forms a node or a brick.  You then connect multiple nodes together over a backbone network (think a private network used for node to node communication only). This forms a scale out cluster.  Over the scale out cluster you typically have one file system, or rather one contiguous allocated amount of space.
  2. The number of nodes in a cluster is mostly limited by the number of backbone connections that can be provisioned.  In the case of a 48 port infiniband switch, that means you basically have room for 48 nodes in a single cluster.
  3. Resiliency within the cluster is maintained by providing copies or parity at not only and individual disk level, but also a node level.
  4. Typically (not always) scale out is accessed over SMB/NFS and not iSCSI or FC.

So what makes this architecture so great, and why did we look at it for backup?  Well I’m going to highlight what we learned from speaking with EMC about Isilon.

First, what makes the architecture slick.

  1. Theres a lot less waste (at scale) in this architecture.  You don’t have pools of capacity spread across multiple SANs being unused.  With scaleout, its one massive pool of storage.
    1. A cluster is made up 3 or more like nodes
    2. Node can be divided up into tiers based on their performance.  Each performance tier must have a minimum of three nodes to be added into a cluster.
  2. To grow capacity, you simply add a node into the cluster.  Depending on your resiliency scheme, you may end up adding 100% of the nodes capacity into the cluster.
    1. One big win with scaleout that most traditional SAN’s don’t have, is when you do add capacity, the data in use is re-balanced across all nodes, thus ensuring every node is being evenly used.
  3. Isilon offer some pretty slick management capabilities.  For example, being able to control the resiliency settings at not only the cluster level but also down to the folder / file level.  Maybe that archive you have doesn’t need the same resiliency as your active files.   This same management capability also applies to performance polices.  Your archive data can reside on slow spinning disk and your active data can reside on their 10k RPM tier.
  4. Performance increases linearly as you add nodes.

So what makes this potentially great for backup?

  1. The lack of wasted space means you’re able to suck 100% of the capacity from your storage (after parity).  This means no space being wasted across luns  or different SAN units.  100% space is allocated across these devices.
  2. Due to the simplicity of adding nodes hot and thus adding capacity, if you start running out of space, there’s no need to do crazy things like expand luns, move data to new luns, migrated data within a storage pool so that its rebalanced.  All this stuff goes away with scale out.
  3. The amount of space in general supported by this solution should be able to hold a lot of your data for a long period of time.
  4. Throughput is pretty good on this architecture and it only gets better as you add nodes.

All this said, no architecture is perfect and Isilon is no exception.  There were a few negatives that ultimately led to us passing on it at the time of evaluation.

  1. We planned to backup to the Isilon unit and then backup the Isilon unit to tape.  The only way to do this (correctly) was to purchase their NDMP appliance.  This wouldn’t have been an issue per say, except that the NDMP appliance IIRC has some pretty terrible backup throughput numbers.  I don’t recall the exact numbers, but I want to say something like 200MBps.  We are going to be backing up over 100TB of data, and 200MBps wasn’t going to cut it.
  2. Isilon is cool and it has a hot price tag along with it.  While I’m sure at scale the cost per GB would be decent, it just wasn’t something we could afford.  You have to buy three nodes at a minimum and we’d only have one nodes capacity.  it would have blown our storage budget and we would have ended up with a lot less capacity than other solutions.
  3. Performance does scale as you add nodes, but it takes A LOT of nodes to equal the performance of some other solutions we were looking at.  So again, at scale, this may have been a contender, but not at the levels we were thinking about.
  4. They had no cloud solution, meaning I couldn’t replicate to an Isilon unit in AWS.  This wasn’t a deal breaker but it wasn’t ideal either.
  5. Like most EMC solutions, it was mired with all kinds of licensing features and various ways to nick and dime you to death.  Half of the cool features we heard about were a la carte license features, each jacking the price up more and more.

All in all, a very cool solution that’s unfortunately too expensive, and too limiting for us.  I think if we were contending with PB’s of data, this solution would be the only right way to tackle it, but in our case, a few hundred TB can easily be managed by plenty of other solutions.

Backup Storage Part 3a: Deduplication Targets

Deduplication and backup kind of go hand in hand, so we couldn’t evaluate backup storage and not check out this segment.  We had two primary goals for a deduplication appliance.

  1. Reduce racks space while enabling us to store more data.  As you know in part 1, we had a lot of rack space being consumed.  While we weren’t hurting for rack space in our primary DC, we were in our DR DC.
  2. We were hoping that something like a deduplication target would finally enable us to get rid of tape and replicate our data to our DR site (instead of sneakernet).

For those of you not particularly versed in deduplicated storage, there are a few things to keep in mind.

  • Backup deduplication and the deduplication you’ll find running on high performance storage arrays are a little different.  Backup deduplication tends to use either variable or much smaller block size comparisons. An example, your primary array might be looking for 32k blocks that are the same, where as deduplication target might be looking for 4k blocks that are the same.  Huge difference in the deduplication potential.  The point is, just because you have deduplication baked into your primary array, does not mean its the same level of dedupliation that’s used in deduplication target.
  • Deduplication targets normally also include compression as well.  Again, its not the same level of compression found in your primary storage array, typically a more aggressive (CPU intensive) compression algorithm.
  • Deduplication targets tend to be in-line dedpulication.  Not all are, but the majority of the ones I looked at were.  There are pros and cons to this that I’ll go into later.
  • In all the appliances I’ve looked at, everyone of them had a primary access meathod of NFS/SMB.  Some of them also offered VTL, but the standard deployment method is them acting as a file share.
  • Not all deduplication targets offer whats referred to as global dedplication.  Depending on the target, you may only deduplicate at the share level.  This can make a big difference in your deduplication rates.  A true global deduplication solution, will deduplicate data across the entire target, which is the most ideal.

Now I’d like to elaborate a bit on the pros and cons of in-line vs post process (post process) deduplication.

Pros of In-Line:

  • As the name implies, data is instantly deduplicated as its being absorbed.
  • You don’t need to worry about maintaining a buffer or landing zone space like post process appliances need.
  • Once an appliance has seen the data (meaning its getting a deduplication hit) writes tend to be REALLY fast since its just metadata updates.  In turn replication speed also goes through the roof.
  • You can start replication almost instantly or in real time depending on the appliance.  Post process can’t do this, because you need to wait for the data to be deduplicated.

Pros of Post Process:

  • Data written isn’t deduplicated right away, which means if you’re doing say a tape backup right afterwards, or a DB verification, you’re not having to rehydrate the data.  Basically they tend to deal with reads a lot better.
  • Some of them actually cache the data (un-deduplicated) so that restores and other actions are fast (even days later).
  • I know this probably sounds redundant, but random disk IO in general is much better on these devices.  A good use case example would be doing a Veeam VM verification.  So not only reads in general, but random writes.

Again, like most comparisons, you can draw the inverse of each devices pros to figure out its cons.  Anyway, on to the devices we looked at.

There were three names that kept coming up in my research, EMC’s DataDomain, ExaGrid and Dell.  Its not that they’re the only players in town, HP, Quantum, Seapaton, and a few others all had appliances.  However, EMC and ExaGrid were well known, and we’re a Dell shop, so we stuck with evaluating these three devices.

Dell DR series appliances (In-line):

After doing a lot of research, discussions, demo’s the whole 9 yards.  It became very clear that Dell wasn’t isn’t the same league as the other solutions we looked at.  I’m not saying I wouldn’t recommend them, nor am I saying I wouldn’t reconsider them, but not yet, and not in its current iteration.  That said, as of this writing, its clear Dell is investing in this platform, so its certainly worth keeping an eye on.

Below are the reasons we weren’t sold on their solution at the time of evaluation.

  • At the time, they had a fairly limited set of certified backup solutions.  We planned to dump SQL straight to these devices, and SQL wasn’t on the supported list.
  • They often compared their performance to EMC, except, they were typical quoting their source side deduplicated protocol, vs. EMC’s raw (unoptimized) throughput.  Meaning it wasn’t an apples to apples comparison.  When you’re planning on transferring 100TB+ of data on a weekly basis and not everything can use source side deduplication, this makes a huge difference.  At the time we were evaluating, Dell was comparing their DR4100 vs. a DD2500.  The reality is, the Dell DR6100 is a better match for the DD2500.  Regardless, we were looking at the DD4200, so we were way above what Dell could provide.
  • They would only back a 10:1 deduplication ratio.  Now this, I don’t have a problem with.  I’d much rather a vendor be honest then claim I can fit the moon in my pocket.
  • They didn’t do multi to multi replication.  Not the end of the world, but also kind of a bummer.  Once you pick a destination, that’s it.
  • Their deduplication was at a share level, not global.  If we wanted one share for our DBA’s and one for us, no shared deduplication.
  • They didn’t support snapshots.  Not the end of the world, but its 2015, snapshots have kind of been a thing for 10+ years now.
  • Their source side deduplication protocol was only really suited to Dell products.   Given that we weren’t planning on going all in with Dell’s backup suite, this was a negative for us.
  • No one, and I mean no one was talking about them on the net.  With EMC or ExaGrid, it wasn’t hard at all to find some comments, even if they were negative.
  • They had a very limited amount of raw data (real usable capacity) that they could offer.  This is a huge negative when you consider that splitting off a new appliance means you just lost half or more of your deduplication potential.
  • There was no real analysis done to determine if they were even a good fit for our data.

ExaGrid (Post process ):

I heard pretty good things about ExaGrid after having a chat with a former EMC storage contact of mine.  If EMC has one competitor in this space, it would be ExaGrid.  Like Dell, we spent time chatting with them, researching what others said, and really just mulling on the whole solution.  Its kind of hard to solely place them in the deduplicaiton segment as they’re also scale out storage to a degree, but I think this is a more appropriate spot for them.

Pros:

  • The post process is a bit of a double edged sword.  One of the pros that I outlined above, is that data is not deduplicated right away.  This means we could use this device as our primary and archive backup storage.
  • The storage scaled out linearly in both performance and capacity.  I really like the idea of not having to forklift upgrade our unit if we grew out of it.
  • They had what I’ll refer to as “backup specialists”.  These were techs that were well versed in the backup software we’d be using with ExaGrid.  In our case SQL and Veeam.  Point being, if we had questions about maximizing our backup app with ExaGrid, they’d have folks that know not just ExaGrid but the application as well.
  • The unit pricing wasn’t simply a “lets get’em in cheap and suck’em dry later”.  Predictable (fair) pricing was part of who they are.

Cons:

  • As I mentioned, post process was a bit of a double edged sword.  One of the big negatives for us, was that their replication engine required waiting until a given file was fully deduplicated before it could begin.  So not only did we have to wait say 8 hours for a 4TB file server backup, but then we had to wait potentially another 8 hours before replication could begin.  Trying to keep any kind of RPO with that kind of variable is tough.
  • While they “scale out” their nodes, they’re not true scale out storage IMO.
    • Rather than pointing a backup target at a single share, and letting the storage figure everything out, we’d have to manually balance which backup’s go to which node.  With the number of backup’s we were talking about and the number of nodes there could be, this sounded like too much of a hassle to me.
    • The landing zone space (un-deduplicated storage) was not scale out, and was instead pinned to the local node.
    • There is no node resiliency.  Meaning if you lose one node, everything is down, or at least for that node.   While I’m not in love with giving up two or three nodes for parity, at least having it as an option would be nice.  IIRC (and could be wrong) this also affected the deduplication part of the storage cluster.
    • Individual nodes didn’t have the best throughput IMO.  While its great that you can aggregate multiple nodes throughput, if I have a single 4TB backup, I need that to go as fast as possible and I can’t break that across multiple nodes.
  • I didn’t like that the landing zone : deduplicaiton zone was manually managed on each node.  This just seemed to me like something that should be automated.

EMC DataDomain (Inline):

All I can say is there’s no wonder they’re the leader in this segment.  Just an absolutely awesome product overall.  As many who know me, I’m not a huge EMC (Expensive Machine Company) fan in general, but there area few areas they do well and this is one of them.

Pros:

  • Snapshots, file retention policies, ACL’s, they have all the basic file servers stuff you’d want and expect.
  • Multi : Multi replication.
  • Very high throughput of non-source (DDBoost) optimized data and even better when it is source optimized.
  • Easy to use (based on demo) and intuitive interface.
  • The ability to store huge amounts of data in a single unit.  At time a head swap may be required, but have the ability to simply swap the head is nice.
  • Source based optimization baked into a lot of non-EMC products, SQL and Veeam in our case.
  • Archive storage as a secondary option for data not accessed frequently.
  • End to end data integrity.  These guys were the only ones that actually bragged about it.  When I asked this question to others, they didn’t exactly instill faith in their data integrity.
  • They actually analyzed all my backup data and gave me reasonably accurate predictions of what my dedupe rate would be and how much storage I’d need.  All in all, I can’t speak highly enough about their whole sales process.  Obviously everyone wants to win, but EMC’s process was very diplomatic, non-pushy and in general a good experience.

Cons:

  • EMC provided some great initial pricing for their devices, but any upgrades would be cost prohibitive.  That said, I at least appreciate that they were up front with the upgrade costs so we knew what we were getting into.  If you go down this path yourself, my suggestion is buy a lot more storage than you need.
  • They treat archive storage and backup storage differently and it needs to be manually separated.  For the price you pay for a solution like this, I’d like to think they could auto tier the data.
  • They license al a carte.  Its not like there’s even a slew of options, I don’t get why they don’t make things all inclusive.  Its easier for the customer and its easier for them.
  • In general, the device is super expensive.  Unless you plan on storing 6+ months of data on the device, I’d bet you could do better with large cheap disks, or even something like disk to tape tiering solution (SpectraLogic Black Pearl).  Add to that, unless your data deduplicates well, you’ll also be paying through the nose for storage.
  • Going off the above statement, if you’re only keeping a few weeks worth of data on disk, you can likely build a faster solution $ for $ than what’s offered by them.
  • No cloud option for replication.  I was specifically told they see AWS as competition, not as a partner. Maybe this will change in the future, but it wasn’t something we would have banked on.

All in all, the deduplication appliances were fun to evaluate.  However, cutting to the chase, we ended up not going with any of these solutions.  As for ROI, these devices are too specialized, and too expensive for what we were looking to accomplish.  I think if you’re looking to get rid of tape (and your employer is on board), EMC DataDomain would be my first stop.  Unfortunately, for our needs, tape was staying in the picture, which meant this storage type was not a good fit.

Next up, scale out storage…

VMware vSAN in my environment? not yet…

When VMware first announced vSAN, the premiss of the solution was just pure awesomeness.  After all, VMware has the best hypervisor (fact, not opinion), they’re in the process of honing what I think will be a Cisco butt kicking software defined network solution. Basically vSAN was the only missing piece to building a software defined datacenter.  However, like NSX, I just don’t think vSAN is at a point where they can replace my tried and true shared storage solution.  Thats not a knock against HyperConvergence (although I have my reservations with the architecture as a whole), rather VMware’s current implementation.

Before getting into my the current reservations with vSAN , I wanted to highlight one awesome feature I love about vSAN.  Its not the kernel integration, its not that its from VMware, no its simply that its a truly software only solution.  I really dig this part of vSAN.  There are plenty of things I don’t like about vSAN, but this is one area they did right and that I wish other vendors would follow.  Seriously, its 2015 and we STILL buy our storage / network as appliances.  It sucks IMO, and I’m sick of being tethered to some vendors crappy HW platform (I’m looking at you Nimble Storage).  With vSAN, so long as the SW doesn’t get in the way (and it does to a degree), you can build a solution on your terms.  Want an all Intel platform? Go for it! Want FusionIO (SanDisk)? Go for it! Want 6TB drives? Go for it! Want the latest 18 core procs from Intel?  Go for it!  Do you want Dell, HP, IBM, Cisco, Quanta, or Supermicro servers?  Take your pick…  It is the way solutions in this day an age should be, or at least have an option to be.

Cons of vSAN in my opinion are plentiful.

  • Lack of tiers:  One very simple thing VMware could do that would ease with my adoption of their solution, is allow me to have two different tiers of storage.  One that’s all flash and one that’s hybrid.  This way my file serves can go in a hybrid pool, and my SQL servers can go into the all flash.  I’m not even looking for automation, just static / manual tiers.
  • In-line compression: I would love to have a deduplicated + compression solution, but if VMware simply offered in-line compression, that would make a world of difference in making those expensive flash drives go a little farther.  Not to mention the potential throughput improvements.
  • Disk groups:  This is one area I just don’t get.  Why do I need to have disk groups?  The architecture, from my view, just seems unneeded.  Here is what I WISH vSAN actually looked like:
    • Disks come in three classifications:
      • Write Cache: I want a dedicated write cache, mixing read / write cache on the same device is wasteful.  Now I need to run a massive / expensive cache drive.  It needs to be massive (hybrid design) so that it can actually have enough capacity to cache my working set.  it also needs to be expensive because it needs to deliver good write performance and have decent write endurance.  Just imagine what vSAN’s write performance would be if I could use a NVRAM based write cache.
      • Read Cache: No need to hash this out, but I’d want a dedicated read cache.
      • Data Disk:  Pretty self explanatory, and this could be either SSD or HDD.
    • Let me pool these devices rather than “grouping” them.  Create one simple rule, you get 35 disks per host, do with them as you please.
    • Let me create sub-vSAN’s out of this pool of data, AND let me make decisions like “this vSAN only runs on hosts 1 -5, and this other vSAN runs on hosts 6 – 10”.  I would love to institute some form of true separation for environments that have multiple nodes.  For example, to exchange servers, I’d like to make sure the disks are never on the same hosts storage, ever, even the redundant parts.  Maybe this feature already exists.  I’d still want hosts to be able to float to any compute node (so long as they’re not on the same node).  This is also where I could see a vSAN being created for hybrid pools AND SSD only pools.
    • I know its probably complex and CPU intensive, but at least provide a parity based option.  Copies are great for resiliency, but man do they consume a lot of capacity.  Then again, see my point about in-line compression.
  • Replication at the vSAN level: Don’t make me fire some appliance up for this feature, this should be baked into the code and as simple as “right click the VM and pick a replica destination”.  Obviously you’d want groups and polices, and all that good stuff, but you get my point.
  • Standalone option: I would actually consider vSAN (now) if it wasn’t converged.  I know that must sound like blasphemy, but I’d really love an option to build a scale out storage solution that anything could use, including VMware.  Having it converged is a really cool option, but I’d also like the opposite.
  • Easier to setup:  Its not that it looks hard, but when you have third parties or enthusiasts creating tools to make your product easier to setup, to me its clear you dropped the ball.
  • Real world benchmarks / configurations:  This is one area where if you going to offer a software only solution, you need to work a little harder.  You can’t hide behind the “oh everyone’s environment is different.” or the “your mileage may vary”.  On top of that, when your marketing states that you do 45k IOPS in a hybrid configuration and then your marketing engineer releases a blog article showing you doing 80k IOPS http://blogs.vmware.com/storage/2015/03/17/double-vsan-performance/  it tells me that VMware 100% what its own limits are, and that’s a problem.  I’m not saying they need to lab out every single possible scenario, but put together a few examples for each vSAN type (hybrid or all flash).  I realize Dell, HP, and the various other partners are partly to blame here, but then again, its not entirely in their best interest for vSAN to succeed.

Overall, I think vSAN is cool, but its not at a point where its truly an enterprise solution.

 

Backup Storage Part 2: What we wanted

In part 1, I went over our legacy backup storage solution, and why it needed to change.  This section, I’m going to outline what we were looking for in our next storage refresh.

Also, just to give you a little more context, while evaluating storage, were also looking at new backup solutions.  CommVault, Veeam, EMC (Avamar and Networker), NetVault, Microsoft DPM, and a handful of cloud solutions.  Point being, there were a lot of moving parts and a lot of things to consider.  I’ll dive more into this in the coming sections, but wanted to let you know it was more than just a simple storage refresh.

The core of what we were seeking is outlined below.

  1. Capacity: We wanted capacity, LOTS of capacity.  88TB’s of storage wasn’t cutting it.
  2. Scalability: Just as important as capacity, and obviously related, we needed the ability to scale the capacity and performance.  It didn’t always have to be easy, but we needed it to be doable.
  3. Performance: We weren’t looking for 100K IOps, but we were looking for multiple GBps in throughput, that’s bytes with an uppercase B.
  4. Reliable/Resilient: The storage needed to be reasonably reliable, we weren’t looking for five 9’s or anything crazy like that.  However, we didn’t want to be down for a day at a time, so something with resiliency built in, or a really great on-site warranty was highly desired.
  5. Easy to use: There are tons of solutions out there, but not all of them are easy to use.
  6. Affordable: Again, there’s lots of solutions out there, but not all of them are affordable for backup storage.
  7. Enterprise support: Sort of related to easy to use, but not the same, we needed something with a support contract.  When the stuff hits the fan, we needed someone to fallback on.

At a deeper level, we also wanted to evaluate a few storage architectures.  Each one, meets aspects of our core requirements.

  1. Deduplication Targets:  We weren’t as concerned about deduplications ability to store lots of redundant data on few disks (storage is cheap), but we were interested in the side effect of really efficient replication (in theory).
  2. Scale out storage:  What we were looking for out of this architecture was the ability to limitlessly scale.
  3. Cloud Storage: We liked the idea of our backup’s being off site (get rid of tape).  Also in theory it too was easy to scale (for us).
  4. Traditional SAN / NAS:  Not much worth explaining here, other than that this would be a reliable fallback if the other two architectures didn’t pan out.

With all that out there, it was clear we had a lot of conflicting criteria.  We all know that something can’t be fast, reliable and affordable.  It was apparent that some concessions would need to be made, but we hadn’t figured out what those were yet.  However, after evaluating a lot of vendors and solutions, passing along quotes to management, things started to become much clearer towards the end.  That, is something I’m going to go over in my upcoming sections. There is too much information to force it all in one upcoming section, so I’ll be breaking it out further.

-Eric

Backup Storage Part 1: Why it needed to change

Introduction:

This is going to be a multi-part series where I walk you through the whole process we took in evaluating, implementing and living with a new backup storage solution.  While its not a perfect solution, given the parameters we had to work within, I think we ended up with some very decent storage.

Backup Storage Part 1: The “Why” it needed to change

Last year my team and I began a project to overhaul our backup solution and part of that solution involved researching some new storage options.   At the time, we ran what most folks would on a budget, which is simply a server, with some local DAS. It was a Dell 2900 (IIRC) with 6 MD1000’s and 1 MD1200.  The solution was originally designed to manage about 10TB of data, and really was never expected to handle what ultimately ended up being much greater than that.  The solution was less than ideal for a lot of reasons that I’m sharing below.

  • It wasn’t just a storage unit, it also ran all the backup server (CommVault) on top of it, and our tape drives were  locally attached as well.  Basically a single box to handle a lot of data and ALL aspects of managing it.
  • The whole solution was a single point of failure and many of its sub-components were singe points of failure.
  • This solution was 7 years old, and had been grown organically, one JBOD at a time.  This had a few pitfalls:
    • JBODs were daisy chained off each other.  Which meant that while you added more capacity, and spindles, the throughput was ultimately limited to 12Gbps for each chain (SAS x4 3G).  We only had two chains for the md1000’s and one chain / JBOD for the md1200.
    • The JBODs were carved up into independent LUN’s, which from CommVaults view was fine (awesome SW), but it left potential IOPS on the table.  So as we added JBODS the IOPS didn’t linearly increase per say.  Sure the “aggregate” IOPS increased, but a single job, is now as limited to the speed of a 15 disk RAID 6.  Instead of the potential of say a 60 disk RAID 60.
  • The disks at the time were fast (for SATA drives that is) but compared to modern NL-SAS drives, much lower throughput capability and density.
  • The PCI bus and the FSB (this server still had a FSB) was overwhelmed.  Remember, this was doing tape AND disk copies.  I know a lot of less seasoned folks don’t think its easy to overwhelmed a PCI bus, but that’s not actually true (more on that later) even more so when you’re PCI bus is version 1.x.
  • This solution consumed a TON or rack space, each JBOD was 3U and we had 7 of them (md1200 was 2U).  And each drive was only 1TB, so best case with 15 disk RAID 6, we were looking at 13TB usable   By today’s standards, this TB per RU is terrible even for tier 1 storage, let alone backup.
  • We were using a RAID 6 instead of 10.  I know what some of you are thinking, its backup, so why would you use RAID 10?    Backup’s are probably every much as disk intensive as your production workloads and likely more so.  On top of that, we force them to do all their work in what’s typically a very constrained window of time.  RAID 6 while great for sequential / random reads, does horribly at writes in comparison to a RAID 10 (I’m exuding fancy file systems like CASL for this generalization).  Unless you’re running one backup at a time, you’re likely throwing tons of  parallel writes at this storage.  And while each stream may be sequential in nature, the aggregation of them is random in appearance to the storage.  At the end of the day, disk is cheap, and cheap disk (NL-SAS) is even cheaper so splurge on RAID 10.
    • This was also compounded by the point I made above about one LUN per JBOD
  • It was locked into what IMO is a less than ideal Dell (LSI) RAID card.  Again, I know what some of you are thinking.  “HW” RAID is SO much better than SW RAID.  Its a common and pervasive myth.  EMC, NetApp, HP, IBM, etc. are all simple x86 servers with a really fancy SW RAID.  SW RAID is fine, so long as the SW that’s doing the RAID is good.  In fact, SW RAID is not only fine, in many cases, it’s FAR better than a HW RAID card.    Now, I’m not saying LSI sucks at RAID, they’re fine cards, but SW has come a long way, and I really see them as the preferred solution over HW RAID.  I’m not going to go into the WHY’s in this post, but if you’re curious, do some research of your own, until I have time to write another “vs” article.
  • Using Dell, HP, IBM, etc for bulk storage is EXPENSIVE compared to what I’ll call 2nd tier solutions.  Think of this as somewhere in-between Dell and home brewing your own solution.
    • Add on to this, manufactures only want you running “their” disks in “their” JBODs.  Which means not only are you stuck paying a lot for a false sense of security, you’re also incredibly limited to what options you have for your storage.
  • All the HW was approaching EOL.

There’s probably a few more reason why this solution was no longer ideal, but you get the point.  The reality is, our backup solution was changing, and it was a perfect time to re-think our storage.  In part 2, we’ll get into what we thought we wanted, what we needed, and what we had budget for.