Tag Archives: review

Review: 5 years virtualizing Microsoft SQL Server

Introduction:

I know what you’re thinking, it’s 2017, why are you writing about virtualizing Microsoft SQL?  Most are doing it after all.  And even if they’re not, there’s this whole SQLaaS thing that’s starting to take off, so why would anyone care?  Well I’m writing this as more of a reflection on virtualizing SQL.  What works well, what doesn’t, what lessons I’ve learned, what I’m still learning, etc.

Like most things on the internet, I find that folks tend to share all the good, without sharing any of the bad (or vice versa).  There’s also just a lot of folks out there saying they’ve done it, without quantifying how well it’s working.  Sure, I’ve seen the cranky DBA say it’s the worst thing to happen, and I’ve seen the sysadmins say it’s the best thing that they ever did.  I find both types of feedback to be mostly useless, as they’re all missing context and depth.  This post is going to follow my typical review style, so I’ll outline things like the specs, the pros and cons, and share some general thoughts.

Background:

When I first started at ASI, I was told we’d never virtualize SQL.  It was the un-virtualizeable workload.  That was roughly six and a half years ago.  Fast forward to today, and we’ve been running a primarily virtualized SQL environment for close to five years.  It took a bit of convincing on my side, but this is basically how I convinced ASI to virtualize SQL.

  • Virtualizing SQL (and other big iron) was gaining a lot of popularity back in 2012
  • I had just completed my first successful POC of virtualizing a lot of other workloads at ASI.
  • We were running SQL on older physical systems and they were running adequately. The virtual hosts I was proposing were at the time two generations newer processor wise.  Meaning, if it was running ok on this dinosaur HW, it should run even better on this newer processor, regardless of whether it was virtual or not.
  • I did a ton of research, and a lot of political marketing / sales. Basically, compiling a list of things virtualization was going to fix in our current SQL environment.  Best of all, I was able to point at my POC as proof of these things.  For example, we had virtualized Exchange, and Exchange was a pretty big iron system that was running well. Many of the things I laid out as pros, I could point to Exchange as proof.

Basically, it was proposed as a win / win solution.  It wasn’t that I didn’t share the cons of virtualizing SQL, it was that I wasn’t as familiar with the cons until after virtualizing SQL.  This is going back to that whole lack of real-world feedback issue.  I brought up things like there would be some performance overhead, troubleshooting would be more difficult, and some of the more well-known issues.  But there was never a detailed list of gotcha’s.  No one that I was aware of had virtualized BIG SQL servers in the real world and then shared their experience in great detail.  Sure, I saw DBA’s complain a lot, but most of it was FUD (and still is).

Anyway, the point is, we did a 180, and went from not virtualizing any SQL, to virtualizing any and all SQL with the exception of one platform (more on that later).

The numbers and specs:

Bare in mind, this was five years ago, these were big numbers back then.

  • VMware cluster comprised of seven Dell r820’s
    • 32 total cores (quad socket 8 cores per)
    • 768GB of RAM
    • Quad 10gb networking
      • Two for the storage network
      • Two for all other traffic
    • Fusion IO drive 2 card
    • Fusion IO Io Turbine cache acceleration.
    • VMware ESXi 5.x – 6.x (over time)
  • Five Nimble cs460 SANs
  • Dual Nexus 5596 10Gb switches
  • Approximately 80 SQL servers (peak)
    • 20 – 30 of which were two node clusters
    • Started with Windows 2012 R1 + SQL 2012
    • Currently running Windows 2012 R2 + SQL 2014 and moving on to Windows 2016 + SQL 2017

To summarize, we have a dedicated VMware cluster for production SQL systems and another cluster (not detailed) for non-production workloads.  It didn’t start out that way, more on that later.

Pros:

No surprise, but there are a lot of advantages to virtualizing SQL that even after five years I still think holds true.  Let’s dig into it.

  • The ability to expand resources with minimal disruption. I’m not talking about anything hot-add here, simply the fact that you can add resources.  In essence, give you the ability to right size each SQL server.
  • Through virtualization, you can run any number of OS + SQL version combinations that you need. Previously there was all kinds of instance stacking, OS + SQL version lag.  With virtualization if we want a specific OS + SQL combo, we spool up a new VM and away we go.
  • Virtualization made it easy for us to have a proper dev, stage, UAT and finally production environment for all systems. Before these would have been instances on existing SQL servers.
  • Physical hardware maintenance is mostly non-disruptive. By being able to easily move workloads (scheduled) to different physical hosts, we’re able to perform maintenance without risking data loss.  There’s also the added benefit that there’s basically no firmware or driver updates (other than VMware tools / version) to apply in the OS its self.  This make maintenance a lot easier for the SQL server its self.
  • Related to the above, hardware upgrades are as easy as a shutdown / power on. There’s no need to re-install and re-configure SQL on a new system.
  • We were able to build SQL VM’s for specific purposes rather than trying to co-mingle a bunch of databases on the same SQL server. Some might say six of one and half a dozen of another, but I disagree.  By making a specific SQL server virtual, it enabled us to migrate that workload to any number of virtual hosts.
  • With enterprise licensing, we could build as many SQL systems as we wanted within the confines of resources.
  • Migrating SQL data from one storage location to another was easy, but I won’t go so far as saying non-disruptive. Doing that on a physical SQL server, requires moving the data files manually.  With VMware, we just moved the virtual disk.
  • Better physical host utilization. This is a double-edged sword, but we’ve been able to more fully utilize our physical HW than before.  When you consider how much SQL licensing costs, that’s a pretty big deal.
  • Redundancy for older OS versions. Before Windows 2012, there was no official support for NIC teaming.  You could do it, but Microsoft wouldn’t support it.  With VMware, we had both NIC redundancy and host redundancy.  In a non-clustered SQL server, VMware’s HA could kick in as a backup for host failures.

Pretty much, all the standard pros you’d expect from a virtual environment, and a few SQL specific ones.

Cons:

This is a tough one to admit, but there are a TON of cons to virtualizing SQL if a sysadmin has to deal with it at scale.

  • Troubleshooting just got tougher with a SQL. VMware will now always be suspect for any and all issues.  Some of it is justified, a lot of it not.  Still, trying to prove it’s not a VMware issue is tough.  You’re no longer simply looking at the OS stats, now you have to review the VM host stats, check for things like co-stop, wait, busy, etc.  Were there any noisy neighbors, anything in the VMware logs, etc.
  • Things behave have differently in a virtual world. In a physical world, “stuns” or “waits” don’t happen.  This is related to the above, but basically, for every simplicity that virtualization adds, it at least matches it with an equal or greater complexity.
  • The politics, OH the politics of a virtual SQL environment. If you don’t’ have a great relationship with your SQL team, I would say, don’t virtualize SQL.  It’s just not worth the pain and agony you’re going to go through.  It will only increase finger pointing.
  • DBA’s in charge of sizing VM’s on a virtual host you’re in charge of supporting. This is related to politics, but basically now that DBA’s know they can expand resources, you can bet your hind end your VM’s will get bigger and almost never shrink (we’ve gotten some resources back, so kudos to our DBA’s).  It doesn’t matter if you explain NUMA concerns, co-stop, etc.  It’s nothing more than “I want more CPU” or “I want more memory”.  Then a week later, when you have VM’s stepping on each other’s toes, it will be finger pointing back at you again.  I think what’s mostly happening here, is the DBA’s are focused on the individual server performance, whereas its difficult to convey the multi-server impact.
  • vMotion (host or storage) will cause interruptions. In a SQL cluster, you will have failovers.  At least that’s my experience.  Despite what VMware puts on their matrix, DON’T plan on using DRS.  Even if you can get the VM’s to migrate without a failover, the applications accessing SQL will slow down.  At least if your SQL VM’s are a decent size.  This was probably the number one disappointment with our SQL environment.
    • Once you can’t rely on DRS, managing VM’s across different hosts becomes a nightmare. You’ll either end up in CPU overload, or memory ballooning. I’ve never seen memory ballooning before virtualizing SQL, and that’s the last application you want to see ballooning and swapping.
    • Since you can’t vmotion VM’s to rebalance the cluster without causing disruptions (save for maybe non-clustered VMs) just keep piling on the struggle.
  • SQL VMware hosts are EXPENSIVE at least when you’re running a good number of big VM’s like we are. We actually maxed out our quad socket servers from a power perspective.  Even if we wanted to do something like add memory it’s not an option.  And when you want to talk about swapping in new hosts, it’s not some cheap 30k host, no it’s a host that probably costs close to 110k if not more.  Adding to that, you’re now tasked with trying to determine if you should stay with the same number of CPU cores, or try to make a case for more CPU cores, which now add SQL licensing costs.

I could probably keep going on, but the point is virtualizing SQL isn’t all sunshine and roses like it is for other workloads.

Lessons learned:

I’m thankful to have had this opportunity, because it’s enabled me to experience first-hand what it’s like virtualizing SQL in a shop where SQL is respectably large and critical.  In this time, I’ve learned a number of things.

  • DRS + SQL clusters = no go. Maybe it works for you and your puny 4 vCPU / 16GB VM, but for one of our vm’s with 24 vCPU and 228GB of RAM, you will cause failovers.  And no DBA wants a failover.
    • Actually DRS + any Windows cluster = no go, but that’s for another post.
  • If I had to do it over again, I would have gotten Dell r920’s instead of 820’s. While both were quad socket, I didn’t realize or appreciate the scalability difference between the 4600 and 8600 series xeons.  If I was building this today, I would go after hosts that are super dense.  Rather than relying on a scale out technique, I’d shoot for a scale up approach.  Most ideal would be something like the HPe SuperDome, but even getting a new M series Xeons with 128GB DIMMS would be a wise choice.  In essence, build a virtual platform just like you would a physical one So if you normally would have had three really big hosts, do the same in VMware.
  • Accept the fact that SQL VM’s are going to be larger than you think they should be. Some of this being fair is SysAdmins think they understand SQL, and we don’t.  There’s a lot more to SQL than CPU utilization.  For example, I’ve seen SQL queries that only used 25% of every CPU core they were running on, but the more vCPUs we allocated to the VM, the faster that query ran.  It was the oddest thing I had ever seen, but it also wasn’t the only application I’ve seen like this.  Likely, a disk bottleneck issue, or at least that’s my guess.
  • Just give SQL memory and be done with it. When we virtualized our first SQL cluster, the one thing we noticed was the disk IO on our SAN (and FusionIO card) was pretty impressive.  At first, it’s pretty cool to see 80k IOPS from a real workload, but then when you hear the DBA’s saying, “it’s slow” and you realize that if every SQL server you add needs this kind of disk IO, you’re going to run out of IOPS in no time.  We added something like 64GB of more memory to those nodes, and the disk IO went from 80k to 3k and the performance from the DBA’s perspective was back to what they expected.  There’s no replacement for memory.
  • Virtualizing SQL is complex. While it CAN be as simple as what you’re used to doing, once you start adding clustering, and managing a lot of monster VM’s on the same cluster, it’s a different kind of experience than you’re used to.  To me, it’s worth investing in VMware log insight for your SQL environment to make it easier to troubleshoot things.  I would also add ops manager as another potential value add.  At least these are things I’m thinking of pushing for.
  • Keep your environment as simple as possible. We started out with Fusion IO cards + Fusion IO caching software.  All that did was create a lot of headache, and once we increased the RAM in SQL, the disk bottleneck went away (mostly).  I could totally see using an Intel NVMe (or 3dxpoint) card for something like TempDB.  However, I would put the virtual disk on the drive directly, not use any sort of caching solution.
  • I would have broken our seven node cluster up into two or three two node clusters. This goes back to treating them like they’re physical servers.  Again, scaling up, much better, but if you’re going to use more, but smaller hosts, treat them like they’re physical.
    • We kind of do this now. Node 1’s on odd hosts, node 2’s on even hosts
  • We found that we ultimately didn’t need Vmware’s enterprise plus. We couldn’t vmotion, or use DRS, and the distributed switch was of little value, so we converted everything to standard edition.  Now, I have no clue what would happen if we wanted Ops Manager.  It used to be a la carte, but I’m not so sure anymore.
  • We originally had non-prod and prod on the same cluster. We eventually moved all of non-prod off.  This provided a little more breathing room, and now we have two out of seven hosts free to use for maintenance.  Before, they were partially consumed with non-prod SQL VM’s.
  • We made the mistake of starting with virtualizing big SQL severs and learning about Microsoft clustering + AlwaysOn Availability Groups at the same time. Not recommended J.  I don’t think it would have been easy to learn the lessons we did, even if it was difficult.
  • Just because VMware says something will work, doesn’t mean it will. I quadruple checked their clustering matrix and recommended practices guides.  We were doing everything they recommended and our clusters still failed over.
  • Big VM’s don’t behave the same way as little VM’s. I know it sounds like a no duh, but it’s really not something you think about.  This is especially true when it comes to vMotion or even trying to balance resources (manually) on different hosts.  You never realize how much you really appreciate DRS.
  • I’ve learned to absolutely despise Microsoft clustering when it’s virtualized. It just doesn’t behave well.  I think MS clustering is built for a physical world, where there are certain assumptions about how the host will react.  For the record, our physical SQL cluster is rock solid.  All our issues typically circle back to virtualization.
    • BTW, yes, we’ve tried tuning the subnet failover thresholds, no it doesn’t work, and no I can’t tell you why.
  • We’ve learned that VMware support just isn’t up to par, and that you’re really playing with fire if you’re virtualizing complex workloads like SQL. We can’t afford mission critical support, so maybe that’s what we need, but production support is basically useless if you need their help.
  • Having access to Microsoft’s premier support would be very beneficial in this environment. It’s probably something we should have insisted on.

Conclusion:

Do I recommend virtualizing SQL?  I would say it depends, but mostly yes.  There are certainly days where I want to go back to physical, but then I think about all the things I would miss with our virtual environment.  And I’m sure if you asked our DBA’s, they too would admit to missing some of the pros as well.  Here are my final thoughts.

I would say if you’re a shop that has a lot of smaller SQL servers, and they’re non-clustered, virtualization is a no-brainer.  When SQL is small, and non-clustered, it mostly behaves about the same as other VM’s.  We never have issues with our dev or stage systems, and they’re all on the smaller side and they’re all non-clustered.  Even with our UAT environment, we almost never have issues, even though they are clustered.

For us, it seems to be the combination of a clustered and large SQL server where things start getting sketchy.  I don’t want to make it sound like we’re dealing with failovers all the time.  We’ve worked through most of our issues, and for the most part, things are stable.  We occasionally have random failovers, which is incredibly frustrating for all parties, but they’re rare now a day.

My suggestion is, if you do want to virtualize large clustered SQL systems, treat them like they’re physical.  Here are a few rough recommendations:

  • Avoid heavy CPU oversubscription. Shoot for something like less than 3:1, and more ideal being less than 2:1
  • Size your VM’s so they fit in a NUMA node. That would have been impossible back in the day, but now a day, we could probably do this.  Maybe some of you though, this will still be an issue.  Our largest VM’s (so far) are only 24 vCPU, so we can fit in a single NUMA node on newer HW.
  • Don’t cluster in VMware period. No HA, no DRS.  Keep your hosts standalone and manage your SQL VM’s just like you would if they were physical.  Meaning, plan the VMware host to accommodate the SQL VM(s).
  • Don’t intermix non-SQL VM’s with these systems. We didn’t do this, but I wanted to point it out.
  • Plan on a physical host that can scale up its memory if needed.
  • When doing VMware host maintenance, failover your SQL listeners / clusters before migrating the VMs.
    • BTW, it’s typically faster to shutdown a VM then vMotion it while powered on at the sizes we’re dealing with.

Finally, I wanted to close by pointing out, that performance was never an issue in our environment.  In fact, things got faster when we moved to the newer HW + SAN.  One of the biggest concerns I used to see with virtualizing SQL was performance, and yet it was everything else that no one mentioned that ended up being the issues.

Hope this helps someone else who hasn’t taken the plunge yet or is struggling themselves.

Review: 877stockcar.com exotic experiences

Introduction:

This post is 100% off topic, it’s about my “exotic car” experience through 877stockcar.com.  In general, my blog is for tech stuff, but I figure it might be fun to write about something non-tech for once.  This is about the https://877stockcar.com/experiences/exotic-experiences/ located at Pocono Raceway.

I wanted to write this for anyone that might be thinking of dropping up to $700 on their package, so you know what you’re in for.  My wife got me the mid-tier package for Christmas (best gift ever) because she knows I’m a pretty big car nut.

In case someone reads this that’s not familiar with my review style, besides going over the pros and cons, you’ll find that my assessment will be blunt.  While I may have a degree of diplomacy in my views, the point of my review style is to be brutally honest.

The weather:

In my case, I couldn’t have asked for a more perfect day.  70ish and sunny, with no rain for days, which meant the track and waiting area was dry.

Pros:

As usual, I like to start with the good before digging into the bad.

  • For the most part, the cars they had are what I would consider pretty respectable. I personally drove an Audi R8 (v10) and a Maserati MC.  If you’re thinking to yourself “those are six figure cars” I get it, but they’re low six figure cars, as in less than 200k.
  • The cars were clean inside and out. I’m only bringing it up, because you’re paying for an experience and no one wants a dusty dash, and a dirty car. No it doesn’t affect how they drive, but I know it can skive some folks out.
  • They provide something that best I can describe is a head glove, so you keep your germs to yourself. Similar inside the cars, the seats are covered, although I suspect that’s more to protect the interior of the car than the driver.
  • The instructors I had were all friendly, and knew the track like the back of their hand.
  • They had the apex’s all coned off for you. Short of painting a driving line (more on that later) you knew exactly where to go if you were trying to maximize your speed.
  • Similar to above, they had the breakpoint marked off for their one straight away.
  • The helmet they offered fit my large head, which was good. It was honestly a concern I had going in.
  • For the little amount of time you do get with the cars, it is a fun experience.

Cons:

This was my “exotic car” experience. I’m not trying to imply the whole experience was negative, it wasn’t. However, as you’ll see it was far from perfect.

  • Where am I supposed to go? So, I plugged in the address, as marked on the site, and arrived to a locked gate.   A few thoughts on this:
    • We tried calling them to see what’s up. We were greeted by the “we’re closed today, but you can leave a message”.  Here’s the thing, If I’m dropping (or I my wife in this case) anywhere from $250 – $700 for a course that lasts maybe 20 minutes, your ass can staff someone to answer a damn phone during the hours of the event.
    • When I called to make my reservation, there was zero mention of where to go specifically, or that the main gate would be locked. The only thing I was told was make sure I wear socks and sneakers, that’s it.  In my not so humble opinion, I think pointing out something that I imagine is pretty common would make sense to do.
      • Related to this, I did find on their website (https://877stockcar.com/wp-content/uploads/2017/04/Directions.pdf) directions on where to actually go. Now I can see how this being partially on me for not looking (like I’m sure most people don’t), but I’m totally calling bull shit on their inability to provide a set of GPS coordinates (let alone bring attention to the main gate not being the right place to go).    So, you’re telling me, NO ONE in the whole facility, with all the revenue this place probably brings in, can afford or has access to GPS?  Right…
    • Why not place a sign right in front of the main gate, saying something like “go here, wrong entrance”?  Again, just to brow beat the concept, I can’t imagine I’m the only one to do this.
      • Once we started driving down the road (knowing there were a few more entrances) we saw that they had small post signs that eventually led us to the right entrance.
    • Where do we park? I’m not trying to nitpick here, but knowing where to park wasn’t made abundantly clear We guessed where we parked was fine, but there were no signs saying park here.  Actually, adding to that, there were no signs even letting us know we were at the right spot.  I mean it was kind of obvious with a bunch of Lambos running around and a large tent, but there was no official indication that we were at the right spot.  For all we knew, it was some crew area.
    • The check-in: To be honest, the guy at the check-in acted like I was bothering him, and was clearly pre-occupied with something else.  Here’s the thing, it’s my first time (and probably my last with them) and I have zero clue what the process is. He didn’t ask if it was my first time, he didn’t ask how I was, it was “sign here”.    So after checking in, I basically had to keep asking questions in order to figure out where I’m supposed to go, how the process works, etc.
    • The introduction: After standing there for a few minutes, some random employee walked up to the area and asked if anyone just arrived, and the few of us flocked to him.  He proceeded to rapid fire off a rough set of instructions on how the process works, doesn’t ask if anyone has questions and walks off. He was a nice guy, but you could tell he did this all the time, and was probably on auto pilot. Meaning I think he just assumed everyone understood what to do.
    • So when do I drive? After standing around for a bit longer and frankly pretty frustrated, I started observing what others were doing and basically figured out that helmets get dropped off at a table, and we’re supposed to just go fight over them.  Once you figure that out, the next part is just standing in a random area near the drop off / pickup.  And again, it’s more or less a diplomatic fight for going after whatever car you want.
    • Driving:
      • Instructor: Alas, I’m finally sitting in an R8. The instructor is a super nice guy, and goes over adjusting the seat, and basic instructions to get the car into go mode.  We take off for my “warm up” lap, and he takes me through the course showing me the apex’s (while holding the steering wheel, really weird).  And then mostly lets me have at it.  He continues coaching me on trying to hit the apex’s but other than that, pretty much along for the ride.
        • Cool down lap conversation: I figured since we were basically driving as fast as I do in a school zone for the cool down lap that I’d break the awkward silence and try to have a conversation.  I tried asking him about the cars to which he didn’t have much knowledge (or didn’t want to chat).   I get that you don’t need to know the cars to be a good driver, but this is kind of a driver enthusiast experience, I’d think the instructors could talk all about the cars. I don’t know, maybe they’re just busy keeping an eye out for other drivers too.
      • Track: IMO, the track sucks.  Here’s the thing, it’s not that the track was badly maintained or anything like that, it’s just the thing is so damn small.  Their straight away, I’m fairly confident isn’t even a quarter mile.  You spend more time trying to whip through corners (which IS fun) and you never really get a chance to get the car over 100.  Now, being fair, I suspect a good deal of that has to do with my skill, but ALSO the skill of the drivers in front of you, more on that in a sec.  So, when they tell you 4 laps, it’s like ten minutes tops, and that’s if you’re poking around.
      • Other people: The fact is, they have way too many people on the track at a given time.  During both of my group of laps, the busyness varied, but it was very rare that you’d have even close to a wide-open track in front of you.  By the time I was in the Massaratti, I got to a point where I was mostly getting stuck behind other drivers.  The instructor kept telling me if I could catch them we could pass them. I was like a car length and a half, and that was only because I didn’t want to rear end anyone.  So I’m not sure what’s defined as catching, but if you think you’re going to be passing slower drivers, I’ll say you’ll typically burn 25% of your laps before you get the opportunity.  That said, I know they said it’s not racing, so just make sure you have your expectations in line.
      • Picking the car: I kind of knew what I wanted to drive, but they didn’t ask what I wanted to drive.  Hell, they didn’t even tell you what all the cars were, specs or anything like that.  Being fair, they mentioned which cars were RWD vs. AWD.  It was also really disappointing that a few cars were only available for the folks who had the $700 package, more specifically the McLaren.  Although, it sounded like there were reliability issues with it, so maybe not a big deal.
    • The cars: To be blunt, I wasn’t impressed with the car selection. It’s not that they had bad cars, they lacked variety.  I think the fastest car they had was the R8 or the McLaren when it was working..  A lot of their cars were convertibles (lame), and really the variety was lacking.  I would have much rather seen one of a few different types of cars, than having a pick of four or five cars that are basically all the same.  I mean, going to the Lambo and the R8.  It’s basically the same car with a different skin.

Conclusion:

All in all, the experience is plagued with terrible customer service, practically zero training / overviews, a complete lack of organization, overcrowding and ultimately, it’s a ton of money to dump on what is essentially 20 minutes at most of driving.    It was certainly fun to drive the cars, but I’d never give them another dollar of my money.  Instead, I’d probably just spend a little extra and go to a Porsche, BMW or the like driving school.  I suppose if you just want to know what its like to drive the car, it’s an ok experience, but for the money you spend, you could probably rent the car for a whole day.  At least then, you’d get some real seat time with the car.

What would I do differently?

  • The registration process should include detailed directions emailed to you (and discussed over the phone). I would also send a reminder email, along with a restatement of where to go, and where to park.
  • If the main gate is locked, I’d put a sign right in front telling folks to turn around and go this way.
  • I’d staff a person or two on the phones (how about the registration people?) to answer calls during the event times.
  • I would run the event as batches of people, rather than make it a free for all.
    • Everyone would have a helmet
    • The cars would be lined up, with specs and performance numbers outlined.
      • I would let folks look at the cars for a few minutes at the very least so you can see what you might actually want to drive.
    • I would document which cars folks wanted to drive, and have a program that organizes an order when driver x gets car y and how long the wait is estimated to be.
    • I would have the instructors take each person out for a lap to show them the course before having the driver do it.
    • I would then have the instructor take a driver out in something like a Miata for a few laps so they can get familiar with the track in a fuel efficient, affordable sports car.
    • I would limit the track to no more than three cars at a time.  At least if we’re talking the track layout they had.  MAYBE, if the track was longer and they had an actual straight away, they could get away with more cars, without spoiling the experience.
    • Rather than doing “laps” it would simply be a timed event. You get 15 minutes for every $300 or whatever would make business sense.  This way faster drivers don’t lose seat time.
      • And that would be 15 minutes, with the car you want, and with no more need to “warm up”.
      • Cool downs? Just let the car sit for five minutes or so when they’re done.  If people are really seeing brake fade, equip the cars with some better pads.  And if the car can’t handle having the piss beat out of it for at least 15 minutes in a row, it’s not exactly a great exotic car.
    • I would have GoPros on the helmets and the cars themselves. I would record videos that could be purchased, provide lap times, top speed, most g’s pulled, etc.  They had none of that stuff.
    • I would have a larger variety of cars, and 100% of them would be coupes. If you want to drive a freaking convertible, go get a Solara.  To name a few…
      • Corvette ZR1
      • Audi R8 (was a good pick)
      • A real Ferrari, like a 488 GTB
      • BMW M5
      • Ford GT 40
      • Lotus
      • Porsche (maybe GT3?)
      • Ariel Atom (ok not a coupe, but it’s allowed to be excluded).
    • For Pocono Raceway specifically, I would open up the track so that maybe you could end up on the actual race track for a bit, so there’s enough room to actually open the car up a bit. What in the world is the point to a car that can go 180+ if you can’t even get it to 100?  Maybe offer two options, an open track for speed demons, and a closed track for folks that like to feel the G force.
    • I would paint a driving line rather than relying on cones. Or some similar material.
    • How about something for the family to do while they wait?  I don’t have a particular idea of what that might be, but I suspect standing around isn’t their idea of fun.

I realize it’s a business and ultimately, it’s about making money.  The cars aren’t cheap, and I’m sure they’re getting the piss beat out of them, but I think those are some relatively cheap things they could do, that would make a dramatic improvement in the driving experience.

Quicky Review: GPO/GPP vs. DSC

Introduction:

If you’re not in a DevOps based shop, or living under a rock, you may not know that Microsoft has been working on a solution that by all accounts sounds like its poised to usurp GPO / GPP.  The solution I’m talking about is Desired State Configuration, or DSC. According to all the marketing hype, DSC is the next best thing for IT since virtualization.  If the vision comes to fruition, GPO and GPP will be a legacy solution.  Enterprise mobility management will be used for desktops and DSC will be used for servers.  Given that I currently manage 700 VM’s and about an equal number of desktops, I figured why not take it for a test drive. So I stood up a simplistic environment, and played around with it for a full week and my conclusion is this.

I can totally see why DSC is awesome for non-domain joined systems, but its absolutely not a good replacement in todays iteration for domain joined systems. Does that mean you should shun it since all your systems are domain joined?  That depends on the size of your environment and how much you care about consistency and automation.  Below are all my unorganized thoughts on the subject.

The points:

DSC can do everything GPO can do, but the reverse is not true. At first that sounds like DSC is clearly a winner, but I say no.  The reality is, GPO does what it was meant to do, and it does it well.  To reproduce what you’ve already done in GPO while certainly doable, has the potential of making your life hell.  Here are a few fun facts about DSC.

  1. The DSC “agent” runs as local system. This means it only has local computer rights, and nothing else.
  2. Every server that you want to manage with DSC, needs its own unique config file built. That means if you have 700 servers like me, and you want to manage them with DSC, they each are going to have a unique config file.  Don’t get me wrong, you can create a standard config, and duplicated it “x” times, but none the less, its not like GPO where you just drop the computer in an OU and walk away.  That being said, and to be fair, there’s no reason you couldn’t automate DSC config build process to do just that.
    1. DSC has no concept of “inheritance / merging” like you’re used to with GPO. Each config must be built to encompass all of those things that GPO would normally handle in a very easy way.  DSC does have config merges in the sense that you can have a partial config for say your OS team, your SQL team and maybe some other team.  So they can “merge” configs, and work on them independently (awesome).  However, if the DBA config and the OS config, conflict, errors are thrown, and someone has to figure it out.  Maybe not a bad thing at all, but none the less, it’s a different mindset, and there is certainly potential for conflicts to occur.
  3. A DSC configuration needs to store user credentials for a lot of different operations. It stores these credentials in a config file that hosted both on a pull server (SMB share / HTTPs site) and on the local host.  What this means is you need a certificate to encrypt the config file and then of course for the agent to decrypt the config file.  You thought managing certificates was a pain for a few exchange servers and some web sites?  Ha! now every server and the the build server need certs.  In the most ideal scenario, you’re using some sort of PKI infrastructure.  This is just the start of the complexity.
    1. You of course need to deploy said certificate to the DSC system before the DSC config file can be applied. In case you can’t figure it out by now, this is a boot strap solution you have to implement on your own if you don’t use GPO.  You could use the same certificate and bake it into an image.  That certainly makes your life easier, but its also going to make your life that much harder when it comes to replacing those certs on 700 systems.  Not to mention, a paranoid security nut would argue how terrible that potentially is.
  4. The DSC agent of course need to be configured before it knows what to do. You can “push” configurations, which does mitigate some of these issues, but the preferred method is “pull”.  So that means you need to find a way (boot strap again) to configure your DSC agent so that it knows where to pull its config from, and what certificate thumbprint to use.

Based on the above point, you probably think DSC is a mess, and to some degree it is. However, a few other thoughts.

  1. It’s a new solution, so it still needs time to mature. GPO has been in existence since 2000, and DSC, I’m going to guess, since maybe 2012.  GPO is mature, and DSC is the new kid.
  2. Remember when I wrote that DSC can do everything that GPO can do, but not the reverse? Well, lets dig into that.  Let’s just say you still manage Exchange on premises, or more likely, you manage some IIS / SQL systems.  DSC has the potential to make setting those up and administering them, significantly easier.  DSC can manage not only the simple stuff that GPO does, but also way beyond that.  For example, here are just a few things.
    1. For exchange:
      1. DSC could INSTALL exchange for you
      2. Configure all your connectors, including not only creating them, but defining all the “allowed to relay” and what not.
      3. Configure all your web settings (think removing the default domain\username).
      4. Install and configure your exchange certificate in IIS
      5. Configure all your DAG relationships
      6. Setup your disks and folders
    2. For SQL
      1. DSC could INSTALL sql for you.
      2. Configure your max member min memory
      3. Configure your TempDB requirements
      4. Setup all your SQL jobs and other default DB’s
    3. Pick another MS app, and there’s probably a series of DSC resources for it…
    4. DSC let’s you know when things are compliant, and it can automatically attempt to remediate them. It can even handle things like auto reboots if you want it to.  GPO can’t do this.  To the above point, what I like about DSC, is I’ll know if someone went in to my receive connector and added an unauthorized IP, and even better, DSC will whack it and set it back to what it should be.
    5. Part of me thinks that while DSC is cool, I wish Microsoft would just extend GPO to encompass the things that DSC does that GPO doesn’t. I know its because the goal is to start with non-domain joined systems, but none the less, GPO works well and honestly, I think most people would rather use GPO over DSC if both were equally capable.

Conclusion:

Should you use DSC for domain joined systems?  I think so, or at least I think it would be a great way to learn DSC.  I currently look at DSC as being a great addition to GPO, not a replacement.  My goal is going to be to use GPO to manage the DSC dependencies (like the certificates as one example) and then use DSC for specific systems where I need consistency, like our exchange, SQL and web servers.  At this stage, unless you have a huge non-domain joined infrastructure, and you NEED to keep it that way, I wouldn’t use DSC to replace GPO.

 

Review: 2.5 years with Nimble Storage

Disclaimer: I’m not getting paid for this review, nor have I been asked to do this by anyone.  These views are my own,  and not my employers, and they’re opinions not facts .

Intro:

To begin with, as you can tell, I’ve been running Nimble Storage for a few years at this point, and I felt like it was time to provide a review of both the good and bad.  When I was looking at storage a few years ago, it was hard to find reviews of vendors, they were very short, non-informative, clearly paid for, or posts by obvious fan boys.

Ultimately Nimble won us over against the various  storage lines listed below.  Its not a super huge list as there was only so much time and budget that I had to work with .  There were other vendors I was interested in but the cost would have been prohibitive, or the solution would have been too complex.  At the time, Tintri and Tegile never showed up in my search results, but ultimately Tintri wouldn’t have worked (and still doesn’t) and Tegile is just not something I’m  super impressed with.

  • NetApp
  • X-IO
  • Equallogic
  • Compellent
  • Nutanix

After a lot of discussions and research, it basically boiled down to NetApp vs. Nimble Storage, with Nimble obviously winning us over.  While I made the recommendation with a high degree of trepidation and even after a month with the storage, wondered if I totally made an expensive mistake, I’m happy to say, it was and is still is a great storage decision.  I’m not going into why I chose Nimble over NetApp, perhaps some other time, for now this post is about Nimble, so let’s dig into it.

When I’m thinking about storage, the following are the high level area’s that I’m concerned about.  This is going to be the basic outline of the review.

  • Performance / Capacity ratios
  • Ease of use
  • Reliability
  • Customer support
  • Scaling
  • Value
  • Design
  • Continued innovation

Finally, for your reference, we’re running 5 of their 460’s, which is between their cs300 and cs500 platforms and these are hybrid arrays.

Performance / Capacity Ratios

Good performance like a lot of things is in the eye of the beholder.  When I think of what defines storage as being fast, its IOPS, throughput and latency.  Depending on your workload, more of one than the other may be more important to you, or maybe you just need something that can do ok with all of those factors, but not awesome in any one area.  To me, Nimble falls in the general purpose array, it doesn’t do any one thing great, but it does a lot of things very well.

Below you’ll find a break down of our workloads and capacity consumers.

IO breakdown (estimates):

  • MS SQL (50% of our total IO)
    • 75% OLTP
    • 25% OLAP
  • MS Exchange (30% of total IO)
  • Generic servers (15% of total IO)
  • VDI (5% of total IO)

Capacity consuming apps:

  • SQL (40TB after compression)
  • File server (35TB after compression)
  • Generic VM’s (16TB after compression)
  • Exchange (8TB after compression)

Compression?  yeah, Nimble’s got compression…

Nimble’s probably telling you that compression is better than dedupe, they even have all kinds of great marketing literature to back it up.  The reality like anything is, it all depends.  I will start by saying if you need a general purpose array, and can only get one or the other, there’s only one case where I would choose dedupe over compression, which is data sets mostly consisting of operating system and application installer data.  The biggest example of that would be VDI, but basically where ever you find your data being mostly consistent of the same data over and over.  Dedupe will always reduce better than compression in these cases.    Everything else, you’re likely better off with compression.    At this point, compression is pretty much a commodity, but if you’re still not a believer, below you can see my numbers.  Basically, Nimble (and everyone else using compression) delivers on what they promise.

  • SQL: Compresses very well, right now I’m averaging 3x.  That said, there is a TON of white space in some of my SQL volumes.  The reality is, I normally get a minimum of 1.5x and usually end up more along the 2x range.
  • Exchange 2007: Well this isn’t quite as impressive, but anything is better than nothing,   1.3x is about what we’re looking at.  Still not bad…
  • Generic VM’s: We’re getting about 1.6x, so again, pretty darn good.
  • Windows File Servers: For us its not entirely fair to just use the general average, we have a TON of media files that are pre-compressed.  What I’ll say is our generic user / department file server gets about 1.6 – 1.8 reduction.

Show me the performance…

Ok, so great, we can store a lot of data, but how fast can we access it?  In general, pretty darn fast…

The first thing I did when we got the arrays was fire up IOMeter, and tried trashing the array with a 100% random read 8k IO profile (500GB file), and you know what, the array sucked.  I mean I was getting like 1,200 IOPS, really high latency and was utterly disappointed almost instantly.    In hind sight, that test was unrealistic and unfair to some extent.  Nimble’s caching algorithm is based on random in, random out, and IOmeter was sequential in (ignored) and then attempting random out.  For me, what was more bothersome at the time, and still is to some degree is it took FOREVER before the cache hit ratio got high enough that I was starting to get killer performance.    Its actually pretty simple to figure out how long it would take a cold dataset like that to completely heat up, divide (524288000k/9600) or 15 hours.  The 524288000 is 500GB converted to KB.  The 9600 is 8k * 1200IOPS to figure out the approximate throughput at 8k.

So you’re probably think all kinds of doom and gloom and how could I recommend Nimble with such a long theoretical warm up time?  Well let’s dig into why:

  • That’s a synthetic test and a worst case test.  That’s 500GBs of 100% random, non-compressed data.  If that data was compressed for example to 250GB, it would “only” take 7.5 hours to copy into cache.
  • On average only 10% – 20% of you total dataset is actually hot.  If that file was compressed to 250GB, worst case you’re probably looking at 50GB that’s hot, and more realistic 25GB.
  • That was data that was written 100% sequential and then being read 100% random.  Its not a normal data pattern.
  • That time is how long it takes for 100% of the data to get a 100% cache hit.  The reality is, its not too long before you’re starting to get cache hits and that 1,200 IOPS starts looking a lot higher (depending on your model).

There are a few example cases where that IO pattern is realistic:

  • TempDB: When we were looking at Fusion IO cards , the primary workload that folks used them for in SQL was TempDB.  TempDB can be such a varied workload that its really tough to tune for, unless you know your app.  Having a sequential in, random out in TempDB is a very realistic scenario. 
  • Storage Migrations:  Whether you use Hyper-V or VMware, when you migrate storage, that storage is going to be cold all over again with Nimble.  Storage migrations tend to be sequential write.
  • Restoring backup data:  Most restores tend to be sequential in nature.  With SQL, if you’re restoring a DB, that DB is going to be cold.

if you recall, I highlighted that my IOmeter test was unrealistic  except in a few circumstances, and one of those realistic circumstances can be TempDB, and that’s a big “it depends”.    But what if you did have such a circumstance?  Well any good array should have some knobs to turn and Nimble is no different.  Nimble now has two ways to solves this:

  • Cache Pinning: This feature was released in NOS 2.3, basically volumes that are pinned run out of flash.  You’ll never have a cache miss.
  • Aggressive caching: Nimble had this from day one, and it was reserved for cases like this.  Basically when this is turned on, (volume or performance policy granularity TMK), Nimble caches any IO coming in or going out.  While it doesn’t guarantee 100% cache hit ratios, in the case of TempDB, its highly likely the data will have a very high cache hit ratio.

Performance woes:

That said, Nimble suffers the same issues that any hybrid array does, which is a cache miss will make it fall on its face, which is further amplified in Nimbles case by having a weak disk subsystem IMO.  If you’re not seeing at least a 90% cache hit ratio, you’re going to start noticing pretty high latency .  While their SW can do a lot to defy physics, random reads from disk is one area they can’t cheat.  When they re-assure you that you’ll be just fine with 12 7k drives, they’re mostly right, but make sure you don’t skimp on your cache.  When they size your array, they’ll likely suggest anywhere between 10% and 20% of your total data set size.  Go with 20% of your data set size or higher, you’ll thank me.  Also, if you plan to do pinning or anything like that, account for that on top of the 20%.  When in doubt, add cache.  Yes its more expensive, but its also still cheaper than buying NetApp, EMC, or any other overpriced dinosaur of an array.

The only other area where I don’t see screaming performance is situations where 50% sequential read + 50% sequential write is going on.  Think of something like copying a table from one DB to another.  I’m not saying its slow, in fact, its probably faster than most, but its not going to hit the numbers you see when its closer to 100% in either direction.  Again, I suspect part of this has to do with the NL-SAS drives and only having 12 of them.  Even with coalesced writes, they still have to commit at some point, which means, you have to stop reading data for that to happen, and since sequential data comes off disk by design, you end up with disk contention.

Performance, the numbers…

I touched on it above, but I’ll basically summarize what Nimble’s IO performance spec’s look like in my shop.  Again, remember I’m running their slightly older cs460’s, if these were cs500’s or cs700’s all these numbers (except cache misses) would be much higher.

  • Random Read:
    • Cache hit: Smoking fast (60k IOPS)
    • Cache miss: dog slow (1.2k IOPS)
  • Random Write: fast (36k IOPS)
  • Sequential
    • 100% read: smoking fast (2GBps)
    • 100% write: fast (800MBps – 1GBps)
    • 50%/50%: not bad, not great (500MBps)

Again, its rough numbers, I’ve seen higher number in all the categories, and I’ve seen lower, but these are very realistic numbers I see.

Ease of use:

Honestly the simplest SAN I’ve ever used, or at least mostly.  Carving up volumes, setting up snapshots and replication has all been super easy, and intuitive.  While Nimble provided training, I would content its easy enough that you likely don’t need it.  I’d even go so far as saying you’ll probably think you’re missing something.

Also, growing the HW has been simple as well.  Adding a data shelf or cache shelf has been as simple as a few cables and clicking “activate” in the GUI.

Why do I say mostly?  Well if you care about not wasting cache, and optimizing performance, you do need to adapt your environment a bit.  Things like transaction logs vs DB, SQL vs Exchange, they all should have separate volume types.  Depending on your SAN, this is either common place, or completely new.  I came from an Equallogic shop, where all you did was carve up volumes.  With Nimble you can do that too, but you’re not maximizing your investment, nor would you be maximizing your performance.

Troubleshooting performance can take a bit of storage knowledge in general (can’t fault Nimble for that per say) and also a good understanding of Nimble its self.  That being said, I don’t think they do as good of a job as they could in presenting performance data in a way that would make it easier to pin down the problem.  From the time I purchased Nimble till now, everything I’ve been requesting is being siloed in this tool they call “Infosite”, and the important data that you need to troubleshoot performance  in many ways is still kept under lock and key by them, or is buried in a CLI.  Yeah, you can see IOPS, latency, throughput and cache hits, but you need to do a lot of correlations.  For example, they have a line graph showing total read / write IOPS, but they don’t tell you in the line graph whether it was random or sequential.  So when you see high latency, you now need to correlate that with the cache hits and throughput to make a guess as to whether the latency was due to a cache miss, or if it was a high queue depth sequential workload.  Add to that, you get no view of the CPU, average IO size, or other things that are helpful for troubleshooting performance.  Finally, they role up the performance data so fast, that if you’re out to lunch and there was a performance problem, its hard to find, because the data is average way too quickly.

Reliability:

Besides disk failures (common place) we’ve had two controller failures.    Probably not super normal, but none the less, not a big deal.  Nimble failed over seamlessly, and replacing them was super simple.

Customer Support:

I find that their claim of having engineers staffing support to be mostly true.  By in large, their support is  responsive, very knowledgeable and if they don’t know the answer, they find it out.  Its not always perfect, but certainly better than other vendors I’ve worked with.

Scaling:

I think Nimble scales fantastically so long as you have the budget.  At first when they didn’t have data or cache shelves, I would have said they have some limits, but now a days, with their scale in any direction, its hard not to say that they can’t adapt to your needs.

That said, there is one area where I’m personally very disappointed in their scaling, which is going up from an older generation to a newer generation controllers.  In our case, running the cs460’s requires a disruptive upgrade to go to the cs500’s or cs700’s.  They’ll tell me its non-disruptive if I move my volumes to a new pool, but that first assumes I have SAN groups, and second assumes I have the performance and capacity to do that.  So I would say this is mostly true, but not always.

Value / Design:

The hard parts of Nimble…

If we just take face value, and compare them based on performance and capacity to their competitors, they’re a great value.  If you open up the black box though and start really looking at the HW you’re getting, you start to realize Nimble’s margins are made up in their HW.    A few examples…

  • Using Intel sc3500’s (or comparable) with SAS interposers instead of something like an STEC or HTST SAS based SSD.
  • Supermicro HW instead of something rebranded from Dell or HP.  The build quality of Supermicro just doesn’t compare to the others.  Again, I’ve had two controller failures in 2 years.
  • Crappy rail system.  I know its kind of petty, but honestly they have some of the worst rails I’ve seen next to maybe Dell’s EQL 6550 series.  Tooless kits have kind of been a thing for many years now, it would be nice to see Nimble work on this
  • Lack of cable management, seriously, they have nothing…

Other things that bug me about their HW design…

Its tough to understand how to power off / on certain controllers without looking in the manual.  Again, not something you’re going to be doing a lot, but still it could be better.  Their indicator lights are also slightly mis-leading with a continual blinking amberish orangeish light on their chassis.  The color is initially misleading that perhaps an issue is occurring.

While I like the convince of the twin controller chassis, and understand why they, and many other vendors use it.  I’d really like to see a full sized dual 2u rack mount server chassis.  Not because I like wasting space, but because I suspect it would actually allow them to build a faster array.  Its only slightly more work to unrack a full sized server, and the reality is I’d trade that any day for better performance and scalability (more IO slots).

I would like to see a more space conscious JBOD.  Given that they over subscribe the SAS backplane anyway, they might as well do it while saving space.  Unlike my controller argument, where more space would equal more performance, they’s offering a configuration that chews up more space, with no other value add, except maybe having front facing HDD’s.  I have 60 bay JBODs for backup that fit in 4u.  Would love to see that option for Nimble, that would be 4 times the amount of storage in about the same amount of space.

Its time to talk about the softer side of Nimble….

The web console, to be blunt is a POS.  Its slow, buggy, unstable, and really, I hate using it.  To be fair, I’m bigoted against web consoles in general, but if they’re done well, I can live with them.  Is it usable, sure, but I certainly don’t like living in it.  If I had a magic wand, I would actually do away with the web console on the SAN its self and instead, produce two things:

  • A C# client that mimic’s the architecture of VMware.  VMware honestly had the best management architecture I’ve seen (until they shoved the web console down my throat).  There really is no need for a web site running on the SAN.  The SAN should be locked down to CLI only, with the only web traffic being API calls.  Give me a c# client that I can install on my desktop, and that can connect directly to the SAN or to my next idea below.  I suspect, that Nimble could ultimately display a lot more useful information if this was the case, and things would work much faster.
  • Give me a central console (like vCenter) to centrallly manage my arrays,  I get that you want us to use infosite and while its gotten better, its still not good enough.  I’m not saying do away with info site, but let me have a central, local, fast solution for my arrays.  Heck, if you still want to do a web console option, this would be the perfect place to run it.

The other area I’m not a fan of right now, is their intelligent MPIO.  I mean I like it, but I find its too restrictive.  Being enabled on the entire array or nothing is just too extreme.  I’d much rather see it at the volume level.

Finally, while I love the Windows connection manager, it still needs a lot of work.

  • NCM should be forwards and backwards compatible, at least to some reasonable degree.  Right now its expected that it matches the SAN’s FW version and that’s not realistic.
  • NCM should be able to kick off on demand snaps (in guest) and offer a snapshot browser (meaning show me all snaps of the volume).
  • If Nimble truly want to say they can replace my backup with their snapshots, then make accessing the data off them easier.  For example, if I have a snap of a DB, I should be able to right click that DB, and say (mount a snapshot copy of this DB, with this name) and the Nimble goes off and runs some sort of workflow to make that happen.  Or just let us browse the snaps data almost like a UNC share.

The backup replacement myth…

Nimble will tell you in some cases that they have a combined backup and primary storage solution.  IMO, that’s a load of crap.  Just because you take a snapshot, doesn’t mean you’ve backed up the data.  Even if you replicate that data, it’s still not counting as a backup.  To me, Nimble can say they’ve solved the backup dilemma with their solution when they can do the following:

  • Replicate your data to more than one location
  • Replicate your data to tape every day and send it offsite.
  • Provide an easy straight forward way to restore data out of the snapshots.
  • Truncate transaction logs after a successful backup.
  • Provide a way of replicating the data to non-Nimble solution, so the data can be restored anywhere.  Or provide something like a “Nimble backup / recovery in the cloud” product.

Continued Innovation:

I find Nimble’s innovation to be on the slow side, but steady, which is a good thing.  I’d much rather have a vendor be slow to release something because they’re working on perfecting it.  In the time I’ve been a customer, they’ve released the following features post purchase:

  • Scale out
  • Scale deep
  • External flash expansion
  • Cache Pinning
  • Virtual Machine IOPS break down per volume
  • Intelligent MPIO
  • Cache Pinning
  • QOS
  • RestAPI
  • RBAC
  • Refreshed generation of SANS (faster)
  • Larger and larger cache and disk shelves

Its not a huge list, but I also know what they’re currently working on, and all I can say is, yeah they’re pretty darn innovative.

Conclusion and final thoughts:

Nimble is honestly my favorite general purpose array right now.  Coming from Equallogic, and having looked at much bigger / badder arrays, I honestly find them to be the best bang for the buck out there.  They’re not without faults, but I don’t know an array out there that’s perfect.  If you’re worried they’re not “mature enough”,  I’ll tell you, you having nothing to fear.

That said, its almost 2016 and with flash prices being where they are now, I personally don’t see a very long life for hybrid array going forward, at least not as high performance mid-size to enterprise  storage arrays.  Flash is getting so cheap, its practically not worth the saving you get from a hybrid, compared to the guaranteed performance you get from an all flash.    Hybrids were really filling a niche until all flash became more attainable, and that attainable day is here IMO.  Nimble thankfully has announced that an AFA is in the works, and I think that’s a wise move on their part.  If you have the time, I would honestly wait out your next SAN purchase until their AFA’s are out, I suspect, they’ll be worth the wait.