Category Archives: thinking out loud

Thinking out loud: The broken talent acquisition process

I was scanning through my LinkedIn feed the other day and stumbled across an interesting picture / chart showing common mistakes that interview candidates make posted by a former colleague.   To basically summarize, it was a culmination of physical characteristics (slouching, not smiling, clothing not too trendy, but not too outdated, etc.), knowing a lot about the company and position, and various other common do’s and don’ts.

My first reaction was “no duh”, but then I paused for a moment and said to myself “Why?”  Does this criteria really lead to you getting the best candidate for the position or is it actually filtering out the best candidate for the position?  It actually got me a little annoyed just thinking about how shallow someone recruiting must be if these were deciding factors.  Then I started thinking about other facets of recruiting that just seem not only outdated, but also likely detrimental to a company finding the best possible candidates.  It’s actually bothered me enough that I felt compelled to get a blog post written about what I think is wrong on not only the recruiting side, but also the candidates side too.

I realize a lot of the things I’m going to mention will have exceptions.  There are always exceptions.  I’m not writing to debate the exceptions, I’m writing to discuss the averages, which contrary to most companies beliefs, they’re likely in the average, and that’s not a bad thing.

First, let’s really think about what a companies goal SHOULD be when attempting to fill a position.

  1. A person who has the best demonstrated skills in the chosen position you need filled.
  2. Ideally the person should be reliable.  You might also consider this a great work ethic.  it’s not an inherently easy trait to get out of an interview, but its an important one.
  3. They should be passionate about their career, and love their work.

With those three simple yet crucial traits, there’s no reason you should not be able to find the perfect candidate.  Now, I’m not saying they’re easy traits to determine, nor am I saying that the candidates which posses them are abundant.  However, if you put up superficial filters, you’re reducing your chances of finding them.

What’s broken in the talent acquisition process from a candidates view:

Where I sit as a former job seeker, these are things I saw that led me to not wasting my time applying for a position, or just being generally frustrated with company that I was trying to apply to.  I think more often then not, companies forget that employment is a two way street.

  • Making applying for your position a lot of work.  This could be things like forcing candidate to go through lengthy online applications, to something as simple as not having a “click here to submit your resume”.  The application process should be as simple and as quick as possible.  I’m not condoning applicants shotgunning their resume out either, but I think given the possibility of an applicant not applying or forgetting to apply, outweighs the cost of sifting through more resumes than you’d like.
  • Related to the above, cover letters need to go.  I’ve started noticing this as a slowing trend, so hopefully that continues.  Seriously, its an old fashioned formality, 99% of the time, the candidate is going to use a template, and I suspect, most recruiters don’t read them anyway.
  • If salaries are not going to be in the job posting, they need to be discussed at a high level early on, not at the end.  It’s a waste of a companies time, and candidates time if the salary the candidate is looking for is way off what the company is willing to stretch too.
    • Adding to this, the candidates salary history is frankly none of the hiring companies business.  Add to that, it really should matter what I was paid, what should matter is what I’m willing to work for, and what you’re willing to compensate.  That’s it.
  • Not doing the majority of your interviews via phone IMO is a disservice to your candidate.  As a person that’s now been on both sides of the table, I can say without question that you know if you want to hire someone before you ever meet them in person.  To me, the in person interview is mostly a formality, and really just a chance to meet face to face, and get an idea of the work environment.  Add to this, taking time off of work to go interview for a candidate is WAY HARDER than say slipping out during lunch for a one hour conversation via phone.  Just think about it from the candidates view.  How would you like to tell your manager for the 5th time that you’re sick, or your car broke down, or whatever other lie you force the candidate to make so they can meet with you.
  • If you’re going to bring the candidate in for an in person interview, then make sure its a once and done thing, unless the candidate wants a second in person interview.  Again, looking at the point above as the main reason, but also because its highly likely the candidate is burning up vacation time to come and interview with you.  I can think of one place I interviewed at where I literally went in 5 times, only to get told “no”.
  • Offer interview times after hours, during lunch, and heck even over the weekend.  Forcing someone to interview during business hours makes it tough to coordinate for the candidate.  If you realy want this person, and the person really wants to work for you, what’s the big deal with spending a non-work day in the office to meet / greet.  I’d even add to the fact that maybe you nor the candidate would feel as rushed as you might during the work week.
  • Be both flexible and understanding when it comes to someone showing up a little late or a little early.  I’m not talking about 30 minutes or an hour, but if they’re say 5 – 10 minutes late, so what.  I can tell you a number of times where I was basically racing from work to an interview (pre-GPS) and there were just times it was tough to find a place.  Add to that, accidents and other things happen.    I get being punctual is important, but I bet your average person (even you the recruiter) are late at times, and its beyond your control.
  • Except in a few circumstances, judging a candidate in person based on their appearance and NOT focusing on their skills set and experience isn’t a determination that they’re a bad candidate, its that you’re a poor interviewer.  Sure, maybe they showed up to a job that claims they’re “casual dress” in a polo instead of a suite, but does that really change that they’re a kick ass SysAdmin, that you’re trying to fill for a casual dress code company.  If the company dress codes is business professional and they show up for the interview in a polo shirt, then I would say that might be an issue if they didn’t tell you ahead of time.  For example, what if the candidates current employers dress codes is casual.  Everyone wears jeans and t-shirts.  Now they need to go from that place to your business professional dress code.  You’re only leaving them with a few options.  None of which are good for them.
    • Dress professionally while at their current employer, that won’t set any signals off that they’re going to an interview.
    • Change in a bathroom along the way there.  That’s not awkward for the candidate or anything.
  • Not letting the candidate know that you’re not moving forward with them via a phone call or the very least an email.  Getting some insincere letter in the mail weeks or months from when you interviewed is just rude.
  • If you are still interested in them, but are still interviewing other candidates, a weekly update isn’t too much to ask for.
  • Requiring candidate to have a college degree when you know darn well that the degree isn’t required to accomplish the work.  At least when it comes to IT, a college degree does not inherently make you a great IT person.
  • Worrying about skills that are not critical to the position at hand.  For example, “SysAdmin must have excellent written and oral communication skills.”  Why must they be excellent?  Seriously, do you care that they can automate your entire infrastructure, or do you care that they can write a perfect announcement.  One skill actually is important to getting work done, the other is just looking for something to nitpick about.
  • Putting 100 skills that a candidate must posses when any reasonable person could look at that and go “yeah and unicorns are real”.  Seriously, expecting someone to be wide and deep knowledge wise is so unlikely to exist, and if they do, you probably can’t afford them.  I’ve looked at job posting where they basically wanted an entire IT departments worth of skill sets in a single person and wanted to pay them the salary  of a Jr. Admin.  I’m not saying that well rounded candidates shouldn’t exist, but expecting someone to be an expert in networking, virtualization, storage, Linux and Windows, sorry ain’t gonna happen.  Sure, they might know some stuff in those areas, but they’re not going to be true experts in all of them.
  • Having unrealistic salary expectations is another issue I see a lot.  Contrary to popular belief, finding a GREAT SysAdmin (as an example) is VERY tough to find.  If you want the best talent, and not just a person that says they can do stuff, its going to cost more than you likely budgeted for.  And you know what, if they are that good, you probably will make whatever extra salary overhead you think you’ll incur back, when they do their job about 3x better / faster than the cheaper candidate you wanted to hire.  Not to mention, golden shackles are pretty powerful way to keep most good talent from leaving.
  • Interviewing with either too many people or not enough people and sometimes the wrong people is also an issue I’ve seen.  I remember interviewing through a recruiter where the SVP of infrastructure wanted to interview with me in person.  They insisted that the interview be both in person and during a date / time that they were around.  I had probably 2 – 3 different interviews scheduled that were canceled at the last minute because the SVP either had vacation, or something came up.  When I finally did come in for the interview, I never met with them, but I met with every other person that reported to them.    What’s funny about this is, every person I interviewed with, indirectly lamented how strict and demanding the SVP was.  So not only was I turned off by the fact that the SVP after changing my schedule around about 3 times didn’t bother to meet with me, but the people he delegated to do the interviewing basically convinced me (without knowing it) that there was no way in hell I wanted to work for this guy.  I suspect, had I actually met with the SVP, I may have picked up on the queues, but who knows, maybe not.  Either way, they lost me, and I know I could have turned things around for them.
  • Having a candidate fill out an application even during an interview is a waste of time.  I say, reserve the job application for when you think you’re ready to send them an offer letter.  Focus first on finding the candidate of your dreams, THEN go through the formalities once you’ve found them.
  • Asking stupid interview questions that have no chance of determining the quality of the candidate or are not applicable to the job.  Most of the time I see these coming out of the HR, but every once in a while I’ll see a hiring manager ask them too.  Those types of questions I’m talking about are things like “why do you want to work here?” or “what are your 3 greatest strengths and your 3 greatest weaknesses?”.  Seriously, stop wasting my time and yours and let’s move on to questions that really determine if I’m a good candidate.  One example would be “what project were you most proud of in your career and why?”  Remember my “passion for their career” requirement, if they don’t light up when asked to brag about themselves and their career, there’s something wrong with them.  Or if they can’t explain anything of significance, I think you have your answer as to whether they’re a good fit.  Me personally, I could probably give you a 100’s of projects that made me beam with pride.  Even technical questions that are trivia.  If someone claims to be an expert in a field, then sure they should know the trivia, but if they’re not claiming to be an expert, don’t ask them something that you could just Google.
  • Can we just do away with dumb requirement of bringing three printed resumes and references along?  First, its a waste of paper, and second, you can print out the resume that I sent in, or just look at it on your phone.

What’s broken in the talent acquisition process from a hiring managers view:

These are points where I see that the candidate has messed up, or is making my life tougher than they should.

  • Stop loading your resume up with stuff that you did, that you really didn’t do.  There are so many times where I would read through someones resume and start asking them to explain details about an accomplishment they listed, only to hear that really they just helped and someone else actually did all the complex parts of the project.  If you put “designed and implemented a virtual environment hosting over 500 vm’s” then I’m going to dig into that environment so that I know you actually did it.
  • Sort of related to the above, but don’t apply for jobs that you know you’re not qualified for.  Just because your employer gave you a fake title of *Senior* SysAdmin, when we both know at best your mid level, and more than likely not much better than Jr.  You applying for a job that your not qualified for is only going to lead to you either getting declined (wasting both our time) or me having to let you go if you do talk a good game, but can’t back it up.  Don’t get me wrong, there were times much later in my career where I realized I had to fake it till I made it, but I knew I had the skills to do the job.  Not just because I thought so, but because everyone I worked with told me I did.
  • Related to the above, I actually appreciate a person that admits “you know, I really don’t know how to do this”, so long as that’s not your answer to every question I ask, and so long as that’s not the answer to critical points of the job description.  Let’s be real here, if you DO say you know how to do something, I’m going to ask you about it.
  • Not having a passion for what you do is a huge negative for me.  Look, I’m glad that you enjoy your tomato garden, but I care a hell of a lot more that you love your job to a point where you’re going to stay up to date on your own.  Unless you just happen to have raw talent (and some do) being good in IT, requires a lot of work, and there’s not enough business hours to get work done and stay up to date.  If you love your job, you’ll never work a day in your life.  I live by that, and I want the people I look for to as well.  Think of it like this, do you want a surgeon operating on you that only cares about their job when they’re getting paid, or do you want someone that goes to seminars on their own, researches on their own and in general wants to excel at what they do.  I’m not saying you should live to work, but reading a few blogs every night on the couch isn’t going to kill you, nor is thinking about how to architect solution x while you’re running / biking, etc.
  • You should be an expert at some things and pretty darn good at a lot of things.  If you say you’re an expert in Active Directory, I’m going to ask you about the bridge head controllers and how they’re elected.  I’m going to ask you about the 5 FSMO roles, and what they do.  If you say you’re a VMware expert, I’m going to ask you if HA requires vCenter, and I’m going to ask you if the vMotion kernel and the management kernel can co-exist on the same vLAN.
  • I want you to be able to talk about IT architecture, and how you’d solve certain problems and why you would use that tactic.  I might not agree with you, but if your answer is well thought out, we can always negotiate the tactics as long as the strategy ultimately solves the problem.
  • I want to see progressive experience and responsibility, and you should have the skills to back it up.  I realize the higher up you get the tougher it gets.  But if I see that it took you 10 years before you got your first sysadmin gig, I’m going to wonder if you really have what it takes.
    • Just because you’re a SysAdmin, doesn’t mean I don’t expect very good desktop management skills out of you.
  • I want to see fire and passion when I interview you.  if you disagree with me, diplomatically correct me.  I don’t know everything, and who knows, I might just be testing if you do, and if you have the non-technical skills to lead upwards.  You’re no good to me if you’d let me crash the titanic into the iceberg if you saw the iceberg and didn’t say anything.    Besides, think of it from your view, do you really want to work with someone that isn’t open to discussions?  That doesn’t mean I won’t push back, but if your point is well thought out, I can at least respect your view.
  • If you need to dress down, I’m okay with it, but just let me know ahead of time before you show up at the interview.
  • If you’re calling me from a cell, try to make sure you have decent signal.
  • If you think you might be late, just let me know, I get it.

Closing thoughts:

Just remember with all of this, the point isn’t to make all kinds of crazy demands of both the candidate or the hiring company.  Its about cutting through outdated and in many cases proven to be ineffective recruitment techniques.  I want to work for the best company, and in turn I want to be able to find the best candidates too.  The sooner we get rid of broken acquisition techniques and improve our process, the quicker we’ll all find what we’re looking for.

Thinking out loud: The cloud (IaaS) delusion


Just so we’re all being honest here, I’m not going to sit here and lie about how I’m not biased and I’m looking at both sides 100% objectively.  I mean I’m going to try to, but I have a slant towards on prem, and a lot of that is based on my experience and research with IaaS solutions as they exist now.  My view of course is subject to change as technology advances (as anyones should), and I think with enough time, IaaS will get to a point where its a no brainer, but I don’t think that time is yet for the masses.  Additionally, I think its worth noting that in general, like any technology, I’m a fan of what makes my life easier, what’s better for my employer, and what’s financially sound.  In many cases cloud fits those requirement, and I currently run and have run cloud solutions (long before being trendy).  I’m not anti cloud, I’m anti throwing money away, which is what IaaS is mostly doing.

Where is this stemming from?  After working with Azure for the past month, and reading why I’m a cranky old SysAdmin for not wanting to move my datacenter to the cloud, I wanted to speak up on why in contrary, I think you’re a fool if you do.  Don’t get me wrong, I think there are perfectly valid reasons to use IaaS, there are things that don’t make sense to do in house, but running a primary (and at times a DR) datacenter in the cloud, is just waisting money and limiting your companies capabilities.  Let’s dig into why…

Basic IaaS History:

Let’s start with a little history as I know it on how IaaS was initially used, and IMO, this is still the best fit for IaaS.

I need more power… Ok, I’m done, you can have it back.

There are companies out there (not mine) that do all kinds of crazy calculations, data crunching and other compute intensive operations.  They needed huge amounts of compute capacity for relatively short periods of time (or at least that was the ideal setup).  Meaning, they were striving to get the work done as fast as possible, and for arguments sake, let’s just say their process scaled linearly as they added compute nodes.  There was only so much time, so much power, so much cooling, and so much budget to be able to house all these physical servers for solving what is in essence one big complex math equation.  What they were left with was a balancing act of buying as much compute as they could manage, without being excessively wasteful.  After all, if they purchased so much compute that they could solve the problem in a minimal amount of time, unless they were keeping those server busy, once the problem was solved, it was a waste of capital.  About 10 years ago (taking a rough guess here), AWS releases this awesome product capable of renting compute by the hour, and offering whats basically unlimited amounts of cpu / gpu power.  Now all of a sudden a company that would have had to operate a massive datacenter has a new option of renting mass amounts of compute by the hour.  This company could fire up as many compute nodes as they could afford, and not only could they solve their problem quicker, but they only had to pay for the time they used.

I want to scale my web platform on demand…. and then shrink it, and then scale it, and then shrink it.

It evolved further, if its affordable for mass scale up and scale down for folks that fold genomes, or trend the stock market, why not for running things like next generation web scale architectures.  Sort of a similar principle, except that you run everything in the cloud.  To make it affordable, and scalable, they designed their web infrastructure so that it could scale out, and scale on demand.  Again, we’re not talking about a few massive database servers, and a few massive web servers, we’re talking about tons of smaller web infrastructure components, all broken out into smaller independently scalable components.  Again the cloud model worked brilliantly here, because it was built on a premise that you designed small nodes, and scaled them out on demand as load increased, and destroyed nodes as demand dwindled.  You could never have this level of dynamic capacity affordably on prem.

I want a datacenter for my remote office, but I don’t need a full server, let alone multiples for redundancy.

At this stage IaaS is working great for the DNA crunchers and your favorite web scale company, and all the while, its getting more and more development time, more functionally, and finally gaining the attention of more folks for different use cases.  I’m talking about folks that are sick of waiting on their SysAdmins to deploy test servers, or folks that needed a handful of servers in a remote location, folks that only needed a handful of small servers in general, and didn’t need a big expensive SAN or server. Again, it worked mostly well for these folks.  They saved money by not needing to manage 20 small datacenters, or they were able to test that code on demand and on the platform they wanted, and things were good.

The delusion begins…

Fast forward to now, and everyone thinks that if the cloud worked for the genome folders, the web scale companies and finally for small datacenter replacements, then it must also be great for my relatively speaking static, large legacy enterprise environment.  At least that’s what every cloud peddling vendor and blogger would have you believe, and thus the cloud delusion was born.

Why do I call it the cloud delusion?  Simple, your enterprise architecture is likely NOT getting the same degrees of wins that these types of companies were/are getting out of IaaS.

Let’s break it down the wins that cloud offered and offers you.  In essence, if this is functionality that you need, then the cloud MAY make sense for you.

  1. Scale on demand:  Do you find your self frequently needing to scale servers by the hundreds every, day, week or even month?   Shucks, I’ll even give you same leeway and ask if you’re adding multiple hundred servers every year?   In turn are you finding that you are also destroying said servers in this quantity?  We’re trying to find out if you really need the dynamic scale on demand advantage that the cloud brings over your on prem solution.
  2. Programatic Infrastructure:  Now I want to be very clear with this from the start, while on prem may not be as advanced as IaaS, infrastructure is mostly programatic on prem, so weigh this pro carefully.  Do you find that you hate using a GUI to manage your infrastructure, or need something that you can that can be highly repeatable, and fully configurable via a few JSON files and a few scripts?  I mean really think about that.  How many of you right now are just drowning because you haven’t automated your infrastructure, and are currently head first in automating every single task you do?  If so, the cloud may be a good fit then because practically everything can be done via a script and some config files.  If however, you’re still running through a GUI, or using a handful of simple scripts, and really have no intention of doing everything through a JSON file / script, its likely that IaaS isn’t offering you a big win here.  Even if you are, you have to question if your on prem solution offers similar capabilities, and if so, whats the win that a cloud provider offers that your on prem does not.
  3. Supplement infrastructure personnel:    Do you find your infrastructure folks are holding you back?  If only they didn’t have to waste time on all that low level stuff like managing hypervisors, SANs, switches, firewalls, and other solutions, they’d have so much free time to do other things.  I’m talking about things like patching firmware, racking / unracking equipment, installing hypervisors, provisioning switch ports.  We’re talking about all of this consuming a considerable portion of your infrastructure teams time.  If they’re not spending that much time on this stuff (and chances are very high that they’re not), then  this is not going to be a big win for you.  Again, companies that would have teams busy with this stuff all the time, probably have problem number 1 that I identified.  I’d also like to add that even if this is an issue you have, there is still a limited amount of gain you’ll get out of this.  You’re still going to need to provision storage, networking and compute, but now instead of in the HW, it will simply be transferred to a CLI / GUI.  Mostly the same problem, just a different interface.  Again, unless you plan to solve this problem ALONG with problem 2, its not going to be a huge win.
  4. VM’s on demand for all:  Do you plan on giving all your folks (developers, DBA,  QA, etc.) access to your portal to deploy VM’s?  IaaS has an awesome on demand capability that’s easy to delegate to folks.  if you’re needing something like this, without having to worry about them killing your production workload, then IaaS might be great for you.  Don’t get me wrong, we can do this on prem too, but there’s a bit more work and planning involved.  Then again, letting anyone deploy as much as they want, can be an equally expensive proposition.  Also, let’s not forget problem number 2, chances are pretty high, your folks need some pre-setup tasks performed, and unless you’ve got that problem figured out, VM’s on demand probably isn’t going to work well anywhere, let alone the cloud.
  5. At least 95% of your infrastructure is going to the cloud:  While the number may seem arbitrary (and to some degree it is a guess), you need a critical mass of some sort for it to make financial sense to send you infrastructure to the cloud (if you’re not fixing a point problem).  What good is it to send 70% of your infrastructure to the cloud, if you have to keep 30% on prem.  You’re still dealing with all the on prem issues, but now your economies of scale are reduced.  If you can’t move the lions share of your infrastructure to the cloud, then what’s the point in moving random parts of it?  I’m not saying don’t move certain workloads to the cloud.  For example, if you have a mission critical web site, but everything else its ok to have an outage for, then move that component to the cloud.  However, if most of your infrastructure needs five 9’s, and you can only move 70% of it, then you’re still stuck supporting five 9’s on prem, so again, what’s the point?

Disclaimer:  Extreme amounts of snark are coming, be prepared.

Ok, ok maybe you don’t need any of these features, but you’ve got money to burn, you want these features just because you might use them at some point, everyone else is “going cloud” so why not you, or who knows whatever reason you might be coming up with for why the cloud is the best decision.  What’s the big deal, I mean you’re probably thinking you lose nothing, but gain all kinds of great things.  Well that my friend is where you’d be wrong.  Now my talking points are going to be coming from my short experience with Azure, so I can’t say these apply to all clouds.

  1. No matter what, you still need on prem infrastructure.  Maybe its not a hoard of servers, but you’ll need stuff.
    1. Networking isn’t going anywhere (should have been a network engineer).    Maybe you won’t have as many datacenter switches to contend with (and you shouldn’t have a lot if your infrastructure is modern and not greater than a few thousand VM’s), but you’ll still need access switches for you staff.  You’re going to need VPN’s and routers.  Oh, and NOW you’re going to need a MUCH bigger router and firewall (err… more expensive).  All that data you were accessing locally now has to go across the WAN, if you’re encrypting that data, that’s going to take more horsepower, and that means bigger badder WAN networking.
    2. You’re probably still going to have some form of servers on site.  In a windows shop that will be at least a few domain controllers, you’ll also have file server caching appliances, and possibly other WAN acceleration devices depending on what apps you’re running in the cloud.
    3. Well, you’ve got this super critical networking and file caching HW in place, you need to make sure it stays on.  That potentially is going to lead back to UPS’s at a minimum and maybe even a generator.  Then again, being fair, if the power is out, perhaps its out for your desktops too, so no one is working anyway.  That’s a call you need to make.
    4. Is your phone system moving to the cloud too?  No… guess you’re going to need to maintain servers and other proprietary equipment for that too.
    5. How about application “x”?  Can you move it to the cloud, will it even run in the cloud?  Its based on Windows 2003, and Azure doesn’t support Windows 2003.  What are application “X”‘s dependencies and how will they effect the application if they’re in the cloud?  That might mean more servers staying on prem.
  2. They told you it would be cheaper right, I mean the cloud saves you on so much infrastructure, so much personnel power, and it provides this unlimited flexibility and scalability that you don’t actually need.
    1. Every VM you build now actually has a hard cost.  Sorry, but there’s no such thing as “over provisioning” in the cloud.  Your cloud provider gets to milk that benefit out of you  and make a nice profit.  Yeah I can run a hundred small VM’s on a single host, those same VM’s I’d pay per in a cloud solution.  But hey, its cheaper in the cloud, or so the cloud providers have told me.
    2. Well at least the storage is cheaper, except that to get decent performance in the cloud, you need to run on premium storage and premium storage isn’t cheap (and not really all that premium either).  You don’t get to enjoy the nice low latency, high iop, high throughput, adaptive caching (or all flash) that your on prem SAN provided.  And if you want to try and match what you can get on prem, you’ll need to over-provision your storage, and do crazy in guest disk stripping techniques.
    3. What about your networking?  I mean what is one of the most expensive reoccurring  networking costs to a business?  The WAN links… well they just got A LOT more expensive.  So on top of now spending more capex on a router and firewall, you also need to pump more money into the WAN link so your users have a good experience.  Then again, they’ll never have the same sub-millisecond latency that they had when the app was local to them.
      1. No problem you say, I’ll just move my desktop to the cloud, and then you remember that the latency still exists, its just been moved from client and application, to the user interfacing with the client.  Not really sure which is worse.
        1. Even if you’re not deterred by this, now you’re incurring the costs of running your desktops in the cloud.  You know, the folks that you force 5 years or older desktops on.
    4. How many IP’s or how many NIC’s does your VM have?  I hope its one and one.  You see there are limitations (in Azure) of one IP per NIC, and in order to run multiple NIC’s per server, you need a larger VM.  Ouch…
    5. I hope you weren’t thinking you’d run exactly 8 vCPU’s and 8GB of vRAM because that’s all your server needs.  Sorry, that’s not the way the cloud works.  You can have any size VM you want, as long as its the sizes that your cloud provider offers.  So you may end up paying for a VM that has 8 vCPU and 64GB of RAM because that’s the closest fit.  But wait, there’s more…  what if you don’t need a ton of CPU or RAM, but you have a ton of data, say a file server.  Sorry, again, the cloud provider only enables a certain number of disks per vCPU, so you now need to bump up your VM size to support the disk size you need.
    6. At least with cloud, everything will be easy, I mean yeah it might cost more, but oh… the simplicity of it all.  Yep, because having a year 2005 limitation of 1TB disks just makes everything easy.  Hope you’re really good with dynamic disks, windows storage spaces, or LVM (Linux) because you’re going to need it  Also, I hope you have everything pre-thought out if you plan to stripe disks in guest.  MS has the most unforgiving disk stripping capabilities if you don’t.
    7. Snapshots, they at least have snapshots… right?  Well sort of, except its totally convoluted, not something you’d probably never want to implement for fear of wrecking your VM (which is what you were trying to avoid with the snap right?).
    8. Ok, ok, well how about dynamically resizing your VM’s?  They can at least do that right?  Yes, sort of, so long as your sizing up in a specific VM class.  Otherwise TMK, you have to rebuild once you outgrow a given VM class.  For example, the “D series” can be scaled until you reach the maximum for the “D”.  You can’t easily convert it to a “G” series in a few clicks to continue growing it.
    9. Changes are quick and non-disruptive right?  LOL, sure with any other hypervisor they might be, but this is the cloud (Azure) and from what I can see, its iffy if your VM’s don’t need to be shutdown, or even worse, if you do something that is supported hot, you may see longer than normal stuns.
    10. Ever need to troubleshoot something in the console?  Me too, a shame because Azure doesn’t let you access the console.
    11. Well at least they have a GUI for everything right?  Nope, I found I need to go drop into PS more often than not.  Want to resize that premium storage disk, that’s gonna take a powershell cmdlet.  That’s good though right, I mean you like wasting time finding the disk guid, digging into a CLI, just to resize one disk, which BTW is a powered off operation, WIN!
    12. You like being in control of maintenance windows right?  Of course you do, but with cloud you don’t get a say.

I could keep going on, but honestly I think you get the point.  There are caveats in spades when switching to the cloud as a primary (or even DR) datacenter.  Its not a simple case of paying more for features you don’t have, you lose flexibility / performance, and you pay more for it too.

Alright, but what about all those bad things they say about on prem, or things like TCO they’re trying to woo you to the cloud for.  Well lets dig into it a bit.

  1. Despite what “they” tell you, they’re likely out of touch.  Most of the cloud folks you’re dealing with, have been chewing their own dog food so long, they don’t have a clue about what exists in the on prem world, let alone dealing with your infrastructure and all its nuances.  They might convince you they’re infrastructure experts, but only THEIR infrastructure, not yours and certainly not on prem in general.  Believe me, most of them have been in their bubble for half a decade at least, and we all know how fast things change in technology, they’re new school in cloud, but a dinosaur in on prem.  Don’t misunderstand me, I’m not saying they’re not smart, I’m saying I doubt they have the on prem knowledge you do, and if you’re smart, you’ll educate yourself in cloud so you’re prepared to evaluate if IaaS really is a good fit for you and your employer.
  2. Going cloud is NOT like virtualization.  With virtualization you didn’t change the app, you didn’t’ lose control and more importantly it actually saved you money and DID provide more flexibility, scalability and simplicity.  Cloud does not guarantee any of those for a traditional infrastructure.  Or rather it may offer different benefits, that are not as equally needed.
  3. They’ll tell you the TCO for cloud is better and they MAY be right if you’re doing foolish things like.
    1. Leasing servers and swapping them every three years.  A total waste of money.  There’s very few good reason you aren’t financing a server (capex) and re-purposing that server through a proper lifecycle.  Five years is the minimum maximum life cycle for a modern server.  You have DR, and other things you can use older HW for.
    2. You’re not maxing out the cores in your server to maximize licensing costs, reduce network connectivity costs, and also reduce power, cooling and rack space.  An average dual socket 18 core server can run 150 average VM’s without breaking a sweat.
    3. Your threshold for a maxed out cluster is too low.  There’s nothing wrong with a 10:1 or even a 15:1 vCPU to pCPU ratio so long as your performance is ok.  Your milage may vary, but be honest with yourself before buying more servers based on arbitrary numbers like these.
    4. You take advice from a greedy VAR.  Do yourself a favor and just hire a smart person that knows infrastructure.  They’ll be cheaper than all the money you waste on a VAR, or cloud.   You should be pushing for someone that  is border line architect, if not an architect.
      1. FYI, I’m not saying all VARs are greedy, but more are than not.  I can’t tell you how many interviews I’ve had where I go “yeah, you got up sold”.
    5. Stop with this BS of only buying “EMC” or “Cisco” or “Juniper” or whatever your arbitrary preferred vendor is. Choose the solution based on price, reliability, performance, scalability and simplicity, not by its name.  I picked Nimble when NetApp would have been an easy, but expensive choice.  Again, see point 4 about getting the right person on staff.
    6. Total datacenter costs (Power, UPS, generator and cooling) are worth considering, but are often not as expensive as the providers would have you think.  If this is the only cost saving’s point they have you sold on, you should consider colocation first which takes care of some of that, but also incurs some of the same costs/caveats that come with cloud (but not nearly as many).  Again, I personally think this is FUD, and in a lot of cases, IT departments, let alone businesses don’t even see the bill for all of this.  Even things like DC space, if you’re using newer equipment, the rack density you get through virtualization is astounding.
    7. You’re not shopping your solution, ever.  I know folks that just love to go out to lunch (takes one to know one), and their VAR’s and vendors are happy to oblige.  If your management team isn’t pushing back on price, and lets you run around throwing PO’s like monopoly money, there’s a good chance you’re paying more for something than you need to.
    8. You suck at your job, or you’ve hired the wrong person.  Sounds a little harsh,but again, going back to point 4.  if you have the right people on staff,you’ll get the right solutions, they’ll be implemented better, and they’ll get implemented quicker.  Cloud by the way, only fixes certain aspects of this problem.
  4. They’ll tell you can’t do it better than them, they scale better, and it would cost you millions to get to their level.  They’re right, they can build a 100,000 VM host datacenter better than you or I, and they can run it better.  But you don’t need that scale, and more importantly, they’re not passing those economies of scale on to you.  That’s their profit margin.  Remember, they’re not doing this to save you money, they’re doing this to make money.   In your case, if your DC is small enough (but not too small) you can probably do it MUCH cheaper than what you’d pay for in a cloud, and it will likely run much better.
  5. They’ll tell you’ll be getting rid of a SysAdmin or two thanks to cloud.  Total BS… An average sysadmin (contrary to marketing slides) does not spend a ton of time with the mundane task or racking HW, patching hypervisors (unless its Microsoft :-)), etc.  They spend most of their time managing the OS layer, doing deployments, etc, which BTW all still need to be done in the cloud.

For now, that’s all I’ve got.  I wrote this because I was so tired of hearing folks spew pro cloud dogma from their mouthes without even having a simplistic understanding of what it takes to run infrastructure in the cloud or on prem.  Maybe I am the cranky main frame guy, and maybe I’m the one who is delusional and wrong.  I’m not saying the cloud doesn’t have its place, and I’m not even saying that IaaS won’t be the home of my DC in ten years.  What I am saying is right now, at this point in time, I see moving to the cloud as big expensive mistake if your goal is to simply replace your on prem DC.  If you’re truly being strategic with what you’re using IaaS for, and there are pain points that are difficult to solve on prem, then by all means go for it.  Just don’t tell me that IaaS is ready for general masses, because IMO, it has a long ways to go yet.

Thinking out loud: CLI vs. GUI, a pointless debate

I see this come up occasionally and I really don’t get why its always an “x is better than y”.  A lot of times folks are talking about what works best for them in their world, and for all intents and purposes stating that if its best for them, its best for all.  I’d like to challenge this reasoning and also challenge the notion that both a CLI and GUI have their place.

The GUI: It’s pretty and functional

I’m not sure where all this hate on GUI’s comes from, but I can tell you that it stereotypically comes from either the *NIX or network engineer crowds.  Thinking about a GUI from their point of view, I can completely understand why they’re not huge fans of a GUI.  The folks in these crowds spend most of their day buried in a CLI, for the simple reason that no GOOD GUI exists for most of what they’re doing.  KDE? GNOME? some crappy web interface (this ones a toss up on depending on who’s interface)?  I wouldn’t want to live day in and day out with half of the GUI’s they’re using either.  So what’s the problem with their GUI’s?

  • In my admittedly limited experience with Linux GUI’s, most of them offer limited functionally.  They can do the basics, but when you really want to get in and configure something, it almost always leads to firing up a console.  If everything in your ecosystem relies on you ending up in the CLI, pretty soon, you’re going to skip the GUI.  Even with Apple’s OS-X I’ve found this to be the case.  I wanted to change my mouse wheel’s scrolling direction, that required going into the CLI (seriously).
  • Their GUI’s are not always intuitive, and honestly this is supposed to be one of the value adds of a good GUI.
  • Their GUI’s sometimes implement sloppy / bad configurations.  One area I recall hearing this a lot with is the Cisco ASA, but i’m sure this occurs on more solutions than Cisco’s firewalls.
  • Not so much a problem of their GUI, but there’s just a negative stigma with anyone in these crowds using a GUI.  To paraphrase, if use a GUI, you’re not a good/real admin (load of crap BTW).
  • Again, not so much a problem of the GUI, but likely do to the above statement, there really isn’t a lot of  “how to’s” for using a GUI.  You ultimately end up firing up the CLI, because that’s the way things are documented.
  • A lot of GUI’s don’t do bulk or automated administration well.  I think this is pretty true across the board.  That said, I have used purpose built automation tools, but they were for very specific use cases.  98% of the time, I go to a CLI for automation / bulk tasks.
  • Depending on the task you’re doing, GUI’s can be slower IF you’re familiar with the CLI (commands and proper syntax).

Clearly there’s a lot of cons for the GUI, but I think a lot of them tend to pertain more to *NIX and network engineers. its not that a GUI by nature is bad, its just that their GUI is bad.  Incase you haven’t guessed, I’m a Windows admin, and with that, I enjoy an OS that was specifically designed with a great GUI in mind (even windows 8 / 2012).  This is really the key, if the GUI is good, then your desire to use a GUI instead of the CLI will be increased.  We went over what the problems are with a GUI, so why not go over what’s good about a GOOD GUI?

  • They’re easy and intuitive to use.
  • They present data in a way thats easier to analyze visually.  This is important and I really think its overlooked a lot of times.  Sure you CAN analyze data in the CLI, but the term “a picture is worth a thousand words” isn’t a hollow phrase.
  • When they’re designed well, single task items (non-bulk) are quick and easy.  its debatable whether a CLI is quicker and there are A LOT of factors that go into that.
  • GUI’s by nature allow multi-tasking.  Now  I get that you can technically have multiple putty windows open (thats a GUI BTW), but its not quit the same as an application interactively speaking.
  • They’re pretty.  I know that’s not exactly a great reason, but let’s all be honest, a pretty GUI is a lot nicer than a cold dark CLI.
  • Options and parameters tend to be fully displayed making it easier to see all the possible options and requirements. (I know Powershell ISE now offers this, very sweet.)
  • Some newer GUI’s like MS’s SQL and Exchange management consoles even show you the CLI  commands so you can use it as a foundation for scripts.  Meaning, GUI’s can help you to learn the CLI.
  • Certain highly complex tasks are simplified by using a GUI.  Things that may have taken multiple commands or digging deep into an object can be accomplished with a few clicks.

The CLI: Lean, mean automation machine

Like I said, I’m a Windows guy, but at the same time, I have a healthy appreciation and love for the CLI.  I’m lazy by nature, and HATE routine tasks. Speaking of hate, I see a lot of windows admins dolling out their equal amount of hate on the CLI, or more specifically being forced into Powershell.    I get it, after all, it wasn’t until Powershell came out, that Microsoft actually had a decent CLI / Scripting language.  There was vbscript, it worked, but man was it a lot of work (and not really interactive).  None the less, Powershell has been out for what’s got to be close to ten years now and there’s still pushback from some of the windows crowd.  There’s no need for the resistance (its futile), your life will be better with a CLI, AND the GUI you know and love.  So let’s go into why you’re still not using a CLI, and why its all a bunch of nonsense..  Also, this is going to be targeted at windows admins mostly and windows admins that are still avoiding the CLI.  Its kind of redundant to go over the pros / cons of CLI because they’re mostly the inverse of what I mentioned about a GUI.

  • Its a PITA to learn and a GUI is easy, or at least that’s what you tell yourself.  The reality, Powershell is EASY to use and learn.
  • You don’t have time to learn how to script.  After all, you could spend 30 minutes clicking, or spend 4 hours trying to write a script.  Sure, the first time you attempt to write a new script, it might take you 16 hours, but the second script might take you 4 hours, and the third script might take you 5 minutes.  The more you familiarize yourself with all the syntax, commands, functions, methods, etc. the easier it will be to solve the next problem.
  • You’re afraid that your script might purge 500 mailboxes in a single key tap.  This is absolutely a possibility, and you know what, mistakes happen (even GOOD admins make really dumb mistakes).  But that’s why you start out small, and learn how to target.  That’s also why you have backup’s 🙂
  • Your afraid, you’ll automate yourself out of a job.  That’s not likely to happen.  Someone still needs to maintain the automation logic (its never perfect, OR things change), and it frees you up to do more interesting (fun) things.
  • Once you learn one CLI, a lot of its transferable to other CLI’s.  For me, going from Powershell to T-SQL was actually pretty easy.  I’m not a pro with T-SQL, but a lot of the concepts were similar.  I also found that as I learned how to do things in T-SQL, it helped me with other problems in Powershell (see how that works).  I don’t have a lot of experience with *NIX CLI’s, but I’d be willing to bet I could figure it out.

I probably did a bad job of being non-biased, but I really did try.  I truly see a place for both administration methods in IT and I hope you do to.

Thinking out loud: Core vs. Socket Licensing

I recall when Microsoft changed their SQL licensing model from per socket to per core.  It was not a well received model and it certainly wasn’t following the industry standard.  I’d even say it ranked up there with VMware vRAM tax debacle.  The big difference being that VMware reversed their licensing model.  To this day, other than perhaps Oracle, I don’t know of any other product / manufacture using this model.  And you know what, its a shame.

I bet you weren’t expecting that were you?  No I don’t own stock in Microsoft, and yes I do like per socket (at times), but I honestly feel like in more cases than not, per core is a better licensing model for everyone.

I didn’t always feel this way, like most of you I was pretty ticked off when Microsoft changed to this model.  After all, the cores per socket is getting denser and denser, and now when we’re finally start getting to an incredible amount of cores per socket, here’s Microsoft (and only Microsoft BTW) changing their model.  Clearly it must be driven by greed.

No, I don’t think it’s greed, in fact I think its an evolutionary model that was needed for the good of us the consumers and for the manufactures.  Let’s go into why I fell this way.

To start, I’m going to use my companies environment as an example.  We have what I would personally consider a respectably sized environment.

In my production site, I currently have the following setup:

  • 4 Dell R720’s with the v2 processors (dual socket, 12 cores per, 768GB of RAM)
    • General purpose VM’s
  • 10 Dell R720’s with the v1 processors (dual socket, 8 cores per, 384GB of RAM)
    • Web infrastructure
  • 7 Dell R820’s with v1 processors (quad socket, 8 cores per, 768GB of RAM)
    • Microsoft SQL only

In my sister companies site we have the following setup:

  • 3 Dell r710’s (dual socket, quad core, 256GB of RAM)
    • Remote office setup

In my DR site, we have the following setup (there are more servers, they’re just not VMware yet).

  • 4 Dell r710’s (dual socket, quad core, 256GB of RAM)
    • DR servers (more to come).

As you can see, beefy hardware and pretty wide array of different configurations for different uses.  All of these have a per socket licensing model, with the one caveat, that my SQL cluster is also licensed per core.

What got me thinking about this whole licensing model to begin with, is that I’d like to refresh my DR site, and take an opportunity to re-provision our prod site in a more holistic way.  The biggest problem here is SQL because its per core, and everything else is per socket.  Which really limits my options to the following:

  1. Have two separate clusters, one for SQL and one for everything else.  Then I can design my HW and licensing as it makes the most sense with SQL, and also maximize my VM’s per socket with my other cluster.
  2. Take a density hit on the number of VM’s per socket, and run a single cluster with quad socket 8 core procs.
  3. Have my SQL VM’s take a clock rate hit (ie make them slower) and adopt a dual socket 16 core setup.

So how does any of this have to do with this whole core vs. socket?  It’s pretty simple actually, and if you don’t get it, go back and re-read my current setup, and go read the design questions I’m juggling with.  Really think about it…

Let me highlight a few thing about my current infrastructure.  I’m going to attack the principle of per socket, and why its a futile licensing model (even if you don’t run anything else that’s licensed per core).

Every server in the infrastructure I laid out above is licensed on a flat per socket model, regardless of the number of cores.  That means my dual socket quad core procs, costs me the same amount to license as my dual socket 12 core procs.  They’re doing less work, not able to support the same workloads, yet they cost me the same amount.  Now, lets look at it from the manufactures view, at the time of pricing, my quad core proc was a pretty fair deal, but now as cores per socket have increased, their revenue potential decreases.  Depending on which side you’re on, someone is losing out.

Here’s another example of how per socket isn’t fair for us.  Remember my SQL hosts, quad socket 8 core?  Remember how I was saying I was thinking about dual socket 16 core procs, but at the expense of  a really slow clock rate.  How is that fair for me?  The reality is, I have the same number of cores, yet if I design my solution based on performance, it costs me twice as much in licensing, compared to a model that’s mostly about density.

I’d like to share a final example, of why I think per socket, isn’t relevant any longer.  This should hit home a lot more with SMB’s and those that try to convince SMB to go virtual.  We have a sister company that we manage, and they have all of about 20 servers.  That doesn’t mean virtualaztion isn’t a good fit for them, and why should they be forced into using something less optimal like Hyper-V.  Yet, a simple 3 host dual socket config for them, would cost an arm and a leg.  The reason being, VMware is charging them the same price that they’d charge a large enterprise.  You know what would level the playing field?  Per core instead of per socket.

In my opinion, per core would be a win for both the consumers and manufacture.  It provides the most flexibility, the most fair pricing, and it doesn’t force you into building your environment based on simply maximizing your licensing ROI (you do take application performance into consideration, right?)

Finally, if we were to adopt a per core model, there’s one thing I would insist that manufactures do to be fair.  I’m calling Microsoft out on this, because I think its very underhanded of them put this in their EULA.  Modern BIOS’s have the ability to limit the number of cores presented to the OS.  You can effectively turn a 12 core proc into a quad core proc if you want.  Microsoft, specifically doesn’t allow a company to use this feature, and still requires that all physical installed cores be licensed.  This is so wrong on their part, its not fair, its not right, and the reality is, it just makes this model look worse than it really is.  So with that being said, if a manufacture were to switch to this model, I’d implore that they utilize a few different options to allow me to buy more physical cores than i’m licensed for, but still be compliant.

  1. Let me use BIOS feature to limit the number of cores present to the OS.  Do you really think I’m going to shutdown all my servers before an audit and lower their cores?  Really what good would that do?  If my server needs 8 cores, it needs 8 cores, lowering it to 4 cores for the audit would likely have a devastating effect on my business more so then properly licensing the server.
  2. Design your product so that with a simple license key, or license augmentation, you can logically go from “x” cores to “y” cores.  Meaning you self restrict the number of physical cores used in a system.  If I have a dual socket 12 core proc, and i’m only licensed for 8 cores, only utilize 8/24 cores.
    1. Heck, take NUMA minimums into consideration.  Meaning, if I have a quad socket, 4 cores at a minimum are required, and if I have a dual socket, 2 cores at a minimum are required.

What are your thoughts?