Dirty Cache

costsave Last week (during EMC world) a discussion came up on Twitter around Oracle licensing and whether Oracle would support CPU affinity as a way to license subsets of a physical server these days.

@sam_lucido @EMCOracle @CacheFlush is #VMware and accepted hypervisor for #oracle hard partitioning these days? thought they weren't

— Andre karlsson (@karlsson_andre) May 7, 2014

@karlsson_andre @sam_lucido @EMCOracle Good topic for my next blogpost maybe? 😉

— Bart Sjerps (@CacheFlush) May 7, 2014

Unfortunately, the answer is NO (that is, if you run any other hypervisor than Oracle’s own Oracle VM). Enough has been said on this being anti-competitive and obviously another way for Oracle to lock in customers to their own stack. But keeping my promise, here’s the blogpost 😉

A good writeup on that can be found here: Oracle’s reaction on the licensing discussion
And see Oracle’s own statement on this: Oracle Partitioning Policy

So let’s accept the situation and see if we can find smarter ways to run Oracle on a smaller license footprint – without having to use an inferior hypervisor from a vendor who isn’t likely to help you use it to reduce license cost savings…

The vast majority of enterprise customers run Oracle based on CPU licensing (actually, licensing is based on how many cores you have that run Oracle or have Oracle installed).

An important note here is that an ELA or ULA (Enterprise License Agreement / Unlimited License Agreement)
is also based on the number of cores in use. Don’t let people tell you that with an ELA/ULA you can deploy as much Oracle as you want without paying extra.
(more info here: Oracle ULA contract agreement risk factors)

I’ve compared ULAs to the debt crisis in Europe and USA United States debt-ceiling crisis of 2013 & Eurozone crisis
debtcrisis
By allowing people (or governments, or companies) to spend (or deploy) without limits now (hence Unlimited License Agreement) and paying later, you may find yourself in a lot of future trouble.

Reducing license spend

So one way to reduce (current and/or future) license spend is to reduce the number of licensed cores. It’s that simple. So how do you get to that?

Get the best CPU cores available, in other words, those cores that drive the most database transactions per core
Get rid of idle CPU cycles (no can do with physical deployed servers, need to go virtual – so choose the best hypervisor that’s available to you)

Now there’s more ways to reduce license footprint (ordered from low risk/low impact to high risk/high impact):

Get rid of unneeded database options (the options on Oracle Enterprise Edition may add up to being more expensive than the base EE license)
Move away from Enterprise Edition altogether and run Standard Edition
Move workloads away from Oracle to something else (think Hadoop i.e. EMC HAWQ/Pivotal, NoSQL, Open-Source enterprise databases i.e. PostgreSQL)

Getting the best CPU cores

The way I compare CPU cores is by using the TPC-C benchmark numbers – and actually the TpmC (Transactions per minute, type C, which is OLTP oriented) per core. The problem with that is, not all CPU types are listed, and the results may be influenced by not only the CPU but also the server platform. So I also use SPEC and other benchmarks, and I try to be creative with MHz ratings, etc. It’s not exact science and the results may not be 100% accurate – but I guess it’s close enough.

Given that Oracle uses a CPU core factor for Intel of 0.5, and 1.0 for some RISC processors (except their recent SPARC models…), you have to take that into account as well. For example, IBM Power 7 is a very fast CPU even per core, however, Oracle requires double the license cost per core as compared to Intel X86-64. So on a license cost per transaction basis, IBM Power has to outperform Intel by 2X to match up for the core factor. Again you may argue that this is anti-competitive from Oracle (and rightfully so) but that’s the way it is.

Currently among the fastest CPU cores are (disclaimer: this list is not complete and likely not 100% accurate, and some of these are based on other databases than Oracle):

CPU type      TpmC/core
Intel E5-2690:   100574
Intel E5-2643:   100574  (exact same cores as the 2690)
Intel E7-8870:    63199
SPARC T5:         66816  (currently top-listed overall TPC-C benchmark 
                         but they required lots of cores - expensive!)
IBM POWER6:      101116  (but core factor 1.0)
IBM POWER7:      150001  (core factor 1.0, this is the 4.14 GHz CPU type)

Estimates (no TPC-C results yet so these are estimated based on other benchmarks)
POWER8:         ~200000+ (awesome!)
Intel E5-2697v2 ~115000  (no TPC benchmark available yet so I used other benchmarks
                         to make an estimate)

(Many thanks to Kevin Closson for guidance on CPU ratings)

So you can see which CPU works best for DB workload consolidation. Be cautious as some CPU types (Intel E7?) may drive more bandwidth than their TpmC rating suggests – so for data warehouse workloads you might be better off looking at other benchmark results. But for regular OLTP I guess TPC-C works fine.

So clearly the best CPU (currently) for DB consolidation is the Intel E5-2600 series (preferably the newer v2 models). IBM has (very impressive) much higher TpmC/core ratings but requires double the license cost (and the pSeries hardware ain’t cheap either). I’d consider IBM Power8 a good alternative if you want to go RISC/UNIX (better than SPARC on a price/core basis) – and don’t forget to use all the virtualization capabilities of IBM POWER to drive up your utilization. IBM supports hard partitioning, so you may want to run Oracle on a few cores licensed and use the rest of the system for other workloads.

Back to Intel/VMware. Very large environments probably will dedicate an entire VMware cluster to Oracle database (remember to not run anything else there: no middleware, no apps, no data replication overhead etc – dedicate all your CPU cycles to DB processing where possible).

Smaller server environments

Average sized organizations will likely only have one VMware farm – so they will deploy VMware sub-clusters for Oracle database (so they can license only the servers that actually run Oracle). For those organizations, the issue around CPU affinity and hard- vs soft partitioning is a non-issue as they need more than a few entire servers for their database workload anyway. The issue is around smaller organizations who only need a few CPUs to run their database load.

For those, a dual-socket server with Intel E5-2690 (16 cores) might even be too large – so it seems to be perfectly valid to look at sub-server licensing. But no can do, unless you buy from Oracle.

Now smaller organizations typically also don’t need things like partitioning (because of smaller database sizes) and other features of Enterprise Edition (EE). Going from EE + options to SE (Standard Edition) pays off:

(assuming 50% street price discount)

2-socket, 16-core server with EE, partitioning, advanced compression, diagnostics & tuning pack: $322,000
2-socket, 16-core server with SE (no options possible): $ 17,500 (assuming 50% street price discount).

That’s about 95% savings!

2-socket, 24-core server with EE, partitioning, advanced compression, diagnostics & tuning pack: $483,000
2-socket, 24-core server with SE (no options possible): $ 17,500 (assuming 50% street price discount).

That’s about 97% savings!

(remember Oracle Standard Edition is licensed by socket – so that’s a good deal considering you can have many cores per socket with today’s processors). If you can’t use SE because of EE features that you need, then the only way to reduce license cost is reducing – licensed – CPU cores. Fortunately you don’t have to buy the largest server you can get – for example, Cisco UCS has blades that run with fewer CPU cores: The UCS B200M3 has 2 sockets for Intel E5 processors. If you only install one socket with E5-2407v2, then you have only 4 cores in your machine. That requires only 25% of the licensing of a dual-socket E5-2690 or 17% of dual-socket E5-2697v2.

The extra (empty) socket allows you to quickly add more CPU power if your workload grows. Cisco UCS blades are also “stateless” which means that if you require a more powerful server than the B200, you just add the faster blade, and apply the UCS template to the new blade, and restart your system.

As you see, you can achieve cost savings as long as you’re a bit creative and ignore FUD from software vendors…

Oracle, VMware and sub-server partitioning

Tagged on: cost savings license licensing oracle VMware

Bart Sjerps 2014-05-12 Oracle, Virtualization 9 Comments

9 thoughts on “Oracle, VMware and sub-server partitioning”

Brett Murphy says:

2014-05-17 at 05:15

Bart, this was an excellent write-up. Kudo’s to you for the tone and approach to doing a fair comparison of options for customers. However – yes, here it comes 🙂 It appears you either do not appreciate the differences or understand how Power technology works and have fallen victim to a traditional comparison to the other chip offerings – x86 and SPARC. In the past I have seen EMC do this by design in a effort to support their VCE partnership but I did not get that impression from you. I am a business partner and competitive specialist having worked at Sun and IBM not to mention a high performance computing instructor in the military.

Although Power chips beginning with Power6 have a licensing factor of 1.0 compared to values ranging from .25 – .75 on competitive platforms unless you piss off Oracle like HP did and they double your factor from .5 to 1.0 as they did with Itanium. You correctly state with x86 servers that all cores must be licensed even if just a portion are used for Oracle. You are also spot on when you suggest any customer running software licensed by the core to pick the most powerful cores out there – and not the most number of cores. For x86 that can be very expensive. The challenge x86 servers have is if they go to small to control Oracle licensing they can run out of headroom. Most prod Oracle environments on x86 do not virtualize for good reason so going from a 4 core single socket to a 8 core single socket or 4 core x 2 socket server isn’t easy – everything is a trade-off – although you said the UCS blades are stateless so you could just “swap” them out – I learned something here. Still disruptive and you didn’t say if that applies to Windows, Linux or both but depending on some of these variables this is interesting. Won’t spend to much time on any Oracle servers because they are not competitive by and large, everything they do has a feeling of overstating or misleading on its features or capabilities. Just like EMC would like to sell storage capacity, Oracle wants to sell software licenses – I’m simply not convinced and do not see much more than a powerful marketing department make big claims.

Back to Power servers. You only license the cores required to run Oracle. If the server has 16 cores but you only need 2 cores for Oracle then you only need to license 2 cores. That same 16 core x86 server would license 8. Further, benchmarks are great – x86 and Power both publish a bunch so they have credibility. However, they don’t tell the same story. Power has PowerVM which is there for every benchmark. It is highly secure, scalable, flexible and “efficient”. This efficiency explains why it may only need 4 cores with a Power7+ 740 server and a x86 server would need 8 physical cores to do the same workload. Now, with the Power server, I can license the 4 cores. For the 8 x86 cores, that would be on a 12 or 16 core server. I cite one sizing I did against Cisco last year where they required 300 x86 cores compared to just 28 on a Power7+ 780 server. On first blush that looks preposterous. But, the 300, which was the sizing provided by Cisco was the sum of the x86 servers required for the Oracle workload whereas it actually needed somewhere around 168(ish) cores (effectively). But, because of the way the chips fill the sockets that is the way it works. Thus, 300 times .5 = 150 Oracle licenses compared to 28 Power cores times 1.0 = 28 Oracle licenses. it was 28 Power cores because the server can allocate the exact compute resource to the VM – if that is .5, 1.3, 2.1, .7 and so on, that all equals 4.6 or 5 cores to round up. If it isn’t obvious I’ll state it – the Power server is fully virtualized which means it can take advantage of all the available virtualization features. x86 benchmarks do not (usually) include any hypervisor such as VMware. So, if you use that then add some overhead to the processor, understand the I/O impact and if using vMotion understand the “broader” licensing implications (yes, this can be very, very painful).

With Power8, the benchmarks show roughly 75% to 2X per core improvement over Ivy Bridge EP/EX and Power7+ – I said “roughly” so nobody nitpick on me for one that is only 70% more instead of a full 100%. There are some that are more and a few that are less – we are friends here right so would you allow me to just “average” it? 🙂 (It’s a Friday night and past 11 pm, I’m tired and don’t want to write a table). Factor in the efficiency I mention above due to the Power Hypervisor and the performance of Power8 across all workloads then you have even fewer cores required. That example of 300 x x86 cores to 28 x Power7+ cores has a Oracle licensing cost for EE + RAC with a 75% discount had TCA license cost for Power around $2.5M and $9.5M for x86. Factor in a 5 year maintenance number of 22% per year which with Oracle begins with the first year puts the Power TCO around $3.5 and the x86 solution around $14.5M. These numbers scale whether it is 300 vs 28 or 30 vs 3 or 8 vs 2.

Hope this explanation helps both you and your readers understand the differences. Not sure how my name and website will show up on this post. I won’t put it here as I don’t want this to seem like I’m marketing here. If it allows the user to contact me feel free to do so for a deeper explanation or simply post your question or challenge here.

Reply
1. Bart Sjerps says:
  
  2014-05-17 at 22:45
  
  Hi Brett, thanks for your extensive comments, highly appreciated (I bet my readers would agree). Some comments:
  
  First of all, yes it is difficult in today’s highly competitive world to stay away from prejudice and granted, at EMC, our main focus is on Intel and VMware, and the VCE joint venture but we partner with many more and try to be open in architecture.
  I have a personal background with pSeries, or rather, RS/6000 as it was called back then. I was senior UNIX engineer in the late nineties and mostly working on AIX (version 4) although I worked on HP-UX and in a lesser extent on Solaris as well. We had various IBM models all the way up to R50/J50 and later even S70 (the first 64-bit RS). Always enjoyed working with AIX but in those days there was no virtualisation and I never worked with that after joining EMC in 2000. So I guess you’re right in that my knowledge on pSeries virtualisation is a bit limited – for sure I need to polish up my skills there a bit 😉
  However I included both POWER and SPARC in my post – and the traditional comparison based on TPC is the closest and most realistic for comparing DB consolidation. I mentioned that POWER allows sub-server partitioning (hard partitioning) licenses there – not sure if I missed something there?
  
  Anyway, answers based on the best of my knowledge:
  
  – Stateless UCS blades: OS is not relevant. But you’re correct on swapping blades being disruptive – you need to shutdown the existing physical blade, appy the profile to another (faster) blade and reboot. That would take, what, 10-15 minutes? Sure it’s not entirely non-disruptive but can be scheduled – and database servers need to be down at some time for maintenance anyway. But can pSeries move existing LPARS or domains, completely without downtime or performance impact, to another physical system (rack?) with different CPUs? Not sure but that sounds impressive.
  – EMC wants to sell storage capacity: absolutely, but sales guys just trying to sell raw gigabytes are dinosaurs – and the meteor leading to their extinction is closing in fast. It’s not about gigabytes anymore, it’s about the “third platform” as well as offering extreme IO performance and flexibility, at acceptable cost levels, for platforms that are quickly becoming legacy (2nd platform) but will be with us for a long time (such as Oracle databases+apps).
  I myself try to create confidence with customers in choosing the right platforms and to build long lasting customer relationships. No sales quota for me, fortunately!
  Which, with Oracle’s current sales tactics, isn’t as easy as it used to be (some DBA’s suddenly questioning technology that protected their systems 20+ years, for example – guess who obfuscated their brains). So you might catch me balancing between FUDbusting and highlighting some competitors’ limitations myself (note the difference) because someone has got to warn our customers for inconvenient hidden truths. And avoiding drinking one’s own companies’ Kool Aid remains a challenge too.
  – PowerVM efficiency vs Intel (I think we can assume VMware here?): I accept PowerVM has zero overhead as it was architected directly into the POWER architecture and is hardware assisted. VMware started by having to implement a lot of translation in software (thus, overhead) but Intel (and AMD for that matter) have kept up and provided more and more hardware assist, while at the same time VMware improved their kernel to be much more scalable and running with less overhead. And as such, vSphere 5.1 was found to have as little as 4% overhead for Oracle by our own IT department (white papers on that are available). With 5.5 it has probably come down a bit again. I find 4% not negligible but acceptable – but if Power has zero then granted, it’s still better but the significance has faded.
  
  Where I fail to fully agree with your comments is where you claim 28 POWER cores can handle the same (database) workload as 300 Intel cores. Which benchmark backs that claim? Or otherwise how to explain? 28 POWER8 cores can probably handle the workload of 60 (new, state of the art) Intel cores (and I round up in IBM’s advantage). Let’s say that hypervisor overhead (4% ignoring ESX5.5 innovations) drives you to 64 intel cores.
  That’s a big gap with 300. Not disqualifying your statement but I would enjoy seeing some proof in benchmarks or otherwise 🙂
  
  Final comment (personal view and may not reflect that of my company, but I already mentioned that in the blog disclaimer): I appreciate IBM P and wish we (EMC) would partner a bit more with the IBM server guys (and their business partners such as yourself). Have succesfully worked with IBM recently both the Intel and Power side of the house. We should do that more often, if for nothing else, to provide customers reasonable alternatives to Oracle’s Red Stack and Engineered Systems portfolio lock-in. Shame that such initiatives are often blocked by our sales teams because we hapen to compete on storage.
  
  Reply
  1. Brett Murphy says:
    
    2014-05-17 at 23:52
    
    Thank you Bart – I read what it means to “Blog with Integrity” and wanted to say I appreciate the endorsement. Too often on these internet forums are people who attack just for the sake of it, make claims just for marketing purposes and argue just to be argumentative. I take the approach to make customers “informed consumers” and over time (only way it can happen) to become their “Trusted Advisor”. But, we all make mistakes and with that trust comes the integrity to admit if we have made a mistake.
    
    I will stand with everything I said. Per your statement if Power can move a LPAR running from one server to another without disruption then you would be impressed – Ok, you should be impressed. I can move an AIX LPAR from a Power6 to Power7 to Power8 (there are + models in between for the P6 & P7 as well) live. Assumption is the older models just have to be at the supported OS version of the newer models. Power8 actually lets customers move those older version without updating and they will just run in the same “mode” as where they came from but they are on the new server and can update later to pick up some of the new features. To put a “cherry” on top of this – I can move from a single socket Power6 server to a 256 core Power7 795. I can move from a Flex node to a Power8 2 socket server. I can move a Linux, AIX or IBM i VM all equally. I can further add/remove cpu, memory and I/O to each of these LPAR’s at any time – all without disruption.
    
    I’m glad you called BS on the 300 to 28 co example. I actually say that to customers when I talk with them. A couple of things on this. You have to talk with customers who have done this to appreciate it. I helped a customer reduce their Oracle licensing from 190 licenses to 40, over 10K IBM PVU of software to ~3K PVU (they needed less than 2500 but rounded up to 2500 and added 500 when they made the move because they were skeptical.) That was 3 years ago with 54 servers and roughly 240 cores to 1 server and 64 cores. They actually deployed that on 4 servers, each with 32 cores for redundancy and spare capacity. They have since moved over 200+ workloads to these servers. They recently upgrade one of the 3 year old Power7 servers which was a B model running with ~24 cores at 60% utilization for 32 LPAR’s to a new Power7+ server which we call a D model also with 32 cores. The customer “live” moved the 32 LPAR’s to the new server where it was using less than 10 cores running at 20% utilziation. Lot’s of reasons for that – faster processors, more L3 cache, improved I/O, offloaded some tasks from the P7 core to hardware accelerators, etc.
    
    The reason you struggle to accept or understand (whatever applies) is because your own background goes to about Power4 where the servers were still essentially core based – in other words you assigned resources to a workload by core and that was it. Starting with Power5 and now with Power8 we can allocate cores to different processor pools then create VM’s in those proc pools (PP). Within the PP for the VM’s I can assign a compute resource as small as .05 of a core and with Power8 it has SMT8 or 8 threads vs 4 for Power7 and 2 threads for Power5/6 and of course still what x86 has. .05 is the smallest resource element so by default that would become 1 virtual processor. If I had assigned .1 or one-tenth that could also be 1 virtual processor or if I wanted to, I could make it 2 virtual processors – each assigned .05 of a processor. Take these examples and extrapolate up to the 256 core 795 (remember just 4 threads though). A 24 core S824 server with 8 threads means I could have up to 24 physical cores, 480 virtual processors or 3,840 logical processors (this is the last level when SMT is applied). Take where I live – I have a 4 bedroom house but with just 2 bedrooms occupied (my wife & I in one and my child in another. This means I have extra capacity with the 2 empty bedrooms. I can accommodate visitors just like a server can accommodate EOM/EOQ/EOY workload spikes or if we adopt a child and use a 3rd bedroom or the server workload grows it can consume more. Nobody would argue with this – pretty consistent across most hypervisors. By the way, – Yes I typically am referring to VMware but try to not call them out by name, same for x86 instead of Intel, etc – just to protect myself from a zealous legal department if they took exception with a flip remark I may make – and I do sometimes make them 🙂
    
    I mention the efficiency of the Power hypervisor – you tried to associate that with overhead. Different topics which I will address below. Power hypervisor is microcode and not a guest OS. Just like my bedrooms are only occupied 1/3 of the day or 33% utilized across 2 bedrooms. I could increase that by either reducing the number of bedrooms – let’s say to 1. Let our child sleep for the first 8 hours then my wife and I the next 8 hours – our utilization would increase to 66% across one bedroom. Same with the cores – but there are trade-offs with that. Going to move off this example and just focus on the compute example. If I have an efficient scheduler though, I could schedule (ie weave) all CPU instructions across all of the processors – if they are only on a physical core for .1 or one-tenth of a core there is nine-tenths available. I can allocate that to another workload. Plus, I have multiple execution pipelines and don’t forget the 8 threads which get engaged when a thread reaches a certain utilization level – I think that is 65% right now ( would have to check) as it has been tweaked over the generations. So, if a DB workloads on x86 required 5 cores running at 40% utilization I might only need 2.3 cores running at 55% utilization (making it up here just to make the point but these aren’t untypical). I add up the many examples of 2.3 and 1.2 and .9 and whatever to get to 28 cores. The x86 servers allowed some to use VMware but the prod DB workloads for example were physical (again, they were sized by the vendor and not me). So, that 5 core example was maybe put on a 2 socket 8 core or maybe a 2 socket 12 core. Hopefully that helps. Feel free to contact me and we can walk through it using GoTo Meeting. You don’t have to become a fan but you will at least know how it works today.
    
    With regard to hypervisor overhead. This is a hot topic – I will throw out my number but as the story teller I can say what I want 🙂 it’s my story. I also tell you the listener to use what your experience is. Since I have yet to see it confirmed what that tells me is the vendors are happy to let us fight it out – the hard core will defend it and the competition will attack it. If it was 5% I would almost guarantee they would post and publish that everywhere. Since x86 is commodity relying on a multitude of vendors and there is a plethora of chip offerings that require a savant to know them all I am confident in saying that any users mileage would vary. Also, if the overhead were not that significant I think we would see more benchmarks performed with it. After all, there are VMware specific benchmarks. But, how many TPC, SPEC, SAP, Oracle, LinPack or others are performed with VMware? All are done with PowerVM – except now that Power8 does offer PowerKVM for those OpenSource guys who want it the open way or no way which is fine and which is why IBM is opening up the Power portfolio to be more inclusive whereas it seems the x86 community is becoming more proprietary or closed. EMC should join the OpenPower Foundation and get in on the CAPI capabilities – it is stupid fast with ultra low latency which is a slice of heaven for storage.
    
    I didn’t intend to write this much. Please email me and I’ll walk you through the details of the 300:28. That is one of many I have done. x86 sellers try to compare a 16 core x86 to a 16 core Power with price and Oracle licensing factor and say there you go. Power is too expensive so buy my x86. That is BS, misleading and either uninformed or intentionally deceptive. There is a lot of EMC attached to Power servers – I’m ok with that until they try to replace Power with VCE then it is game on. You can imagine the difficulty the storage component has in overcoming a $7M software delta I provided in my previous example – not a good way to start. I would also prefer to see EMC come in and partner with my Power solution more with mutual respect for each others role in the account but all sellers are under a lot of pressure to deliver results…..sigh
    
    Take care
    
    Reply
E. Pierce says:

2014-05-19 at 15:43

I’m going to chime in here. I’m the customer Brett is talking about, and I can confirm what he says is correct. We moved all our middleware and Oracle to the Power platform. IBM licensing was cut by about 65-70% (saving about 500K a year), and our Oracle environment is consolidated and virtualized using PowerVM. It’s been fantastic for us; high performance, year on year licensing savings that far outstrip the cost of the hardware.

The sizing we did for the number of cores it would take to run any given workload has been accurate within 10-15%. Couple examples: moved an Oracle database from a Sun M5000 with 16 cores to an LPAR with that peaks at about 6 cores of utilization. Steady state is about 3 cores. We consolidated a 32 core RAC cluster (AMD CPUs) onto a single 2 core LPAR. The multiplier may be 1, which throws people off — the fact is you can do so much more with less that the licensing factor is irrelevant.

With the release of Power8, our consolidation will go up even more. Smaller servers have more RAS features, which will also lower the cost of acquisition for us.

The general market trend, of course, is to go to x86 with Linux. We made a decision to split our environment — VMware for mostly Windows stuff and some one-off Linux. AIX/Power runs Oracle and all our middleware. We’re fine bucking the trend. 🙂

Reply
1. Bart Sjerps says:
  
  2014-05-19 at 22:33
  
  Great to hear real end-customer validation!
  
  Your numbers make sense; POWER7+ CPUs can do much more work than SPARC64 – roughly 1:5, if you’re on P7+ the TPC-C rating would be around 150,000 per core, SPARC64 probably around 40,000 or less.
  The 32-core RAC probably used older AMD cores so you benefit from Moore’s law as well as better CPU architecture on POWER. I wonder how many modern Intel E5 cores you would have needed to replace the 32 AMD cores – and there’s the factor of utilization and – especially with RAC – overhead.
  A 32-core RAC cluster, how many server nodes would that require? 4? On a 4-node RAC cluster I bet the RAC overhead will be 40%. You get rid of that (and then some) if you go to single-instance.
  Say the AMD cores could handle 25,000 TpmC. 32 cores would rate @ 800,000. RAC overhead eats away 320,000 so the cluster could handle 480,000 TpmC. A P7+ can do 150,000 so you’d need roughly 3 P7 cores for the same peak load. But you can drive up utilization much higher because of single instance + PowerVM so you can get rid of one more core -> 2 cores 😉
  Same env on Intel E5: 480,000 TpmC requires 5 E5-2600v1 cores (of course you’d need to buy a server with 6 or 8 and if you virtualize you have to subtract ~ 4% for overhead) or just over 4 E5-2600v2 (if my estimate TPC numbers would hold)
  Granted I made up the numbers here because I don’t have the real ones – but I’m not really surprised.
  Replacing 3++ year old processors + getting rid of clustering overhead + virtualization to drive up %util = 10x CPU core reduction. Matches the numbers Brett was talking about (300:28) and I believe the numbers as long as you replace fairly old processors with state-of-the-art ones.
  
  I doubt you can replace 300 Intel E5-2697v2 cores with 28 POWER8 ones…
  
  Thanks for sharing your experiences!
  
  Reply
  1. Brett Murphy says:
    
    2014-05-19 at 23:41
    
    Hello Bart,
    Well, haven’t made a believer of you yet but don’t forget I did impress you with our ability to do the dynamic upgrades, etc from above! Baby steps!!
    
    The RAC cluster Eric was talking about was a 2 node RAC cluster which is as efficient as it can be for x86. As you know, you add nodes to RAC – particularly beyond 3 and it goes downhill from there fast. Maybe Eric can say what generation of AMD servers there were but I recall them being either N or N-1 so not as old as you suggest. That was also the case in 2011 when we moved a lot of their x86 workloads over which were running on HP G7 class servers. Also, with regard to the SPARC – most of those workloads were running near 100% utilization. Just like they moved some from a 16 core to a 3 core VM they also moved some to less than 2 and 1 – lots of factors such as the server model, proc generation, workload utilization & type, etc.
    
    You can see though I am already debunking your assumptions. It is natural to look for flaws in a solution you either don’t understand or need to quantify to explain such dramatic capabilities that differ from what one considers to be the “norm”. However, you don’t get to cherry pick data just to make a point. Since this is a “Blog with Integrity” I will say that you could say lots of things but the reality is that customers would not typically take a x86 solution that is using Oracle RAC to a non-RAC solution just because they could get better single server performance – they may reduce or eliminate a node but not the availability because x86 is inherently unreliable compared to Power. This is a typical x86 vendor talking point – We can do that with 1 server and 16 cores just like Power and our licensing factor is .5 and they are 1.0 so they are twice as expensive. But, the reality is when it is time for the architect to sign his “John Hancock” to the design and be sold to the customer stating this will work for the desired solution it all of a sudden has multiple servers with extra software like RAC whereas the Power solution is still the single server with 16 cores but it only had to license the cores required – maybe 4 or 6 or whatever is required. And as you said, my single instance will outperform the overhead the x86 solution has to endure with the overhead from RAC not to mention the licensing of each entire server at $70,500 per core times .5. And, if you want to virtualize it then add more overhead to it and more cost because you have to license for the cores in the HA cluster as well.
    
    What you missed in my example of the 300:28 was that it wasn’t entire or even solely about performance as much as it was about efficiency. Performance is good because I can have multiple workloads with a quality of service all running on a highly reliable server sharing software licensing. Because of it’s ability to allocate compute resources to the workload it is entitled to and offer more if available you can stack or weave many, many more onto fewer servers.
    
    You are right, I probably wouldn’t need 28 Power7+ cores for 300 Intel cores from just 1 year ago. My guess with Power8 is that if everything was equal and the sizing for the Intel solution was 300 x86 v2 cores that I would probably need less than 20 Power8 cores. I know that pokes a guy in the eye but that is why more customers are looking at Power technology because they are tired of the skyrocketing software licensing costs required with x86. The constant challenges to manage 10 – 20X more servers for the same number of VM’s not to mention the increase in data center space, power consumption, increased cooling, increased port count for LAN & SAN switches plus the typical 2 – 3X as many FTE’s to support it.
    
    If a customer running Ivy Bridge v2 processors with Oracle wants to do a proof of concept against my Power8 servers with the winner “remains” I would be happy to talk with them. I’m a bigger fan of using a real workload than relying on vendor benchmarks because just like you repeatedly used them above you never made a distinction if there was VMware used or not. All of that overhead adds up. Did I mention I can provide concurrent OS upgrades, apply firmware concurrently, add & remove cpu, memory and I/O live to any VM without a reboot. Love move a VM to other servers of other generations. Actually upgrade a OS like AIX – have you tried to upgrade RedHat on x86 for example? You ought to look at the cautions not to mention the official statement at RedHat which says it isn’t officially supported. So, customers should set aside their biases and look at what helps them sleep at night, delivers the greatest value to the business. Gotta go. They just closed the door on the plane.
    
    Reply
    1. Bart Sjerps says:
      
      2014-05-30 at 09:59
      
      Hi Brett,
      
      Think we agree on many things – I’d say both Intel and POWER are viable platforms for DB (and other apps) consolidation. When I’m digesting the numbers they more or less match with my expectations. The only thing I cannot get a grasp on is what you mention as “efficiency” for consolidated workloads. I understand what you mean but I’d like to see it quantified (in some kind of consolidation benchmark although I’m not sure such a thing exists).
      
      Re availability of Intel – Guess Power arch is very reliable but Intel (and Linux for that matter) have major quantum leaps over the last years and these days getting 5 9’s on Intel is no longer a problem (even without clustering like RAC). The majority of downtime issues comes from user errors and software bugs anyway and no OS or HW platform is going to protect you from that 😉
      
      Anyway, thanks for the interesting discussion and hereby I promise to update my POWER/AIX knowledge a bit (when I can find the time 😉
      
      Regards from Holland!
      
      Reply
      1. Brett Murphy says:
        
        2014-05-31 at 01:00
        
        I’ve enjoyed the discussion and your blog. Look forward to many more. I could respond a bit more to your latest remarks but will leave it where it is for now. Regards, Brett
Pingback: 10 Top Oracle Licensing Articles on the Web - Madora Consulting » Madora Consulting