Dirty Cache

Sometimes I hear people claim that by using faster storage, you can save on database licenses. True or false?

The idea is that many database servers are suffering from IO wait – which actually means that the processors are waiting for data to be transferred to or from storage – and in the meantime, no useful work can be done. Given the expensive licenses that are needed for running commercial database software, usually licensed per CPU core, this then leads to loss of efficiency.

Let’s see if we can visualise the problem here with a common world example – Baking a cake.

The recipe of the cake mentions the ingredients:

200 g butter
200 g sugar
200 g flour
4 eggs
2 teaspoons of baking powder
vanilla sugar
lemon juice

In contrast, what do you need to process a business transaction? Something like this maybe?

1 million database CPU cycles
100,000 app server CPU cycles
100 IO requests of 8K each
2 Megabytes of memory
1 Megabyte of network transfers

Let’s simplify things by looking at two resources only:

Cake	Transaction
200g butter	1M CPU Cycles
200g flour	100 IO cycles

Let’s also assume that butter is 10 times more expensive than flour, much like the CPU cycles are more expensive than the IO cycles (per transaction).

If I want to reduce the cost of baking cakes, could I reduce the amount of butter if I added more flour? I don’t think as a customer I would buy such a cake. So, can you reduce the number of required CPU cycles by adding more IO? I guess we all understand that it will not work out…

But the original idea was to eliminate overhead. So let’s say we can buy butter in packages of 1kg (think “servers with X processors” and flour in packages of 500g (storage with X “spindles”). We only need to bake one cake so we buy one of both. After baking the cake we still have 800g of expensive butter left (underutilized resources) and 300g (inexpensive) flour.

By applying a consolidation strategy (a bit of a strange word when baking cakes, but you get the idea) we could use the same oven to bake 2 cakes instead of one at the same time. Which leaves us with 600g unused butter and 100g unused flour. The efficiency is increasing, but now I’m limited by availability of inexpensive resources (flour).

To improve the efficiency I need to make sure that all of the butter is used. There are a few things I could do to achieve that:

Buy butter in smaller packages (smaller server with less CPUs)
Buy more flour so I can bake more cakes (more available IOPS due to Flash storage)

Sometimes, the flash vendors (including EMC), in their unlimited enthusiasm, forget the first option. So let’s say we go for option 2 and we buy 2 packages of flour so we can bake 5 cakes without any butter left. Great!

bakery Now, the consumption of cakes is not driven by how much I can bake, but how much I can sell in the bakery – during opening hours!

Think about it – peak production capacity is meaningless if we cannot deliver the products (transactions) at the maximum rate all the time, unless we open up the bakery 24×7 and have equal amounts of customers come shopping for cake at 3am in the morning as at 3pm in the afternoon, and on sundays as well as on tuesdays. Cakes can be stored for maybe a day or so. Business transactions need to be consumed immediately.

So, the consumption of business transactions is not driven by how much I have available but by how much the business needs. Which is why I see my customers all too often buying very expensive database appliances, capable of driving millions of transactions per minute, only to end up mostly idle due to limited business demand.

If my bakery can only sell 2 cakes out of 5, how does that help? In other words, the capability of driving more workload doesn’t necessarily mean it gets consumed.

Driving up the potential production capacity by removing bottlenecks does not automatically lead to better efficiency. As said, the actual consumed amount of transactions is driven by the demand, not by the supply – although over time the demand may increase, if the supply (of information) has become frictionless.

Baking a cake: trading CPU for IO?

4 thoughts on “Baking a cake: trading CPU for IO?”

jweinshe says:

2016-02-01 at 16:18

One thing to note – and I haven’t had enough coffee to figure out how to put this in the cake analogy – even with flash, we’re talking huge differences in access times vs CPU – as humans it’s hard for us (me anyhow) to see how 50-150 micro-seconds (us) is that bad compared to nano seconds – but I use a chart from Systems Performance: Enterprise and the cloud that, for purposes of helping humans understand the huge differences, puts 1 CPU cycle (normally 0.3 nanoseconds) as 1 second and then adjusts the scale for everything based off that.

When you do that, you see that
1 CPU cycle is 1 second
Level 1 CPU cache access 3 seconds
Level 2 CPU cache access 9 seconds
Level 3 CPU cache access 43 seconds
Main memory access 6 min
Solid state disk IO 2-6 days

So yeah – although we’re talking 50-150 microseconds for solid state, that’s still 172800 times SLOWER than a CPU cycle – and that’s going with the best case of 50 micoseconds / 2 days.

1. Bart Sjerps says:
  
  2016-02-01 at 18:08
  
  Hi Jay,
  You mean something like this? https://dirty-cache.com/2011/12/08/performance-hourglass-time-lightning/
  🙂
  And good point – slow IO makes CPU wait for data before it can do anything. But I could also remove some of the CPU cores then and still get the same performance at higher %CPU and lower %wait (ignoring the effects of peakload and parallelism for a moment)
  
Terry golden says:

2016-02-01 at 17:29

Ive seen a great many dbs in IO wait, and there is more then enough demand, as the blocks arent being served up fast enough to keep up with demand. Many times the root cause is a lack of understanding of the san configuration (putting office automation on the same drives as say the redo logs with of course the least amount of san cache, not tiering storage to io demand etc) or using fibre cards tasked with serving many times there capacity . The net effect is not getting the full value of those expensive licensed weighted cores. On the other side is as they say in DC dont do stupid stuff, like writing sql that generating a petabyte of intermediate data to process a 10M row email list, that generates days of unnecessary IO requests. Engineered systems like Exadata in theory take the headache out of building a properly for Oracle system, but many sites could benefit from just tuning thier sql, san and fibre channel network to insure that there are sufficent capacity to serve demand without any bottle necks. Sadly most shops dont seem to have the interest or skills to get very close to the hardware.

1. Bart Sjerps says:
  
  2016-02-01 at 18:14
  
  Hi Terry,
  I agree and I share your experiences. BTW there are more efficient “engineered” systems available than Exadata such as EMC’s vBlock and VxRacks etc. More upcoming.
  
  The Exadata systems I’ve seen in my customer base are typically notoriously under-utilized on CPU – i.e. paying for too much butter, baking capacity for a lot of cakes but not enough consistent demand from the business. Making it a VERY expensive deal. Not to speak about the other differences in architecture.
  
  Will blog on that in the (near) future 😉
  Thx for commenting!

Baking a cake: trading CPU for IO?

Like this:

4 thoughts on “Baking a cake: trading CPU for IO?”

Leave a ReplyCancel reply

Share this:

Like this:

4 thoughts on “Baking a cake: trading CPU for IO?”

Leave a ReplyCancel reply