Here you find an overview (index) of all articles and pages on this blog.
Not including reblogged articles and references to other resources.
About this blog – i.e. why is it called “Dirty Cache” ?
A few pages containing links to white papers, news articles, videos, demos, presentations, other blog posts and the like for certain topics. Currently Greenplum and Oracle related stuff.
A few presentations I have given on various conferences and EMC events.
Blog posts (in chronological order)
My first blog post and an introduction to myself, what kind of stuff I have done during my career and how I ended up doing what I do today
A two-part post on the innovation of Enterprise Flash Drives (aka Solid State Disk), why it is important for database applications, how to manage them and what EMC has done to have them work at extremely high performance and reliability rates.
A series of posts on how to apply tiering to (Oracle) databases. Some of the presentation material I have co-developed with Oracle consultants. It explains the fundamentals of how to get lower cost and better performance at the same time using different storage tiers for a database.
Most (Unix based) file systems have write cache. So if data integrity and consistency is important with databases, how does Oracle prevent data corruption due to data kept in write cache? Or does it work a bit different? And how does this enable EMC cloning/snapshot technology to make complete database copies without compromising consistency?
Still by far the most popular post on my blog. It explains why data should be aligned on certain blocksizes if you use EMC storage, and how to do it. Actually I provide a simpler method than what you will find in most white papers and product documentation.
Database management tools, Operating System tools and storage management tools all have different definitions and ways of measurement for storage performance. Very frequently this leads to confusion and fingerpointing between different support groups and technology providers. This post is an attempt to explain some of the confusion and how to get around it.
Every now and then I see customers duplexing their Oracle redo logs in search of better data protection. Is this worth it, and what are the drawbacks? My opinion on the matter.
A bit off-topic as this has nothing to do with optimizing business applications. It is my personal objection against the way desktop (and mobile) applications are developed these days – where a piece of application data also can contain code to be executed by the same application, how this might improve ease-of-use and functionality but also has huge implications for information security as a whole.
One EMC innovation is thin (virtual) provisioning for data storage. An explanation of how it works without going into deep bits & bytes stuff, and why it can bring huge benefits for reducing infrastructure cost.
EMC storage systems offer features to manage performance levels by throttling I/O rates, or by allocating more resources (mainly storage cache) to an application and less to others. Although technically it works fine, I believe applying those to databases is not a good idea. My argumentation why.
The way databases interface with the storage layer is – not surprisingly – very important to us at EMC. One of the frequent discussions I have is whether to use Oracle ASM or go for any of the available (Unix based) file systems. My view on why I think ASM is the best choice.
Some people are pushing for higher availability for their databases. Even if a datacenter would fail (in case of an unlikely disaster) they still want to keep their databases going without re-starts or recovery. Why is it desirable in some occasions to completely eliminate the last minutes of downtime? And how can you achieve this?
In follow-up to the previous post. Some folks attempt to build stretched clusters by using some form of host based data mirroring. What are the limitations?
The next in the series of stretched clustering. What is required to build a stretched cluster without intruducing serious trouble due to split-brain situations, or subtle mis-configurations that prevent full automatic failover that we wished for in the first place.
Being a market leader, EMC has – not surprisingly – tough competition from other vendors. Some vendors claim to have built a solution for stretched clusters long before EMC did. Is it true? Or are they dangerously cutting some sharp corners?
Can the same piece of data be physically present at multiple locations? We don’t really need science fiction to achieve this. EMC storage virtualization with VPLEX offers this functionality – initially developed to allow data mobility without downtime across distance. But I have pushed EMC engineering to use this technology for building extremely high available stretched database clusters, too. Some history and an explanation of our solution.
Are hot (Oracle) backups impacting your service levels? Worry no more. You can make perfectly usable backups without ever going into hot backup mode. But for a long time, Oracle thought it could not be done. EMC finally convinced them and now it is supported (but it was already working fine for a long time). An explanation of how it works and other useful purposes of creating consistent database copies.
EMC now has Enterprise Flash Drive technology available for about 3 years. I still see many customers buying a storage box with fast spinning rust only and not leveraging the new innovations. Why?
Another off-topic article which has nothing to do with optimizing business apps. But I am interested in information security and one of the last disasters in security happened in my own country. What happened and what are the (largely underestimated) implications?
One of the hot topics in my journeys is the discussion around how to replicate data for disaster recovery purposes. Oracle’s standard way to do this is using Oracle Data Guard. EMC offers other (and, in my opinion, often better) alternatives. A comparison.
Stretched clusters are quickly becoming hot! But some people still asked me to explain what the benefits are of EMC’s VPLEX solution over other alternatives. I tried to make it as simple to understand as possible (but no simpler).
A critical post on the value of performing Proof of Concepts.This because I got confronted a few times with customers – having performed POC’s with our competitors – only to find out that the POC results do not mean that much, or at least do not tell customers how their applications will behave (performance- or otherwise) in a real, non-ideal production environment. On separating marketing and reality.
When tuning our applications for performance, we should focus on not one, but all of the technology layers in the application stack. But are we frequently missing out on the most important one?
A picture says more than 1000 words. So I have stolen an old picture from one of my colleagues and modified it to show the various layers in the application I/O stack – including virtualization layers. Maybe you can use it in performance discussions with colleagues or vendors.
In the average database infrastructure stack, where do you spend most money? I bet it is on database licenses (plus support). But what about the utilization of – very expensively licensed – processors? I strongly believe you can achieve enormous cost savings by going virtual and thereby reducing license cost. Here is how. Including some answers to the most common objections.
Just a joke – what happens if you use systems beyond what they are desinged for?
Modern microprocessors work at incredible speeds and clock cycles are measured in nanoseconds. How do these compare to other speeds in the I/O stack? If you expand a nanosecond to a second, then what do the other response times look like? By doing so you might get a better feeling on how fast (or slow) some technologies are compared to others.
Plain disk drives (including the more expensive ones) are not 100% accurate at all times. Sometimes they return wrong data without error. Why does this happen, what is the impact on a database and how can you protect against this?
If you’re a frequent visitor of my blog (thanks!) then you might know that I use Wikipedia a lot to point my readers to explanation of certain ICT concepts. However, Wikipedia went black for one day (at least, when hitting the page the first time) in their protest against certain proposed laws against information freedom on the internet. Although as an EMC employee, maybe I should be neutral and not comment on such events, in fact I share Wikipedia’s (and many other) views in that any law threatening free internet communications is not a good thing (to put it in very mild words). So this post is a statement of my full support for Wikipedia’s initiative.
Another picture that might help in database performance discussions. I created this for a training explaining how Oracle database (with ASM) interacts with the storage layer. Although oversimplified and possibly not 100% acurate, it still might help when troubleshooting performance.
Some competitors have claimed at our customers that EMC SRDF would allow certain data corruptions to occur where Oracle Data Guard would not, thereby claiming Data Guard is better than EMC SRDF. I don’t appreciate such half-truths so here is the full explanation.
Another update on the Oracle RAC / VPLEX stretched cluster solution: Oracle has certified it! EMC is now the only vendor who is certified by Oracle for stretched cluster implementations.
Moore’s law has given us double CPU speeds every 2 years, double disk capacity and bandwidth every so many months, so that a current system compared to one 10 years ago has dramatically more speed and power. Still, many of the people I talk to are struggling to solve I/O bottlenecks. In this post I focus on REDO log performance as this is often the achilles heel of the behaviour of the entire database.
What is columnar store for databases? How has Oracle implemented it in Exadata Hybrid Columnar Compression? Why is it not available for EMC customers running Oracle? And how does it compare against what EMC has to offer as alternatives?
Is there any disadvantage for a customer in using Oracle/SUN ZFS appliances to create database/application snapshots in comparison with EMC’s cloning/snapshot offerings? Some things to consider that Oracle isn’t telling you about…
Explore how idling processors on a database server are driving up the TCO, and what you can do about it.
A description of EMC’s customer support strategy and the joint escalation with Oracle (including procedure how to engage)
Why not all customers need to run synchronous D/R, what the hidden problems are around application consistency, theoretical vs. real world (“rolling”) disasters, benefits of asynchronous replication, and more
Some vendors claim that VMware is expensive – or at least more expensive than other virtualization platforms. But are they looking at the complete picture?
Some thoughts about the past 2012 and things I expect to happen in the future. Maybe wishful thinking…
Why it makes sense to make a quick copy of production databases – not just because of Oracle support requirements in virtualized platforms, but always before starting serious troubleshooting on mission critical databases.
Here I provide technical proof that the ZFS filesystem causes heavy fragmentation when used for Oracle database files. I don’t make comments yet on how that affects performance, that’s material for a future post.
Follow up to an earlier post about how to set disk alignment on Linux. The new method uses “parted” which makes things a lot easier.
Solving the problem of replicating VMware virtualized Oracle databases, using VMDK/VMFS, on physical hosts, using iSCSI storage protocols.
Providing the PDF version of the session I co-presented at Oracle Openworld. Contains lots of experiences, best practices and tips around tuning Oracle database I/O performance.
Virtualizing databases is still a hot topic. Here I discuss the influence of CPU utilisation on the total infrastructure cost and how to identify a few gotcha’s when looking at system stats.
My experiences in a customer proof-of-concept but to show the madness of such POCs and how certain vendors influence the outcome, I translated the metrics in those of passenger transportation.
What’s the difference between support and certification? Why are some vendors making such a big thing out of this? How should we deal with the FUD?