Oracle Automatic Storage Management (ASM) is an Oracle proprietary volume manager and file system technology that optimizes storage and management functions for Oracle. Taken together with Silk, they can form a very powerful robust and performant I/O sub system for mission critical database intensive workloads. ASM accomplishes this with DiskGroups, and ASM instances – a special kind of Oracle database instance that serves up I/O to subscribed database instances. Let’s discuss ASM a bit first and then show how Silk leverages ASM to provide best of breed performance for critical IaaS DB workloads.
The ASM instance is an unmounted Oracle instance with a SGA, PGA and background Processes. But no mounted or open datafiles. Instead, specially prepared OS LUNS are presented to this instance and are formed into diskgroups. These striped diskgroups in turn can be offered to other Oracle database instances (non ASM, “traditional”) and provide storage through OMF (Oracle Managed Files) conventions.
Device preparation is done in one of two ways. Formerly, a series of RPMs associated with something called ASMLib can be installed and used to manage LUN’s – or as of 12c the newer technique using ASM Filter Driver (ASMFD) can be used. ASMLib is being deprecated but I recently tried to install ASMFD 19c on a Cent-OS Linux system and the install failed with the message “Unsupported Operating System”. Apparently, the install scripts are quite fussy about where they will install themselves. More to the point, you won’t be able to get support for that Platform/Oracle combination. Compatibility aside, ASM Filter Driver is the way of the future.
Once ASM is set up and the diskgroups created, the diskgroups can be dynamically managed. An init.ora parameter, DB_CREATE_FILE_DEST can be set and in conjunction with OMF can make tablespace creation this easy.
CREATE BIGFILE TABLESPACE big_file_asm SIZE 10MG REUSE AUTOEXTEND ON NEXT 10G MAXSIZE UNLIMITED;
The above SQL statement constructs a 10-gigabyte tablespace called big_file_asm. It will auto extend indefinitely in 10G increments. The size will only be limited by the physical limits of the storage. Note the bigfile clause: even on 64-bit systems, regular Oracle datafiles are restricted to 32GB each. A multi-terabyte tablespace then would require perhaps dozens of datafiles that need to be managed and monitored. ASM with bigfile/OMF simplifies this task. The ability to expand and contract the diskgroup by dynamically adding and/or removing LUNS – while everything is online and running – is a powerful feature.
It is true striping too. When a new ASM disk is added, the entire array is rebalanced in a background process that is tunable (the POWER SQL clause). It is not merely concatenating the storage to the end of the volume. This is key to performance.
ASM’s manageability capabilities go along with its performance characteristics. ASM presents the LUN devices as raw devices and thus bypasses the overhead of the OS buffer cache – Oracle does a far better job of caching rows of data than the OS buffer system. This can be up to 10% better I/O than the same I/O going against a regular Linux filesystem such as ext4.
With ASM, reducing a striped volume (as long as remaining data capacity is sufficient for existing data) is accomplished by dropping an ASM disk from the diskgroup. The remaining disks will then reshuffle their data across remaining ASM Disks – all the while the databases are available and running. It is, of course, recommended that these rebalance operations take place on a quiet database, but it is not an absolute requirement. The rebalance operation can have different priority and resource consumption settings as it runs in the background.
Adding an ASM disk to the diskgroup is the same process. Once presented to the OS, a SQL command such as the below will integrate the new LUN to an existing diskgroup:
ALTER DISKGROUP data1 ADD DISK
‘/dev/oradata_5’
REBALANCE POWER 5 WAIT;
Here the ASMFD prepped raw device “/dev/oradata_5” is being added to a diskgroup called “data1”. The rebalance clause tells ASM to use a ‘power of five’ to rebalance the entire array in the background. The wait command instructs SQL to not return until the rebalance operation is complete.
There is a very “Silk” but Oracle-specific parameter that can be set at the ASM init.ora level to maximize efficiency during ASM rebalance operations. In the final phase of an ASM rebalance, Oracle attempts to move all the data out to the ‘outer tracks’ of the media. This makes perfect sense when you have spinning disks – AKA rotating rust – but not in an all-flash situation. Thus, the underscore Oracle parameter _DISABLE_REBALANCE_COMPACT=TRUE can be set on the ASM instance to tell Oracle not to undergo this wasted effort. It requires a reboot.
My esteemed colleague Flash DBA has a great discussion about the Oracle ASM rebalance process.
Silk leverages ASM to present its high speed I/O subsystem to Oracle databases. At the OS level, Silk devices appear as regular LUNS and as such get the treatment from ASMFD before being formed into diskgroups. For Oracle on Silk, the allocation unit size (AU) varies depending on the workload. Large data warehouses can benefit from up to 16 percent performance improvement using a larger AU size of 8 MB or 16 MB. Oracle recommends an AU size of between 1 MB and 4 MB for mostly OLTP workloads, and an 8 MB or larger AU size for OLAP workloads.
For very large databases, a 16 MB or larger AU size is recommended. The sub millisecond latency and 10GB+ throughput make Silk a compelling choice to handle high I/O demands on mission critical database-oriented workloads. While presenting Silk LUNs as cooked file systems to Oracle is certainly possible – it is not recommended. The raw I/O enhancement of ASM coupled with the ability to dynamically expand and contract diskgroups make ASM a very worthwhile component to install configure and support. Coupled with Silk you will get the highest performance and the best manageability.
Want to learn more about how Silk boosts the performance of Oracle? Check out these throughput, latency, and read/write performance tests that were recently conducted for Oracle on Silk.