Back in 2011, I made the statement, "I have put my Oracle redo logs or SQL Server transaction log on nothing but SSDs" (Improve Database Performance: Redo and Transaction Logs on Solid State Disks (SSDs). In fact since the release of the Intel? SSD X25-E series in 2008, it is fair to say I have never looked backed. Even though those X25-Es have long since retired, every new product has convinced me further still that from a performance perspective a hard drive configuration just cannot compete. This is not to say that there have not been new skills to learn, such as configuration details explained here (How to Configure Oracle Redo on SSD (Solid State Disks) with ASM). The Intel? SSD 910 series provided a definite step-up from the X25-E for Oracle workloads (Comparing Performance of Oracle Redo on Solid State Disks (SSDs)) and proved concerns for write peaks was unfounded (Should you put Oracle Database Redo on Solid State Disks (SSDs)). Now with the PCIe*-based Intel? SSD DC P3600/P3700 series Opens in a new windowwe have the next step in the evolutionary development of SSDs for all types of Oracle workloads. Additionally we have updates in operating system and driver support and therefore a refresh to the previous posts on SSDs for Oracle is warranted to help you get the best out of the Intel SSD DC P3700 series for Oracle redo. NVMe One significant difference in the new SSDs is the change in interface and driver from AHCI and SATA to NVMe (Non-volatile memory express). For an introduction to NVMe see this video by James Myers and to understand the efficiency that NVMe brings read this post by Christian Black. As James noted, high performance, consistent, low latency Oracle redo logging also needs high endurance, therefore the P3700 is the drive to use. With a new interface comes a new driver, which fortunately is included in the Linux kernel at the Oracle supported Linux releases of Red Hat and Oracle Linux 6.5, 6.6 and 7. I am using Oracle Linux 7. Booting my system with both a RAID array of Intel SSD DC S3700 series and Intel SSD DC P3700 series shows two new disk devices: First the S3700 array using the previous interface
Second the new PCIe P3700 using NVMe
Changing the Sector Size to 4KB As Oracle introduced support for 4KB sector sizes at Oracle release 11g R2, it is important to be at a minimum of this release or Oracle 12c to take full advantage of SSD for Oracle redo. However 'out of the box’ as shown the P3700 presents a 512 byte sector size. We can use this 'as is’ and set the Oracle parameter 'disk_sector_size_override’ to true. With this we can then specify the blocksize to be 4KB when creating a redo log file. Oracle will then use 4KB redo log blocks and performance will not be compromised. As a second option, the P3700 offers a feature called 'Variable Sector Size’. Because we know we need 4KB sectors, we can set up the P3700 to present a 4KB sector size instead. This can then be used transparently by Oracle without the requirement for additional parameters. It is important to do this before you have configured or started to use the drive for Oracle as the operation is destructive of any existing data on the device. To do this, first check that everything is up to date by using the Intel Solid State Drive Data Center Tool from https://downloadcenter.intel.com/download/23931/Intel-Solid-State-Drive-Data-Center-ToolOpens in a new window Be aware that after running the command it will be necessary to reboot the system to pick up the new configuration and use the device.
Then run the following command to change the sector size. The parameter LBAFormat=3 sets it to 4KB and LBAFormat=0 sets it back to 512b.
After it ran I rebooted, the reboot is necessary because of the need to do an NVMe reset on the device because I am on Oracle Linux 7 with a UEK kernel at 3.8.13-35.3.1. At Linux kernels 3.10 and above you can also run the following command with the system online to do the reset.
The disk should now present the 4KB sector size we want for Oracle redo.
Configuring the P3700 for ASMFor ASM (Automatic Storage Management) we need a disk with a single partition and, after giving the disk a gpt label, I use the following command to create and check the use of an aligned partition.
I then use udev to set the device permissions. Note: the scsi_id command can be run independently to find the device id to put in the file and the udevadm command used to apply the rules. Rebooting the system is useful during configuration to ensure that the correct permissions are applied on boot.
Successfully applied, the oracle user now has ownership of the DC S3700 RAID array device and the P3700 presented by NVMe.
Use ASMLIB to mark both disks for ASM.
As the Oracle user, use the ASMCA utilityOpens in a new window to create the ASM disk groups. I now have 2 disk groups created under ASM. Because of the way the disk were configured Oracle has automatically detected and applied the sector size of 4KB.
SPFILES in 4K DISKGROUPS In previous posts I noted Oracle bug “16870214 : DB STARTUP FAILS WITH ORA-17510 IF SPFILE IS IN 4K SECTOR SIZE DISKGROUP” and even with Oracle 12.1.0.2 this bug is still with us. As both of my diskgroups have a 4KB sector size, this will affect me if I try to create a database in either without having applied patch 16870214. With this bug, upon creating a database with DBCA you will see the following error. The database is created and the spfile does exist so can be extracted as follows:
This spfile is corrupt and attempts to reuse it will result in errors.
However, you can extract the parameters by using the strings command and create an external spfile or a spfile in a diskgroup with a 52b sector size. Once complete, the Oracle instance can be started.
Creating Redo Logs under ASM In viewing the same disks within the Oracle instance, the underlying sector size has been passed right through to the database.
Now it is possible to create a redo log file with a command such as follows:
…and Oracle will create a redo log automatically with an optimal blocksize of 4KB.
Running an OLTP workload with Oracle Redo on Intel? SSD DC P3700 series To put the Oracle redo on P3700 through its paces I used a HammerDB workload. The redo is set with a standard production type configuration without commit_write and commit_wait parameters. A test shows we are running almost 100,000 transactions per second at redo over 500MB / second and therefore we would be archiving almost 2 TBs per hour.
Log file sync even at this level of throughput is just above 1ms
…and the average log file parallel write showing the average disk response time to just 0.13ms
There are six log writers on this system. As with previous blog posts on SSDs I observed the log activity to be heaviest on the first three and therefore traced the log file parallel write activity on the first one with the following method:
The trace file shows the following results for log file parallel write latency to the P3700.
Looking at a scatter plot of all of the log file parallel write latencies recorded in microseconds on the y axis clearly illustrate that any outliers are statistically insignificant and none exceed 15 milliseconds. Most of the writes are sub-millisecond on a system that is processing many millions of transactions a minute while doing so. A subset of iostat data shows the the device is also far from full utilization.
Conclusion As a confirmed believer in SSDs, I have long been convinced that most experiences of poor Oracle redo performance on SSDs has been due to an error in configuration such as sector size, block size and/or alignment as opposed to performance of the underlying device itself. In following the configuration steps I have outlined here, the Intel SSD DC P3700 series shows as an ideal candidate to take Oracle redo to the next level of performance without compromising endurance. |
|