Reading Oracle ASM Files Directly Using x$kffxp
DBA Team Lead and Oracle ACE Associate. Building database tools and helping teams manage Oracle infrastructure.
In a previous post, I explored implementing asmfs using dbms_diskgroup.read(). Since then, asmfs (my open source GitHub project) has gained a new capability: reading files stored in Oracle ASM by directly accessing block devices.
Rather than needing deep knowledge of ASM's internal structures, asmfs can rely on the x$kffxp view, which exposes nearly all the metadata required. In this post, I'll walk through how asmfs leverages this view to read ASM files.
Disk Groups and Allocation Units (AU)
Here are the basics: an ASM diskgroup is made of one or more disks. Space for each file is allocated in uniformly sized "chunks" known as extents. The same extent can be written to one (aka external redundancy), two (normal redundancy) or three (high redundancy) different disks. So, for example, in a high redundancy diskgroup, each extent needs 3 allocation units (AU).
Each ASM diskgroup can have different AU size, but strictly all allocation units (AU) have the same size within one diskgroup. You can define your preferred AU size (1, 2, 4, 8, 16, 32 or 64 MB) when you create a diskgroup.
We can figure out our current AU size and ASM redundancy level (v$asm_diskgroup.type) by querying v$asm_diskgroup like this:
SQL> select group_number, name, type, allocation_unit_size / 1024 / 1024 as au_size_mb from v$asm_diskgroup;
GROUP_NUMBER NAME TYPE AU_SIZE_MB
------------ -------- -------- ----------
1 DATA NORMAL 4
SQL> select disk_number, path from v$asm_disk where group_number=1;
DISK_NUMBER PATH
----------- ---------------
0 AFD:DATA1
1 AFD:DATA2
2 AFD:DATA3
If you're using AFD then you'll also need to resolve AFD labels to actual block devices like this:
$ afdtool -getdevlist
--------------------------------------------------------------------------------
Label Path
================================================================================
DATA1 /dev/sdd
DATA2 /dev/sdb
DATA3 /dev/sde
File Number
In order to obtain the list of all allocation units (AU) that belong to a given file, we need to obtain the file_number of the file we want to read. It can be obtained by simply querying v$asm_alias like this:
SQL> select name, file_number from v$asm_alias where name = 'SYSTEM.261.1206729191' and group_number=1;
NAME FILE_NUMBER
------------------------------ -----------
SYSTEM.261.1206729191 261
x$kffxp (Extent Map)
Now the fun part:
SQL> SELECT
x.disk_kffxp AS disk_number,
x.au_kffxp AS allocation_unit
FROM x$kffxp x
WHERE x.group_kffxp = 1 /* :group_number */
AND x.number_kffxp = 261 /* :file_number */
AND x.lxn_kffxp = 0 /* :mirror */
ORDER BY x.xnum_kffxp /* extent number */;
DISK_NUMBER ALLOCATION_UNIT
----------- ---------------
2 148
0 147
1 152
...
By using first two where conditions (:group_number and :file_number) we select only those allocation units that belong to our file. But because our example diskgroup has normal redundancy, we get two allocation units for each extent.
So, we filter by :mirror, which can be:
0for primary copy1for secondary copy (available in normal and high redundancy diskgroups)2for tertiary copy (available in high redundancy diskgroups)
In asmfs, you can select which :mirror copy you want to use by specifying --mirror parameter to asmfs command (e.g. asmfs --mirror 1 for always reading only from secondary copy).
Order is also important, xnum_kffxp is extent number. First extent starts with 0, second one is 1 etc. A constant value of 2147483648 refers to the triple-mirrored file metadata.
So, how can we use disk number (x.disk_kffxp) and allocation unit number x.au_kffxp?
Running dd
Based on the example three rows returned by our example query, we can run three dd commands like this:
dd if=/dev/sde bs=4194304 skip=148 count=1 >> /tmp/SYSTEM.DBF
dd if=/dev/sdd bs=4194304 skip=147 count=1 >> /tmp/SYSTEM.DBF
dd if=/dev/sdb bs=4194304 skip=152 count=1 >> /tmp/SYSTEM.DBF
...
Let's explain the first command in detail:
if=/dev/sdebecauseDISK_NUMBER=2fromx$kffxprefers toPATH=AFD:DATA3(according tov$asm_disk) which refers to/dev/sde(according toafdtool).bs=4194304because AU size is 4 MB (according tov$asm_diskgroup.allocation_unit_size)skip=148because allocation unit fromx$kffxp.au_kffxpis148count=1because we're reading only one AU at a time
btw, if you're running AFD, you can only do such dd commands as root because one of the AFD features is to deny non-oracle non-root software direct access to block devices.
File size
Note that the file we've created by running all those dd commands must be multiple of AU size (4 MB in our example). But actual Oracle files (e.g. datafiles) can be any multiple of any valid block size (e.g. 8 KB).
In other words, there is a high chance that our file contains garbage at the end of file, which we can get rid of by running truncate:
truncate -s 3659538432 /tmp/SYSTEM.DBF
The number of bytes to which we need to truncate is obtained simply from v$asm_file:
SQL> select bytes from v$asm_file where file_number=261;
BYTES
----------
3659538432
File Headers
By using the approach described above, you'll get file contents as it is written in the ASM. The problem is that if you copy a file using, say rman or asmcmd, then md5sum will be different than what you'll get from described dd & truncate commands.
The difference comes from first block only. Not really the complete first block, but just two 4-byte regions:
- The 4 bytes from
0x20to0x23seems to be the magic constant, which is different for files in ASM and for files in a local filesystem. For files in local filesystem, this always seems to be the same magic value0x000081a0in little-endian format. - The 4 bytes from
0x10to0x13seems to be the checksum. Due to the way this checksum is calculated, we can simply read the existing checksum and XOR it with the previously mentioned magic value.
For more details, feel free to read the function fix_header_block in file fuse.rs.
Not every file type follows this rule, but most do - datafiles, tempfiles, and archivelogs all behave this way. I stumbled upon this magic constant by doing a simple byte-by-byte comparison of files and noticed it kept showing up with the same value.
Since we're deep in undocumented territory here, there's always a chance I've got something slightly wrong. But so far, it seems to work! :)
Getting started with ASMFS
Now that you understand how it works, feel free to grab a ready-made asmfs.rpm files compatible with Oracle Linux versions 10, 9, 8 from asmfs releases.
Installation instructions and examples are all in the README.md of the github project.
If you run into issues or discover edge cases, feel free to open an issue on GitHub.
References
A lot of useful additional information regarding ASM Metadata and Internals is also available on twiki.cern.ch.