Skip to main content

Command Palette

Search for a command to run...

Reading Oracle ASM Files Directly Using x$kffxp

Updated
5 min read
U

DBA Team Lead and Oracle ACE Associate. Building database tools and helping teams manage Oracle infrastructure.

In a previous post, I explored implementing asmfs using dbms_diskgroup.read(). Since then, asmfs (my open source GitHub project) has gained a new capability: reading files stored in Oracle ASM by directly accessing block devices.

Rather than needing deep knowledge of ASM's internal structures, asmfs can rely on the x$kffxp view, which exposes nearly all the metadata required. In this post, I'll walk through how asmfs leverages this view to read ASM files.

Disk Groups and Allocation Units (AU)

Here are the basics: an ASM diskgroup is made of one or more disks. Space for each file is allocated in uniformly sized "chunks" known as extents. The same extent can be written to one (aka external redundancy), two (normal redundancy) or three (high redundancy) different disks. So, for example, in a high redundancy diskgroup, each extent needs 3 allocation units (AU).

Each ASM diskgroup can have different AU size, but strictly all allocation units (AU) have the same size within one diskgroup. You can define your preferred AU size (1, 2, 4, 8, 16, 32 or 64 MB) when you create a diskgroup.

We can figure out our current AU size and ASM redundancy level (v$asm_diskgroup.type) by querying v$asm_diskgroup like this:

SQL> select group_number, name, type, allocation_unit_size / 1024 / 1024 as au_size_mb from v$asm_diskgroup;

GROUP_NUMBER NAME     TYPE     AU_SIZE_MB
------------ -------- -------- ----------
           1 DATA     NORMAL            4

SQL> select disk_number, path from v$asm_disk where group_number=1;

DISK_NUMBER PATH
----------- ---------------
          0 AFD:DATA1
          1 AFD:DATA2
          2 AFD:DATA3

If you're using AFD then you'll also need to resolve AFD labels to actual block devices like this:

$ afdtool -getdevlist
--------------------------------------------------------------------------------
Label                     Path
================================================================================
DATA1                     /dev/sdd
DATA2                     /dev/sdb
DATA3                     /dev/sde

File Number

In order to obtain the list of all allocation units (AU) that belong to a given file, we need to obtain the file_number of the file we want to read. It can be obtained by simply querying v$asm_alias like this:

SQL> select name, file_number from v$asm_alias where name = 'SYSTEM.261.1206729191' and group_number=1;

NAME                           FILE_NUMBER
------------------------------ -----------
SYSTEM.261.1206729191                  261

x$kffxp (Extent Map)

Now the fun part:

SQL> SELECT
    x.disk_kffxp AS disk_number,
    x.au_kffxp AS allocation_unit
FROM x$kffxp x  
WHERE x.group_kffxp = 1        /* :group_number */
    AND x.number_kffxp = 261   /* :file_number */
    AND x.lxn_kffxp = 0        /* :mirror */
ORDER BY x.xnum_kffxp /* extent number */;

DISK_NUMBER ALLOCATION_UNIT
----------- ---------------
          2             148
          0             147
          1             152
...

By using first two where conditions (:group_number and :file_number) we select only those allocation units that belong to our file. But because our example diskgroup has normal redundancy, we get two allocation units for each extent. So, we filter by :mirror, which can be:

  • 0 for primary copy
  • 1 for secondary copy (available in normal and high redundancy diskgroups)
  • 2 for tertiary copy (available in high redundancy diskgroups)

In asmfs, you can select which :mirror copy you want to use by specifying --mirror parameter to asmfs command (e.g. asmfs --mirror 1 for always reading only from secondary copy).

Order is also important, xnum_kffxp is extent number. First extent starts with 0, second one is 1 etc. A constant value of 2147483648 refers to the triple-mirrored file metadata.

So, how can we use disk number (x.disk_kffxp) and allocation unit number x.au_kffxp?

Running dd

Based on the example three rows returned by our example query, we can run three dd commands like this:

dd if=/dev/sde bs=4194304 skip=148 count=1 >> /tmp/SYSTEM.DBF
dd if=/dev/sdd bs=4194304 skip=147 count=1 >> /tmp/SYSTEM.DBF
dd if=/dev/sdb bs=4194304 skip=152 count=1 >> /tmp/SYSTEM.DBF
...

Let's explain the first command in detail:

  • if=/dev/sde because DISK_NUMBER=2 from x$kffxp refers to PATH=AFD:DATA3 (according to v$asm_disk) which refers to /dev/sde (according to afdtool).
  • bs=4194304 because AU size is 4 MB (according to v$asm_diskgroup.allocation_unit_size)
  • skip=148 because allocation unit from x$kffxp.au_kffxp is 148
  • count=1 because we're reading only one AU at a time

btw, if you're running AFD, you can only do such dd commands as root because one of the AFD features is to deny non-oracle non-root software direct access to block devices.

File size

Note that the file we've created by running all those dd commands must be multiple of AU size (4 MB in our example). But actual Oracle files (e.g. datafiles) can be any multiple of any valid block size (e.g. 8 KB). In other words, there is a high chance that our file contains garbage at the end of file, which we can get rid of by running truncate:

truncate -s 3659538432 /tmp/SYSTEM.DBF

The number of bytes to which we need to truncate is obtained simply from v$asm_file:

SQL> select bytes from v$asm_file where file_number=261;

     BYTES
----------
3659538432

File Headers

By using the approach described above, you'll get file contents as it is written in the ASM. The problem is that if you copy a file using, say rman or asmcmd, then md5sum will be different than what you'll get from described dd & truncate commands. The difference comes from first block only. Not really the complete first block, but just two 4-byte regions:

  • The 4 bytes from 0x20 to 0x23 seems to be the magic constant, which is different for files in ASM and for files in a local filesystem. For files in local filesystem, this always seems to be the same magic value 0x000081a0 in little-endian format.
  • The 4 bytes from 0x10 to 0x13 seems to be the checksum. Due to the way this checksum is calculated, we can simply read the existing checksum and XOR it with the previously mentioned magic value.

For more details, feel free to read the function fix_header_block in file fuse.rs.

Not every file type follows this rule, but most do - datafiles, tempfiles, and archivelogs all behave this way. I stumbled upon this magic constant by doing a simple byte-by-byte comparison of files and noticed it kept showing up with the same value.

Since we're deep in undocumented territory here, there's always a chance I've got something slightly wrong. But so far, it seems to work! :)

Getting started with ASMFS

Now that you understand how it works, feel free to grab a ready-made asmfs.rpm files compatible with Oracle Linux versions 10, 9, 8 from asmfs releases. Installation instructions and examples are all in the README.md of the github project.

If you run into issues or discover edge cases, feel free to open an issue on GitHub.

References

A lot of useful additional information regarding ASM Metadata and Internals is also available on twiki.cern.ch.