Compressing and Decompressing XMLTYPE’s

We are testing with compression of our XMLTYPE data. We are initially testing with the following…

create or replace package compression_utils is
  function comp_data(p_data in xmltype) return blob;
  function decomp_data(p_data in blob) return xmltype;
end;
/

create or replace package body compression_utils is
  function comp_data (p_data in xmltype) return blob is
    l_original_clob      clob;
    l_original_blob      blob;
    l_compressed_blob    blob;
    l_warning number := 0;
    l_in number := 1;
    l_out number := 1;
    l_lang number := 873;
  begin

    dbms_lob.createtemporary(l_compressed_blob,TRUE);
    dbms_lob.createtemporary(l_original_blob,TRUE);
    l_original_clob := p_data.getclobval();
    dbms_lob.converttoblob(l_original_blob,
                           l_original_clob,
                           dbms_lob.getlength(l_original_clob),
                           l_in,
                           l_out,
                           dbms_lob.default_csid,
                           l_lang,
                           l_warning);
    utl_compress.lz_compress (src => l_original_blob, dst => l_compressed_blob);
    return l_compressed_blob;
  end;

  function decomp_data (p_data in blob) return xmltype is
    l_uncompressed_blob  blob := to_blob('1');
    temp_clob CLOB;
    dest_offset number :=1;
    src_offset number :=1;
    default_lang_ctx integer := 0;
    warning integer :=1;
  begin
    dbms_lob.createtemporary(temp_clob,true);
    utl_compress.lz_uncompress (src => p_data, dst => l_uncompressed_blob);  

    dbms_lob.converttoclob(temp_clob,
                           l_uncompressed_blob,
                           dbms_lob.getlength(l_uncompressed_blob),
                           dest_offset,
                           src_offset,
                           873,
                           default_lang_ctx,
                           warning);
    return xmltype.createxml(temp_clob);
  end;
end;
/

We loaded 200,000 XMLTYPE rows from a large journal table we have. When I run an extract using the decomp_data procedure above, my session time is spent entirely on CPU. It took two minutes to get the average length of the uncompressed data in the 200,000 rows. To get the same average length with the uncompressed rows, it took less than two seconds.

Out of the two minutes, the time was spent as follows:

  • 20 seconds on decompressing
  • 20 seconds on converttoclob()
  • 80 seconds on createxml()

I realize a user call like UTL_COMPRESS will never be as fast as native database compression, but I was still pretty disappointed with this. I will do some more testing to see if I can increase a buffer size that may make the decompression faster, but it doesn’t look good right now.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.