Range scans when using VARCHAR2 instead of NUMBER

I was a little surprised today when another DBA did not know the effect of using a VARCHAR2 datatype for what is functionally a number.  I wanted to ensure I had an article to which I could point when this question comes up in the future, so here it is.

A VARCHAR2 that functions as a number will not be very usable for what would normally result in a range scan execution plan. If you think about it, this makes sense. You can intuitively look at all values (regardless of “datatype”) between 10 and 900 and say there are 890 values between the two. If however, you are a computer, you will sort it as though 10 is a alphabetic character, and 900 is an alphabetic character, since that is what the data is. In other words, 1,000,000,000,000 is less than 2, since the leading 1 is less than 2.

Below shows a test case and the optimizer execution plan for each.

SQL> create table vchar(c varchar2(20));

Table created.

SQL> create table num(c number);

Table created.

SQL> begin
  2    for i in 1..100000 loop
  3      insert into vchar values(i);
  4      insert into num values(i);
  5    end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> create index vchar_i on vchar(c);

Index created.

SQL> create index num_i on num(c);

Index created.

SQL> exec dbms_stats.gather_table_stats(USER,'VCHAR');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(USER,'NUM');

PL/SQL procedure successfully completed.

SQL> explain plan for select * from vchar where c between '10' and '900';

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------
Plan hash value: 600218931
--------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         | 88900 |   434K|    45   (5)| 00:00:01 |
|*  1 |  INDEX FAST FULL SCAN| VCHAR_I | 88900 |   434K|    45   (5)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("C"<='900' AND "C">='10')

13 rows selected.

SQL> explain plan for select * from num where c between 10 and 900;

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------
Plan hash value: 3771115364
--------------------------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |       |   892 |  3568 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| NUM_I |   892 |  3568 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
----------------------------------------------------
   1 - access("C">=10 AND "C"<=900)

13 rows selected.

SQL>

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.