Friday, November 24, 2006

Consider declaring NOT NULL columns

Consider declaring NOT NULL columns


People sometimes do not bother to define columns as NOT NULL in the data dictionary, even though these columns should not contain nulls, and indeed never do contain nulls because the application ensures that a value is always supplied. You may think that this is a matter of indifference, but it is not. The optimizer sometimes needs to know that a column is not nullable, and without that knowledge it is constrained to choose a less than optimal execution plan.

1. An index on a nullable column cannot be used to drive access to a table unless the query contains one or more predicates against that column that exclude null values.

Of course, it is not normally desirable to use an index based access path unless the query contains such predicates, but there are important exceptions.

For example, if a full table scan would otherwise be required against the table and the query can be satisfied by a fast full scan (scan for which table data need not be read) against the index, then the latter plan will normally prove more efficient.

Test-case for the above reasoning

SQL> create index color_indx on automobile(color);

Index created.

SQL> select distinct color,count(*) from automobile group by color;


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=1046 Bytes=
54392)

1 0 SORT (GROUP BY) (Cost=4 Card=1046 Bytes=54392)
2 1 TABLE ACCESS (FULL) OF 'AUTOMOBILE' (TABLE) (Cost=3 Card
=1046 Bytes=54392)


SQL> alter table automobile modify color not null;

Table altered.

SQL> select distinct color,count(*) from automobile group by color;


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=4 Card=1046 Bytes=
54392)

1 0 SORT (GROUP BY) (Cost=4 Card=1046 Bytes=54392)
2 1 INDEX (FAST FULL SCAN) OF 'COLOR_INDX' (INDEX) (Cost=3 C
ard=1046 Bytes=54392)



2. If you are calling a sub-query in a parent query using the NOT IN predicate, the indexing on column (in where clause of parent query) will not be used.

Because as per optimizer, results of parent query needs to be displayed only when there is no equi-matching from sub-query, And if the sub-query can potentially contain NULL value (UNKNOWN, incomparable), parent query will have no value to compare with NULL value, so it will not use the INDEX.

Test-case for the above Reasoning

SQL> create index sal_indx on emp(sal);

Index created.

SQL> create index ename_indx on emp(ename);

Index created.

SQL> select * from emp where sal not in (select sal from emp where ename='JONES');


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=17 Card=13 Bytes=4
81)

1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=3 Card=14 Byt
es=518) –> you can see a full table scan even when index exist on SAL

3 1 TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (TABLE) (Cost=2 C
ard=1 Bytes=10)

4 3 INDEX (RANGE SCAN) OF 'ENAME_INDX' (INDEX) (Cost=1 Car
d=1)



SQL> alter table emp modify sal not null;

Table altered.

SQL> select * from emp where sal not in (select sal from emp where ename='JONES');


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=12 Bytes=56
4)

1 0 MERGE JOIN (ANTI) (Cost=5 Card=12 Bytes=564)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (TABLE) (Cost=2 C
ard=14 Bytes=518)

3 2 INDEX (FULL SCAN) OF 'SAL_INDX' (INDEX) (Cost=1 Card=1
4) -> Here you go, your index getting used now

4 1 SORT (UNIQUE) (Cost=3 Card=1 Bytes=10)
5 4 TABLE ACCESS (BY INDEX ROWID) OF 'EMP' (TABLE) (Cost=2
Card=1 Bytes=10)

6 5 INDEX (RANGE SCAN) OF 'ENAME_INDX' (INDEX) (Cost=1 C
ard=1)


The above article has been an inspiration after reading an article on ixora . The article was missing some of the testcases, so I thought of adding few for newbiews to relate to it.

1 comment:

Anonymous said...
This comment has been removed by a blog administrator.