Translate

Search This Blog

Oracle text index replacing like query

Text indexes in comparison query ( tune like query using text index for large tables)
consider text comparison query:

select e.empno,e.ename,d.loc,d.dname from emp e ,dept d
where e.ename like '%'||d.dname||'%'

This is suitable case of use of text indexes if table emp is quite big.

Steps:

1 .FROM DBA ---------

GRANT SELECT ON ctxsys.dr$ths_phrase to scott;
GRANT EXECUTE ON CTX_DDL TO scott;

FROM CTXSYS USER ----------------
exec ctx_ddl.create_stoplist('empty_stoplist', 'BASIC_STOPLIST');
exec ctx_ddl.create_preference('matching_lexer', 'BASIC_LEXER');

2.FROM application USER scott ------

CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');

in order to take benefit of this text index rewrite above query as below :

select e.*
from emp2 e, dept d
where CONTAINS(e.ename, '%'||d.dname||'%')>0

***** Spool outputs along with execution plans *****

SYSTEM session:

1* alter user ctxsys identified by sys
SQL> /

User altered.

SQL> GRANT SELECT ON ctxsys.dr$ths_phrase to scott;
Grant succeeded.

SQL> GRANT EXECUTE ON CTX_DDL TO scott;
Grant succeeded.

SQL> exec ctx_ddl.create_stoplist('empty_stoplist', 'BASIC_STOPLIST');
PL/SQL procedure successfully completed.

SQL> exec ctx_ddl.create_preference('matching_lexer', 'BASIC_LEXER');

PL/SQL procedure successfully completed.

--SCOTT session --

SQL> create table emp2 as select * from emp;
Table created.

SQL> alter table emp2 modify empno number(7);
Table altered.

SQL> alter table emp2 modify ename varchar2(200);
Table altered.

SQL> create sequence s1 ;
Sequence created.

SQL> insert into emp2(empno,ename) select s1.nextval,'SALES' from emp2;
15 rows created.

insert repeated for

30720 rows created.

SQL> commit;
Commit complete.

SQL> insert into emp2(empno,ename) select s1.nextval,'ABCD SALES HOLA' from emp2 where rownum<=1000; 1000 rows created.
 SQL> select * from dept;

DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON

SQL> insert into emp2(empno,ename) select s1.nextval,'ABCDACCOUNTINGEXYZ' from emp2;
62440 rows created.

SQL> CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');

Index created.

SQL> insert into emp2(empno,ename) select s1.nextval,'bola' from emp2 ;
124880 rows created.

SQL> commit;
Commit complete.

SQL> set autotrace traceonly explain statistics timing on
SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0 ;

124866 rows selected.

Elapsed: 00:00:01.23

Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 682 | 135K| 146 (0)| 00:00:02 |
| 1 | NESTED LOOPS | | 682 | 135K| 146 (0)| 00:00:02 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 170 | 32980 | 146 (0)| 00:00:02 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
750 recursive calls
0 db block gets
10354 consistent gets
5 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
26 sorts (memory)
0 sorts (disk)
124866 rows processed

SQL> /
124866 rows selected.
Elapsed: 00:00:00.84

SQL> drop index EMP_ENAME2;
Index dropped.

Elapsed: 00:00:02.93
1 select e.*
2 from emp2 e, dept d
3* where e.ename like '%'||d.dname||'%'
SQL>
SQL> /

124866 rows selected.

Elapsed: 00:00:01.01

Execution Plan
----------------------------------------------------------
Plan hash value: 4088618096

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 43131 | 8087K| 831 (2)| 00:00:10 |
| 1 | NESTED LOOPS | | 43131 | 8087K| 831 (2)| 00:00:10 |
| 2 | TABLE ACCESS FULL| DEPT | 4 | 40 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| EMP2 | 10783 | 1916K| 207 (2)| 00:00:03 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("E"."ENAME" LIKE '%'||"D"."DNAME"||'%')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
178 recursive calls
0 db block gets
11239 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
124866 rows processed

SQL> /
124866 rows selected.
Elapsed: 00:00:01.01


SQL> insert into emp2(empno,ename) select s1.nextval,'z' from emp2;

49760 rows created.

Elapsed: 00:00:02.75

SQL> commit;
Commit complete.

Elapsed: 00:00:00.10
SQL> set autotrace off
SQL> insert into emp2(empno,ename) select s1.nextval,null from emp2;

499520 rows created.

Elapsed: 00:00:06.32
SQL> commit;
Commit complete.

Elapsed: 00:00:00.01
SQL> CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');

Index created.

Elapsed: 00:00:22.06
SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0
4
SQL>
SQL> set autotrace traceonly explain statistics timing on
SQL>
SQL> /

124866 rows selected.

Elapsed: 00:00:00.86

Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1852 | 368K| 356 (0)| 00:00:05 |
| 1 | NESTED LOOPS | | 1852 | 368K| 356 (0)| 00:00:05 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 463 | 89822 | 356 (0)| 00:00:05 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
392 recursive calls
0 db block gets
10671 consistent gets
2 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
124866 rows processed

SQL> select e.*
2 from emp2 e, dept d
3 where e.ename like '%'||d.dname||'%' ;

124866 rows selected.

Elapsed: 00:00:01.42

Execution Plan
----------------------------------------------------------
Plan hash value: 4088618096

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 185K| 33M| 2103 (3)| 00:00:26 |
| 1 | NESTED LOOPS | | 185K| 33M| 2103 (3)| 00:00:26 |
| 2 | TABLE ACCESS FULL| DEPT | 4 | 40 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| EMP2 | 46308 | 8230K| 525 (3)| 00:00:07 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("E"."ENAME" LIKE '%'||"D"."DNAME"||'%')

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
15761 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
124866 rows processed

SQL> /

124866 rows selected.

Elapsed: 00:00:01.42

SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0;

124866 rows selected.

Elapsed: 00:00:00.82

Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1852 | 368K| 356 (0)| 00:00:05 |
| 1 | NESTED LOOPS | | 1852 | 368K| 356 (0)| 00:00:05 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 463 | 89822 | 356 (0)| 00:00:05 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)

Note
-----
- dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
52 recursive calls
0 db block gets
10242 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
124866 rows processed