Text indexes in comparison query ( tune like query using text index for large tables)
consider text comparison query:
select e.empno,e.ename,d.loc,d.dname from emp e ,dept d
where e.ename like '%'||d.dname||'%'
This is suitable case of use of text indexes if table emp is quite big.
Steps:
1 .FROM DBA ---------
GRANT SELECT ON ctxsys.dr$ths_phrase to scott;
GRANT EXECUTE ON CTX_DDL TO scott;
FROM CTXSYS USER ----------------
exec ctx_ddl.create_stoplist('empty_stoplist', 'BASIC_STOPLIST');
exec ctx_ddl.create_preference('matching_lexer', 'BASIC_LEXER');
2.FROM application USER scott ------
CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');
in order to take benefit of this text index rewrite above query as below :
select e.*
from emp2 e, dept d
where CONTAINS(e.ename, '%'||d.dname||'%')>0
***** Spool outputs along with execution plans *****
SYSTEM session:
1* alter user ctxsys identified by sys
SQL> /
User altered.
SQL> GRANT SELECT ON ctxsys.dr$ths_phrase to scott;
Grant succeeded.
SQL> GRANT EXECUTE ON CTX_DDL TO scott;
Grant succeeded.
SQL> exec ctx_ddl.create_stoplist('empty_stoplist', 'BASIC_STOPLIST');
PL/SQL procedure successfully completed.
SQL> exec ctx_ddl.create_preference('matching_lexer', 'BASIC_LEXER');
PL/SQL procedure successfully completed.
--SCOTT session --
SQL> create table emp2 as select * from emp;
Table created.
SQL> alter table emp2 modify empno number(7);
Table altered.
SQL> alter table emp2 modify ename varchar2(200);
Table altered.
SQL> create sequence s1 ;
Sequence created.
SQL> insert into emp2(empno,ename) select s1.nextval,'SALES' from emp2;
15 rows created.
insert repeated for
30720 rows created.
SQL> commit;
Commit complete.
SQL> insert into emp2(empno,ename) select s1.nextval,'ABCD SALES HOLA' from emp2 where rownum<=1000; 1000 rows created.
SQL> select * from dept;
DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
SQL> insert into emp2(empno,ename) select s1.nextval,'ABCDACCOUNTINGEXYZ' from emp2;
62440 rows created.
SQL> CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');
Index created.
SQL> insert into emp2(empno,ename) select s1.nextval,'bola' from emp2 ;
124880 rows created.
SQL> commit;
Commit complete.
SQL> set autotrace traceonly explain statistics timing on
SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0 ;
124866 rows selected.
Elapsed: 00:00:01.23
Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 682 | 135K| 146 (0)| 00:00:02 |
| 1 | NESTED LOOPS | | 682 | 135K| 146 (0)| 00:00:02 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 170 | 32980 | 146 (0)| 00:00:02 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
750 recursive calls
0 db block gets
10354 consistent gets
5 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
26 sorts (memory)
0 sorts (disk)
124866 rows processed
SQL> /
124866 rows selected.
Elapsed: 00:00:00.84
SQL> drop index EMP_ENAME2;
Index dropped.
Elapsed: 00:00:02.93
1 select e.*
2 from emp2 e, dept d
3* where e.ename like '%'||d.dname||'%'
SQL>
SQL> /
124866 rows selected.
Elapsed: 00:00:01.01
Execution Plan
----------------------------------------------------------
Plan hash value: 4088618096
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 43131 | 8087K| 831 (2)| 00:00:10 |
| 1 | NESTED LOOPS | | 43131 | 8087K| 831 (2)| 00:00:10 |
| 2 | TABLE ACCESS FULL| DEPT | 4 | 40 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| EMP2 | 10783 | 1916K| 207 (2)| 00:00:03 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("E"."ENAME" LIKE '%'||"D"."DNAME"||'%')
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
178 recursive calls
0 db block gets
11239 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
124866 rows processed
SQL> /
124866 rows selected.
Elapsed: 00:00:01.01
SQL> insert into emp2(empno,ename) select s1.nextval,'z' from emp2;
49760 rows created.
Elapsed: 00:00:02.75
SQL> commit;
Commit complete.
Elapsed: 00:00:00.10
SQL> set autotrace off
SQL> insert into emp2(empno,ename) select s1.nextval,null from emp2;
499520 rows created.
Elapsed: 00:00:06.32
SQL> commit;
Commit complete.
Elapsed: 00:00:00.01
SQL> CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');
Index created.
Elapsed: 00:00:22.06
SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0
4
SQL>
SQL> set autotrace traceonly explain statistics timing on
SQL>
SQL> /
124866 rows selected.
Elapsed: 00:00:00.86
Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1852 | 368K| 356 (0)| 00:00:05 |
| 1 | NESTED LOOPS | | 1852 | 368K| 356 (0)| 00:00:05 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 463 | 89822 | 356 (0)| 00:00:05 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
392 recursive calls
0 db block gets
10671 consistent gets
2 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
124866 rows processed
SQL> select e.*
2 from emp2 e, dept d
3 where e.ename like '%'||d.dname||'%' ;
124866 rows selected.
Elapsed: 00:00:01.42
Execution Plan
----------------------------------------------------------
Plan hash value: 4088618096
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 185K| 33M| 2103 (3)| 00:00:26 |
| 1 | NESTED LOOPS | | 185K| 33M| 2103 (3)| 00:00:26 |
| 2 | TABLE ACCESS FULL| DEPT | 4 | 40 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| EMP2 | 46308 | 8230K| 525 (3)| 00:00:07 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("E"."ENAME" LIKE '%'||"D"."DNAME"||'%')
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
15761 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
124866 rows processed
SQL> /
124866 rows selected.
Elapsed: 00:00:01.42
SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0;
124866 rows selected.
Elapsed: 00:00:00.82
Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1852 | 368K| 356 (0)| 00:00:05 |
| 1 | NESTED LOOPS | | 1852 | 368K| 356 (0)| 00:00:05 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 463 | 89822 | 356 (0)| 00:00:05 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
52 recursive calls
0 db block gets
10242 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
124866 rows processed
consider text comparison query:
select e.empno,e.ename,d.loc,d.dname from emp e ,dept d
where e.ename like '%'||d.dname||'%'
This is suitable case of use of text indexes if table emp is quite big.
Steps:
1 .FROM DBA ---------
GRANT SELECT ON ctxsys.dr$ths_phrase to scott;
GRANT EXECUTE ON CTX_DDL TO scott;
FROM CTXSYS USER ----------------
exec ctx_ddl.create_stoplist('empty_stoplist', 'BASIC_STOPLIST');
exec ctx_ddl.create_preference('matching_lexer', 'BASIC_LEXER');
2.FROM application USER scott ------
CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');
in order to take benefit of this text index rewrite above query as below :
select e.*
from emp2 e, dept d
where CONTAINS(e.ename, '%'||d.dname||'%')>0
***** Spool outputs along with execution plans *****
SYSTEM session:
1* alter user ctxsys identified by sys
SQL> /
User altered.
SQL> GRANT SELECT ON ctxsys.dr$ths_phrase to scott;
Grant succeeded.
SQL> GRANT EXECUTE ON CTX_DDL TO scott;
Grant succeeded.
SQL> exec ctx_ddl.create_stoplist('empty_stoplist', 'BASIC_STOPLIST');
PL/SQL procedure successfully completed.
SQL> exec ctx_ddl.create_preference('matching_lexer', 'BASIC_LEXER');
PL/SQL procedure successfully completed.
--SCOTT session --
SQL> create table emp2 as select * from emp;
Table created.
SQL> alter table emp2 modify empno number(7);
Table altered.
SQL> alter table emp2 modify ename varchar2(200);
Table altered.
SQL> create sequence s1 ;
Sequence created.
SQL> insert into emp2(empno,ename) select s1.nextval,'SALES' from emp2;
15 rows created.
insert repeated for
30720 rows created.
SQL> commit;
Commit complete.
SQL> insert into emp2(empno,ename) select s1.nextval,'ABCD SALES HOLA' from emp2 where rownum<=1000; 1000 rows created.
SQL> select * from dept;
DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
SQL> insert into emp2(empno,ename) select s1.nextval,'ABCDACCOUNTINGEXYZ' from emp2;
62440 rows created.
SQL> CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');
Index created.
SQL> insert into emp2(empno,ename) select s1.nextval,'bola' from emp2 ;
124880 rows created.
SQL> commit;
Commit complete.
SQL> set autotrace traceonly explain statistics timing on
SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0 ;
124866 rows selected.
Elapsed: 00:00:01.23
Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 682 | 135K| 146 (0)| 00:00:02 |
| 1 | NESTED LOOPS | | 682 | 135K| 146 (0)| 00:00:02 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 170 | 32980 | 146 (0)| 00:00:02 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
750 recursive calls
0 db block gets
10354 consistent gets
5 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
26 sorts (memory)
0 sorts (disk)
124866 rows processed
SQL> /
124866 rows selected.
Elapsed: 00:00:00.84
SQL> drop index EMP_ENAME2;
Index dropped.
Elapsed: 00:00:02.93
1 select e.*
2 from emp2 e, dept d
3* where e.ename like '%'||d.dname||'%'
SQL>
SQL> /
124866 rows selected.
Elapsed: 00:00:01.01
Execution Plan
----------------------------------------------------------
Plan hash value: 4088618096
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 43131 | 8087K| 831 (2)| 00:00:10 |
| 1 | NESTED LOOPS | | 43131 | 8087K| 831 (2)| 00:00:10 |
| 2 | TABLE ACCESS FULL| DEPT | 4 | 40 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| EMP2 | 10783 | 1916K| 207 (2)| 00:00:03 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("E"."ENAME" LIKE '%'||"D"."DNAME"||'%')
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
178 recursive calls
0 db block gets
11239 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
124866 rows processed
SQL> /
124866 rows selected.
Elapsed: 00:00:01.01
SQL> insert into emp2(empno,ename) select s1.nextval,'z' from emp2;
49760 rows created.
Elapsed: 00:00:02.75
SQL> commit;
Commit complete.
Elapsed: 00:00:00.10
SQL> set autotrace off
SQL> insert into emp2(empno,ename) select s1.nextval,null from emp2;
499520 rows created.
Elapsed: 00:00:06.32
SQL> commit;
Commit complete.
Elapsed: 00:00:00.01
SQL> CREATE INDEX emp_ename2 ON emp2 (ename) INDEXTYPE IS CTXSYS.CONTEXT parameters ('lexer CTXSYS.matching_lexer');
Index created.
Elapsed: 00:00:22.06
SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0
4
SQL>
SQL> set autotrace traceonly explain statistics timing on
SQL>
SQL> /
124866 rows selected.
Elapsed: 00:00:00.86
Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1852 | 368K| 356 (0)| 00:00:05 |
| 1 | NESTED LOOPS | | 1852 | 368K| 356 (0)| 00:00:05 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 463 | 89822 | 356 (0)| 00:00:05 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
392 recursive calls
0 db block gets
10671 consistent gets
2 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
124866 rows processed
SQL> select e.*
2 from emp2 e, dept d
3 where e.ename like '%'||d.dname||'%' ;
124866 rows selected.
Elapsed: 00:00:01.42
Execution Plan
----------------------------------------------------------
Plan hash value: 4088618096
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 185K| 33M| 2103 (3)| 00:00:26 |
| 1 | NESTED LOOPS | | 185K| 33M| 2103 (3)| 00:00:26 |
| 2 | TABLE ACCESS FULL| DEPT | 4 | 40 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| EMP2 | 46308 | 8230K| 525 (3)| 00:00:07 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("E"."ENAME" LIKE '%'||"D"."DNAME"||'%')
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
15761 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
124866 rows processed
SQL> /
124866 rows selected.
Elapsed: 00:00:01.42
SQL> select e.*
2 from emp2 e, dept d
3 where CONTAINS(e.ename, '%'||d.dname||'%')>0;
124866 rows selected.
Elapsed: 00:00:00.82
Execution Plan
----------------------------------------------------------
Plan hash value: 4225699595
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1852 | 368K| 356 (0)| 00:00:05 |
| 1 | NESTED LOOPS | | 1852 | 368K| 356 (0)| 00:00:05 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMP2 | 463 | 89822 | 356 (0)| 00:00:05 |
|* 4 | DOMAIN INDEX | EMP_ENAME2 | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("E"."ENAME",'%'||"D"."DNAME"||'%')>0)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
52 recursive calls
0 db block gets
10242 consistent gets
0 physical reads
0 redo size
2295427 bytes sent via SQL*Net to client
91959 bytes received via SQL*Net from client
8326 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
124866 rows processed