全文索引常用查询方式

一、一般的查询格式:

select * from article_detail where 

contains(content_text,'发觉')>0

and rownum<11;

 

select * from article_detail where 

contains(content_text,'هؤلاء المظلومين')>0

and rownum<11;

二、包括查询操作符的

1. 通配符%(阿拉伯文、英文成功,中文却不成功)

select * from mark_tab where 

contains(text,'قوامها وفعاليتها وع%ائها')>0

and rownum<11;

2. near

语法:NEAR((word1, word2,..., wordn) [, max_span [, order]])

max_span最大是100,默认也是100.

 

例子:

Text下有:The dog chases the cat.好人一生平安

 

select * from mark_tab where 

contains(text,'near((好人,平安),6,true)')>0

and rownum<11;

表示“好人”和“平安”之间最大长度不能超过6,并且排序。

 

select * from mark_tab where 

contains(text,'near((اختلفت,中国),100,false)')>0

and rownum<11;

 

查询有 '南京路' 字样的地址
SELECT student_id,student_name
FROM students
WHERE CONTAINS( address, 'nanjing NEAR road' )
remark:
上面的查询将返回包含 'nanjing road''nanjing east road''nanjing west road' 等字样的地址。
A NEAR B
,就表示条件: A 靠近 B

 

3. and, or, not

select * from mark_tab where 

contains(text,'dog and cat and 好人')>0 and rownum<11;

 

select * from mark_tab where 

contains(text,'والاحساس and العمل and 中国')>0

and rownum<11;

 

 

select * from mark_tab where 

contains(text,'dog | cat | 好人')>0

and rownum<11;

 

4. Fuzzy

Use the fuzzy operator to expand queries to include words that are spelled similarly to the specified term

语法:fuzzy(term, score, numresults, weight)

 

Parameter

Description

term

Specify the word on which to perform the fuzzy expansion. Oracle Text expands term to include words only in the index. The word needs to be at least 3 characters for the fuzzy operator to process it.

score

Specify a similarity score. Terms in the expansion that score below this number are discarded. Use a number between 1 and 80. The default is 60.

numresults

Specify the maximum number of terms to use in the expansion of term. Use a number between 1 and 5000. The default is 100.

weight

Specify WEIGHT or W for the results to be weighted according to their similarity scores.

Specify NOWEIGHT or N for no weighting of results.

 

Supported Languages

Oracle Text supports fuzzy definitions for English, French, German, Italian, Dutch, Spanish, Portuguese, Japanese, OCR, and auto-language detection.

select * from mark_tab where 

contains(text,'fuzzy(google, 70, 6, weight)')>0

and rownum<11;

查询结果:The dog chases the cat.好人一生平安 goole.com baidu

 

阿语试了几个不成功,也不知道哪些才是相似的,不知该怎样测试。

网上资料说fuzzy不支持阿拉伯语:

Is there no way to make a "fuzzy matching" with my context index on Arabic words?, I mean what can I do to get the alternatives/matched words in a specific column (Arabic values) for a search text?

The only thing I can think of is to create your own thesaurus using ctx_thes.create_relation and add the matching words in pairs using ctx_thes.create_relation, making them synonyms, so that you can use ctx_thes.syn on your search string to obtain all of the possibilities.

 

5. SYNonym (SYN)

 

Syntax

Description

SYN(term[,thes])

Expands a query to include all the terms defined in the thesaurus as synonyms for term.

 

From:

http://download.oracle.com/docs/cd/B28359_01/text.111/b28304/cthes.htm#i997464

步骤:

--创建名为'DEFAULT'同义词词典,同义词不区分大小写

begin

CTX_THES.CREATE_THESAURUS('DEFAULT',False);

end;

--添加同义词(关系)

begin

ctx_thes.create_relation('DEFAULT','google','SYN','googel');

end;

 

begin

ctx_thes.create_relation('DEFAULT','المجتمعات','SYN','سطحية');

end;

 

 

--查看同义词(关系)

declare

 synonyms varchar2(2000);

begin

 synonyms := CTX_THES.SYN('google','DEFAULT');

 dbms_output.put_line('the synonym expansion for google is: '||synonyms);

end;

 

--删除同义词(关系),整个不可用

begin

CTX_THES.DROP_RELATION('DEFAULT','google','SYN','googe');

end;

 

--删除某个词

begin

ctx_thes.drop_phrase('DEFAULT', 'GOOGEL');

end;

 

 

----查询测试,英文,阿文,中文都测试成功

select * from mark_tab where 

contains(text,'SYN(dog)')>0

and rownum<11;

 

select * from mark_tab where 

contains(text,'SYN(سطحية)')>0

and rownum<11;

 

select * from mark_tab where 

contains(text,'SYN(中国)')>0

and rownum<11;

--------------------------------

posted on 2010-01-07 13:05  思静  阅读(846)  评论(1编辑  收藏  举报