Yes, Oracle Text can do this.
http://download-west.oracle.com/docs...cpkg.htm#12729
You need a context index with the INSO filter on the RTF document column in
order to use it.
e.g.
create table RTFDOCUMENT
(
ID NUMBER not null,
DOC CLOB not null,
PLAIN_DOC CLOB
);
[insert an RTF document with id=1 using dbms_lob.loadclobfromfile or whatever]
SQL> create index rtfdocument_tx1 on rtfdocument (doc) indextype is
ctxsys.context parameters ('filter ctxsys.INSO_FILTER'); Index created
SQL> declare
2 v_doc clob;
3 v_plain clob;
4 r number;
5 begin
6 select id, doc
7 into r, v_doc
8 from rtfdocument
9 where id = 1 for update;
10
11 dbms_lob.createtemporary(v_plain, true);
12 dbms_lob.append(v_plain, v_doc); 13
14 ctx_doc.filter('RTFDOCUMENT_TX1', to_char(r), v_plain, plaintext =>
true);
15
16 update rtfdocument set plain_doc = v_plain where id = 1;
17 end;
18 /
PL/SQL procedure successfully completed
SQL> commit;
Commit complete
[plain_doc now contains a plain text version of the RTF document].
It seems a bit over-enthusiastic with newlines, but it certainly works.
from ask tom
Tim -- Thanks for the question regarding "Remove RTF Tags from a Text Field Containing RTF", version 9i
You Asked
Hi Tom, In our database we have some varchar2(4000) fields (could later be a clob). Where we store RTF data. I would like to know if there is anyway I can remove the RTF tags returning only the text again. For Example a Function RTFtoTEXT that takes the RTF: {/rtf1/ansi/deff0{/fonttbl{/f0/fnil/fcharset0 Courier New;}} {/colortbl ;/red0/green0/blue255;} {/*/generator Msftedit 5.41.15.1507;}/viewkind4/uc1/pard/lang1033/ul/b/i/f0/fs20 This is a test./par /cf1/ulnone/i0 This is a test./cf0/b0/par /par /par /par /par } And Returns: This is a test. This is a test. Can you do this with Oracle Text or Something? TIA Tim
and we said...
Yes, we can do this with text -- you can a) filter to plaintext if you have an index b) filter to html with or without an index c) call ctxhx directly from the command line to filter the text and load it I'll demo a) and b). you can play with ctxhx from the command line from $ORACLE_HOME/ctx/bin if you want (run it, it'll tell you the inputs it takes) ops$tkyte@ORA9IR2> create table demo 2 ( id int primary key, 3 theblob blob, 4 theclob clob 5 ) 6 / Table created. ops$tkyte@ORA9IR2> create table filter ( query_id number, document clob ); Table created. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> create index demo_idx on demo(theblob) indextype is ctxsys.context; Index created. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> create sequence s; Sequence created. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> create or replace directory my_files as '/home/tkyte/Desktop/' 2 / Directory created. ops$tkyte@ORA9IR2> @trace ops$tkyte@ORA9IR2> alter session set events '10046 trace name context forever, level 12'; Session altered. ops$tkyte@ORA9IR2> declare 2 l_blob blob; 3 l_clob clob; 4 l_id number; 5 l_bfile bfile; 6 begin 7 insert into demo values ( s.nextval, empty_blob(), empty_clob() ) 8 returning id, theblob, theclob into l_id, l_blob, l_clob; 9 10 l_bfile := bfilename( 'MY_FILES', 'asktom.rtf' ); 11 dbms_lob.fileopen( l_bfile ); 12 13 dbms_lob.loadfromfile( l_blob, l_bfile, 14 dbms_lob.getlength( l_bfile ) ); 15 16 dbms_lob.fileclose( l_bfile ); 17 18 ctx_doc.ifilter( l_blob, l_clob ); 19 commit; 20 ctx_doc.filter( 'DEMO_IDX', l_id, 'FILTER', l_id, TRUE ); 21 end; 22 / PL/SQL procedure successfully completed. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> set long 500 ops$tkyte@ORA9IR2> select utl_raw.cast_to_varchar2(dbms_lob.substr(theblob,500,1)) from demo; UTL_RAW.CAST_TO_VARCHAR2(DBMS_LOB.SUBSTR(THEBLOB,500,1)) ------------------------------------------------------------------------------- {/rtf1/ansi/ansicpg1252/uc1 /deff0/deflang1033/deflangfe1033{/fonttbl{/f0/froma n/fcharset0/fprq2{/*/panose 02020603050405020304}Times New Roman;}{/f1/fswiss/f charset0/fprq2{/*/panose 020b0604020202020204}Arial;} {/f2/fmodern/fcharset0/fprq1{/*/panose 02070309020205020404}Courier New;}{/f23/ froman/fcharset128/fprq1{/*/panose 00000000000000000000}MS Mincho{/*/falt MS ?? };}{/f28/froman/fcharset128/fprq1{/*/panose 00000000000000000000}@MS Mincho;} {/f29/froman/fcharset238/fprq2 Times New Roman CE; ops$tkyte@ORA9IR2> select theclob from demo; THECLOB ------------------------------------------------------------------------------- <HTML><BODY> <h1><font size="5" face="Arial"><b>Primary key index in Partitioning</b></font> </h1> <p><font size="3" face="Times New Roman"><i>I</i> <i>have a table accounts whic h has 80 million records (OLTP system). I would like to partition the table by acct_by_date column. I will be going with range partition and global index es. My concern is regd the primary key acct_id. The index that will be created for primary key should it be local or g lobal and which should I opt for?</i></font></ ops$tkyte@ORA9IR2> select document from filter; DOCUMENT ------------------------------------------------------------------------------- Primary key index in Partitioning I have a table accounts which has 80 million records (OLTP system). I would lik e to partition the table by acct_by_date column.�I will be going with range par tition and global indexes. My concern is regd the primary key acct_id. The inde x that will be created for primary key should it be local or global and which s hould I opt for? Well, this is an easy one.�The primary key index can be local IF and ONLY IF, t he primary key is in fact the (or part of th ifilter works without a ctxsys.context index, but only lets you get HTML, filter works only with an index -- but lets you get plain text OR html