Fulltext_index_doc

c/c++pascalsql

BOOL cd_Fulltext_index_doc(cdp_t cdp, const char * ft_label, t_docid docid, const char * filename, const char * format);
function cd_Fulltext_index_doc(cdp : cdp_t; ft_label : PChar; docid : integer; filename, format : PChar) : Boolean;
FUNCTION Fulltext_index_doc(IN ft_label CHAR(52), IN docid INT, IN doc CLOB, IN format CHAR(10), IN mode INT) RETURNS BOOLEAN;

Parameters:

ft_label fulltext system label (schema.fulltext or .fulltext for actual opened schema);
docid document id (a unique value in the fulltext system);
filename full name of the file containing the document (client API only);
format document format (see below);
doc document or document reference according to the mode parameter value;
mode document passing mode

Since version:

7.0

Description:

This function stores the document specified by the parameters into the ft_label fulltext system. There exists a simpler method of declaring fulltext system triggers since version 9.5, however this general function can still be used in more complex cases.

The document to be indexed may be passed in two ways - the mode parameter values may be:

value  description
0 the document is contained in the doc parameter (i.e. the doc parameter is the contents of a column (or phrase) of CHAR or CLOB type, or BLOB if the content is a binary file)
1 the document is in a file, whose name is in the doc parameter (i.e. the doc parameter is the CHAR type column, that contains only the full names of the files; the files must be places on the computer running the server;

The format parameter allows you to specify, whether a plain text will be indexed (parameter TXT, TEXT or PLAIN) or some other format (RTF, DOC, HTM, XLS, PDF, PPT and Open Office. org formats) - further text analysis is done in this case for this format (format specification, and also version for DOC). If the analysis fails, the CONVERSION_NOT_SUPPORTED (sqlstate W0226) error will occur (or GERERR_FULLTEXT_CONV - sqlstate W6005). If the format parameter is an empty string, text analysis will be done and a known format will be chosen (or an error will occur). This is the usual method of indexing various documents (e.g. files).

If you set the mode parameter to 1 (file name) and the file is not found, the CONVERSION_NOT_SUPPORTED (sqlstate W0226) error will occur.

CONVERSION_NOT_SUPPORTED and GERERR_FULLTEXT_CONV errors don't cause rollback. If they are treaded with a CONTINUE handler, the procedure will continue with the next statement (see an example bellow).

If the function is called from a client (C or Pascal syntax), then the document must be in a file, whose full name is passed as the filename parameter.

If the fulltext system already contains information about the contents of the document set in the docid parameter, the indexed information is deleted before indexing the document. If you're indexing multiple table, pay attention to the uniqueness of the document id for all tables (e.g. by using one sequence)!

Function value:

This function returns TRUE if successful, FALSE otherwise.

Example:

We assume that the TXTAB table contains text in the TX column and the id of the document in the DOC_ID column. The text is either plain or RTF format. We may execute indexing in a trigger defined as follows:

TRIGGER txupd AFTER UPDATE ON Txtab
REFERENCING NEW AS new FOR EACH ROW
BEGIN
  IF Substring(new.tx FROM 1 FOR 5) = '{\rtf' THEN
    CALL Fulltext_index_doc('.FT_FORUM', new.DOC_ID, new.tx, 'rtf', 0);
  ELSE
    CALL Fulltext_index_doc('.FT_FORUM', new.DOC_ID, new.tx, 'txt', 0);
  END IF;
END

This trigger assumes, that the document id is not modified during the UPDATE operation.

Example:

We assume that the MyCD table has the Id_CD primary key and columns Album and Interpret of CHAR(250) type and column Note of CLOB type. All these data need to be indexed (when adding a new CD or editing).

TRIGGER AfterCDInsert AFTER INSERT ON MyCD
REFERENCING NEW AS new FOR EACH ROW
BEGIN
  CALL Fulltext_index_doc('.cd_ft', new.id_CD, new.Album||' '||new.interpret||' '||new.Note, 'txt', 0);
END
TRIGGER AfterCDEdit AFTER UPDATE OF ANY Album,interpret,tracks,Note ON MyCD
REFERENCING NEW AS new FOR EACH ROW
BEGIN
  CALL Fulltext_index_doc('.cd_ft', new.id_CD, new.Album||' '||new.interpret||' '||new.Note, 'txt', 0);
END
TRIGGER BeforeDelete BEFORE DELETE ON MyCD
REFERENCING OLD AS old FOR EACH ROW
BEGIN
  CALL Fulltext_remove_doc(".cd_ft", old.id_CD);
END

SQL procedure example: indexing of external document files (mode=1):

PROCEDURE `deferred_indexing`( );
BEGIN
  DECLARE i,j INTEGER DEFAULT 0;
  DECLARE CONTINUE HANDLER FOR SQLSTATE 'W0226', SQLSTATE 'W6005' BEGIN 
    CALL Log_write("File indexing failed. Error number: "+SQLSTATE);
  END;
  FOR row AS cx CURSOR FOR
    SELECT id_doc, doc, indexed 
    FROM Doctab
    WHERE NOT indexed
  DO
    IF Fulltext_index_doc('.ft_test_deferred', row.id_doc, row.filename, '', 1) THEN
      SET i = i+1;
    ELSE
      SET j = j+1;
    END IF;
    UPDATE SET indexed = TRUE WHERE CURRENT OF cx;
  END FOR;
  CALL Log_write(Int2str(i)+' document(s) indexed, '+Int2str(j)+' failed');
END

See