602SQL Documentation Index  

Fulltext in 602SQL

Fulltext document indexing and search are part of the 602SQL Server. The fulltext extension in 602SQL can do the following:

Documents indexed within the document tables can be stored in the following ways:

A database column in a database record will represent each document. This column either directly contains the document or contains a reference to the document. The fulltext document search provides search results from the document tables using the SQL language.

Documents may be in plain text, PDF, RTF, XML, FO, ZFO (created using 602XML), HTML, DOC (created using Microsoft Word or OpenOffice.org), XLS (created using Microsoft Excel or OpenOffice.org), PPT (created using Microsoft PowerPoint or OpenOffice.org) and formats of an OpenOffice.org version 1 or 2 (SXW, SXC, ODT, ODS).

Utilities for getting words from binary documents (fulltext plug-in) need to be configured before using them (see Fulltext plug-ins for more information). Parameters can be set either in the client or server, depending on the entity that will execute indexing.

Word Variation

Words may occur in a document in various forms (e.g. singular, plural, various verb forms). The fulltext system may work in one of the following ways:

The first method requires the appropriate language support to be installed (the 602SQL installation pack contains support for Czech, Slovak, German and American or British English). The method of dealing with word variation is specified when the fulltext system is created and cannot be changed.

Fulltext System

If an application wants to use fulltext search, a fulltext system must be created in the database. You may have more than one fulltext system in a single database. Each fulltext system is designated with an identifier. Unlike other object identifiers, the fulltext system identifier may have a maximum of 20 characters.

A fulltext system is always created for a specific language.

The name of the application followed by a dot and the fulltext system identifier is used when referring to a fulltext system (e.g. using the FULLTEXT predicate). If the application name is omitted, the current application will be used instead. The dot prior to the fulltext system name may not be omitted.

A new fulltext system is created using the CREATE FULLTEXT statement, definition modifications are performed using the ALTER FULLTEXT statement and the system is deleted using the DROP FULLTEXT statement. You can also create, modify and delete a fulltext system from the 602SQL Development Client using the fulltext designer.

When a new internal fulltext system is created, new tables will be created in the application. These tables are FTX_WORDTABid (word table) and FTX_REFTABidft (reference table) (and also the FTX_DOCPARTidft table if archive indexing is enabled), sequence FTX_DOCIDidft, where idft is the fulltext system identifier. The tables for document storage are not created. The application author must create these tables.

When a new external fulltext system is created (since version 9.5.1) only the FTX_DOCPARTidft table will be created in the application (if archive indexing is enabled) and the sequence FTX_DOCIDidft, where idft is the fulltext system identifier. Tables for words and references will not be created. This information is stored in a binary file named Application_name.Ftx_system_name.ftx. This file will be created in the same folder where the database file is located, unless set differently by the server property ExtFulltextDir.

Example

The internal fulltext system Correspondence from the Literary application is composed of the fulltext object type Correspondence, tables FTX_WORDTABCorrespondence, FTX_REFTABCorrespondence, sequence FTX_DOCIDCorrespondence and the document tables. The predicate Fulltext('Literary.Correspondence', ...) is used for a search (or the predicate Fulltext('.Correspondence', ...) in the Literary application).

Differences between internal and external index storage

Internal storage (indexes are part of the database file)

External storage (indexes are separate from the database file)

The index speed is the same for both cases.

List of topics: