A brief walkthrough of fulltext search

Now that you've seen what a parameterized search looks like, let's turn to full-text search. Full-text search functionality involves two different operations:

• Indexing: It is the process by which metadata or keywords are extracted from a file and stored elsewhere in a more easily accessible format for searching. Take an encyclopedia for example. (The type that is printed on paper, not the electronic one!). If you wanted to look up a section on grasshoppers, you wouldn't need to search through the book page by page looking for it. You could just flip to the glossary, look up the word, and it would tell you which page to turn to. Indexing is the process of creating this glossary.

• Searching: It is the process of searching your document content for a specific phrase. After you've indexed a document, you do not need to search every line in your file for a particular phrase. You only need to search the created index. This leads to better performance by many orders of magnitude.

In your application, it makes good sense for the indexing to be performed automatically every time a user uploads a file. The best place to do this would be the File details pop up window in the Account Details area you've created in the previous chapter (shown in the following screenshot). When the user clicks the Save button, you could index the file right there and then.

Moving on to the search interface itself, you will need to provide the user with a place to key in the search phrase and start the search. The full-text search window can be launched from the main menu by double-clicking the Search icon in the main menu. The full-text search screen features a single text box that allows the user to specify a search phrase and a Search button to run the search.

Your full-text search engine will also need to support basic Boolean searches. Boolean searches allow you to specify how multiple words in a search phrase are used in a search. In the previous screenshot, for instance, the search phrase Medical OR Certificate will indicate that documents containing either the word Medical or the word Certificate will be returned.

In contrast, a search phrase consisting of two words Medical Certificate would indicate an and relationship between these words, and would only return documents that contains both words. The following is a list of basic Boolean search operators commonly found in most search engines:

Boolean operator Description

Keyword1

AND

The AND operator (usually omitted) specifies that *both* the

Keyword2

keywords around this operator must exist in the document for

it to be returned in the search results.

Examples:

"Medical AND Certificate"

"Medical Certificate"

Keyword1

OR

The OR operator specifies that all documents containing either

Keyword2

one of the keywords should be returned.

Keyword1

NOT

The NOT operator specifies that all documents containing

Keyword2

Keywordl but not keyword2 should be returned.

The full-text search functionality runs the search phrase (with the Boolean logic applied) against the stored Keywords metadata for all documents. This can be done at the database level. All documents with matching results are returned in a paged list (again making use of the paging control you've created in the previous chapter).

: Search summary Yx

James Broccoli

Contract document (20 Kb) Open File

All medical leave submissions must be accompanied with an exemption note or letter from the doctor and

Chandra Joseph

Medical Cert 09/2009 (110Kb) Open file has a medical condition called Rheumatoid Athrythis that may render him unable to use The Chair for

Chandra Joseph

Medical Cert 09/2009 (110Kb) Open file has a medical condition called Rheumatoid Athrythis that may render him unable to use The Chair for

The summary details window will display a summary of the text containing the search phrase (which is highlighted in bold in the previous screenshot). The name and size of the document is also displayed, with an Open file link at the side to let the user open the file directly. The name of the account is displayed as a link so that end users can easily jump into the details window of the account if they wish to.

Was this article helpful?

0 0

Post a comment