Creating the fulltext search query for SQL Server CE

As you've learned earlier, SQL Server CE does not support any full-text search functionality and, therefore, does not come with the contains() and freetext() T-SQL functions that would otherwise have made full-text search an easier task.

The closest equivalent you are given is the CHARINDEX(sequence,expression) and PATINDEX(pattern,expression) T-SQL functions. These functions search text-based SQL fields for a given sequence or pattern and return the starting point of the pattern in the text (if it is found). Consider the following table for instance:

TABLE [Fruits]

FruitName FruitDescription

Apple I love apples

Running the following SQL would produce the result 8 (the starting position of the word 'Apple' in the preceding text)

SELECT CHARINDEX('apple,,FruitDescription) FROM [Fruits]

The patindex() function gives you a little more control, in that you can specify a small subset of regular expression patterns. An example follows:

SELECT CHARINDEX('%apple[0-9]%',FruitDescription) FROM [Fruits]

So, how does the charindex() function translate to the full-text search feature? Let's take a look at an example. If the user, for instance, wanted to search for all documents containing either the Medical or Certificates keyword, he or she would key in Medical or Certificates as the search phrase. You could basically retrieve a listing of the uploaded files matching these keywords using the following SQL:

SELECT * FROM AccountFiles WHERE CHARINDEX(

'Medical', Keywords)>0 OR CHARINDEX('Certificates', Keywords)>0

[rlTt The CHARINDEX() and PATINDEX() T-SQL functions, unlike the LIKE I ^¡s^-5' keyword, work well on large data types such as TEXT or NTEXT. I

You now need to find a way to translate a Boolean search phrase into an sql where clause. The user can key in search phrases in many different ways. A few are shown in the following table:

Search phrase example

Description

Medical OR Certificates OR Receipts

Medical OR Certificates NOT Clinic

Medical Certificates OR Medical Receipts NOT Medical

Containing any of the 'Medical', 'Certificates' or 'Receipts,' keywords

Containing either the 'Medical' or the 'Certificates' keywords but must not contain the 'Clinic' keyword Containing either the 'Medical Certificates' phrase or the 'Medical Receipts' phrase

Any document that does not contain the 'Medical' keyword

You can create a BuildwhereClause() function that takes in a search phrase keyed in by the user and converts it into SQL. The code is as follows:

private string BuildWhereClause(string SearchPhrase) {

int _counter; int _counter2; string[] _NOTPhrases; string[] _ORPhrases; string _NOTPhrase; string _ORPhrase; string _WhereClause; string _SubWhereClause;

//Split the search phrase using 'NOT' as the delimiter //The \b Regex special character specifies a word boundary //and basically tells Regex to only consider occurrences of //'NOT' that are standalone words

_NOTPhrases = Regex.Split(SearchPhrase, "\\bNOT\\b", RegexOptions.IgnoreCase);

//Loop through the split parts for (_counter = 0; _counter <= (_NOTPhrases.Length - 1); _counter++)

_NOTPhrase = _NOTPhrases[_counter].Trim();

//Split this part further using 'OR' as the delimiter

_ORPhrases = Regex.Split(_NOTPhrase, "\\bOR\\b",

RegexOptions.IgnoreCase); _SubWhereClause = "";

//Here we enter another loop for (_counter2 = 0; _counter2 <= (_ORPhrases.Length -1); _counter2++)

_ORPhrase = _ORPhrases[_counter2].Trim();

//Generate the CHARINDEX() expression for each //word in the search phrase and attaches them //together using the OR operator

_SubWhereClause += ((_SubWhereClause.Length > 0) ? " OR " : "") + "CHARINDEX(\'" + _ORPhrase + "\',Keywords)>0";

//As we are still inside a loop that runs through //phrases separated by the NOT operator, //we reattach each formatted portion together //using the NOT operator

_WhereClause += ((_WhereClause.Length > 0) ? " AND

" : "") + (_counter > 0 ? "NOT " : "") + _SubWhereClause;

//If the WHERE clause is not empty, we append the //'WHERE' SQL keyword at the front

_WhereClause = "WHERE " + _WhereClause;

return _WhereClause;

As an example, passing the search phrase "Medical or Certificate not Clinic" through the function above will produce the following:

WHERE CHARINDEX('Medical',Keywords)>0 OR CHARINDEX('Certificate',Keywo rds)>0 AND NOT CHARINDEX('Clinic',Keywords)>0

As you've seen in the brief walk-through earlier on in this chapter, the full-text search results are displayed differently. Instead of the usual DataGrid listing, we choose to display in similar fashion to online search engines, which is to display each search result item with a text summary at the bottom.

James Broccoli

Contract document (20 Kb) Open file

All medical leave submissions must be accompanied with an exemption note or letter from the doctor and

The text summary displayed isn't just a random snippet of text from the Keywords field. It must contain at least one of the search phrase keywords typed in by the user for the display to be meaningful. These keywords are highlighted in bold in the text summary.

To retrieve these text summaries, you can use another T-SQL function available in SQL Server CE — the SUBSTRING(expression, starting_position, length) function. This function simply extracts a chunk of text (with a specified length) from a specified field and starting position. Hence, assuming you have the following data:

TABLE [AccountFiles]

Keywords

This Medical Certificate proves that...

If you combine the charindex() function with the substring() function in the following manner:

SELECT SUBSTRING(Keywords,CHARINDEX(,Medical,,Keywords)/12)

FROM AccountFiles

You will get the following result:

Medical Cert

You can use these two functions to extract the first chunk of text that contains the search phrase keywords. Keeping in mind that a search phrase may consist of more than one set of keywords (attached via the or operator), you need to call these two functions for every OR keyword in the search phrase. The search phrase Medical OR Certificate NOT Clinic will hence translate to the following:

SELECT SUBSTRING(Keywords,CHARINDEX(,Medical,,Keywords)/100) AS [TextSummaryl], SUBSTRING(Keywords,CHARINDEX(,Certificate,,Keywords)/ 100) AS [TextSummary2] FROM AccountFiles

Let's write some code to do this translation:

private string BuildSelectClause(string SearchPhrase) {

int _counter; int _counter2 = 1; string[] _NOTPhrases; string[] _ORPhrases; string _ActivePhrase; string _ORPhrase;

string _SelectClause; _SelectClause = "";

//Split the search phrase at the NOT operator _NOTPhrases = Regex.Split(SearchPhrase, "\\bNOT\\b", RegexOptions.IgnoreCase);

//Everything after the NOT is not needed. We only need the //part on the left of the NOT operator

_ActivePhrase = _N0TPhrases[0].Trim();

//Split the phrase using the OR operator _ORPhrases = Regex.Split(_ActivePhrase, "\\bOR\\b",

RegexOptions.IgnoreCase); for (_counter = 0; _counter <= (_ORPhrases.Length - 1); _counter++)

_ORPhrase = _ORPhrases[_counter].Trim();

//For each OR operator, we call the SUBSTRING and //CHARINDEX functions. Take note that we deduct 30 //characters from the starting point so that the //keyword does not appear exactly as the first //word in the text summary. This is purely for //aesthetic value

_SelectClause += ((_SelectClause.Length > 0) ? "," : "") + "SUBSTRING(Keywords,CHARINDEX(\'" + _ORPhrase + "\',Keywords)-30,120) AS [TextSummary" + _counter2.ToString() + "]"; _counter2++;

//If no keyword was specified in the search phrase, we //simply return the first 120 characters of the text

_SelectClause = "SUBSTRING(Keywords,0,12 0) AS TextSummaryl";

return _SelectClause;

Now that you've created this function, you can proceed to build the full-text search queries. The full-text search function is expected to return paged data, so this means that you will again implement two functions—one to return the search result count and another to return paged data. The code is very similar to the previous ones you've done. The code for the count function follows. It makes use of the BuildwhereClause() function you've created earlier.

public int GetAccountFilesCountBySearchPhrase(string SearchPhrase)

_whereClause = BuildWhereClause(SearchPhrase); _command.CommandText = "SELECT COUNT(*) AS RecordCount FROM AccountFiles " + whereClause;

The code to retrieve the paged data follows next. It makes use of both the

BuildWhereClause() and BuildSelectClause() functions you've created.

public DataSet GetAccountFilesBySearchPhrase(string SearchPhrase, int TotalRecords, int PageNumber, int PageSize, string SortColumn, GlobalVariables.SortingOrder SortDirection)

_command = _globalConnection.CreateCommand(); whereClause = BuildWhereClause(SearchPhrase); textSummaryClause = BuildSelectClause(SearchPhrase);

_command.CommandText = "SELECT * FROM (SELECT TOP(" + _pageRecordCount + ") * FROM (SELECT TOP(" + _initialSelectSize + ") b.FirstName, b.LastName, b.AccountType, a.AccountGUID,a.AttachmentID, a.AttachmentName, a.AttachmentSize, a.Attachment," + _textSummaryClause + " FROM AccountFiles a LEFT JOIN Accounts b ON a.AccountGUID=b.AccountGUID " + _whereClause + " ORDER BY " + _sortColumn + " " + _sortDirection + ", AttachmentID DESC) AS [mytable] ORDER BY " + _sortColumn + " " + _sortOppDirection + ", AttachmentID ASC) AS [mytable2] ORDER BY " + _sortColumn + " " + _sortDirection + ",AttachmentID DESC";

Let's now take a look at the equivalent for Oracle Lite.

Was this article helpful?

0 0

Post a comment