|
|
|
|
|
|
|
|
Empress Technical News – March 2009 Database Text SearchIndex – Fast Text Data Retrieval – Part 2 IntroductionEmpress Ultra Embedded 10.20 offers many additional features to application developers. One of those features is a Text Search Index capability for fast text data retrieval. Empress Text Search IndexEmpress Ultra Embedded V10.20 Text Search Index capability empowers application developers to implement an efficient search for database records using keywords/tokens/phrases. The most typical usage would be to associate those keywords with particular character/text based attribute in a database table. The search index capability is developed as an additional set of C calls that are used in conjunction with Empress C/C++ Kernel Level API – mr Routines. The search index is a user maintained index – an index not maintained via Empress database engine calls. Application would supply the list (array) of tokens/keywords/phrases on insertion in the Empress database in order to create a text search index. In “Fast Text Data Retrieval - Part 1” a Text Search facility was demonstrated on a very simple retrieval example. In Part 2, we are going to demonstrate how to create a text search index, how to insert keywords/tokens into this index and how to perform more complex searches. In Part 1, the example was written for a Windows Mobile device. This time, in Part 2, the examples are written for Linux devices. Create Search Index The following example provides the Empress C/C++ Kernel Level API – mr Routines program code (make_database.c) to show how to create the search index on the table songs. /*The following example provides the actual mr program code to show how to create the table songs and the search index on the same table. When translated into SQL the example does something like: CREATE TABLE songs (id INTEGER, title NLSCHAR (80,1)) CREATE TEXTSEARCH INDEX ON songs (title) */ #include < mscc.h> mscall (" -" , " CREATE DATABASE karaokedb" ) Insert IntoSearch Index The following example provides the Empress C/C++ Kernel Level API – mr Routines program code (insert_text.c) to show how to insert records into the table songs and into the search index on the same table. /*The following example provides the actual mr program code to show how to insert records into the table songs and into the search index on the same table. When translated into pseudo code the example does something
like: songs_tabdesc =
mropen (DATABASE, " songs" , 'u') return 1 Complex Retrieval Using Search Index With Multiple TokensWe will now demonstrate the more complex retrieval example . The Empress C/C++ Kernel Level API – mr Routines program code (select2_text.c) shows how to perform a complex retrieval with multiple tokens from the table songs using the search index. The example performs the retrieval of all the records from the song table that contain tokens “Want” and “Hand” and all the records that contain token “Really”. When translated into SQL the example does something like: SELECT id, title FROM songs WHERE title LIKE “%Want%” AND title LIKE “%Hand%” OR title LIKE “%Really%” /*The following example provides the actual mr program code to show how to perform a complex retrieval with multiple tokens from the table songs using the search index. The example performs the retrieval of all the records from the song table that contain tokens Want and Hand and all the records that contain token Really. When translated into SQL the example does something like: SELECT id, title FROM songs WHERE title LIKE “%Want%” AND title LIKE “%Hand%” OR title LIKE “%Really%” */ #include < mscc.h>
The output of the program is the same as if you would run the following SQL statement: SELECT * FROM songs WHERE title LIKE “%Hold%” AND Songs that contain Hold and Want or Really Id Title The results from the SQL statementare the same as from the Empress text search index but the difference is in performance. If there are many records, the search using the Empress text search index could be orders of magnitude faster than the SQL query producing the same output. Another advantage of using the Empress text search index is in flexibility of using tokens/keywords that do not even have to be present in the title of the text attribute. Empress
Software Inc. |
|
Company |
Products |
Services |
Partners |
Media |
|
|
USA: 301-220-1919 Canada
& International: 905-513-8888 |
|||||