Contents
SeSQL data model is described in a 'sesql_config.py' file in the project.
It should define three variables :
SeSQL data model is composed of fields. A field is something on which you can perform queries and order the data, it's similar to a Django field.
Fields are computed from sources. A source will fetch the values to compose a field, taking object attributes, concatenating several of them, calling methods, or following relations.
Each field has a type, a name, a source and may have some type dependant options.
Known types are :
Each field requires a 'source'. The source can be one of the following classes :
Source can also be given in a more friendly way :
If the source is not specified, it'll be a SimpleField of the same name that the index.
The type map is a list (or tuple) of (class, table_name, recursive). All Django objects of this class will be indexed into the given table. All objects of a subclass too, unless the recursive parameter is set to False (it defaults to True).
If the same class is reachable twice (due to multiple inheritance or to specifying both a base class and a derivative in the map), the first entry that matches is taken.
You specify None as the table to explicitly ban indexing a content type even if a base class has to be indexed.
Example
TYPE_MAP = ((models.Photo, "sesql_photo", False), (models.Comment, "sesql_comment"), (models.BaseModel, "sesql_default"))
SeSQL provides semi-automatic dependency tracking. This works in two steps.
1. By implementing a method get_related_objects_for_indexation on your models, which must return a lit of (classname, id) pairs (or of Django objects). This method will be called when an object is indexed. All "related objects" will then be inserted into a special table, called sesql_reindex_schedule.
2. Then, asynchronously, a daemon will fetch rows from this table, and reindex objects.
Before indexing text fields, cleanup has to be performed. In SeSQL, the cleanup is a two-phase process.
The first phase is user-configurable. It should strip the text of all meta-information (HTML tags, wiki syntax, ...) and gives plain text.
The second phase is automatic, it consists in stripping all accents, converting upper-case letters to lower-case letters and replacing special characters with spaces.
To configure the first phase, you have two options :
1. Specify the ADDITIONAL_CLEANUP_FUNCTION in the configuration file. This should contain a function, which takes the text as parameter and returns the cleaned up text.
2. Add a cleanup parameter to the FullTextField, with a similar function.
The ADDITIONAL_CLEANUP_FUNCTION will be used only if the cleanup parameter was not specified, or set to None.
If you want additional text search configurations to be created when SeSQL tables are created, you can add the SQL lines in a ADDITIONAL_TS_CONFIG variable in the config file. Refer to PosgreSQL documentation for more details.