Skip to main content

 Version 5

Legacy documentation for PageSeeder v5

Servlet: GenericSearch

com.pageseeder.search.GenericSearch

Description

This servlet has been removed as of PageSeeder v6. Use Service: /groups/{group}/search [GET] instead.

Search the specified index using the Lucene search engine.

Select the index

The current implementation of PageSeeder produces one Lucene index per group. Use the groups parameter to specify which Lucene index to search. The groups parameter can be either a group id or a group name.

Note that, while it is possible to search across multiple groups, this class is currently not capable of collating the results from different indexes; specifying multiple group is therefore not recommended.

The question

The main parameters for this servlet are question and fields which form the main predicate for the search.

The question is typically one or multiple terms, separated with spaces, and might also come directly from the user.

Each term in the question is converted into Lucene Terms for each field specified in the fields parameters. The fields parameter would typically include the title ( pstitle) and content ( pscontent ) for a full-text search. The fields should be indexed by Lucene and it is preferable that they are analyzed. To generate extracts, they must be stored.

The Lucene query produced is the equivalent of:

+(field1:term1 field1:term2 field2:term1 field2:term2 ...)

Document types

The types parameter can be used to choose which types of Lucene documents can be returned as a comma-separated list of values; each value must match the value of a ‘pstype’ field. Valid types include document, comment, task .

The Lucene query produced is the equivalent of:

+(psdocument:type1 psdocument:type2 ...)

Using facets

This servlet can be used for a faceted search.

Facets cardinality, that is the number of search results matching a facet within the current results is calculated automatically based on the values of the fields specified as facets. Use the facets to specify which fields to use in the index; generally, it is preferable to use fields which are indexed but not analyzed.

As an example, if the facets parameter includes the ‘psauthor’ field, this servlet calculates, for each possible ‘psauthor’ field value, how many results within the current results match each facet value or the ‘psauthor’ field.

Specifying facets to compute does not affect the search results.

To affect search results, a particular facet must be selected with the select parameter. The select parameter is a comma-separated list of Lucene index terms including the field name: [field]:[term] .

For example, the select parameter value psauthor:john,pspriority:high only displays results by author ‘john’ and that have priority ‘high’.

When selecting facets, the Lucene query produced is the equivalent of:

+facet1:value1 +facet2:value2 ...)
In the select parameter commas inside terms can be escaped with \. For example, psauthor:smith\,john,pspriority:high

Date filtering

Results can be filtered by date by using either the from and to parameters or the last parameter.

The from and to parameters use dates formatted as ISO-8601 standard (extended format only), for example 2010-10-25, 2010-10-25T12:26. Open ended date ranges are possible if one of from or to is specified.

The last parameter automatically generates the date range based on the current date using the specified duration; the duration must match: [span][unit] , where [span] is the length of time (digits only) and [unit] is the quantifier to use, valid units are values are years, months, days, hours, minutes and seconds.

Examples: 30years, 15days, 2hours .

When filtering by date, the Lucene query produced is the equivalent of:

[psdate:start TO psdate:end]

When using date filtering, only results which have a date are returned; in other words, if the Lucene document does not have a date, it is not considered by the search.

Size filtering

The min-size and max-size parameters can be used to filter documents by byte size, values must be positive integer values. This parameter only applies to URIs (pstype=document) which do have a byte size, such as images, videos, etc.

When filtering by size, the Lucene query produced is the equivalent of:

[pssize:minsize TO pssize:maxsize]

When using size filtering, only results which have a size are returned.

Ranges

The ranges parameter can be used to filter using a set of ranges. It is a comma-separated list of range searches with format:

field:[|{(lower)?;(upper)?}|]

where [ or ] means include limit value and { or } means exclude limit value. For example:

ranges=psproperty-expires:[2015-03-20;2016-01-01],psproperty-title:[A;C}
ranges=psxrefcount:{50;],psreversexrefcount:[;10]

Organizing search results

Results can be paged and ordered.

The page and page-size controls which part of the search results are returned. They must be positive integer values and they default to 1 and 100 respectively.

The sortby parameter is a comma-separated list of fields to sort the results; the results are sorted by relevance if no sortby parameter is specified.

HTTP method: GET

Same as POST method.

HTTP method: POST

Performs a search based on the specified parameters and returns the search results as XML.

HTTP parameters

NameDescriptionRequiredTypeDefault
groupsA list of groups to search (comma-separated list of IDs or names)yesstrings
questionThe query entered by the user (not a Lucene query)nostring
fieldsA comma-separated list of fields to search for the questionnostrings
typesA comma-separated list of the types of documents to search,noenums
facetsA comma-separated list of fields to include as facetsnostrings
facet-sizeThe maximum number of facet values returned per facetnostring10
rangesA comma-separated list of range searches (format is field:[|{(lower)? ; (upper)?}|] where { or } means exclude limit value)nostrings
selectA comma-separated list of terms to select facetsnostrings
sortbyA comma-separated list of fields to sort the results, results are sorted by relevance if unspecifiednostrings
fromAn ISO-8601 date to specify the start of a date range filter,nodate
toAn ISO-8601 date to specify the end of a date range filternodate
lastA duration to filter the resultnoenum
min-sizeThe minimum size the document must havenoint
max-sizeThe maximum size the document must havenoint
pageThe current page to viewnoint1
page-sizeHow many results does a page containnoint100

Types

Supported types are 

Last

The format of the parameter must be as follows: [span][unit], where [span] is the length of time (digits) and [unit] is the quantifier to use, valid quantifiers are years, months, days, hours, minutes, and seconds.

Examples: 30years, 15days, 2hours, 5years

Created on , last edited on