When a query call completes normally, it returns the result as a
SearchResults
object. The results object tells you how many matching documents were found in the index, and how many matched documents were returned. It also includes a list of matching
ScoredDocuments
. The list usually contains a portion of all the matching documents found, since search returns a limited number of documents each time it's called. By using an offset or a cursor you can retrieve all the matching documents, a subset at a time.
Results
Your calling code should be prepared to handle exceptions which might be thrown if the query is invalid or there were problems processing it:
# index and query have already been defined ...
try:
result = index.search(query)
total_matches = result.number_found
list_of__docs = result.results
number_of_docs_returned = len(result.results)
except search.Error:
logging.exception('Search failed')
Depending on the value of the
limit
query option
, the number of matching documents returned in the result may be less than the number found. Remember that the number found will be an estimate if the number found accuracy is less than the number found. No matter how you configure the search options, a
search()
call will find no more than 10,000 matching documents.
If more documents were found than returned, and you want to retrieve all of them, you need to repeat the search using either an offset or a cursor, as explained below.
Scored documents
The search results will include a list of
ScoredDocuments
that match the query. You can iterate over the list to process each document in turn:
search_results = index.search(query);
for doc in search_result.results:
// do work
By default, a scored document contains all the fields of the original document that was indexed. If your
query options
specified
returned_fields
, only those fields appear in the
fields
property of the document. If you created any computed fields by specifying
returned_expressions
or
snippeted_fields
they will appear
separately in the
expressions
property of the document.
Using offsets
If your search finds more documents than you can return at once, use an offset to index into the list of matching documents. For example, the default query limit is 20 documents. After you've executed a search the first time (with offset 0) and retrieved the first 20 documents, retrieve the next 20 documents by setting the offset to 20 and running the same search again. Keep repeating the search, incrementing the offset each time by the number of documents returned:
# index and query_string have already been defined
offset = 0
# initialized so the search is called at least once
number_retrieved = 1
try:
while number_retrieved > 0:
# build options and query
options = search.QueryOptions(offset=offset)
query = search.Query(query_string=query_string, options=options)
# search at least once
result = index.search(query)
number_retrieved = len(result.results)
if number_retrieved > 0:
offset += number_retrieved
# ... process the matched docs
except search.Error:
logging.exception('Search failed')
Offsets can be inefficient when iterating over a very large result set.
Using cursors
You can also use cursors to retrieve a subrange of results. Cursors are useful when you intend to present your search results in consecutive pages and you want to be sure you do not skip any documents in the case where an index could be modified between queries. Cursors are also more efficient when iterating across a very large result set.
In order to use cursors, you must create an initial cursor and include it in the query options. There are two kinds of cursors, per-query and per-result . A per-query cursor causes a separate cursor to be associated with the results object returned by the search call. A per-result cursor causes a cursor to be associated with every scored document in the results.
Using a per-query cursor
By default, a newly constructed cursor is a per-query cursor. This cursor holds the position of the last document returned in the search's results. It is updated with each search. To enumerate all matching documents in an index, execute the same search until the result returns a null cursor:
# index and query_string have already been defined
cursor = search.Cursor()
try:
while cursor != None:
# build options and query
options = search.QueryOptions(cursor=cursor)
query = search.Query(query_string=query_string, options=options)
# search at least once
result = index.search(query)
number_retrieved = len(result.results)
cursor = result.cursor
if number_retrieved > 0:
# ... process the matched docs
# all done!
except search.Error:
logging.exception('Search failed')
Using a per-result cursor
To create per-result cursors, you must set the cursor per_result property to true when you create the initial cursor. When the search returns, every document will have a cursor associated with it. You can use that cursor to specify a new search with results that begin with a specific document. Note that when you pass a per-result cursor to search, there will be no per-query cursor associated with the result itself; result.getCursor() will return null so you can't use this to test whether you've retrieved all the matches.
# index and query_string have already been defined
cursor = search.Cursor(per_result=True)
try:
# build options and query
options = search.QueryOptions(cursor=cursor)
query = search.Query(query_string=query_string, options=options)
result = index.search(query)
# process the matched docs
number_retrieved = len(result.results)
cursor = None
for doc in result.results:
# discover some document of interest and grab its cursor
if ... :
cursor = doc.cursor
# Start the next search from the document of interest
if cursor != None:
options = search.QueryOptions(cursor=cursor)
query = search.Query(query_string=query_string, options=options)
result = index.search(query)
except search.Error:
logging.exception('Search failed')
Saving and restoring cursors
A cursor can be serialized as a web-safe string, saved, and then restored for later use:
cursor_string = cursor.web_safe_string
# Save the string ... and restore:
cursor = search.Cursor(web_safe_string=cursor_string)