super*. Clearly, it matches different sets of keywords starting with
"superman dc"~10. This means that results will be shown wherever the two words are encountered within less than
10words on either ends.
.PPTetc. Solr's ExtractingRequestHandler uses Tika allowing users to upload binary files to Solr and have Solr extract text from it and then index it, making them searchable.
HTTP POST. The following
cURLrequest does the work:
curl "http://localhost:8983/solr/update/extract?literal.id=1&commit=true" -F "firstname.lastname@example.org"
ps.pdffile is sent to Solr to extract content from it.
literal.id=1: assigns the
id=1to the Solr Document thus created.
commit=true: Commits the changes to the solr index.
email@example.com: This needs to be a valid relative or absolute path.
ExtractingRequestHandlerNow, using the
PECL PHP SOLRclient in Moodle, there isn't a way to get the extracted content and add it to solr document's field. The
cURLrequest creates an all-new Solr Document specifically for the files and adds content to that Solr Document's fields.
get_content_file_location()function of Moodle that stores the absolute filepath of files is protected. But, there is a predefined
add_to_curl_request()function that adds a file to the
$curlrequest->_tmp_file_post_params[$key] = '@' . $this->get_content_file_location()
ExtractingRequestHandlerin Global Search.
$idof the Solr Document and passing it to the forum's access check function.[Full code]
cURLrequest as it would take a lot of time. Hence, after all the documents have been added, I'm execute
$client->commitat the end.
_get_iterator($from=0)will return a recordset. I've already covered it in Updating Solr Index in Global Search
SolrInputDocumentby including data from the database by specifying fields. An example is shown below:
_get_iterator()will return the record of a particular chapter. Hence, each chapter will be a separate
SolrInputDocumenthaving solr field
adminto delete solr index recently. The code can be seen here.
SolrClient::deleteByQuery. I have provided two types of deleting index:
adminselect the delete option:
Allor let the
adminchoose the modules. I made these options available to the
adminthrough Moodle Quick Forms. Here is a code snippet:
$client->deleteByQuery('*:*')is executed, deleting the entire solr index.
adminchooses only some modules to delete their index, the name of the modules are concatenated together separated by a string, stored as a
stdClass $data->moduleand passed as a parameter into the
search_delete_indexfunction, thus executing
adminis able to re-index. This is done by re-setting the values in the
config_plugintable. This is done through the below simple code:
$modswill be a simple array containing the names of all modules or only those modules whose index was selected for deletion.
adminpage for Global Search. Here are the three indexing configurations that I've planned to implement :
solrgives us two options:
SolrDocumentand re-index the complete document.
iteratorwill return a recordset having
timemodifiedfrom a previous index run. And, those records will be accordingly re-indexed. [As implemented by my mentor Tomasz earlier. See wiki]
1000books in Moodle stored in
courseid : 1. The teacher/admin imports all the books to another course, say
courseid : 2. So re-indexing all the 1000 books might not be very useful here. All we need to do is update only
field: 'courseid'of all the books.
set– set or replace a particular value, or remove the value if null is specified as the new value
add– adds an additional value to a list
inc– increments a numeric value by a specific amount
PHPapproach of doing it but only
SolrClient::requestfunction to send a raw
XMLupdate request to the solr server. Here is a sample code of doing it in PHP.
2MBas defined in
multipartUploadLimitInKB="2048000". Running the above code resulted in a string of
~80KB, so we could easily use it for updating fileds in a large set of documents.
SEARCH_ACCESS_DENIED, that particular result will not be shown to the user.
SEARCH_ACCESS_GRANTED, then that record will be further checked if it has been deleted or not.
SEARCH_ACCESS_DELETED, the index will be updated by deleting that document from the index using
$query->setRows(1000)and check for access. Once, we have
100results to be shown to the user (having
SEARCH_ACCESS_GRANTED), it will stop checking for permission and will terminate showing those
Triein real-life situations (apart from the college assignments), hence I thought that a post was in immediate order.
schema.xml. A lot of effectiveness and optimization of the search depends upon it. The file contains details about the different fields of our documents that will be indexed, and the manner in which the indexing will take place.
Triefor a while. Suppose we have five words:
Triefield types in
4208909997. When Solr indexes this integer it saves the
hexequivalents at different precision levels. (
precisionStep="12"would result in:
precisionStep="8"would result in:
precisionStep="8"would result in:
FADEDEAD, using a precision step of
8would result in going through all the 16 possibilities to find the record, but just one record in case of precision step of
php-pecl-solrextension doesn't work with Solr4. There's a minute difference in the client constructor that makes it flexible to use for