prateek sachan

indian institute of technology delhi, india

UTC: Thursday April 17, 2014
posted on: Wednesday January 22, 2014
The above topic is contrary to the usual catchy ones (one which I was unable to figure out) and that solemnly fits with my current objective: to revive an almost dead blog.

I often wonder how people influence our lives, our character, and everyone connected to ous. Its just like a huge graph where even a minute change gets transmitted everywhere. One decision, and it gives birth to a whole new universe. There's a constant battle amongst us towards attaining a certain perception. No matter what happens, we always tend towards that version. All our beliefs, our actions, our choices, everything is based on that single hypothetical perception: one of our own absurd creations. Often, we easily get offended due to some reasons if someone perceives us a bit differently. Can't Homo sapians simply make mistakes, gradually learn and ultimately prosper symbiotically? No one comes with a "Dummies Guide to Knowing Me".

Are there any imaginary points being awarded for recognition somewhere? Everyone wants to be perceived as a certain individual. Everyone wants to see others according to their own designed perceptions. (Sort of a many-to-many mapping perhaps). Why the need? What if one is entitled to something even better? What if you wish to change someone's perception for their own happiness? And these are the real stubborn ones: hard to change/hard to convince.

Which one is the correct path to stop being offended by next person? How to stop ourselves from offending our close ones? It might not even be a mistake. Some people can't open up that well. There are cultural gaps, generation gaps, etc. As Wittgenstein said in Tractatus Logico-Philosophicus: "The limits of my language mean the limits of my world".

One doesn't really has a choice. They have to fathom all minute details and dynamically keep modifying their outlook towards everything. You may try ignoring each and every individual you meet (which is a bit difficult) or you may try winning the Perception Championship throughout your life.
(Continue Reading)
posted on: Tuesday January 21, 2014
About a month ago, I received an e-mail from Packt Publishing asking if I would like to be a technical reviewer for one of their books, “Moodle 2.5 Multimedia”. They had found me through this blog where I've covered my projects on Moodle. I was really excited and immediately said yes.

Finally reading the book, I could say it provides a different outlook on presenting courses with multimedia content embedded in Moodle modules and resources. This book would help people in providing their audience with a very rich experience of Moodle courses. As a FOSS lover, I was really happy to see how most examples in the book were based on Open Source applications.

The contents of the book has been nicely laid down to make it really simple and interesting for implementation. Initially, it explains about using images, music and video content in Moodle: enabling users to find, insert and export content. In today's growing technology, it becomes really important to design your courses in order to boost student participation.

The section on pictures focuses on finding pictures and embedding them in Moodle. Comprehensive procedures with examples from Flickr and Wikimedia are provided. It discusses on how one can modify and optimize images for web quality. It also outlines easy steps on creating slideshows and comic strips and successfully importing them into Moodle courses. Similarily, for audio and video contents, it teaches on how to select the appropriate formats providing plenty of settings for modifying them and finally integration in Moodle. Frequent "Moodle it!" sections in the book outline Moodle connection with the concerned topics.

One of the key parts of the book is integrating contents of web-based applications in Moodle. Some notable applications like Google Drive (docs, spreadsheets, infographics, etc.), Floorplanner (designing spaces), Mindomo (mind maps), Tiki-toki (timelines), Google Maps and even Prezi. The section has rich illustrations on integrating content in Moodle made from these web-apps.

The next chapter demonstrates on using multimedia content for creating assessment activities in MCQs, lessons, puzzles, quizzes, etc. It also underlines creation of interactive exercises using Hot Potatoes softwares. The next chapter focuses on real-time communication and collaboration through Google Hangouts. It also teaches about supporting desktop interactions and remote desktop functionality.

The last few chapters emphasizes on FOSS: licensing one's work, laying down references and plagarism. The book ends with listing down some fascinating Moodle plugins.

The procedures in the entire book have been brilliantly presented with the authors explaining all concepts clearly and logically. The content flow is logically designed throughout the book that makes it an easy read and hence easy to use. The book would really help enhance the whole teacher-course-student experience.
(Continue Reading)
posted on: Friday August 02, 2013
These past days I've been busy trying to come up with a working product with the features that I'd earlier planned to include in this version of Global Search.

Finally, I feel happy to complete this milestone. Both my mentors Tomasz and Aparup have guided me well in this project clearing my doubts every now and then. Tomasz has even installed the Global Search plugin on his website. You may try it out

Feel free to clone my Moodle gs2 branch and try out the product. I need deveopers to try it out and test it for any security leaks. The wiki has been updated with the complete procedure for setting up Global Search. Feel free to contact me with your feedback and comments.

I'm including screenshots of some advance search queries. This example here, indexes two pages and one pdf file generated from the Superman wikipedia page. It takes into account Moodle's Wiki module.

The first screenshot shows a normal search for superman return 3 results.

The second screenshot shows a wildcard search for super*. Clearly, it matches different sets of keywords starting with super.

The third screenshot shows an example of proximity search for "superman dc"~10. This means that results will be shown wherever the two words are encountered within less than 10 words on either ends.

The remaining screenshots show boolean searches. You can clearly figure out the differences between them.

My next work will be to design a search UI page. Suggestions are welcome. You may directly contact me or post on Moodle's developer forum post.
(Continue Reading)
posted on: Monday July 15, 2013
This week I integrated Apache Tika into Moodle to support indexing of Rich Documents like .PDF, .DOC, .PPT etc. Solr's ExtractingRequestHandler uses Tika allowing users to upload binary files to Solr and have Solr extract text from it and then index it, making them searchable.

One has to send the file to Solr via HTTP POST. The following cURL request does the work:

curl "http://localhost:8983/solr/update/extract?" -F "myfile=@ps.pdf"
ps.pdf file is sent to Solr to extract content from it. assigns the id=1 to the Solr Document thus created.
commit=true: Commits the changes to the solr index.
myfile=@ps.pdf: This needs to be a valid relative or absolute path.

Refer the wiki for more options on ExtractingRequestHandler Now, using the PECL PHP SOLR client in Moodle, there isn't a way to get the extracted content and add it to solr document's field. The cURL request creates an all-new Solr Document specifically for the files and adds content to that Solr Document's fields.

Also, the get_content_file_location() function of Moodle that stores the absolute filepath of files is protected. But, there is a predefined add_to_curl_request() function that adds a file to the cURL request.
$curlrequest->_tmp_file_post_params[$key] = '@' . $this->get_content_file_location()

So, keeping all these things in mind I had to come up with the following logic for including the feature of indexing Rich Documents via ExtractingRequestHandler in Global Search.

The access rights will be checked by extracting the $id of the Solr Document and passing it to the forum's access check function.[Full code]

And, here's the code that I've written for the Forum Module.

The above code sends the external files to Solr for extracting content and creating new Solr Documents. I'm not committing the Documents after each cURL request as it would take a lot of time. Hence, after all the documents have been added, I'm execute $client->commit at the end.
(Continue Reading)
posted on: Sunday June 30, 2013
This week I started off starting the Search API functions for Global Search. The idea is to code 3 functions for each module. These will be written in the module's lib.php file.

The former two functions are used while indexing records while the last one is used to check user permissions for displaying the search results.

The admin has the option to enable a particular module/resource for supporting Global Search through settings. You may view the code here

The first function _get_iterator($from=0) will return a recordset. I've already covered it in Updating Solr Index in Global Search

The second function _search_get_documents($id) creates a SolrInputDocument by including data from the database by specifying fields. An example is shown below:

The tricky part is to correctly structure our indexed records. For example, for the book module, _get_iterator() will return the record of a particular chapter. Hence, each chapter will be a separate SolrInputDocument having solr field id->chapterid.

The third function maintains security by checking Moodle caps and restricting access to prohibited search results. I've already discussed about Global Search security in Handling security in Global Search.
(Continue Reading)
posted on: Thursday June 20, 2013
I implemented the functionality of allowing the admin to delete solr index recently. The code can be seen here.

Solr provides a simple way of deleting indexing using SolrClient::deleteByQuery. I have provided two types of deleting index:
• Delelting the complete index in one-go.
• Deleteing index modularily. (For example, deleting index for records belonging to book and page modules only)

The idea was to make the admin select the delete option: All or let the admin choose the modules. I made these options available to the admin through Moodle Quick Forms. Here is a code snippet:
If the admin chooses checkbox: all the $client->deleteByQuery('*:*') is executed, deleting the entire solr index.

If, on the other hand the admin chooses only some modules to delete their index, the name of the modules are concatenated together separated by a string, stored as a stdClass $data->module and passed as a parameter into the search_delete_index function, thus executing $client->deleteByQuery('modules:' .$data)

That for the first part: deletion.
After deletion, I need to handle the the config settings, so that the admin is able to re-index. This is done by re-setting the values in the config_plugin table. This is done through the below simple code:

Here, $mods will be a simple array containing the names of all modules or only those modules whose index was selected for deletion.
(Continue Reading)
posted on: Wednesday June 19, 2013
Previous week, I started coding the admin page for Global Search. Here are the three indexing configurations that I've planned to implement :
• Adding new documents. (This will be written such that the indexing is resumed from a previous run).
• Deleting index.
• Updating index for the updated records

For updating index pertaining to update/change in a record, solr gives us two options:
• Treat the "updated" record as a whole new SolrDocument and re-index the complete document.
• Perform a partial update by re-indexing only that field which was updated.

The first approach outlined above is pretty simple. The iterator will return a recordset having timemodified from a previous index run. And, those records will be accordingly re-indexed. [As implemented by my mentor Tomasz earlier. See wiki]

The second approach was recently released by Solr. It could be very useful where thousands of documents may have been updated at once, and the first approach consumes a lot of time.

Lets, take an example. Suppose, we have 1000 books in Moodle stored in courseid : 1. The teacher/admin imports all the books to another course, say courseid : 2. So re-indexing all the 1000 books might not be very useful here. All we need to do is update only field: 'courseid' of all the books.

Solr supports several modifiers that atomically update values of a document.

set – set or replace a particular value, or remove the value if null is specified as the new value
add – adds an additional value to a list
inc – increments a numeric value by a specific amount

However, there's no specific PHP approach of doing it but only XML and JSON.
Hence, I will have to use SolrClient::request function to send a raw XML update request to the solr server. Here is a sample code of doing it in PHP.

Followed by the following commands:


One thing has to be kept in mind that the string above should be less than 2MB as defined in solrconfig.xml:
multipartUploadLimitInKB="2048000". Running the above code resulted in a string of ~80KB, so we could easily use it for updating fileds in a large set of documents.

However, I've to discuss this second approach with my mentors which I will probably do this week on how to implement this in Global Search.
(Continue Reading)
posted on: Sunday June 16, 2013
Handling security issues will be an integral part of Global Search. Last thing we want is users getting access to prohibited records through search. It will be a huge blow to the project if users get access to documents that they are not premissible to view. For this, the solution will be to filter the results after receiving the XML format of the query response from the solr server.

Here, I will be using 3 cases for every search result:

I will check for every result whether the user has access to view it or not. If the user doesn't have access to it SEARCH_ACCESS_DENIED, that particular result will not be shown to the user.

In the alternative case, if the user is found to have permission to view a particular result SEARCH_ACCESS_GRANTED, then that record will be further checked if it has been deleted or not.
• If it has been deleted SEARCH_ACCESS_DELETED, the index will be updated by deleting that document from the index using deleteByQuery('id:'.$docid)
• If the record still exists, the result is then displayed to the user.

We will be getting only 1000 results from the Solr response Object for a query $query->setRows(1000) and check for access. Once, we have 100 results to be shown to the user (having SEARCH_ACCESS_GRANTED), it will stop checking for permission and will terminate showing those 100 results.
(Continue Reading)
posted on: Sunday June 09, 2013
Well this particular post is dedicated to Data Structures. It will be the first time I would be implementing Trie in real-life situations (apart from the college assignments), hence I thought that a post was in immediate order.

When integrating Apache Solr search engine, one of the most important files is schema.xml. A lot of effectiveness and optimization of the search depends upon it. The file contains details about the different fields of our documents that will be indexed, and the manner in which the indexing will take place.

So, lets talk about Trie for a while. Suppose we have five words:

The above five words could be implemented in the following manner:
| |
| \--e

Now, Solr uses this structure to index documents. Following is the declaration of Trie field types in schema.xml.

Suppose, I want to index the integer 4208909997. When Solr indexes this integer it saves the hex equivalents at different precision levels. (FADEDEAD = 4208909997 in hex)

A precisionStep="12" would result in:

A precisionStep="8" would result in:

A precisionStep="8" would result in:

Now, if solr has to search for FADEDEAx to FADEDEAD, using a precision step of 8 would result in going through all the 16 possibilities to find the record, but just one record in case of precision step of 4.

So clearly, a small precision step makes the query fast but also increases the index size. Hence, I will have to test different cases to come out with the "perfect" schema for solr.
(Continue Reading)
posted on: Saturday June 08, 2013
This week I got started integrating search into Moodle core writing code for search within Moodle's page module. I decided to quickly pick a module and test Solr with it. Things become easy when you actually do stuff and see it happening infront of you. This also gave me the advantage of laying down the search API structure. Thanks to my mentor Aparup on guiding me with it.

The idea is to use the php-pecl-solr extension. It's faster as its build into PHP itself. Some major advantages:
• Its wholly written in C. Hence, its extremely fast. compared to other PHP solr client libraries.
• Object-oriented API that makes it easier and efficient for developers to communicate and interact with Apache Solr Server.
• Documented in PHP.

However, one has to have a dedicated server to install this extension.
There were many doubts concerning the integration of such extension in Moodle. Implecations stating that the extension relies on server. Moodle sites installed on shared servers cannot use this search feature. Also, servers need to have Java hosting. However, my mentor Tomasz has previously tried implementing a search feature in Moodle and informs that there are no search engines written in PHP (Moodle is based on PHP). I feel that as Moodle is majorly used by many colleges and universities across the world who have their own dedicated servers, so using the extension is a safe bet to some extent. Something is always better than nothing.

In the future, we will obviously make it flexible to avoid dependancy on server after seeing the success of this first version of Global Search.

The current official version of php-pecl-solr extension doesn't work with Solr4. There's a minute difference in the client constructor that makes it flexible to use for Solr 3.x or Solr 4.x

However, a patch is available that will be merged into the offical stable release in the future.

I'm maintaining a complete procedure for installing this extension on Moodle docs.

Feel free to go through/or drop in the discussion here.
(Continue Reading)
posted on: Tuesday May 28, 2013
The GSoC results were announced yesterday. After weeks of anticipation, anxiety and horror (yes!) it was exhilirating to see my proposal for Global Search being accepted by Moodle. Two months of hard-work, sleepless nights, bunking lectures and screwing up my exams didn't go waste after all. No matter how much you try or work hard one is always pessimistic about the things that aren't under your own control.

I love Open Source, sort of an addiction lately. The major reason being the awesome developers. Hats off to them. They are so smart, witty and opinionated. Like a piece of optimized code, their replies too are exact, precise and just to the point. Nothing less or nothing more. They'll outsmart you everytime with their double-meaning wits. Also, they can talk indefinitedly about anything and everything: from earthquakes to cats and from North Korea to Nuclear Power.

I first used and came to now about Moodle by using it in college. Moodle is currently installed in my college intranet server for course management. I want to thank the Moodle selection team for selecting my proposal for the project and the whole Moodle community for giving such a great feedback and help whenever I needed it. Also, I would like to deeply thank Aparup Banerjee and Tomasz Muras for helping me out and tolerating my random queries from time to time.

For the next three-four months, I will be involved in developing a full-site search in Moodle. The idea is to adopt the widely-used Open Source Solr search platform from the Apache Lucene Project and integrate it in Moodle.
(Continue Reading)