What I learned by information retrieval in one week
October 19th, 2008
It has been about a week since I began doing a deeper study of information retrieval. Actually, everything just began with a new course at my university about that and I just fallen in love almost immediately. The fact is that this thing really got me interested, and I began doing some experiments (one involves django as well, keep reading to know more).
In this week I learned a lot of things about information retrieval, text categorization, natural language processing and machine learning. But the most relevant thing is: the principles are easy, their implementation is not. The fact is that most of the techniques are relatively simple but you usually have to deal with very large datasets and this could be challenging, since one of the main requirements about information retrieval is time. It’s really much more important that you give less results in one second rather than giving better results in one hour. No one will ever care to use your system if it takes an hour to get some result. And if you’re considering to store your data in a database forget about normalization, it wouldn’t really take you anywhere.