Abstract:
This project aims at implementing Indexing for Web 2.0 Applications. Ajax applications consist of a set of states which are generated by the user through actions such as click, focus, blur etc. events. By saving these DOM states we can index information obtained from dynamically generated web content. To prevent indexing of duplicate DOM states, a Tree Edit Distance algorithm known as Fast Match Edit Script has been implemented.
For ranking the results, the ranking function tf-idf has been implemented.