[Soc-coordination] applying for Aptitude search ranking and presentation
KUTLU EMRE YILMAZ
keylmz at gmail.com
Mon Mar 23 21:41:10 UTC 2009
Hİ all,
im a 4th year undergrad student in cs and im interested in aptitude search
ranking implementation project.
to be fair i havent used xapian before.
my thesis topic is comparison of turkish information retrieval performance
of lemur and terrier toolkits regarding their different retrieval algos.
i have also used lucene in my ir course so im familiar with ir and believe i
can do my best for this project.
so i have some background information about the ir terms.
when it comes to what i can add to this project i see that xapian has okapi
algortihm i can try to improve ranking of results by
try all the possible things that affect ir performance tokenization stemming
may be mistypings (python - ptyhon) AND OR specific boolean queries or
differently weigted queires.
also i got good results in my experiments with lemur tf-idf model weigted
with a modified okapi weighing function i can try different weighing
algorithms.
before implementing a new ranking heuristic , first i wish to try the above
i mentioned but you are the professionals and i would be glad to implement
some different unigram language models for xapian
i believe that unigrams can give better results for small queries like
package names and they arent so many fluctating like in natural language
words form different meanings.
"java sdk" --> the probability the word "sdk" coming after java will be
higher than java rails may be i can do this by modelling collection , here
as my collection filenames in the repository their explanations.
may be we can create two fields one for filename and other for explanations
of the package and its job , then can combine these two fields in search.
please dont hesitate to write your feedbacks either bad or good.
i really wish to work on this project and i hope i could explain myselft
well.
tnx,
Emre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.alioth.debian.org/pipermail/soc-coordination/attachments/20090323/38e3f617/attachment.htm
More information about the Soc-coordination
mailing list