Boolean&VS model

Telechargé par Rihab BEN LAMINE
Boolean Model
Hongning Wang
CS@UVa
Abstraction of search engine architecture
User
Ranker
Indexer
Doc Analyzer
Index results
Crawler
Doc Representation Query Rep
(Query)
Evaluation
Feedback
CS@UVa CS 4501: Information Retrieval 2
Indexed corpus
Ranking procedure
Search with Boolean query
Boolean query
E.g., “obama” AND “healthcare” NOT “news”
Procedures
Lookup query term in the dictionary
Retrieve the posting lists
Operation
AND: intersect the posting lists
OR: union the posting list
NOT: diff the posting list
CS@UVa CS 4501: Information Retrieval 3
Search with Boolean query
Example: AND operation
128
34
2 4 8 16 32 64
1 2 3 5 8 13 21
Term1
Term2
scan the postings
Time complexity: 
Trick for speed-up: when performing multi-way
join, starts from lowest frequency term to highest
frequency ones
CS@UVa CS 4501: Information Retrieval 4
Deficiency of Boolean model
The query is unlikely precise
“Over-constrained” query (terms are too specific): no
relevant documents can be found
“Under-constrained” query (terms are too general):
over delivery
It is hard to find the right position between these two
extremes (hard for users to specify constraints)
Even if it is accurate
Not all users would like to use such queries
All relevant documents are not equally important
No one would go through all the matched results
Relevance is a matter of degree!
CS@UVa CS 4501: Information Retrieval 5
1 / 41 100%

Boolean&VS model

Telechargé par Rihab BEN LAMINE
La catégorie de ce document est-elle correcte?
Merci pour votre participation!

Faire une suggestion

Avez-vous trouvé des erreurs dans linterface ou les textes ? Ou savez-vous comment améliorer linterface utilisateur de StudyLib ? Nhésitez pas à envoyer vos suggestions. Cest très important pour nous !