Boolean Retrieval Model

Ramishamukhtar
3 min readAug 11, 2023

--

Introduction:

Boolean Retrieval Model is the basic information retrieval model which is though generic but it can be extended to variety of formats. The model is based upon Boolean logic and the data structures used in this model is term-document matrix or inverted index.

Below are the theoretical assumptions of Boolean Retrieval Model:
- User specific features can be extracted from documents as all documents are assumed to be machine readable.
- All documents are comprised of those features.
- Users are assumed to have cognition based knowledge about the features.
- Users are assumed to form boolean queries for getting their information.

Model:

Term-Document Matrix:

Term: the features in the document

Document: files in the group

Document-Term Matrix:

Document Term Matrix

Term-Document Matrix:

Term-Document Matrix

Boolean Queries: Queries are composed of terms and Boolean operations (AND, OR, NOT)

Boolean Query Processing:
1- Find location of the term in term/doc matrix.
2- Get the entire row for each term.
3- Operation processing for each pair of term.

Inverted Index:

In inverted index, data structure is comprised of dictionary and postings.

Dictionary Postings

Advantages of Boolean Retrieval Model:
-
This model is easy to implement.
- This model has a clear and mathematical formalism.

Disadvantages of Boolean Retrieval Model:
- Flat results.
- Equally weighted terms.
- Hard query formation.
- Exact match.

Queries in Information Retrieval:

1- Bi-word Query: Query having two terms only i.e, “file system”
2- General Phrase: Query having more than two terms i.e, “android file system”
3- Free Text Query: A query that is comprised of free text, i.e, “where to find new folder created in gallery in android file system”
4- Single Term Query
5- Multi-Term Query
6- Boolean Query: A query that is composed of multiple terms and Boolean operations, i.e, “information and retrieval”

The model’s strengths lie in its precise control over queries, but it may yield either overly specific or broad results. In essence, the Boolean retrieval model is about using logical operations to retrieve documents that match user-defined conditions, forming the basis for more complex information retrieval systems

Want to learn more about Information Retrieval? Stay connected and subscribe. Feel free to write suggestions as well in the comments below!

Thank you!

--

--