1-DAV-202 Data Management 2023/24
Previously 2-INF-185 Data Source Integration
Difference between revisions of "HWcpp"
Jump to navigation
Jump to search
Line 10: | Line 10: | ||
===Task A=== | ===Task A=== | ||
− | You are given a list of documents. Each document consists of multiple words | + | You are given a list of documents. Each document consists of multiple words separated by space. |
− | You should design and implement an indexing | + | You should design and implement an indexing datastructure that allows the following two operations: |
− | * Add a document to index (your index should assign it | + | * Add a document to the index (your index should assign it a unique ID) |
− | * Retrieve a document with given ID | + | * Retrieve a document with the given ID |
* Find all documents that contain the given word | * Find all documents that contain the given word | ||
Revision as of 10:27, 12 April 2024
See the lecture
You should implement all functionalities in two versions:
- Pure Python
- Python interface with C++ implementation
Task A
You are given a list of documents. Each document consists of multiple words separated by space. You should design and implement an indexing datastructure that allows the following two operations:
- Add a document to the index (your index should assign it a unique ID)
- Retrieve a document with the given ID
- Find all documents that contain the given word
TODO Python signature and implementation hints, example usage
Task B
Extend you index so that it support following query:
- Given list of words (W1, W2, ...) find all documents, which contain all of the words (so each document has to contain all of the word from the query).
TODO Python signature and implementation hints, example usage