git clone https://github.com/amandaay/IndexingNoSQLDB.git
- To compile,
clang++ -std=c++17 BTree.cpp BTreeTest.cpp -o <name-you-want-to-give>
org++ -std=c++17 BTree.cpp BTreeTest.cpp -o <name-you-want-to-give>
./<name-you-want-to-give> <testFile.txt> <NODESIZE>
, e.g.<name-you-want-to-give>
=BTreeTest
- In this program, user will have to feed in a textfile similar to
DemoTestFile.txt
to implement a BTree. i.e. The textfile must be a set of integers separated by commas. - After building the BTree, the user can input values to be searched.
No known bugs so far.
- NodeId cannot be -1 according to our implementation.
- If user does not provide a nodesize, the default value will be 5.
- NodeId cannot be -1.
- Assume keys are unique without duplicates.
- Assume nodesize is greater than 2.
https://www.baeldung.com/cs/b-trees-vs-btrees
https://www.programiz.com/dsa/b-tree
https://stackoverflow.com/questions/28846377/what-is-the-difference-btw-order-and-degree-in-terms-of-tree-data-structure
https://stackoverflow.com/questions/50800605/using-binary-search-in-b-tree#:~:text=In%20practice%2C%20a%20linear%20search,followed%20by%20a%20linear%20search.
- To compile,
g++ -std=c++17 NoSQLDatabase.o BTree.cpp main.o -o NoSQLDb
orclang++ -std=c++17 NoSQLDatabase.cpp BTree.cpp main.cpp -o NoSQLDb
- To run the program,
./NoSQLDb
. - Prompt
NoSQL >
will be shown to enter desired commands.
- Test file should always have:
- Header
- Endline ("\n") at the end of each line or record
- First character or integer before the first comma (",") is our key
-
Each data record has a maximum of 40 bytes, bfr = 256/40 = 6 records
-
Each block has a maximum of 256 bytes
-
Index block allocation assumption:
- child block # 5 bytes
- key, block # (8 bytes, 5 bytes)
- index blocking factor (bfri) for index block: (256 - 5) / (8 + 5 + 5) = 13 key
- Directory Structure Assumptions:
- Database name: 50 bytes
- Total size: 10 bytes
- Number of total files (PFS): 2 bytes
- Block size: 3 bytes
- Number of uploaded files: 1 byte
Assume that there are 3 FCBs:
- filename: 50 bytes
- filesize: 10 bytes
- date/time: 20 bytes
- start block: 5 bytes
- Number of blocks used: 5 bytes
- Starting block for index (i.e. root): 5 bytes
Bit Map:
-
Number of blocks = 1 Mbytes / block size = 1024 * 1024 / 256 = 4096 blocks
-
Each block has 1 or 0. (bit)
-
0 indicates a free block, 1 indicates that the block is allocated.
-
Total blocks in a PFS / BLOCK_SIZE = 4096 / 256 = 16 blocks
-
Thus, we will use 16 blocks to represent.
-
Assume that metadata uses 1 block, each FCB uses 1 block (3 * 1 block), bitmap uses 16 blocks
-
The directory structure takes around 20 blocks.
-
Every PFS File has a directory structure.
-
For find command:
- Number of block accessed: Read fcbs + (every index block from the root) + Data Block
https://docs.google.com/presentation/d/11No_ByUtp9IdFpWdFastP0NVLu4G1fSZGQK9316PIJQ/edit?usp=sharing
So Man Amanda Au-Yeung, NUID: 001673193
Chin Yuen Au (Isaac), NUID: 002639665
Moreno Gong (Gongtianchou)
Group 1