diff --git a/docs-2.0/1.introduction/1.what-is-nebula-graph.md b/docs-2.0/1.introduction/1.what-is-nebula-graph.md index 0da2a1f42c6..a62a953a4ee 100644 --- a/docs-2.0/1.introduction/1.what-is-nebula-graph.md +++ b/docs-2.0/1.introduction/1.what-is-nebula-graph.md @@ -14,7 +14,7 @@ Graph databases are well suited for storing most kinds of data models abstracted Nebula Graph, as a typical native graph database, allows you to store the rich relationships as edges with edge types and properties directly attached to them. -## Benefits of Nebula Graph +## Advantages of Nebula Graph ### Open-source @@ -40,7 +40,7 @@ Nebula Graph supports strict role-based access control and external authenticati More and more native tools of Nebula Graph have been released, such as [Nebula Graph Studio](https://github.com/vesoft-inc/nebula-web-docker), [Nebula Console](https://github.com/vesoft-inc/nebula-console), and [Nebula Exchange](https://github.com/vesoft-inc/nebula-exchange). For more ecosystem tools, see [Ecosystem tools overview](../20.appendix/6.eco-tool-version.md). -Besides, Nebula Graph has the ability to be integrated with many cutting-edge technologies, such as Spark, Flink, and HBase, for the purpose of mutual strengthening in a world of increasing challenges and chances. For more information, see [Ecosystem development](../20.appendix/6.eco-tool-version.md). +Besides, Nebula Graph has the ability to be integrated with many cutting-edge technologies, such as Spark, Flink, and HBase, for the purpose of mutual strengthening in a world of increasing challenges and chances. ### OpenCypher-compatible query language @@ -48,8 +48,7 @@ The native Nebula Graph Query Language, also known as nGQL, is a declarative, op ### Future-oriented hardware with balanced reading and writing -Solid-state drives have extremely high performance and [they are getting cheaper](https://blocksandfiles.com/wp-content/uploads/2021/01/Wikibon-SSD-less-than-HDD-in-2026.jpg). - Nebula Graph is a product based on SSD. Compared with products based on HDD and large memory, it is more suitable for future hardware trends and easier to achieve balanced reading and writing. +Solid-state drives have extremely high performance and [they are getting cheaper](https://blocksandfiles.com/wp-content/uploads/2021/01/Wikibon-SSD-less-than-HDD-in-2026.jpg). Nebula Graph is a product based on SSD. Compared with products based on HDD and large memory, it is more suitable for future hardware trends and easier to achieve balanced reading and writing. ### Easy data modeling and high flexibility diff --git a/docs-2.0/1.introduction/2.1.path.md b/docs-2.0/1.introduction/2.1.path.md index 7a4a44e8cb4..647744bd8ba 100644 --- a/docs-2.0/1.introduction/2.1.path.md +++ b/docs-2.0/1.introduction/2.1.path.md @@ -4,7 +4,7 @@ In graph theory, a path in a graph is a finite or infinite sequence of edges whi Paths can be categorized into 3 types: `walk`, `trail`, and `path`. For more information, see [Wikipedia](https://en.wikipedia.org/wiki/Path_(graph_theory)#Walk,_trail,_path). -The following picture is an example for a brief introduction. +The following figure is an example for a brief introduction. ![path](../images/path1.png) @@ -12,7 +12,7 @@ The following picture is an example for a brief introduction. A `walk` is a finite or infinite sequence of edges. Both vertices and edges can be repeatedly visited in graph traversal. -In the above picture C, D, and E form a cycle. So, this picture contains infinite paths, such as `A->B->C->D->E`, `A->B->C->D->E->C`, and `A->B->C->D->E->C->D`. +In the above figure C, D, and E form a cycle. So, this figure contains infinite paths, such as `A->B->C->D->E`, `A->B->C->D->E->C`, and `A->B->C->D->E->C->D`. !!! note @@ -22,26 +22,26 @@ In the above picture C, D, and E form a cycle. So, this picture contains infinit A `trail` is a finite sequence of edges. Only vertices can be repeatedly visited in graph traversal. The Seven Bridges of Königsberg is a typical `trail`. -In the above picture, edges cannot be repeatedly visited. So, this picture contains finite paths. The longest path in this picture consists of 5 edges: `A->B->C->D->E->C`. +In the above figure, edges cannot be repeatedly visited. So, this figure contains finite paths. The longest path in this figure consists of 5 edges: `A->B->C->D->E->C`. !!! note `MATCH`, `FIND PATH`, and `GET SUBGRAPH` statements use `trail`. -There are two special cases of trail, `cycle`, and `circuit`. The following picture is an example for a brief introduction. +There are two special cases of trail, `cycle` and `circuit`. The following figure is an example for a brief introduction. ![trail](../images/Circuits1.png) - cycle - A `cycle` refers to a closed `trail`. Only the terminal vertices can be repeatedly visited. The longest path in this picture consists of 3 edges: `A->B->C->A` or `C->D->E->C`. + A `cycle` refers to a closed `trail`. Only the terminal vertices can be repeatedly visited. The longest path in this figure consists of 3 edges: `A->B->C->A` or `C->D->E->C`. - circuit - A `circuit` refers to a closed `trail`. Edges cannot be repeatedly visited in graph traversal. Apart from the terminal vertices, other vertices can also be repeatedly visited. The longest path in this picture: `A->B->C->D->E->C->A`. + A `circuit` refers to a closed `trail`. Edges cannot be repeatedly visited in graph traversal. Apart from the terminal vertices, other vertices can also be repeatedly visited. The longest path in this figure: `A->B->C->D->E->C->A`. ## Path A `path` is a finite sequence of edges. Neither vertices nor edges can be repeatedly visited in graph traversal. -So, the above picture contains finite paths. The longest path in this picture consists of 4 edges: `A->B->C->D->E`. +So, the above figure contains finite paths. The longest path in this figure consists of 4 edges: `A->B->C->D->E`. diff --git a/docs-2.0/1.introduction/2.data-model.md b/docs-2.0/1.introduction/2.data-model.md index 53116d1d6e3..8a872e25978 100644 --- a/docs-2.0/1.introduction/2.data-model.md +++ b/docs-2.0/1.introduction/2.data-model.md @@ -7,22 +7,27 @@ A data model is a model that organizes data and specifies how they are related t Nebula Graph data model uses six data structures to store data. They are graph spaces, vertices, edges, tags, edge types and properties. - **Graph spaces**: Graph spaces are used to isolate data from different teams or programs. Data stored in different graph spaces are securely isolated. Storage replications, privileges, and partitions can be assigned. + - **Vertices**: Vertices are used to store entities. - In Nebula Graph, vertices are identified with vertex identifiers (i.e. `VID`). The `VID` must be unique in the same graph space. VID should be int64, or fixed_string(N). - A vertex must have at least one tag or multiple tags. + - **Edges**: Edges are used to connect vertices. An edge is a connection or behavior between two vertices. - There can be multiple edges between two vertices. - Edges are directed. `->` identifies the directions of edges. Edges can be traversed in either direction. - - An edge is identified uniquely with a source vertex, an edge type, a rank value, and a destination vertex. Edges have no EID. + - An edge is identified uniquely with ``. Edges have no EID. - An edge must have one and only one edge type. - The rank value is an immutable user-assigned 64-bit signed integer. It identifies the edges with the same edge type between two vertices. Edges are sorted by their rank values. The edge with the greatest rank value is listed first. The default rank value is zero. + - **Tags**: Tags are used to categorize vertices. Vertices that have the same tag share the same definition of properties. + - **Edge types**: Edge types are used to categorize edges. Edges that have the same edge type share the same definition of properties. + - **Properties**: Properties are key-value pairs. Both vertices and edges are containers for properties. !!! Note - Tag and Edge type are similar to the vertex table and edge table in the relational databases. + Tags and Edge types are similar to "vertex tables" and "edge tables" in the relational databases. ## Directed property graph @@ -35,12 +40,12 @@ Nebula Graph stores data in directed property graphs. A directed property graph - **PV** is the property of vertices. - **PE** is the property of edges. -The following table is an example of the structure of the basketball player dataset. We have two types of vertices, that is **player** and **team**, and two types of edges, that is **_serve_** and **_follow_**. +The following table is an example of the structure of the basketball player dataset. We have two types of vertices, that is **player** and **team**, and two types of edges, that is **serve** and **follow**. | Element | Name | Property name (Data type) | Description | | :--- | :--- | :--- | :--- | | Tag | **player** | name (string)
age (int) | Represents players in the team. | -| Tag | **team** | name (string) | Represents the teams. +| Tag | **team** | name (string) | Represents the teams. | | Edge type | **serve** | start_year (int)
end_year (int) | Represents actions taken by players in the team.
An action links a player with a team, and the direction is from a player to a team. | | Edge type | **follow** | degree (int) | Represents actions taken by players in the team.
An action links a player with another player, and the direction is from one player to the other player. | @@ -49,7 +54,7 @@ The following table is an example of the structure of the basketball player data Nebula Graph supports only directed edges. !!! compatibility - + Nebula Graph {{ nebula.release }} allows dangling edges. Therefore, when adding or deleting, you need to ensure the corresponding source vertex and destination vertex of an edge exist. For details, see [INSERT VERTEX](../3.ngql-guide/12.vertex-statements/1.insert-vertex.md), [DELETE VERTEX](../3.ngql-guide/12.vertex-statements/4.delete-vertex.md), [INSERT EDGE](../3.ngql-guide/13.edge-statements/1.insert-edge.md), and [DELETE EDGE](../3.ngql-guide/13.edge-statements/4.delete-edge.md). The MERGE statement in openCypher is not supported. diff --git a/docs-2.0/1.introduction/3.nebula-graph-architecture/1.architecture-overview.md b/docs-2.0/1.introduction/3.nebula-graph-architecture/1.architecture-overview.md index 5410a42fe06..9d45a5d5bc7 100644 --- a/docs-2.0/1.introduction/3.nebula-graph-architecture/1.architecture-overview.md +++ b/docs-2.0/1.introduction/3.nebula-graph-architecture/1.architecture-overview.md @@ -20,18 +20,20 @@ Nebula Graph applies the separation of storage and computing architecture. The G * Great scalability - The separated structure makes both the Graph Service and the Storage Service flexible and easy to scale in or out. + The separated structure makes both the Graph Service and the Storage Service flexible and easy to scale in or out. * High availability - If part of the Graph Service fails, the data stored by the Storage Service suffers no loss. And if the rest part of the Graph Service is still able to serve the clients, service recovery can be performed quickly, even unfelt by the users. + If part of the Graph Service fails, the data stored by the Storage Service suffers no loss. And if the rest part of the Graph Service is still able to serve the clients, service recovery can be performed quickly, even unfelt by the users. * Cost-effective - The separation of storage and computing architecture provides a higher resource utilization rate, and it enables clients to manage the cost flexibly according to business demands. The cost savings can be more distinct if the [Nebula Graph Cloud](https://www.nebula-cloud.io/ "Nebula Graph Cloud official website") service is used. + The separation of storage and computing architecture provides a higher resource utilization rate, and it enables clients to manage the cost flexibly according to business demands. + + * Open to more possibilities - With the ability to run separately, the Graph Service may work with multiple types of storage engines, and the Storage Service may also serve more types of computing engines. + With the ability to run separately, the Graph Service may work with multiple types of storage engines, and the Storage Service may also serve more types of computing engines. For details on the Graph Service and the Storage Service, see [Graph Service](3.graph-service.md) and [Storage Service](4.storage-service.md). diff --git a/docs-2.0/1.introduction/3.nebula-graph-architecture/2.meta-service.md b/docs-2.0/1.introduction/3.nebula-graph-architecture/2.meta-service.md index 9decb33d7ec..43dc21b5894 100644 --- a/docs-2.0/1.introduction/3.nebula-graph-architecture/2.meta-service.md +++ b/docs-2.0/1.introduction/3.nebula-graph-architecture/2.meta-service.md @@ -27,7 +27,7 @@ The leader is elected by the majorities and only the leader can provide service The Meta Service stores the information of user accounts and the privileges granted to the accounts. When the clients send queries to the Meta Service through an account, the Meta Service checks the account information and whether the account has the right privileges to execute the queries or not. -For more information on Nebula Graph access control, see [Authentication and authorization](../../7.data-security/1.authentication/1.authentication.md). +For more information on Nebula Graph access control, see [Authentication](../../7.data-security/1.authentication/1.authentication.md). ### Manages partitions diff --git a/docs-2.0/1.introduction/3.nebula-graph-architecture/3.graph-service.md b/docs-2.0/1.introduction/3.nebula-graph-architecture/3.graph-service.md index c1a34bf1174..20cdf632548 100644 --- a/docs-2.0/1.introduction/3.nebula-graph-architecture/3.graph-service.md +++ b/docs-2.0/1.introduction/3.nebula-graph-architecture/3.graph-service.md @@ -1,12 +1,12 @@ # Graph Service -Graph Service is used to process the query. It has four submodules: Parser, Validator, Planner, and Executor. This topic will describe Graph Service accordingly. +The Graph Service is used to process the query. It has four submodules: Parser, Validator, Planner, and Executor. This topic will describe the Graph Service accordingly. -## The architecture of Graph Service +## The architecture of the Graph Service ![The architecture of the Graph Service](https://docs-cdn.nebula-graph.com.cn/docs-2.0/1.introduction/2.nebula-graph-architecture/query-engine-architecture.png) -After a query is sent to Graph Service, it will be processed by the following four submodules: +After a query is sent to the Graph Service, it will be processed by the following four submodules: 1. **Parser**: Performs lexical analysis and syntax analysis. @@ -18,9 +18,9 @@ After a query is sent to Graph Service, it will be processed by the following fo ## Parser -After receiving a request, the statements will be parsed by the Parser composed of Flex (lexical analysis tool) and Bison (syntax analysis tool), and its corresponding AST will be generated. Statements will be directly intercepted in this stage because of its invalid syntax. +After receiving a request, the statements will be parsed by Parser composed of Flex (lexical analysis tool) and Bison (syntax analysis tool), and its corresponding AST will be generated. Statements will be directly intercepted in this stage because of its invalid syntax. -For example, the structure of the AST of `GO FROM "Tim" OVER like WHERE properties(edge).likeness > 8.0 YIELD dst(edge)` is shown in the following picture. +For example, the structure of the AST of `GO FROM "Tim" OVER like WHERE properties(edge).likeness > 8.0 YIELD dst(edge)` is shown in the following figure. ![AST](https://docs-cdn.nebula-graph.com.cn/docs-2.0/1.introduction/2.nebula-graph-architecture/parser-ast-tree.png) @@ -70,7 +70,7 @@ In the `nebula-graphd.conf` file, when `enable_optimizer` is set to be `true`, P - Before optimization - In the execution plan on the right side of the preceding picture, each node directly depends on other nodes. For example, the root node `Project` depends on the `Filter` node, the `Filter` node depends on the `GetNeighbor` node, and so on, up to the leaf node `Start`. Then the execution plan is (not truly) executed. + In the execution plan on the right side of the preceding figure, each node directly depends on other nodes. For example, the root node `Project` depends on the `Filter` node, the `Filter` node depends on the `GetNeighbor` node, and so on, up to the leaf node `Start`. Then the execution plan is (not truly) executed. During this stage, every node has its input and output variables, which are stored in a hash table. The execution plan is not truly executed, so the value of each key in the associated hash table is empty (except for the `Start` node, where the input variables hold the starting data), and the hash table is defined in `src/context/ExecutionContext.cpp` under the `nebula-graph` repository. diff --git a/docs-2.0/1.introduction/3.nebula-graph-architecture/4.storage-service.md b/docs-2.0/1.introduction/3.nebula-graph-architecture/4.storage-service.md index 8e74a9b3218..ab021090f5c 100644 --- a/docs-2.0/1.introduction/3.nebula-graph-architecture/4.storage-service.md +++ b/docs-2.0/1.introduction/3.nebula-graph-architecture/4.storage-service.md @@ -2,7 +2,7 @@ The persistent data of Nebula Graph have two parts. One is the [Meta Service](2.meta-service.md) that stores the meta-related data. -The other is the Storage Service that stores the data, which is run by the nebula-storaged process. This topic will describe the architecture of Storage Service. +The other is the Storage Service that stores the data, which is run by the nebula-storaged process. This topic will describe the architecture of the Storage Service. ## Advantages @@ -16,11 +16,11 @@ The other is the Storage Service that stores the data, which is run by the nebul - Supports synchronizing with the third party systems, such as [Elasticsearch](../../4.deployment-and-installation/6.deploy-text-based-index/2.deploy-es.md). -## The architecture of Storage Service +## The architecture of the Storage Service ![image](https://www-cdn.nebula-graph.com.cn/nebula-blog/nebula-reading-storage-architecture.png) -Storage Service is run by the nebula-storaged process. Users can deploy nebula-storaged processes on different occasions. For example, users can deploy 1 nebula-storaged process in a test environment and deploy 3 nebula-storaged processes in a production environment. +The Storage Service is run by the nebula-storaged process. Users can deploy nebula-storaged processes on different occasions. For example, users can deploy 1 nebula-storaged process in a test environment and deploy 3 nebula-storaged processes in a production environment. All the nebula-storaged processes consist of a Raft-based cluster. There are three layers in the Storage Service: @@ -28,11 +28,11 @@ All the nebula-storaged processes consist of a Raft-based cluster. There are thr The top layer is the storage interface. It defines a set of APIs that are related to the graph concepts. These API requests will be translated into a set of KV operations targeting the corresponding [Partition](#data_partitioning). For example: - - `getNeighbors`: query the in-edge or out-edge of a set of vertices, return the edges and the corresponding properties, and support conditional filtering. + - `getNeighbors`: queries the in-edge or out-edge of a set of vertices, returns the edges and the corresponding properties, and supports conditional filtering. - - `insert vertex/edge`: insert a vertex or edge and its properties. + - `insert vertex/edge`: inserts a vertex or edge and its properties. - - `getProps`: get the properties of a vertex or an edge. + - `getProps`: gets the properties of a vertex or an edge. It is this layer that makes the Storage Service a real graph storage. Otherwise, it is just a KV storage. @@ -42,9 +42,9 @@ All the nebula-storaged processes consist of a Raft-based cluster. There are thr - Store engine - The bottom layer is the local storage engine library, providing operations like `get`, `put`, and `scan` on local disks. The related interfaces are stored in `KVStore.h` and `KVEngine.h` files. Users can develop their own local store plugins based on their needs. + The bottom layer is the local storage engine library, providing operations like `get`, `put`, and `scan` on local disks. The related interfaces are stored in `KVStore.h` and `KVEngine.h` files. You can develop your own local store plugins based on your needs. -The following will describe some features of Storage Service based on the above architecture. +The following will describe some features of the Storage Service based on the above architecture. ## KVStore @@ -60,7 +60,7 @@ Therefore, Nebula Graph develops its own KVStore with RocksDB as the local stora - For multiple local hard disks, Nebula Graph can make full use of its concurrent capacities through deploying multiple data directories. -- Meta Service manages all the Storage servers. All the partition distribution data and current machine status can be found in the meta service. Accordingly, users can execute a manual load balancing plan in meta service. +- The Meta Service manages all the Storage servers. All the partition distribution data and current machine status can be found in the meta service. Accordingly, users can execute a manual load balancing plan in meta service. !!! Note @@ -124,7 +124,7 @@ Since in an ultra-large-scale relational network, vertices can be as many as ten ### Edge and storage amplification -In Nebula Graph, an edge corresponds to two key-value pairs on the hard disk. When there are lots of edges and each has many properties, storage amplification will be obvious. The storage format of edges is shown in the picture below. +In Nebula Graph, an edge corresponds to two key-value pairs on the hard disk. When there are lots of edges and each has many properties, storage amplification will be obvious. The storage format of edges is shown in the figure below. ![edge storage](https://docs-cdn.nebula-graph.com.cn/docs-2.0/1.introduction/2.nebula-graph-architecture/two-edge-format.png) @@ -154,7 +154,7 @@ Nebula Graph uses a **static Hash** strategy to shard data through a modulo oper When inserting into Nebula Graph, vertices and edges are distributed across different partitions. And the partitions are located on different machines. The number of partitions is set in the CREATE SPACE statement and cannot be changed afterward. -If certain vertices need to be placed on the same partition (i.e., on the same machine), see [Formula/code](https://github.com/vesoft-inc/nebula-common/blob/master/src/common/clients/meta/MetaClient.cpp). +If certain vertices need to be placed on the same partition (i.e., on the same machine), see [Formula/code](https://github.com/vesoft-inc/nebula-common/blob/master/src/common/clients/meta/MetaClient.cpp). The following code will briefly describe the relationship between VID and partition. @@ -206,11 +206,13 @@ Failure: Scenario 1: Take a (space) cluster of a single replica as an example. I Raft and HDFS have different modes of duplication. Raft is based on a quorum vote, so the number of replicas cannot be even. + ### Multi Group Raft -Storage Service supports a distributed cluster architecture, so Nebula Graph implements Multi Group Raft according to Raft protocol. Each Raft group stores all the replicas of each partition. One replica is the leader, while others are followers. In this way, Nebula Graph achieves strong consistency and high availability. The functions of Raft are as follows. +The Storage Service supports a distributed cluster architecture, so Nebula Graph implements Multi Group Raft according to Raft protocol. Each Raft group stores all the replicas of each partition. One replica is the leader, while others are followers. In this way, Nebula Graph achieves strong consistency and high availability. The functions of Raft are as follows. Nebula Graph uses Multi Group Raft to improve performance when there are many partitions because Raft-wal cannot be NULL. When there are too many partitions, costs will increase, such as storing information in Raft group, WAL files, or batch operation in low load. @@ -230,6 +232,7 @@ For each partition, it is necessary to do a batch to improve throughput when wri For example, lock-free CAS operations will execute after all the previous WALs are committed. So for a batch, if there are several WALs in CAS type, we need to divide this batch into several smaller groups and make sure they are committed serially. + ### Transfer Leadership @@ -250,14 +254,14 @@ To avoid split-brain, when members in a Raft Group change, an intermediate state ## Differences with HDFS -Storage Service is a Raft-based distributed architecture, which has certain differences with that of HDFS. For example: +The Storage Service is a Raft-based distributed architecture, which has certain differences with that of HDFS. For example: -- Storage Service ensures consistency through Raft. Usually, the number of its replicas is odd to elect a leader. However, DataNode used by HDFS ensures consistency through NameNode, which has no limit on the number of replicas. +- The Storage Service ensures consistency through Raft. Usually, the number of its replicas is odd to elect a leader. However, DataNode used by HDFS ensures consistency through NameNode, which has no limit on the number of replicas. -- In Storage Service, only the replicas of the leader can read and write, while in HDFS all the replicas can do so. +- In the Storage Service, only the replicas of the leader can read and write, while in HDFS all the replicas can do so. -- In Storage Service, the number of replicas needs to be determined when creating a space, since it cannot be changed afterward. But in HDFS, the number of replicas can be changed freely. +- In the Storage Service, the number of replicas needs to be determined when creating a space, since it cannot be changed afterward. But in HDFS, the number of replicas can be changed freely. -- Storage Service can access the file system directly. While the applications of HDFS (such as HBase) have to access HDFS before the file system, which requires more RPC times. +- The Storage Service can access the file system directly. While the applications of HDFS (such as HBase) have to access HDFS before the file system, which requires more RPC times. -In a word, Storage Service is more lightweight with some functions simplified and its architecture is simpler than HDFS, which can effectively improve the read and write performance of a smaller block of data. +In a word, the Storage Service is more lightweight with some functions simplified and its architecture is simpler than HDFS, which can effectively improve the read and write performance of a smaller block of data. diff --git a/docs-2.0/1.introduction/3.vid.md b/docs-2.0/1.introduction/3.vid.md index 17387229973..cab2d29d603 100644 --- a/docs-2.0/1.introduction/3.vid.md +++ b/docs-2.0/1.introduction/3.vid.md @@ -4,7 +4,7 @@ In Nebula Graph, a vertex is uniquely identified by its ID, which is called a VI ## Features -- The data types of VIDs are restricted to `FIXED_STRING()` or `INT64`; a graph space can only select one VID type. +- The data types of VIDs are restricted to `FIXED_STRING()` or `INT64`. One graph space can only select one VID type. - A VID in a graph space is unique. It functions just as a primary key in a relational database. VIDs in different graph spaces are independent. @@ -42,7 +42,7 @@ VIDs can be generated via applications. Here are some tips: - If short primary keys greatly outnumber long primary keys, do not enlarge the `N` of `FIXED_STRING()` too much. Otherwise, it will occupy a lot of memory and hard disks, and slow down performance. Generate VIDs via BASE64, MD5, hash by encoding and splicing. -- If you generate inte64 VID via hash, the probability of collision is about 1/10 when there are 1 billion vertices. The number of edges has no concern with the probability of collision. +- If you generate int64 VID via hash, the probability of collision is about 1/10 when there are 1 billion vertices. The number of edges has no concern with the probability of collision. ## Define and modify the data type of VIDs