Got it

[FI Components] GraphBase

Latest reply: Mar 15, 2022 06:00:03 848 3 3 0 0

Hello all,
Today we learn about GraphBase. With the quick development of network technologies, enterprises in the Internet era are facing massive data. With the increase of data sets, the query performance of traditional relational databases deteriorates, especially for some special service scenarios. Therefore, a new solution is urgently needed to cope with this crisis. To resolve the complex relationship problem, GraphBase came into being.
VertexFilter is described as follows: GraphBase is a distributed graph database based on FusionInsight HD. Based on the distributed storage mechanism of HBase, it supports data of tens of billions of nodes and hundreds of billions of relationships, and provides Spark-based data import and Elasticsearch-based index mechanisms. FusionInsight GraphBase is widely used in recommendations, relationship analysis, and financial anti-fraud. FusionInsight GraphBase has the following features:
• Distributed and seamless integration with the Hadoop ecosystem.
• Queries of Hundreds of billions of relationships on tens of billions of nodes in just seconds.
• Easy-to-use REST APIs are provided to facilitate data query and analysis.
• The powerful Gremlin graph traversal function is provided to implement complex service logic.
• Offline batch import, real-time stream import, and import performance optimization.
Architecture
GraphBase contains GraphServer and LoadBalancer roles.
• GraphServer: includes the GremlinServer and StandardServer services. GremlinServer is used for the graph query using Gremlin, and StandardServer is used for the REST service. When the system is started, the meta_graph graph is started first. The meta_graph graph is used to store multi-graph metadata and asynchronous tasks. ZooKeeper monitors live instances in services and provide distributed lock services.
• LoadBalancer: provides the load sharing capability for graph services.
Figure 1 shows the GraphBase architecture.
Figure 1 GraphBase architecture
GraphBase
• Access layer
Gremlin API: is an open-source standard language interface for graph interactive query provided by the open-source Apache TinkerPop Gremlin component.
REST APIs: includes APIs for graph query, modification, and management, and graph algorithm of Huawei enhanced online analysis.
Load Balancer: provides load sharing for multi-instance GraphServer.
• Computing layer
Provides a core engine of data management and metadata management for GraphBase.
Provides interface adaptation for backend storage and index.
• Storage layer
Distributed KV storage that provides massive graph data storage capabilities.
Provides a search engine with secondary index, full-text search, and fuzzy search capabilities.
Typical application scenarios:
• Anti-financial fraud
• Knowledge map
• Relationship analysis
Key Features
Multi-graph
Scenarios:
• Different service departments can use the same graph database to import different graphs for application development.
• Different applications use different data. Data is not associated, which facilitates service isolation.
Design of multi-graph solution

GraphBase
• GraphServer: includes the GremlinServer and StandardServer services. GremlinServer is used for the graph query using Gremlin, and StandardServer is used for the REST service. When the system is started, the meta_graph graph is started first. The meta_graph graph is used to store multi-graph metadata and asynchronous tasks. ZooKeeper monitors live instances in services and provide distributed lock services.
• LoadBalancer: provides the load sharing capability for graph services.
• GraphWriter: is the module for batch data import.
• GraphStreaming: is used for real-time data import.
Importing Data
Batch import and real-time import
FusionInsight GraphBase supports batch data import and real-time data import. For batch data import, Spark is used to import all historical data stored in HDFS to GraphBase. For real-time data import, Kafka and SparkStreaming are used to import data to GraphBase in real time.
Flexible data mapping rules are provided to map original data to graph models.

HDFS
BulkLoad supported in batch data import
The capability of importing data in BulkLoad mode is added to support faster data import.
During data import, graph HFiles and inner secondary index HFiles can be generated in one MapReduce job.

Hfile
Thanks.


The post is synchronized to: FusionInsight Components

  • x
  • convention:

Geek69
Created Feb 6, 2020 02:11:25

Helpful!
View more
  • x
  • convention:

olive.zhao
Admin Created Mar 15, 2022 03:05:49

Good post!
View more
  • x
  • convention:

MahMush
Moderator Author Created Mar 15, 2022 06:00:03

amazing crafted article...

View more
  • x
  • convention:

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.
Information Protection Guide
Thanks for using Huawei Enterprise Support Community! We will help you learn how we collect, use, store and share your personal information and the rights you have in accordance with Privacy Policy and User Agreement.