Hello all,
Have a good day. This post is talking about Data Lake.
With big data in full swing today, our primary challenge is leveraging big data technologies to mass-produce industry services.
As enterprises speed up their digital transformation, their data grows exponentially. In addition, cross-system analysis makes data usage even more expensive. Huawei serves its enterprise customers with the data lake solution, a one-stop answer to the following issues:
Petabyte-level storage Centralized data management includes storage of both existing structured data and unstructured data from digital transformation. The influx of unstructured data includes user behavior logs, images, videos, and documents. Big data applications will be embedded into more and more business scenarios.
Terabyte-level computation Compute power is in demand for large-scale processing before and during data input. Orders, contracts, and user profiles in ultra-large wide tables of over a thousand dimensions require aggregation, processing, and calculation. More terabytes also come from scanned barcodes.
Access to same-source heterogeneous data Diversified data storage means Oracle GoldenGate (OGG) tables will be stored in the Oracle database while barcodes with a key-V quick query will be stored in the HBase. Cross-database analysis requires query engines such as Spark and Hive to directly access local metadata. The challenge arises when the actual data is stored in multiple environments, such as HDFS, HBase, or Oracle.
Large-throughput data pipes Massive volumes of service data require rapid aggregation for downstream big data analysis, computing, and modeling. Predictive models are useless if data access cannot keep up with analysis.
The data lake is a platform that converges data thanks to its architecture of traditional Oracle hybridized with Huawei’s FusionInsight HD&LibrA. This central platform integrates data from Huawei R&D, manufacturing, supply, storage, installation, and delivery. Such integration enables more interaction and digital twin functionality. These automated and intelligent capabilities make overall operations more efficient.
The following figure shows the scenario overview of the data lake.
FusionInsight Intelligent Data Lake
HUAWEI CLOUD FusionInsight Intelligent Data Lake provides intelligent data lakes that are cost-effective and easy to build. It is used by over 3,000 customers in multiple industries in more than 60 countries and regions. The solution enables you to derive value from massive amounts of data. Application examples include e-government data governance and all-in-one network offices, real-time financial risk control, carrier BOM convergence, smart campuses for large enterprises, smart urban rail, and smart airports. FusionInsight includes cloud services such as MapReduce Service (MRS), GaussDB (DWS), Cloud Search Service (CSS), Graph Engine Service (GES), and DAYU. It is applicable to big data service scenarios such as offline analysis, real-time stream processing, real-time retrieval, and interactive query.
That's all for today, thank you.