Hello, everyone!
Today I'm going to introduce you DAYU
Background:
Since the business scenario needs to show the dependencies between jobs, and because some open source plug-ins have some conflicts with the current business logic, I personally plan to try to start from scratch and implement my own DAG diagram bit by bit. At the same time, use a blog to record your own implementation process and summary. If you are not correct, please correct me.
Scene analysis:
1. There are often dependencies between jobs in data development. When performing jobs, some jobs may fail or be retried. At this time, the operation and maintenance or development personnel need to have a very intuitive diagram to show the job. At this time, the DAG dependency flowchart is a good form of presentation.
2. In the big data scenario, there may be many nodes, even thousands or even tens of thousands of nodes, and there may be many edges between nodes. Under big data, the rendering time of the graph needs to be taken into consideration.
3. After viewing the picture, the user may want to directly rerun or view the upstream and downstream nodes of a node, etc. operation and maintenance operations
4. The user may need to drag an operator, click on an operator, copy the name of some operator and other operations
Goals:
Based on the requirements of the above customers, we have roughly defined the goals and needs to be achieved, sorted out as follows
1. Use diagrams to show the dependencies between tasks, the basic ability must be achieved
2. The graph must be able to support big data scenarios, under many nodes
Clear layout between nodes
Do not overlap the lines between nodes, less cross
Figure can support zooming and dragging
The nodes in the graph support dragging
3. The map needs to support operation and maintenance capabilities
Floating a node, showing the upstream and downstream of the node, and being able to distinguish upstream and downstream nodes
Select a node, and support to select all upstream or downstream nodes of a node
Right-click to customize business operations, such as re-run and stop job instances
Job name supports duplication
Select a specific node to display the information panel of that node
4. Optimization
Provide a bird's-eye view function to ensure that nodes can be found under big data
Provide a search function, which can accurately locate the job you are looking for on the map
Interactive design draft:
Based on the above goals, the design draft is roughly as follows

Technical selection:
Generally, there are two ways of DAG, one is canvas implementation, the other is svg, because svg is relatively simple for event processing, but the performance will be worse. If the performance of svg is uncertain in the future, you can use canvas optimization, novice, so Here I chose svg. Using svg.js, this open source plug-in, encapsulates some simple svg functions, which can reduce part of the workload.
Overview of the realization process:
I personally intend to divide it into several aspects and gradually realize it.
Planning, background, interactive design plan, technology selection
Hierarchical layout algorithm of nodes
Line path between nodes
Operation and maintenance on the map-right click, select, node copy, etc.
Events on the graph--zoom, pan, node drag, etc.
Post-optimization-vertical arrangement optimization
Post-optimization-bird's eye view, search box and others
Post-optimization-big data node optimization
The basic work has been prepared, let's start!