-
Notifications
You must be signed in to change notification settings - Fork 75
Open
Description
UI4t architecture currently has the following entities
- A model node i.e.
OrgDbtModel
which is a materialized table. Everything should ideally start and end here in chains. - An operation node i.e.
OrgDbtOperation
which more like a configuration node that defines how the data should transform while moving through it. - A
DbtEdge
that maps or joins twoOrgDbtModel
. - The
DAG (directed acyclic graph)
is build from the above entities & sent to the client
The architecture/codebase has become complex & might be difficult to scale for following reasons/properties
- We dont actually consider
OrgDbtOperation
as a node on the transform graph for some reason. There are no edges to & from anOrgDbtOperation
node, which makes the operations of adding an operation, deleting an operation & rendering the DAG, unecessarily complicated transform_api.get_dbt_project_DAG
is quite complex. This should be as simple as fetching all the edges and the nodes on it.transform.get_operation
has to read from a cryptic json configdbt_operation.config
and figure out the edges to it.- If we every want to save the state of the nodes (with coordinates) we definitely wont be able to do it in this architecture.
- Also the schema of various operation configs (drop, union) are not typed and it becomes difficult to understand whats going on.
Proposed solution
- There should be a concept of
CanvasNode
orNode
. A more generic node. Different types of node would inherit from this one i.e.OrgDbtModel extends Node
&OrgDbtOperation extends Node
- A
DbtEdge
should exist from two genericNode
. - Generating the
DAG
is now nothing but going through all edges and finding the unique set of nodes included. - Deleting a node form canvas should be straightforward, delete the
Node
and handle the side effects based on what type it is i.e.OrgDbtModel
orOrgDbtOperation
- Currently
OrgDbtOperation.config
has the information of the sources i.e.config.input_models
in its cryptic json if its the first in the chain. This can go away since the edges will tell us this now. - Make the schema of various operations
typed
and see if there can be a more general schema that helps us take away creating individual form/functions for every new operation.
Metadata
Metadata
Assignees
Labels
No labels