Metabuilder diagram for Records in Contexts parser for draw.io
Online Mermaid Diagram on Kroki
Metabuilder Python module: drawio_meta_builder.py
Source chats for diagram:
initial: 2025-10-16_claude
revision: 2025-10-23_gemini
Override workflow
Custom parser behaviour can be introduced by placing Python modules in
legacy/overrides/. Decorate replacement functions or classes with the
@override decorator exported by meta_builder.drawio_meta_builder and specify
their data type, role, and phase. Matching entries in the builder mapping are
replaced, while new symbols are injected directly into the generated pipeline
namespace. Overrides are discovered by default when running
python -m meta_builder, and the CLI now reports which modules were loaded.
Data type, role, and phase: These are three conceptual dimensions that allow description of the purpose and place of the override – as well as of the original parser’s functions and classes, for which labels were purposively developed and hardcoded into meta_builder.
Adhering to this conceptual framework is helpful for ensuring continuity between the old and new codebases while respecting the output single-file layout tailored for ingestion into Pyodide.
Phase:
["pre", "core", "post"]. Denotes the timing of execution in the pipeline:premeans before the main XML tree (e.g., mxCells) has started processing.coreis from XML processing to graph building.postis afterDrawIOParserGraph(our custom subclass ofrdflib.graph.Graph) is constructed.Data type:
["xml", "internal", "rdf"]. Denotes the data model that is primarily the focus of the override, namely the XML tree orxml.etree.ElementTree.Element, internal models, orrdflibterms and graphs, respectively.Role:
["metadata", "data", "control"]. Denotes the kind of data primarily operated on the override: Whenever it comes from the main Draw.io XML tree (exclusive of metadata nodes), it is considered adatarole;metadataconcerns operations specifically on metadata;controlincludes operations that involve/depend on both data and metadata, and occasional control flow functionality such as I/O operations or exception handling (however, the dependence on data vs. metadata is the primary consideration for thecontrolrole which is rather overblown already, so self-contained control operations may be better placed underdataormetadata).
Managing pipeline imports
When writing code that interacts with the metabuilder pipeline, import patterns depend on context. For override modules placed in legacy/overrides/, import the compiled pipeline using from legacy.draw_io_parser import pipeline, then reference pipeline symbols via their nested namespace path (e.g., pipeline.pre.xml.metadata.MetadataNodeNotFoundError). Within override functions, assign any needed pipeline annotations to local variables at the top of the function for cleaner code. For external code such as tests or utility modules, use the same import: from legacy.draw_io_parser import pipeline. While original functions are technically accessible through their source classes (e.g., xml_metadata_pre._find_metadata_node), the intended access pattern is always through the pipeline namespace. When circular import issues arise during function overrides, either access pattern will work—prioritize whatever resolves the circular dependency.
For external code that imports the pipeline, ensure Python can locate the necessary modules by setting up sys.path before importing. Use Path(__file__).resolve().parents[N] to locate package roots relative to your script, then insert these paths into sys.path if not already present. Place these path manipulations before any from legacy.draw_io_parser import pipeline statements (which should be marked with # noqa: E402 to suppress linting warnings about non-top-level imports). This setup is essential because the metabuilder generates code in locations that may not be in Python’s default import path.