Task 2b – DrawIO Parser Metadata Integration (2025-02-14)

Summary

  • Implemented a DrawioParserGraph subclass that records csv_path metadata while leveraging rdflib’s namespace manager for prefix bindings.

  • Added XML metadata extraction that captures CSV paths, base URIs, and user-defined prefix mappings injected by the DrawIO UI.

  • Sanitised metadata-patched DrawIO documents by replacing the root <UserObject> wrapper with the expected <mxCell id="0" /> node prior to structural parsing.

  • Refactored CLI and library entry points to funnel parsing through a shared _build_graph_from_raw_xml helper that returns fully initialised DrawioParserGraph instances.

  • Extended pytest coverage to assert metadata exposure, namespace registration, and backward compatibility for legacy fixtures.

  • Added regression coverage that patches every pristine DrawIO fixture with metadata on-the-fly via patchDrawioWithMetadata.ts, verifying graph isomorphism and metadata propagation.

Testing

  • pip install rdflib

  • pytest src/main/webapp/plugins/rdfexport/legacy/tests/test_patched_parser.py

  • bun run test

Notes

  • The metadata sanitisation step preserves backwards compatibility by leaving pristine fixtures untouched while supporting metadata-augmented documents generated by patchDrawioWithMetadata.ts and the DrawIO UI.

  • Namespace bindings now replace existing entries to ensure user-provided prefixes override defaults where necessary. Base URIs are bound as the default namespace and stored on the graph’s base attribute.