Developing a Neuro-Symbolic Approach
for Monitoring and Analyzing
International IP Law


John Brüne, University of Bologna

Supervision: Prof. Monica Palmirani


  github.com/john-bruene/Jurix2025

Funded by the European Union (MSCA COFUND, Grant No. 101126733).
Views and opinions expressed are those of the author only and do not necessarily reflect those of the European Union.

Content

Problem / Motivation

  • WIPO must monitor and compare IP legislation across 194 Member States
  • Legal sources:
    • heterogeneous formats (PDF, HTML, XML, scans)
    • multiple languages and legal traditions
  • Manual comparison is extremely slow
  • Knowledge sharing across jurisdictions is limited
  • → We need a structured, explainable, scalable method

Practical Problems (WIPO Perspective)

Heterogeneous sources:

  • WIPO treaties (curated, English, relatively clean)

  • National laws (Normattiva, HTML/XML, scans, multiple versions)

  • Different languages and legal traditions

Versioning: Same law, different consolidated versions over time


Need to answer questions like:

  • “Which provisions in country X correspond to Article Y of a WIPO treaty?”

  • “Where are gaps or missing implementations?”

Research Vision

Build a usable tool that helps WIPO analyze and harmonize national IP legislation.

  • Combine:
    • Standardized legal representation (Akoma Ntoso XML)
    • Neuro-symbolic AI (symbolic logic + neural models)
  • Goals:
    • Identify shared patterns, principles, divergences
    • Make AI explainable, verifiable, and useful for legal experts

Research Questions

RQ1
How can legislative IP texts from different WIPO member states be collected and standardized so that they can be meaningfully compared?

RQ2
How can logical structures and ontologies be combined with neural language models to detect shared principles and divergences?

RQ3
Which kinds of visualization and interaction best support legal experts in understanding and verifying the results?

Background: From Symbolic to Neuro-Symbolic AI

  • Symbolic / rule-based systems
    • Transparent and interpretable
    • Limited flexibility in ambiguous, contextual language
  • Neural / data-driven models
    • Powerful pattern recognition in large corpora
    • Opaque, hard to justify decisions
  • Neuro-symbolic AI
    • Combine semantic flexibility of neural models
    • With logical precision and structure of symbolic methods

Methodology Overview

  1. Collect & normalize legal sources
    – IP legislation from WIPO member states; various formats (HTML, PDF, XML, scans)

  2. Apply Rule-as-Code principles
    – Extract normative logic into formal rule structures

  3. Exploration
    – Neural embeddings + symbolic constraints

  4. Validation
    – Human-in-the-loop, legal experts

  5. Interactive tooling for WIPO analysts
    – Support for monitoring legal evolution

Schematic representation of the HAIMLA methodology according to Palmirani, Sapienza, and Ashley (2024), published in BioLaw Journal – Rivista di BioDiritto, n. 3/2024.

Case Study: WIPO ↔︎ Italy

Goal: Identify which Italian copyright provisions correspond to WIPO treaty articles.

  • Start with a concrete pair:
    • WIPO Copyright Treaty (WCT, 1996)
    • Italian Copyright Law (L. 633/1941, consolidated version)
  • Research subquestion:
    • “How can we automatically identify which Italian articles correspond to which WIPO provisions?”

Steps:

  • Convert both laws to AKN-XML

  • Parse each article as a structured unit

  • Apply multilingual sentence-transformers

  • Compute semantic similarity (cosine)

Step 1: From XML to DataFrames

  • Input:
    • WIPO Copyright Treaty_1996.xml (AKN from WIPO Lex)
    • 19410716_041U0633_VIGENZA_20251010.xml (AKN from Normattiva)
  • Parsing strategy:
    • Loop over each <article> tag
    • Extract:
      • num → article number
      • heading / content → full text
    • Store as rows in df_wipo and df_it:
      • id (article number)
      • label (e.g. Art. 1 – [title])
      • text (concatenated content)
  • Result: One row = one article ready for semantic comparison.

Step 2: Semantic Matching (Cross-Lingual)

  • Model:
    • paraphrase-multilingual-MiniLM-L12-v2
    • Maps English (WIPO) and Italian (L. 633) to a shared vector space
  • Pipeline:
    • Encode all WIPO articles → embeddings_wipo
    • Encode all Italian articles → embeddings_it
    • Compute cosine similarity for every pair
  • Intuition:
    • High score → likely semantic match
    • Low score → weak or no direct correspondence

Source: Semantic Matching using LLM, TowardsAI (2023)

Loading the cosine distances

df <- read.csv("data/semantic_similarity_matrix_full.csv")
head(df[, 1:2])
                                                             label   Art..1.
1                       Article 1 Relation to the Berne Convention 0.4640073
2                          Article 2 Scope of Copyright Protection 0.5999359
3 Article 3 Application of Articles 2 to 6 of the Berne Convention 0.3424722
4                                      Article 4 Computer Programs 0.6192293
5                       Article 5 Compilations of Data (Databases) 0.6420113
6                                  Article 6 Right of Distribution 0.6009862

Step 3: Strong semantic matches (>0.6)

  • Left: WIPO treaty articles
  • Right: Italian law articles
  • Line thickness = similarity score

Best match per WIPO article

  • For each WIPO provision, show:
    • strongest Italian match, or
    • “no direct match” (red)
  • Identifies gaps and structural divergences

To be worked on (happy for tips and tricks)

  • Integration of internal and external references
  • Temporal dimension / versioning
  • Evaluation framework

Text vs. Text + External References (Example: WCT Article 1)

Raw AKN-style article text AKN-style text with external reference expansion
<article eId="art_1-1">
<num>Article 1(1)</num>
<paragraph>
<content>
Nothing in this Treaty shall derogate from existing obligations that Contracting Parties have to each other under the Berne Convention.
</content>
</paragraph>
</article>
<article eId="art_1-1">
<num>Article 1(1)</num>
<paragraph>
<content>
Nothing in this Treaty shall derogate from existing obligations that Contracting Parties have to each other under the
<ref href="urn:lex:int:convention:berne:1886-09-09;consolidated"
showAs="Berne Convention (Arts. 1–21, Appendix)">
Berne Convention
</ref>.
</content>
</paragraph>
<!-- Berne adds: scope & subject matter (Arts. 1–2),
exclusive rights (Arts. 9–14), term of protection (Art. 7),
and other core obligations for Contracting Parties. -->
</article>

Status and Next Steps

  • Current status:
    • Setting AKN standards for WIPO treaties
    • Developing and Testing Italian Case
    • Prototype pipeline for semantic matching
  • Next steps:
    • Expand to more countries and treaties
    • Integrate rule extraction and symbolic reasoning
    • Prototype interactive tool for WIPO analysts

Thank you!