Skip to content

RFC-001: YAML Schema Design Decisions

Status: Proposed Date: 2025-11-20 Authors: regelrecht team

Context

As we stabilize the YAML schema (issue #7), we need to document small design decisions about the format. This RFC groups related choices rather than creating separate RFCs for each.

Decision

1. Endpoints: All Outputs Are Public

  • All outputs defined in machine_readable sections are publicly accessible
  • No separate endpoint definition needed - every output is an endpoint
  • Naming pattern: ^[a-z0-9_]+$ (e.g., toetsingsinkomen)

2. Article Text Format: Use Markdown with |- Style

  • Format: Article text field uses markdown to preserve original law formatting
  • YAML Style: Use |- (literal block scalar) for multiline text
  • Goal: Make YAML representation match official law publication as closely as possible

What to preserve:

  • Numbered lists (1., 2., 3.) for article paragraphs (leden)
  • Links to referenced laws/articles
  • Original paragraph structure and line breaks
  • Plain formatting (no bold/italic unless in source)

3. Preamble Structure: Include Aanhef Section

  • Structure: Preamble (aanhef) behaves as an article, including optional machine_readable section
  • Format: Markdown text preserving original formatting from official publication
  • Content: Complete preamble text as it appears in the source document
  • Location: Between metadata and articles section

4. POC v0.1.6 Regulation Discovery Fields Migration

The POC v0.1.6 schema had several top-level "regulation discovery" fields. This section documents how each is handled in v0.2.0:

POC v0.1.6 Fieldv0.2.0 StatusNotes
law_typeReplaced by regulatory_layerMore comprehensive: WET, AMVB, MINISTERIELE_REGELING, GRONDWET, EU_VERORDENING, etc.
legal_characterMoved to execution.produces.legal_characterNow per-article instead of per-service (BESCHIKKING, TOETS, etc.)
decision_typeMoved to execution.produces.decision_typeNow per-article instead of per-service (TOEKENNING, GOEDKEURING, etc.)
discoverableRemovedNo longer needed; all outputs are public endpoints (see Decision 1)
serviceReplaced by competent_authorityThe "service" concept becomes the authority responsible for execution; see RFC-002 for detailed design

Why per-article instead of per-service?

  • Laws contain multiple articles with different legal effects
  • Article 2 may produce a BESCHIKKING (toekenning) while Article 3 produces a TOETS (goedkeuring)
  • This matches the legal reality better than a single service-level classification

The produces construct on an article's execution block declares what legal artifact the article produces when executed. legal_character (e.g. BESCHIKKING, VASTSTELLING) classifies the type of legal act; decision_type (e.g. TOEKENNING, AFWIJZING) classifies the outcome. These fields serve as filter targets for reactive hooks (see RFC-007) and lifecycle stages (see RFC-008).

Open question: Single vs multiple executions per article?

Currently execution is an object (one per article). Should we allow multiple executions per article?

The execution.produces property could enable different outcomes from the same execution:

  • Different legal_character values (BESCHIKKING vs TOETS)

This raises the question: should produces be an array to support multiple distinct outcomes from one article's logic?

Alternative: Move produces to action level instead of execution level. This way, different actions within one execution can produce different legal outcomes without needing multiple executions.

5. Metadata Field Migrations

POC v0.1.6 had several top-level metadata fields. This section documents how each is handled in v0.2.0:

POC v0.1.6 Fieldv0.2.0 StatusNotes
name (top-level)KeptNow supports plain text or internal # reference
name (in fields)KeptStill used in baseField for field names (parameters, input, output)
lawRemovedReplaced by bwb_id + officiele_titel for proper identification
descriptionRemovedArticle text field is self-describing
valid_fromKeptEffective date (inwerkingtredingsdatum; when law becomes effective)
-Added publication_datePublication date (publicatiedatum; when law was published)
referencesReplaced by requiresNow in machineReadableSection with structured format
legal_basis (top-level)KeptArray structure: [{law_id, article, description}]

6. Reference and Variable Notation

Two prefix conventions distinguish input from output references:

$ prefix - References to input (what you read):

  • Parameters: $bsn
  • Input fields: $leeftijd_op_verkiezingsdatum
  • Definitions: $MINIMUM_LEEFTIJD

Used in execution context (actions, conditions, source parameters).

# prefix - References to output (what is produced):

  • References named outputs within the same law
  • Used in metadata fields like name and competent_authority
  • Example: name: '#wet_naam' or competent_authority: '#bevoegd_gezag'

Reason: Make properties traceable to the source.

This is a convention, not enforced by the schema.

The regelrecht:// URI Scheme

Cross-law references use the regelrecht:// URI scheme:

regelrecht://{law_id}/{output_name}#{field}
  • law_id — the $id slug of the target law
  • output_name — the name of an output produced by one of the law's articles
  • #field (optional fragment) — extracts a specific field from the output (e.g. #value)

The engine resolves these URIs by finding the law by $id, locating the article that produces the named output, executing it, and extracting the requested field. This scheme is used for cross-law input references (RFC-003, RFC-007) and as annotation targets (RFC-005).

7. UUID Field: Removed

  • POC v0.1.6: Required UUIDv4 field at top level
  • v0.2.0: Removed

Rationale: No clear purpose identified. Can be reintroduced when a concrete use case emerges (e.g., signature hashes, content verification).

8. Temporal Specifications

Field values can have temporal metadata describing how they relate to time.

  • Location: baseField definition (available on parameters, input, output)
  • Adopted from v0.1.6: type, period_type, reference

Structure:

yaml
temporal:
  type: "period" | "point_in_time"    # Required: snapshot or range?
  period_type: "year" | "month" | "continuous"  # Granularity (for periods)
  reference: "$calculation_date" | "$january_first" | "$prev_january_first" | "$variable"

Usage examples:

  • type: period, period_type: year → yearly income
  • type: period, period_type: month → monthly insurance status
  • type: point_in_time, reference: $calculation_date → age at calculation moment

Not adopted: immutable_after

POC v0.1.6 had immutable_after: "P2Y" to indicate when values become final (e.g., revision period (herzieningstermijn)). This is removed because immutability/finality should be modeled as explicit rules in laws (e.g., AWIR).

9. Input Source Consolidation

The v0.1.6 schema had two separate mechanisms for external input:

  • sourceField with sourceReference (database lookups)
  • inputField with serviceReference (cross-law calls)

In v0.2.0, all inputs use inputField.source with a unified format:

yaml
# Reference to another law or regulation (with parameters)
source:
  regulation: wet_basisregistratie_personen
  output: leeftijd
  parameters:
    bsn: $bsn
  description: "Leeftijd conform BRP"

# Internal reference (within same law, regulation is omitted)
source:
  output: vermogen_onder_grens

# External data source (no regulation - must be resolved outside YAML)
source:
  output: verkiezingsdatum_tweede_kamer
  parameters:
    bsn: $bsn
  description: "Datum van de Tweede Kamerverkiezingen"

Source properties:

  • output (required): Output name to retrieve from the source
  • regulation (optional): Name of external law/regulation. If omitted, it's either an internal reference or an external data source that must be resolved outside the YAML.
  • parameters (optional): Parameters to pass when calling the source (e.g., bsn: $bsn)
  • description (optional): Human-readable description or legal reference

10. Article Numbering: Free-Form

  • Field: articles[].number is a free-form string
  • No separate identifier needed - number serves as both display name and identifier

Rationale: Laws have varying article structures:

  • Simple: "1", "2", "3"
  • With paragraphs: "2.1", "2.2"
  • Nested: "1.1.1", "2.1.3", "14.1.1"
  • With letters: "2a", "2.1.a"

By keeping number free-form:

  • Authors can model at any granularity (whole article, per paragraph (lid), per sub-paragraph (onderdeel))
  • No schema changes needed for different law structures
  • Formatting conventions can be agreed on separately if needed

11. Removal of Requirement Property

The v0.1.6 requirements property is removed in v0.2.0.

Reason: Obsolete - requirements can be inferred from actions during execution.

12. Operation and Type Changes

Minor changes to operations and types between v0.1.6 and v0.2.0:

Operation enum renames (for clarity):

v0.1.6v0.2.0
GREATER_OR_EQUALGREATER_THAN_OR_EQUAL
LESS_OR_EQUALLESS_THAN_OR_EQUAL

Type specification additions:

FieldChange
type_spec.unitAdded "days" unit
variableReferenceNow allows lowercase ($name in addition to $NAME)

Rationale:

  • Operation renames: More explicit naming matches common programming conventions
  • Days unit: Needed for laws that specify durations in days (e.g., deadlines (termijnen))
  • Lowercase variables: Allows more natural naming (e.g., $standaardpremie vs $STANDAARDPREMIE)

References