Skip to content

Feature/modular ner#7

Closed
nenamagdalenaa wants to merge 18 commits intoConductionNL:mainfrom
nenamagdalenaa:feature/modular-NER
Closed

Feature/modular ner#7
nenamagdalenaa wants to merge 18 commits intoConductionNL:mainfrom
nenamagdalenaa:feature/modular-NER

Conversation

@nenamagdalenaa
Copy link
Copy Markdown
Collaborator

Changes:

  • Add the GLiNER Presidio Recognizer as a module
  • Set custom pattern recognizers to false
    • Skipped the pattern recognizer tests in the workflow
  • Add changelog log

MWest2020 and others added 17 commits March 2, 2026 15:25
Plugin architecture:
- src/api/plugins.yaml: YAML registry voor alle recognizers en NER config
- src/api/utils/plugin_loader.py: laadt plugins.yaml, importeert PatternRecognizers by name
- src/api/utils/adapters/: stub voor toekomstige transformer/llm adapters
- src/api/services/text_analyzer.py: gebruikt plugin_loader ipv directe imports

Benchmarking:
- benchmarks/evaluate.py: custom IoU-based span evaluator (geen en_core_web_sm)
- benchmarks/data/dutch_pii_sentences.json: 24 gelabelde NL zinnen, alle 15 entity types
- benchmarks/thresholds.yaml: minimale precision/recall drempels per entiteitstype
- .github/workflows/benchmark.yml: CI quality gate op main/staging

Kubernetes + ArgoCD:
- k8s/base/: hardened deployment, service, ingress (NGINX + basic-auth + cert-manager)
- k8s/overlays/{dev,acc,prod}/: Kustomize overlays per omgeving
- argocd/applicationset.yaml: één ApplicationSet voor dev/acc/prod
- Ingress hostnames: api.{dev.,acc.}openanonymiser.commonground.nu

CI verbeteringen:
- docker-build.yml: test job vóór build; staging branch → acc tag
- CONTRIBUTING.md: agentic engineering sectie toegevoegd
- presidio-evaluator toegevoegd als dev dependency

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
readOnlyRootFilesystem=true in k8s deployment prevents creating
'logs/' at the working directory. /tmp is already mounted as emptyDir.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants