feat(data-masking): add Data Masking utility by svozza · Pull Request #5143 · aws-powertools/powertools-lambda-typescript

svozza · 2026-03-29T11:08:11Z

Summary

Changes

This PR is an experiment in delivering a full feature, end to end, using spec-driven development and agentic coding. As such, I have set it as a draft. We may or may not merge this, but if we do, only after a thorough review by the team. The purpose of this PR is as much to provoke discussion as it is to implement a feature. I would appreciate if @dreamorosi and @sdangol could look at the code and give their opinions.

From my perspective, I think this was a very successful experiment: I am happy with the code quality and I also took the opportunity to add property tests, which are a perfect fit for this sort of logic.

Something I would note is that this probably worked so well because of how well-defined the issue was by @walmsles, and also that we could use the Python implementation as a reference.

One place I differed from the proposed implementation was that I don't batch the calls to KMS. This simplifies the API and means we mirror the Python implementation exactly. While ordinarily this would be a performance concern, we use the caching feature in the AWS Cryptography library to ensure that we only ever make one call to KMS when encrypting multiple fields.

What's included

@aws-lambda-powertools/data-masking package with erase, encrypt, and decrypt operations
AWSEncryptionSDKProvider using KMS envelope encryption (@aws-crypto/client-node as optional peer dep)
Field selection via dot notation, [*] array wildcards, and * object wildcards
Custom masking rules (regex, dynamic length, custom strings)
Encryption context (AAD) for integrity and authenticity
Prototype pollution protection
Unit tests with property-based testing via fast-check
E2e test scaffolding (CDK stack + Lambda handlers)
Documentation mirroring the Python Powertools data masking docs

A note on field path resolution

The Python implementation uses jsonpath_ng for field selection, which natively supports both querying and path extraction for write-back. We considered using a JavaScript JSONPath library (e.g. jsonpath-plus) to match this approach, but decided against it for a few reasons:

jsonpath-plus (9.8m weekly downloads) is marked as unmaintained by its maintainers
We already have @aws-lambda-powertools/jmespath in-house

Instead, we use JMESPath to validate expressions and a small (~20 line) custom walker to resolve wildcards ([*] and *) into concrete paths for write-back. JMESPath is read-only by design so it can't be used for path extraction directly, but the walker is simple, well-tested, and avoids any new dependencies.

There is however another library, jsonpath, that has 3.8m weekly downloads. I am always hesitant to introduce new dependencies to the project but I think if we want feature parity with Python we will need to take the dependency on. I would like to hear the maintainers thoughts before committing to this course of action though.

Issue number: closes #4960

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

svozza · 2026-03-29T22:30:26Z

A note here: I wasted a couple of hours getting the end to end tests to work because the linter wouldn't allow us to create an async function without the await keyword and the LLM ended up accidentally making the lambda function sync because of this. There are perfectly valid reason to want to use the async keyword in a function that doesn't use await, e.g., the value we return is a call to an async function: return someAsyncFunction(value). This has caused me issues multiple times in the past and I think we should disable this rule. I want a developer or an LLM to be able to know a function is async and the best way to do that is with the async keyword, not that await has been used there. In fact, the Biome docs already say that this rule is not recommended:

Summary

Rule available since: v1.4.0

Diagnostic Category: lint/suspicious/useAwait

This rule isn’t recommended, so you need to enable it.

I will raise a separate issue and PR to handle this.

svozza · 2026-04-02T13:40:07Z

I have removed the JMESPath dependency here, it's an unnecesaary depencendency because we only use it for reading but could mislead users into thinking we support the whole JMESPath spec for writes. In fact, for this MVP, we only support the following write paths.

dot notation (obj.key)
wildcard for object keys (obj.*.key)
wildcard for arrays (obj.[*].key)

I am very confident in the implementation of these three cases as they are validated with property tests, which are much more thorough than traditional unit tests.

Add new @aws-lambda-powertools/data-masking package with support for: - Irreversible field erasure with default or custom masking rules - Field-level and full-payload encryption/decryption via AWS Encryption SDK - Encryption context for integrity and authenticity - Dot notation and [*] wildcard field selection - Prototype pollution protection Includes unit tests with property-based testing (fast-check), e2e test scaffolding, and user-facing documentation.

…da runtime

…card support Replace jmespath validation with native path resolution, add object wildcard (.*) support alongside existing array wildcard ([*]), update JSDoc, add property-based tests, encryption context provider tests, and enable data-masking in CI test matrix.

…functions Reduce cyclomatic complexity in walk by extracting wildcard/literal segment resolution into a helper. Convert module-level functions to arrow expressions per project code standards.

…olveWildcardEntries

sonarqubecloud · 2026-04-02T15:18:33Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

pull-request-size bot added the size/XXL PRs with 1K+ LOC, largely documentation related label Mar 29, 2026

svozza force-pushed the 001-data-masking branch 4 times, most recently from b8ad946 to 8bf61c2 Compare March 29, 2026 19:00

svozza temporarily deployed to e2e-tests March 29, 2026 19:03 — with GitHub Actions Inactive

svozza temporarily deployed to e2e-tests March 29, 2026 19:15 — with GitHub Actions Inactive

svozza added 16 commits April 2, 2026 17:15

test(data-masking): add type tests for generics

d92aea3

chore: suppress sonarqube false positive on test regex

574410c

chore: fix sonarqube findings

2a1d01c

refactor(data-masking): reduce cognitive complexity of erase method

393d126

chore: suppress sonarqube CDK construct warnings

b865f76

ci: add data-masking to e2e test matrix

33a675c

fix(data-masking): wire up e2e and type test scripts

13a2d92

fix(data-masking): fix e2e stack output key lookups

3e83d80

fix(data-masking): grant KMS encrypt/decrypt to e2e Lambda functions

7a20518

debug: add logging to erase handler

42b0fe5

debug: add logging to invoke helper

77a50c9

fix(data-masking): return Promise from erase e2e handler for ESM Lamb…

22d3a72

…da runtime

refactor(data-masking): extract resolveWildcardEntries and use arrow …

2e042e5

…functions Reduce cyclomatic complexity in walk by extracting wildcard/literal segment resolution into a helper. Convert module-level functions to arrow expressions per project code standards.

fix(data-masking): replace inverted negation with early return in res…

03110fd

…olveWildcardEntries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(data-masking): add Data Masking utility#5143

feat(data-masking): add Data Masking utility#5143
svozza wants to merge 16 commits intomainfrom
001-data-masking

svozza commented Mar 29, 2026 •

edited

Loading

Uh oh!

svozza commented Mar 29, 2026 •

edited

Loading

Summary

Uh oh!

svozza commented Apr 2, 2026 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

svozza commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

What's included

A note on field path resolution

Uh oh!

svozza commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

svozza commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud bot commented Apr 2, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

svozza commented Mar 29, 2026 •

edited

Loading

svozza commented Mar 29, 2026 •

edited

Loading

svozza commented Apr 2, 2026 •

edited

Loading