Conversation
b8ad946 to
8bf61c2
Compare
|
A note here: I wasted a couple of hours getting the end to end tests to work because the linter wouldn't allow us to create an async function without the
I will raise a separate issue and PR to handle this. |
|
I have removed the JMESPath dependency here, it's an unnecesaary depencendency because we only use it for reading but could mislead users into thinking we support the whole JMESPath spec for writes. In fact, for this MVP, we only support the following write paths.
I am very confident in the implementation of these three cases as they are validated with property tests, which are much more thorough than traditional unit tests. |
Add new @aws-lambda-powertools/data-masking package with support for: - Irreversible field erasure with default or custom masking rules - Field-level and full-payload encryption/decryption via AWS Encryption SDK - Encryption context for integrity and authenticity - Dot notation and [*] wildcard field selection - Prototype pollution protection Includes unit tests with property-based testing (fast-check), e2e test scaffolding, and user-facing documentation.
…card support Replace jmespath validation with native path resolution, add object wildcard (.*) support alongside existing array wildcard ([*]), update JSDoc, add property-based tests, encryption context provider tests, and enable data-masking in CI test matrix.
…functions Reduce cyclomatic complexity in walk by extracting wildcard/literal segment resolution into a helper. Convert module-level functions to arrow expressions per project code standards.
…olveWildcardEntries
|



Summary
Changes
This PR is an experiment in delivering a full feature, end to end, using spec-driven development and agentic coding. As such, I have set it as a draft. We may or may not merge this, but if we do, only after a thorough review by the team. The purpose of this PR is as much to provoke discussion as it is to implement a feature. I would appreciate if @dreamorosi and @sdangol could look at the code and give their opinions.
From my perspective, I think this was a very successful experiment: I am happy with the code quality and I also took the opportunity to add property tests, which are a perfect fit for this sort of logic.
Something I would note is that this probably worked so well because of how well-defined the issue was by @walmsles, and also that we could use the Python implementation as a reference.
One place I differed from the proposed implementation was that I don't batch the calls to KMS. This simplifies the API and means we mirror the Python implementation exactly. While ordinarily this would be a performance concern, we use the caching feature in the AWS Cryptography library to ensure that we only ever make one call to KMS when encrypting multiple fields.
What's included
@aws-lambda-powertools/data-maskingpackage with erase, encrypt, and decrypt operationsAWSEncryptionSDKProviderusing KMS envelope encryption (@aws-crypto/client-nodeas optional peer dep)[*]array wildcards, and*object wildcardsfast-checkA note on field path resolution
The Python implementation uses
jsonpath_ngfor field selection, which natively supports both querying and path extraction for write-back. We considered using a JavaScript JSONPath library (e.g.jsonpath-plus) to match this approach, but decided against it for a few reasons:jsonpath-plus(9.8m weekly downloads) is marked as unmaintained by its maintainers@aws-lambda-powertools/jmespathin-houseInstead, we use JMESPath to validate expressions and a small (~20 line) custom walker to resolve wildcards (
[*]and*) into concrete paths for write-back. JMESPath is read-only by design so it can't be used for path extraction directly, but the walker is simple, well-tested, and avoids any new dependencies.There is however another library, jsonpath, that has 3.8m weekly downloads. I am always hesitant to introduce new dependencies to the project but I think if we want feature parity with Python we will need to take the dependency on. I would like to hear the maintainers thoughts before committing to this course of action though.
Issue number: closes #4960
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.