From b977e07da8ce256b09080b3063e8e31c227eed51 Mon Sep 17 00:00:00 2001 From: Sahil Bhimjiani Date: Fri, 19 Jun 2026 15:53:49 -0500 Subject: [PATCH 1/2] docs(aws-serverless): update LMI skill for region expansion and scheduled scaling - Region availability: now all commercial AWS Regions except Israel (Tel Aviv), Middle East (Bahrain), Middle East (UAE), and Asia Pacific (Auckland); update Unsupported Region handling. - Scheduled scaling: document EventBridge Scheduler-based scheduled scaling of Min/Max execution environments for predictable traffic (SKILL.md, configuration-guide.md, infrastructure-setup.md). --- .../aws-lambda-managed-instances/SKILL.md | 23 ++++-- .../references/configuration-guide.md | 20 +++++ .../references/infrastructure-setup.md | 77 +++++++++++++++++++ 3 files changed, 112 insertions(+), 8 deletions(-) diff --git a/plugins/aws-serverless/skills/aws-lambda-managed-instances/SKILL.md b/plugins/aws-serverless/skills/aws-lambda-managed-instances/SKILL.md index 4ea2189d..22d675e9 100644 --- a/plugins/aws-serverless/skills/aws-lambda-managed-instances/SKILL.md +++ b/plugins/aws-serverless/skills/aws-lambda-managed-instances/SKILL.md @@ -4,13 +4,14 @@ description: > Evaluate, configure, and migrate workloads to AWS Lambda Managed Instances (LMI). Triggers on: Lambda Managed Instances, LMI, capacity provider, multi-concurrency Lambda, dedicated instance Lambda, EC2-backed Lambda, cold start elimination, Graviton Lambda, - instance type for Lambda, Lambda cost optimization with Reserved Instances or Savings Plans. - Also trigger when users describe high-volume predictable workloads seeking cost savings, + instance type for Lambda, scheduled scaling for LMI, Lambda cost optimization with + Reserved Instances or Savings Plans. Also trigger when users describe high-volume + predictable workloads seeking cost savings, want to scale LMI capacity on a schedule, or compare Lambda vs EC2 for steady-state traffic. For standard Lambda without LMI, use the aws-lambda skill instead. argument-hint: "[describe your workload or what you need help with]" metadata: - tags: lambda, lmi, managed-instances, ec2, capacity-provider, multi-concurrency, cost-optimization + tags: lambda, lmi, managed-instances, ec2, capacity-provider, multi-concurrency, cost-optimization, scheduled-scaling --- # AWS Lambda Managed Instances (LMI) @@ -22,10 +23,10 @@ For standard Lambda development, see [aws-lambda skill](../aws-lambda/). For SAM ## When to Load Reference Files - **Cost comparison**, **pricing analysis**, **Lambda vs LMI cost**, **Savings Plans**, or **Reserved Instances** -> see [references/cost-comparison.md](references/cost-comparison.md) -- **Instance types**, **memory sizing**, **vCPU ratios**, **scaling tuning**, or **capacity provider config** -> see [references/configuration-guide.md](references/configuration-guide.md) +- **Instance types**, **memory sizing**, **vCPU ratios**, **scaling tuning**, **scheduled scaling**, or **capacity provider config** -> see [references/configuration-guide.md](references/configuration-guide.md) - **Thread safety**, **concurrency model**, **code review checklist**, **Powertools compatibility**, or **multi-concurrency readiness** -> see [references/thread-safety.md](references/thread-safety.md) - **Before/after code examples**, **runtime-specific migration** (Node.js, Python, Java, .NET), or **connection pooling** -> see [references/migration-patterns.md](references/migration-patterns.md) -- **IAM roles**, **VPC setup**, **CLI commands**, **SAM template**, or **CDK example** -> see [references/infrastructure-setup.md](references/infrastructure-setup.md) and [scripts/setup-lmi.sh](scripts/setup-lmi.sh) +- **IAM roles**, **VPC setup**, **CLI commands**, **SAM template**, **CDK example**, or **scheduled scaling setup (EventBridge Scheduler)** -> see [references/infrastructure-setup.md](references/infrastructure-setup.md) and [scripts/setup-lmi.sh](scripts/setup-lmi.sh) - **Errors**, **throttling**, **debugging**, or **stuck deployments** -> see [references/troubleshooting.md](references/troubleshooting.md) ## Quick Decision: Is LMI Right for This Workload? @@ -77,6 +78,8 @@ For discount analysis (Savings Plans, Reserved Instances), refer users to the [A **Scaling**: MinExecutionEnvironments (default 3), MaxVCpuCount (default 400), TargetResourceUtilization. +**Scheduled scaling**: For predictable traffic (business hours, marketing events), use EventBridge Scheduler to adjust Min/Max execution environments on a one-time or recurring schedule — scale up before peak, scale down or to zero when idle. + See [references/configuration-guide.md](references/configuration-guide.md) for decision trees and detailed tuning. ### Step 4: Migrate the Code @@ -135,8 +138,10 @@ See [references/infrastructure-setup.md](references/infrastructure-setup.md) for ### Operations - Do: Set CloudWatch alarms on throttle rate > 1% and CPU > 80% +- Do: Use scheduled scaling (EventBridge Scheduler) for predictable traffic — raise Min/Max before peak periods and lower them (or scale to zero) when idle - Don't: Manually terminate LMI EC2 instances (delete the capacity provider instead) - Don't: Forget to publish a version — unpublished functions cannot run on LMI +- Don't: Rely on a deactivated (Min=Max=0) function to self-recover — schedule an explicit scale-up to reactivate it ## Limits Quick Reference @@ -172,7 +177,7 @@ REQUIRED: AWS credentials configured on the host machine. ### Regional Availability -Currently available: us-east-1, us-east-2, us-west-2, ap-northeast-1, eu-west-1. Expanding to all commercial regions soon. +Available in all commercial AWS Regions except Israel (Tel Aviv), Middle East (Bahrain), Middle East (UAE), and Asia Pacific (Auckland). Check the [Lambda Managed Instances documentation](https://docs.aws.amazon.com/lambda/latest/dg/lambda-managed-instances.html) for the latest regional availability. @@ -204,12 +209,14 @@ Override: "use SAM" → SAM YAML, "use CloudFormation" → CloudFormation YAML. ### Unsupported Region -- State: "Lambda Managed Instances is not yet available in [region]" -- List available regions +- State: "Lambda Managed Instances is not available in [region]" +- Name the excluded regions: Israel (Tel Aviv), Middle East (Bahrain), Middle East (UAE), Asia Pacific (Auckland) +- Suggest the nearest supported region ## Resources - [Lambda Managed Instances Docs](https://docs.aws.amazon.com/lambda/latest/dg/lambda-managed-instances.html) +- [Scaling LMI & Scheduled Scaling Docs](https://docs.aws.amazon.com/lambda/latest/dg/lambda-managed-instances-scaling.html) - [Introducing LMI (AWS Blog)](https://aws.amazon.com/blogs/aws/introducing-aws-lambda-managed-instances-serverless-simplicity-with-ec2-flexibility/) - [Build High-Performance Apps with LMI](https://aws.amazon.com/blogs/compute/build-high-performance-apps-with-aws-lambda-managed-instances/) - [Migrating Functions to LMI (AWS Blog)](https://aws.amazon.com/blogs/compute/migrating-your-functions-to-aws-lambda-managed-instances/) diff --git a/plugins/aws-serverless/skills/aws-lambda-managed-instances/references/configuration-guide.md b/plugins/aws-serverless/skills/aws-lambda-managed-instances/references/configuration-guide.md index cc560c82..dcff8362 100644 --- a/plugins/aws-serverless/skills/aws-lambda-managed-instances/references/configuration-guide.md +++ b/plugins/aws-serverless/skills/aws-lambda-managed-instances/references/configuration-guide.md @@ -51,6 +51,26 @@ Total capacity = MinExecutionEnvironments × PerExecutionEnvironmentMaxConcurren | AllowedInstanceTypes | All | Restrict only for specific hardware needs | | ExcludedInstanceTypes | None | Exclude expensive types in dev/test | +## Scheduled Scaling (Predictable Traffic) + +For workloads with known traffic patterns (business hours, marketing events, batch windows), use [Amazon EventBridge Scheduler](https://docs.aws.amazon.com/scheduler/latest/UserGuide/managing-targets-universal.html) to adjust a function's `MinExecutionEnvironments` and `MaxExecutionEnvironments` on a one-time or recurring schedule. A schedule (cron or rate expression) targets the Lambda `PutFunctionScalingConfig` API as an EventBridge Scheduler universal target, passing new Min/Max values in the input payload. + +**Behavior:** + +- Scheduled scaling sets the provisioned floor and ceiling. Actual scaling between Min and Max still responds to CPU utilization and concurrency saturation. +- If traffic more than doubles within 5 minutes of a scheduled scale-up, you may still see throttles while capacity provisions. +- Setting both `MinExecutionEnvironments` and `MaxExecutionEnvironments` to 0 deactivates the function version (instances terminate). A deactivated function does NOT auto-recover — schedule a separate action with non-zero values to reactivate it. + +**Common patterns:** + +| Pattern | Scale-up schedule | Scale-down schedule | +| ---------------------- | ----------------------------------- | -------------------------------- | +| Business hours | Raise Min/Max before work starts | Lower Min/Max after hours | +| Marketing/launch event | Raise Min ahead of the campaign | Restore baseline after the event | +| Idle scale-to-zero | Reactivate (non-zero) before demand | Set Min=Max=0 when idle | + +See [infrastructure-setup.md](infrastructure-setup.md) for the EventBridge Scheduler IAM role and `create-schedule` CLI examples. + ## Monitoring Thresholds - **CPU > 80%**: reduce concurrency or add vCPUs diff --git a/plugins/aws-serverless/skills/aws-lambda-managed-instances/references/infrastructure-setup.md b/plugins/aws-serverless/skills/aws-lambda-managed-instances/references/infrastructure-setup.md index 83deaef3..e88a1f06 100644 --- a/plugins/aws-serverless/skills/aws-lambda-managed-instances/references/infrastructure-setup.md +++ b/plugins/aws-serverless/skills/aws-lambda-managed-instances/references/infrastructure-setup.md @@ -224,6 +224,83 @@ Resources: CapacityProviderArn: !GetAtt MyCP.Arn ``` +## Scheduled Scaling (EventBridge Scheduler) + +For predictable traffic, adjust `MinExecutionEnvironments`/`MaxExecutionEnvironments` on a schedule using [Amazon EventBridge Scheduler](https://docs.aws.amazon.com/scheduler/latest/UserGuide/managing-targets-universal.html). The schedule calls the Lambda `PutFunctionScalingConfig` API directly as a universal target — no Lambda code or extra glue required. + +### 1. Scheduler execution role + +Trust policy (allow EventBridge Scheduler to assume the role): + +```json +{ + "Version": "2012-10-17", + "Statement": [{ + "Effect": "Allow", + "Principal": { "Service": "scheduler.amazonaws.com" }, + "Action": "sts:AssumeRole" + }] +} +``` + +Permissions (call `PutFunctionScalingConfig` on the target function): + +```json +{ + "Version": "2012-10-17", + "Statement": [{ + "Effect": "Allow", + "Action": "lambda:PutFunctionScalingConfig", + "Resource": "arn:aws:lambda:*:*:function:my-lmi-function" + }] +} +``` + +### 2. Create schedules + +Scale up before peak (08:00 UTC daily): + +```bash +aws scheduler create-schedule \ + --name ScaleUpLmi \ + --schedule-expression "cron(0 8 * * ? *)" \ + --flexible-time-window '{"Mode": "OFF"}' \ + --target '{ + "Arn": "arn:aws:scheduler:::aws-sdk:lambda:PutFunctionScalingConfig", + "RoleArn": "arn:aws:iam:::role/eventbridge-scheduler-role", + "Input": "{\"FunctionName\": \"my-lmi-function\", \"Qualifier\": \"$LATEST.PUBLISHED\", \"FunctionScalingConfig\": {\"MinExecutionEnvironments\": 100, \"MaxExecutionEnvironments\": 1000}}" + }' +``` + +Scale down after peak (18:00 UTC daily): + +```bash +aws scheduler create-schedule \ + --name ScaleDownLmi \ + --schedule-expression "cron(0 18 * * ? *)" \ + --flexible-time-window '{"Mode": "OFF"}' \ + --target '{ + "Arn": "arn:aws:scheduler:::aws-sdk:lambda:PutFunctionScalingConfig", + "RoleArn": "arn:aws:iam:::role/eventbridge-scheduler-role", + "Input": "{\"FunctionName\": \"my-lmi-function\", \"Qualifier\": \"$LATEST.PUBLISHED\", \"FunctionScalingConfig\": {\"MinExecutionEnvironments\": 5, \"MaxExecutionEnvironments\": 20}}" + }' +``` + +Set both values to `0` to deactivate during idle periods; schedule a separate non-zero action to reactivate (a deactivated function does not auto-recover). + +### Manual override + +Update scaling limits directly at any time: + +```bash +aws lambda put-function-scaling-config \ + --function-name my-lmi-function \ + --qualifier '$LATEST.PUBLISHED' \ + --function-scaling-config MinExecutionEnvironments=5,MaxExecutionEnvironments=20 +``` + +`MinExecutionEnvironments` and `MaxExecutionEnvironments` accept values from 0 to 15000 and must be set together. Setting them on `$LATEST.PUBLISHED` propagates to future published versions. + ## Cleanup ```bash From 66dd3cf5acdf374cc0d79af2bf51fa6fbb975c07 Mon Sep 17 00:00:00 2001 From: Aleksandar Maksimovic Date: Mon, 22 Jun 2026 11:45:33 -0700 Subject: [PATCH 2/2] fix(dsql): reframe JSON/JSONB/array storage as a choice, not "MUST use JSONB" (#200) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix(dsql): reframe JSON/JSONB/array storage as a choice, not "MUST use JSONB" The native JSON/JSONB rollout left this skill prescribing a single answer. Two drift directions existed: some files said "MUST serialize arrays as JSONB" / "prefer JSONB over JSON" (over-correction), others still said "store as TEXT" / "SET -> TEXT" (stale pre-JSON framing). Bring all nine steering files in line with the already-reviewed wording in awslabs/mcp: serializing is the MUST (DSQL has no array column type); the format is a choice — PREFER JSONB for queryable values, MAY use TEXT when opaque, JSON when writes dominate or byte-exact preservation matters; keep existing JSON columns as JSON when migrating. ASK the user. Files: SKILL.md, references/development-guide.md, references/onboarding.md, references/troubleshooting.md, references/examples/{patterns,schema}.md, references/mysql-migrations/{type-mapping,full-example,ddl-type-alternatives}.md. Bump databases-on-aws to 1.3.3. * ci: exclude broken registry rule bbp-pattern-inject from semgrep The community registry rule bbp-pattern-inject fails to parse ("Invalid pattern for Python: Stdlib.Parsing.Parse_error"), making semgrep exit 2 and fail every PR regardless of findings. Exclude it the same way other unwanted registry rules already are. --- .claude-plugin/marketplace.json | 2 +- .github/workflows/security-scanners.yml | 1 + .semgrep.yaml | 5 +++ .../.claude-plugin/plugin.json | 2 +- .../.codex-plugin/plugin.json | 2 +- plugins/databases-on-aws/skills/dsql/SKILL.md | 2 +- .../dsql/references/development-guide.md | 13 ++++-- .../dsql/references/examples/patterns.md | 44 ++++++++++++++++--- .../skills/dsql/references/examples/schema.md | 4 +- .../mysql-migrations/ddl-type-alternatives.md | 24 ++++++++-- .../mysql-migrations/full-example.md | 11 ++--- .../mysql-migrations/type-mapping.md | 22 +++++----- .../skills/dsql/references/onboarding.md | 8 ++-- .../skills/dsql/references/troubleshooting.md | 8 ++-- 14 files changed, 107 insertions(+), 41 deletions(-) diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 31167ff3..a77156a4 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -160,7 +160,7 @@ "name": "databases-on-aws", "source": "./plugins/databases-on-aws", "tags": ["aws", "database", "aurora", "dsql", "serverless", "postgresql"], - "version": "1.3.2" + "version": "1.3.3" }, { "category": "deployment", diff --git a/.github/workflows/security-scanners.yml b/.github/workflows/security-scanners.yml index b9e873c0..ff85eace 100644 --- a/.github/workflows/security-scanners.yml +++ b/.github/workflows/security-scanners.yml @@ -214,6 +214,7 @@ jobs: set +e semgrep scan --oss-only --verbose --metrics=off --config=r/all \ --max-log-list-entries=0 \ + --exclude-rule="bbp-pattern-inject" \ --exclude-rule="ai.generic.detect-generic-ai-anthprop.detect-generic-ai-anthprop" \ --exclude-rule="generic.secrets.security.detected-sonarqube-docs-api-key.detected-sonarqube-docs-api-key" \ --exclude-rule="apex.lang.best-practice.ncino.accessmodifiers.globalaccessmodifiers.global-access-modifiers" \ diff --git a/.semgrep.yaml b/.semgrep.yaml index 2f2b019b..6901fce3 100644 --- a/.semgrep.yaml +++ b/.semgrep.yaml @@ -7,6 +7,11 @@ # Excluded rules: # +# bbp-pattern-inject +# Reason: Broken rule in the community registry (r/all) - fails to parse +# ("Invalid pattern for Python: Stdlib.Parsing.Parse_error"), which makes +# semgrep exit 2 and fail every PR regardless of findings. Not our rule. +# # ai.generic.detect-generic-ai-anthprop.detect-generic-ai-anthprop # Reason: This contains a Claude Code plugin repository - Anthropic references are expected # diff --git a/plugins/databases-on-aws/.claude-plugin/plugin.json b/plugins/databases-on-aws/.claude-plugin/plugin.json index e918fecc..5cf5c741 100644 --- a/plugins/databases-on-aws/.claude-plugin/plugin.json +++ b/plugins/databases-on-aws/.claude-plugin/plugin.json @@ -22,5 +22,5 @@ "license": "Apache-2.0", "name": "databases-on-aws", "repository": "https://github.com/awslabs/agent-plugins", - "version": "1.3.2" + "version": "1.3.3" } diff --git a/plugins/databases-on-aws/.codex-plugin/plugin.json b/plugins/databases-on-aws/.codex-plugin/plugin.json index be7d4f8b..1ebaa244 100644 --- a/plugins/databases-on-aws/.codex-plugin/plugin.json +++ b/plugins/databases-on-aws/.codex-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "databases-on-aws", - "version": "1.3.2", + "version": "1.3.3", "description": "Expert database guidance for the AWS database portfolio. Design schemas, execute queries, handle migrations, and choose the right database for your workload.", "author": { "name": "Amazon Web Services", diff --git a/plugins/databases-on-aws/skills/dsql/SKILL.md b/plugins/databases-on-aws/skills/dsql/SKILL.md index c44133cd..75f6d888 100644 --- a/plugins/databases-on-aws/skills/dsql/SKILL.md +++ b/plugins/databases-on-aws/skills/dsql/SKILL.md @@ -161,7 +161,7 @@ See [scripts/README.md](../../scripts/README.md) for usage and hook configuratio - MUST include tenant_id in all tables - MUST use `CREATE INDEX ASYNC` exclusively - MUST issue each DDL in its own transact call: `transact(["CREATE TABLE ..."])` -- MUST serialize arrays as JSONB; expand at query time with `jsonb_array_elements_text(data)` +- MUST serialize arrays into a single-column representation — DSQL has no array column type; PREFER `JSONB` (operators work directly); MAY use `TEXT` when the column is opaque to the database; ASK the user. For `JSONB` arrays, expand at query time with `jsonb_array_elements_text(data)` ### Workflow 2: Safe Data Migration diff --git a/plugins/databases-on-aws/skills/dsql/references/development-guide.md b/plugins/databases-on-aws/skills/dsql/references/development-guide.md index 72781dde..b29d6945 100644 --- a/plugins/databases-on-aws/skills/dsql/references/development-guide.md +++ b/plugins/databases-on-aws/skills/dsql/references/development-guide.md @@ -13,7 +13,7 @@ effortless scaling, multi-region viability, among other advantages. - **REQUIRED: Follow DDL Guidelines** - Refer to [DDL Rules](#schema-ddl-rules) - **SHALL repeatedly generate fresh tokens** - Refer to [Connection Limits](auth/authentication-guide.md#connection-rules) - **ALWAYS use ASYNC indexes** - `CREATE INDEX ASYNC` is mandatory -- **MUST serialize arrays as JSONB** - see [Schema Design Rules](#schema-design-rules) +- **MUST serialize arrays** into a single-column representation; **PREFER `JSONB`** (operators work directly); **MAY use `TEXT`** when the column is opaque to the database; **ASK** the user - see [Schema Design Rules](#schema-design-rules) - **ALWAYS Batch within row limit** - maintain transaction limits (verify via `awsknowledge`: `aurora dsql transaction limits`) - **REQUIRED: Build and sanitize all SQL with `safe_query.build()`** - See [Input Validation](../mcp/tools/input-validation.md#required-pattern) - **MUST follow correct Application Layer Patterns** - when multi-tenant isolation or application referential integrity are required; refer to [Application Layer Patterns](#application-layer-patterns) @@ -54,7 +54,14 @@ effortless scaling, multi-region viability, among other advantages. ### Schema Design Rules - MUST verify column types via `awsknowledge`: `aurora dsql supported data types` or the [DSQL supported data types list](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility-supported-data-types.html) -- MUST serialize arrays as JSONB; expand at query time via `jsonb_array_elements_text(data)` +- MUST serialize arrays into a single-column representation — DSQL has no array column type: + - **PREFER `JSONB`** — `@>`, `?`, `?|`, `?&`, and `jsonb_array_elements_text(data)` work directly; values validated and normalized at write + - **MAY use `TEXT`** when the column is opaque to the database (application reads the whole value, parses it, never queries inside) +- For document columns: + - **`JSONB`** when querying with `@>`, `?`, or indexed JSONB paths + - **`JSON`** when writes dominate (no parse/sort overhead), when byte-exact input matters (audit, replay, payloads with duplicate keys), or when only `->`/`->>` is needed + - **SHOULD keep** existing `JSON` columns as `JSON` when migrating; **MAY upgrade to `JSONB`** if the application needs JSONB-only operators or indexed paths + - ASK the user about query patterns and read/write ratio before defaulting - **MUST NOT** add per-column `COLLATE` clauses — DSQL uses C collation database-wide and rejects `COLLATE "C"` in DDL. `dsql_lint(fix=true)` auto-strips `COLLATE` clauses from migrated schemas (rule `collation`, fix status `fixed`). - ALWAYS include tenant_id in tables for multi-tenant isolation - SHOULD create async indexes for tenant_id and common query patterns @@ -127,7 +134,7 @@ UPDATE table SET c = 'default' WHERE c IS NULL; ← AFTER ADD COLUMN **MUST verify** column types against the [DSQL supported data types docs](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility-supported-data-types.html) or via `awsknowledge`: `aurora dsql supported data types` — the supported set evolves, so do not treat any static list as exhaustive. -Arrays and `INET` are **[runtime-only](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility-supported-data-types.html#working-with-postgresql-compatibility-query-runtime)** — cast at query time. For structured data, prefer `JSONB` over `JSON` for queryable fields. +Arrays and `INET` are **[runtime-only](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility-supported-data-types.html#working-with-postgresql-compatibility-query-runtime)** — cast at query time. For structured data, **PREFER `JSONB`** when querying inside the value (`@>`, `?`, indexed JSONB paths); `JSON` is valid when writes dominate, byte-exact input matters, or only `->`/`->>` is needed. ASK the user about query patterns before defaulting. ### Supported Key diff --git a/plugins/databases-on-aws/skills/dsql/references/examples/patterns.md b/plugins/databases-on-aws/skills/dsql/references/examples/patterns.md index 47951d5a..06b83c77 100644 --- a/plugins/databases-on-aws/skills/dsql/references/examples/patterns.md +++ b/plugins/databases-on-aws/skills/dsql/references/examples/patterns.md @@ -129,12 +129,16 @@ INSERT INTO distributors VALUES (nextval('order_seq'), 'nothing'); --- -## Arrays and Structured Data +## Data Serialization -Arrays and `INET` are [runtime-only](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility-supported-data-types.html#working-with-postgresql-compatibility-query-runtime) — not valid as column types. For structured data, prefer `JSONB` over `JSON` for queryable fields. +Arrays and `INET` are [runtime-only](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility-supported-data-types.html#working-with-postgresql-compatibility-query-runtime) — not valid as column types. **MUST** serialize arrays and structured data into a single-column representation. WHICH format is a choice — ASK the user which access pattern fits: -- **MUST** serialize arrays as `JSONB` -- **MAY** use `jsonb_array_elements_text(data)` to expand a JSONB array at query time +- **PREFER** `JSONB` when querying inside the value (`@>`, `?`, `?|`, `?&`, `jsonb_array_elements_text`, indexed JSONB paths); values are normalized at write. +- **MAY** use `TEXT` when the column is opaque to the database — the application reads the whole value, parses it, and never queries inside it. +- `JSON` is valid when writes dominate (no parse/sort overhead), byte-exact input matters (audit, replay, duplicate keys), or only `->`/`->>` is needed. +- When migrating, **SHOULD** keep existing `JSON` columns as `JSON`; **MAY** upgrade to `JSONB` if JSONB-only operators or indexed paths are needed. + +**JSONB (write + query with operators):** ```javascript const categories = ['backend', 'api', 'database']; @@ -149,12 +153,38 @@ await pool.query( ); ``` -Query-time operations: - ```sql +-- JSONB-only operators (containment, key existence, indexed paths): +SELECT user_id FROM user_settings WHERE preferences @> '{"theme":"dark"}'; +SELECT project_id, jsonb_array_elements_text(categories) AS category FROM projects; + +-- ->/->> work on both JSON and JSONB: SELECT user_id, preferences->>'theme' AS theme FROM user_settings WHERE preferences->>'notifications' = 'true'; +``` -SELECT project_id, jsonb_array_elements_text(categories) AS category FROM projects; +**JSON (write-heavy, byte-exact, key-extraction only):** + +```javascript +const auditPayload = { event: 'login', ts: 1717890000, user_id: '...' }; +await pool.query( + 'INSERT INTO audit_log (id, payload) VALUES ($1, $2)', // no cast: column is JSON + [eventId, JSON.stringify(auditPayload)], +); +``` + +```sql +SELECT id, payload->>'event' AS event FROM audit_log WHERE payload->>'user_id' = $1; +``` + +**TEXT (opaque to the database):** + +```javascript +const tagsCsv = ['backend', 'api', 'database'].join(','); +await pool.query( + 'INSERT INTO projects (project_id, tags_csv) VALUES ($1, $2)', + [projectId, tagsCsv], +); +// Application parses tags_csv.split(',') on read; the database never inspects it. ``` diff --git a/plugins/databases-on-aws/skills/dsql/references/examples/schema.md b/plugins/databases-on-aws/skills/dsql/references/examples/schema.md index d1462fb7..777fdd20 100644 --- a/plugins/databases-on-aws/skills/dsql/references/examples/schema.md +++ b/plugins/databases-on-aws/skills/dsql/references/examples/schema.md @@ -21,11 +21,13 @@ CREATE TABLE IF NOT EXISTS orders ( tenant_id VARCHAR(255) NOT NULL, status VARCHAR(50) NOT NULL, tags JSONB, - metadata JSONB, + metadata JSON, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` +Both `JSONB` and `JSON` are valid; pick by access pattern (see Schema Design Rules in `development-guide.md`). + --- ## Schema Design: Index Creation diff --git a/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/ddl-type-alternatives.md b/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/ddl-type-alternatives.md index a80006a7..76cb06ca 100644 --- a/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/ddl-type-alternatives.md +++ b/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/ddl-type-alternatives.md @@ -50,18 +50,36 @@ CREATE TABLE user_preferences ( ); ``` -**DSQL equivalent using TEXT (comma-separated):** +DSQL has no array column type. **MUST** serialize the SET into a single-column representation. **WHICH** format is a choice — ASK the user. ```sql +-- PREFER JSONB: filter with `@>`, expand with `jsonb_array_elements_text`, +-- and let the database validate JSON shape on write. transact([ "CREATE TABLE user_preferences ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), - permissions TEXT -- Stored as comma-separated: 'read,write,admin' + permissions JSONB -- '[\"read\",\"write\",\"admin\"]' + )" +]) + +-- MAY use TEXT when the column is opaque to the database (application +-- reads the whole value, parses it, never queries inside). +transact([ + "CREATE TABLE user_preferences ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + permissions TEXT -- e.g. 'read,write,admin'; app validates and parses )" ]) ``` -**Note:** Application layer MUST validate and parse SET values. MySQL stores SET values as comma-separated strings internally, so direct migration preserves the format. +**Choosing:** + +- **PREFER JSONB** when querying inside the value — `permissions @> '[\"admin\"]'`, `jsonb_array_elements_text`, or indexed JSONB paths; values are normalized on write +- **MAY use TEXT** when the column is opaque to the database — application reads the whole value, parses it, never queries inside +- **JSON** is valid when writes dominate (no parse/sort overhead), byte-exact input matters (audit, replay, duplicate keys), or only `->`/`->>` is needed +- When migrating existing JSON columns: **SHOULD** keep them as `JSON`; **MAY** upgrade to `JSONB` if JSONB-only operators or indexed paths are needed + +**Note:** Application layer MUST validate `permissions` against the allowed value set on write regardless of the column type. Enum-of-values constraints belong in the application or as a `CHECK` against a derived column. --- diff --git a/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/full-example.md b/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/full-example.md index d56aed62..7acef404 100644 --- a/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/full-example.md +++ b/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/full-example.md @@ -44,8 +44,8 @@ transact([ description TEXT, price DECIMAL(10,2) NOT NULL, category VARCHAR(255) DEFAULT 'other' CHECK (category IN ('electronics', 'clothing', 'food', 'other')), - tags TEXT, - metadata JSONB, + tags JSONB, -- source was SET (array); PREFER JSONB for queryable arrays (MAY use TEXT for opaque columns) + metadata JSON, -- source was JSON; keep JSON by default (MAY upgrade to JSONB for @>/?/indexed paths) stock INTEGER DEFAULT 0 CHECK (stock >= 0), is_active BOOLEAN DEFAULT true, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, @@ -69,8 +69,8 @@ transact(["CREATE INDEX ASYNC idx_products_category ON products(tenant_id, categ | `INT` tenant_id | `VARCHAR(255)` for multi-tenant pattern | | `MEDIUMTEXT` | `TEXT` | | `ENUM(...)` | `VARCHAR(255)` with `CHECK` constraint | -| `SET(...)` | `TEXT` (comma-separated) | -| `JSON` | `JSONB` (preferred) or `JSON` — `JSONB` for queryable structured data; `JSON` preserves key order and whitespace | +| `SET(...)` | Serialize to a single column. PREFER `JSONB` (operators work directly); MAY use `TEXT` when opaque to the database. ASK the user. | +| `JSON` | Keep as `JSON`. MAY upgrade to `JSONB` when the application needs `@>`/`?`/indexed JSONB paths. ASK the user about query patterns. | | `UNSIGNED` | `CHECK (col >= 0)` | | `TINYINT(1)` | `BOOLEAN` | | `DATETIME` | `TIMESTAMP` | @@ -98,7 +98,8 @@ transact(["CREATE INDEX ASYNC idx_products_category ON products(tenant_id, categ - **MUST map** all MySQL data types to DSQL equivalents before creating tables - **MUST convert** AUTO_INCREMENT to UUID with gen_random_uuid(), IDENTITY column with `GENERATED AS IDENTITY (CACHE ...)`, or explicit SEQUENCE -- ALWAYS use `GENERATED AS IDENTITY` for auto-incrementing columns (see [AUTO_INCREMENT Migration](ddl-auto-increment.md#auto_increment-migration)) - **MUST replace** ENUM with VARCHAR and CHECK constraint -- **MUST replace** SET with TEXT (comma-separated) +- **MUST serialize** SET into a single-column representation; **PREFER `JSONB`** (operators work directly), with **`TEXT`** as a MAY for opaque columns; **ASK** the user +- **SHOULD keep** JSON columns as `JSON`; **MAY upgrade to `JSONB`** when the application needs `@>`/`?`/indexed JSONB paths; **ASK** the user about query patterns - **MUST replace** FOREIGN KEY constraints with application-layer referential integrity - **MUST replace** ON UPDATE CURRENT_TIMESTAMP with application-layer updates - **MUST convert** all index creation to use CREATE INDEX ASYNC diff --git a/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/type-mapping.md b/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/type-mapping.md index b7561df2..86405825 100644 --- a/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/type-mapping.md +++ b/plugins/databases-on-aws/skills/dsql/references/mysql-migrations/type-mapping.md @@ -61,16 +61,16 @@ Map MySQL data types to their DSQL equivalents. ### String Types -| MySQL Type | DSQL Equivalent | Notes | -| ----------------- | ---------------------------------- | ----------------------------------------------------------------------------------------------- | -| CHAR(n) | CHAR(n) | Direct equivalent | -| VARCHAR(n) | VARCHAR(n) | Direct equivalent | -| TINYTEXT | TEXT | DSQL uses TEXT for all unbounded strings | -| TEXT | TEXT | Direct equivalent | -| MEDIUMTEXT | TEXT | DSQL uses TEXT for all unbounded strings | -| LONGTEXT | TEXT | DSQL uses TEXT for all unbounded strings | -| ENUM('a','b','c') | VARCHAR(255) with CHECK constraint | See [ENUM Migration](ddl-type-alternatives.md#enum-type-migration) | -| SET('a','b','c') | TEXT | Store as comma-separated TEXT; see [SET Migration](ddl-type-alternatives.md#set-type-migration) | +| MySQL Type | DSQL Equivalent | Notes | +| ----------------- | ---------------------------------- | --------------------------------------------------------------------------------------------------------------- | +| CHAR(n) | CHAR(n) | Direct equivalent | +| VARCHAR(n) | VARCHAR(n) | Direct equivalent | +| TINYTEXT | TEXT | DSQL uses TEXT for all unbounded strings | +| TEXT | TEXT | Direct equivalent | +| MEDIUMTEXT | TEXT | DSQL uses TEXT for all unbounded strings | +| LONGTEXT | TEXT | DSQL uses TEXT for all unbounded strings | +| ENUM('a','b','c') | VARCHAR(255) with CHECK constraint | See [ENUM Migration](ddl-type-alternatives.md#enum-type-migration) | +| SET('a','b','c') | JSONB (PREFERRED) or TEXT | PREFER JSONB; MAY use TEXT for opaque columns; see [SET Migration](ddl-type-alternatives.md#set-type-migration) | ### Date/Time Types @@ -97,7 +97,7 @@ Map MySQL data types to their DSQL equivalents. | MySQL Type | DSQL Equivalent | Notes | | -------------- | --------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | -| JSON | `JSONB` (preferred) or `JSON` | Prefer `JSONB` for queryable structured data; use `JSON` to preserve key order and whitespace | +| JSON | JSON (default); MAY upgrade to JSONB | Keep as `JSON`; MAY upgrade to `JSONB` when querying with `@>`/`?`/indexed JSONB paths | | AUTO_INCREMENT | UUID with gen_random_uuid(), IDENTITY column, or SEQUENCE | See [AUTO_INCREMENT Migration](ddl-auto-increment.md#auto_increment-migration) for all three options | --- diff --git a/plugins/databases-on-aws/skills/dsql/references/onboarding.md b/plugins/databases-on-aws/skills/dsql/references/onboarding.md index 7241779e..d7fc8ef7 100644 --- a/plugins/databases-on-aws/skills/dsql/references/onboarding.md +++ b/plugins/databases-on-aws/skills/dsql/references/onboarding.md @@ -35,7 +35,7 @@ These guidelines apply when users say "Get started with DSQL" or similar phrases - Example: - "What column names would you like in this table?" - "What is the column name of the primary key?" - - "Should this column be `JSONB` or `JSON`? (`JSONB` is recommended for queryable structured data.)" + - "Should this column be JSON, JSONB, or TEXT? (PREFER JSONB for `@>`/`?` queries; JSON for write-heavy or byte-exact paths; TEXT for columns the database never inspects.)" **Examples:** @@ -252,8 +252,8 @@ cargo add aws-sdk-dsql tokio --features full - If yes, MUST verify DSQL compatibility: - No SERIAL types (use `GENERATED AS IDENTITY` with sequences, or UUID) - No foreign keys (implement in application) - - Serialize arrays as `JSONB`; expand at query time with `jsonb_array_elements_text(data)` - - For structured data, prefer `JSONB` over `JSON` for queryable fields + - Arrays must be serialized into a single column — PREFER `JSONB` when querying inside the value (`@>`, `?`, `jsonb_array_elements_text(data)`, indexed JSONB paths); MAY use `TEXT` for columns the database never inspects; `JSON` is also valid for write-heavy or byte-exact paths. ASK the user. + - SHOULD keep existing `JSON` columns as `JSON`; MAY upgrade to `JSONB` if JSONB-only operators or indexed paths are needed - Verify column types against the [supported data types list](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility-supported-data-types.html) - Reference [`./development-guide.md`](./development-guide.md) for full constraints @@ -351,7 +351,7 @@ Let them know you're ready to help with more: **ALWAYS follow these rules:** 1. **Indexes:** Use `CREATE INDEX ASYNC` - synchronous index creation not supported -2. **Arrays:** Serialize as `JSONB`; expand at query time with `jsonb_array_elements_text(data)` +2. **Serialization:** Arrays must be serialized into a single column — PREFER `JSONB` (operators work directly); MAY use `TEXT` for columns the database never inspects. For document columns, `JSON` is also a valid choice (write-heavy or byte-exact paths). ASK the user. 3. **Referential Integrity:** Implement foreign key validation in application code 4. **DDL Operations:** Execute one DDL per transaction, no mixing with DML 5. **Transaction Limits:** Maximum 3,000 row modifications, 10 MiB data size per transaction diff --git a/plugins/databases-on-aws/skills/dsql/references/troubleshooting.md b/plugins/databases-on-aws/skills/dsql/references/troubleshooting.md index 658441da..42558d40 100644 --- a/plugins/databases-on-aws/skills/dsql/references/troubleshooting.md +++ b/plugins/databases-on-aws/skills/dsql/references/troubleshooting.md @@ -93,10 +93,12 @@ See [full list of unsupported features](https://docs.aws.amazon.com/aurora-dsql/ ### Error: "Datatype array not supported" **Cause:** Using `TEXT[]` or other array column types -**Solution:** +**Solution:** Serialize the array into a single column — DSQL has no array column type. PREFER `JSONB`; MAY use `TEXT` for opaque columns. ASK the user which format fits the access pattern. -1. Change the column to `JSONB` (`tags JSONB`) and serialize the array as a JSONB array -2. At query time, expand it with `jsonb_array_elements_text(tags)` +- **PREFER `JSONB`** — the application queries inside the value (`@>`/`?`/`?|`/`?&`, `jsonb_array_elements_text`, or indexed JSONB paths); values are normalized on write. Insert: `INSERT INTO t (tags) VALUES ($1::jsonb)` with `JSON.stringify(arr)`. Query: `jsonb_array_elements_text(tags)`. +- **MAY use `TEXT`** — the column is opaque to the database (the app reads the whole value, parses it, and never queries inside). Insert raw: `INSERT INTO t (tags_csv) VALUES ($1)` with `arr.join(',')`. +- **`JSON` is valid** when writes dominate (no parse/sort overhead on write), byte-exact input matters (audit, replay, duplicate keys), or only `->`/`->>` is needed. +- **When migrating:** keep existing `JSON` columns as `JSON`; upgrade to `JSONB` only when JSONB-only operators or indexed paths are needed. ### Error: "Please use CREATE INDEX ASYNC"