feat(blueprints): consolidate single-node into full-multi-node-cluster with Arc machine support#581
feat(blueprints): consolidate single-node into full-multi-node-cluster with Arc machine support#581nguyena2 wants to merge 13 commits into
Conversation
…urations - specify required providers: azurerm, azuread, azapi - set required Terraform version constraints - configure azurerm provider with storage use and partner ID settings 🔧 - Generated by Copilot
…chine configuration
…Arc cluster blueprint - separate long description into multiple lines for clarity - adjust table formatting for better alignment 🔧 - Generated by Copilot
📚 Documentation Health ReportGenerated on: 2026-06-03 19:23:38 UTC 📈 Documentation Statistics
🏗️ Three-Tree Architecture Status
🔍 Quality Metrics
This report is automatically generated by the Documentation Automation workflow. |
bindsi
left a comment
There was a problem hiding this comment.
🤖 Automated review: Large blueprint addition (full single-node Arc machine cluster). Adds devcontainer lock, Terraform/Bicep configurations, and supporting infrastructure. Structure follows established blueprint patterns. No obvious security or functional issues in the scaffolding visible from the diff.
📚 Documentation Health ReportGenerated on: 2026-06-05 19:27:58 UTC 📈 Documentation Statistics
🏗️ Three-Tree Architecture Status
🔍 Quality Metrics
This report is automatically generated by the Documentation Automation workflow. |
- downgrade golang.org/x/crypto to v0.52.0 - upgrade golang.org/x/mod to v0.35.0 - upgrade golang.org/x/net to v0.55.0 - upgrade golang.org/x/sys to v0.45.0 - upgrade golang.org/x/text to v0.37.0 - upgrade golang.org/x/tools to v0.44.0 🔧 - Generated by Copilot
📚 Documentation Health ReportGenerated on: 2026-06-05 19:43:50 UTC 📈 Documentation Statistics
🏗️ Three-Tree Architecture Status
🔍 Quality Metrics
This report is automatically generated by the Documentation Automation workflow. |
bindsi
left a comment
There was a problem hiding this comment.
Automated batch review: requesting changes for one blocking Terraform dependency-cycle issue.
…local variable for client ID - remove direct reference to client ID in locals - compute client ID in a local variable to avoid module self-dependency 🔧 - Generated by Copilot
📚 Documentation Health ReportGenerated on: 2026-06-08 19:37:54 UTC 📈 Documentation Statistics
🏗️ Three-Tree Architecture Status
🔍 Quality Metrics
This report is automatically generated by the Documentation Automation workflow. |
bindsi
left a comment
There was a problem hiding this comment.
Automated batch re-review: no actionable findings. The prior Terraform self-dependency concern appears resolved: EventHubConnection__clientId is now computed inside the Azure Functions module rather than fed back through blueprint locals.
|
@nguyena2 thanks for the work here, though I do fear we already have this functionality in the |
I actually like this idea. I'll move to removing the two single cluster blueprints and modify the multi-node one to be more streamlined. |
…-node to multi-node cluster blueprints - replace references to full-single-node-cluster with full-multi-node-cluster - ensure consistency across multiple application components and README files - clarify deployment instructions for production environments 📚 - Generated by Copilot
…settings - rename username_password_ref to usernamePasswordCredentials - update secret names for camera credentials - change trustSettings to use a config map for trusted CA certificates 🔒 - Generated by Copilot
…m_host is null 🔧 - Generated by Copilot
📚 Documentation Health ReportGenerated on: 2026-06-10 21:51:07 UTC 📈 Documentation Statistics
🏗️ Three-Tree Architecture Status
🔍 Quality Metrics
This report is automatically generated by the Documentation Automation workflow. |
Would you feel like picking that one up? It's a major overhaul in the sense that the blueprint itself (both in TF and Bicep) is not major work, but the documentation and references to the |
…tructions - clarify connection command for MQTT tools container - specify usage of anonymous listener for mqttui 🔧 - Generated by Copilot
📚 Documentation Health ReportGenerated on: 2026-06-11 18:55:01 UTC 📈 Documentation Statistics
🏗️ Three-Tree Architecture Status
🔍 Quality Metrics
This report is automatically generated by the Documentation Automation workflow. |
This PR retires the standalone full-single-node-cluster blueprint and folds its capabilities into full-multi-node-cluster, which now serves as the single canonical cluster blueprint for both single- and multi-node deployments. The unified blueprint can target either Azure-provisioned VMs or pre-existing Arc-enabled machines, selected through a
should_use_arc_machinestoggle (Terraform) /shouldUseArcMachinesparameter (Bicep). Supporting changes reach into the 100-cncf-cluster component, the messaging Azure Functions module, CI templates, developer tooling, and documentation across the repository.Description
Blueprint consolidation
The entire
blueprints/full-single-node-cluster/directory was deleted, including both bicep and terraform implementations, the generated READMEs, and the top-level README. Its.tfvars.examplefiles (dataflow, dataflow-graphs variants, dataflow-endpoint, foundry-project, leak-detection, sse-connector-assets) and its Terratest suite were moved intoblueprints/full-multi-node-cluster/, with test names updated from the single-node naming (for example,TestTerraformFullSingleNodeClusterDeploybecameTestTerraformFullMultiNodeClusterDeploy).Arc machine targeting
The full-multi-node-cluster blueprint gained a deployment-mode switch so it can target Arc-enabled machines instead of provisioning VMs:
main.tf,variables.tf): addedshould_use_arc_machines,arc_machine_count,arc_machine_name,arc_machine_name_prefix, andarc_machine_resource_group_name. When the toggle is on, adata "azurerm_arc_machine" "arc_machines"block resolves machines by exact name (whenarc_machine_count == 1) or a{prefix}{N}pattern, the cloud_vm_host module is gated off, andarc_onboarding_principal_idsis derived from each machine's system-assigned identity. A newcluster_server_ipinput is validated as required in Arc mode.main.bicep): addedshouldUseArcMachines,arcMachineCount,arcMachineName,arcMachineNamePrefix, andarcMachineResourceGroupName. ExistingMicrosoft.HybridCompute/machinesresources are referenced withexistingand gated byshouldUseArcMachines. TheadminPasswordparameter became optional (required only when targeting VMs), and outputs were regrouped into semantic sections.The multi-node blueprint defaults also shifted toward a production-ready single-node baseline:
host_machine_countnow defaults to 1,should_deploy_resource_sync_rulesdefaults to true, andaks_should_enable_private_clusterdefaults to true.CNCF cluster component
The 100-cncf-cluster Bicep component was extended to onboard Arc machines:
arcOnboardingPrincipalIdsis now an array,clusterServerArcMachineNameandclusterNodeArcMachineNameswere added, and ashouldDeployArcMachinestoggle routes script deployment to a newdeploy-scripts-to-arc.bicepmodule (the Arc counterpart ofdeploy-scripts-to-vm.bicep). The key-vault-role-assignment module was updated to consume the principal-ID array.Networking, messaging, and other components
shouldEnableManagedOutboundAccessparameter (default true).localsblock and conditionally setsEventHubConnection__clientId, avoiding a circular dependency.username_password_reftousernamePasswordCredentialswith ConfigMap-based trust lists, and the MQTT tools manifest clarified container selection (-c mqtt-tools) and listener ports (18883 authenticated, 18884 anonymous).Documentation and tooling
Repository-wide references to full-single-node-cluster were repointed to full-multi-node-cluster across
blueprints/README.md,docs/, ADR libraries, scenario prerequisites,copilot/guidance,.github/agents/instructions/prompts, the.azdocluster-test template, andscripts/build/Detect-Folder-Changes.ps1. The VS Code workspace added alaunch.jsonand updatedsettings.jsonto wire the Azure Functions project at src/500-application/513-tiered-notification-service, and.gitignorenow ignores compiled Bicep output (**/bicep/main.json).Related Issue
None
Type of Change
Implementation Details
Rather than maintaining parallel single-node and multi-node blueprints, the single-node blueprint was removed and its examples and tests were migrated so that full-multi-node-cluster covers the full range from a one-machine cluster to a multi-node cluster. Deployment mode is driven by a single toggle (
should_use_arc_machines/shouldUseArcMachines): when off, the blueprint provisions VMs via the existing cloud_vm_host path; when on, it resolves pre-existing Arc machines throughdata.azurerm_arc_machine/existingresources and feeds their system-assigned identities into Arc onboarding. The CNCF cluster component carries the matchingshouldDeployArcMachineslogic and the newdeploy-scripts-to-arc.bicepmodule so the in-cluster setup scripts run on Arc machines. Optional capabilities remain behind their existingshould_*toggles.Testing Performed
Validation Steps
blueprints/full-multi-node-cluster/terraform, runterraform initandterraform validatefor both default (VM) and Arc modes.should_use_arc_machines = true, providearc_machine_*andcluster_server_ipvalues, and runterraform planto confirm the Arc machine data sources resolve and the VM host module is skipped.az bicep build) and confirmshouldUseArcMachinesgates theMicrosoft.HybridCompute/machinesreferences.full-single-node-clusterin docs, CI templates, or component READMEs.Checklist
terraform fmton all Terraform codeterraform validateon all Terraform codeaz bicep formaton all Bicep codeaz bicep buildto validate all Bicep codeSecurity Review
Additional Notes
This PR does not modify the security-sensitive paths flagged by the template (
SECURITY.md,src/000-cloud/010-security-identity/,deploy/). Secret-bearing inputs and outputs remain marked sensitive. The Bicep blueprint outputs for notification are intentionally stubbed ("Not deployed") with a note that Bicep does not yet wire the 045-notification component, kept for parity with Terraform.Screenshots (if applicable)