Fix static IP subnet mismatch when adding network to nic_group#2716
Fix static IP subnet mismatch when adding network to nic_group#2716vlast3k wants to merge 2 commits intocloudfoundry:mainfrom
Conversation
When adding a new network to existing instances in the same nic_group,
the placement planner picked the first unclaimed static IP in the AZ
without checking which cloud subnet it belonged to. This caused CPI
errors ("Networks in nic_group have different subnet_ids") when an AZ
has multiple cloud subnets with interleaved static IP ranges.
Store cloud_properties from the containing subnet on each StaticIpToAzs
entry. When selecting an IP for a network that shares a nic_group with
an already-assigned sibling, constrain the lookup to IPs on a subnet
with matching cloud_properties. Fail with a clear error when no
matching IP is available.
Fixes: BOSH-1922
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThe PR extends static IP placement to track and match subnet Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@src/bosh-director/lib/bosh/director/deployment_plan/placement_planner/static_ips_availability_zone_picker.rb`:
- Around line 288-300: The helper nic_group_sibling_cloud_properties may
accidentally match the current network as its own sibling; add a defensive early
check inside the loop to skip when sibling_network == network so it only
considers other networks, keeping the rest of the logic (matching nic_group and
checking plan.reservation.ip and subnet) unchanged; update the loop in
nic_group_sibling_cloud_properties to continue to next when sibling_network is
the same object as the provided network.
- Around line 118-129: The error message raised when static_ip_to_azs is nil
uses a nic_group-specific message even when sibling_cloud_props is nil; update
the branch inside the desired_instance.az.nil? block to raise a different
message depending on whether sibling_cloud_props is present: if
sibling_cloud_props is truthy, raise Bosh::Director::NetworkReservationError
with the subnet/nic_group message using network.nic_group (matching the existing
nic_group wording), otherwise raise the generic "Failed to distribute static
IPs..." style message used in the else-branch; adjust the code paths around
sibling_cloud_props,
`@networks_to_static_ips.next_ip_for_network_with_cloud_properties` and
`@networks_to_static_ips.next_ip_for_network` so the error raised mirrors the
else-branch behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0e44e9cf-74b2-4434-8c77-eb01b375d94e
📒 Files selected for processing (4)
src/bosh-director/lib/bosh/director/deployment_plan/placement_planner/networks_to_static_ips.rbsrc/bosh-director/lib/bosh/director/deployment_plan/placement_planner/static_ips_availability_zone_picker.rbsrc/bosh-director/spec/unit/bosh/director/deployment_plan/placement_planner/networks_to_static_ips_spec.rbsrc/bosh-director/spec/unit/bosh/director/deployment_plan/placement_planner/static_ips_availability_zone_picker_spec.rb
Validation:
|
- Differentiate error when static IP lookup fails: nic_group-specific message when sibling_cloud_props is present, generic message otherwise. Mirrors the else-branch error handling. - Skip self-match in nic_group_sibling_cloud_properties to prevent the current network from matching itself as a sibling.
Summary
nic_group, the placement planner picked the first unclaimed static IP in the AZ without checking which cloud subnet it belonged to. This caused CPI errors (Networks in nic_group have different subnet_ids) when an AZ has multiple cloud subnets with interleaved static IP ranges.cloud_propertiesfrom the containing subnet on eachStaticIpToAzsentry. When selecting an IP for a network that shares anic_groupwith an already-assigned sibling, constrain the lookup to IPs on a subnet with matchingcloud_properties.What changed
networks_to_static_ips.rbStaticIpToAzsstruct now includescloud_propertiescreate()records the subnet'scloud_propertiesfor each IPfind_by_network_az_and_cloud_propertiesandnext_ip_for_network_with_cloud_propertiesstatic_ips_availability_zone_picker.rbcreate_network_plan_with_azlooks up the cloud_properties of an already-assigned sibling in the same nic_group, then constrains IP selection to matching subnetsnic_group_sibling_cloud_propertieshelperTests
networks_to_static_ips_spec.rb: cloud_properties storage and filtered lookupstatic_ips_availability_zone_picker_spec.rb: multi-subnet nic_group pairing, mismatch error, no-nic_group unchanged behaviorContext
On AWS, all networks in a
nic_groupmust resolve to the same ENI subnet. Landscapes with multiple/24subnets per AZ and interleaved static IP ranges hit this when enabling IPv6 (adding a second network to the nic_group). BOSH assigned IPv6 IPs on a different subnet than the existing IPv4 IPs because it only filtered by AZ, not by cloud subnet.Verified against a production cloud config: paired IPv4/IPv6 subnets sharing a
nic_grouphave identicalcloud_properties(same AWS subnet ID, same security groups), because AWS subnets are dual-stack and anic_groupmaps to a single ENI. Fullcloud_propertieshash equality is the correct comparison — narrowing to just thesubnetkey would be fragile and CPI-specific.Test plan