Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions TSG/Update/Update-Service-terminated-repeatedly-by-ALM.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,24 @@ if ($resourceIds.Count -gt 0)
}
```

## Remove failed Update action plan instances
Delete all failed Update action plans except for the last failed one.

Comment on lines +101 to +102
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section description is too brief for a potentially destructive operation. The instruction should warn users that this operation will permanently delete action plan instances and cannot be undone. Consider adding context about why keeping the last failed instance is important for troubleshooting.

Suggested change
Delete all failed Update action plans except for the last failed one.
> [!WARNING]
> The following script permanently deletes failed Update action plan instances and cannot be undone. Only run this step after you have collected any required logs or when explicitly instructed by support.
Delete all failed Update action plans except for the last failed one so that you retain the most recent failure for troubleshooting and comparison while removing older, no‑longer‑needed instances that can interfere with subsequent update attempts.

Copilot uses AI. Check for mistakes.
```
Import-Module ECEClient -DisableNameChecking
$failedUpdates = Get-ActionPlanInstances | ? { $_.Status -eq "Failed" -and $_.ActionPlanName -match "MAS Update" } | sort LastModifiedDateTime -Descending | select -Skip 1
$instanceIDs = $failedUpdates.InstanceID

$eceClient = Create-ECEClusterServiceClient
$deleteActionPlanInstanceDescription = New-Object Microsoft.AzureStack.Solution.Deploy.EnterpriseCloudEngine.Controllers.Models.DeleteActionPlanInstanceDescription

foreach ($actionPlanInstanceId in $instanceIDs) {
   # remove old instance
   $deleteActionPlanInstanceDescription.ActionPlanInstanceID = $actionPlanInstanceID
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name case mismatch: The foreach loop uses $actionPlanInstanceId (lowercase 'd') but references it as $actionPlanInstanceID (uppercase 'ID') on line 113. This will cause the deletion to fail because PowerShell variable names are case-insensitive but the inconsistency indicates the wrong variable is being used. The loop variable should be used consistently.

Suggested change
   $deleteActionPlanInstanceDescription.ActionPlanInstanceID = $actionPlanInstanceID
   $deleteActionPlanInstanceDescription.ActionPlanInstanceID = $actionPlanInstanceId

Copilot uses AI. Check for mistakes.
   $eceClient.DeleteActionPlanInstance($deleteActionPlanInstanceDescription).Wait()
Comment on lines +104 to +114
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code formatting has inconsistent indentation with trailing spaces on lines 107, 109, 110, 111, and 112. Remove trailing whitespace and use consistent spacing to match the formatting style of other PowerShell scripts in this document.

Suggested change
Import-Module ECEClient -DisableNameChecking
$failedUpdates = Get-ActionPlanInstances | ? { $_.Status -eq "Failed" -and $_.ActionPlanName -match "MAS Update" } | sort LastModifiedDateTime -Descending | select -Skip 1
$instanceIDs = $failedUpdates.InstanceID
$eceClient = Create-ECEClusterServiceClient
$deleteActionPlanInstanceDescription = New-Object Microsoft.AzureStack.Solution.Deploy.EnterpriseCloudEngine.Controllers.Models.DeleteActionPlanInstanceDescription
foreach ($actionPlanInstanceId in $instanceIDs) {
   # remove old instance
   $deleteActionPlanInstanceDescription.ActionPlanInstanceID = $actionPlanInstanceID
   $eceClient.DeleteActionPlanInstance($deleteActionPlanInstanceDescription).Wait()
Import-Module ECEClient -DisableNameChecking
$failedUpdates = Get-ActionPlanInstances | ? { $_.Status -eq "Failed" -and $_.ActionPlanName -match "MAS Update" } | sort LastModifiedDateTime -Descending | select -Skip 1
$instanceIDs = $failedUpdates.InstanceID
$eceClient = Create-ECEClusterServiceClient
$deleteActionPlanInstanceDescription = New-Object Microsoft.AzureStack.Solution.Deploy.EnterpriseCloudEngine.Controllers.Models.DeleteActionPlanInstanceDescription
foreach ($actionPlanInstanceId in $instanceIDs) {
# remove old instance
$deleteActionPlanInstanceDescription.ActionPlanInstanceID = $actionPlanInstanceID
$eceClient.DeleteActionPlanInstance($deleteActionPlanInstanceDescription).Wait()

Copilot uses AI. Check for mistakes.
}
Comment on lines +104 to +115
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PowerShell script lacks error handling that is present in other mitigation scripts in this document. This script performs state-changing delete operations that could fail. Add $ErrorActionPreference = "Stop" at the beginning of the script to ensure errors are caught and the script stops if operations fail.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +104 to +115
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script deletes action plan instances without defensive validation. Before performing the delete operations, add a check to verify that failed updates were found and provide user confirmation or output about what will be deleted. For example, check if $failedUpdates is empty and inform the user how many instances will be deleted.

Copilot generated this review using guidance from repository custom instructions.
```

## (Last Resort) Increase update service memory limit
If the problem is still occurring, the final thing to do is to temporarily increase the configured memory limit for the update service.

Expand Down Expand Up @@ -162,3 +180,4 @@ else
Write-Host "No changes needed. Existing limits are already set. Warning limit $($memoryWarningLimit)MB and Error limit $($memoryErrorLimit)MB"
}
```

Loading