Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 0 additions & 80 deletions cloud-accounts/changing-instance-types.mdx

This file was deleted.

82 changes: 82 additions & 0 deletions cloud-accounts/cluster-observability.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
title: "Cluster Observability"
sidebarTitle: "Cluster Observability"
description: "Monitor cluster health, resource usage, and infrastructure metrics"
---

Porter provides built-in observability for your cluster infrastructure through the **Infrastructure** dashboard. Access it by clicking **Infrastructure** in the left sidebar.

---

## Pods

The **Pods** tab provides a real-time view of all pods running in your cluster.

- **Search**: Filter pods by name
- **Filters**: Filter by status or namespace

Each pod displays:

| Column | Description |
|--------|-------------|
| **Pod name** | The name of the pod |
| **Namespace** | Kubernetes namespace (e.g., `kube-system`, `default`) |
| **Status** | Current state (Running, Pending, Failed, etc.) |
| **Ready** | Container readiness (e.g., `1/1`) |
| **Restarts** | Number of container restarts |
| **CPU** | CPU usage |
| **Memory** | Memory usage |
| **Memory %** | Percentage of memory limit used |
| **Age** | Time since pod creation |

---

## Nodes

The **Nodes** tab shows your cluster's node groups and individual nodes.

### Node Groups View

The default view displays all node groups:

| Column | Description |
|--------|-------------|
| **Node group** | Name of the node group (e.g., default, monitoring, system) |
| **Instance type** | The machine type for nodes in this group |
| **Utilization** | Visual indicator of resource usage |
| **Actions** | Link to view detailed metrics |

### Individual Nodes View

Click on a node group to see individual nodes:

- **Node name**: The cloud provider's node identifier
- **Node group**: Which node group this node belongs to
- **Instance type**: The machine type
- **CPU**: CPU utilization shown as utilized (yellow) vs reserved (blue)
- **Memory**: Memory utilization shown as utilized (yellow) vs reserved (blue)
- **Status**: Node health status (Ready, NotReady)

Click **Metrics >** on any node group to view historical instance counts over time.

---

## Integrating External Monitoring

For application-level monitoring and alerting, integrate with external observability platforms:

<CardGroup cols={3}>
<Card title="Datadog" icon="dog" href="/observability/integrations">
Full-stack monitoring with APM, logs, and infrastructure metrics
</Card>
<Card title="New Relic" icon="chart-line" href="/observability/integrations">
Application performance monitoring and alerting
</Card>
<Card title="Grafana" icon="chart-area" href="/observability/integrations">
Dashboards and visualization for metrics and logs
</Card>
</CardGroup>

See [Third party observability](/addons/third-party-observability) or reach out to support for more information.


11 changes: 7 additions & 4 deletions cloud-accounts/cluster-upgrades.mdx
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
---
title: "Cluster Upgrades"
sidebarTitle: "Cluster Upgrades"
description: "How Porter manages Kubernetes upgrades for your cluster"
---

Keeping your Kubernetes clusters up-to-date is essential for ensuring security, stability, and access to the latest features built by the wider Kubernetes community as well as the underlying public cloud. To that end, we take care of managed Kubernetes upgrades for all clusters provisioned through our platform. Our automated upgrade process ensures your clusters remain current without disrupting your workloads, so you can focus on building and deploying your applications while we handle the complexities of cluster maintenance.

# Shared Responsibility Model
## Shared Responsibility Model

We've endeavoured to build a world-class cluster management system which is able to manage and upgrade customer infrastructure without causing disruption to customer workloads. To that end, we've defined a shared responsibility model which maps out the roles played by Porter's engineering/SRE teams as well as customers to ensure the best possible experience with upgrades.

Expand All @@ -30,11 +32,11 @@ More documentation around zero-downtime deployments may be found [here](/configu

3. Maintaining a constant stream of communication around upgrade timelines and statuses.

# Upgrade Calendar
## Upgrade Calendar

Kubernetes follows a release cycle where there are - approximately - three minor version releases a year. Every release is followed by a period where public clouds integrate the new version into their managed Kubernetes offerings and run tests to ensure compatibility with the underlying cloud. Our upgrade calendar is thus dependent on both release cycles. To account for that, we carry out cluster upgrades twice a year, where we "leapfrog" over versions to ensure customer clusters are running the _latest stable_ version of Kubernetes. These are typically carried out once towards the end of Q1/beginning of Q2 and then later towards the end of Q3.

# Upgrade Path
## Upgrade Path

When a new version of upstream Kubernetes is released, we closely track the corresponding release on public clouds in conjunction with the wider community as well as our public cloud partners (AWS, Google Cloud, Azure).

Expand All @@ -50,4 +52,5 @@ When a new version of upstream Kubernetes is released, we closely track the corr

3. After our tests are successful, we announce a timeline for upgrades over our comms channels on Slack. At this point, while we typically announce a window during low-traffic hours when upgrades are conducted, customers have the option of scheduling a specific slot.

4. When a cluster is upgraded, we upgrade system components, all app templates, the managed cluster control plane as well as all nodegroups. While this operation is meant to be non-disruptive, there are certain prerequisites on the customers' end to ensure zero downtime (see the section below for more details).
4. When a cluster is upgraded, we upgrade system components, all app templates, the managed cluster control plane as well as all nodegroups. While this operation is meant to be non-disruptive, there are certain prerequisites on the customers' end to ensure zero downtime (see the section below for more details).

Loading