Shipper documentation¶
Documentation overview¶
- Introduction: Brief overview of what Shipper is and why you might be interested
- Quick start: 5 minutes to a working Shipper setup
- User guide: Using Shipper to deploy your code
- Administrator guide: Production installation, monitoring, and cluster fleet management
- Limitations and known issues
- API Reference: Detailed reference on the Shipper resources
Introduction¶
Shipper¶
Shipper is an extension for Kubernetes to add sophisticated rollout strategies and multi-cluster orchestration.
It lets you use kubectl
to manipulate objects which represent any kind of
rollout strategy, like blue/green or canary. These strategies can deploy to one
cluster, or many clusters across the world.
Why does Shipper exist?¶
Kubernetes is a wonderful platform, but implementing mature rollout strategies on top of it requires subtle multi-step orchestration: Deployment objects are a building block, not a solution.
When implemented as a set of scripts in CI/CD systems like Jenkins, GitLab, or Brigade, these strategies can become hard to debug, or leave out important properties like safe rollbacks.
These problems become more severe when the rollout targets multiple Kubernetes clusters in multiple regions: the complex, multi-step orchestration has many opportunities to fail and leave clusters in inconsistent states.
Shipper helps by providing a higher level API for complex rollout strategies to one or many clusters. It simplifies CI/CD pipeline scripts by letting them focus on the parts that matter to that particular application.
What is Shipper from a technical point of view?¶
Shipper is a collection of Kubernetes controllers that work with custom Kubernetes objects to provide a declarative API for advanced rollouts. These controllers continuously monitor the clusters involved, and converge them on the declared state. They act as control loops for the different aspects of a rollout: capacity management, traffic shifting, and Kubernetes object installation.
For example, you might have a Shipper Application like this:
apiVersion: shipper.booking.com/v1alpha1
kind: Application
metadata:
name: reviews-api
spec:
template:
# helm chart for this application
chart:
name: reviews-api
version: "0.0.1"
repoUrl: https://charts.example.com
# how to select clusters to deploy to
clusterRequirements:
regions:
- name: us-east1
# the rollout strategy
strategy:
steps:
- name: canary
capacity:
incumbent: 100
contender: 10
traffic:
incumbent: 9
contender: 1
- name: all-in
capacity:
incumbent: 0
contender: 100
traffic:
incumbent: 0
contender: 10
# the values for the helm chart
values:
image:
repository: image-registry.example.com/reviews-api
tag: v0.1.0
In this example, we’re defining an Application named reviews-api
. It uses
a Helm Chart of the same name, and deploys to a cluster in the us-east1
region. It uses a two step rollout strategy: a basic canary step with a bit of
traffic for the new version, then “all-in”. It populates the Helm Chart with
values specifying the image tag.
In order to make this declared state a reality, Shipper will select a matching cluster, install the Chart objects into that cluster, and with your guidance, progress through the rollout strategy until the new release is fully live.
Multi-cluster, multi-region, multi-cloud¶
Shipper can deploy your application to multiple clusters in different regions.
It expects a Kubernetes API, so it should work with any compliant
Kubernetes implementation like GKE or AKS. If you can use kubectl
with it, chances are, you can use Shipper with it as well.
Release Management¶
Shipper doesn’t just copy-paste your code onto multiple clusters for you – it allows you to customize the rollout strategy fully. This allows you to craft a rollout strategy with the appropriate speed/risk balance for your particular situation.
After each step of the rollout strategy, Shipper pauses to wait for another update to the Release object. This checkpointing approach means that rollouts are fully declarative, scriptable, and resumable. Shipper can keep a rollout on a particular step in the strategy for ten seconds or ten hours. At any point the rollout can be safely aborted, or moved backwards through the strategy to return to an earlier state.
Roll Backs¶
Since Shipper keeps a record of all your successful releases, it allows you to roll back to an earlier release very easily.
Charts As Input¶
Shipper installs a complete set of Kubernetes objects for a given application.
It does this by relying on Helm, and using Helm Charts as
the unit of configuration deployment. Shipper’s Application object provides an
interface for specifying values to a Chart just like the helm
command line
tool.
Getting help¶
We’re happy to take bug reports on the GitHub repo.
For user questions or general discussion you can find us on #shipper on the Kubernetes Slack.
Installing Shipper¶
Step 0: procure a cluster¶
The rest of this document assumes that you have access to a Kubernetes cluster and admin privileges on it. If you don’t have this, check out docker desktop, kind, microk8s or minikube. Cloud clusters like GKE are also fine. Shipper requires Kubernetes 1.17 or later, and you’ll need to be an admin on the cluster you’re working with. [1]
Make sure that kubectl
works and can connect to your cluster before
continuing.
Setting up kind clusters¶
Lets write a kind.yaml
manifest to configure our clusters:
:caption: kind.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
Now we’ll use this to create the clusters:
$ kind create cluster --name app --config kind.yaml --image kindest/node:v1.15.7
$ kind create cluster --name mgmt --config kind.yaml --image kindest/node:v1.15.7
Congratulations, you have created your clusters!
Step 1: get shipperctl
¶
shipperctl
automates setting up clusters for Shipper. Grab the tarball for
your operating system, extract it, and stick it in your PATH
somewhere.
You can find the binaries on the GitHub Releases page for Shipper.
Step 2: write a cluster manifest¶
shipperctl
expects a manifest of clusters to configure. It uses your
~/.kube/config
to translate context names into cluster API server URLs.
Find out the name of your context like so:
$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
kind-app kind-app kind-app
* kind-mgmt kind-mgmt kind-mgmt
In my setup, the context name of the application cluster is kind-app.
This configuration will allow management cluster to communicate with application cluster. The cluster API server URL stored in the kubeconfig is a local address (127.0.0.1), we need an actual ip address for our kind-app cluster. This is how you can get it:
$ kind get kubeconfig --name app --internal | grep server
Note that app is the name we gave to kind
when creating the application cluster.
Copy the URL of the server.
Now let’s write a clusters.yaml
manifest to configure Shipper here:
:caption: clusters.yaml
applicationClusters:
- name: kind-app
region: local
apiMaster: "SERVER_URL"
Paste your server URL as a string.
Step 3: Setup the Management Cluster¶
Before you run shipperctl
, make sure that your kubectl
context
is set to the management cluster:
$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
kind-app kind-app kind-app
* kind-mgmt kind-mgmt kind-mgmt
First we’ll setup all the needed resources in the management cluster:
$ shipperctl clusters setup management -n shipper-system
Setting up management cluster:
Registering or updating custom resource definitions... done
Creating a namespace called shipper-system... already exists. Skipping
Creating a namespace called rollout-blocks-global... already exists. Skipping
Creating a service account called shipper-management-cluster... already exists. Skipping
Creating a ClusterRole called shipper:management-cluster... already exists. Skipping
Creating a ClusterRoleBinding called shipper:management-cluster... already exists. Skipping
Checking if a secret already exists for the validating webhook in the shipper-system namespace... yes. Skipping
Creating the ValidatingWebhookConfiguration in shipper-system namespace... done
Creating a Service object for the validating webhook... done
Finished setting up management cluster
Step 4: deploy shipper¶
Now that we have the namespace, custom resource definitions, role bindings, service accounts, and so on, let’s create the Shipper Deployment:
$ kubectl --context kind-mgmt create -f https://github.com/bookingcom/shipper/releases/latest/download/shipper.deployment.yaml
deployment.apps/shipper created
This will create an instance of Shipper in the shipper-system
namespace.
Step 5: Join the Application cluster to the Management cluster¶
Now we’ll give clusters.yaml
to shipperctl
to configure the cluster for
Shipper:
$ shipperctl clusters join -f clusters.yaml -n shipper-system
Creating application cluster accounts in cluster kind-app:
Creating a namespace called shipper-system... already exists. Skipping
Creating a service account called shipper-application-cluster... already exists. Skipping
Creating a ClusterRoleBinding called shipper:application-cluster... already exists. Skipping
Finished creating application cluster accounts in cluster kind-app
Joining management cluster to application cluster kind-app:
Creating or updating the cluster object for cluster kind-app on the management cluster... done
Checking whether a secret for the kind-app cluster exists in the shipper-system namespace... yes. Skipping
Finished joining management cluster to application cluster kind-app
Step 6: do a rollout!¶
Now you should have a working Shipper installation. Let’s roll something out!
Namespace manager¶
By design, Shipper does not create namespaces in the application cluster. Shipper requires the existence of a namespace in the application cluster with the same name as the namespace in management cluster where the Application objects is installed. In case the namespace does not exist in the application cluster, and this application cluster is selected for a Release, Shipper will continue to try and install the charts, and fail. This loop will end only when the namespace is created in the application cluster, or this application cluster is not selected anymore (by deleting the Release or Application objects).
To help with this, we recommend having some sort of a namespace manager tool. This can be a simple controller that installs a namespace in all the application clusters for each namespace existing in the management cluster, or a more complex tool, depending on your needs.
Footnotes
[1] | For example, on GKE you need to bind yourself to cluster-admin before shipperctl will work. |
User guide¶
Rolling out with Shipper¶
Note
This documentation assumes that you have set up Shipper in two
clusters. kind-mgmt
is the name of the context that points to
the management cluster, and kind-app
is the name of the
context that points to the application cluster.
Rollouts with Shipper are all about transitioning from an old Release, the incumbent, to a new Release, the contender. If you’re rolling out an Application for the very first time, then there is no incumbent, only a contender.
In general Shipper tries to present a familiar interface for people accustomed to Deployment objects.
Application object¶
Here’s the Application object we’ll use:
apiVersion: shipper.booking.com/v1alpha1
kind: Application
metadata:
name: super-server
spec:
revisionHistoryLimit: 3
template:
chart:
name: nginx
repoUrl: https://raw.githubusercontent.com/bookingcom/shipper/master/test/e2e/testdata
version: 0.0.1
clusterRequirements:
regions:
- name: local
strategy:
steps:
- capacity:
contender: 1
incumbent: 100
name: staging
traffic:
contender: 0
incumbent: 100
- capacity:
contender: 100
incumbent: 0
name: full on
traffic:
contender: 100
incumbent: 0
values:
replicaCount: 3
Copy this to a file called app.yaml
and apply it to your Kubernetes management cluster:
$ kubectl --context kind-mgmt apply -f app.yaml
This will create an Application and Release object. Shortly thereafter, you should also see the set of Chart objects: a Deployment, a Service, and a Pod.
Checking progress¶
There are a few different ways to figure out how your rollout is going.
We can check in on the Release to see the progress we’re making:
.status.achievedStep
¶
This field is the definitive answer for whether Shipper considers a given step in a rollout strategy complete.
$ kubectl --context kind-mgmt get rel super-server-83e4eedd-0 -o json | jq .status.achievedStep
null
$ # "null" means Shipper has not written the achievedStep key, because it hasn't finished the first step
$ kubectl get rel -o json | jq .items[0].status.achievedStep
{
"name": "staging",
"step": 0
}
If everything is working, you should see one Pod active/ready.
.status.conditions
¶
Just like any other object, the status
field of a Release object
contains information on anything that is going wrong, and anything that is going right:
This set of conditions shows that the strategy hasn’t been executed
because Shipper can not contact the application cluster called
kind-app
.
.status.strategy.conditions
¶
For a more detailed view of what’s happening while things are in between states, you can use the Strategy conditions.
$ kubectl --context kind-mgmt get rel super-server-83e4eedd-0 -o json | jq .status.strategy.conditions
[
{
"lastTransitionTime": "2018-12-09T10:00:55Z",
"message": "clusters pending capacity adjustments: [microk8s]",
"reason": "ClustersNotReady",
"status": "False",
"type": "ContenderAchievedCapacity"
},
{
"lastTransitionTime": "2018-12-09T10:00:55Z",
"status": "True",
"type": "ContenderAchievedInstallation"
}
]
These will tell you which part of the step Shipper is currently working on. In this example, Shipper is waiting for the desired capacity in the microk8s cluster. This means that Pods aren’t ready yet.
.status.strategy.state
¶
Finally, because the Strategy conditions can be kind of a lot to parse, they
are summarized into estatus.strategy.state
.
$ kubectl get rel super-server-83e4eedd-0 -o json | jq .status.strategy.state
{
"waitingForCapacity": "True",
"waitingForCommand": "False",
"waitingForInstallation": "False",
"waitingForTraffic": "False"
}
The troubleshooting guide has more information on how to dig deep into what’s going on with any given Release.
Advancing the rollout¶
So now that we’ve checked on our Release and seen that Shipper considers step 0 achieved, let’s advance the rollout:
$ kubectl --context kind-mgmt patch rel super-server-83e4eedd-0 --type=merge -p '{"spec":{"targetStep":1}}'
I’m using patch
here to keep things concise, but any means of modifying
objects will work just fine.
Now, if you’ve got your kind-app
context set to the same namespace as your
Application object in the management cluster, you should be able to see 2 more pods spin up:
$ kubectl --context kind-app get po
NAME READY STATUS RESTARTS AGE
super-server-83e4eedd-0-nginx-5775885bf6-76l6g 1/1 Running 0 7s
super-server-83e4eedd-0-nginx-5775885bf6-9hdn5 1/1 Running 0 7s
super-server-83e4eedd-0-nginx-5775885bf6-dkqbh 1/1 Running 0 3m55s
And confirm that Shipper believes this rollout to be done:
$ kubectl --context kind-mgmt get rel -o json | jq .items[0].status.achievedStep
{
"name": "full on",
"step": 1
}
That’s it! Doing another rollout is as simple as editing the Application object, just like you would with a Deployment. The main principle is patching the Release object to move from step to step.
Troubleshooting Shipper¶
Prerequisites¶
To troubleshoot deployments effectively you need to be familiar with core Kubernetes and Shipper concepts (very briefly explained below) and be comfortable running kubectl commands.
Fundamentals¶
Shipper objects form a hierarchy:
Application
|
Release
|
InstallationTarget
CapacityTarget
TrafficTarget
You already know Applications and Releases, but there’s more. Below Releases you have what we call “target objects”. Each represents an important chunk of work we do when rolling out:
Kind | Shorthand | Description |
---|---|---|
InstallationTarget | it | Install charts in application clusters |
CapacityTarget | ct | Scale deployments up and down to reach desired number of pods |
TrafficTarget | tt | Orchestrate traffic by moving pods in and out of the LB |
The list is ordered (e.g. we can’t manipulate traffic before there are pods).
The universal troubleshooting algorithm¶
Shipper is a fairly complex system that runs on top of an even more complex one. Things can fail in many different ways. It’s not really feasible for us to list all the possible problems and solutions for them. Instead, we’ll give you a rough algorithm that should help you deal with commonly encountered problems.
To summarise, the algorithm is roughly:
- Find what stage you’re at by looking at Release conditions and state
- Inspect the corresponding target object’s conditions
- Act accordingly
In the next sections we’ll explain in more detail how to do that.
Finding where you are¶
Before we attempt to fix anything we need to make sure we know where we are in the rollout process. The starting point is almost always looking at your Release’s status:
$ kubectl describe rel nginx-vj7sn-7cb440f1-0
...
Status:
Achieved Step:
Name: staging
Step: 0
Conditions:
Last Transition Time: 2018-07-27T07:21:14Z
Status: True
Type: Scheduled
Strategy:
Conditions:
Last Transition Time: 2018-07-27T07:23:29Z
Message: clusters pending capacity adjustments: [minikube]
Reason: ClustersNotReady
Status: False
Type: ContenderAchievedCapacity
Last Transition Time: 2018-07-27T07:23:29Z
Status: True
Type: ContenderAchievedInstallation
State:
Waiting For Capacity: True
Waiting For Command: False
Waiting For Installation: False
Waiting For Traffic: False
...
We already looked at status.strategy.state.waitingForCommand but there are more fields there: one for every type of target objects. If your rollout isn’t finished and not waiting for input, these fields tell you which stage you’re at.
Field | Meaning |
---|---|
waitingForInstallation |
Waiting for the chart to be installed in application clusters |
waitingForCapacity |
Waiting for the contender to scale up and/or the incumbent to scale down |
waitingForTraffic |
Waiting for the contender traffic to increase and/or the incumbent to decrease |
Release conditions and strategy conditions¶
Category | Description |
---|---|
Object conditions | Conditions that apply to the object itself. All objects have this. |
Strategy conditions | Conditions that apply to the strategy of the Release that’s being rolled out. Only Releases have this. |
In the example above, under .status.strategy
we can find a condition called ContenderAchievedCapacity
, saying there’re still clusters pending capacity adjustments.
Target objects¶
The next step would be to look at the corresponding target object. Since we’re waiting for capacity, we’ll be looking at CapacityTarget. The object will have the same name as the release but different kind:
$ kubectl describe ct nginx-vj7sn-7cb440f1-0
...
Status:
Clusters:
Achieved Percent: 0
Available Replicas: 0
Conditions:
Last Transition Time: 2018-07-27T07:23:29Z
Status: True
Type: Operational
Last Transition Time: 2018-07-27T07:23:29Z
Message: there are 1 sad pods
Reason: PodsNotReady
Status: False
Type: Ready
Name: minikube
Sad Pods:
Condition:
Last Probe Time: <nil>
Last Transition Time: 2018-07-27T07:23:14Z
Status: True
Type: PodScheduled
Containers:
Image: nginx:boom
Image ID:
Last State:
Name: nginx
Ready: false
Restart Count: 0
State:
Waiting:
Message: Back-off pulling image "nginx:boom"
Reason: ImagePullBackOff
Init Containers: <nil>
Name: nginx-vj7sn-7cb440f1-0-nginx-9b5c4d7c9-2gjwl
...
Important
For installation the command would be kubectl describe it <release name>
,
for traffic kubectl describe tt <release name>
.
If we inspect .status.conditions
of the InstallationTarget we’ll notice a condition called Ready
which has status False
and reason PodsNotReady
. Further inspection will reveal that we have a pod called nginx-vj7sn-7cb440f1-0-nginx-9b5c4d7c9-2gjwl
and that Kubernetes can’t pull the Docker image for one if its containers:
Message: Back-off pulling image "nginx:boom"
Reason: ImagePullBackOff
The “boom” Docker tag clearly looks wrong. To fix this you can simply edit the application object and set the correct tag in .spec.template.values.
Other sources of useful information¶
Shipper emits Kubernetes events with useful information. You can look at that, if you prefer:
$ kubectl get events
...
1m 1h 238 nginx-vj7sn-7cb440f1-0.154528eb631aac75 CapacityTarget Normal CapacityTargetChanged capacity-controller Set "default/nginx-vj7sn-7cb440f1-0" status to {[{minikube 0 0 [{nginx-vj7sn-7cb440f1-0-nginx-9b5c4d7c9-2gjwl [{nginx {&ContainerStateWaiting{Reason:ImagePullBackOff,Message:Back-off pulling image "nginx:boom",} nil nil} {nil nil nil} false 0 nginx:boom }] [] {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2018-07-27 09:23:14 +0200 CEST }}] [{Operational True 2018-07-27 09:23:29 +0200 CEST } {Ready False 2018-07-27 09:23:29 +0200 CEST PodsNotReady there are 1 sad pods}]}]}
Typical failure scenarios¶
While we can’t list all the possible failures we can list the ones that we think happen more often than others:
Failure | Description |
---|---|
Can’t pull Docker image
|
Strategy condition ContenderAchievedCapacity is false,
InstallationTarget’s Ready condition is false and the message is
something like “Back-off pulling image “nginx:boom”” |
Previous release is unhealthy | Release condition IncumbentAchievedCapacity is false and the
message is something like “incumbent capacity is unhealthy in clusters:
[minikube]”. In this case, you can try describing the CapacityTarget
from the previous release to find out what’s wrong. If you’re doing a
rollout to fix that previous release, though, you can opt for
proceeding to the next step in your strategy, as Shipper does not
require a step to be completed before moving on to the next. |
Can’t fetch Helm chart | Release condition Scheduled is false and the message is something
like “download https://charts.example.com/charts/nginx-0.1.42.tgz: 404” |
Make sure you’re on the right cluster!¶
There are cases where the user is checking on the wrong cluster and can’t see the pods etc. To make sure you’re on the right one:
$ kubectl get release
NAME CREATED AT
myrelease-cf68dfe8-0 23m
$ kubectl describe release <your app release> | grep release.clusters
Annotations: shipper.booking.com/release.clusters=kube-us-east-1-a
Operations and administration¶
Shipper is designed to make it easier to manage a fleet of Kubernetes clusters with many teams deploying code to them.
Cluster architecture¶
Shipper defines two kinds of Kubernetes clusters, management clusters and application clusters.
Management clusters¶
Management clusters are where Shipper itself runs. It has the Shipper Custom Resource Definitions installed, and is where application developers interact with the Application or Release objects. The management cluster stores the set of Cluster objects and associated Secrets that enable Shipper to connect to the application clusters.
Typically you have one of these per large deployment, or one with a standby.
Application clusters¶
Application clusters are where Shipper installs and rolls out user workloads. Shipper does not run any custom software in the application clusters: it only needs a service account and associated RBAC configuration.
Patterns¶
One management, many application¶
This is the standard arrangement if you have a fleet of Kubernetes clusters that you would like to manage with Shipper. The single management cluster provides application developers with a single place to interface with Shipper’s objects and orchestrate their rollouts.
One-and-the-same¶
It is totally fine if the management cluster and the application cluster are the same. This is how Shipper is developed, and also how you would use Shipper if you only have a single Kubernetes cluster in your infrastructure. You can think about this configuration as using Shipper to provide a better Deployment object, but without any multi-cluster federation.
Multiple management, each with own set of application¶
While Shipper fully supports namespaces as units of multi-tenancy, it does not yet have any way to limit the set of clusters that an Application can select. So, if your organization has multiple groups of Kubernetes clusters that are consumed by disjoint sets of users, it might make sense to create a management cluster for each group of application clusters that need strong isolation between each other.
Using shipperctl
¶
The shipperctl
command is created to make using Shipper
easier.
Setting Up Clusters Using shipperctl clusters
Commands¶
To set up clusters to work with Shipper, you should create ClusterRoleBindings, ClusterRoles, Roles, RoleBindings, Clusters, and so forth.
Meet shipperctl clusters
, which is made to make this easier.
There are two use cases for this set of commands.
First, you can use it to set up a local environment to run Shipper in, or to set up a fleet of clusters for the first time.
Second, you can integrate it into your continuous integration pipeline. Since these commands are idempotent, you can use it to apply the configuration of your clusters.
Note that these commands don’t apply a Shipper deployment. You should deploy Shipper once you’ve run these commands.
The commands under shipperctl clusters
should be run in this order
if you’re setting up a cluster for a very first time. Once you’ve
followed this procedure, you can use the ones that apply to your
situation.
- Note that you need to change your context to point to the
- management cluster before running the following commands.
- shipperctl clusters setup management: creates the CustomResourceDefinitions, ServiceAccount, ClusterRoleBinding and other objects Shipper needs to function correctly.
- shipperctl clusters join: creates the ServiceAccount that Shipper is going to use on the application cluster, and copies its token back to the management cluster. This is so that Shipper, which runs on the management cluster, can modify Kubernetes objects on the application cluster. Once the token is created, this command also creates a Cluster object on the management cluster, which tells Shipper how to communicate with the application cluster.
All of these commands share a certain set of options. However, they each have their own set of options as well.
Below are the options that are shared between all the commands:
-
--kube-config
<path string>
¶ The path to your
kubectl
configuration, where the contexts thatshipperctl
should use reside.
-
-n
,
--shipper-system-namespace
<string>
¶ The namespace Shipper is running in. This is the namespace where you have a Deployment running the Shipper image.
-
--management-cluster-context
<string>
¶ By default,
shipperctl
uses the context that was already set in yourkubeconfig
(i.e. using kubectl config use-context
). However, if that’s not what you want,
you can use this option to tell shipperctl
to use another context.
shipperctl clusters setup management
¶
As mentioned above, this command is used to set up the management cluster for use with Shipper.
-
--management-cluster-service-account
<string>
¶ the name of the service account Shipper will use for the management cluster (default “shipper-mgmt-cluster”)
-
-g
,
--rollout-blocks-global-namespace
<string>
¶ the namespace where global RolloutBlocks should be created (default “rollout-blocks-global”)
This is the namespace that the users or administrators of the management cluster will create a RolloutBlock object, so that all Shipper rollouts for Applications on that cluster would be disabled.
shipperctl clusters join
¶
As mentioned above, this command is used to join the management and
application clusters together using a clusters.yaml
file. To
know more about the format of that file, look at the Clusters
Configuration File Format section.
-
--application-cluster-service-account
<string>
¶ the name of the service account Shipper will use in the application cluster (default “shipper-app-cluster”)
-
-f
,
--file
<string>
¶ the path to a YAML file containing application cluster configuration (default “clusters.yaml”)
Clusters Configuration File Format¶
The clusters configuration file is a YAML file. At the top level,
you should specify two keys, managementClusters
and applicationClusters
.
The clusters you specify under each key are your management and application
clusters, respectively. Check out Cluster Architecture
to learn more about what this means.
For each item in the list of management or application clusters, you can specify these fields:
- name (mandatory): This is the name of the cluster. When specified for an application cluster,
a Cluster object will be created on the management cluster,
and will point to the application.
- context (optional, defaults to the value of name
): this is the name of the context from your
kubectl configuration that points to this cluster. shipperctl
will use this context to run
commands to set up the cluster, and also to populate the URL of the API master.
- Fields from the Cluster object (optional):
you can specify any field from the Cluster object, and shipperctl
will patch the
Cluster object for you the next time you run it. The only field that is mandatory is region
,
which you have to specify to create any Cluster object.
Examples¶
Minimal Configuration¶
Here is a minimal configuration to set up a local kind instance, assuming that you have
created a cluster called mgmt
and a cluster called app
:
managementClusters:
- name: kind-mgmt # kind contexts are prefixed with `kind-`
applicationClusters:
- name: kind-app
region: local
Specifying Cluster Fields¶
Here is something more interesting: having 2 application clusters, and marking one of them as unschedulable:
managementCluster:
- name: eu-m
applicationClusters:
- name: eu-1
region: eu-west
- name: eu-2
region: eu-west
scheduler:
unschedulable: true
Using Google Kubernetes Engine (GKE) Context Names¶
If you’re running on GKE, your cluster context names are likely to have underscores in them,
like this: gke_ACCOUNT_ZONE_CLUSTERNAME
. shipperctl
’s usage of the context name as the
name of the Cluster object will break, because Kubernetes objects are not allowed to have
underscores in their names. To solve this, specify context
explicitly in clusters.yaml
, like so:
managementCluster:
- name: eu-m # make sure this is a Kubernetes-friendly name
context: gke_ACCOUNT_ZONE_CLUSTERNAME_MANAGEMENT # add this
applicationClusters:
- name: eu-1
region: eu-west
context: gke_ACCOUNT_ZONE_CLUSTERNAME_APP_1 # same here
- name: eu-2
region: eu-west
context: gke_ACCOUNT_ZONE_CLUSTERNAME_APP_2 # and here
scheduler:
unschedulable: true
Creating backups and restoring Using shipperctl backup
Commands¶
shipperctl backup prepare
¶
1. The backup must be created by a shipperctl command. This guarantees you can restore this backup. Acquire a backup file by running
$ kubectl config use-context mgmt-dev-cluster ##be sure to switch to correct context of the management cluster before backing up
Switched to context "mgmt-dev-cluster"
$ shipperctl backup prepare -v -f bkup-dev-29-10.yaml
NAMESPACE RELEASE NAME OWNING APPLICATION
default super-server-dc5bfc5a-0 super-server
default2 super-server2-dc5bfc5a-0 super-server2
default3 super-server3-dc5bfc5a-0 super-server3
Backup objects stored in "bkup-dev-29-10.yaml"
The command’s default format is yaml. This will create a file named “bkup-dev-29-10.yaml” and store the backup there in a yaml format.
- Save the backup file in a storage system of your liking (for example, AWS S3)
- That’s it! Repeat steps 1+2 for all management clusters.
shipperctl backup restore
¶
- Download your latest backup from your selected storing system
- Make sure that Shipper is down (spec.replicas: 0) before applying objects.
- Use shipperctl to restore your backup:
$ kubectl config use-context mgmt-dev-cluster ##be sure to switch to correct management context before restoring backing up
Switched to context "mgmt-dev-cluster"
$ shipperctl backup restore -v -f bkup-dev-29-10-from-s3.yaml
Would you like to see an overview of your backup? [y/n]: y
NAMESPACE RELEASE NAME OWNING APPLICATION
default super-server-dc5bfc5a-0 super-server
default2 super-server2-dc5bfc5a-0 super-server2
default3 super-server3-dc5bfc5a-0 super-server3
Would you like to review backup? [y/n]: y
- application:
apiVersion: shipper.booking.com/v1alpha1
kind: Application
...
backup_releases:
- capacity_target:
apiVersion: shipper.booking.com/v1alpha1
kind: CapacityTarget
...
installation_target:
apiVersion: shipper.booking.com/v1alpha1
kind: InstallationTarget
...
release:
apiVersion: shipper.booking.com/v1alpha1
kind: Release
...
traffic_target:
apiVersion: shipper.booking.com/v1alpha1
kind: TrafficTarget
...
...
Would you like to restore backup? [y/n]: y
application "default/super-server" created
release "default/super-server-dc5bfc5a-0" owner reference updates with uid "a6c587cb-624e-44ec-b267-b48630b0ed1c"
release "default/super-server-dc5bfc5a-0" created
installation target "default/super-server-dc5bfc5a-0" owner reference updates with uid "9ccfd876-7f4f-4b1c-9c10-653d295e21d2"
installation target "default/super-server-dc5bfc5a-0" created
traffic target "default/super-server-dc5bfc5a-0" owner reference updates with uid "9ccfd876-7f4f-4b1c-9c10-653d295e21d2"
traffic target "default/super-server-dc5bfc5a-0" created
capacity target "default/super-server-dc5bfc5a-0" owner reference updates with uid "9ccfd876-7f4f-4b1c-9c10-653d295e21d2"
capacity target "default/super-server-dc5bfc5a-0" created
...
- The command’s default format is yaml. This will apply the backup from file “bkup-dev-29-10-from-s3.yaml” while maintaining owner references between an application and its releases and between release and its target objects.
- The backup file must be created using shipperctl backup prepare command.
Monitoring Shipper¶
Cluster fleet management¶
Blocking rollouts¶
You can block rollouts in a specific namespace, or all namespaces (if you have the permissions to do so). To do so, you simply create a RolloutBlock object. The RolloutBlock object represents a rollout block in a specific namespace. When the object is deleted, the block is lifted.
RolloutBlock object¶
Here’s an example for a RolloutBlock object we’ll use:
apiVersion: shipper.booking.com/v1alpha1
kind: RolloutBlock
metadata:
name: dns-outage
namespace: rollout-blocks-global # for global rollout block. for a local one use the correct namespace.
spec:
message: DNS issues, troubleshooting in progress
author:
type: user
name: jdoe # This indicates that a rollout block was put in place by user 'jdoe'
Copy this to a file called globalRolloutBlock.yaml
and apply it to your Kubernetes cluster:
$ kubectl apply -f globalRolloutBlock.yaml
This will create a Global RolloutBlock object. In order to create a namespace rollout block, simply state the relevant namespace in the yaml file. An example for a namespaced RolloutBlock object:
apiVersion: shipper.booking.com/v1alpha1
kind: RolloutBlock
metadata:
name: fairy-investigation
namespace: fairytale-land
spec:
message: Investigating current Fairy state
author:
type: user
name: fgodmother
While this object is in the system, there can not be any change to the .Spec of any object. Shipper will reject the creation of new objects and patching of existing releases.
Overriding a rollout block¶
Rollout blocks can be overridden with an annotation applied to the Application or Release object which needs to bypass the block. This annotation will list each RolloutBlock object that it overrides with a fully-qualified name (namespace + name).
For example, mending our Application object to override the global rollout block that we set in place:
apiVersion: shipper.booking.com/v1alpha1
kind: Application
metadata:
name: super-server
annotations:
shipper.booking.com/rollout-block.override: rollout-blocks-global/dns-outage
spec:
revisionHistoryLimit: 3
template:
# ... rest of template omitted here
The annotation may reference multiple blocks:
shipper.booking.com/rollout-block.override: rollout-blocks-global/dns-outage,frontend/demo-to-investors-in-progress
The block override annotation format is CSV.
The override annotation must reference specific, fully-qualified RolloutBlock objects by name. Non-existing blocks enlisted in this annotation are not allowed. If there exists a Release object for a specific application, the release should be the one overriding it.
Application and Release conditions¶
Application and Release objects will have a .status.conditions entry which lists all of the blocks which are currently in effect.
For example:
apiVersion: shipper.booking.com/v1
kind: Application
metadata:
name: ui
namespace: frontend
spec:
# ... spec omitted
status:
conditions:
- type: Blocked
status: True
reason: RolloutsBlocked
message: rollouts blocked by: rollout-blocks-global/dns-outage
This will be accompanied with an event (can be viewed with kubectl describe application ui -n frontend
).
For example:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning RolloutBlock 3s (x3 over 5s) application-controller rollout-blocks-global/dns-outage
Checking a rollout block status¶
There are a few simple ways to know which objects are overriding your RolloutBlock object.
.status.overrides
¶
This fields will state all living Application and Release objects that override this RolloutBlock object.
$ kubectl -n rollout-blocks-global get rb dns-outage -o yaml
This might look like this:
apiVersion: shipper.booking.com/v1alpha1
kind: RolloutBlock
metadata:
name: dns-outage
namespace: rollout-blocks-global
# ... spec omitted
status:
# associated because 'shipper-system/dns-outage' is referenced in override annotation
overrides:
applications: default/super-server
release: default/super-server-83e4eedd-0
output wide
¶
This will show all information about all rollout blocks in the namsespace (default if not specify, rollout-blocks-global for all global RolloutBlocks ,`–all-namespaces` for all rollout blocks)
$ kubectl -n rollout-blocks-global get rb -o wide
This might look like this:
NAMESPACE NAME MESSAGE AUTHOR TYPE AUTHOR NAME OVERRIDING APPLICATIONS OVERRIDING RELEASES
rollout-blocks-global dns-outage DNS issues, troubleshooting in progress user jdoe default/super-server default/super-server-83e4eedd-0
Limitations and known issues¶
Shipper is just software, and all software has limits. Here are the highlights for Shipper currently. Some of these are not principal problems, just shortcuts that we took while building Shipper.
Chart restrictions¶
Shipper expects a few properties to be true about the Chart it is rolling out. We hope to loosen or remove most of these restrictions over time.
Only Deployments¶
The Chart must have exactly one Deployment object. The name of the
Deployment should be templated with {{.Release.Name}}
. The Deployment
object should have apiVersion: apps/v1
.
Shipper cannot yet perform roll outs for StatefulSets, HorizontalPodAutoscalers, or bare ReplicaSets. These objects can be present in the Chart, but Shipper only knows how to manipulate Deployment objects to scale capacity over the course of a rollout.
Services¶
The Chart must contain either:
- exactly one Service, or
- exactly one Service labeled with the label
shipper-lb: production
.
The name of the Service should be fixed: either a literal in the Chart template, or a value which does not change from release to release.
The Service should have a selector
which matches the application, not
a single release. A Service with release: {{ .Release.Name }}
as part
of the Service selector
will cause Shipper to error, as it will not be
able to balance traffic between multiple Releases.
If you cannot modify the Chart you’re rolling out, you can ask Shipper to
remove the release
selector from the Service selector
by adding the
enable-helm-release-workaround: "true"
label to your Application. This
workaround helps make Charts created with helm create
work out of the box.
Load balancing¶
Shipper uses Kubernetes’ built-in mechanism for shifting traffic: labeling
Pods to add or remove them to a Service’s selector
. This means you
don’t need any special support in your Kubernetes clusters, but it has several
drawbacks.
We hope to mitigate these by adding support for service mesh providers as traffic shifting backends.
Pod-based traffic shifting¶
Traffic shifting happens at the granularity of Pods, not requests. While Shipper’s interface specifes a traffic weight, small fleets of Pods may find that their actual weight differs significantly from the one they requested.
New Pods don’t get traffic if Shipper is not working¶
Shipper adds the shipper-traffic-status: enabled
label to Pods after they
start. This allows Shipper to correctly manage the number of Pods exposed to
traffic. However, if a Pod is deleted and Shipper is not currently running or
cannot contact the cluster, the new Pod spawned by the ReplicaSet will not
get traffic until Shipper is working again.
The primary issue is that we cannot “cork” a successfully completed rollout by adding the traffic label to the Deployment or ReplicaSet without triggering a native Deployment-based rollout. We could solve this by working directly with ReplicaSets instead of Deployments, but that’s probably working against the grain of the ecosystem (most charts contain Deployments).
Lock-step rollouts¶
Shipper is good at making sure that all clusters involved in a rollout are in the same state. It does this by ensuring that all clusters are in the correct state before marking a rollout step as complete.
However, this means that Shipper cannot perform cluster-by-cluster rollouts,
like first kube-us-east1-a
, then kube-eu-west2-b
. Our “federation”
layer supports this, but we have not yet designed the extension to our strategy
language to describe this kind of rollout.
This cluster-by-cluster strategy is important when limiting traffic or capacity exposure to a new change is not enough to mitigate risk: for example, perhaps the new version will change a cluster-local schema once it starts running.
API Reference¶
High-level APIs¶
These objects represent the primary user interface to Shipper. They are the control and reporting layers for any rollout operation.
Application¶
An Application object represents a single application Shipper can manage on a user’s behalf. In this case, the term “application” means ‘a collection of Kubernetes objects installed by a single Helm chart’.
Application objects are a user interface, and are the primary way that application developers trigger new rollouts.
This is accomplished by editing an Application’s .spec.template
field. The
template field is a mold that Shipper will use to stamp out a new Release
object on each edit. This model is identical to to Kubernetes Deployment
objects and their .spec.template
field, which serves as a mold for
ReplicaSet objects (and by extension, Pod objects).
Application’s .spec.template.chart
contains ambiguity by design: a user is
expected to provide either a specific chart version or a SemVer constraint
defining the range of acceptable chart versions. Shipper will resolve an
appropriate available chart version and pin the Release on it. Shipper
resolves the version in-place: it will substitute the initial constraint with a
specific resolved version and preserve the initial constraint in the Application
annotation named shipper.booking.com/app.chart.version.raw
.
The resolved .spec.template
field will be copied to a new Release
object under the .spec.environment
field during deployment.
Example¶
apiVersion: shipper.booking.com/v1alpha1
kind: Application
metadata:
name: reviews-api
spec:
revisionHistoryLimit: 1
template:
chart:
name: reviews-api
version: "~0.1"
repoUrl: https://charts.example.com
clusterRequirements:
capabilities:
- gpu
- high-memory-nodes
regions:
- name: us-east1
strategy:
steps:
- name: staging
capacity:
incumbent: 100
contender: 1
traffic:
incumbent: 100
contender: 0
- name: canary
capacity:
incumbent: 10
contender: 90
traffic:
incumbent: 10
contender: 90
- name: full on
capacity:
incumbent: 0
contender: 100
traffic:
incumbent: 0
contender: 100
values:
replicaCount: 2
Spec¶
.spec.revisionHistoryLimit
¶
revisionHistoryLimit
is an optional field that represents the number
of associated Release objects in .status.history
.
If you’re using Shipper to configure development environments,
revisionHistoryLimit
can be a small value, like 1
. In a production
setting it should be set to a larger number, like 10
or 20
. This
ensures that you have plenty of rollback targets to choose from if something
goes wrong.
.spec.template
¶
The .spec.template
is the only required field of the .spec
.
The .spec.template
is a Release template. It has the same schema as the
.spec.environment in a Release
object.
Application’s .spec.template.chart
can define either a specific chart version,
or a SemVer constraint.
Please refer to Semantic Version Ranges section for more details on supported constraints.
Status¶
.status.history
¶
history
is the sequence of Releases that belong to this Application.
This list is ordered by generation, old to new: the oldest Release is at the
start of the list, and the most recent (the contender) at the bottom.
.status.conditions
¶
All conditions contain five fields: lastTransitionTime
, status
, type
,
reason
, and message
. Typically reason
and message
are omitted in the
expected case, and populated in the error or unexpected case.
type: Aborting
¶
This condition indicates whether an abort is currently in progress. An abort is when the latest Release (the contender) is deleted, triggering an automatic rollback to the incumbent.
Type | Status | Reason | Description |
---|---|---|---|
Aborting | True | N/A | The contender was deleted, triggering an abort. The Application
.spec.template will be overwritten with the Release
.spec.environment of the incumbent. |
Aborting | False | N/A | No abort is occurring. |
type: ReleaseSynced
¶
This condition indicates whether the contender Release reflects the
current state of the Application .spec.template
.
Type | Status | Reason | Description |
---|---|---|---|
ReleaseSynced | True | N/A | Everything is OK: Release .spec.environment and Application .spec.template are in sync. |
ReleaseSynced | False | CreateReleaseFailed | The API call to Kubernetes to create the Release object failed. Check
message for the specific error. |
type: RollingOut
¶
This condition indicates whether a rollout is currently in progress. A rollout is in progress if the contender Release object has not yet achieved the final step in the rollout strategy.
Type | Status | Reason | Description |
---|---|---|---|
RollingOut | False | N/A | No rollout is in progress. |
RollingOut | True | N/A | A rollout is in progress. Check message for more details. |
type: ValidHistory
¶
This condition indicates whether the Releases listed in .status.history
form a valid sequence.
Type | Status | Reason | Description |
---|---|---|---|
ValidHistory | True | N/A | Everything is OK. All Releases have a valid generation annotation. |
ValidHistory | False | BrokenReleaseGeneration | One of the Releases does not have a valid generation annotation.
Check message for more details. |
ValidHistory | False | BrokenApplicationObservedGeneration | The Application has an invalid highestObservedGeneration
annotation. check message for more details. |
Semantic Version Ranges¶
Shipper supports an extended range of semantic version constraints in
Application’s .spec.template.chart.version
.
This section highlights the major features of supported SemVer constraints. For a full reference please see the underlying library spec.
Composition¶
SemVer specifications are composable: there are 2 composition operators defined:
- ,
: stands for AND
- ||
: stands for OR
In the example >=1.2.3, <3.4.5 || 6.7.8
the constraint defines a range where
any version between 1.2.3 inclusive and 3.4.5 non-inclusive, or a specific
version 6.7.8 would satisfy it.
Trivial Comparisons¶
Trivial comparison constraints belong to a category of equality check relationships.
The range of comparison checks is defined as:
- =
: strictly equal to
- !=
: not equal to
- >
: greater than (non-inclusive)
- <
: less than (non-inclusive)
- >=
: greater than or equal to (inclusive)
- <=
: less than or equal to (inclusive)
The rest of the constraints is mainly a semantical syntax sugar and is fully based on this category therefore the forecoming constraints are explained using these operators.
Hyphens¶
A hyphen-separated range is an equivalent to defining a lower and an upper bound for a range of acceptable versions.
1.2.3-4.5.6
is equivalent to>=1.2.3, <=4.5.6
1.2-4.5
is equivalent to>=1.2, <=4.5
Wildcards¶
There are 3 wildcard characters: x
, X
and *
. They are absolutely
equivalent to each other: 1.2.*
is the same as 1.2.X
.
1.2.x
is equivalent to>=1.2.0, <1.3.0
(note the non-inclusive range)>=1.2.*
is equivalent to>=1.2.0
(the wildcard is optional here)*
is equivalent to>=0.0.0
(one can usex
andX
as well)
Tildes¶
A tilde is a context-dependant operator: it changes the range based on the least significant version component provided.
~1.2.3
is equivalent to>=1.2.3, <1.3.0
~1.2
is equivalent to>=1.2, <1.3
~1
is equivalent to>=1, <2
Carets¶
Carets pin the major version to a specific branch.
^1.2.3
is equivalent to>=1.2.3, <2.0.0
^1.2
is equivalent to>=1.2, <2.0
A caret-defined constraint is a handy way to say: give me the latest non-breaking version.
Release¶
A Release contains all the information required for Shipper to run a particular version of an application.
To aid both the human and other users in finding resources related to a particular Release object, the following labels are expected to be present in a newly created Release and propagated to all of its related objects (both in the management and application clusters):
- shipper-app
- The name of the Application object owning the Release.
- shipper-release
- The name of the Release object.
Example¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | apiVersion: shipper.booking.com/v1alpha1
kind: Release
metadata:
name: reviews-api-deadbeef-1
spec:
targetStep: 2
environment:
chart:
name: reviews-api
version: 0.0.1
repoUrl: https://charts.example.com
clusterRequirements:
capabilities:
- gpu
- high-memory-nodes
regions:
- name: us-east1
strategy:
steps:
- name: staging
capacity:
incumbent: 100
contender: 1
traffic:
incumbent: 100
contender: 0
- name: canary
capacity:
incumbent: 10
contender: 90
traffic:
incumbent: 10
contender: 90
- name: full on
capacity:
incumbent: 0
contender: 100
traffic:
incumbent: 0
contender: 100
values:
replicaCount: 2
status:
achievedStep:
name: full on
step: 2
conditions:
- lastTransitionTime: 2018-12-06T13:43:15Z
status: "True"
type: Complete
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Scheduled
strategy:
conditions:
- lastTransitionTime: 2018-12-06T17:48:41Z
status: "True"
step: 2
type: ContenderAchievedCapacity
- lastTransitionTime: 2018-12-06T12:43:46Z
status: "True"
step: 2
type: ContenderAchievedInstallation
- lastTransitionTime: 2018-12-06T13:42:15Z
status: "True"
step: 2
type: ContenderAchievedTraffic
- lastTransitionTime: 2018-12-06T13:43:15Z
status: "True"
step: 2
type: IncumbentAchievedCapacity
- lastTransitionTime: 2018-12-06T13:42:45Z
status: "True"
step: 2
type: IncumbentAchievedTraffic
state:
waitingForCapacity: "False"
waitingForCommand: "False"
waitingForInstallation: "False"
waitingForTraffic: "False"
|
Spec¶
.spec.targetStep
¶
targetStep defines which strategy step this Release should be trying to complete. It is the primary interface for users to advance or retreat a given rollout.
.spec.environment
¶
The environment contains all the information required for an application to be deployed with Shipper.
Important
Roll-forwards and roll-backs have no difference from Shipper’s
perspective, so a roll-back can be performed simply by replacing an
Application’s .spec.template
field with the .spec.environment
field of the Release you want to roll-back to.
.spec.environment.chart
¶
1 2 3 4 | chart:
name: reviews-api
version: 0.0.1
repoUrl: https://charts.example.com
|
The environment chart key defines the Helm Chart that contains the Kubernetes object
templates for this Release. name
, version
, and repoUrl
are all
required. repoUrl
is the Helm Chart repository that Shipper should
download the chart from.
Note
Shipper will cache this chart version internally after fetching it, just
like pullPolicy: IfNotPresent
for Docker images in Kubernetes. This
protects against chart repository outages. However, it means that if you
need to change your chart, you need to tag it with a different version.
.spec.environment.clusterRequirements
¶
1 2 3 4 5 6 | clusterRequirements:
capabilities:
- gpu
- high-memory-nodes
regions:
- name: us-east1
|
The environment clusterRequirements key specifies what kinds of clusters this Release can be scheduled to. It is required.
clusterRequirements.capabilities
is a list of capability names this
Release requires. They should match capabilities specified in Cluster objects exactly. This may be left empty
if the Release has no required capabilities.
clusterRequirements.regions
is a list of regions this Release must run in. It is required.
.spec.environment.strategy
¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | strategy:
steps:
- name: staging
capacity:
incumbent: 100
contender: 1
traffic:
incumbent: 100
contender: 0
- name: canary
capacity:
incumbent: 10
contender: 90
traffic:
incumbent: 10
contender: 90
- name: full on
capacity:
incumbent: 0
contender: 100
traffic:
incumbent: 0
contender: 100
|
The environment strategy is a required field that specifies the rollout strategy to be used when deploying the Release.
.spec.environment.strategy.steps
contains a list of steps that must be
executed in order to complete a release. A step should have the follwing keys:
Key | Description |
---|---|
.name |
The step name, meant for human users. For example, staging , canary or full on . |
.capacity.incumbent |
The percentage of replicas, from the total number of required replicas the incumbent Release (previous release) should have at this step. |
.capacity.contender |
The percentage of replicas, from the total number of required replicas the contender Release (latest release) should have at this step. |
.traffic.incumbent |
The weight the incumbent Release has when load balancing traffic through all Release objects of the given Application. |
.traffic.contender |
The weight the contender Release has when load balancing traffic through all Release objects of the given Application. |
.spec.environment.values
¶
The environment values key provides parameters for the Helm Chart templates. It is
exactly equivalent to a values.yaml
file provided to the helm install -f
values.yaml
invocation. Like values.yaml
it is technically optional, but
almost all rollouts are likely to include some dynamic values for the chart,
like the image tag.
Almost all Charts will expect some values like replicaCount
,
image.repository
, and image.tag
.
Status¶
.status.achievedStep
¶
achievedStep indicates which strategy step was most recently completed.
.status.conditions
¶
All conditions contain five fields: lastTransitionTime
, status
, type
,
reason
, and message
. Typically reason
and message
are omitted in the
expected case, and populated in the error or unexpected case.
type: Blocked
¶
This condition indicates whether a Release is blocked by a rollout block or not.
type: Complete
¶
This condition indicates whether a Release has finished its strategy, and should be considered complete.
type: Scheduled
¶
This condition indicates whether the clusterRequirements
were satisfied and
a concrete set of clusters selected for this Release.
type: StrategyExecuted
¶
This condition indicates whether a Release has achieved a strategy step. This means the installation, capacity and traffic specified in the .spec.environment.strategy step were achieved.
.status.strategy
¶
This section contains information on the progression of the strategy.
.status.strategy.conditions
¶
These conditions represent the precise state of the strategy: for each of the incumbent and contender, whether they have converged on the state defined by the given strategy step.
.status.strategy.state
¶
The state keys are intended to make it easier to interpret the strategy
conditions by summarizing into a high level conclusion: what is Shipper waiting
for right now? If it is waitingForCommand: "True"
then the rollout is
awaiting a change to .spec.targetStep
to proceed. If any other key is
True
, then Shipper is still working to achieve the desired state.
Low-level APIs¶
These objects represent low-level commands defining the state of specific clusters, as well as the current status of those commands. Together they provide ‘just enough federation’ to implement Shipper’s rollout strategies.
They depend on an associated Release object to work correctly: they cannot be created in isolation.
Installation Target¶
An InstallationTarget describes the concrete set of clusters where the release
should be installed. It is created by the Release Controller’s Scheduler after the
concrete clusters are picked using clusterRequirements
.
The Installation Controller acts on InstallationTarget objects by getting the chart, values, and sidecars from the associated Release object, rendering the chart per-cluster, and inserting those objects into each target cluster. Where applicable, these objects are always created with 0 replicas.
It updates the status
resource to indicate progress for each target cluster.
Example¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | apiVersion: shipper.booking.com/v1alpha1
kind: InstallationTarget
metadata:
name: api-3f498d25-0
namespace: service-directory
spec:
clusters:
- kube-us-east1-a
- kube-eu-west2-b
status:
clusters:
- conditions:
- lastTransitionTime: 2018-12-06T16:53:24Z
status: "True"
type: Operational
- lastTransitionTime: 2018-12-06T16:53:24Z
status: "True"
type: Ready
name: kube-us-east1-a
status: Installed
- conditions:
- lastTransitionTime: 2018-12-06T16:53:24Z
status: "True"
type: Operational
- lastTransitionTime: 2018-12-06T16:53:24Z
status: "True"
type: Ready
name: kube-eu-west2-b
status: Installed
|
Spec¶
.spec.clusters
¶
The clusters
field is a list of cluster names known to Shipper where the associated Release should be installed.
Installation means rendering all the objects in the Chart and inserting them
into the cluster.
1 2 3 4 | spec:
clusters:
- kube-us-east1-a
- kube-eu-west2-b
|
Status¶
.status.clusters
¶
.status.clusters
is a list of objects representing the installation status
of all clusters where the associated Release objects must be installed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | status:
clusters:
- conditions:
- lastTransitionTime: 2018-12-06T16:53:24Z
status: "True"
type: Operational
- lastTransitionTime: 2018-12-06T16:53:24Z
status: "True"
type: Ready
name: kube-us-east1-a
status: Installed
- conditions:
- lastTransitionTime: 2018-12-06T16:53:24Z
status: "True"
type: Operational
- lastTransitionTime: 2018-12-06T16:53:24Z
status: "True"
type: Ready
name: kube-eu-west2-b
status: Installed
|
The following table displays the keys a cluster status entry should have:
Key | Description |
---|---|
name | The Application Cluster name. For example, kube-us-east1-a. |
status | Failed in case of failure, or Installed in case of success. |
message | A message describing the reason Shipper decided that it has failed. |
conditions | A list of all conditions observed for this particular Application Cluster. |
.status.clusters.conditions
¶
The following table displays the different conditions statuses and reasons reported in the InstallationTarget object for the Operational condition type:
Type | Status | Reason | Description |
---|---|---|---|
Operational | True | N/A | Cluster is reachable, and seems to be operational. |
Operational | False | TargetClusterClientError | There is a problem contacting the Application Cluster; Shipper
either doesn’t know about this Application Cluster, or there is
another issue when accessing the Application Cluster. Details
can be found in the .message field. |
Operational | False | ServerError | Some error has happened Shipper couldn’t classify. Details can be
found in the .message field. |
The following table displays the different conditions statuses and reasons reported in the InstallationTarget object for the Ready condition type:
Type | Status | Reason | Description |
---|---|---|---|
Ready | True | N/A | Indicates that Kubernetes has achieved the desired state related to the InstallationTarget object. |
Ready | False | ServerError | Shipper could not either create an object in the Application Cluster,
or an error occurred when trying to fetch an object from the
Application Cluster. Details can be found in the .message field. |
Ready | False | ChartError | There was an issue while processing a Helm Chart, such as invalid
templates being used as input, or rendered templates that do not
match any known Kubernetes object. Details can be found in the
.message field. |
Ready | False | ClientError | Shipper couldn’t create a resource client to process a particular
rendered object. Details can be found in the .message field. |
Ready | False | UnknownError | Some error Shipper couldn’t classify has happened. Details can be
found in the .message field. |
Capacity Target¶
A CapacityTarget is the interface used by the Release Controller to change the target number of replicas for an application in a set of clusters. It is acted upon by the Capacity Controller.
The status
resource includes status per-cluster so that the Release
Controller can determine when the Capacity Controller is complete and it can
move to the traffic step.
Example¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | apiVersion: shipper.booking.com/v1alpha1
kind: CapacityTarget
metadata:
name: reviewsapi-deadbeef-0
namespace: reviewsapi
annotations:
"shipper.booking.com/v1/finalReplicaCount": 10
labels:
release: reviewsapi-4
spec:
clusters:
- name: kube-us-east1-a
percent: 10
- name: kube-eu-west2-b
percent: 10
status:
clusters:
- name: kube-us-east1-a
availableReplicas: 1
achievedPercent: 10
- name: kube-eu-west2-b
availableReplicas: 1
achievedPercent: 10
sadPods:
- name: reviewsapi-deadbeef-0-cafebabe
phase: Terminated
containers:
- name: app
status: CrashLoopBackOff
condition:
type: Ready
status: False
reason: ContainersNotReady
message: "unready containers [app]"
|
Spec¶
.spec.clusters
¶
clusters
is a list of clusters the associated Release object is present
in. Each item in the list has a name
, which should map to a Cluster object, and a percent
. percent
declares how
much capacity the Release should have in this cluster relative to the final
replica count. For example, if the final replica count is 10 and the
percent
is 50, the Deployment object for this Release will be patched to
have 5 pods.
1 2 3 4 5 6 | release: reviewsapi-4
spec:
clusters:
- name: kube-us-east1-a
percent: 10
- name: kube-eu-west2-b
|
Status¶
.status.clusters
¶
.status.clusters
is a list of objects representing the capacity status
of all clusters where the associated Release objects must be installed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | percent: 10
status:
clusters:
- name: kube-us-east1-a
availableReplicas: 1
achievedPercent: 10
- name: kube-eu-west2-b
availableReplicas: 1
achievedPercent: 10
sadPods:
- name: reviewsapi-deadbeef-0-cafebabe
phase: Terminated
containers:
- name: app
status: CrashLoopBackOff
condition:
type: Ready
status: False
reason: ContainersNotReady
message: "unready containers [app]"
|
The following table displays the keys a cluster status entry should have:
Key | Description |
---|---|
name | The Application Cluster name. For example, kube-us-east1-a. |
availableReplicas | The number of pods that have successfully started up |
achievedPercent | What percentage of the final replica count does availableReplicas represent. |
sadPods | Pod Statuses for up to 5 Pods which are not yet Ready. |
conditions | A list of all conditions observed for this particular Application Cluster. |
.status.clusters.conditions
¶
The following table displays the different conditions statuses and reasons reported in the CapacityTarget object for the Operational condition type:
Type | Status | Reason | Description |
---|---|---|---|
Operational | True | N/A | Cluster is reachable, and seems to be operational. |
Operational | False | ServerError | Some error has happened Shipper couldn’t classify. Details can be
found in the .message field. |
The following table displays the different conditions statuses and reasons reported in the CapacityTarget object for the Ready condition type:
Type | Status | Reason | Description |
---|---|---|---|
Ready | True | N/A | The correct number of pods are running and all of them are Ready. |
Ready | False | WrongPodCount | This cluster has not yet achieved the desired number of pods. |
Ready | False | PodsNotReady | The cluster has the desired number of pods, but not all of them are Ready. |
Ready | False | MissingDeployment | Shipper could not find the Deployment object that it expects to be able
to adjust capacity on. See message for more details. |
Traffic Target¶
A TrafficTarget is an interface to a method of shifting traffic between different Releases based on weight. This may be implemented in a number of ways: pod labels and Service objects, service mesh manipulation, or something else. For the moment only vanilla Kubernetes traffic shifting is supported: pod labels and Service objects.
It is manipulated by the Release Controller as part of executing a release strategy.
Example¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | apiVersion: shipper.booking.com/v1alpha1
kind: TrafficTarget
metadata:
name: reviewsapi-deadbeaf-0
namespace: reviewsapi
spec:
clusters:
- name: kube-us-east1-a
weight: 30
- name: kube-eu-west2-b
weight: 30
status:
clusters:
- achievedTraffic: 100
conditions:
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Operational
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Ready
name: kube-us-east1-a
status: Synced
- achievedTraffic: 100
conditions:
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Operational
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Ready
name: kube-eu-west2-b
status: Synced
|
Spec¶
.spec.clusters
¶
1 2 3 4 5 6 | spec:
clusters:
- name: kube-us-east1-a
weight: 30
- name: kube-eu-west2-b
weight: 30
|
clusters
is a list of cluster entries and the desired traffic weight for
this Release in that cluster. The Traffic controller calculates the correct
traffic ratio for this Release by summing weights from all TrafficTarget
objects available.
Status¶
.status.clusters
¶
.status.clusters
is a list of objects representing the traffic status
of all clusters where the associated Release objects must be installed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | status:
clusters:
- achievedTraffic: 100
conditions:
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Operational
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Ready
name: kube-us-east1-a
status: Synced
- achievedTraffic: 100
conditions:
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Operational
- lastTransitionTime: 2018-12-06T12:43:09Z
status: "True"
type: Ready
name: kube-eu-west2-b
status: Synced
|
The following table displays the keys a cluster status entry should have:
Key | Description |
---|---|
name | The Application Cluster name. For example, kube-us-east1-a. |
status | Failed in case of failure, or Synced in case of success. |
achievedTraffic | The traffic weight achieved by Shipper for this cluster. |
conditions | A list of all conditions observed for this particular Application Cluster. |
.status.clusters.conditions
¶
The following table displays the different conditions statuses and reasons reported in the TrafficTarget object for the Operational condition type:
Type | Status | Reason | Description |
---|---|---|---|
Operational | True | N/A | Cluster is reachable, and seems to be operational. |
Operational | False | ServerError | There is a problem contacting the Application Cluster; Shipper
either doesn’t know about this Application Cluster, or there is
another issue when accessing the Application Cluster. Details
can be found in the .message field. |
The following table displays the different conditions statuses and reasons reported in the TrafficTarget object for the Ready condition type:
Type | Status | Reason | Description |
---|---|---|---|
Ready | True | N/A | The desired traffic weight has been successfully achieved. |
Ready | False | MissingService | Shipper could not find a Service object to use for traffic shifting.
Check message for more details. |
Ready | False | ServerError | Shipper got an error status code while calling the Kubernetes API of
the Application Cluster. Details in the .message field. |
Ready | False | ClientError | Shipper couldn’t create a resource client to process a particular
rendered object. Details can be found in the .message field. |
Ready | False | InternalError | Something went wrong with the math that Shipper does to calculate the
desired number of pods. See the .message field for the exact error. |
Ready | False | UnknownError | Some error Shipper couldn’t classify has happened. Details can be
found in the .message field. |
Administrator APIs¶
These objects represent internal details of a Shipper installation. They expose tools for administrators to configure Shipper or change how Shipper works for application developers.
Cluster¶
A Cluster object represents a Kubernetes cluster that Shipper can deploy to. It is an administrative interface.
They serve two purposes:
- Enable Shipper to connect to the cluster to manage it
- Enable administrators to influence how Releases are scheduled to this cluster.
The second point allows administrators to perform tasks like load balancing workloads between clusters, shift workloads from one cluster to another, or drain clusters for risky maintenance. For examples of these tasks, see the administrator’s guide.
Example¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | apiVersion: shipper.booking.com/v1alpha1
kind: Cluster
metadata:
name: kube-us-east1-a
spec:
apiMaster: https://10.0.0.1
capabilities:
- gpu
- ssd
- high-memory-nodes
region: us-east1
scheduler:
unschedulable: false
weight: 100
|
Spec¶
.spec.apiMaster
¶
apiMaster
is the URL of the Kubernetes cluster API server. Shipper uses
this to connect to the cluster to manage it. This is the same URL as in
a ~/.kube/config
for enabling kubectl
commands.
.spec.capabilities
¶
capabilities[]
is a required field that lists the capabilities the
cluster has. Capabilities are arbitrary tags that can be used by Application
objects to select clusters while rolling out. For example, one Kubernetes
cluster might have nodes provisioned with GPUs for video encoding. Adding ‘gpu’
as a Cluster capability will allow application developers to specify ‘gpu’ in
their set of Application clusterRequirements
if their application needs
access to that feature.
.spec.region
¶
region
is a required field that specifies the region the cluster belongs to.
.spec.scheduler
¶
scheduler.unschedulable
is an optional field that causes clusters to
be ignored during rollout cluster selection. This allows operators to mark
clusters to be drained. Default: false
.
scheduler.weight
is an optional field that assigns a weight to the
cluster. The weight influences the priority of the cluster during rollout
cluster selection. Default: 100
.
scheduler.identity
is an optional field that assigns an identity to
the cluster different than its .metadata.name
value. This allows operators
to make one cluster ‘impersonate’ another in order to transfer all of the
Applications on one cluster to another specific cluster. Default:
.metadata.name
.
More information on how to use these fields to manage a fleet of clusters can be found in the Administrator’s guide.
Status¶
Cluster objects do not currently have a meaningful .status
field.