注意: 急于开始的读者可以直接前往快速入门

正在使用 Kubebuilder v1 或 v2 吗?请查看 v1v2 的旧版文档。

适用对象

Kubernetes 用户

Kubernetes 用户将通过学习 API 被设计和实现的基本概念,深入了解 Kubernetes,并且将会开发出更深刻的认识。本书将教会读者如何开发自己的 Kubernetes API,以及核心 Kubernetes API 设计的原则。

包括:

  • Kubernetes API 和资源的结构
  • API 版本语义
  • 自愈
  • 垃圾回收和终结器
  • 声明式 vs 命令式 API
  • 基于级别 vs 基于边缘的 API
  • 资源 vs 子资源

Kubernetes API 扩展开发者

API 扩展开发者将学习实现规范 Kubernetes API 背后的原则和概念,以及快速执行的简单工具和库。本书涵盖了扩展开发者常遇到的陷阱和误解。

包括:

  • 如何将多个事件批量处理为单个协调调用
  • 如何配置定期协调
  • 即将推出
    • 何时使用列表缓存 vs 实时查找
    • 垃圾回收 vs 终结器
    • 如何使用声明式 vs Webhook 验证
    • 如何实现 API 版本管理

为什么选择 Kubernetes API

Kubernetes API 为对象提供了一致和明确定义的端点,这些对象遵循一致和丰富的结构。

这种方法培育了一个丰富的工具和库生态系统,用于处理 Kubernetes API。

用户通过将对象声明为 yamljson 配置,并使用常见工具来管理对象来使用这些 API。

将服务构建为 Kubernetes API 相比于普通的 REST,提供了许多优势,包括:

  • 托管的 API 端点、存储和验证。
  • 丰富的工具和 CLI,如 kubectlkustomize
  • 对 AuthN 和细粒度 AuthZ 的支持。
  • 通过 API 版本控制和转换支持 API 演进。
  • 促进自适应/自愈 API 的发展,这些 API 可以持续响应系统状态的变化,而无需用户干预。
  • Kubernetes 作为托管环境

开发人员可以构建并发布自己的 Kubernetes API,以安装到运行中的 Kubernetes 集群中。

贡献

如果您想要为本书或代码做出贡献,请先阅读我们的贡献指南。

资源

Architecture Concept Diagram

The following diagram will help you get a better idea over the Kubebuilder concepts and architecture.

快速入门

本快速入门指南将涵盖以下内容:

先决条件

  • go 版本 v1.20.0+
  • docker 版本 17.03+。
  • kubectl 版本 v1.11.3+。
  • 访问 Kubernetes v1.11.3+ 集群。

安装

安装 kubebuilder

# 下载 kubebuilder 并在本地安装。
curl -L -o kubebuilder "https://go.kubebuilder.io/dl/latest/$(go env GOOS)/$(go env GOARCH)"
chmod +x kubebuilder && mv kubebuilder /usr/local/bin/

创建项目

创建一个目录,然后在其中运行 init 命令以初始化一个新项目。以下是一个示例。

mkdir -p ~/projects/guestbook
cd ~/projects/guestbook
kubebuilder init --domain my.domain --repo my.domain/guestbook

创建 API

运行以下命令以创建一个名为 webapp/v1 的新 API(组/版本),并在其中创建一个名为 Guestbook 的新 Kind(CRD):

kubebuilder create api --group webapp --version v1 --kind Guestbook

可选步骤: 编辑 API 定义和调和业务逻辑。有关更多信息,请参阅设计 API控制器概述

如果您正在编辑 API 定义,可以使用以下命令生成诸如自定义资源(CRs)或自定义资源定义(CRDs)之类的清单:

make manifests
点击此处查看示例。(api/v1/guestbook_types.go)

// GuestbookSpec 定义了 Guestbook 的期望状态
type GuestbookSpec struct {
	// 插入其他规范字段 - 集群的期望状态
	// 重要提示:在修改此文件后运行 "make" 以重新生成代码

	// 实例数量
	// +kubebuilder:validation:Minimum=1
	// +kubebuilder:validation:Maximum=10
	Size int32 `json:"size"`

	// GuestbookSpec 配置的 ConfigMap 名称
	// +kubebuilder:validation:MaxLength=15
	// +kubebuilder:validation:MinLength=1
	ConfigMapName string `json:"configMapName"`

	// +kubebuilder:validation:Enum=Phone;Address;Name
	Type string `json:"alias,omitempty"`
}

// GuestbookStatus 定义了 Guestbook 的观察状态
type GuestbookStatus struct {
	// 插入其他状态字段 - 定义集群的观察状态
	// 重要提示:在修改此文件后运行 "make" 以重新生成代码

	// 活动的 Guestbook 节点的 PodName
	Active string `json:"active"`

	// 待机的 Guestbook 节点的 PodNames
	Standby []string `json:"standby"`
}

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// +kubebuilder:resource:scope=Cluster

// Guestbook 是 guestbooks API 的架构
type Guestbook struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   GuestbookSpec   `json:"spec,omitempty"`
	Status GuestbookStatus `json:"status,omitempty"`
}

测试

您需要一个 Kubernetes 集群来运行。您可以使用 KIND 获取本地集群进行测试,或者运行在远程集群上。

将 CRD 安装到集群中:

make install

为了快速反馈和代码级调试,运行您的控制器(这将在前台运行,如果要保持运行状态,请切换到新终端):

make run

安装自定义资源实例

如果按 y 键创建资源 [y/n],则会在您的样本中为您的 CRD 创建一个 CR(如果已更改 API 定义,请确保先编辑它们):

kubectl apply -k config/samples/

在集群上运行

当您的控制器准备好打包并在其他集群中进行测试时。

构建并将您的镜像推送到 IMG 指定的位置:

make docker-build docker-push IMG=<some-registry>/<project-name>:tag

使用由 IMG 指定的镜像将控制器部署到集群中:

make deploy IMG=<some-registry>/<project-name>:tag

卸载 CRD

从集群中删除您的 CRD:

make uninstall

卸载控制器

从集群中卸载控制器:

make undeploy

下一步

现在,查看架构概念图以获得更好的概述,并跟随CronJob 教程,以便通过开发演示示例项目更好地了解其工作原理。

入门指南

概述

通过遵循Operator 模式,不仅可以提供所有预期的资源,还可以在执行时动态、以编程方式管理它们。为了说明这个想法,想象一下,如果有人意外更改了配置或者误删了某个资源;在这种情况下,操作员可以在没有任何人工干预的情况下进行修复。

示例项目

我们将创建一个示例项目,以便让您了解它是如何工作的。这个示例将会:

  • 对账一个 Memcached CR - 代表着在集群上部署/管理的 Memcached 实例
  • 创建一个使用 Memcached 镜像的 Deployment
  • 不允许超过 CR 中定义的大小的实例
  • 更新 Memcached CR 的状态

请按照以下步骤操作。

创建项目

首先,创建一个用于您的项目的目录,并进入该目录,然后使用 kubebuilder 进行初始化:

mkdir $GOPATH/memcached-operator
cd $GOPATH/memcached-operator
kubebuilder init --domain=example.com

创建 Memcached API (CRD)

接下来,我们将创建一个新的 API,负责部署和管理我们的 Memcached 解决方案。在这个示例中,我们将使用[Deploy Image 插件][deploy-image]来获取我们解决方案的全面代码实现。

kubebuilder create api --group cache \
  --version v1alpha1 \
  --kind Memcached \
  --image=memcached:1.4.36-alpine \
  --image-container-command="memcached,-m=64,-o,modern,-v" \
  --image-container-port="11211" \
  --run-as-user="1001" \
  --plugins="deploy-image/v1-alpha" \
  --make=false

理解 API

这个命令的主要目的是为 Memcached 类型生成自定义资源(CR)和自定义资源定义(CRD)。它使用 group cache.example.com 和 version v1alpha1 来唯一标识 Memcached 类型的新 CRD。通过利用 Kubebuilder 工具,我们可以为这些平台定义我们的 API 和对象。虽然在这个示例中我们只添加了一种资源类型,但您可以根据需要拥有尽可能多的 GroupsKinds。简而言之,CRD 是我们自定义对象的定义,而 CR 是它们的实例。

定义您的 API

在这个示例中,可以看到 Memcached 类型(CRD)具有一些特定规格。这些是由 Deploy Image 插件构建的,用于管理目的的默认脚手架:

状态和规格

MemcachedSpec 部分是我们封装所有可用规格和配置的地方,用于我们的自定义资源(CR)。此外,值得注意的是,我们使用了状态条件。这确保了对 Memcached CR 的有效管理。当发生任何更改时,这些条件为我们提供了必要的数据,以便在 Kubernetes 集群中了解此资源的当前状态。这类似于我们为 Deployment 资源获取的状态信息。

从:api/v1alpha1/memcached_types.go

// MemcachedSpec 定义了 Memcached 的期望状态
type MemcachedSpec struct {
	// 插入其他规格字段 - 集群的期望状态
	// 重要:修改此文件后运行 "make" 以重新生成代码
	// Size 定义了 Memcached 实例的数量
	// 以下标记将使用 OpenAPI v3 schema 来验证该值
	// 了解更多信息:https://book.kubebuilder.io/reference/markers/crd-validation.html
	// +kubebuilder:validation:Minimum=1
	// +kubebuilder:validation:Maximum=3
	// +kubebuilder:validation:ExclusiveMaximum=false
	Size int32 `json:"size,omitempty"`

	// Port 定义了将用于使用镜像初始化容器的端口
	ContainerPort int32 `json:"containerPort,omitempty"`
}

// MemcachedStatus 定义了 Memcached 的观察状态
type MemcachedStatus struct {
	// 代表了 Memcached 当前状态的观察结果
	// Memcached.status.conditions.type 为:"Available"、"Progressing" 和 "Degraded"
	// Memcached.status.conditions.status 为 True、False、Unknown 中的一个
	// Memcached.status.conditions.reason 的值应为驼峰字符串,特定条件类型的产生者可以为此字段定义预期值和含义,以及这些值是否被视为 API 的保证
	// Memcached.status.conditions.Message 是一个人类可读的消息,指示有关转换的详细信息
	// 了解更多信息:https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties

	Conditions []metav1.Condition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type" protobuf:"bytes,1,rep,name=conditions"`
}

因此,当我们向此文件添加新规格并执行 make generate 命令时,我们使用 controller-gen 生成了 CRD 清单,该清单位于 config/crd/bases 目录下。

标记和验证

此外,值得注意的是,我们正在使用 标记,例如 +kubebuilder:validation:Minimum=1。这些标记有助于定义验证和标准,确保用户提供的数据 - 当他们为 Memcached 类型创建或编辑自定义资源时 - 得到适当的验证。有关可用标记的全面列表和详细信息,请参阅标记文档

观察 CRD 中的验证模式;此模式确保 Kubernetes API 正确验证应用的自定义资源(CR):

从:config/crd/bases/cache.example.com_memcacheds.yaml

description: MemcachedSpec 定义了 Memcached 的期望状态
properties:
  containerPort:
    description: Port 定义了将用于使用镜像初始化容器的端口
    format: int32
    type: integer
  size:
    description: 'Size 定义了 Memcached 实例的数量 以下标记将使用 OpenAPI v3 schema 来验证该值 了解更多信息:https://book.kubebuilder.io/reference/markers/crd-validation.html'
    format: int32
    maximum: 3 ## 从标记 +kubebuilder:validation:Maximum=3 生成
    minimum: 1 ## 从标记 +kubebuilder:validation:Minimum=1 生成
    type: integer

自定义资源示例

位于 “config/samples” 目录下的清单作为可以应用于集群的自定义资源的示例。 在这个特定示例中,通过将给定资源应用到集群中,我们将生成一个大小为 1 的 Deployment 实例(参见 size: 1)。

从:config/samples/cache_v1alpha1_memcached.yaml

apiVersion: cache.example.com/v1alpha1
kind: Memcached
metadata:
  name: memcached-sample
spec:
  # TODO(用户):编辑以下值,确保 Operand 在集群上必须拥有的 Pod/实例数量
  size: 1

  # TODO(用户):编辑以下值,确保容器具有正确的端口进行初始化
  containerPort: 11211

对账过程

对账函数在确保资源和其规格之间基于其中嵌入的业务逻辑的同步方面起着关键作用。它的作用类似于循环,不断检查条件并执行操作,直到所有条件符合其实现。以下是伪代码来说明这一点:

reconcile App {

  // 检查应用的 Deployment 是否存在,如果不存在则创建一个
  // 如果出现错误,则重新开始对账
  if err != nil {
    return reconcile.Result{}, err
  }

  // 检查应用的 Service 是否存在,如果不存在则创建一个
  // 如果出现错误,则重新开始对账
  if err != nil {
    return reconcile.Result{}, err
  }

  // 查找数据库 CR/CRD
  // 检查数据库 Deployment 的副本大小
  // 如果 deployment.replicas 的大小与 cr.size 不匹配,则更新它
  // 然后,从头开始对账。例如,通过返回 `reconcile.Result{Requeue: true}, nil`。
  if err != nil {
    return reconcile.Result{Requeue: true}, nil
  }
  ...

  // 如果循环结束时:
  // 所有操作都成功执行,对账就可以停止了
  return reconcile.Result{}, nil

}

返回选项

以下是重新开始对账的一些可能返回选项:

  • 带有错误:
return ctrl.Result{}, err
  • 没有错误:
return ctrl.Result{Requeue: true}, nil
  • 停止对账,使用(执行成功之后,或者不需要再进行对账):
return ctrl.Result{}, nil
  • X 时间后重新开始对账:
return ctrl.Result{RequeueAfter: nextRun.Sub(r.Now())}, nil

在我们的示例中

当将自定义资源应用到集群时,有一个指定的控制器来管理 Memcached 类型。您可以检查其对账是如何实现的:

从:internal/controller/memcached_controller.go

func (r *MemcachedReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	log := log.FromContext(ctx)

	// 获取 Memcached 实例
	// 目的是检查是否在集群上应用了 Memcached 类型的自定义资源
	// 如果没有,我们将返回 nil 以停止对账过程
	memcached := &examplecomv1alpha1.Memcached{}
	err := r.Get(ctx, req.NamespacedName, memcached)
	if err != nil {
		if apierrors.IsNotFound(err) {
			// 如果找不到自定义资源,通常意味着它已被删除或尚未创建
			// 这样,我们将停止对账过程
			log.Info("未找到 memcached 资源。忽略,因为对象可能已被删除")
			return ctrl.Result{}, nil
		}
		// 读取对象时出错 - 重新排队请求
		log.Error(err, "获取 memcached 失败")
		return ctrl.Result{}, err
	}

	// 当没有状态可用时,让我们将状态设置为 Unknown
	if memcached.Status.Conditions == nil || len(memcached.Status.Conditions) == 0 {
		meta.SetStatusCondition(&memcached.Status.Conditions,
			metav1.Condition{
				Type: typeAvailableMemcached,
				Status: metav1.ConditionUnknown,
				Reason: "对账中",
				Message: "开始对账"
			})
		if err = r.Status().Update(ctx, memcached); err != nil {
			log.Error(err, "更新 Memcached 状态失败")
			return ctrl.Result{}, err
		}

		// 更新状态后,让我们重新获取 memcached 自定义资源
		// 以便我们在集群上拥有资源的最新状态,并且避免
		// 引发错误 "对象已被修改,请将您的更改应用到最新版本,然后重试"
		// 如果我们尝试在后续操作中再次更新它,这将重新触发对账过程
		if err := r.Get(ctx, req.NamespacedName, memcached); err != nil {
			log.Error(err, "重新获取 memcached 失败")
			return ctrl.Result{}, err
		}
	}

	// 添加 finalizer。然后,我们可以定义在删除自定义资源之前应执行的一些操作。
	// 更多信息:https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers
	if !controllerutil.ContainsFinalizer(memcached, memcachedFinalizer) {
		log.Info("为 Memcached 添加 Finalizer")
		if ok := controllerutil.AddFinalizer(memcached, memcachedFinalizer); !ok {
			log.Error(err, "无法将 finalizer 添加到自定义资源")
			return ctrl.Result{Requeue: true}, nil
		}

		if err = r.Update(ctx, memcached); err != nil {
			log.Error(err, "更新自定义资源以添加 finalizer 失败")
			return ctrl.Result{}, err
		}
	}

	// 检查是否标记要删除 Memcached 实例,这通过设置删除时间戳来表示。
	isMemcachedMarkedToBeDeleted := memcached.GetDeletionTimestamp() != nil
	if isMemcachedMarkedToBeDeleted {
		if controllerutil.ContainsFinalizer(memcached, memcachedFinalizer) {
			log.Info("在删除 CR 之前执行 Finalizer 操作")

			// 在这里添加一个状态 "Downgrade",以反映该资源开始其终止过程。
			meta.SetStatusCondition(&memcached.Status.Conditions, 
				metav1.Condition{
					Type: typeDegradedMemcached,
					Status: metav1.ConditionUnknown,
				},
				Reason: "Finalizing",
				Message: fmt.Sprintf("执行自定义资源的 finalizer 操作:%s ", memcached.Name)})

			if err := r.Status().Update(ctx, memcached); err != nil {
				log.Error(err, "更新 Memcached 状态失败")
				return ctrl.Result{}, err
			}

			// 执行在删除 finalizer 之前需要的所有操作,并允许
			// Kubernetes API 删除自定义资源。
			r.doFinalizerOperationsForMemcached(memcached)

			// TODO(用户):如果您在 doFinalizerOperationsForMemcached 方法中添加操作
			// 那么您需要确保一切顺利,然后再删除和更新 Downgrade 状态
			// 否则,您应该在此重新排队。

			// 在更新状态前重新获取 memcached 自定义资源
			// 以便我们在集群上拥有资源的最新状态,并且避免
			// 引发错误 "对象已被修改,请将您的更改应用到最新版本,然后重试"
			// 如果我们尝试在后续操作中再次更新它,这将重新触发对账过程
			if err := r.Get(ctx, req.NamespacedName, memcached); err != nil {
				log.Error(err, "重新获取 memcached 失败")
				return ctrl.Result{}, err
			}

			meta.SetStatusCondition(&memcached.Status.Conditions,
				metav1.Condition{
					Type: typeDegradedMemcached,
					Status: metav1.ConditionTrue,
					Reason: "Finalizing",
					Message: fmt.Sprintf("自定义资源 %s 的 finalizer 操作已成功完成", memcached.Name)
				})

			if err := r.Status().Update(ctx, memcached); err != nil {
				log.Error(err, "更新 Memcached 状态失败")
				return ctrl.Result{}, err
			}

			log.Info("成功执行操作后移除 Memcached 的 Finalizer")
			if ok := controllerutil.RemoveFinalizer(memcached, memcachedFinalizer); !ok {
				log.Error(err, "移除 Memcached 的 finalizer 失败")
				return ctrl.Result{Requeue: true}, nil
			}

			if err := r.Update(ctx, memcached); err != nil {
				log.Error(err, "移除 Memcached 的 finalizer 失败")
				return ctrl.Result{}, err
			}
		}
		return ctrl.Result{}, nil
	}

	// 检查部署是否已经存在,如果不存在则创建新的
	found := &appsv1.Deployment{}
	err = r.Get(ctx, types.NamespacedName{Name: memcached.Name, Namespace: memcached.Namespace}, found)
	if err != nil && apierrors.IsNotFound(err) {
		// 定义一个新的部署
		dep, err := r.deploymentForMemcached(memcached)
		if err != nil {
			log.Error(err, "为 Memcached 定义新的 Deployment 资源失败")

			// 以下实现将更新状态
			meta.SetStatusCondition(&memcached.Status.Conditions, metav1.Condition{
				Type: typeAvailableMemcached,
				Status: metav1.ConditionFalse,
				Reason: "对账中",
				Message: fmt.Sprintf("为自定义资源创建 Deployment 失败 (%s): (%s)", memcached.Name, err)})

			if err := r.Status().Update(ctx, memcached); err != nil {
				log.Error(err, "更新 Memcached 状态失败")
				return ctrl.Result{}, err
			}

			return ctrl.Result{}, err
		}

		log.Info("创建新的 Deployment",
			"Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
		if err = r.Create(ctx, dep); err != nil {
			log.Error(err, "创建新的 Deployment 失败",
				"Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
			return ctrl.Result{}, err
		}

		// 部署成功创建
		// 我们将重新排队对账,以便确保状态
		// 并继续进行下一步操作
		return ctrl.Result{RequeueAfter: time.Minute}, nil
	} else if err != nil {
		log.Error(err, "获取 Deployment 失败")
		// 让我们返回错误以重新触发对账
		return ctrl.Result{}, err
	}

	// CRD API 定义了 Memcached 类型具有 MemcachedSpec.Size 字段
	// 以设置集群上所需的 Deployment 实例数量。
	// 因此,以下代码将确保 Deployment 大小与我们对账的自定义资源的 Size spec 相同。
	size := memcached.Spec.Size
	if *found.Spec.Replicas != size {
		found.Spec.Replicas = &size
		if err = r.Update(ctx, found); err != nil {
			log.Error(err, "更新 Deployment 失败",
				"Deployment.Namespace", found.Namespace, "Deployment.Name", found.Name)

			// 在更新状态前重新获取 memcached 自定义资源
			// 以便我们在集群上拥有资源的最新状态,并且避免
			// 引发错误 "对象已被修改,请将您的更改应用到最新版本,然后重试"
			// 如果我们尝试在后续操作中再次更新它,这将重新触发对账过程
			if err := r.Get(ctx, req.NamespacedName, memcached); err != nil {
				log.Error(err, "重新获取 memcached 失败")
				return ctrl.Result{}, err
			}

			// 以下实现将更新状态
			meta.SetStatusCondition(&memcached.Status.Conditions, 
				metav1.Condition{
					Type: typeAvailableMemcached,
					Status: metav1.ConditionFalse,
					Reason: "调整大小",
					Message: fmt.Sprintf("更新自定义资源的大小失败 (%s): (%s)", memcached.Name, err)
				})

			if err := r.Status().Update(ctx, memcached); err != nil {
				log.Error(err, "更新 Memcached 状态失败")
				return ctrl.Result{}, err
			}

			return ctrl.Result{}, err
		}

		// 现在,我们更新大小后,希望重新排队对账
		// 以便确保我们拥有资源的最新状态
		// 并帮助确保集群上的期望状态
		return ctrl.Result{Requeue: true}, nil
	}

	// 以下实现将更新状态
	meta.SetStatusCondition(&memcached.Status.Conditions,
		metav1.Condition{
			Type: typeAvailableMemcached,
			Status: metav1.ConditionTrue,
			Reason: "对账中",
			Message: fmt.Sprintf("为自定义资源创建 %d 个副本的 Deployment 成功", memcached.Name, size)
		})

	if err := r.Status().Update(ctx, memcached); err != nil {
		log.Error(err, "更新 Memcached 状态失败")
		return ctrl.Result{}, err
	}

	return ctrl.Result{}, nil
}

观察集群上的变化

该控制器持续地观察与该类型相关的任何事件。因此,相关的变化会立即触发控制器的对账过程。值得注意的是,我们已经实现了 watches 特性。(更多信息)。这使我们能够监视与创建、更新或删除 Memcached 类型的自定义资源相关的事件,以及由其相应控制器编排和拥有的 Deployment。请注意以下代码:

// SetupWithManager 使用 Manager 设置控制器。
// 请注意,也将监视 Deployment 以确保其在集群中处于期望的状态
func (r *MemcachedReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
    For(&examplecomv1alpha1.Memcached{}). // 为 Memcached 类型创建监视
    Owns(&appsv1.Deployment{}). // 为其控制器拥有的 Deployment 创建监视
    Complete(r)
}

设置 RBAC 权限

现在通过 RBAC markers 配置了 RBAC 权限,用于生成和更新 config/rbac/ 中的清单文件。这些标记可以在每个控制器的 Reconcile() 方法中找到(并应该被定义),请看我们示例中的实现方式:

//+kubebuilder:rbac:groups=cache.example.com,resources=memcacheds,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/finalizers,verbs=update
//+kubebuilder:rbac:groups=core,resources=events,verbs=create;patch
//+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch

重要的是,如果您希望添加或修改 RBAC 规则,可以通过更新或添加控制器中的相应标记来实现。在进行必要的更改后,运行 make generate 命令。这将促使 controller-gen 刷新位于 config/rbac 下的文件。

Manager(main.go)

Manager 在监督控制器方面扮演着至关重要的角色,这些控制器进而使集群端的操作成为可能。如果您检查 cmd/main.go 文件,您会看到以下内容:

...
    mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
        Scheme:                 scheme,
        Metrics:                metricsserver.Options{BindAddress: metricsAddr},
        HealthProbeBindAddress: probeAddr,
        LeaderElection:         enableLeaderElection,
        LeaderElectionID:       "1836d577.testproject.org",
        // LeaderElectionReleaseOnCancel 定义了领导者在 Manager 结束时是否应主动放弃领导权。
        // 这要求二进制在 Manager 停止时立即结束,否则此设置是不安全的。设置此选项显著加快主动领导者转换的速度,
        // 因为新领导者无需等待 LeaseDuration 时间。
        //
        // 在提供的默认脚手架中,程序在 Manager 停止后立即结束,因此启用此选项是可以的。但是,
        // 如果您正在进行任何操作,例如在 Manager 停止后执行清理操作,那么使用它可能是不安全的。
        // LeaderElectionReleaseOnCancel: true,
    })
    if err != nil {
        setupLog.Error(err, "无法启动 Manager")
        os.Exit(1)
    }

上面的代码片段概述了 Manager 的配置选项。虽然我们在当前示例中不会更改这些选项,但了解其位置以及初始化您的基于 Operator 的镜像的过程非常重要。Manager 负责监督为您的 Operator API 生成的控制器。

检查在集群中运行的项目

此时,您可以执行 快速入门 中突出显示的命令。通过执行 make build IMG=myregistry/example:1.0.0,您将为项目构建镜像。出于测试目的,建议将此镜像发布到公共注册表。这样可以确保轻松访问,无需额外的配置。完成后,您可以使用 make deploy IMG=myregistry/example:1.0.0 命令将镜像部署到集群中。

下一步

  • 要深入了解开发解决方案,请考虑阅读提供的教程。
  • 要了解优化您的方法的见解,请参阅最佳实践文档。

教程:构建 CronJob

许多教程都以一些非常牵强的设置或一些用于传达基础知识的玩具应用程序开头,然后在更复杂的内容上停滞不前。相反,这个教程应该带您(几乎)完整地了解 Kubebuilder 的复杂性,从简单开始逐步构建到相当全面的内容。

我们假装(当然,这有点牵强)我们终于厌倦了在 Kubernetes 中使用非 Kubebuilder 实现的 CronJob 控制器的维护负担,我们想要使用 Kubebuilder 进行重写。

CronJob 控制器的任务(不是故意的双关语)是在 Kubernetes 集群上定期间隔运行一次性任务。它通过在 Job 控制器的基础上构建来完成这一点,Job 控制器的任务是运行一次性任务一次,并确保其完成。

我们不打算试图重写 Job 控制器,而是将其视为一个机会来了解如何与外部类型交互。

构建项目框架

快速入门中所述,我们需要构建一个新项目的框架。确保您已经安装了 Kubebuilder,然后构建一个新项目:

# 创建一个项目目录,然后运行初始化命令。
mkdir project
cd project
# 我们将使用 tutorial.kubebuilder.io 作为域,
# 因此所有 API 组将是 <group>.tutorial.kubebuilder.io。
kubebuilder init --domain tutorial.kubebuilder.io --repo tutorial.kubebuilder.io/project

现在我们已经有了一个项目框架,让我们来看看 Kubebuilder 到目前为止为我们生成了什么…

基本项目结构包含什么?

在构建新项目的框架时,Kubebuilder 为我们提供了一些基本的样板文件。

构建基础设施

首先是构建项目的基本基础设施:

go.mod:与我们的项目匹配的新 Go 模块,具有基本依赖项
module tutorial.kubebuilder.io/project

go 1.21

require (
	github.com/onsi/ginkgo/v2 v2.14.0
	github.com/onsi/gomega v1.30.0
	github.com/robfig/cron v1.2.0
	k8s.io/api v0.29.0
	k8s.io/apimachinery v0.29.0
	k8s.io/client-go v0.29.0
	sigs.k8s.io/controller-runtime v0.17.0
)

require (
	github.com/beorn7/perks v1.0.1 // indirect
	github.com/cespare/xxhash/v2 v2.2.0 // indirect
	github.com/davecgh/go-spew v1.1.1 // indirect
	github.com/emicklei/go-restful/v3 v3.11.0 // indirect
	github.com/evanphx/json-patch/v5 v5.8.0 // indirect
	github.com/fsnotify/fsnotify v1.7.0 // indirect
	github.com/go-logr/logr v1.4.1 // indirect
	github.com/go-logr/zapr v1.3.0 // indirect
	github.com/go-openapi/jsonpointer v0.19.6 // indirect
	github.com/go-openapi/jsonreference v0.20.2 // indirect
	github.com/go-openapi/swag v0.22.3 // indirect
	github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572 // indirect
	github.com/gogo/protobuf v1.3.2 // indirect
	github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
	github.com/golang/protobuf v1.5.3 // indirect
	github.com/google/gnostic-models v0.6.8 // indirect
	github.com/google/go-cmp v0.6.0 // indirect
	github.com/google/gofuzz v1.2.0 // indirect
	github.com/google/pprof v0.0.0-20210720184732-4bb14d4b1be1 // indirect
	github.com/google/uuid v1.3.0 // indirect
	github.com/imdario/mergo v0.3.6 // indirect
	github.com/josharian/intern v1.0.0 // indirect
	github.com/json-iterator/go v1.1.12 // indirect
	github.com/mailru/easyjson v0.7.7 // indirect
	github.com/matttproud/golang_protobuf_extensions/v2 v2.0.0 // indirect
	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
	github.com/modern-go/reflect2 v1.0.2 // indirect
	github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
	github.com/pkg/errors v0.9.1 // indirect
	github.com/prometheus/client_golang v1.18.0 // indirect
	github.com/prometheus/client_model v0.5.0 // indirect
	github.com/prometheus/common v0.45.0 // indirect
	github.com/prometheus/procfs v0.12.0 // indirect
	github.com/spf13/pflag v1.0.5 // indirect
	go.uber.org/multierr v1.11.0 // indirect
	go.uber.org/zap v1.26.0 // indirect
	golang.org/x/exp v0.0.0-20220722155223-a9213eeb770e // indirect
	golang.org/x/net v0.19.0 // indirect
	golang.org/x/oauth2 v0.12.0 // indirect
	golang.org/x/sys v0.16.0 // indirect
	golang.org/x/term v0.15.0 // indirect
	golang.org/x/text v0.14.0 // indirect
	golang.org/x/time v0.3.0 // indirect
	golang.org/x/tools v0.16.1 // indirect
	gomodules.xyz/jsonpatch/v2 v2.4.0 // indirect
	google.golang.org/appengine v1.6.7 // indirect
	google.golang.org/protobuf v1.31.0 // indirect
	gopkg.in/inf.v0 v0.9.1 // indirect
	gopkg.in/yaml.v2 v2.4.0 // indirect
	gopkg.in/yaml.v3 v3.0.1 // indirect
	k8s.io/apiextensions-apiserver v0.29.0 // indirect
	k8s.io/component-base v0.29.0 // indirect
	k8s.io/klog/v2 v2.110.1 // indirect
	k8s.io/kube-openapi v0.0.0-20231010175941-2dd684a91f00 // indirect
	k8s.io/utils v0.0.0-20230726121419-3b25d923346b // indirect
	sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd // indirect
	sigs.k8s.io/structured-merge-diff/v4 v4.4.1 // indirect
	sigs.k8s.io/yaml v1.4.0 // indirect
)
Makefile:用于构建和部署控制器的 Make 目标

# Image URL to use all building/pushing image targets
IMG ?= controller:latest
# ENVTEST_K8S_VERSION refers to the version of kubebuilder assets to be downloaded by envtest binary.
ENVTEST_K8S_VERSION = 1.29.0

# Get the currently used golang install path (in GOPATH/bin, unless GOBIN is set)
ifeq (,$(shell go env GOBIN))
GOBIN=$(shell go env GOPATH)/bin
else
GOBIN=$(shell go env GOBIN)
endif

# CONTAINER_TOOL defines the container tool to be used for building images.
# Be aware that the target commands are only tested with Docker which is
# scaffolded by default. However, you might want to replace it to use other
# tools. (i.e. podman)
CONTAINER_TOOL ?= docker

# Setting SHELL to bash allows bash commands to be executed by recipes.
# Options are set to exit when a recipe line exits non-zero or a piped command fails.
SHELL = /usr/bin/env bash -o pipefail
.SHELLFLAGS = -ec

.PHONY: all
all: build

##@ General

# The help target prints out all targets with their descriptions organized
# beneath their categories. The categories are represented by '##@' and the
# target descriptions by '##'. The awk command is responsible for reading the
# entire set of makefiles included in this invocation, looking for lines of the
# file as xyz: ## something, and then pretty-format the target and help. Then,
# if there's a line with ##@ something, that gets pretty-printed as a category.
# More info on the usage of ANSI control characters for terminal formatting:
# https://en.wikipedia.org/wiki/ANSI_escape_code#SGR_parameters
# More info on the awk command:
# http://linuxcommand.org/lc3_adv_awk.php

.PHONY: help
help: ## Display this help.
	@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n  make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf "  \033[36m%-15s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)

##@ Development

.PHONY: manifests
manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects.
	$(CONTROLLER_GEN) rbac:roleName=manager-role crd webhook paths="./..." output:crd:artifacts:config=config/crd/bases

.PHONY: generate
generate: controller-gen ## Generate code containing DeepCopy, DeepCopyInto, and DeepCopyObject method implementations.
	$(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths="./..."

.PHONY: fmt
fmt: ## Run go fmt against code.
	go fmt ./...

.PHONY: vet
vet: ## Run go vet against code.
	go vet ./...

.PHONY: test
test: manifests generate fmt vet envtest ## Run tests.
	KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" go test $$(go list ./... | grep -v /e2e) -coverprofile cover.out

# Utilize Kind or modify the e2e tests to load the image locally, enabling compatibility with other vendors.
.PHONY: test-e2e  # Run the e2e tests against a Kind k8s instance that is spun up.
test-e2e:
	go test ./test/e2e/ -v -ginkgo.v

.PHONY: lint
lint: golangci-lint ## Run golangci-lint linter & yamllint
	$(GOLANGCI_LINT) run

.PHONY: lint-fix
lint-fix: golangci-lint ## Run golangci-lint linter and perform fixes
	$(GOLANGCI_LINT) run --fix

##@ Build

.PHONY: build
build: manifests generate fmt vet ## Build manager binary.
	go build -o bin/manager cmd/main.go

.PHONY: run
run: manifests generate fmt vet ## Run a controller from your host.
	go run ./cmd/main.go

# If you wish to build the manager image targeting other platforms you can use the --platform flag.
# (i.e. docker build --platform linux/arm64). However, you must enable docker buildKit for it.
# More info: https://docs.docker.com/develop/develop-images/build_enhancements/
.PHONY: docker-build
docker-build: ## Build docker image with the manager.
	$(CONTAINER_TOOL) build -t ${IMG} .

.PHONY: docker-push
docker-push: ## Push docker image with the manager.
	$(CONTAINER_TOOL) push ${IMG}

# PLATFORMS defines the target platforms for the manager image be built to provide support to multiple
# architectures. (i.e. make docker-buildx IMG=myregistry/mypoperator:0.0.1). To use this option you need to:
# - be able to use docker buildx. More info: https://docs.docker.com/build/buildx/
# - have enabled BuildKit. More info: https://docs.docker.com/develop/develop-images/build_enhancements/
# - be able to push the image to your registry (i.e. if you do not set a valid value via IMG=<myregistry/image:<tag>> then the export will fail)
# To adequately provide solutions that are compatible with multiple platforms, you should consider using this option.
PLATFORMS ?= linux/arm64,linux/amd64,linux/s390x,linux/ppc64le
.PHONY: docker-buildx
docker-buildx: ## Build and push docker image for the manager for cross-platform support
	# copy existing Dockerfile and insert --platform=${BUILDPLATFORM} into Dockerfile.cross, and preserve the original Dockerfile
	sed -e '1 s/\(^FROM\)/FROM --platform=\$$\{BUILDPLATFORM\}/; t' -e ' 1,// s//FROM --platform=\$$\{BUILDPLATFORM\}/' Dockerfile > Dockerfile.cross
	- $(CONTAINER_TOOL) buildx create --name project-v3-builder
	$(CONTAINER_TOOL) buildx use project-v3-builder
	- $(CONTAINER_TOOL) buildx build --push --platform=$(PLATFORMS) --tag ${IMG} -f Dockerfile.cross .
	- $(CONTAINER_TOOL) buildx rm project-v3-builder
	rm Dockerfile.cross

.PHONY: build-installer
build-installer: manifests generate kustomize ## Generate a consolidated YAML with CRDs and deployment.
	mkdir -p dist
	echo "---" > dist/install.yaml # Clean previous content
	@if [ -d "config/crd" ]; then \
		$(KUSTOMIZE) build config/crd > dist/install.yaml; \
		echo "---" >> dist/install.yaml; \
	fi
	cd config/manager && $(KUSTOMIZE) edit set image controller=${IMG}
	$(KUSTOMIZE) build config/default >> dist/install.yaml

##@ Deployment

ifndef ignore-not-found
  ignore-not-found = false
endif

.PHONY: install
install: manifests kustomize ## Install CRDs into the K8s cluster specified in ~/.kube/config.
	$(KUSTOMIZE) build config/crd | $(KUBECTL) apply -f -

.PHONY: uninstall
uninstall: manifests kustomize ## Uninstall CRDs from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
	$(KUSTOMIZE) build config/crd | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -

.PHONY: deploy
deploy: manifests kustomize ## Deploy controller to the K8s cluster specified in ~/.kube/config.
	cd config/manager && $(KUSTOMIZE) edit set image controller=${IMG}
	$(KUSTOMIZE) build config/default | $(KUBECTL) apply -f -

.PHONY: undeploy
undeploy: kustomize ## Undeploy controller from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
	$(KUSTOMIZE) build config/default | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -

##@ Dependencies

## Location to install dependencies to
LOCALBIN ?= $(shell pwd)/bin
$(LOCALBIN):
	mkdir -p $(LOCALBIN)

## Tool Binaries
KUBECTL ?= kubectl
KUSTOMIZE ?= $(LOCALBIN)/kustomize-$(KUSTOMIZE_VERSION)
CONTROLLER_GEN ?= $(LOCALBIN)/controller-gen-$(CONTROLLER_TOOLS_VERSION)
ENVTEST ?= $(LOCALBIN)/setup-envtest-$(ENVTEST_VERSION)
GOLANGCI_LINT = $(LOCALBIN)/golangci-lint-$(GOLANGCI_LINT_VERSION)

## Tool Versions
KUSTOMIZE_VERSION ?= v5.3.0
CONTROLLER_TOOLS_VERSION ?= v0.14.0
ENVTEST_VERSION ?= latest
GOLANGCI_LINT_VERSION ?= v1.54.2

.PHONY: kustomize
kustomize: $(KUSTOMIZE) ## Download kustomize locally if necessary.
$(KUSTOMIZE): $(LOCALBIN)
	$(call go-install-tool,$(KUSTOMIZE),sigs.k8s.io/kustomize/kustomize/v5,$(KUSTOMIZE_VERSION))

.PHONY: controller-gen
controller-gen: $(CONTROLLER_GEN) ## Download controller-gen locally if necessary.
$(CONTROLLER_GEN): $(LOCALBIN)
	$(call go-install-tool,$(CONTROLLER_GEN),sigs.k8s.io/controller-tools/cmd/controller-gen,$(CONTROLLER_TOOLS_VERSION))

.PHONY: envtest
envtest: $(ENVTEST) ## Download setup-envtest locally if necessary.
$(ENVTEST): $(LOCALBIN)
	$(call go-install-tool,$(ENVTEST),sigs.k8s.io/controller-runtime/tools/setup-envtest,$(ENVTEST_VERSION))

.PHONY: golangci-lint
golangci-lint: $(GOLANGCI_LINT) ## Download golangci-lint locally if necessary.
$(GOLANGCI_LINT): $(LOCALBIN)
	$(call go-install-tool,$(GOLANGCI_LINT),github.com/golangci/golangci-lint/cmd/golangci-lint,${GOLANGCI_LINT_VERSION})

# go-install-tool will 'go install' any package with custom target and name of binary, if it doesn't exist
# $1 - target path with name of binary (ideally with version)
# $2 - package url which can be installed
# $3 - specific version of package
define go-install-tool
@[ -f $(1) ] || { \
set -e; \
package=$(2)@$(3) ;\
echo "Downloading $${package}" ;\
GOBIN=$(LOCALBIN) go install $${package} ;\
mv "$$(echo "$(1)" | sed "s/-$(3)$$//")" $(1) ;\
}
endef
PROJECT:用于构建新组件的 Kubebuilder 元数据
# Code generated by tool. DO NOT EDIT.
# This file is used to track the info used to scaffold your project
# and allow the plugins properly work.
# More info: https://book.kubebuilder.io/reference/project-config.html
domain: tutorial.kubebuilder.io
layout:
- go.kubebuilder.io/v4
projectName: project
repo: tutorial.kubebuilder.io/project
resources:
- api:
    crdVersion: v1
    namespaced: true
  controller: true
  domain: tutorial.kubebuilder.io
  group: batch
  kind: CronJob
  path: tutorial.kubebuilder.io/project/api/v1
  version: v1
  webhooks:
    defaulting: true
    validation: true
    webhookVersion: v1
version: "3"

启动配置

我们还在config/目录下获得了启动配置。目前,它只包含了Kustomize YAML 定义,用于在集群上启动我们的控制器,但一旦我们开始编写控制器,它还将包含我们的自定义资源定义、RBAC 配置和 Webhook 配置。

config/default 包含了一个Kustomize 基础配置,用于在标准配置中启动控制器。

每个其他目录都包含一个不同的配置部分,重构为自己的基础配置:

  • config/manager:在集群中将您的控制器作为 Pod 启动

  • config/rbac:在其自己的服务帐户下运行您的控制器所需的权限

入口点

最后,但肯定不是最不重要的,Kubebuilder 为我们的项目生成了基本的入口点:main.go。让我们接着看看…

每段旅程都需要一个起点,每个程序都需要一个主函数

emptymain.go
Apache License

版权所有 2022 年 Kubernetes 作者。

根据 Apache 许可,版本 2.0(“许可”)获得许可; 除非符合许可的规定,否则您不得使用此文件。 您可以在以下网址获取许可的副本:

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或书面同意,否则根据许可分发的软件将按“原样“分发, 不附带任何明示或暗示的担保或条件。 请参阅许可,了解特定语言下的权限和限制。

我们的包从一些基本的导入开始。特别是:

  • 核心的 controller-runtime
  • 默认的 controller-runtime 日志记录,Zap(稍后会详细介绍)
package main

import (
	"flag"
	"os"

	// 导入所有 Kubernetes 客户端认证插件(例如 Azure、GCP、OIDC 等)
	// 以确保 exec-entrypoint 和 run 可以利用它们。
	_ "k8s.io/client-go/plugin/pkg/client/auth"

	"k8s.io/apimachinery/pkg/runtime"
	utilruntime "k8s.io/apimachinery/pkg/util/runtime"
	clientgoscheme "k8s.io/client-go/kubernetes/scheme"
	_ "k8s.io/client-go/plugin/pkg/client/auth/gcp"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/cache"
	"sigs.k8s.io/controller-runtime/pkg/healthz"
	"sigs.k8s.io/controller-runtime/pkg/log/zap"
	"sigs.k8s.io/controller-runtime/pkg/metrics/server"
	"sigs.k8s.io/controller-runtime/pkg/webhook"
	// +kubebuilder:scaffold:imports
)

每组控制器都需要一个 Scheme, 它提供了 Kinds 与它们对应的 Go 类型之间的映射。稍后在编写 API 定义时,我们将更详细地讨论 Kinds,所以稍后再谈。

var (
	scheme   = runtime.NewScheme()
	setupLog = ctrl.Log.WithName("setup")
)

func init() {
	utilruntime.Must(clientgoscheme.AddToScheme(scheme))

	//+kubebuilder:scaffold:scheme
}

此时,我们的主函数相当简单:

  • 我们为指标设置了一些基本的标志。

  • 我们实例化了一个 manager, 它负责运行我们所有的控制器,并设置了共享缓存和客户端到 API 服务器的连接(请注意我们告诉 manager 关于我们的 Scheme)。

  • 我们运行我们的 manager,它反过来运行我们所有的控制器和 Webhooks。 manager 被设置为在接收到优雅关闭信号之前一直运行。 这样,当我们在 Kubernetes 上运行时,我们会在 Pod 优雅终止时表现良好。

虽然目前我们没有任何东西要运行,但记住+kubebuilder:scaffold:builder注释的位置——很快那里会变得有趣起来。

func main() {
	var metricsAddr string
	var enableLeaderElection bool
	var probeAddr string
	flag.StringVar(&metricsAddr, "metrics-bind-address", ":8080", "The address the metric endpoint binds to.")
	flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.")
	flag.BoolVar(&enableLeaderElection, "leader-elect", false,
		"Enable leader election for controller manager. "+
			"Enabling this will ensure there is only one active controller manager.")
	opts := zap.Options{
		Development: true,
	}
	opts.BindFlags(flag.CommandLine)
	flag.Parse()

	ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))

	mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
		Scheme: scheme,
		Metrics: server.Options{
			BindAddress: metricsAddr,
		},
		WebhookServer:          webhook.NewServer(webhook.Options{Port: 9443}),
		HealthProbeBindAddress: probeAddr,
		LeaderElection:         enableLeaderElection,
		LeaderElectionID:       "80807133.tutorial.kubebuilder.io",
	})
	if err != nil {
		setupLog.Error(err, "unable to start manager")
		os.Exit(1)
	}

注意,Manager 可以通过以下方式限制所有控制器将监视资源的命名空间:

	mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
		Scheme: scheme,
		Cache: cache.Options{
			DefaultNamespaces: map[string]cache.Config{
				namespace: {},
			},
		},
		Metrics: server.Options{
			BindAddress: metricsAddr,
		},
		WebhookServer:          webhook.NewServer(webhook.Options{Port: 9443}),
		HealthProbeBindAddress: probeAddr,
		LeaderElection:         enableLeaderElection,
		LeaderElectionID:       "80807133.tutorial.kubebuilder.io",
	})

上面的示例将把项目的范围更改为单个Namespace。在这种情况下,建议将提供的授权限制为此命名空间, 方法是将默认的ClusterRoleClusterRoleBinding替换为RoleRoleBinding。 有关更多信息,请参阅 Kubernetes 关于使用 RBAC 授权 的文档。

此外,还可以使用 DefaultNamespacescache.Options{} 缓存特定一组命名空间中的对象:

	var namespaces []string // 名称空间列表
	defaultNamespaces := make(map[string]cache.Config)

	for _, ns := range namespaces {
		defaultNamespaces[ns] = cache.Config{}
	}

	mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
		Scheme: scheme,
		Cache: cache.Options{
			DefaultNamespaces: defaultNamespaces,
		},
		Metrics: server.Options{
			BindAddress: metricsAddr,
		},
		WebhookServer:          webhook.NewServer(webhook.Options{Port: 9443}),
		HealthProbeBindAddress: probeAddr,
		LeaderElection:         enableLeaderElection,
		LeaderElectionID:       "80807133.tutorial.kubebuilder.io",
	})

有关更多信息,请参阅 cache.Options{}

	// +kubebuilder:scaffold:builder

	if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
		setupLog.Error(err, "unable to set up health check")
		os.Exit(1)
	}
	if err := mgr.AddReadyzCheck("readyz", healthz.Ping); err != nil {
		setupLog.Error(err, "unable to set up ready check")
		os.Exit(1)
	}

	setupLog.Info("starting manager")
	if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
		setupLog.Error(err, "problem running manager")
		os.Exit(1)
	}
}

有了这个,我们可以开始构建我们的 API 了!

组、版本和类型

实际上,在开始创建我们的 API 之前,我们应该稍微谈一下术语。

当我们在 Kubernetes 中讨论 API 时,我们经常使用 4 个术语:groups(组)、versions(版本)、kinds(类型)和resources(资源)。

组和版本

在 Kubernetes 中,API Group(API 组)简单地是相关功能的集合。每个组都有一个或多个versions(版本),正如其名称所示,允许我们随着时间的推移改变 API 的工作方式。

类型和资源

每个 API 组-版本包含一个或多个 API 类型,我们称之为kinds(类型)。虽然一个类型在不同版本之间可能会改变形式,但每种形式都必须能够以某种方式存储其他形式的所有数据(我们可以将数据存储在字段中,或者在注释中)。这意味着使用较旧的 API 版本不会导致较新的数据丢失或损坏。有时,同一类型可能由多个资源返回。例如,pods(Pod)资源对应于Pod类型。然而,有时相同的类型可能由多个资源返回。例如,Scale类型由所有规模子资源返回,比如deployments/scalereplicasets/scale。这就是允许 Kubernetes HorizontalPodAutoscaler 与不同资源交互的原因。然而,对于自定义资源定义(CRD),每种类型将对应于单个资源。

请注意,资源始终以小写形式存在,并且按照惯例是类型的小写形式。

这如何对应到 Go 语言?

当我们提到特定组-版本中的一种类型时,我们将其称为GroupVersionKind(GVK)。资源也是如此。正如我们将很快看到的那样,每个 GVK 对应于包中的给定根 Go 类型。

现在我们术语明晰了,我们可以实际地创建我们的 API!

那么,我们如何创建我们的 API?

在接下来的添加新 API部分中,我们将检查工具如何帮助我们使用命令kubebuilder create api创建我们自己的 API。

这个命令的目标是为我们的类型创建自定义资源(CR)和自定义资源定义(CRD)。要进一步了解,请参阅使用自定义资源定义扩展 Kubernetes API

但是,为什么要创建 API?

新的 API 是我们向 Kubernetes 介绍自定义对象的方式。Go 结构用于生成包括我们数据模式以及跟踪新类型名称等数据的 CRD。然后,我们可以创建我们自定义对象的实例,这些实例将由我们的controllers管理。

我们的 API 和资源代表着我们在集群中的解决方案。基本上,CRD 是我们定制对象的定义,而 CR 是它的一个实例。

啊,你有例子吗?

让我们想象一个经典的场景,目标是在 Kubernetes 平台上运行应用程序及其数据库。然后,一个 CRD 可以代表应用程序,另一个可以代表数据库。通过创建一个 CRD 描述应用程序,另一个描述数据库,我们不会伤害封装、单一责任原则和内聚等概念。损害这些概念可能会导致意想不到的副作用,比如扩展、重用或维护方面的困难,仅举几例。

这样,我们可以创建应用程序 CRD,它将拥有自己的控制器,并负责创建包含应用程序的部署以及创建访问它的服务等工作。类似地,我们可以创建一个代表数据库的 CRD,并部署一个负责管理数据库实例的控制器。

呃,那个 Scheme 是什么?

我们之前看到的Scheme只是一种跟踪给定 GVK 对应的 Go 类型的方式(不要被其godocs所压倒)。

例如,假设我们标记"tutorial.kubebuilder.io/api/v1".CronJob{}类型属于batch.tutorial.kubebuilder.io/v1 API 组(隐含地表示它具有类型CronJob)。

然后,我们可以根据来自 API 服务器的一些 JSON 构造一个新的&CronJob{},其中说

{
    "kind": "CronJob",
    "apiVersion": "batch.tutorial.kubebuilder.io/v1",
    ...
}

或者在我们提交一个&CronJob{}进行更新时,正确查找组-版本。

创建一个新的 API

要创建一个新的 Kind(你有关注上一章的内容吗?)以及相应的控制器,我们可以使用 kubebuilder create api 命令:

kubebuilder create api --group batch --version v1 --kind CronJob

按下 y 键来选择 “Create Resource” 和 “Create Controller”。

第一次针对每个 group-version 调用此命令时,它将为新的 group-version 创建一个目录。

在这种情况下,将创建一个名为 api/v1/ 的目录,对应于 batch.tutorial.kubebuilder.io/v1(还记得我们从一开始设置的 --domain 吗?)。

它还将添加一个用于我们的 CronJob Kind 的文件,即 api/v1/cronjob_types.go。每次使用不同的 Kind 调用该命令时,它都会添加一个相应的新文件。

让我们看看我们得到了什么,然后我们可以开始填写它。

emptyapi.go
Apache License

版权所有 2022。

根据 Apache 许可证 2.0 版(“许可证”)获得许可; 除非符合许可证的规定,否则您不得使用此文件。 您可以在下面的网址获取许可证的副本

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或书面同意,否则根据许可证分发的软件 以“原样“为基础分发,没有任何明示或暗示的保证或条件。 请参阅特定语言管理权限和限制的许可证。

我们从简单的开始:我们导入 meta/v1 API 组,它通常不是单独公开的,而是包含所有 Kubernetes Kind 的公共元数据。

package v1

import (
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

接下来,我们为我们的 Kind 的 Spec 和 Status 定义类型。Kubernetes 通过协调期望的状态(Spec)与实际的集群状态(其他对象的Status)和外部状态,然后记录它观察到的内容(Status)来运行。因此,每个 functional 对象都包括 spec 和 status。一些类型,比如 ConfigMap 不遵循这种模式,因为它们不编码期望的状态,但大多数类型都是这样。

// 编辑此文件!这是你拥有的脚手架!
// 注意:json 标记是必需的。您添加的任何新字段必须具有 json 标记,以便对字段进行序列化。

// CronJobSpec 定义了 CronJob 的期望状态
type CronJobSpec struct {
	// 插入其他的 Spec 字段 - 集群的期望状态
	// 重要提示:在修改此文件后运行 "make" 以重新生成代码
}

// CronJobStatus 定义了 CronJob 的观察状态
type CronJobStatus struct {
	// 插入其他的状态字段 - 定义集群的观察状态
	// 重要提示:在修改此文件后运行 "make" 以重新生成代码
}

接下来,我们定义与实际 Kinds 对应的类型,CronJobCronJobListCronJob 是我们的根类型,描述了 CronJob 类型。与所有 Kubernetes 对象一样,它包含 TypeMeta(描述 API 版本和 Kind), 还包含 ObjectMeta,其中包含名称、命名空间和标签等信息。

CronJobList 简单地是多个 CronJob 的容器。它是用于批量操作(如 LIST)的 Kind。

一般情况下,我们从不修改它们中的任何一个 – 所有的修改都在 Spec 或 Status 中进行。

这个小小的 +kubebuilder:object:root 注释称为标记。我们稍后会看到更多这样的标记,但要知道它们作为额外的元数据,告诉 controller-tools(我们的代码和 YAML 生成器)额外的信息。 这个特定的标记告诉 object 生成器,这个类型表示一个 Kind。然后,object 生成器为我们生成了 runtime.Object 接口的实现,这是所有表示 Kinds 的类型必须实现的标准接口。

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status

// CronJob 是 cronjobs API 的架构
type CronJob struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   CronJobSpec   `json:"spec,omitempty"`
	Status CronJobStatus `json:"status,omitempty"`
}

//+kubebuilder:object:root=true

// CronJobList 包含 CronJob 的列表
type CronJobList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []CronJob `json:"items"`
}

最后,我们将 Go 类型添加到 API 组中。这使我们可以将此 API 组中的类型添加到任何 Scheme 中。

func init() {
	SchemeBuilder.Register(&CronJob{}, &CronJobList{})
}

现在我们已经了解了基本结构,让我们继续填写它!

设计 API

在 Kubernetes 中,我们有一些关于如何设计 API 的规则。特别是,所有序列化字段必须camelCase,因此我们使用 JSON 结构标记来指定这一点。我们还可以使用omitempty结构标记来标记当字段为空时应该在序列化时省略。

字段可以使用大多数基本类型。数字是个例外:出于 API 兼容性的目的,我们接受三种形式的数字:int32int64 用于整数,resource.Quantity 用于小数。

等等,什么是 Quantity?

Quantity 是一种特殊的表示小数的记法,它具有明确定义的固定表示,使其在不同机器上更易于移植。在 Kubernetes 中,当指定 pod 的资源请求和限制时,您可能已经注意到了它们。

它们在概念上类似于浮点数:它们有一个有效数字、基数和指数。它们的可序列化和人类可读格式使用整数和后缀来指定值,就像我们描述计算机存储的方式一样。

例如,值2m表示十进制记法中的0.0022Ki表示十进制中的2048,而2K表示十进制中的2000。如果我们想指定分数,我们可以切换到一个后缀,让我们使用整数:2.52500m

有两种支持的基数:10 和 2(分别称为十进制和二进制)。十进制基数用“正常”的 SI 后缀表示(例如MK),而二进制基数则用“mebi”记法表示(例如MiKi)。可以参考兆字节和二进制兆字节

我们还使用另一种特殊类型:metav1.Time。它的功能与time.Time完全相同,只是它具有固定的、可移植的序列化格式。

现在,让我们来看看我们的 CronJob 对象是什么样子的!

project/api/v1/cronjob_types.go
Apache License

版权所有 2024 年 Kubernetes 作者。

根据 Apache 许可证 2.0 版(以下简称“许可证”)获得许可; 除非符合许可证的规定,否则您不得使用此文件。 您可以在以下网址获取许可证的副本

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或书面同意,根据许可证分发的软件是基于“按原样”分发的, 没有任何明示或暗示的担保或条件。请参阅许可证以获取有关特定语言管理权限和限制的详细信息。

package v1
Imports
import (
	batchv1 "k8s.io/api/batch/v1"
	corev1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!
// 注意:json 标记是必需的。您添加的任何新字段都必须具有字段的 json 标记以进行序列化。

首先,让我们看一下我们的规范。正如我们之前讨论过的,规范保存期望状态,因此我们控制器的任何“输入”都在这里。

从根本上讲,CronJob 需要以下几个部分:

  • 一个计划(CronJob 中的 cron
  • 一个要运行的作业的模板(CronJob 中的 job

我们还希望有一些额外的内容,这些将使我们的用户生活更轻松:

  • 启动作业的可选截止时间(如果错过此截止时间,我们将等到下一个预定的时间)
  • 如果多个作业同时运行,应该怎么办(我们等待吗?停止旧的作业?两者都运行?)
  • 暂停运行 CronJob 的方法,以防出现问题
  • 对旧作业历史记录的限制

请记住,由于我们从不读取自己的状态,我们需要有其他方法来跟踪作业是否已运行。我们可以使用至少一个旧作业来做到这一点。

我们将使用几个标记(// +comment)来指定额外的元数据。这些将在生成我们的 CRD 清单时由 controller-tools 使用。 正如我们将在稍后看到的,controller-tools 还将使用 GoDoc 来形成字段的描述。

// CronJobSpec 定义了 CronJob 的期望状态
type CronJobSpec struct {
	//+kubebuilder:validation:MinLength=0

	// Cron 格式的计划,请参阅 https://en.wikipedia.org/wiki/Cron。
	Schedule string `json:"schedule"`

	//+kubebuilder:validation:Minimum=0

	// 如果由于任何原因错过预定的时间,则作业启动的可选截止时间(以秒为单位)。错过的作业执行将被视为失败的作业。
	// +optional
	StartingDeadlineSeconds *int64 `json:"startingDeadlineSeconds,omitempty"`

	// 指定如何处理作业的并发执行。
	// 有效值包括:
	// - "Allow"(默认):允许 CronJob 并发运行;
	// - "Forbid":禁止并发运行,如果上一次运行尚未完成,则跳过下一次运行;
	// - "Replace":取消当前正在运行的作业,并用新作业替换它
	// +optional
	ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`

	// 此标志告诉控制器暂停后续执行,它不适用于已经启动的执行。默认为 false。
	// +optional
	Suspend *bool `json:"suspend,omitempty"`

	// 指定执行 CronJob 时将创建的作业。
	JobTemplate batchv1.JobTemplateSpec `json:"jobTemplate"`

	//+kubebuilder:validation:Minimum=0

	// 要保留的成功完成作业的数量。
	// 这是一个指针,用于区分明确的零和未指定的情况。
	// +optional
	SuccessfulJobsHistoryLimit *int32 `json:"successfulJobsHistoryLimit,omitempty"`

	//+kubebuilder:validation:Minimum=0

	// 要保留的失败完成作业的数量。
	// 这是一个指针,用于区分明确的零和未指定的情况。
	// +optional
	FailedJobsHistoryLimit *int32 `json:"failedJobsHistoryLimit,omitempty"`
}

我们定义了一个自定义类型来保存我们的并发策略。实际上,它在内部只是一个字符串,但该类型提供了额外的文档,并允许我们在类型而不是字段上附加验证,使验证更容易重用。

// ConcurrencyPolicy 描述作业将如何处理。
// 只能指定以下并发答案中的一个。
// 如果没有指定以下策略之一,则默认答案是 AllowConcurrent。
// +kubebuilder:validation:Enum=Allow;Forbid;Replace
type ConcurrencyPolicy string

const (
	// AllowConcurrent 允许 CronJob 并发运行。
	AllowConcurrent ConcurrencyPolicy = "Allow"

	// ForbidConcurrent 禁止并发运行,如果上一个作业尚未完成,则跳过下一个运行。
	ForbidConcurrent ConcurrencyPolicy = "Forbid"

	// ReplaceConcurrent 取消当前正在运行的作业,并用新作业替换它。
	ReplaceConcurrent ConcurrencyPolicy = "Replace"
)

接下来,让我们设计我们的状态,其中包含观察到的状态。它包含我们希望用户或其他控制器能够轻松获取的任何信息。

我们将保留一个正在运行的作业列表,以及我们成功运行作业的上次时间。请注意,我们使用 metav1.Time 而不是 time.Time 来获得稳定的序列化,如上文所述。

// CronJobStatus 定义了 CronJob 的观察状态
type CronJobStatus struct {
	// 插入额外的状态字段 - 定义集群的观察状态
	// 重要提示:在修改此文件后,请运行“make”以重新生成代码

	// 指向当前正在运行的作业的指针列表。
	// +optional
	Active []corev1.ObjectReference `json:"active,omitempty"`

	// 作业最后成功调度的时间。
	// +optional
	LastScheduleTime *metav1.Time `json:"lastScheduleTime,omitempty"`
}

最后,我们有我们已经讨论过的其余样板。如前所述,除了标记我们想要一个状态子资源,以便表现得像内置的 Kubernetes 类型一样,我们不需要更改这个。

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status

// CronJob 是 cronjobs API 的模式
type CronJob struct {
Root Object Definitions
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   CronJobSpec   `json:"spec,omitempty"`
	Status CronJobStatus `json:"status,omitempty"`
}

//+kubebuilder:object:root=true

// CronJobList 包含 CronJob 的列表
type CronJobList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []CronJob `json:"items"`
}

func init() {
	SchemeBuilder.Register(&CronJob{}, &CronJobList{})
}

既然我们有了一个 API,我们需要编写一个控制器来实际实现功能。

简要说明:其他文件的内容是什么?

如果你浏览了 api/v1/ 目录中的其他文件,你可能会注意到除了 cronjob_types.go 外还有两个额外的文件:groupversion_info.gozz_generated.deepcopy.go

这两个文件都不需要进行编辑(前者保持不变,后者是自动生成的),但了解它们的内容是很有用的。

groupversion_info.go

groupversion_info.go 包含有关组版本的常见元数据:

project/api/v1/groupversion_info.go
Apache License

版权所有 2024 年 Kubernetes 作者。

根据 Apache 许可,版本 2.0 进行许可(“许可”); 除非遵守许可,否则您不得使用此文件。 您可以在以下网址获取许可的副本:

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或书面同意,根据许可分发的软件是基于“按原样“的基础分发的, 不附带任何明示或暗示的担保或条件。 请参阅许可以获取特定语言下的权限和限制。

首先,我们有一些 包级别 的标记,表示此包中有 Kubernetes 对象,并且此包表示组 batch.tutorial.kubebuilder.ioobject 生成器利用前者,而 CRD 生成器则利用后者从此包中生成正确的 CRD 元数据。

// Package v1 包含了 batch v1 API 组的 API Schema 定义
// +kubebuilder:object:generate=true
// +groupName=batch.tutorial.kubebuilder.io
package v1

import (
	"k8s.io/apimachinery/pkg/runtime/schema"
	"sigs.k8s.io/controller-runtime/pkg/scheme"
)

然后,我们有一些通常有用的变量,帮助我们设置 Scheme。 由于我们需要在我们的控制器中使用此包中的所有类型,有一个方便的方法将所有类型添加到某个 Scheme 中是很有帮助的(也是惯例)。SchemeBuilder 为我们简化了这一过程。

var (
	// GroupVersion 是用于注册这些对象的组版本
	GroupVersion = schema.GroupVersion{Group: "batch.tutorial.kubebuilder.io", Version: "v1"}

	// SchemeBuilder 用于将 go 类型添加到 GroupVersionKind scheme
	SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}

	// AddToScheme 将此组版本中的类型添加到给定的 scheme 中。
	AddToScheme = SchemeBuilder.AddToScheme
)

zz_generated.deepcopy.go

zz_generated.deepcopy.go 包含了上述 runtime.Object 接口的自动生成实现,该接口标记了所有我们的根类型表示的 Kinds。

runtime.Object 接口的核心是一个深度复制方法 DeepCopyObject

controller-tools 中的 object 生成器还为每个根类型及其所有子类型生成了另外两个方便的方法:DeepCopyDeepCopyInto

控制器的内容是什么?

控制器是 Kubernetes 和任何操作者的核心。

控制器的工作是确保对于任何给定的对象,世界的实际状态(集群状态,以及可能是 Kubelet 的运行容器或云提供商的负载均衡器等外部状态)与对象中的期望状态相匹配。每个控制器专注于一个 Kind,但可能会与其他 Kinds 交互。

我们称这个过程为调和

在 controller-runtime 中,实现特定 Kind 的调和逻辑称为Reconciler。调和器接受一个对象的名称,并返回我们是否需要再次尝试(例如在出现错误或周期性控制器(如 HorizontalPodAutoscaler)的情况下)。

emptycontroller.go
Apache License

版权所有 2022。

根据 Apache 许可,版本 2.0 进行许可(“许可”); 除非遵守许可,否则您不得使用此文件。 您可以在以下网址获取许可的副本:

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或书面同意,根据许可分发的软件是基于“按原样“的基础分发的, 不附带任何明示或暗示的担保或条件。 请参阅许可以获取特定语言下的权限和限制。

首先,我们从一些标准的导入开始。 与之前一样,我们需要核心的 controller-runtime 库,以及 client 包和我们的 API 类型包。

package controllers

import (
	"context"

	"k8s.io/apimachinery/pkg/runtime"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/client"
	"sigs.k8s.io/controller-runtime/pkg/log"

	batchv1 "tutorial.kubebuilder.io/project/api/v1"
)

接下来,kubebuilder 为我们生成了一个基本的 reconciler 结构。 几乎每个 reconciler 都需要记录日志,并且需要能够获取对象,因此这些都是开箱即用的。

// CronJobReconciler reconciles a CronJob object
type CronJobReconciler struct {
	client.Client
	Scheme *runtime.Scheme
}

大多数控制器最终都会在集群上运行,因此它们需要 RBAC 权限,我们使用 controller-tools 的 RBAC markers 来指定这些权限。这些是运行所需的最低权限。 随着我们添加更多功能,我们将需要重新审视这些权限。

// +kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs/status,verbs=get;update;patch

ClusterRole manifest 位于 config/rbac/role.yaml,通过以下命令使用 controller-gen 从上述标记生成:

// make manifests

注意:如果收到错误,请运行错误中指定的命令,然后重新运行 make manifests

Reconcile 实际上执行单个命名对象的对账。 我们的 Request 只有一个名称,但我们可以使用 client 从缓存中获取该对象。

我们返回一个空结果和没有错误,这表示 controller-runtime 我们已成功对账了此对象,并且在有变更之前不需要再次尝试。

大多数控制器需要一个记录句柄和一个上下文,因此我们在这里设置它们。

context 用于允许取消请求,以及可能的跟踪等功能。它是所有 client 方法的第一个参数。Background 上下文只是一个基本上没有任何额外数据或时间限制的上下文。

记录句柄让我们记录日志。controller-runtime 通过一个名为 logr 的库使用结构化日志。很快我们会看到,日志记录通过将键值对附加到静态消息上来实现。我们可以在我们的 reconciler 的顶部预先分配一些键值对,以便将它们附加到此 reconciler 中的所有日志行。

func (r *CronJobReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	_ = log.FromContext(ctx)

	// your logic here

	return ctrl.Result{}, nil
}

最后,我们将此 reconciler 添加到 manager 中,以便在启动 manager 时启动它。

目前,我们只指出此 reconciler 作用于 CronJob。稍后,我们将使用这个来标记我们关心相关的对象。

func (r *CronJobReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&batchv1.CronJob{}).
		Complete(r)
}

现在我们已经看到了调和器的基本结构,让我们填写 CronJob 的逻辑。

实现一个控制器

我们的CronJob控制器的基本逻辑如下:

  1. 加载指定的CronJob

  2. 列出所有活动的作业,并更新状态

  3. 根据历史限制清理旧作业

  4. 检查我们是否被暂停(如果是,则不执行其他操作)

  5. 获取下一个预定运行时间

  6. 如果符合预定时间、未超过截止时间,并且不受并发策略阻塞,则运行一个新作业

  7. 当我们看到一个正在运行的作业(自动完成)或者到了下一个预定运行时间时,重新排队。

project/internal/controller/cronjob_controller.go
Apache License

版权所有 2024 Kubernetes 作者。

根据 Apache 许可证 2.0 版(“许可证”)获得许可; 除非符合许可证的规定,否则您不得使用此文件。 您可以在以下网址获取许可证的副本

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或书面同意,否则根据许可证分发的软件 将按“原样“分发,不附带任何明示或暗示的担保或条件。 请参阅许可证以了解特定语言下的权限和限制。

我们将从一些导入开始。您将看到我们需要比为我们自动生成的导入更多的导入。 我们将在使用每个导入时讨论它们。

package controller

import (
	"context"
	"fmt"
	"sort"
	"time"

	"github.com/robfig/cron"
	kbatch "k8s.io/api/batch/v1"
	corev1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/apimachinery/pkg/runtime"
	ref "k8s.io/client-go/tools/reference"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/client"
	"sigs.k8s.io/controller-runtime/pkg/log"

	batchv1 "tutorial.kubebuilder.io/project/api/v1"
)

接下来,我们需要一个时钟,它将允许我们在测试中模拟时间。

// CronJobReconciler 调和 CronJob 对象
type CronJobReconciler struct {
	client.Client
	Scheme *runtime.Scheme
	Clock
}
Clock

我们将模拟时钟以便在测试中更容易地跳转时间,“真实“时钟只是调用 time.Now

type realClock struct{}

func (_ realClock) Now() time.Time { return time.Now() }

// 时钟知道如何获取当前时间。
// 它可以用于测试中模拟时间。
type Clock interface {
	Now() time.Time
}

请注意,我们需要更多的 RBAC 权限 —— 因为我们现在正在创建和管理作业,所以我们需要为这些操作添加权限, 这意味着需要添加一些 标记

//+kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=batch.tutorial.kubebuilder.io,resources=cronjobs/finalizers,verbs=update
//+kubebuilder:rbac:groups=batch,resources=jobs,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=batch,resources=jobs/status,verbs=get

现在,我们进入控制器的核心——调和逻辑。

var (
	scheduledTimeAnnotation = "batch.tutorial.kubebuilder.io/scheduled-at"
)

// Reconcile 是主要的 Kubernetes 调和循环的一部分,旨在将集群的当前状态移动到期望的状态。
// TODO(用户):修改 Reconcile 函数以比较 CronJob 对象指定的状态与实际集群状态,然后执行操作以使集群状态反映用户指定的状态。
//
// 有关更多详细信息,请查看此处的 Reconcile 和其结果:
// - https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.17.0/pkg/reconcile
func (r *CronJobReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	log := log.FromContext(ctx)

1: 通过名称加载 CronJob

我们将使用我们的客户端获取 CronJob。所有客户端方法都以上下文(以允许取消)作为它们的第一个参数, 并以对象本身作为它们的最后一个参数。Get 有点特殊,因为它以一个 NamespacedName 作为中间参数(大多数没有中间参数,正如我们将在下面看到的)。

许多客户端方法还在最后接受可变选项。

	var cronJob batchv1.CronJob
	if err := r.Get(ctx, req.NamespacedName, &cronJob); err != nil {
		log.Error(err, "无法获取 CronJob")
		// 我们将忽略未找到的错误,因为它们不能通过立即重新排队来修复(我们需要等待新的通知),并且我们可以在删除的请求中得到它们。
		return ctrl.Result{}, client.IgnoreNotFound(err)
	}

2: 列出所有活动作业,并更新状态

为了完全更新我们的状态,我们需要列出此命名空间中属于此 CronJob 的所有子作业。 类似于 Get,我们可以使用 List 方法列出子作业。请注意,我们使用可变选项设置命名空间和字段匹配(实际上是我们在下面设置的索引查找)。

	var childJobs kbatch.JobList
	if err := r.List(ctx, &childJobs, client.InNamespace(req.Namespace), client.MatchingFields{jobOwnerKey: req.Name}); err != nil {
		log.Error(err, "无法列出子作业")
		return ctrl.Result{}, err
	}

一旦我们拥有所有我们拥有的作业,我们将它们分为活动、成功和失败的作业,并跟踪最近的运行时间,以便我们可以在状态中记录它。 请记住,状态应该能够从世界的状态中重建,因此通常不建议从根对象的状态中读取。相反,您应该在每次运行时重新构建它。这就是我们将在这里做的事情。

我们可以使用状态条件来检查作业是否“完成“,以及它是成功还是失败。我们将把这个逻辑放在一个辅助函数中,使我们的代码更清晰。

	// 找到活动作业列表
	var activeJobs []*kbatch.Job
	var successfulJobs []*kbatch.Job
	var failedJobs []*kbatch.Job
	var mostRecentTime *time.Time // 找到最近的运行时间,以便我们可以在状态中记录它
isJobFinished

我们认为作业“完成“,如果它具有标记为 true 的“Complete“或“Failed“条件。 状态条件允许我们向对象添加可扩展的状态信息,其他人类和控制器可以检查这些信息以检查完成和健康等情况。

	isJobFinished := func(job *kbatch.Job) (bool, kbatch.JobConditionType) {
		for _, c := range job.Status.Conditions {
			if (c.Type == kbatch.JobComplete || c.Type == kbatch.JobFailed) && c.Status == corev1.ConditionTrue {
				return true, c.Type
			}
		}

		return false, ""
	}
getScheduledTimeForJob

我们将使用一个辅助函数从我们在作业创建时添加的注释中提取预定时间。

	getScheduledTimeForJob := func(job *kbatch.Job) (*time.Time, error) {
		timeRaw := job.Annotations[scheduledTimeAnnotation]
		if len(timeRaw) == 0 {
			return nil, nil
		}

		timeParsed, err := time.Parse(time.RFC3339, timeRaw)
		if err != nil {
			return nil, err
		}
		return &timeParsed, nil
	}
	for i, job := range childJobs.Items {
		_, finishedType := isJobFinished(&job)
		switch finishedType {
		case "": // 进行中
			activeJobs = append(activeJobs, &childJobs.Items[i])
		case kbatch.JobFailed:
			failedJobs = append(failedJobs, &childJobs.Items[i])
		case kbatch.JobComplete:
			successfulJobs = append(successfulJobs, &childJobs.Items[i])
		}

		// 我们将在注释中存储启动时间,因此我们将从活动作业中重新构建它。
		scheduledTimeForJob, err := getScheduledTimeForJob(&job)
		if err != nil {
			log.Error(err, "无法解析子作业的计划时间", "job", &job)
			continue
		}
		if scheduledTimeForJob != nil {
			if mostRecentTime == nil || mostRecentTime.Before(*scheduledTimeForJob) {
				mostRecentTime = scheduledTimeForJob
			}
		}
	}

	if mostRecentTime != nil {
		cronJob.Status.LastScheduleTime = &metav1.Time{Time: *mostRecentTime}
	} else {
		cronJob.Status.LastScheduleTime = nil
	}
	cronJob.Status.Active = nil
	for _, activeJob := range activeJobs {
		jobRef, err := ref.GetReference(r.Scheme, activeJob)
		if err != nil {
			log.Error(err, "无法引用活动作业", "job", activeJob)
			continue
		}
		cronJob.Status.Active = append(cronJob.Status.Active, *jobRef)
	}

在这里,我们将记录我们观察到的作业数量,以便进行调试。请注意,我们不使用格式字符串,而是使用固定消息,并附加附加信息的键值对。这样可以更容易地过滤和查询日志行。

	log.V(1).Info("作业数量", "活动作业", len(activeJobs), "成功的作业", len(successfulJobs), "失败的作业", len(failedJobs))
使用我们收集的数据,我们将更新我们的 CRD 的状态。

就像之前一样,我们使用我们的客户端。为了专门更新状态子资源,我们将使用客户端的 Status 部分,以及 Update 方法。

状态子资源会忽略对 spec 的更改,因此不太可能与任何其他更新冲突,并且可以具有单独的权限。

	if err := r.Status().Update(ctx, &cronJob); err != nil {
		log.Error(err, "无法更新 CronJob 状态")
		return ctrl.Result{}, err
	}

一旦我们更新了我们的状态,我们可以继续确保世界的状态与我们在规范中想要的状态匹配。

3: 根据历史限制清理旧作业

首先,我们将尝试清理旧作业,以免留下太多作业。

	// 注意:删除这些是"尽力而为"的——如果我们在特定的作业上失败,我们不会重新排队只是为了完成删除。
	if cronJob.Spec.FailedJobsHistoryLimit != nil {
		sort.Slice(failedJobs, func(i, j int) bool {
			if failedJobs[i].Status.StartTime == nil {
				return failedJobs[j].Status.StartTime != nil
			}
			return failedJobs[i].Status.StartTime.Before(failedJobs[j].Status.StartTime)
		})
		for i, job := range failedJobs {
			if int32(i) >= int32(len(failedJobs))-*cronJob.Spec.FailedJobsHistoryLimit {
				break
			}
			if err := r.Delete(ctx, job, client.PropagationPolicy(metav1.DeletePropagationBackground)); client.IgnoreNotFound(err) != nil {
				log.Error(err, "无法删除旧的失败作业", "job", job)
			} else {
				log.V(0).Info("已删除旧的失败作业", "job", job)
			}
		}
	}

	if cronJob.Spec.SuccessfulJobsHistoryLimit != nil {
		sort.Slice(successfulJobs, func(i, j int) bool {
			if successfulJobs[i].Status.StartTime == nil {
				return successfulJobs[j].Status.StartTime != nil
			}
			return successfulJobs[i].Status.StartTime.Before(successfulJobs[j].Status.StartTime)
		})
		for i, job := range successfulJobs {
			if int32(i) >= int32(len(successfulJobs))-*cronJob.Spec.SuccessfulJobsHistoryLimit {
				break
			}
			if err := r.Delete(ctx, job, client.PropagationPolicy(metav1.DeletePropagationBackground)); err != nil {
				log.Error(err, "无法删除旧的成功作业", "job", job)
			} else {
				log.V(0).Info("已删除旧的成功作业", "job", job)
			}
		}
	}

4: 检查我们是否被暂停

如果此对象被暂停,我们不希望运行任何作业,所以我们将立即停止。 如果我们正在运行的作业出现问题,我们希望暂停运行以进行调查或对集群进行操作,而不删除对象,这是很有用的。

	if cronJob.Spec.Suspend != nil && *cronJob.Spec.Suspend {
		log.V(1).Info("CronJob 已暂停,跳过")
		return ctrl.Result{}, nil
	}

5: 获取下一个预定运行时间

如果我们没有暂停,我们将需要计算下一个预定运行时间,以及我们是否有一个尚未处理的运行。

我们将使用我们有用的 cron 库来计算下一个预定时间。 我们将从我们的最后一次运行时间开始计算适当的时间,或者如果我们找不到最后一次运行,则从 CronJob 的创建开始计算。

如果错过了太多的运行并且我们没有设置任何截止时间,那么我们将中止,以免在控制器重新启动或发生故障时引起问题。

否则,我们将返回错过的运行(我们将只使用最新的),以及下一个运行,以便我们知道何时再次进行调和。

	getNextSchedule := func(cronJob *batchv1.CronJob, now time.Time) (lastMissed time.Time, next time.Time, err error) {
		sched, err := cron.ParseStandard(cronJob.Spec.Schedule)
		if err != nil {
			return time.Time{}, time.Time{}, fmt.Errorf("不可解析的调度 %q:%v", cronJob.Spec.Schedule, err)
		}

		// 为了优化起见,稍微作弊一下,从我们最后观察到的运行时间开始
		// 我们可以在这里重建这个,但是没有什么意义,因为我们刚刚更新了它。
		var earliestTime time.Time
		if cronJob.Status.LastScheduleTime != nil {
			earliestTime = cronJob.Status.LastScheduleTime.Time
		} else {
			earliestTime = cronJob.ObjectMeta.CreationTimestamp.Time
		}
		if cronJob.Spec.StartingDeadlineSeconds != nil {
			// 控制器将不会在此点以下调度任何内容
			schedulingDeadline := now.Add(-time.Second * time.Duration(*cronJob.Spec.StartingDeadlineSeconds))

			if schedulingDeadline.After(earliestTime) {
				earliestTime = schedulingDeadline
			}
		}
		if earliestTime.After(now) {
			return time.Time{}, sched.Next(now), nil
		}

		starts := 0

		// 我们将从最后一次运行时间开始,找到下一个运行时间
		for t := sched.Next(earliestTime); !t.After(now); t = sched.Next(t) {
			starts++
			if starts > 100 {
				return time.Time{}, time.Time{}, fmt.Errorf("错过了太多的运行")
			}
			lastMissed = t
		}

		return lastMissed, sched.Next(now), nil
	}

	lastMissed, nextRun, err := getNextSchedule(&cronJob, r.Now())
	if err != nil {
		log.Error(err, "无法计算下一个运行时间")
		return ctrl.Result{}, err
	}

6: 创建下一个作业

最后,我们将创建下一个作业,以便在下一个运行时间触发。

	// 我们将创建一个新的作业对象,并设置它的所有者引用以确保我们在删除时正确清理。
	newJob := &kbatch.Job{
		ObjectMeta: metav1.ObjectMeta{
			GenerateName: cronJob.Name + "-",
			Namespace:    cronJob.Namespace,
			OwnerReferences: []metav1.OwnerReference{
				*metav1.NewControllerRef(&cronJob, batchv1.SchemeGroupVersion.WithKind("CronJob")),
			},
			Annotations: map[string]string{
				scheduledTimeAnnotation: nextRun.Format(time.RFC3339),
			},
		},
		Spec: cronJob.Spec.JobTemplate.Spec,
	}

	// 我们将等待我们的作业创建
	if err := r.Create(ctx, newJob); err != nil {
		log.Error(err, "无法创建作业")
		return ctrl.Result{}, err
	}

	log.V(0).Info("已创建新作业", "job", newJob)

	// 我们已经创建了一个新的作业,所以我们将在下一个运行时间重新排队。
	return ctrl.Result{RequeueAfter: nextRun.Sub(r.Now())}, nil
}

现在我们已经实现了 CronJobReconciler 的 Reconcile 方法,我们需要在 manager 中注册它。

我们将在 manager 中注册一个新的控制器,用于管理 CronJob 对象。

func (r *CronJobReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&batchv1.CronJob{}).
		Owns(&kbatch.Job{}).
		Complete(r)
}

这是一个复杂的任务,但现在我们有一个可工作的控制器。让我们对集群进行测试,如果没有任何问题,就部署它吧!

你提到了主要内容?

但首先,记得我们说过我们会再次回到 main.go 吗?让我们来看看发生了什么变化,以及我们需要添加什么。

project/cmd/main.go
Apache 许可证

版权所有 2024 年 Kubernetes 作者。

根据 Apache 许可证 2.0 版(“许可证”)获得许可; 除非符合许可证的规定,否则您不得使用此文件。 您可以在以下网址获取许可证的副本:

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或书面同意,根据许可证分发的软件是基于“按原样“的基础分发的, 没有任何明示或暗示的担保或条件。 请查看许可证以了解特定语言管理权限和限制。

Imports
package main

import (
	"crypto/tls"
	"flag"
	"os"

	// 导入所有 Kubernetes 客户端认证插件(例如 Azure、GCP、OIDC 等)
	// 以确保 exec-entrypoint 和 run 可以利用它们。
	_ "k8s.io/client-go/plugin/pkg/client/auth"

	"k8s.io/apimachinery/pkg/runtime"
	utilruntime "k8s.io/apimachinery/pkg/util/runtime"
	clientgoscheme "k8s.io/client-go/kubernetes/scheme"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/healthz"
	"sigs.k8s.io/controller-runtime/pkg/log/zap"
	metricsserver "sigs.k8s.io/controller-runtime/pkg/metrics/server"
	"sigs.k8s.io/controller-runtime/pkg/webhook"

	batchv1 "tutorial.kubebuilder.io/project/api/v1"
	"tutorial.kubebuilder.io/project/internal/controller"
	//+kubebuilder:scaffold:imports
)

要注意的第一个变化是,kubebuilder 已将新 API 组的包(batchv1)添加到我们的 scheme 中。 这意味着我们可以在我们的控制器中使用这些对象。

如果我们将使用任何其他 CRD,我们将不得不以相同的方式添加它们的 scheme。 诸如 Job 之类的内置类型通过 clientgoscheme 添加了它们的 scheme。

var (
	scheme   = runtime.NewScheme()
	setupLog = ctrl.Log.WithName("setup")
)

func init() {
	utilruntime.Must(clientgoscheme.AddToScheme(scheme))

	utilruntime.Must(batchv1.AddToScheme(scheme))
	//+kubebuilder:scaffold:scheme
}

另一个发生变化的地方是,kubebuilder 已添加了一个块,调用我们的 CronJob 控制器的 SetupWithManager 方法。

func main() {
old stuff
	var metricsAddr string
	var enableLeaderElection bool
	var probeAddr string
	var secureMetrics bool
	var enableHTTP2 bool
	flag.StringVar(&metricsAddr, "metrics-bind-address", ":8080", "The address the metric endpoint binds to.")
	flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.")
	flag.BoolVar(&enableLeaderElection, "leader-elect", false,
		"Enable leader election for controller manager. "+
			"Enabling this will ensure there is only one active controller manager.")
	flag.BoolVar(&secureMetrics, "metrics-secure", false,
		"If set the metrics endpoint is served securely")
	flag.BoolVar(&enableHTTP2, "enable-http2", false,
		"If set, HTTP/2 will be enabled for the metrics and webhook servers")
	opts := zap.Options{
		Development: true,
	}
	opts.BindFlags(flag.CommandLine)
	flag.Parse()

	ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))

	// 如果 enable-http2 标志为 false(默认值),则应禁用 http/2
	// 由于其漏洞。更具体地说,禁用 http/2 将防止受到 HTTP/2 流取消和
	// 快速重置 CVE 的影响。更多信息请参见:
	// - https://github.com/advisories/GHSA-qppj-fm5r-hxr3
	// - https://github.com/advisories/GHSA-4374-p667-p6c8
	disableHTTP2 := func(c *tls.Config) {
		setupLog.Info("disabling http/2")
		c.NextProtos = []string{"http/1.1"}
	}

	tlsOpts := []func(*tls.Config){}
	if !enableHTTP2 {
		tlsOpts = append(tlsOpts, disableHTTP2)
	}

	webhookServer := webhook.NewServer(webhook.Options{
		TLSOpts: tlsOpts,
	})

	mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
		Scheme: scheme,
		Metrics: metricsserver.Options{
			BindAddress:   metricsAddr,
			SecureServing: secureMetrics,
			TLSOpts:       tlsOpts,
		},
		WebhookServer:          webhookServer,
		HealthProbeBindAddress: probeAddr,
		LeaderElection:         enableLeaderElection,
		LeaderElectionID:       "80807133.tutorial.kubebuilder.io",
		// LeaderElectionReleaseOnCancel 定义了在 Manager 结束时领导者是否应主动下台
		//。这需要二进制文件在 Manager 停止后立即结束,否则,此设置是不安全的。设置这将显著
		// 加快自愿领导者过渡的速度,因为新领导者无需等待 LeaseDuration 时间。
		//
		// 在默认提供的脚手架中,程序在 Manager 停止后立即结束,因此可以启用此选项。
		// 但是,如果您正在执行或打算在 Manager 停止后执行任何操作,比如执行清理操作,
		// 那么它的使用可能是不安全的。
		// LeaderElectionReleaseOnCancel: true,
	})
	if err != nil {
		setupLog.Error(err, "unable to start manager")
		os.Exit(1)
	}
	if err = (&controller.CronJobReconciler{
		Client: mgr.GetClient(),
		Scheme: mgr.GetScheme(),
	}).SetupWithManager(mgr); err != nil {
		setupLog.Error(err, "unable to create controller", "controller", "CronJob")
		os.Exit(1)
	}
old stuff

我们还将为我们的类型设置 webhooks,接下来我们将讨论它们。 我们只需要将它们添加到 manager 中。由于我们可能希望单独运行 webhooks, 或者在本地测试控制器时不运行它们,我们将它们放在一个环境变量后面。

我们只需确保在本地运行时设置 ENABLE_WEBHOOKS=false

	if os.Getenv("ENABLE_WEBHOOKS") != "false" {
		if err = (&batchv1.CronJob{}).SetupWebhookWithManager(mgr); err != nil {
			setupLog.Error(err, "unable to create webhook", "webhook", "CronJob")
			os.Exit(1)
		}
	}
	//+kubebuilder:scaffold:builder

	if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
		setupLog.Error(err, "unable to set up health check")
		os.Exit(1)
	}
	if err := mgr.AddReadyzCheck("readyz", healthz.Ping); err != nil {
		setupLog.Error(err, "unable to set up ready check")
		os.Exit(1)
	}

	setupLog.Info("starting manager")
	if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
		setupLog.Error(err, "problem running manager")
		os.Exit(1)
	}
}

现在我们可以实现我们的控制器了。

实现默认值/验证 webhook

如果你想为你的 CRD 实现准入 webhook,你需要做的唯一事情就是实现 Defaulter 和(或)Validator 接口。

Kubebuilder 会为你处理其余工作,比如

  1. 创建 webhook 服务器。
  2. 确保服务器已添加到 manager 中。
  3. 为你的 webhook 创建处理程序。
  4. 在服务器中为每个处理程序注册一个路径。

首先,让我们为我们的 CRD(CronJob)生成 webhook 框架。我们需要运行以下命令,带有 --defaulting--programmatic-validation 标志(因为我们的测试项目将使用默认值和验证 webhook):

kubebuilder create webhook --group batch --version v1 --kind CronJob --defaulting --programmatic-validation

这将为你生成 webhook 函数,并在你的 main.go 中为你的 webhook 将其注册到 manager 中。

project/api/v1/cronjob_webhook.go
Apache 许可证

版权所有 2024 年 Kubernetes 作者。

根据 Apache 许可证 2.0 版进行许可; 除非符合许可证的规定,否则您不得使用此文件。 您可以在以下网址获取许可证副本:

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或书面同意,否则根据许可证分发的软件 按“原样“分发,没有任何担保或条件,无论是明示的还是暗示的。 请查看许可证以获取特定语言的权限和限制。

Go 导入
package v1

import (
	"github.com/robfig/cron"
	apierrors "k8s.io/apimachinery/pkg/api/errors"
	"k8s.io/apimachinery/pkg/runtime"
	"k8s.io/apimachinery/pkg/runtime/schema"
	validationutils "k8s.io/apimachinery/pkg/util/validation"
	"k8s.io/apimachinery/pkg/util/validation/field"
	ctrl "sigs.k8s.io/controller-runtime"
	logf "sigs.k8s.io/controller-runtime/pkg/log"
	"sigs.k8s.io/controller-runtime/pkg/webhook"
	"sigs.k8s.io/controller-runtime/pkg/webhook/admission"
)

接下来,我们为 Webhook 设置一个日志记录器。

var cronjoblog = logf.Log.WithName("cronjob-resource")

然后,我们使用管理器设置 Webhook。

// SetupWebhookWithManager 将设置管理器以管理 Webhook
func (r *CronJob) SetupWebhookWithManager(mgr ctrl.Manager) error {
	return ctrl.NewWebhookManagedBy(mgr).
		For(r).
		Complete()
}

请注意,我们使用 kubebuilder 标记生成 Webhook 清单。 此标记负责生成一个变更 Webhook 清单。

每个标记的含义可以在这里找到。

//+kubebuilder:webhook:path=/mutate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=true,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=create;update,versions=v1,name=mcronjob.kb.io,sideEffects=None,admissionReviewVersions=v1

我们使用 webhook.Defaulter 接口为我们的 CRD 设置默认值。 将自动提供一个调用此默认值的 Webhook。

Default 方法应该改变接收器,设置默认值。

var _ webhook.Defaulter = &CronJob{}

// Default 实现了 webhook.Defaulter,因此将为该类型注册 Webhook
func (r *CronJob) Default() {
	cronjoblog.Info("默认值", "名称", r.Name)

	if r.Spec.ConcurrencyPolicy == "" {
		r.Spec.ConcurrencyPolicy = AllowConcurrent
	}
	if r.Spec.Suspend == nil {
		r.Spec.Suspend = new(bool)
	}
	if r.Spec.SuccessfulJobsHistoryLimit == nil {
		r.Spec.SuccessfulJobsHistoryLimit = new(int32)
		*r.Spec.SuccessfulJobsHistoryLimit = 3
	}
	if r.Spec.FailedJobsHistoryLimit == nil {
		r.Spec.FailedJobsHistoryLimit = new(int32)
		*r.Spec.FailedJobsHistoryLimit = 1
	}
}

此标记负责生成一个验证 Webhook 清单。

//+kubebuilder:webhook:verbs=create;update;delete,path=/validate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=false,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,versions=v1,name=vcronjob.kb.io,sideEffects=None,admissionReviewVersions=v1

我们可以对我们的 CRD 进行超出声明性验证的验证。 通常,声明性验证应该足够了,但有时更复杂的用例需要复杂的验证。

例如,我们将在下面看到,我们使用此功能来验证格式良好的 cron 调度,而不是编写一个长正则表达式。

如果实现了 webhook.Validator 接口,将自动提供一个调用验证的 Webhook。

ValidateCreateValidateUpdateValidateDelete 方法预期在创建、更新和删除时验证其接收器。 我们将 ValidateCreateValidateUpdate 分开,以允许像使某些字段不可变这样的行为,这样它们只能在创建时设置。 我们还将 ValidateDeleteValidateUpdate 分开,以允许在删除时进行不同的验证行为。 在这里,我们只为 ValidateCreateValidateUpdate 使用相同的共享验证。在 ValidateDelete 中不执行任何操作,因为我们不需要在删除时验证任何内容。

var _ webhook.Validator = &CronJob{}

// ValidateCreate 实现了 webhook.Validator,因此将为该类型注册 Webhook
func (r *CronJob) ValidateCreate() (admission.Warnings, error) {
	cronjoblog.Info("验证创建", "名称", r.Name)

	return nil, r.validateCronJob()
}

// ValidateUpdate 实现了 webhook.Validator,因此将为该类型注册 Webhook
func (r *CronJob) ValidateUpdate(old runtime.Object) (admission.Warnings, error) {
	cronjoblog.Info("验证更新", "名称", r.Name)

	return nil, r.validateCronJob()
}

// ValidateDelete 实现了 webhook.Validator,因此将为该类型注册 Webhook
func (r *CronJob) ValidateDelete() (admission.Warnings, error) {
	cronjoblog.Info("验证删除", "名称", r.Name)

	// TODO(用户):在对象删除时填充您的验证逻辑。
	return nil, nil
}

我们验证 CronJob 的名称和规范。

func (r *CronJob) validateCronJob() error {
	var allErrs field.ErrorList
	if err := r.validateCronJobName(); err != nil {
		allErrs = append(allErrs, err)
	}
	if err := r.validateCronJobSpec(); err != nil {
		allErrs = append(allErrs, err)
	}
	if len(allErrs) == 0 {
		return nil
	}

	return apierrors.NewInvalid(
		schema.GroupKind{Group: "batch.tutorial.kubebuilder.io", Kind: "CronJob"},
		r.Name, allErrs)
}

一些字段通过 OpenAPI 模式进行声明性验证。 您可以在API 设计部分找到 kubebuilder 验证标记(以// +kubebuilder:validation为前缀)。 您可以通过运行controller-gen crd -w来找到所有 kubebuilder 支持的用于声明验证的标记, 或者在这里找到它们。

func (r *CronJob) validateCronJobSpec() *field.Error {
	// 来自 Kubernetes API 机制的字段助手帮助我们返回结构化良好的验证错误。
	return validateScheduleFormat(
		r.Spec.Schedule,
		field.NewPath("spec").Child("schedule"))
}

我们需要验证 cron 调度是否格式良好。

func validateScheduleFormat(schedule string, fldPath *field.Path) *field.Error {
	if _, err := cron.ParseStandard(schedule); err != nil {
		return field.Invalid(fldPath, schedule, err.Error())
	}
	return nil
}
验证对象名称

验证字符串字段的长度可以通过验证模式进行声明性验证。 但是,ObjectMeta.Name 字段是在 apimachinery 仓库的一个共享包中定义的,因此我们无法使用验证模式进行声明性验证。

func (r *CronJob) validateCronJobName() *field.Error {
	if len(r.ObjectMeta.Name) > validationutils.DNS1035LabelMaxLength-11 {
		// 作业名称长度为 63 个字符,与所有 Kubernetes 对象一样(必须适合 DNS 子域)。当创建作业时,cronjob 控制器会在 cronjob 后附加一个 11 个字符的后缀(`-$TIMESTAMP`)。作业名称长度限制为 63 个字符。因此,cronjob 名称长度必须小于等于 63-11=52。如果我们不在这里验证这一点,那么作业创建将在稍后失败。
		return field.Invalid(field.NewPath("metadata").Child("name"), r.Name, "必须不超过 52 个字符")
	}
	return nil
}

运行和部署控制器

可选步骤

如果选择对 API 定义进行任何更改,则在继续之前,可以使用以下命令生成清单,如自定义资源(CRs)或自定义资源定义(CRDs):

make manifests

要测试控制器,请在本地针对集群运行它。 在继续之前,我们需要安装我们的 CRDs,如快速入门中所述。这将自动使用 controller-tools 更新 YAML 清单(如果需要):

make install

现在我们已经安装了我们的 CRDs,我们可以针对集群运行控制器。这将使用我们连接到集群的任何凭据,因此我们暂时不需要担心 RBAC。

在另一个终端中运行

export ENABLE_WEBHOOKS=false
make run

您应该会看到有关控制器启动的日志,但它目前还不会执行任何操作。

此时,我们需要一个 CronJob 进行测试。让我们编写一个样本到 config/samples/batch_v1_cronjob.yaml,然后使用该样本:

apiVersion: batch.tutorial.kubebuilder.io/v1
kind: CronJob
metadata:
  labels:
    app.kubernetes.io/name: cronjob
    app.kubernetes.io/instance: cronjob-sample
    app.kubernetes.io/part-of: project
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: project
  name: cronjob-sample
spec:
  schedule: "*/1 * * * *"
  startingDeadlineSeconds: 60
  concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure
  
kubectl create -f config/samples/batch_v1_cronjob.yaml

此时,您应该会看到大量活动。如果观察更改,您应该会看到您的 CronJob 正在运行,并更新状态:

kubectl get cronjob.batch.tutorial.kubebuilder.io -o yaml
kubectl get job

现在我们知道它正在运行,我们可以在集群中运行它。停止 make run 命令,并运行

make docker-build docker-push IMG=<some-registry>/<project-name>:tag
make deploy IMG=<some-registry>/<project-name>:tag

如果再次列出 CronJob,就像我们之前所做的那样,我们应该看到控制器再次正常运行!

部署 cert-manager

我们建议使用 cert-manager 为 Webhook 服务器提供证书。只要它们将证书放在所需的位置,其他解决方案也应该可以正常工作。

您可以按照 cert-manager 文档 进行安装。

cert-manager 还有一个名为 CA 注入器 的组件,负责将 CA bundle 注入到 MutatingWebhookConfiguration / ValidatingWebhookConfiguration 中。

为了实现这一点,您需要在 MutatingWebhookConfiguration / ValidatingWebhookConfiguration 对象中使用一个带有键 cert-manager.io/inject-ca-from 的注释。注释的值应该指向一个现有的 证书请求实例,格式为 <证书命名空间>/<证书名称>

这是我们用于给 MutatingWebhookConfiguration / ValidatingWebhookConfiguration 对象添加注释的 kustomize 补丁:

# 这个补丁会向准入 Webhook 配置添加注释
# CERTIFICATE_NAMESPACE 和 CERTIFICATE_NAME 将由 kustomize 替换
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  labels:
    app.kubernetes.io/name: mutatingwebhookconfiguration
    app.kubernetes.io/instance: mutating-webhook-configuration
    app.kubernetes.io/component: webhook
    app.kubernetes.io/created-by: project
    app.kubernetes.io/part-of: project
    app.kubernetes.io/managed-by: kustomize
  name: mutating-webhook-configuration
  annotations:
    cert-manager.io/inject-ca-from: CERTIFICATE_NAMESPACE/CERTIFICATE_NAME
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  labels:
    app.kubernetes.io/name: validatingwebhookconfiguration
    app.kubernetes.io/instance: validating-webhook-configuration
    app.kubernetes.io/component: webhook
    app.kubernetes.io/created-by: project
    app.kubernetes.io/part-of: project
    app.kubernetes.io/managed-by: kustomize
  name: validating-webhook-configuration
  annotations:
    cert-manager.io/inject-ca-from: CERTIFICATE_NAMESPACE/CERTIFICATE_NAME

部署准入 Webhooks

Kind 集群

建议在 kind 集群中开发您的 Webhook,以便快速迭代。 为什么呢?

  • 您可以在本地不到 1 分钟内启动一个多节点集群。
  • 您可以在几秒钟内将其拆除。
  • 您不需要将镜像推送到远程仓库。

cert-manager

您需要按照 这里 的说明安装 cert-manager 捆绑包。

构建您的镜像

运行以下命令在本地构建您的镜像。

make docker-build docker-push IMG=<some-registry>/<project-name>:tag

如果您使用的是 kind 集群,您不需要将镜像推送到远程容器注册表。您可以直接将本地镜像加载到指定的 kind 集群中:

kind load docker-image <your-image-name>:tag --name <your-kind-cluster-name>

部署 Webhooks

您需要通过 kustomize 启用 Webhook 和 cert manager 配置。 config/default/kustomization.yaml 现在应该如下所示:

# Adds namespace to all resources.
namespace: project-system

# Value of this field is prepended to the
# names of all resources, e.g. a deployment named
# "wordpress" becomes "alices-wordpress".
# Note that it should also match with the prefix (text before '-') of the namespace
# field above.
namePrefix: project-

# Labels to add to all resources and selectors.
#labels:
#- includeSelectors: true
#  pairs:
#    someName: someValue

resources:
- ../crd
- ../rbac
- ../manager
# [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in
# crd/kustomization.yaml
- ../webhook
# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER'. 'WEBHOOK' components are required.
- ../certmanager
# [PROMETHEUS] To enable prometheus monitor, uncomment all sections with 'PROMETHEUS'.
- ../prometheus

patches:
# Protect the /metrics endpoint by putting it behind auth.
# If you want your controller-manager to expose the /metrics
# endpoint w/o any authn/z, please comment the following line.
- path: manager_auth_proxy_patch.yaml

# [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix including the one in
# crd/kustomization.yaml
- path: manager_webhook_patch.yaml

# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER'.
# Uncomment 'CERTMANAGER' sections in crd/kustomization.yaml to enable the CA injection in the admission webhooks.
# 'CERTMANAGER' needs to be enabled to use ca injection
- path: webhookcainjection_patch.yaml

# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER' prefix.
# Uncomment the following replacements to add the cert-manager CA injection annotations
replacements:
  - source: # Add cert-manager annotation to ValidatingWebhookConfiguration, MutatingWebhookConfiguration and CRDs
      kind: Certificate
      group: cert-manager.io
      version: v1
      name: serving-cert # this name should match the one in certificate.yaml
      fieldPath: .metadata.namespace # namespace of the certificate CR
    targets:
      - select:
          kind: ValidatingWebhookConfiguration
        fieldPaths:
          - .metadata.annotations.[cert-manager.io/inject-ca-from]
        options:
          delimiter: '/'
          index: 0
          create: true
      - select:
          kind: MutatingWebhookConfiguration
        fieldPaths:
          - .metadata.annotations.[cert-manager.io/inject-ca-from]
        options:
          delimiter: '/'
          index: 0
          create: true
      - select:
          kind: CustomResourceDefinition
        fieldPaths:
          - .metadata.annotations.[cert-manager.io/inject-ca-from]
        options:
          delimiter: '/'
          index: 0
          create: true
  - source:
      kind: Certificate
      group: cert-manager.io
      version: v1
      name: serving-cert # this name should match the one in certificate.yaml
      fieldPath: .metadata.name
    targets:
      - select:
          kind: ValidatingWebhookConfiguration
        fieldPaths:
          - .metadata.annotations.[cert-manager.io/inject-ca-from]
        options:
          delimiter: '/'
          index: 1
          create: true
      - select:
          kind: MutatingWebhookConfiguration
        fieldPaths:
          - .metadata.annotations.[cert-manager.io/inject-ca-from]
        options:
          delimiter: '/'
          index: 1
          create: true
      - select:
          kind: CustomResourceDefinition
        fieldPaths:
          - .metadata.annotations.[cert-manager.io/inject-ca-from]
        options:
          delimiter: '/'
          index: 1
          create: true
  - source: # Add cert-manager annotation to the webhook Service
      kind: Service
      version: v1
      name: webhook-service
      fieldPath: .metadata.name # namespace of the service
    targets:
      - select:
          kind: Certificate
          group: cert-manager.io
          version: v1
        fieldPaths:
          - .spec.dnsNames.0
          - .spec.dnsNames.1
        options:
          delimiter: '.'
          index: 0
          create: true
  - source:
      kind: Service
      version: v1
      name: webhook-service
      fieldPath: .metadata.namespace # namespace of the service
    targets:
      - select:
          kind: Certificate
          group: cert-manager.io
          version: v1
        fieldPaths:
          - .spec.dnsNames.0
          - .spec.dnsNames.1
        options:
          delimiter: '.'
          index: 1
          create: true

config/crd/kustomization.yaml 现在应该如下所示:

# This kustomization.yaml is not intended to be run by itself,
# since it depends on service name and namespace that are out of this kustomize package.
# It should be run by config/default
resources:
- bases/batch.tutorial.kubebuilder.io_cronjobs.yaml
#+kubebuilder:scaffold:crdkustomizeresource

patches:
# [WEBHOOK] To enable webhook, uncomment all the sections with [WEBHOOK] prefix.
# patches here are for enabling the conversion webhook for each CRD
- path: patches/webhook_in_cronjobs.yaml
#+kubebuilder:scaffold:crdkustomizewebhookpatch

# [CERTMANAGER] To enable cert-manager, uncomment all the sections with [CERTMANAGER] prefix.
# patches here are for enabling the CA injection for each CRD
- path: patches/cainjection_in_cronjobs.yaml
#+kubebuilder:scaffold:crdkustomizecainjectionpatch

# [WEBHOOK] To enable webhook, uncomment the following section
# the following config is for teaching kustomize how to do kustomization for CRDs.

configurations:
- kustomizeconfig.yaml

现在您可以通过以下命令将其部署到集群中:

make deploy IMG=<some-registry>/<project-name>:tag

等待一段时间,直到 Webhook Pod 启动并证书被提供。通常在 1 分钟内完成。

现在您可以创建一个有效的 CronJob 来测试您的 Webhooks。创建应该成功通过。

kubectl create -f config/samples/batch_v1_cronjob.yaml

您还可以尝试创建一个无效的 CronJob(例如,使用格式不正确的 schedule 字段)。您应该看到创建失败并带有验证错误。

编写控制器测试

测试 Kubernetes 控制器是一个庞大的主题,而 kubebuilder 为您生成的样板测试文件相对较少。

为了引导您了解 Kubebuilder 生成的控制器的集成测试模式,我们将回顾我们在第一个教程中构建的 CronJob,并为其编写一个简单的测试。

基本方法是,在生成的 suite_test.go 文件中,您将使用 envtest 创建一个本地 Kubernetes API 服务器,实例化和运行您的控制器,然后编写额外的 *_test.go 文件使用 Ginkgo 进行测试。

如果您想调整您的 envtest 集群的配置,请参阅 为集成测试配置 envtest 部分以及 envtest 文档

测试环境设置

../../cronjob-tutorial/testdata/project/internal/controller/suite_test.go
Apache License

版权所有 2024 年 Kubernetes 作者。

根据 Apache 许可证 2.0 版(“许可证”)许可; 除非符合许可证的规定,否则您不得使用此文件。 您可以在以下网址获取许可证的副本:

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或经书面同意,否则根据许可证分发的软件 按“原样“提供,不附带任何担保或条件,无论是明示的还是暗示的。 请查看许可证以了解特定语言下的权限和限制。

Imports

当我们在上一章中使用 kubebuilder create api 创建 CronJob API 时,Kubebuilder 已经为您做了一些测试工作。 Kubebuilder 生成了一个 internal/controller/suite_test.go 文件,其中包含了设置测试环境的基本内容。

首先,它将包含必要的导入项。

package controller

// 这些测试使用 Ginkgo(BDD 风格的 Go 测试框架)。请参考
// http://onsi.github.io/ginkgo/ 了解更多关于 Ginkgo 的信息。

现在,让我们来看一下生成的代码。

var (
    cfg       *rest.Config
    k8sClient client.Client // 您将在测试中使用此客户端。
    testEnv   *envtest.Environment
    ctx       context.Context
    cancel    context.CancelFunc
)

func TestControllers(t *testing.T) {
    RegisterFailHandler(Fail)

    RunSpecs(t, "Controller Suite")
}

var _ = BeforeSuite(func() {
    // 省略了一些设置代码
})

Kubebuilder 还生成了用于清理 envtest 并在控制器目录中实际运行测试文件的样板函数。 您不需要修改这些函数。

var _ = AfterSuite(func() {
    // 省略了一些清理代码
})

现在,您的控制器在测试集群上运行,并且已准备好在您的 CronJob 上执行操作的客户端,我们可以开始编写集成测试了!

测试控制器行为

../../cronjob-tutorial/testdata/project/internal/controller/cronjob_controller_test.go
Apache License

根据 Apache 许可证 2.0 版(“许可证”)许可; 除非符合许可证的规定,否则您不得使用此文件。 您可以在以下网址获取许可证的副本:

http://www.apache.org/licenses/LICENSE-2.0

除非适用法律要求或经书面同意,根据许可证分发的软件 按“原样“提供,不附带任何担保或条件,无论是明示的还是暗示的。 请查看许可证以了解特定语言下的权限和限制。

理想情况下,对于每个在 suite_test.go 中调用的控制器,我们应该有一个 <kind>_controller_test.go。 因此,让我们为 CronJob 控制器编写示例测试(cronjob_controller_test.go)。

Imports

和往常一样,我们从必要的导入项开始。我们还定义了一些实用变量。

package controller

import (
	"context"
	"reflect"
	"time"

	batchv1 "k8s.io/api/batch/v1"
	v1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/apimachinery/pkg/types"

	cronjobv1 "tutorial.kubebuilder.io/project/api/v1"
)

编写简单集成测试的第一步是实际创建一个 CronJob 实例,以便对其运行测试。 请注意,要创建 CronJob,您需要创建一个包含您的 CronJob 规范的存根 CronJob 结构。

请注意,当我们创建存根 CronJob 时,CronJob 还需要其所需的下游对象的存根。 如果没有下游的存根 Job 模板规范和下游的 Pod 模板规范,Kubernetes API 将无法创建 CronJob。

var _ = Describe("CronJob controller", func() {

    // 为对象名称和测试超时/持续时间和间隔定义实用常量。
    const (
        CronjobName      = "test-cronjob"
        CronjobNamespace = "default"
        JobName          = "test-job"

        timeout  = time.Second * 10
        duration = time.Second * 10
        interval = time.Millisecond * 250
    )

    Context("当更新 CronJob 状态时", func() {
        It("当创建新的 Job 时,应增加 CronJob 的 Status.Active 计数", func() {
            By("创建一个新的 CronJob")
            ctx := context.Background()
            cronJob := &cronjobv1.CronJob{
                TypeMeta: metav1.TypeMeta{
                    APIVersion: "batch.tutorial.kubebuilder.io/v1",
                    Kind:       "CronJob",
                },
                ObjectMeta: metav1.ObjectMeta{
                    Name:      CronjobName,
                    Namespace: CronjobNamespace,
                },
                Spec: cronjobv1.CronJobSpec{
                    Schedule: "1 * * * *",
                    JobTemplate: batchv1.JobTemplateSpec{
                        Spec: batchv1.JobSpec{
                            // 为简单起见,我们只填写了必填字段。
                            Template: v1.PodTemplateSpec{
                                Spec: v1.PodSpec{
                                    // 为简单起见,我们只填写了必填字段。
                                    Containers: []v1.Container{
                                        {
                                            Name:  "test-container",
                                            Image: "test-image",
                                        },
                                    },
                                    RestartPolicy: v1.RestartPolicyOnFailure,
                                },
                            },
                        },
                    },
                },
            }
            Expect(k8sClient.Create(ctx, cronJob)).Should(Succeed())

           

创建完这个 CronJob 后,让我们检查 CronJob 的 Spec 字段是否与我们传入的值匹配。 请注意,由于 k8s apiserver 在我们之前的 Create() 调用后可能尚未完成创建 CronJob,我们将使用 Gomega 的 Eventually() 测试函数,而不是 Expect(),以便让 apiserver 有机会完成创建我们的 CronJob。

Eventually() 将重复运行作为参数提供的函数,直到 (a) 函数的输出与随后的 Should() 调用中的预期值匹配,或者 (b) 尝试次数 * 间隔时间超过提供的超时值。

在下面的示例中,timeout 和 interval 是我们选择的 Go Duration 值。

            cronjobLookupKey := types.NamespacedName{Name: CronjobName, Namespace: CronjobNamespace}
            createdCronjob := &cronjobv1.CronJob{}

            // 我们需要重试获取这个新创建的 CronJob,因为创建可能不会立即发生。
            Eventually(func() bool {
                err := k8sClient.Get(ctx, cronjobLookupKey, createdCronjob)
                return err == nil
            }, timeout, interval).Should(BeTrue())
            // 让我们确保我们的 Schedule 字符串值被正确转换/处理。
            Expect(createdCronjob.Spec.Schedule).Should(Equal("1 * * * *"))
           

现在我们在测试集群中创建了一个 CronJob,下一步是编写一个测试,实际测试我们的 CronJob 控制器的行为。 让我们测试负责更新 CronJob.Status.Active 以包含正在运行的 Job 的 CronJob 控制器逻辑。 我们将验证当 CronJob 有一个活动的下游 Job 时,其 CronJob.Status.Active 字段包含对该 Job 的引用。

首先,我们应该获取之前创建的测试 CronJob,并验证它当前是否没有任何活动的 Job。 我们在这里使用 Gomega 的 Consistently() 检查,以确保在一段时间内活动的 Job 计数保持为 0。

            By("检查 CronJob 是否没有活动的 Jobs")
            Consistently(func() (int, error) {
                err := k8sClient.Get(ctx, cronjobLookupKey, createdCronjob)
                if err != nil {
                    return -1, err
                }
                return len(createdCronjob.Status.Active), nil
            }, duration, interval).Should(Equal(0))
           
接下来,我们实际创建一个属于我们的 CronJob 的存根 Job,以及其下游模板规范。
我们将 Job 的状态的 "Active" 计数设置为 2,以模拟 Job 运行两个 Pod,这意味着 Job 正在活动运行。

然后,我们获取存根 Job,并将其所有者引用设置为指向我们的测试 CronJob。
这确保测试 Job 属于我们的测试 CronJob,并由其跟踪。

完成后,我们创建我们的新 Job 实例。

            By("创建一个新的 Job")
            testJob := &batchv1.Job{
                ObjectMeta: metav1.ObjectMeta{
                    Name:      JobName,
                    Namespace: CronjobNamespace,
                },
                Spec: batchv1.JobSpec{
                    Template: v1.PodTemplateSpec{
                        Spec: v1.PodSpec{
                            // 为简单起见,我们只填写了必填字段。
                            Containers: []v1.Container{
                                {
                                    Name:  "test-container",
                                    Image: "test-image",
                                },
                            },
                            RestartPolicy: v1.RestartPolicyOnFailure,
                        },
                    },
                },
                Status: batchv1.JobStatus{
                    Active: 2,
                },
            }

            // 请注意,设置此所有者引用需要您的 CronJob 的 GroupVersionKind。
            kind := reflect.TypeOf(cronjobv1.CronJob{}).Name()
            gvk := cronjobv1.GroupVersion.WithKind(kind)

            controllerRef := metav1.NewControllerRef(createdCronjob, gvk)
            testJob.SetOwnerReferences([]metav1.OwnerReference{*controllerRef})
            Expect(k8sClient.Create(ctx, testJob)).Should(Succeed())
           
将此 Job 添加到我们的测试 CronJob 应该触发我们控制器的协调逻辑。

之后,我们可以编写一个测试,评估我们的控制器是否最终按预期更新我们的 CronJob 的 Status 字段!

            By("检查 CronJob 是否有一个活动的 Job")
            Eventually(func() ([]string, error) {
                err := k8sClient.Get(ctx, cronjobLookupKey, createdCronjob)
                if err != nil {
                    return nil, err
                }

                names := []string{}
                for _, job := range createdCronjob.Status.Active {
                    names = append(names, job.Name)
                }
                return names, nil
            }, timeout, interval).Should(ConsistOf(JobName), "应在状态的活动作业列表中列出我们的活动作业 %s", JobName)
        })
    })

})

编写完所有这些代码后,您可以再次在您的 controllers/ 目录中运行 go test ./... 来运行您的新测试!

上面的状态更新示例演示了一个用于自定义 Kind 与下游对象的一般测试策略。到目前为止,您希望已经学会了以下测试控制器行为的方法:

  • 设置您的控制器在 envtest 集群上运行
  • 编写用于创建测试对象的存根
  • 隔离对象的更改以测试特定的控制器行为

高级示例

有更复杂的示例使用 envtest 严格测试控制器行为。示例包括:

  • Azure Databricks Operator:查看他们完全完善的 suite_test.go 以及该目录中的任何 *_test.go 文件 比如这个

到目前为止,我们已经实现了相当全面的CronJob控制器,充分利用了Kubebuilder的大多数功能,并使用envtest为控制器编写了测试。

如果您想了解更多内容,请前往多版本教程,了解如何向项目添加新的API版本。

此外,您可以尝试以下步骤:我们将很快在教程中介绍这些内容:

  • kubectl get 命令添加额外的打印列,以改善自定义资源在 kubectl get 命令输出中的显示。

希望这次翻译更符合您的需求。

Tutorial: Multi-Version API

Most projects start out with an alpha API that changes release to release. However, eventually, most projects will need to move to a more stable API. Once your API is stable though, you can’t make breaking changes to it. That’s where API versions come into play.

Let’s make some changes to the CronJob API spec and make sure all the different versions are supported by our CronJob project.

If you haven’t already, make sure you’ve gone through the base CronJob Tutorial.

Next, let’s figure out what changes we want to make…

Changing things up

A fairly common change in a Kubernetes API is to take some data that used to be unstructured or stored in some special string format, and change it to structured data. Our schedule field fits the bill quite nicely for this – right now, in v1, our schedules look like

schedule: "*/1 * * * *"

That’s a pretty textbook example of a special string format (it’s also pretty unreadable unless you’re a Unix sysadmin).

Let’s make it a bit more structured. According to our CronJob code, we support “standard” Cron format.

In Kubernetes, all versions must be safely round-tripable through each other. This means that if we convert from version 1 to version 2, and then back to version 1, we must not lose information. Thus, any change we make to our API must be compatible with whatever we supported in v1, and also need to make sure anything we add in v2 is supported in v1. In some cases, this means we need to add new fields to v1, but in our case, we won’t have to, since we’re not adding new functionality.

Keeping all that in mind, let’s convert our example above to be slightly more structured:

schedule:
  minute: */1

Now, at least, we’ve got labels for each of our fields, but we can still easily support all the different syntax for each field.

We’ll need a new API version for this change. Let’s call it v2:

kubebuilder create api --group batch --version v2 --kind CronJob

Press y for “Create Resource” and n for “Create Controller”.

Now, let’s copy over our existing types, and make the change:

project/api/v2/cronjob_types.go
Apache License

Copyright 2023 The Kubernetes authors.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Since we’re in a v2 package, controller-gen will assume this is for the v2 version automatically. We could override that with the +versionName marker.

package v2
Imports
import (
	batchv1 "k8s.io/api/batch/v1"
	corev1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required.  Any new fields you add must have json tags for the fields to be serialized.

We’ll leave our spec largely unchanged, except to change the schedule field to a new type.

// CronJobSpec defines the desired state of CronJob
type CronJobSpec struct {
	// The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.
	Schedule CronSchedule `json:"schedule"`
The rest of Spec
	// +kubebuilder:validation:Minimum=0

	// Optional deadline in seconds for starting the job if it misses scheduled
	// time for any reason.  Missed jobs executions will be counted as failed ones.
	// +optional
	StartingDeadlineSeconds *int64 `json:"startingDeadlineSeconds,omitempty"`

	// Specifies how to treat concurrent executions of a Job.
	// Valid values are:
	// - "Allow" (default): allows CronJobs to run concurrently;
	// - "Forbid": forbids concurrent runs, skipping next run if previous run hasn't finished yet;
	// - "Replace": cancels currently running job and replaces it with a new one
	// +optional
	ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`

	// This flag tells the controller to suspend subsequent executions, it does
	// not apply to already started executions.  Defaults to false.
	// +optional
	Suspend *bool `json:"suspend,omitempty"`

	// Specifies the job that will be created when executing a CronJob.
	JobTemplate batchv1.JobTemplateSpec `json:"jobTemplate"`

	// +kubebuilder:validation:Minimum=0

	// The number of successful finished jobs to retain.
	// This is a pointer to distinguish between explicit zero and not specified.
	// +optional
	SuccessfulJobsHistoryLimit *int32 `json:"successfulJobsHistoryLimit,omitempty"`

	// +kubebuilder:validation:Minimum=0

	// The number of failed finished jobs to retain.
	// This is a pointer to distinguish between explicit zero and not specified.
	// +optional
	FailedJobsHistoryLimit *int32 `json:"failedJobsHistoryLimit,omitempty"`
}

Next, we’ll need to define a type to hold our schedule. Based on our proposed YAML above, it’ll have a field for each corresponding Cron “field”.

// describes a Cron schedule.
type CronSchedule struct {
	// specifies the minute during which the job executes.
	// +optional
	Minute *CronField `json:"minute,omitempty"`
	// specifies the hour during which the job executes.
	// +optional
	Hour *CronField `json:"hour,omitempty"`
	// specifies the day of the month during which the job executes.
	// +optional
	DayOfMonth *CronField `json:"dayOfMonth,omitempty"`
	// specifies the month during which the job executes.
	// +optional
	Month *CronField `json:"month,omitempty"`
	// specifies the day of the week during which the job executes.
	// +optional
	DayOfWeek *CronField `json:"dayOfWeek,omitempty"`
}

Finally, we’ll define a wrapper type to represent a field. We could attach additional validation to this field, but for now we’ll just use it for documentation purposes.

// represents a Cron field specifier.
type CronField string
Other Types

All the other types will stay the same as before.

// ConcurrencyPolicy describes how the job will be handled.
// Only one of the following concurrent policies may be specified.
// If none of the following policies is specified, the default one
// is AllowConcurrent.
// +kubebuilder:validation:Enum=Allow;Forbid;Replace
type ConcurrencyPolicy string

const (
	// AllowConcurrent allows CronJobs to run concurrently.
	AllowConcurrent ConcurrencyPolicy = "Allow"

	// ForbidConcurrent forbids concurrent runs, skipping next run if previous
	// hasn't finished yet.
	ForbidConcurrent ConcurrencyPolicy = "Forbid"

	// ReplaceConcurrent cancels currently running job and replaces it with a new one.
	ReplaceConcurrent ConcurrencyPolicy = "Replace"
)

// CronJobStatus defines the observed state of CronJob
type CronJobStatus struct {
	// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
	// Important: Run "make" to regenerate code after modifying this file

	// A list of pointers to currently running jobs.
	// +optional
	Active []corev1.ObjectReference `json:"active,omitempty"`

	// Information when was the last time the job was successfully scheduled.
	// +optional
	LastScheduleTime *metav1.Time `json:"lastScheduleTime,omitempty"`
}

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status

// CronJob is the Schema for the cronjobs API
type CronJob struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   CronJobSpec   `json:"spec,omitempty"`
	Status CronJobStatus `json:"status,omitempty"`
}

//+kubebuilder:object:root=true

// CronJobList contains a list of CronJob
type CronJobList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []CronJob `json:"items"`
}

func init() {
	SchemeBuilder.Register(&CronJob{}, &CronJobList{})
}

Storage Versions

project/api/v1/cronjob_types.go
Apache License

Copyright 2023 The Kubernetes authors.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

package v1
Imports
import (
	batchv1 "k8s.io/api/batch/v1"
	corev1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required.  Any new fields you add must have json tags for the fields to be serialized.
old stuff
// CronJobSpec defines the desired state of CronJob
type CronJobSpec struct {
	// +kubebuilder:validation:MinLength=0

	// The schedule in Cron format, see https://en.wikipedia.org/wiki/Cron.
	Schedule string `json:"schedule"`

	// +kubebuilder:validation:Minimum=0

	// Optional deadline in seconds for starting the job if it misses scheduled
	// time for any reason.  Missed jobs executions will be counted as failed ones.
	// +optional
	StartingDeadlineSeconds *int64 `json:"startingDeadlineSeconds,omitempty"`

	// Specifies how to treat concurrent executions of a Job.
	// Valid values are:
	// - "Allow" (default): allows CronJobs to run concurrently;
	// - "Forbid": forbids concurrent runs, skipping next run if previous run hasn't finished yet;
	// - "Replace": cancels currently running job and replaces it with a new one
	// +optional
	ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`

	// This flag tells the controller to suspend subsequent executions, it does
	// not apply to already started executions.  Defaults to false.
	// +optional
	Suspend *bool `json:"suspend,omitempty"`

	// Specifies the job that will be created when executing a CronJob.
	JobTemplate batchv1.JobTemplateSpec `json:"jobTemplate"`

	// +kubebuilder:validation:Minimum=0

	// The number of successful finished jobs to retain.
	// This is a pointer to distinguish between explicit zero and not specified.
	// +optional
	SuccessfulJobsHistoryLimit *int32 `json:"successfulJobsHistoryLimit,omitempty"`

	// +kubebuilder:validation:Minimum=0

	// The number of failed finished jobs to retain.
	// This is a pointer to distinguish between explicit zero and not specified.
	// +optional
	FailedJobsHistoryLimit *int32 `json:"failedJobsHistoryLimit,omitempty"`
}

// ConcurrencyPolicy describes how the job will be handled.
// Only one of the following concurrent policies may be specified.
// If none of the following policies is specified, the default one
// is AllowConcurrent.
// +kubebuilder:validation:Enum=Allow;Forbid;Replace
type ConcurrencyPolicy string

const (
	// AllowConcurrent allows CronJobs to run concurrently.
	AllowConcurrent ConcurrencyPolicy = "Allow"

	// ForbidConcurrent forbids concurrent runs, skipping next run if previous
	// hasn't finished yet.
	ForbidConcurrent ConcurrencyPolicy = "Forbid"

	// ReplaceConcurrent cancels currently running job and replaces it with a new one.
	ReplaceConcurrent ConcurrencyPolicy = "Replace"
)

// CronJobStatus defines the observed state of CronJob
type CronJobStatus struct {
	// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
	// Important: Run "make" to regenerate code after modifying this file

	// A list of pointers to currently running jobs.
	// +optional
	Active []corev1.ObjectReference `json:"active,omitempty"`

	// Information when was the last time the job was successfully scheduled.
	// +optional
	LastScheduleTime *metav1.Time `json:"lastScheduleTime,omitempty"`
}

Since we’ll have more than one version, we’ll need to mark a storage version. This is the version that the Kubernetes API server uses to store our data. We’ll chose the v1 version for our project.

We’ll use the +kubebuilder:storageversion to do this.

Note that multiple versions may exist in storage if they were written before the storage version changes – changing the storage version only affects how objects are created/updated after the change.

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
//+kubebuilder:storageversion

// CronJob is the Schema for the cronjobs API
type CronJob struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   CronJobSpec   `json:"spec,omitempty"`
	Status CronJobStatus `json:"status,omitempty"`
}
old stuff
//+kubebuilder:object:root=true

// CronJobList contains a list of CronJob
type CronJobList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []CronJob `json:"items"`
}

func init() {
	SchemeBuilder.Register(&CronJob{}, &CronJobList{})
}

Now that we’ve got our types in place, we’ll need to set up conversion…

Hubs, spokes, and other wheel metaphors

Since we now have two different versions, and users can request either version, we’ll have to define a way to convert between our version. For CRDs, this is done using a webhook, similar to the defaulting and validating webhooks we defined in the base tutorial. Like before, controller-runtime will help us wire together the nitty-gritty bits, we just have to implement the actual conversion.

Before we do that, though, we’ll need to understand how controller-runtime thinks about versions. Namely:

Complete graphs are insufficiently nautical

A simple approach to defining conversion might be to define conversion functions to convert between each of our versions. Then, whenever we need to convert, we’d look up the appropriate function, and call it to run the conversion.

This works fine when we just have two versions, but what if we had 4 types? 8 types? That’d be a lot of conversion functions.

Instead, controller-runtime models conversion in terms of a “hub and spoke” model – we mark one version as the “hub”, and all other versions just define conversion to and from the hub:

becomes

Then, if we have to convert between two non-hub versions, we first convert to the hub version, and then to our desired version:

This cuts down on the number of conversion functions that we have to define, and is modeled off of what Kubernetes does internally.

What does that have to do with Webhooks?

When API clients, like kubectl or your controller, request a particular version of your resource, the Kubernetes API server needs to return a result that’s of that version. However, that version might not match the version stored by the API server.

In that case, the API server needs to know how to convert between the desired version and the stored version. Since the conversions aren’t built in for CRDs, the Kubernetes API server calls out to a webhook to do the conversion instead. For Kubebuilder, this webhook is implemented by controller-runtime, and performs the hub-and-spoke conversions that we discussed above.

Now that we have the model for conversion down pat, we can actually implement our conversions.

Implementing conversion

With our model for conversion in place, it’s time to actually implement the conversion functions. We’ll put them in a file called cronjob_conversion.go next to our cronjob_types.go file, to avoid cluttering up our main types file with extra functions.

Hub…

First, we’ll implement the hub. We’ll choose the v1 version as the hub:

project/api/v1/cronjob_conversion.go
Apache License

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

package v1

Implementing the hub method is pretty easy – we just have to add an empty method called Hub() to serve as a marker. We could also just put this inline in our cronjob_types.go file.

// Hub marks this type as a conversion hub.
func (*CronJob) Hub() {}

… and Spokes

Then, we’ll implement our spoke, the v2 version:

project/api/v2/cronjob_conversion.go
Apache License

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

package v2
Imports

For imports, we’ll need the controller-runtime conversion package, plus the API version for our hub type (v1), and finally some of the standard packages.

import (
	"fmt"
	"strings"

	"sigs.k8s.io/controller-runtime/pkg/conversion"

	"tutorial.kubebuilder.io/project/api/v1"
)

Our “spoke” versions need to implement the Convertible interface. Namely, they’ll need ConvertTo and ConvertFrom methods to convert to/from the hub version.

ConvertTo is expected to modify its argument to contain the converted object. Most of the conversion is straightforward copying, except for converting our changed field.

// ConvertTo converts this CronJob to the Hub version (v1).
func (src *CronJob) ConvertTo(dstRaw conversion.Hub) error {
	dst := dstRaw.(*v1.CronJob)

	sched := src.Spec.Schedule
	scheduleParts := []string{"*", "*", "*", "*", "*"}
	if sched.Minute != nil {
		scheduleParts[0] = string(*sched.Minute)
	}
	if sched.Hour != nil {
		scheduleParts[1] = string(*sched.Hour)
	}
	if sched.DayOfMonth != nil {
		scheduleParts[2] = string(*sched.DayOfMonth)
	}
	if sched.Month != nil {
		scheduleParts[3] = string(*sched.Month)
	}
	if sched.DayOfWeek != nil {
		scheduleParts[4] = string(*sched.DayOfWeek)
	}
	dst.Spec.Schedule = strings.Join(scheduleParts, " ")
rote conversion

The rest of the conversion is pretty rote.

	// ObjectMeta
	dst.ObjectMeta = src.ObjectMeta

	// Spec
	dst.Spec.StartingDeadlineSeconds = src.Spec.StartingDeadlineSeconds
	dst.Spec.ConcurrencyPolicy = v1.ConcurrencyPolicy(src.Spec.ConcurrencyPolicy)
	dst.Spec.Suspend = src.Spec.Suspend
	dst.Spec.JobTemplate = src.Spec.JobTemplate
	dst.Spec.SuccessfulJobsHistoryLimit = src.Spec.SuccessfulJobsHistoryLimit
	dst.Spec.FailedJobsHistoryLimit = src.Spec.FailedJobsHistoryLimit

	// Status
	dst.Status.Active = src.Status.Active
	dst.Status.LastScheduleTime = src.Status.LastScheduleTime
	return nil
}

ConvertFrom is expected to modify its receiver to contain the converted object. Most of the conversion is straightforward copying, except for converting our changed field.

// ConvertFrom converts from the Hub version (v1) to this version.
func (dst *CronJob) ConvertFrom(srcRaw conversion.Hub) error {
	src := srcRaw.(*v1.CronJob)

	schedParts := strings.Split(src.Spec.Schedule, " ")
	if len(schedParts) != 5 {
		return fmt.Errorf("invalid schedule: not a standard 5-field schedule")
	}
	partIfNeeded := func(raw string) *CronField {
		if raw == "*" {
			return nil
		}
		part := CronField(raw)
		return &part
	}
	dst.Spec.Schedule.Minute = partIfNeeded(schedParts[0])
	dst.Spec.Schedule.Hour = partIfNeeded(schedParts[1])
	dst.Spec.Schedule.DayOfMonth = partIfNeeded(schedParts[2])
	dst.Spec.Schedule.Month = partIfNeeded(schedParts[3])
	dst.Spec.Schedule.DayOfWeek = partIfNeeded(schedParts[4])
rote conversion

The rest of the conversion is pretty rote.

	// ObjectMeta
	dst.ObjectMeta = src.ObjectMeta

	// Spec
	dst.Spec.StartingDeadlineSeconds = src.Spec.StartingDeadlineSeconds
	dst.Spec.ConcurrencyPolicy = ConcurrencyPolicy(src.Spec.ConcurrencyPolicy)
	dst.Spec.Suspend = src.Spec.Suspend
	dst.Spec.JobTemplate = src.Spec.JobTemplate
	dst.Spec.SuccessfulJobsHistoryLimit = src.Spec.SuccessfulJobsHistoryLimit
	dst.Spec.FailedJobsHistoryLimit = src.Spec.FailedJobsHistoryLimit

	// Status
	dst.Status.Active = src.Status.Active
	dst.Status.LastScheduleTime = src.Status.LastScheduleTime
	return nil
}

Now that we’ve got our conversions in place, all that we need to do is wire up our main to serve the webhook!

Setting up the webhooks

Our conversion is in place, so all that’s left is to tell controller-runtime about our conversion.

Normally, we’d run

kubebuilder create webhook --group batch --version v1 --kind CronJob --conversion

to scaffold out the webhook setup. However, we’ve already got webhook setup, from when we built our defaulting and validating webhooks!

Webhook setup…

project/api/v1/cronjob_webhook.go
Apache License

Copyright 2023 The Kubernetes authors.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Go imports
package v1

import (
	"github.com/robfig/cron"
	apierrors "k8s.io/apimachinery/pkg/api/errors"
	"k8s.io/apimachinery/pkg/runtime"
	"k8s.io/apimachinery/pkg/runtime/schema"
	validationutils "k8s.io/apimachinery/pkg/util/validation"
	"k8s.io/apimachinery/pkg/util/validation/field"
	ctrl "sigs.k8s.io/controller-runtime"
	logf "sigs.k8s.io/controller-runtime/pkg/log"
	"sigs.k8s.io/controller-runtime/pkg/webhook"
)
var cronjoblog = logf.Log.WithName("cronjob-resource")

This setup doubles as setup for our conversion webhooks: as long as our types implement the Hub and Convertible interfaces, a conversion webhook will be registered.

func (r *CronJob) SetupWebhookWithManager(mgr ctrl.Manager) error {
	return ctrl.NewWebhookManagedBy(mgr).
		For(r).
		Complete()
}
Existing Defaulting and Validation
// +kubebuilder:webhook:path=/mutate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=true,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=create;update,versions=v1,name=mcronjob.kb.io,sideEffects=None,admissionReviewVersions=v1

var _ webhook.Defaulter = &CronJob{}

// Default implements webhook.Defaulter so a webhook will be registered for the type
func (r *CronJob) Default() {
	cronjoblog.Info("default", "name", r.Name)

	if r.Spec.ConcurrencyPolicy == "" {
		r.Spec.ConcurrencyPolicy = AllowConcurrent
	}
	if r.Spec.Suspend == nil {
		r.Spec.Suspend = new(bool)
	}
	if r.Spec.SuccessfulJobsHistoryLimit == nil {
		r.Spec.SuccessfulJobsHistoryLimit = new(int32)
		*r.Spec.SuccessfulJobsHistoryLimit = 3
	}
	if r.Spec.FailedJobsHistoryLimit == nil {
		r.Spec.FailedJobsHistoryLimit = new(int32)
		*r.Spec.FailedJobsHistoryLimit = 1
	}
}

// +kubebuilder:webhook:verbs=create;update;delete,path=/validate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=false,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,versions=v1,name=vcronjob.kb.io,sideEffects=None,admissionReviewVersions=v1

var _ webhook.Validator = &CronJob{}

// ValidateCreate implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateCreate() error {
	cronjoblog.Info("validate create", "name", r.Name)

	return r.validateCronJob()
}

// ValidateUpdate implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateUpdate(old runtime.Object) error {
	cronjoblog.Info("validate update", "name", r.Name)

	return r.validateCronJob()
}

// ValidateDelete implements webhook.Validator so a webhook will be registered for the type
func (r *CronJob) ValidateDelete() error {
	cronjoblog.Info("validate delete", "name", r.Name)

	// TODO(user): fill in your validation logic upon object deletion.
	return nil
}

func (r *CronJob) validateCronJob() error {
	var allErrs field.ErrorList
	if err := r.validateCronJobName(); err != nil {
		allErrs = append(allErrs, err)
	}
	if err := r.validateCronJobSpec(); err != nil {
		allErrs = append(allErrs, err)
	}
	if len(allErrs) == 0 {
		return nil
	}

	return apierrors.NewInvalid(
		schema.GroupKind{Group: "batch.tutorial.kubebuilder.io", Kind: "CronJob"},
		r.Name, allErrs)
}

func (r *CronJob) validateCronJobSpec() *field.Error {
	// The field helpers from the kubernetes API machinery help us return nicely
	// structured validation errors.
	return validateScheduleFormat(
		r.Spec.Schedule,
		field.NewPath("spec").Child("schedule"))
}

func validateScheduleFormat(schedule string, fldPath *field.Path) *field.Error {
	if _, err := cron.ParseStandard(schedule); err != nil {
		return field.Invalid(fldPath, schedule, err.Error())
	}
	return nil
}

func (r *CronJob) validateCronJobName() *field.Error {
	if len(r.ObjectMeta.Name) > validationutils.DNS1035LabelMaxLength-11 {
		// The job name length is 63 character like all Kubernetes objects
		// (which must fit in a DNS subdomain). The cronjob controller appends
		// a 11-character suffix to the cronjob (`-$TIMESTAMP`) when creating
		// a job. The job name length limit is 63 characters. Therefore cronjob
		// names must have length <= 63-11=52. If we don't validate this here,
		// then job creation will fail later.
		return field.Invalid(field.NewPath("metadata").Child("name"), r.Name, "must be no more than 52 characters")
	}
	return nil
}

…and main.go

Similarly, our existing main file is sufficient:

project/cmd/main.go
Apache License

Copyright 2023 The Kubernetes authors.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Imports
package main

import (
	"flag"
	"os"

	// Import all Kubernetes client auth plugins (e.g. Azure, GCP, OIDC, etc.)
	// to ensure that exec-entrypoint and run can make use of them.
	_ "k8s.io/client-go/plugin/pkg/client/auth"

	kbatchv1 "k8s.io/api/batch/v1"
	"k8s.io/apimachinery/pkg/runtime"
	utilruntime "k8s.io/apimachinery/pkg/util/runtime"
	clientgoscheme "k8s.io/client-go/kubernetes/scheme"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/healthz"
	"sigs.k8s.io/controller-runtime/pkg/log/zap"

	batchv1 "tutorial.kubebuilder.io/project/api/v1"
	batchv2 "tutorial.kubebuilder.io/project/api/v2"
	"tutorial.kubebuilder.io/project/internal/controller"
	//+kubebuilder:scaffold:imports
)
existing setup
var (
	scheme   = runtime.NewScheme()
	setupLog = ctrl.Log.WithName("setup")
)

func init() {
	utilruntime.Must(clientgoscheme.AddToScheme(scheme))

	utilruntime.Must(kbatchv1.AddToScheme(scheme)) // we've added this ourselves
	utilruntime.Must(batchv1.AddToScheme(scheme))
	utilruntime.Must(batchv2.AddToScheme(scheme))
	//+kubebuilder:scaffold:scheme
}
func main() {
existing setup
	var metricsAddr string
	var enableLeaderElection bool
	var probeAddr string
	flag.StringVar(&metricsAddr, "metrics-bind-address", ":8080", "The address the metric endpoint binds to.")
	flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.")
	flag.BoolVar(&enableLeaderElection, "leader-elect", false,
		"Enable leader election for controller manager. "+
			"Enabling this will ensure there is only one active controller manager.")
	opts := zap.Options{
		Development: true,
	}
	opts.BindFlags(flag.CommandLine)
	flag.Parse()

	ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))

	mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
		Scheme:                 scheme,
		MetricsBindAddress:     metricsAddr,
		Port:                   9443,
		HealthProbeBindAddress: probeAddr,
		LeaderElection:         enableLeaderElection,
		LeaderElectionID:       "80807133.tutorial.kubebuilder.io",
		// LeaderElectionReleaseOnCancel defines if the leader should step down voluntarily
		// when the Manager ends. This requires the binary to immediately end when the
		// Manager is stopped, otherwise, this setting is unsafe. Setting this significantly
		// speeds up voluntary leader transitions as the new leader don't have to wait
		// LeaseDuration time first.
		//
		// In the default scaffold provided, the program ends immediately after
		// the manager stops, so would be fine to enable this option. However,
		// if you are doing or is intended to do any operation such as perform cleanups
		// after the manager stops then its usage might be unsafe.
		// LeaderElectionReleaseOnCancel: true,
	})
	if err != nil {
		setupLog.Error(err, "unable to start manager")
		os.Exit(1)
	}

	if err = (&controller.CronJobReconciler{
		Client: mgr.GetClient(),
		Scheme: mgr.GetScheme(),
	}).SetupWithManager(mgr); err != nil {
		setupLog.Error(err, "unable to create controller", "controller", "CronJob")
		os.Exit(1)
	}

Our existing call to SetupWebhookWithManager registers our conversion webhooks with the manager, too.

	if os.Getenv("ENABLE_WEBHOOKS") != "false" {
		if err = (&batchv1.CronJob{}).SetupWebhookWithManager(mgr); err != nil {
			setupLog.Error(err, "unable to create webhook", "webhook", "CronJob")
			os.Exit(1)
		}
		if err = (&batchv2.CronJob{}).SetupWebhookWithManager(mgr); err != nil {
			setupLog.Error(err, "unable to create webhook", "webhook", "CronJob")
			os.Exit(1)
		}
	}
	//+kubebuilder:scaffold:builder
existing setup
	if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
		setupLog.Error(err, "unable to set up health check")
		os.Exit(1)
	}
	if err := mgr.AddReadyzCheck("readyz", healthz.Ping); err != nil {
		setupLog.Error(err, "unable to set up ready check")
		os.Exit(1)
	}

	setupLog.Info("starting manager")
	if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
		setupLog.Error(err, "problem running manager")
		os.Exit(1)
	}
}

Everything’s set up and ready to go! All that’s left now is to test out our webhooks.

Deployment and Testing

Before we can test out our conversion, we’ll need to enable them in our CRD:

Kubebuilder generates Kubernetes manifests under the config directory with webhook bits disabled. To enable them, we need to:

  • Enable patches/webhook_in_<kind>.yaml and patches/cainjection_in_<kind>.yaml in config/crd/kustomization.yaml file.

  • Enable ../certmanager and ../webhook directories under the bases section in config/default/kustomization.yaml file.

  • Enable manager_webhook_patch.yaml and webhookcainjection_patch.yaml under the patches section in config/default/kustomization.yaml file.

  • Enable all the vars under the CERTMANAGER section in config/default/kustomization.yaml file.

Additionally, if present in our Makefile, we’ll need to set the CRD_OPTIONS variable to just "crd", removing the trivialVersions option (this ensures that we actually generate validation for each version, instead of telling Kubernetes that they’re the same):

CRD_OPTIONS ?= "crd"

Now we have all our code changes and manifests in place, so let’s deploy it to the cluster and test it out.

You’ll need cert-manager installed (version 0.9.0+) unless you’ve got some other certificate management solution. The Kubebuilder team has tested the instructions in this tutorial with 0.9.0-alpha.0 release.

Once all our ducks are in a row with certificates, we can run make install deploy (as normal) to deploy all the bits (CRD, controller-manager deployment) onto the cluster.

Testing

Once all of the bits are up and running on the cluster with conversion enabled, we can test out our conversion by requesting different versions.

We’ll make a v2 version based on our v1 version (put it under config/samples)

apiVersion: batch.tutorial.kubebuilder.io/v2
kind: CronJob
metadata:
  labels:
    app.kubernetes.io/name: cronjob
    app.kubernetes.io/instance: cronjob-sample
    app.kubernetes.io/part-of: project
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: project
  name: cronjob-sample
spec:
  schedule:
    minute: "*/1"
  startingDeadlineSeconds: 60
  concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

Then, we can create it on the cluster:

kubectl apply -f config/samples/batch_v2_cronjob.yaml

If we’ve done everything correctly, it should create successfully, and we should be able to fetch it using both the v2 resource

kubectl get cronjobs.v2.batch.tutorial.kubebuilder.io -o yaml
apiVersion: batch.tutorial.kubebuilder.io/v2
kind: CronJob
metadata:
  labels:
    app.kubernetes.io/name: cronjob
    app.kubernetes.io/instance: cronjob-sample
    app.kubernetes.io/part-of: project
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: project
  name: cronjob-sample
spec:
  schedule:
    minute: "*/1"
  startingDeadlineSeconds: 60
  concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

and the v1 resource

kubectl get cronjobs.v1.batch.tutorial.kubebuilder.io -o yaml
apiVersion: batch.tutorial.kubebuilder.io/v1
kind: CronJob
metadata:
  labels:
    app.kubernetes.io/name: cronjob
    app.kubernetes.io/instance: cronjob-sample
    app.kubernetes.io/part-of: project
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: project
  name: cronjob-sample
spec:
  schedule: "*/1 * * * *"
  startingDeadlineSeconds: 60
  concurrencyPolicy: Allow # explicitly specify, but Allow is also default.
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

Both should be filled out, and look equivalent to our v2 and v1 samples, respectively. Notice that each has a different API version.

Finally, if we wait a bit, we should notice that our CronJob continues to reconcile, even though our controller is written against our v1 API version.

Troubleshooting

steps for troubleshooting

Tutorial: ComponentConfig

Nearly every project that is built for Kubernetes will eventually need to support passing in additional configurations into the controller. These could be to enable better logging, turn on/off specific feature gates, set the sync period, or a myriad of other controls. Previously this was commonly done using cli flags that your main.go would parse to make them accessible within your program. While this works it’s not a future forward design and the Kubernetes community has been migrating the core components away from this and toward using versioned config files, referred to as “component configs”.

The rest of this tutorial will show you how to configure your kubebuilder project with the component config type then moves on to implementing a custom type so that you can extend this capability.

Resources

Changing things up

This tutorial will show you how to create a custom configuration file for your project by modifying a project generated with the --component-config flag passed to the init command. The full tutorial’s source can be found here. Make sure you’ve gone through the installation steps before continuing.

New project:

# we'll use a domain of tutorial.kubebuilder.io,
# so all API groups will be <group>.tutorial.kubebuilder.io.
kubebuilder init --domain tutorial.kubebuilder.io --component-config

Setting up an existing project

If you’ve previously generated a project we can add support for parsing the config file by making the following changes to main.go.

First, add a new flag to specify the path that the component config file should be loaded from.

var configFile string
flag.StringVar(&configFile, "config", "",
    "The controller will load its initial configuration from this file. "+
        "Omit this flag to use the default configuration values. "+
            "Command-line flags override configuration from this file.")

Now, we can setup the Options struct and check if the configFile is set, this allows backwards compatibility, if it’s set we’ll then use the AndFrom function on Options to parse and populate the Options from the config.

var err error
options := ctrl.Options{Scheme: scheme}
if configFile != "" {
    options, err = options.AndFrom(ctrl.ConfigFile().AtPath(configFile))
    if err != nil {
        setupLog.Error(err, "unable to load the config file")
        os.Exit(1)
    }
}

Lastly, we’ll change the NewManager call to use the options variable we defined above.

mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), options)

With that out of the way, we can get on to defining our new config!

Create the file /config/manager/controller_manager_config.yaml with the following content:

apiVersion: controller-runtime.sigs.k8s.io/v1alpha1
kind: ControllerManagerConfig
health:
  healthProbeBindAddress: :8081
metrics:
  bindAddress: 127.0.0.1:8080
webhook:
  port: 9443
leaderElection:
  leaderElect: true
  resourceName: ecaf1259.tutorial.kubebuilder.io
# leaderElectionReleaseOnCancel defines if the leader should step down volume
# when the Manager ends. This requires the binary to immediately end when the
# Manager is stopped, otherwise, this setting is unsafe. Setting this significantly
# speeds up voluntary leader transitions as the new leader don't have to wait
# LeaseDuration time first.
# In the default scaffold provided, the program ends immediately after
# the manager stops, so would be fine to enable this option. However,
# if you are doing or is intended to do any operation such as perform cleanups
# after the manager stops then its usage might be unsafe.
# leaderElectionReleaseOnCancel: true

Update the file /config/manager/kustomization.yaml by adding at the bottom the following content:

generatorOptions:
  disableNameSuffixHash: true

configMapGenerator:
- name: manager-config
  files:
  - controller_manager_config.yaml

Update the file default/kustomization.yaml by adding under the patchesStrategicMerge: key the following patch:

patchesStrategicMerge:
# Mount the controller config file for loading manager configurations
# through a ComponentConfig type
- manager_config_patch.yaml

Update the file default/manager_config_patch.yaml by adding under the spec: key the following patch:

spec:
  template:
    spec:
      containers:
      - name: manager
        args:
        - "--config=controller_manager_config.yaml"
        volumeMounts:
        - name: manager-config
          mountPath: /controller_manager_config.yaml
          subPath: controller_manager_config.yaml
      volumes:
      - name: manager-config
        configMap:
          name: manager-config

Defining your Config

Now that you have a component config base project we need to customize the values that are passed into the controller, to do this we can take a look at config/manager/controller_manager_config.yaml.

controller_manager_config.yaml
apiVersion: controller-runtime.sigs.k8s.io/v1alpha1
kind: ControllerManagerConfig
metrics:
  bindAddress: 127.0.0.1:8080
webhook:
  port: 9443
leaderElection:
  leaderElect: true
  resourceName: 80807133.tutorial.kubebuilder.io

To see all the available fields you can look at the v1alpha Controller Runtime config ControllerManagerConfiguration

Using a Custom Type

If your project needs to accept additional non-controller runtime specific configurations, e.g. ClusterName, Region or anything serializable into yaml you can do this by using kubebuilder to create a new type and then updating your main.go to setup the new type for parsing.

The rest of this tutorial will walk through implementing a custom component config type.

Adding a new Config Type

To scaffold out a new config Kind, we can use kubebuilder create api.

kubebuilder create api --group config --version v2 --kind ProjectConfig --resource --controller=false --make=false

Then, run make build to implement the interface for your API type, which would generate the file zz_generated.deepcopy.go.

This will create a new type file in api/config/v2/ for the ProjectConfig kind. We’ll need to change this file to embed the v1alpha1.ControllerManagerConfigurationSpec

projectconfig_types.go
Apache License

Copyright 2020 The Kubernetes authors.

Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

We start out simply enough: we import the config/v1alpha1 API group, which is exposed through ControllerRuntime.

package v2

import (
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	cfg "sigs.k8s.io/controller-runtime/pkg/config/v1alpha1"
)

// +kubebuilder:object:root=true

Next, we’ll remove the default ProjectConfigSpec and ProjectConfigList then we’ll embed cfg.ControllerManagerConfigurationSpec in ProjectConfig.

// ProjectConfig is the Schema for the projectconfigs API
type ProjectConfig struct {
	metav1.TypeMeta `json:",inline"`

	// ControllerManagerConfigurationSpec returns the configurations for controllers
	cfg.ControllerManagerConfigurationSpec `json:",inline"`

	ClusterName string `json:"clusterName,omitempty"`
}

If you haven’t, you’ll also need to remove the ProjectConfigList from the SchemeBuilder.Register.

func init() {
	SchemeBuilder.Register(&ProjectConfig{})
}

Lastly, we’ll change the main.go to reference this type for parsing the file.

Updating main

Once you have defined your new custom component config type we need to make sure our new config type has been imported and the types are registered with the scheme. If you used kubebuilder create api this should have been automated.

import (
    // ... other imports
    configv2 "tutorial.kubebuilder.io/project/apis/config/v2"
    // +kubebuilder:scaffold:imports
)

With the package imported we can confirm the types have been added.

func init() {
	// ... other scheme registrations
	utilruntime.Must(configv2.AddToScheme(scheme))
	// +kubebuilder:scaffold:scheme
}

Lastly, we need to change the options parsing in main.go to use this new type. To do this we’ll chain OfKind onto ctrl.ConfigFile() and pass in a pointer to the config kind.

var err error
ctrlConfig := configv2.ProjectConfig{}
options := ctrl.Options{Scheme: scheme}
if configFile != "" {
    options, err = options.AndFrom(ctrl.ConfigFile().AtPath(configFile).OfKind(&ctrlConfig))
    if err != nil {
        setupLog.Error(err, "unable to load the config file")
        os.Exit(1)
    }
}

Now if you need to use the .clusterName field we defined in our custom kind you can call ctrlConfig.ClusterName which will be populated from the config file supplied.

Defining your Custom Config

Now that you have a custom component config we change the config/manager/controller_manager_config.yaml to use the new GVK you defined.

project/config/manager/controller_manager_config.yaml
apiVersion: controller-runtime.sigs.k8s.io/v1alpha1
kind: ControllerManagerConfig
metadata:
  labels:
    app.kubernetes.io/name: controllermanagerconfig
    app.kubernetes.io/instance: controller-manager-configuration
    app.kubernetes.io/component: manager
    app.kubernetes.io/created-by: project
    app.kubernetes.io/part-of: project
    app.kubernetes.io/managed-by: kustomize
health:
  healthProbeBindAddress: :8081
metrics:
  bindAddress: 127.0.0.1:8080
webhook:
  port: 9443
leaderElection:
  leaderElect: true
  resourceName: 80807133.tutorial.kubebuilder.io
clusterName: example-test

This type uses the new ProjectConfig kind under the GVK config.tutorial.kubebuilder.io/v2, with these custom configs we can add any yaml serializable fields that your controller needs and begin to reduce the reliance on flags to configure your project.

Migrations

Migrating between project structures in Kubebuilder generally involves a bit of manual work.

This section details what’s required to migrate, between different versions of Kubebuilder scaffolding, as well as to more complex project layout structures.

Migration guides from Legacy versions < 3.0.0

Follow the migration guides from the legacy Kubebuilder versions up the required latest v3x version. Note that from v3, a new ecosystem using plugins is introduced for better maintainability, reusability and user experience .

For more info, see the design docs of:

Also, you can check the Plugins section.

Kubebuilder v1 vs v2 (Legacy v1.0.0+ to v2.0.0 Kubebuilder CLI versions)

This document cover all breaking changes when migrating from v1 to v2.

The details of all changes (breaking or otherwise) can be found in controller-runtime, controller-tools and kubebuilder release notes.

Common changes

V2 project uses go modules. But kubebuilder will continue to support dep until go 1.13 is out.

controller-runtime

  • Client.List now uses functional options (List(ctx, list, ...option)) instead of List(ctx, ListOptions, list).

  • Client.DeleteAllOf was added to the Client interface.

  • Metrics are on by default now.

  • A number of packages under pkg/runtime have been moved, with their old locations deprecated. The old locations will be removed before controller-runtime v1.0.0. See the godocs for more information.

  • Automatic certificate generation for webhooks has been removed, and webhooks will no longer self-register. Use controller-tools to generate a webhook configuration. If you need certificate generation, we recommend using cert-manager. Kubebuilder v2 will scaffold out cert manager configs for you to use – see the Webhook Tutorial for more details.

  • The builder package now has separate builders for controllers and webhooks, which facilitates choosing which to run.

controller-tools

The generator framework has been rewritten in v2. It still works the same as before in many cases, but be aware that there are some breaking changes. Please check marker documentation for more details.

Kubebuilder

  • Kubebuilder v2 introduces a simplified project layout. You can find the design doc here.

  • In v1, the manager is deployed as a StatefulSet, while it’s deployed as a Deployment in v2.

  • The kubebuilder create webhook command was added to scaffold mutating/validating/conversion webhooks. It replaces the kubebuilder alpha webhook command.

  • v2 uses distroless/static instead of Ubuntu as base image. This reduces image size and attack surface.

  • v2 requires kustomize v3.1.0+.

Migration from v1 to v2

Make sure you understand the differences between Kubebuilder v1 and v2 before continuing

Please ensure you have followed the installation guide to install the required components.

The recommended way to migrate a v1 project is to create a new v2 project and copy over the API and the reconciliation code. The conversion will end up with a project that looks like a native v2 project. However, in some cases, it’s possible to do an in-place upgrade (i.e. reuse the v1 project layout, upgrading controller-runtime and controller-tools.

Let’s take as example an V1 project and migrate it to Kubebuilder v2. At the end, we should have something that looks like the example v2 project.

Preparation

We’ll need to figure out what the group, version, kind and domain are.

Let’s take a look at our current v1 project structure:

pkg/
├── apis
│   ├── addtoscheme_batch_v1.go
│   ├── apis.go
│   └── batch
│       ├── group.go
│       └── v1
│           ├── cronjob_types.go
│           ├── cronjob_types_test.go
│           ├── doc.go
│           ├── register.go
│           ├── v1_suite_test.go
│           └── zz_generated.deepcopy.go
├── controller
└── webhook

All of our API information is stored in pkg/apis/batch, so we can look there to find what we need to know.

In cronjob_types.go, we can find

type CronJob struct {...}

In register.go, we can find

SchemeGroupVersion = schema.GroupVersion{Group: "batch.tutorial.kubebuilder.io", Version: "v1"}

Putting that together, we get CronJob as the kind, and batch.tutorial.kubebuilder.io/v1 as the group-version

Initialize a v2 Project

Now, we need to initialize a v2 project. Before we do that, though, we’ll need to initialize a new go module if we’re not on the gopath:

go mod init tutorial.kubebuilder.io/project

Then, we can finish initializing the project with kubebuilder:

kubebuilder init --domain tutorial.kubebuilder.io

Migrate APIs and Controllers

Next, we’ll re-scaffold out the API types and controllers. Since we want both, we’ll say yes to both the API and controller prompts when asked what parts we want to scaffold:

kubebuilder create api --group batch --version v1 --kind CronJob

If you’re using multiple groups, some manual work is required to migrate. Please follow this for more details.

Migrate the APIs

Now, let’s copy the API definition from pkg/apis/batch/v1/cronjob_types.go to api/v1/cronjob_types.go. We only need to copy the implementation of the Spec and Status fields.

We can replace the +k8s:deepcopy-gen:interfaces=... marker (which is deprecated in kubebuilder) with +kubebuilder:object:root=true.

We don’t need the following markers any more (they’re not used anymore, and are relics from much older versions of Kubebuilder):

// +genclient
// +k8s:openapi-gen=true

Our API types should look like the following:

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// CronJob is the Schema for the cronjobs API
type CronJob struct {...}

// +kubebuilder:object:root=true

// CronJobList contains a list of CronJob
type CronJobList struct {...}

Migrate the Controllers

Now, let’s migrate the controller reconciler code from pkg/controller/cronjob/cronjob_controller.go to controllers/cronjob_controller.go.

We’ll need to copy

  • the fields from the ReconcileCronJob struct to CronJobReconciler
  • the contents of the Reconcile function
  • the rbac related markers to the new file.
  • the code under func add(mgr manager.Manager, r reconcile.Reconciler) error to func SetupWithManager

Migrate the Webhooks

If you don’t have a webhook, you can skip this section.

Webhooks for Core Types and External CRDs

If you are using webhooks for Kubernetes core types (e.g. Pods), or for an external CRD that is not owned by you, you can refer the controller-runtime example for builtin types and do something similar. Kubebuilder doesn’t scaffold much for these cases, but you can use the library in controller-runtime.

Scaffold Webhooks for our CRDs

Now let’s scaffold the webhooks for our CRD (CronJob). We’ll need to run the following command with the --defaulting and --programmatic-validation flags (since our test project uses defaulting and validating webhooks):

kubebuilder create webhook --group batch --version v1 --kind CronJob --defaulting --programmatic-validation

Depending on how many CRDs need webhooks, we may need to run the above command multiple times with different Group-Version-Kinds.

Now, we’ll need to copy the logic for each webhook. For validating webhooks, we can copy the contents from func validatingCronJobFn in pkg/default_server/cronjob/validating/cronjob_create_handler.go to func ValidateCreate in api/v1/cronjob_webhook.go and then the same for update.

Similarly, we’ll copy from func mutatingCronJobFn to func Default.

Webhook Markers

When scaffolding webhooks, Kubebuilder v2 adds the following markers:

// These are v2 markers

// This is for the mutating webhook
// +kubebuilder:webhook:path=/mutate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=true,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=create;update,versions=v1,name=mcronjob.kb.io

...

// This is for the validating webhook
// +kubebuilder:webhook:path=/validate-batch-tutorial-kubebuilder-io-v1-cronjob,mutating=false,failurePolicy=fail,groups=batch.tutorial.kubebuilder.io,resources=cronjobs,verbs=create;update,versions=v1,name=vcronjob.kb.io

The default verbs are verbs=create;update. We need to ensure verbs matches what we need. For example, if we only want to validate creation, then we would change it to verbs=create.

We also need to ensure failure-policy is still the same.

Markers like the following are no longer needed (since they deal with self-deploying certificate configuration, which was removed in v2):

// v1 markers
// +kubebuilder:webhook:port=9876,cert-dir=/tmp/cert
// +kubebuilder:webhook:service=test-system:webhook-service,selector=app:webhook-server
// +kubebuilder:webhook:secret=test-system:webhook-server-secret
// +kubebuilder:webhook:mutating-webhook-config-name=test-mutating-webhook-cfg
// +kubebuilder:webhook:validating-webhook-config-name=test-validating-webhook-cfg

In v1, a single webhook marker may be split into multiple ones in the same paragraph. In v2, each webhook must be represented by a single marker.

Others

If there are any manual updates in main.go in v1, we need to port the changes to the new main.go. We’ll also need to ensure all of the needed schemes have been registered.

If there are additional manifests added under config directory, port them as well.

Change the image name in the Makefile if needed.

Verification

Finally, we can run make and make docker-build to ensure things are working fine.

Kubebuilder v2 vs v3 (Legacy Kubebuilder v2.0.0+ layout to 3.0.0+)

This document covers all breaking changes when migrating from v2 to v3.

The details of all changes (breaking or otherwise) can be found in controller-runtime, controller-tools and kb-releases release notes.

Common changes

v3 projects use Go modules and request Go 1.18+. Dep is no longer supported for dependency management.

Kubebuilder

  • Preliminary support for plugins was added. For more info see the Extensible CLI and Scaffolding Plugins: phase 1, the Extensible CLI and Scaffolding Plugins: phase 1.5 and the Extensible CLI and Scaffolding Plugins - Phase 2 design docs. Also, you can check the Plugins section.

  • The PROJECT file now has a new layout. It stores more information about what resources are in use, to better enable plugins to make useful decisions when scaffolding.

    Furthermore, the PROJECT file itself is now versioned: the version field corresponds to the version of the PROJECT file itself, while the layout field indicates the scaffolding & primary plugin version in use.

  • The version of the image gcr.io/kubebuilder/kube-rbac-proxy, which is an optional component enabled by default to secure the request made against the manager, was updated from 0.5.0 to 0.11.0 to address security concerns. The details of all changes can be found in kube-rbac-proxy.

TL;DR of the New go/v3 Plugin

More details on this can be found at here, but for the highlights, check below

  • Scaffolded/Generated API version changes:

    • Use apiextensions/v1 for generated CRDs (apiextensions/v1beta1 was deprecated in Kubernetes 1.16)
    • Use admissionregistration.k8s.io/v1 for generated webhooks (admissionregistration.k8s.io/v1beta1 was deprecated in Kubernetes 1.16)
    • Use cert-manager.io/v1 for the certificate manager when webhooks are used (cert-manager.io/v1alpha2 was deprecated in Cert-Manager 0.14. More info: CertManager v1.0 docs)
  • Code changes:

    • The manager flags --metrics-addr and enable-leader-election now are named --metrics-bind-address and --leader-elect to be more aligned with core Kubernetes Components. More info: #1839
    • Liveness and Readiness probes are now added by default using healthz.Ping.
    • A new option to create the projects using ComponentConfig is introduced. For more info see its enhancement proposal and the Component config tutorial
    • Manager manifests now use SecurityContext to address security concerns. More info: #1637
  • Misc:

    • Support for controller-tools v0.9.0 (for go/v2 it is v0.3.0 and previously it was v0.2.5)
    • Support for controller-runtime v0.12.1 (for go/v2 it is v0.6.4 and previously it was v0.5.0)
    • Support for kustomize v3.8.7 (for go/v2 it is v3.5.4 and previously it was v3.1.0)
    • Required Envtest binaries are automatically downloaded
    • The minimum Go version is now 1.18 (previously it was 1.13).

Migrating to Kubebuilder v3

So you want to upgrade your scaffolding to use the latest and greatest features then, follow up the following guide which will cover the steps in the most straightforward way to allow you to upgrade your project to get all latest changes and improvements.

By updating the files manually

So you want to use the latest version of Kubebuilder CLI without changing your scaffolding then, check the following guide which will describe the manually steps required for you to upgrade only your PROJECT version and starts to use the plugins versions.

This way is more complex, susceptible to errors, and success cannot be assured. Also, by following these steps you will not get the improvements and bug fixes in the default generated project files.

You will check that you can still using the previous layout by using the go/v2 plugin which will not upgrade the controller-runtime and controller-tools to the latest version used with go/v3 becuase of its breaking changes. By checking this guide you know also how to manually change the files to use the go/v3 plugin and its dependencies versions.

Migration from v2 to v3

Make sure you understand the differences between Kubebuilder v2 and v3 before continuing.

Please ensure you have followed the installation guide to install the required components.

The recommended way to migrate a v2 project is to create a new v3 project and copy over the API and the reconciliation code. The conversion will end up with a project that looks like a native v3 project. However, in some cases, it’s possible to do an in-place upgrade (i.e. reuse the v2 project layout, upgrading controller-runtime and controller-tools).

Initialize a v3 Project

Create a new directory with the name of your project. Note that this name is used in the scaffolds to create the name of your manager Pod and of the Namespace where the Manager is deployed by default.

$ mkdir migration-project-name
$ cd migration-project-name

Now, we need to initialize a v3 project. Before we do that, though, we’ll need to initialize a new go module if we’re not on the GOPATH. While technically this is not needed inside GOPATH, it is still recommended.

go mod init tutorial.kubebuilder.io/migration-project

Then, we can finish initializing the project with kubebuilder.

kubebuilder init --domain tutorial.kubebuilder.io

Migrate APIs and Controllers

Next, we’ll re-scaffold out the API types and controllers.

kubebuilder create api --group batch --version v1 --kind CronJob

Migrate the APIs

Now, let’s copy the API definition from api/v1/<kind>_types.go in our old project to the new one.

These files have not been modified by the new plugin, so you should be able to replace your freshly scaffolded files by your old one. There may be some cosmetic changes. So you can choose to only copy the types themselves.

Migrate the Controllers

Now, let’s migrate the controller code from controllers/cronjob_controller.go in our old project to the new one. There is a breaking change and there may be some cosmetic changes.

The new Reconcile method receives the context as an argument now, instead of having to create it with context.Background(). You can copy the rest of the code in your old controller to the scaffolded methods replacing:

func (r *CronJobReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
    ctx := context.Background() 
    log := r.Log.WithValues("cronjob", req.NamespacedName)

With:

func (r *CronJobReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	log := r.Log.WithValues("cronjob", req.NamespacedName)

Migrate the Webhooks

Now let’s scaffold the webhooks for our CRD (CronJob). We’ll need to run the following command with the --defaulting and --programmatic-validation flags (since our test project uses defaulting and validating webhooks):

kubebuilder create webhook --group batch --version v1 --kind CronJob --defaulting --programmatic-validation

Now, let’s copy the webhook definition from api/v1/<kind>_webhook.go from our old project to the new one.

Others

If there are any manual updates in main.go in v2, we need to port the changes to the new main.go. We’ll also need to ensure all of the needed schemes have been registered.

If there are additional manifests added under config directory, port them as well.

Change the image name in the Makefile if needed.

Verification

Finally, we can run make and make docker-build to ensure things are working fine.

Migration from v2 to v3 by updating the files manually

Make sure you understand the differences between Kubebuilder v2 and v3 before continuing

Please ensure you have followed the installation guide to install the required components.

The following guide describes the manual steps required to upgrade your config version and start using the plugin-enabled version.

This way is more complex, susceptible to errors, and success cannot be assured. Also, by following these steps you will not get the improvements and bug fixes in the default generated project files.

Usually you will only try to do it manually if you customized your project and deviated too much from the proposed scaffold. Before continuing, ensure that you understand the note about project customizations. Note that you might need to spend more effort to do this process manually than organize your project customizations to follow up the proposed layout and keep your project maintainable and upgradable with less effort in the future.

The recommended upgrade approach is to follow the Migration Guide v2 to V3 instead.

Migration from project config version “2” to “3”

Migrating between project configuration versions involves additions, removals, and/or changes to fields in your project’s PROJECT file, which is created by running the init command.

The PROJECT file now has a new layout. It stores more information about what resources are in use, to better enable plugins to make useful decisions when scaffolding.

Furthermore, the PROJECT file itself is now versioned. The version field corresponds to the version of the PROJECT file itself, while the layout field indicates the scaffolding and the primary plugin version in use.

Steps to migrate

The following steps describe the manual changes required to bring the project configuration file (PROJECT). These change will add the information that Kubebuilder would add when generating the file. This file can be found in the root directory.

Add the projectName

The project name is the name of the project directory in lowercase:

...
projectName: example
...

Add the layout

The default plugin layout which is equivalent to the previous version is go.kubebuilder.io/v2:

...
layout:
- go.kubebuilder.io/v2
...

Update the version

The version field represents the version of project’s layout. Update this to "3":

...
version: "3"
...

Add the resource data

The attribute resources represents the list of resources scaffolded in your project.

You will need to add the following data for each resource added to the project.

Add the Kubernetes API version by adding resources[entry].api.crdVersion: v1beta1:
...
resources:
- api:
    ...
    crdVersion: v1beta1
  domain: my.domain
  group: webapp
  kind: Guestbook
  ...
Add the scope used do scaffold the CRDs by adding resources[entry].api.namespaced: true unless they were cluster-scoped:
...
resources:
- api:
    ...
    namespaced: true
  group: webapp
  kind: Guestbook
  ...
If you have a controller scaffolded for the API then, add resources[entry].controller: true:
...
resources:
- api:
    ...
  controller: true
  group: webapp
  kind: Guestbook
Add the resource domain such as resources[entry].domain: testproject.org which usually will be the project domain unless the API scaffold is a core type and/or an external type:
...
resources:
- api:
    ...
  domain: testproject.org
  group: webapp
  kind: Guestbook

Note that you will only need to add the domain if your project has a scaffold for a core type API which the Domain value is not empty in Kubernetes API group qualified scheme definition. (For example, see here that for Kinds from the API apps it has not a domain when see here that for Kinds from the API authentication its domain is k8s.io )

Check the following the list to know the core types supported and its domain:

Core TypeDomain
admission“k8s.io”
admissionregistration“k8s.io”
appsempty
auditregistration“k8s.io”
apiextensions“k8s.io”
authentication“k8s.io”
authorization“k8s.io”
autoscalingempty
batchempty
certificates“k8s.io”
coordination“k8s.io”
coreempty
events“k8s.io”
extensionsempty
imagepolicy“k8s.io”
networking“k8s.io”
node“k8s.io”
metrics“k8s.io”
policyempty
rbac.authorization“k8s.io”
scheduling“k8s.io”
setting“k8s.io”
storage“k8s.io”

Following an example where a controller was scaffold for the core type Kind Deployment via the command create api --group apps --version v1 --kind Deployment --controller=true --resource=false --make=false:

- controller: true
  group: apps
  kind: Deployment
  path: k8s.io/api/apps/v1
  version: v1
Add the resources[entry].path with the import path for the api:
...
resources:
- api:
    ...
  ...
  group: webapp
  kind: Guestbook
  path: example/api/v1
If your project is using webhooks then, add resources[entry].webhooks.[type]: true for each type generated and then, add resources[entry].webhooks.webhookVersion: v1beta1:
resources:
- api:
    ...
  ...
  group: webapp
  kind: Guestbook
  webhooks:
    defaulting: true
    validation: true
    webhookVersion: v1beta1

Check your PROJECT file

Now ensure that your PROJECT file has the same information when the manifests are generated via Kubebuilder V3 CLI.

For the QuickStart example, the PROJECT file manually updated to use go.kubebuilder.io/v2 would look like:

domain: my.domain
layout:
- go.kubebuilder.io/v2
projectName: example
repo: example
resources:
- api:
    crdVersion: v1
    namespaced: true
  controller: true
  domain: my.domain
  group: webapp
  kind: Guestbook
  path: example/api/v1
  version: v1
version: "3"

You can check the differences between the previous layout(version 2) and the current format(version 3) with the go.kubebuilder.io/v2 by comparing an example scenario which involves more than one API and webhook, see:

Example (Project version 2)

domain: testproject.org
repo: sigs.k8s.io/kubebuilder/example
resources:
- group: crew
  kind: Captain
  version: v1
- group: crew
  kind: FirstMate
  version: v1
- group: crew
  kind: Admiral
  version: v1
version: "2"

Example (Project version 3)

domain: testproject.org
layout:
- go.kubebuilder.io/v2
projectName: example
repo: sigs.k8s.io/kubebuilder/example
resources:
- api:
    crdVersion: v1
    namespaced: true
  controller: true
  domain: testproject.org
  group: crew
  kind: Captain
  path: example/api/v1
  version: v1
  webhooks:
    defaulting: true
    validation: true
    webhookVersion: v1
- api:
    crdVersion: v1
    namespaced: true
  controller: true
  domain: testproject.org
  group: crew
  kind: FirstMate
  path: example/api/v1
  version: v1
  webhooks:
    conversion: true
    webhookVersion: v1
- api:
    crdVersion: v1
  controller: true
  domain: testproject.org
  group: crew
  kind: Admiral
  path: example/api/v1
  plural: admirales
  version: v1
  webhooks:
    defaulting: true
    webhookVersion: v1
version: "3"

Verification

In the steps above, you updated only the PROJECT file which represents the project configuration. This configuration is useful only for the CLI tool. It should not affect how your project behaves.

There is no option to verify that you properly updated the configuration file. The best way to ensure the configuration file has the correct V3+ fields is to initialize a project with the same API(s), controller(s), and webhook(s) in order to compare generated configuration with the manually changed configuration.

If you made mistakes in the above process, you will likely face issues using the CLI.

Update your project to use go/v3 plugin

Migrating between project plugins involves additions, removals, and/or changes to files created by any plugin-supported command, e.g. init and create. A plugin supports one or more project config versions; make sure you upgrade your project’s config version to the latest supported by your target plugin version before upgrading plugin versions.

The following steps describe the manual changes required to modify the project’s layout enabling your project to use the go/v3 plugin. These steps will not help you address all the bug fixes of the already generated scaffolds.

Steps to migrate

Update your plugin version into the PROJECT file

Before updating the layout, please ensure you have followed the above steps to upgrade your Project version to 3. Once you have upgraded the project version, update the layout to the new plugin version go.kubebuilder.io/v3 as follows:

domain: my.domain
layout:
- go.kubebuilder.io/v3
...

Upgrade the Go version and its dependencies:

Ensure that your go.mod is using Go version 1.15 and the following dependency versions:

module example

go 1.18

require (
    github.com/onsi/ginkgo/v2 v2.1.4
    github.com/onsi/gomega v1.19.0
    k8s.io/api v0.24.0
    k8s.io/apimachinery v0.24.0
    k8s.io/client-go v0.24.0
    sigs.k8s.io/controller-runtime v0.12.1
)

Update the golang image

In the Dockerfile, replace:

# Build the manager binary
FROM golang:1.13 as builder

With:

# Build the manager binary
FROM golang:1.16 as builder

Update your Makefile

To allow controller-gen to scaffold the nw Kubernetes APIs

To allow controller-gen and the scaffolding tool to use the new API versions, replace:

CRD_OPTIONS ?= "crd:trivialVersions=true"

With:

CRD_OPTIONS ?= "crd"
To allow automatic downloads

To allow downloading the newer versions of the Kubernetes binaries required by Envtest into the testbin/ directory of your project instead of the global setup, replace:

# Run tests
test: generate fmt vet manifests
	go test ./... -coverprofile cover.out

With:

# Setting SHELL to bash allows bash commands to be executed by recipes.
# Options are set to exit when a recipe line exits non-zero or a piped command fails.
SHELL = /usr/bin/env bash -o pipefail
.SHELLFLAGS = -ec

ENVTEST_ASSETS_DIR=$(shell pwd)/testbin
test: manifests generate fmt vet ## Run tests.
	mkdir -p ${ENVTEST_ASSETS_DIR}
	test -f ${ENVTEST_ASSETS_DIR}/setup-envtest.sh || curl -sSLo ${ENVTEST_ASSETS_DIR}/setup-envtest.sh https://raw.githubusercontent.com/kubernetes-sigs/controller-runtime/v0.8.3/hack/setup-envtest.sh
	source ${ENVTEST_ASSETS_DIR}/setup-envtest.sh; fetch_envtest_tools $(ENVTEST_ASSETS_DIR); setup_envtest_env $(ENVTEST_ASSETS_DIR); go test ./... -coverprofile cover.out
To upgrade controller-gen and kustomize dependencies versions used

To upgrade the controller-gen and kustomize version used to generate the manifests replace:

# find or download controller-gen
# download controller-gen if necessary
controller-gen:
ifeq (, $(shell which controller-gen))
	@{ \
	set -e ;\
	CONTROLLER_GEN_TMP_DIR=$$(mktemp -d) ;\
	cd $$CONTROLLER_GEN_TMP_DIR ;\
	go mod init tmp ;\
	go get sigs.k8s.io/controller-tools/cmd/controller-gen@v0.2.5 ;\
	rm -rf $$CONTROLLER_GEN_TMP_DIR ;\
	}
CONTROLLER_GEN=$(GOBIN)/controller-gen
else
CONTROLLER_GEN=$(shell which controller-gen)
endif

With:

##@ Build Dependencies

## Location to install dependencies to
LOCALBIN ?= $(shell pwd)/bin
$(LOCALBIN):
	mkdir -p $(LOCALBIN)

## Tool Binaries
KUSTOMIZE ?= $(LOCALBIN)/kustomize
CONTROLLER_GEN ?= $(LOCALBIN)/controller-gen
ENVTEST ?= $(LOCALBIN)/setup-envtest

## Tool Versions
KUSTOMIZE_VERSION ?= v3.8.7
CONTROLLER_TOOLS_VERSION ?= v0.9.0

KUSTOMIZE_INSTALL_SCRIPT ?= "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"
.PHONY: kustomize
kustomize: $(KUSTOMIZE) ## Download kustomize locally if necessary.
$(KUSTOMIZE): $(LOCALBIN)
	test -s $(LOCALBIN)/kustomize || { curl -Ss $(KUSTOMIZE_INSTALL_SCRIPT) | bash -s -- $(subst v,,$(KUSTOMIZE_VERSION)) $(LOCALBIN); }

.PHONY: controller-gen
controller-gen: $(CONTROLLER_GEN) ## Download controller-gen locally if necessary.
$(CONTROLLER_GEN): $(LOCALBIN)
	test -s $(LOCALBIN)/controller-gen || GOBIN=$(LOCALBIN) go install sigs.k8s.io/controller-tools/cmd/controller-gen@$(CONTROLLER_TOOLS_VERSION)

.PHONY: envtest
envtest: $(ENVTEST) ## Download envtest-setup locally if necessary.
$(ENVTEST): $(LOCALBIN)
	test -s $(LOCALBIN)/setup-envtest || GOBIN=$(LOCALBIN) go install sigs.k8s.io/controller-runtime/tools/setup-envtest@latest

And then, to make your project use the kustomize version defined in the Makefile, replace all usage of kustomize with $(KUSTOMIZE)

Update your controllers

Replace:

func (r *<MyKind>Reconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
    ctx := context.Background()
    log := r.Log.WithValues("cronjob", req.NamespacedName)

With:

func (r *<MyKind>Reconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := r.Log.WithValues("cronjob", req.NamespacedName)

Update your controller and webhook test suite

Replace:

	. "github.com/onsi/ginkgo"

With:

	. "github.com/onsi/ginkgo/v2"

Also, adjust your test suite.

For Controller Suite:

	RunSpecsWithDefaultAndCustomReporters(t,
		"Controller Suite",
		[]Reporter{printer.NewlineReporter{}})

With:

	RunSpecs(t, "Controller Suite")

For Webhook Suite:

	RunSpecsWithDefaultAndCustomReporters(t,
		"Webhook Suite",
		[]Reporter{printer.NewlineReporter{}})

With:

	RunSpecs(t, "Webhook Suite")

Last but not least, remove the timeout variable from the BeforeSuite blocks:

Replace:

var _ = BeforeSuite(func(done Done) {
	....
}, 60)

With

var _ = BeforeSuite(func(done Done) {
	....
})

Change Logger to use flag options

In the main.go file replace:

flag.Parse()

ctrl.SetLogger(zap.New(zap.UseDevMode(true)))

With:

opts := zap.Options{
	Development: true,
}
opts.BindFlags(flag.CommandLine)
flag.Parse()

ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))

Rename the manager flags

The manager flags --metrics-addr and enable-leader-election were renamed to --metrics-bind-address and --leader-elect to be more aligned with core Kubernetes Components. More info: #1839.

In your main.go file replace:

func main() {
	var metricsAddr string
	var enableLeaderElection bool
	flag.StringVar(&metricsAddr, "metrics-addr", ":8080", "The address the metric endpoint binds to.")
	flag.BoolVar(&enableLeaderElection, "enable-leader-election", false,
		"Enable leader election for controller manager. "+
			"Enabling this will ensure there is only one active controller manager.")

With:

func main() {
	var metricsAddr string
	var enableLeaderElection bool
	flag.StringVar(&metricsAddr, "metrics-bind-address", ":8080", "The address the metric endpoint binds to.")
	flag.BoolVar(&enableLeaderElection, "leader-elect", false,
		"Enable leader election for controller manager. "+
			"Enabling this will ensure there is only one active controller manager.")

And then, rename the flags in the config/default/manager_auth_proxy_patch.yaml and config/default/manager.yaml:

- name: manager
args:
- "--health-probe-bind-address=:8081"
- "--metrics-bind-address=127.0.0.1:8080"
- "--leader-elect"

Verification

Finally, we can run make and make docker-build to ensure things are working fine.

Change your project to remove the Kubernetes deprecated API versions usage

The following steps describe a workflow to upgrade your project to remove the deprecated Kubernetes APIs: apiextensions.k8s.io/v1beta1, admissionregistration.k8s.io/v1beta1, cert-manager.io/v1alpha2.

The Kubebuilder CLI tool does not support scaffolded resources for both Kubernetes API versions such as; an API/CRD with apiextensions.k8s.io/v1beta1 and another one with apiextensions.k8s.io/v1.

The first step is to update your PROJECT file by replacing the api.crdVersion:v1beta and webhooks.WebhookVersion:v1beta with api.crdVersion:v1 and webhooks.WebhookVersion:v1 which would look like:

domain: my.domain
layout: go.kubebuilder.io/v3
projectName: example
repo: example
resources:
- api:
    crdVersion: v1
    namespaced: true
  group: webapp
  kind: Guestbook
  version: v1
  webhooks:
    defaulting: true
    webhookVersion: v1
version: "3"

You can try to re-create the APIS(CRDs) and Webhooks manifests by using the --force flag.

Now, re-create the APIS(CRDs) and Webhooks manifests by running the kubebuilder create api and kubebuilder create webhook for the same group, kind and versions with the flag --force, respectively.

V3 - Plugins Layout Migration Guides

Following the migration guides from the plugins versions. Note that the plugins ecosystem was introduced with Kubebuilder v3.0.0 release where the go/v3 version is the default layout since 28 Apr 2021.

Therefore, you can check here how to migrate the projects built from Kubebuilder 3.x with the plugin go/v3 to the latest.

go/v3 vs go/v4

This document covers all breaking changes when migrating from projects built using the plugin go/v3 (default for any scaffold done since 28 Apr 2021) to the next alpha version of the Golang plugin go/v4.

The details of all changes (breaking or otherwise) can be found in:

Common changes

  • go/v4 projects use Kustomize v5x (instead of v3x)
  • note that some manifests under config/ directory have been changed in order to no longer use the deprecated Kustomize features such as env vars.
  • A kustomization.yaml is scaffolded under config/samples. This helps simply and flexibly generate sample manifests: kustomize build config/samples.
  • adds support for Apple Silicon M1 (darwin/arm64)
  • remove support to CRD/WebHooks Kubernetes API v1beta1 version which are no longer supported since k8s 1.22
  • no longer scaffold webhook test files with "k8s.io/api/admission/v1beta1" the k8s API which is no longer served since k8s 1.25. By default webhooks test files are scaffolding using "k8s.io/api/admission/v1" which is support from k8s 1.20
  • no longer provide backwards compatible support with k8s versions < 1.16
  • change the layout to accommodate the community request to follow the Standard Go Project Layout by moving the api(s) under a new directory called api, controller(s) under a new directory called internal and the main.go under a new directory named cmd

TL;DR of the New go/v4 Plugin

More details on this can be found at here, but for the highlights, check below

Migrating to Kubebuilder go/v4

If you want to upgrade your scaffolding to use the latest and greatest features then, follow the guide which will cover the steps in the most straightforward way to allow you to upgrade your project to get all latest changes and improvements.

By updating the files manually

If you want to use the latest version of Kubebuilder CLI without changing your scaffolding then, check the following guide which will describe the steps to be performed manually to upgrade only your PROJECT version and start using the plugins versions.

This way is more complex, susceptible to errors, and success cannot be assured. Also, by following these steps you will not get the improvements and bug fixes in the default generated project files.

Migration from go/v3 to go/v4

Make sure you understand the differences between Kubebuilder go/v3 and go/v4 before continuing.

Please ensure you have followed the installation guide to install the required components.

The recommended way to migrate a go/v3 project is to create a new go/v4 project and copy over the API and the reconciliation code. The conversion will end up with a project that looks like a native go/v4 project layout (latest version).

However, in some cases, it’s possible to do an in-place upgrade (i.e. reuse the go/v3 project layout, upgrading the PROJECT file, and scaffolds manually). For further information see Migration from go/v3 to go/v4 by updating the files manually

Initialize a go/v4 Project

Create a new directory with the name of your project. Note that this name is used in the scaffolds to create the name of your manager Pod and of the Namespace where the Manager is deployed by default.

$ mkdir migration-project-name
$ cd migration-project-name

Now, we need to initialize a go/v4 project. Before we do that, we’ll need to initialize a new go module if we’re not on the GOPATH. While technically this is not needed inside GOPATH, it is still recommended.

go mod init tutorial.kubebuilder.io/migration-project

Now, we can finish initializing the project with kubebuilder.

kubebuilder init --domain tutorial.kubebuilder.io --plugins=go/v4

Migrate APIs and Controllers

Next, we’ll re-scaffold out the API types and controllers.

kubebuilder create api --group batch --version v1 --kind CronJob

Migrate the APIs

Now, let’s copy the API definition from api/v1/<kind>_types.go in our old project to the new one.

These files have not been modified by the new plugin, so you should be able to replace your freshly scaffolded files by your old one. There may be some cosmetic changes. So you can choose to only copy the types themselves.

Migrate the Controllers

Now, let’s migrate the controller code from controllers/cronjob_controller.go in our old project to the new one.

Migrate the Webhooks

Now let’s scaffold the webhooks for our CRD (CronJob). We’ll need to run the following command with the --defaulting and --programmatic-validation flags (since our test project uses defaulting and validating webhooks):

kubebuilder create webhook --group batch --version v1 --kind CronJob --defaulting --programmatic-validation

Now, let’s copy the webhook definition from api/v1/<kind>_webhook.go from our old project to the new one.

Others

If there are any manual updates in main.go in v3, we need to port the changes to the new main.go. We’ll also need to ensure all of needed controller-runtime schemes have been registered.

If there are additional manifests added under config directory, port them as well. Please, be aware that the new version go/v4 uses Kustomize v5x and no longer Kustomize v4. Therefore, if added customized implementations in the config you need to ensure that them can work with Kustomize v5 and/if not update/upgrade any breaking change that you might face.

In v4, installation of Kustomize has been changed from bash script to go get. Change the kustomize dependency in Makefile to

.PHONY: kustomize
kustomize: $(KUSTOMIZE) ## Download kustomize locally if necessary. If wrong version is installed, it will be removed before downloading.
$(KUSTOMIZE): $(LOCALBIN)
	@if test -x $(LOCALBIN)/kustomize && ! $(LOCALBIN)/kustomize version | grep -q $(KUSTOMIZE_VERSION); then \
		echo "$(LOCALBIN)/kustomize version is not expected $(KUSTOMIZE_VERSION). Removing it before installing."; \
		rm -rf $(LOCALBIN)/kustomize; \
	fi
	test -s $(LOCALBIN)/kustomize || GOBIN=$(LOCALBIN) GO111MODULE=on go install sigs.k8s.io/kustomize/kustomize/v5@$(KUSTOMIZE_VERSION)

Change the image name in the Makefile if needed.

Verification

Finally, we can run make and make docker-build to ensure things are working fine.

Migration from go/v3 to go/v4 by updating the files manually

Make sure you understand the differences between Kubebuilder go/v3 and go/v4 before continuing.

Please ensure you have followed the installation guide to install the required components.

The following guide describes the manual steps required to upgrade your PROJECT config file to begin using go/v4.

This way is more complex, susceptible to errors, and success cannot be assured. Also, by following these steps you will not get the improvements and bug fixes in the default generated project files.

Usually it is suggested to do it manually if you have customized your project and deviated too much from the proposed scaffold. Before continuing, ensure that you understand the note about [project customizations][project-customizations]. Note that you might need to spend more effort to do this process manually than to organize your project customizations. The proposed layout will keep your project maintainable and upgradable with less effort in the future.

The recommended upgrade approach is to follow the Migration Guide go/v3 to go/v4 instead.

Migration from project config version “go/v3” to “go/v4”

Update the PROJECT file layout which stores information about the resources that are used to enable plugins make useful decisions while scaffolding. The layout field indicates the scaffolding and the primary plugin version in use.

Steps to migrate

Migrate the layout version into the PROJECT file

The following steps describe the manual changes required to bring the project configuration file (PROJECT). These change will add the information that Kubebuilder would add when generating the file. This file can be found in the root directory.

Update the PROJECT file by replacing:

layout:
- go.kubebuilder.io/v3

With:

layout:
- go.kubebuilder.io/v4

Changes to the layout

New layout:
  • The directory apis was renamed to api to follow the standard
  • The controller(s) directory has been moved under a new directory called internal and renamed to singular as well controller
  • The main.go previously scaffolded in the root directory has been moved under a new directory called cmd

Therefore, you can check the changes in the layout results into:

...
├── cmd
│ └── main.go
├── internal
│ └── controller
└── api
Migrating to the new layout:
  • Create a new directory cmd and move the main.go under it.
  • If your project support multi-group the APIs are scaffold under a directory called apis. Rename this directory to api
  • Move the controllers directory under the internal and rename it for controller
  • Now ensure that the imports will be updated accordingly by:
    • Update the main.go imports to look for the new path of your controllers under the internal/controller directory

Then, let’s update the scaffolds paths

  • Update the Dockerfile to ensure that you will have:
COPY cmd/main.go cmd/main.go
COPY api/ api/
COPY internal/controller/ internal/controller/

Then, replace:

RUN CGO_ENABLED=0 GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH} go build -a -o manager main.go

With:

RUN CGO_ENABLED=0 GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH} go build -a -o manager cmd/main.go
  • Update the Makefile targets to build and run the manager by replacing:
.PHONY: build
build: manifests generate fmt vet ## Build manager binary.
	go build -o bin/manager main.go

.PHONY: run
run: manifests generate fmt vet ## Run a controller from your host.
	go run ./main.go

With:

.PHONY: build
build: manifests generate fmt vet ## Build manager binary.
	go build -o bin/manager cmd/main.go

.PHONY: run
run: manifests generate fmt vet ## Run a controller from your host.
	go run ./cmd/main.go
  • Update the internal/controller/suite_test.go to set the path for the CRDDirectoryPaths:

Replace:

CRDDirectoryPaths:     []string{filepath.Join("..", "config", "crd", "bases")},

With:

CRDDirectoryPaths:     []string{filepath.Join("..", "..", "config", "crd", "bases")},

Note that if your project has multiple groups (multigroup:true) then the above update should result into "..", "..", "..", instead of "..",".."

Now, let’s update the PATHs in the PROJECT file accordingly

The PROJECT tracks the paths of all APIs used in your project. Ensure that they now point to api/... as the following example:

Before update:

  group: crew
  kind: Captain
  path: sigs.k8s.io/kubebuilder/testdata/project-v4/apis/crew/v1

After Update:


  group: crew
  kind: Captain
  path: sigs.k8s.io/kubebuilder/testdata/project-v4/api/crew/v1

Update kustomize manifests with the changes made so far

  • Update the manifest under config/ directory with all changes performed in the default scaffold done with go/v4 plugin. (see for example testdata/project-v4/config/) to get all changes in the default scaffolds to be applied on your project
  • Create config/samples/kustomization.yaml with all Custom Resources samples specified into config/samples. (see for example testdata/project-v4/config/samples/kustomization.yaml)

If you have webhooks:

Replace the import admissionv1beta1 "k8s.io/api/admission/v1beta1" with admissionv1 "k8s.io/api/admission/v1" in the webhook test files

Makefile updates

Update the Makefile with the changes which can be found in the samples under testdata for the release tag used. (see for example testdata/project-v4/Makefile)

Update the dependencies

Update the go.mod with the changes which can be found in the samples under testdata for the release tag used. (see for example testdata/project-v4/go.mod). Then, run go mod tidy to ensure that you get the latest dependencies and your Golang code has no breaking changes.

Verification

In the steps above, you updated your project manually with the goal of ensuring that it follows the changes in the layout introduced with the go/v4 plugin that update the scaffolds.

There is no option to verify that you properly updated the PROJECT file of your project. The best way to ensure that everything is updated correctly, would be to initialize a project using the go/v4 plugin, (ie) using kubebuilder init --domain tutorial.kubebuilder.io plugins=go/v4 and generating the same API(s), controller(s), and webhook(s) in order to compare the generated configuration with the manually changed configuration.

Also, after all updates you would run the following commands:

  • make manifests (to re-generate the files using the latest version of the contrller-gen after you update the Makefile)
  • make all (to ensure that you are able to build and perform all operations)

Single Group to Multi-Group

Let’s migrate the CronJob example.

To change the layout of your project to support Multi-Group run the command kubebuilder edit --multigroup=true. Once you switch to a multi-group layout, the new Kinds will be generated in the new layout but additional manual work is needed to move the old API groups to the new layout.

Generally, we use the prefix for the API group as the directory name. We can check api/v1/groupversion_info.go to find that out:

// +groupName=batch.tutorial.kubebuilder.io
package v1

Then, we’ll rename move our existing APIs into a new subdirectory, “batch”:

mkdir api/batch
mv api/* api/batch

After moving the APIs to a new directory, the same needs to be applied to the controllers. For go/v4:

mkdir internal/controller/batch
mv internal/controller/* internal/controller/batch/