Posts

Crane 推荐组件源码笔记

2025-02-12

选自我的笔记，项目非常庞大，希望给各位同好一点启发，这里主要还是推荐部分模块的源码走读

架构

The project structure is organized into several directories and files, each serving a specific purpose. Here’s a brief overview of the main components

cmd/
- metric-adapter/: Contains code related to the metric adapter component
- craned/: Contains the main application code for the craned command
  - app/: Likely contains application-specific logic and configurations
  - main.go: The entry point for the craned application
- crane-agent/: Contains code related to the crane agent component
deploy/: Contains deployment scripts and configurations for deploying the application
docs/: Documentation files for the project
examples/: Example configurations or usage examples for the project
hack/: Scripts and utilities for development and maintenance tasks
overrides/: Possibly contains configuration overrides or custom settings
pkg/: Contains the core library code, which can be used by other applications or commands
site/: Could be related to the project’s website or documentation site
tools/: Additional tools or scripts that support the project
Files:
- go.mod and go.sum: Go module files that manage dependencies
- README.md and README_zh.md: Documentation files providing an overview and instructions for the project
- Makefile: Contains build instructions and tasks for the project
- Dockerfile: Instructions for building a Docker image of the application
- .gitignore: Specifies files and directories to be ignored by Git
- .golangci.yaml: Configuration for GolangCI, a Go linter
- CONTRIBUTING.md and CODE_OF_CONDUCT.md: Guidelines for contributing to the project and expected behavior

Manager 部分

Command Initialization:
- NewManagerCommand(ctx context.Context) *cobra.Command:
  - Creates a new Cobra command for craned
  - Initializes options using options.NewOptions()
  - Sets up command flags and feature gates
Command Execution:
- The Run function within the Cobra command is executed:
  - Calls opts.Complete() to complete option setup
  - Validates options with opts.Validate()
  - Executes Run(ctx, opts) to start the manager
Run Function:
- Run(ctx context.Context, opts *options.Options) error:
  - Configures the controller manager with ctrl.GetConfigOrDie()
  - Sets up controller options, including leader election and metrics binding
  - Creates a new manager with ctrl.NewManager(config, ctrlOptions)
Health Checks:
- Adds health and readiness checks using mgr.AddHealthzCheck and mgr.AddReadyzCheck
Data Sources and Predictor Initialization:
- initDataSources(mgr, opts): Initializes real-time and historical data sources
- initPredictorManager(opts, realtimeDataSources, historyDataSources): Sets up the predictor manager using the initialized data sources
Scheme and Indexer Initialization:
- initScheme(): Registers API types with the runtime scheme
- initFieldIndexer(mgr): Sets up field indexers for efficient querying
Webhooks Initialization:
- initWebhooks(mgr, opts): Configures webhooks if necessary
Pod OOM Recorder:
- Initializes oom.PodOOMRecorder to track out-of-memory events
- Sets up the recorder with the manager and runs it in a separate goroutine
Controller Initialization:
- initControllers(ctx, podOOMRecorder, mgr, opts, predictorMgr, historyDataSources[providers.PrometheusDataSource]): Initializes various controllers, including analytics and recommendation controllers
Metric Collector Initialization:

initMetricCollector(mgr): Sets up custom metrics collection

Running All Components:

runAll(ctx, mgr, predictorMgr, dataSourceProviders[providers.PrometheusDataSource], opts):
- Starts all components, ensuring that the manager and predictor manager are running

Controllers 部分

PodOOMRecorder

SetupWithManager:
- SetupWithManager(mgr ctrl.Manager) error:
  - This function is called to register the PodOOMRecorder with the controller manager
  - It sets up the reconciler to watch for pod events and trigger the Reconcile function
Reconcile Function:
- Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error):
  - Triggered by the controller manager for each pod event
  - Retrieves the pod specified in the request
  - Checks if the pod was terminated due to an OOM event using IsOOMKilled(pod)
  - If OOMKilled, iterates over container statuses to find the terminated container
  - For each OOMKilled container, checks if memory requests are set
  - Increments the OOM count metric and adds an OOMRecord to the queue
Run Function:
- Run(stopCh <-chan struct{}) error:
  - Continuously processes the OOM records from the queue
  - Uses a ticker to periodically clean old OOM records to prevent exceeding ConfigMap storage limits
  - Retrieves OOM records using GetOOMRecord
  - Updates OOM records with updateOOMRecord
GetOOMRecord Function:
- GetOOMRecord() ([]OOMRecord, error):
  - Retrieves OOM records from the cache or ConfigMap
  - If the cache is nil, it fetches the ConfigMap and unmarshals the OOM records
cleanOOMRecords Function:
- cleanOOMRecords(oomRecords []OOMRecord) []OOMRecord:
  - Cleans up old OOM records to maintain the maximum number specified by OOMRecordMaxNumber
updateOOMRecord Function:
- updateOOMRecord(oomRecord OOMRecord, saved []OOMRecord) error:
  - Updates the OOM records in the cache or storage with new records
IsOOMKilled Function:
- IsOOMKilled(pod *v1.Pod) bool:
  - Helper function to determine if a pod was terminated due to an OOM event

Detailed Flow

Initialization: The PodOOMRecorder is set up with the manager using SetupWithManager, which registers it to watch for pod events
Event Handling: For each pod event, Reconcile is called to check for OOM events and record them
Record Management: The Run function processes the queue of OOM records, periodically cleans old records, and updates the storage
Data Retrieval: GetOOMRecord fetches the current OOM records, either from the cache or the ConfigMap

This detailed explanation provides a comprehensive view of the PodOOMRecorder’s functionality and its role in tracking OOM events in a Kubernetes cluster

权限类型	非分批发布	分批发布
Namespace	读权限	读权限
ControllerRevision (apps/v1/ControllerRevision)	全部权限	全部权限
待部署的 Kubernetes 对象	全部权限	全部权限
Rollout (standard.oam.dev/v1alpha1)	-	全部权限
待部署的工作负载对象	全部权限	全部权限
CRD (apiextensions.k8s.io/v1/CustomResourceDefinition)	-	全部权限（用于安装 Rollout CRD）
ServiceAccount (core/v1/ServiceAccount)	-	全部权限（用于维持 Rollout 控制器的运行）
ClusterRole (rbac.authorization.k8s.io/v1/ClusterRole)	-	全部权限（用于维持 Rollout 控制器的运行）
ClusterRoleBinding (rbac.authorization.k8s.io/v1/ClusterRoleBinding)	-	全部权限（用于维持 Rollout 控制器的运行）
Role (rbac.authorization.k8s.io/v1/Role)	-	全部权限（用于维持 Rollout 控制器的运行）
RoleBinding (rbac.authorization.k8s.io/v1/RoleBinding)	-	全部权限（用于维持 Rollout 控制器的运行）
Deployment (apps/v1/Deployment)	-	全部权限（用于维持 Rollout 控制器的运行）
Pod (core/v1/Pod)	-	全部权限（用于 Rollout 控制器的安装后 E2E 测试）

字体选项

Posts

架构

Manager 部分

Controllers 部分

PodOOMRecorder

Detailed Flow

思路

简介

原理

核心组件（逻辑）

WebPlanner

WebSearcher

部署

前言

系统资源

集群的安装

基础的配置

前言

Pod基础知识

创建Pod的过程

过程简述

权限要求

前言

效果

代码和部分讲解

前言

Kured 代码流程简述

主流程

前篇

Brutal 算法的核心代码剖析

剖析

前言

Operator的基本概念

控制器模式

Kubernetes中的控制器与CRD

初始化和安装软件依赖

引言

微服务时代Linux内核的问题

Abstractions 抽象

SSL 安全评分

速度

前言

问题排查流程

初步尝试

前言

什么是AppArmor？

1. 安全模型描述

扣分须知

扣一分的情况

扣三分的情况

扣六分的情况

前言

故事背景

项目准备

前言

环境

我的配置单

安装模型相应的依赖

实验环境

前言

基本的概念

拥塞控制四板斧

慢启动

概念

模拟线上环境

编译包含了 Lua 模块的 Nginx

阅读目的

项目地址

深入研究Proxy的流程

Director

Linux Service

Nginx

Kubernetes

数据库中间件问题

MongoDB

ReplicaSetNoPrimary 问题

概念梳理

中断和软中断

网卡收发过程