Building Kubernetes Operators for Blockchain Networks: A Deep Dive into Bevel Operator Fabric

Technical Abstract

As a blockchain infrastructure engineer, automating the deployment and management of distributed ledger networks on Kubernetes presents unique challenges. Unlike traditional stateless applications, blockchain nodes require persistent identities, complex state management, and careful coordination across multiple clusters for validator redundancy.

This comprehensive guide explores building production-ready Kubernetes operators for blockchain networks, with a deep dive into the Bevel Operator Fabric architecture. We'll cover Custom Resource Definition (CRD) design patterns, reconciliation loop implementation, multi-cluster deployment strategies, and operational best practices that enable 99%+ uptime for managed blockchain nodes.

Prerequisites:

  • Kubernetes fundamentals (Pods, Services, StatefulSets, Operators)
  • Go programming language basics
  • Understanding of blockchain network architecture
  • Familiarity with Hyperledger Fabric concepts

Target Performance Metrics:

  • Operator startup time: <5 seconds
  • Reconciliation cycle: <30 seconds
  • Managed node uptime: 99%+ under normal conditions

Bevel Operator Fabric Architecture

The Hyperledger Bevel Operator Fabric is a Kubernetes operator that automates the deployment, management, and lifecycle operations of Hyperledger Fabric networks. Let's examine its architecture in detail.

Architecture Overview

Mermaid diagram
Click to expand

Deployment Workflow

The operator manages a complete Hyperledger Fabric network lifecycle through a sequential deployment workflow:

Mermaid diagram
Click to expand

Controller Design Pattern

Each component is managed by a dedicated controller that watches for changes and reconciles state:

package controllers

import (
    "context"
    "time"

    "github.com/go-logr/logr"
    appsv1 "k8s.io/api/apps/v1"
    corev1 "k8s.io/api/core/v1"
    "k8s.io/apimachinery/pkg/api/errors"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/apimachinery/pkg/runtime"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"
    "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"

    hlfv1alpha1 "github.com/hyperledger/bevel-operator-fabric/api/v1alpha1"
)

// FabricPeerReconciler reconciles a FabricPeer object
type FabricPeerReconciler struct {
    client.Client
    Log      logr.Logger
    Scheme   *runtime.Scheme
    ChartURL string
}

// +kubebuilder:rbac:groups=hlf.kungfusoftware.es,resources=fabricpeers,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=hlf.kungfusoftware.es,resources=fabricpeers/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=hlf.kungfusoftware.es,resources=fabricpeers/finalizers,verbs=update

func (r *FabricPeerReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := r.Log.WithValues("fabricpeer", req.NamespacedName)
    startTime := time.Now()

    defer func() {
        reconcileDuration.WithLabelValues("FabricPeer").Observe(time.Since(startTime).Seconds())
    }()

    // Step 1: Fetch the FabricPeer instance
    peer := &hlfv1alpha1.FabricPeer{}
    if err := r.Get(ctx, req.NamespacedName, peer); err != nil {
        if errors.IsNotFound(err) {
            log.Info("FabricPeer resource not found. Ignoring since object must be deleted")
            return ctrl.Result{}, nil
        }
        return ctrl.Result{}, err
    }

    // Step 2: Handle deletion
    if !peer.ObjectMeta.DeletionTimestamp.IsZero() {
        return r.reconcileDelete(ctx, peer, log)
    }

    // Step 3: Add finalizer if not present
    if !controllerutil.ContainsFinalizer(peer, peerFinalizer) {
        controllerutil.AddFinalizer(peer, peerFinalizer)
        if err := r.Update(ctx, peer); err != nil {
            return ctrl.Result{}, err
        }
        return ctrl.Result{Requeue: true}, nil
    }

    // Step 4: Reconcile peer components
    if err := r.reconcilePeer(ctx, peer, log); err != nil {
        r.setCondition(peer, "Error", metav1.ConditionFalse, "ReconcileError", err.Error())
        r.Status().Update(ctx, peer)
        return ctrl.Result{RequeueAfter: 30 * time.Second}, err
    }

    // Step 5: Update status
    peer.Status.Status = hlfv1alpha1.PeerStateRunning
    r.setCondition(peer, "Ready", metav1.ConditionTrue, "ReconcileSuccess", "Peer is running")
    r.Status().Update(ctx, peer)

    // Requeue for periodic sync
    return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
}

Custom Resource Definitions (CRD) Design Patterns

CRDs are the foundation of any Kubernetes operator. For blockchain networks, we need to model complex relationships between components while maintaining a clear declarative API.

FabricPeer CRD

The FabricPeer CRD defines the desired state for a Hyperledger Fabric peer node:

apiVersion: hlf.kungfusoftware.es/v1alpha1
kind: FabricPeer
metadata:
  name: org1-peer0
  namespace: fabric-network
spec:
  # Core configuration
  image: hyperledger/fabric-peer
  version: 2.5.4
  mspID: Org1MSP

  # Identity enrollment
  secret:
    enrollment:
      component:
        cahost: org1-ca.fabric-network
        caport: 7054
        caname: ca
        catls:
          cacert: |
            -----BEGIN CERTIFICATE-----
            MIICpTCCAkugAwIBAgIUYYjk...
            -----END CERTIFICATE-----
        enrollid: peer0
        enrollsecret: peer0pw
      tls:
        cahost: org1-ca.fabric-network
        caport: 7054
        caname: tlsca
        catls:
          cacert: |
            -----BEGIN CERTIFICATE-----
            MIICpTCCAkugAwIBAgIUYYjk...
            -----END CERTIFICATE-----
        enrollid: peer0
        enrollsecret: peer0pw

  # Storage configuration
  storage:
    peer:
      storageClass: fast-ssd
      size: 50Gi
      accessMode: ReadWriteOnce
    statedb:
      storageClass: fast-ssd
      size: 20Gi
    chaincode:
      storageClass: standard
      size: 10Gi

  # State database
  stateDb: couchdb
  couchdb:
    user: admin
    password: adminpw
    image: couchdb
    version: 3.3.2

  # Gossip protocol
  gossip:
    bootstrap: org1-peer1.fabric-network:7051
    endpoint: org1-peer0.fabric-network:7051
    externalEndpoint: peer0.org1.example.com:7051
    useLeaderElection: true
    orgLeader: false

  # Resource allocation
  resources:
    peer:
      requests:
        cpu: 500m
        memory: 1Gi
      limits:
        cpu: 2000m
        memory: 4Gi
    couchdb:
      requests:
        cpu: 250m
        memory: 512Mi
      limits:
        cpu: 1000m
        memory: 2Gi

  # Network exposure
  service:
    type: ClusterIP

  # External access via Istio
  istio:
    port: 443
    hosts:
      - peer0.org1.example.com
    ingressGateway: ingressgateway

FabricOrderingService CRD

For Raft-based ordering services:

apiVersion: hlf.kungfusoftware.es/v1alpha1
kind: FabricOrderingService
metadata:
  name: orderer-service
  namespace: fabric-network
spec:
  image: hyperledger/fabric-orderer
  version: 2.5.4
  mspID: OrdererMSP

  # Raft cluster nodes
  nodes:
    - id: orderer0
      host: orderer0.example.com
      port: 7050
      adminPort: 7053
      enrollment:
        component:
          cahost: orderer-ca.fabric-network
          caport: 7054
          caname: ca
          enrollid: orderer0
          enrollsecret: orderer0pw
        tls:
          cahost: orderer-ca.fabric-network
          caport: 7054
          caname: tlsca
          enrollid: orderer0
          enrollsecret: orderer0pw
    - id: orderer1
      host: orderer1.example.com
      port: 7050
      # ... similar enrollment config
    - id: orderer2
      host: orderer2.example.com
      port: 7050
      # ... similar enrollment config

  # Raft consensus parameters
  systemChannel:
    name: system-channel
    config:
      orderer:
        etcdRaft:
          options:
            tickInterval: 500ms
            electionTick: 10
            heartbeatTick: 1
            maxInflightBlocks: 5
            snapshotIntervalSize: 16777216  # 16 MB
        batchTimeout: 2s
        batchSize:
          maxMessageCount: 500
          absoluteMaxBytes: 10485760  # 10 MB
          preferredMaxBytes: 2097152  # 2 MB
        capabilities:
          - V2_0
      application:
        capabilities:
          - V2_0

  storage:
    storageClass: fast-ssd
    size: 100Gi
    accessMode: ReadWriteOnce

  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 2000m
      memory: 4Gi

FabricChannel CRD

Managing channel lifecycle:

apiVersion: hlf.kungfusoftware.es/v1alpha1
kind: FabricMainChannel
metadata:
  name: supply-chain
  namespace: fabric-network
spec:
  name: supply-chain

  # Administrative organizations
  adminOrdererOrganizations:
    - mspID: OrdererMSP
  adminPeerOrganizations:
    - mspID: Org1MSP
    - mspID: Org2MSP

  # Channel configuration
  channelConfig:
    application:
      acls:
        _lifecycle/CheckCommitReadiness: /Channel/Application/Writers
        _lifecycle/CommitChaincodeDefinition: /Channel/Application/Writers
        _lifecycle/QueryChaincodeDefinition: /Channel/Application/Readers
        _lifecycle/QueryChaincodeDefinitions: /Channel/Application/Readers
        lscc/ChaincodeExists: /Channel/Application/Readers
        lscc/GetDeploymentSpec: /Channel/Application/Readers
        lscc/GetChaincodeData: /Channel/Application/Readers
        lscc/GetInstantiatedChaincodes: /Channel/Application/Readers
        qscc/GetChainInfo: /Channel/Application/Readers
        qscc/GetBlockByNumber: /Channel/Application/Readers
        qscc/GetBlockByHash: /Channel/Application/Readers
        qscc/GetTransactionByID: /Channel/Application/Readers
        qscc/GetBlockByTxID: /Channel/Application/Readers
        cscc/GetConfigBlock: /Channel/Application/Readers
        peer/Propose: /Channel/Application/Writers
        peer/ChaincodeToChaincode: /Channel/Application/Writers
        event/Block: /Channel/Application/Readers
        event/FilteredBlock: /Channel/Application/Readers
      capabilities:
        - V2_0
      policies:
        Admins:
          type: ImplicitMeta
          rule: MAJORITY Admins
        Endorsement:
          type: ImplicitMeta
          rule: MAJORITY Endorsement
        LifecycleEndorsement:
          type: ImplicitMeta
          rule: MAJORITY Endorsement
        Readers:
          type: ImplicitMeta
          rule: ANY Readers
        Writers:
          type: ImplicitMeta
          rule: ANY Writers

  # Orderer endpoints
  orderer:
    - host: orderer0.example.com
      port: 7050
      tlsCert: |
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
    - host: orderer1.example.com
      port: 7050
      tlsCert: |
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----

  # Participating organizations
  peerOrganizations:
    - mspID: Org1MSP
      caName: org1-ca
      caNamespace: fabric-network
    - mspID: Org2MSP
      caName: org2-ca
      caNamespace: fabric-network

CRD Status Design Pattern

The status subresource tracks the observed state:

// FabricPeerStatus defines the observed state of FabricPeer
type FabricPeerStatus struct {
    // Phase represents the current lifecycle phase
    // +kubebuilder:validation:Enum=Pending;Initializing;Running;Failed;Terminating
    Status PeerState `json:"status"`

    // URL to access the peer gRPC endpoint
    URL string `json:"url,omitempty"`

    // TLS CA certificate for verifying peer connections
    TLSCACert string `json:"tlsCACert,omitempty"`

    // Sign CA certificate for the peer's MSP
    SignCACert string `json:"signCACert,omitempty"`

    // SignCert is the peer's signing certificate
    SignCert string `json:"signCert,omitempty"`

    // Conditions represent the latest available observations
    // +patchMergeKey=type
    // +patchStrategy=merge
    // +listType=map
    // +listMapKey=type
    Conditions []metav1.Condition `json:"conditions,omitempty"`

    // LastReconcileTime is when the resource was last reconciled
    LastReconcileTime *metav1.Time `json:"lastReconcileTime,omitempty"`

    // ObservedGeneration is the most recent generation observed
    ObservedGeneration int64 `json:"observedGeneration,omitempty"`

    // Message provides additional status information
    Message string `json:"message,omitempty"`
}

// PeerState represents the lifecycle phase of a peer
type PeerState string

const (
    PeerStatePending     PeerState = "Pending"
    PeerStateInitializing PeerState = "Initializing"
    PeerStateRunning     PeerState = "Running"
    PeerStateFailed      PeerState = "Failed"
    PeerStateTerminating PeerState = "Terminating"
)

Reconciliation Loop Implementation

The reconciliation loop is the heart of any operator. For blockchain nodes, it must handle complex state transitions while maintaining idempotency.

Phased Reconciliation Pattern

Mermaid diagram
Click to expand

Complete Reconciliation Implementation

package controllers

import (
    "context"
    "fmt"
    "time"

    "github.com/go-logr/logr"
    appsv1 "k8s.io/api/apps/v1"
    corev1 "k8s.io/api/core/v1"
    networkingv1 "k8s.io/api/networking/v1"
    "k8s.io/apimachinery/pkg/api/errors"
    "k8s.io/apimachinery/pkg/api/resource"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/apimachinery/pkg/runtime"
    "k8s.io/apimachinery/pkg/types"
    "k8s.io/apimachinery/pkg/util/intstr"
    "k8s.io/utils/pointer"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"
    "sigs.k8s.io/controller-runtime/pkg/controller"
    "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"

    hlfv1alpha1 "github.com/hyperledger/bevel-operator-fabric/api/v1alpha1"
)

const (
    peerFinalizer = "hlf.kungfusoftware.es/peer-finalizer"
)

func (r *FabricPeerReconciler) reconcilePeer(ctx context.Context, peer *hlfv1alpha1.FabricPeer, log logr.Logger) error {
    // Phase 1: Reconcile Certificates
    log.Info("Reconciling certificates")
    certs, err := r.reconcileCertificates(ctx, peer)
    if err != nil {
        return fmt.Errorf("failed to reconcile certificates: %w", err)
    }
    r.setCondition(peer, "CertificatesReady", metav1.ConditionTrue, "CertificatesEnrolled", "Certificates enrolled successfully")

    // Phase 2: Reconcile ConfigMaps
    log.Info("Reconciling ConfigMaps")
    if err := r.reconcileConfigMaps(ctx, peer, certs); err != nil {
        return fmt.Errorf("failed to reconcile configmaps: %w", err)
    }
    r.setCondition(peer, "ConfigReady", metav1.ConditionTrue, "ConfigCreated", "Configuration created")

    // Phase 3: Reconcile Secrets
    log.Info("Reconciling Secrets")
    if err := r.reconcileSecrets(ctx, peer, certs); err != nil {
        return fmt.Errorf("failed to reconcile secrets: %w", err)
    }

    // Phase 4: Reconcile PVCs
    log.Info("Reconciling PersistentVolumeClaims")
    if err := r.reconcilePVCs(ctx, peer); err != nil {
        return fmt.Errorf("failed to reconcile PVCs: %w", err)
    }
    r.setCondition(peer, "StorageReady", metav1.ConditionTrue, "StorageProvisioned", "Storage provisioned")

    // Phase 5: Reconcile CouchDB (if configured)
    if peer.Spec.StateDb == "couchdb" {
        log.Info("Reconciling CouchDB")
        if err := r.reconcileCouchDB(ctx, peer); err != nil {
            return fmt.Errorf("failed to reconcile CouchDB: %w", err)
        }
    }

    // Phase 6: Reconcile StatefulSet
    log.Info("Reconciling StatefulSet")
    if err := r.reconcileStatefulSet(ctx, peer); err != nil {
        return fmt.Errorf("failed to reconcile StatefulSet: %w", err)
    }

    // Phase 7: Reconcile Services
    log.Info("Reconciling Services")
    if err := r.reconcileServices(ctx, peer); err != nil {
        return fmt.Errorf("failed to reconcile Services: %w", err)
    }

    // Phase 8: Reconcile Network Policies
    log.Info("Reconciling NetworkPolicies")
    if err := r.reconcileNetworkPolicies(ctx, peer); err != nil {
        return fmt.Errorf("failed to reconcile NetworkPolicies: %w", err)
    }

    // Phase 9: Verify deployment is ready
    log.Info("Verifying deployment readiness")
    ready, err := r.checkDeploymentReady(ctx, peer)
    if err != nil {
        return fmt.Errorf("failed to check deployment readiness: %w", err)
    }

    if !ready {
        r.setCondition(peer, "Ready", metav1.ConditionFalse, "WaitingForPods", "Waiting for pods to be ready")
        return nil
    }

    r.setCondition(peer, "Ready", metav1.ConditionTrue, "AllReplicasReady", "All replicas are ready")
    return nil
}

func (r *FabricPeerReconciler) reconcileCertificates(ctx context.Context, peer *hlfv1alpha1.FabricPeer) (*CertificateBundle, error) {
    // Connect to Fabric CA and enroll certificates
    caClient, err := r.createCAClient(ctx, peer.Spec.Secret.Enrollment.Component)
    if err != nil {
        return nil, fmt.Errorf("failed to create CA client: %w", err)
    }

    // Enroll component certificate
    componentCert, err := caClient.Enroll(
        peer.Spec.Secret.Enrollment.Component.EnrollID,
        peer.Spec.Secret.Enrollment.Component.EnrollSecret,
    )
    if err != nil {
        return nil, fmt.Errorf("failed to enroll component certificate: %w", err)
    }

    // Enroll TLS certificate
    tlsCAClient, err := r.createCAClient(ctx, peer.Spec.Secret.Enrollment.TLS)
    if err != nil {
        return nil, fmt.Errorf("failed to create TLS CA client: %w", err)
    }

    tlsCert, err := tlsCAClient.Enroll(
        peer.Spec.Secret.Enrollment.TLS.EnrollID,
        peer.Spec.Secret.Enrollment.TLS.EnrollSecret,
    )
    if err != nil {
        return nil, fmt.Errorf("failed to enroll TLS certificate: %w", err)
    }

    return &CertificateBundle{
        SignCert:   componentCert.Certificate,
        SignKey:    componentCert.PrivateKey,
        SignCACert: componentCert.CACertificate,
        TLSCert:    tlsCert.Certificate,
        TLSKey:     tlsCert.PrivateKey,
        TLSCACert:  tlsCert.CACertificate,
    }, nil
}

func (r *FabricPeerReconciler) reconcileStatefulSet(ctx context.Context, peer *hlfv1alpha1.FabricPeer) error {
    sts := r.buildStatefulSet(peer)

    // Set owner reference for garbage collection
    if err := controllerutil.SetControllerReference(peer, sts, r.Scheme); err != nil {
        return fmt.Errorf("failed to set controller reference: %w", err)
    }

    // Create or update StatefulSet
    existing := &appsv1.StatefulSet{}
    err := r.Get(ctx, types.NamespacedName{Name: sts.Name, Namespace: sts.Namespace}, existing)
    if err != nil {
        if errors.IsNotFound(err) {
            return r.Create(ctx, sts)
        }
        return err
    }

    // Update existing StatefulSet
    existing.Spec = sts.Spec
    return r.Update(ctx, existing)
}

func (r *FabricPeerReconciler) buildStatefulSet(peer *hlfv1alpha1.FabricPeer) *appsv1.StatefulSet {
    labels := r.buildLabels(peer)

    // Security context for blockchain nodes
    securityContext := &corev1.PodSecurityContext{
        RunAsNonRoot: pointer.Bool(true),
        RunAsUser:    pointer.Int64(1000),
        RunAsGroup:   pointer.Int64(1000),
        FSGroup:      pointer.Int64(1000),
        SeccompProfile: &corev1.SeccompProfile{
            Type: corev1.SeccompProfileTypeRuntimeDefault,
        },
    }

    containerSecurityContext := &corev1.SecurityContext{
        AllowPrivilegeEscalation: pointer.Bool(false),
        ReadOnlyRootFilesystem:   pointer.Bool(false), // Fabric peer needs writable fs
        RunAsNonRoot:             pointer.Bool(true),
        RunAsUser:                pointer.Int64(1000),
        Capabilities: &corev1.Capabilities{
            Drop: []corev1.Capability{"ALL"},
        },
    }

    // Build peer container
    peerContainer := corev1.Container{
        Name:            "peer",
        Image:           fmt.Sprintf("%s:%s", peer.Spec.Image, peer.Spec.Version),
        ImagePullPolicy: corev1.PullIfNotPresent,
        SecurityContext: containerSecurityContext,
        Env:             r.buildPeerEnvVars(peer),
        Ports: []corev1.ContainerPort{
            {Name: "grpc", ContainerPort: 7051, Protocol: corev1.ProtocolTCP},
            {Name: "chaincode", ContainerPort: 7052, Protocol: corev1.ProtocolTCP},
            {Name: "operations", ContainerPort: 9443, Protocol: corev1.ProtocolTCP},
        },
        VolumeMounts: []corev1.VolumeMount{
            {Name: "peer-data", MountPath: "/var/hyperledger/production"},
            {Name: "peer-msp", MountPath: "/etc/hyperledger/fabric/msp"},
            {Name: "peer-tls", MountPath: "/etc/hyperledger/fabric/tls"},
            {Name: "core-config", MountPath: "/etc/hyperledger/fabric/core.yaml", SubPath: "core.yaml"},
            {Name: "chaincode-data", MountPath: "/var/hyperledger/production/externalbuilder"},
        },
        Resources:      peer.Spec.Resources.Peer,
        LivenessProbe:  r.buildLivenessProbe(),
        ReadinessProbe: r.buildReadinessProbe(),
        StartupProbe:   r.buildStartupProbe(),
    }

    // Build containers list
    containers := []corev1.Container{peerContainer}

    // Add CouchDB sidecar if configured
    if peer.Spec.StateDb == "couchdb" {
        containers = append(containers, r.buildCouchDBContainer(peer))
    }

    replicas := int32(1)

    return &appsv1.StatefulSet{
        ObjectMeta: metav1.ObjectMeta{
            Name:      peer.Name,
            Namespace: peer.Namespace,
            Labels:    labels,
        },
        Spec: appsv1.StatefulSetSpec{
            Replicas: &replicas,
            Selector: &metav1.LabelSelector{
                MatchLabels: labels,
            },
            ServiceName: peer.Name,
            Template: corev1.PodTemplateSpec{
                ObjectMeta: metav1.ObjectMeta{
                    Labels:      labels,
                    Annotations: r.buildPodAnnotations(peer),
                },
                Spec: corev1.PodSpec{
                    SecurityContext:    securityContext,
                    ServiceAccountName: peer.Name,
                    Containers:         containers,
                    Volumes:            r.buildVolumes(peer),
                    Affinity:           r.buildAffinity(peer),
                    TopologySpreadConstraints: []corev1.TopologySpreadConstraint{
                        {
                            MaxSkew:           1,
                            TopologyKey:       "topology.kubernetes.io/zone",
                            WhenUnsatisfiable: corev1.ScheduleAnyway,
                            LabelSelector: &metav1.LabelSelector{
                                MatchLabels: labels,
                            },
                        },
                    },
                },
            },
            VolumeClaimTemplates: r.buildVolumeClaimTemplates(peer),
            PodManagementPolicy:  appsv1.OrderedReadyPodManagement,
            UpdateStrategy: appsv1.StatefulSetUpdateStrategy{
                Type: appsv1.RollingUpdateStatefulSetStrategyType,
                RollingUpdate: &appsv1.RollingUpdateStatefulSetStrategy{
                    Partition: pointer.Int32(0),
                },
            },
        },
    }
}

func (r *FabricPeerReconciler) buildLivenessProbe() *corev1.Probe {
    return &corev1.Probe{
        ProbeHandler: corev1.ProbeHandler{
            HTTPGet: &corev1.HTTPGetAction{
                Path:   "/healthz",
                Port:   intstr.FromInt(9443),
                Scheme: corev1.URISchemeHTTPS,
            },
        },
        InitialDelaySeconds: 60,
        PeriodSeconds:       30,
        TimeoutSeconds:      10,
        FailureThreshold:    3,
        SuccessThreshold:    1,
    }
}

func (r *FabricPeerReconciler) buildReadinessProbe() *corev1.Probe {
    return &corev1.Probe{
        ProbeHandler: corev1.ProbeHandler{
            HTTPGet: &corev1.HTTPGetAction{
                Path:   "/healthz",
                Port:   intstr.FromInt(9443),
                Scheme: corev1.URISchemeHTTPS,
            },
        },
        InitialDelaySeconds: 30,
        PeriodSeconds:       15,
        TimeoutSeconds:      10,
        FailureThreshold:    3,
        SuccessThreshold:    1,
    }
}

func (r *FabricPeerReconciler) buildStartupProbe() *corev1.Probe {
    return &corev1.Probe{
        ProbeHandler: corev1.ProbeHandler{
            HTTPGet: &corev1.HTTPGetAction{
                Path:   "/healthz",
                Port:   intstr.FromInt(9443),
                Scheme: corev1.URISchemeHTTPS,
            },
        },
        InitialDelaySeconds: 10,
        PeriodSeconds:       10,
        TimeoutSeconds:      5,
        FailureThreshold:    30, // Allow up to 5 minutes for startup
        SuccessThreshold:    1,
    }
}

func (r *FabricPeerReconciler) reconcileDelete(ctx context.Context, peer *hlfv1alpha1.FabricPeer, log logr.Logger) (ctrl.Result, error) {
    log.Info("Starting deletion reconciliation")

    if !controllerutil.ContainsFinalizer(peer, peerFinalizer) {
        return ctrl.Result{}, nil
    }

    // Update phase to terminating
    peer.Status.Status = hlfv1alpha1.PeerStateTerminating
    if err := r.Status().Update(ctx, peer); err != nil {
        log.Error(err, "Failed to update status to terminating")
    }

    // Step 1: Graceful shutdown - notify other peers
    log.Info("Performing graceful shutdown")
    if err := r.gracefulShutdown(ctx, peer); err != nil {
        log.Error(err, "Graceful shutdown encountered errors, continuing with cleanup")
    }

    // Step 2: Clean up external resources
    log.Info("Cleaning up external resources")
    if err := r.cleanupExternalResources(ctx, peer); err != nil {
        log.Error(err, "Failed to cleanup external resources")
        return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
    }

    // Step 3: Clean up certificates from CA (if configured)
    log.Info("Revoking certificates")
    if err := r.revokeCertificates(ctx, peer); err != nil {
        log.Error(err, "Failed to revoke certificates, continuing")
    }

    // Step 4: Remove finalizer
    controllerutil.RemoveFinalizer(peer, peerFinalizer)
    if err := r.Update(ctx, peer); err != nil {
        return ctrl.Result{}, err
    }

    log.Info("Successfully deleted FabricPeer")
    return ctrl.Result{}, nil
}

// SetupWithManager sets up the controller with the Manager
func (r *FabricPeerReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        For(&hlfv1alpha1.FabricPeer{}).
        Owns(&appsv1.StatefulSet{}).
        Owns(&corev1.Service{}).
        Owns(&corev1.ConfigMap{}).
        Owns(&corev1.Secret{}).
        Owns(&corev1.PersistentVolumeClaim{}).
        Owns(&networkingv1.NetworkPolicy{}).
        WithOptions(controller.Options{
            MaxConcurrentReconciles: 3,
        }).
        Complete(r)
}

Multi-Cluster Deployment Strategies

Blockchain validators require geographic distribution for resilience and regulatory compliance. Let's explore multi-cluster deployment patterns.

Architecture Overview

Mermaid diagram
Click to expand

FederatedBlockchainNetwork CRD

apiVersion: blockchain.example.com/v1alpha1
kind: FederatedBlockchainNetwork
metadata:
  name: global-fabric-network
  namespace: blockchain-control-plane
spec:
  # Network configuration
  network:
    type: HyperledgerFabric
    version: "2.5"
    consensus: etcdraft

  # Cluster distribution
  clusters:
    - name: us-east-1
      kubeconfig:
        secretRef:
          name: cluster-us-east-kubeconfig
          namespace: blockchain-control-plane
      roles:
        - orderer
        - peer
      nodeDistribution:
        orderers:
          count: 2
          ordinalRange: "0-1"
        peers:
          organizations:
            - mspID: Org1MSP
              count: 2
      zone: us-east-1
      priority: 1

    - name: us-west-2
      kubeconfig:
        secretRef:
          name: cluster-us-west-kubeconfig
          namespace: blockchain-control-plane
      roles:
        - orderer
        - peer
      nodeDistribution:
        orderers:
          count: 2
          ordinalRange: "2-3"
        peers:
          organizations:
            - mspID: Org2MSP
              count: 2
      zone: us-west-2
      priority: 2

    - name: eu-west-1
      kubeconfig:
        secretRef:
          name: cluster-eu-west-kubeconfig
          namespace: blockchain-control-plane
      roles:
        - orderer
        - peer
      nodeDistribution:
        orderers:
          count: 1
          ordinalRange: "4"
        peers:
          organizations:
            - mspID: Org3MSP
              count: 2
      zone: eu-west-1
      priority: 3

  # Cross-cluster networking
  networking:
    type: serviceMesh
    meshProvider: istio
    mTLS: strict
    crossClusterDNS:
      enabled: true
      domain: fabric.global
    serviceDiscovery:
      type: DNS
      refreshInterval: 30s

  # Consensus safety
  consensusSafety:
    quorumPolicy:
      minClusters: 2
      minOrderers: 3
    splitBrainPrevention:
      enabled: true
      witnessService:
        endpoints:
          - https://witness-1.example.com
          - https://witness-2.example.com
          - https://witness-3.example.com
        consensusThreshold: 2

  # Disaster recovery
  disasterRecovery:
    backupSchedule:
      full: "0 2 * * *"        # Daily at 2 AM
      incremental: "0 * * * *"  # Hourly
    retention:
      full: 7d
      incremental: 24h
    crossClusterReplication:
      enabled: true
      mode: async
      maxLagBlocks: 100

Cross-Cluster Service Discovery

package federation

import (
    "context"
    "fmt"
    "sync"
    "time"

    "k8s.io/apimachinery/pkg/types"
    "sigs.k8s.io/controller-runtime/pkg/client"
)

// ClusterEndpoint represents a service endpoint in a specific cluster
type ClusterEndpoint struct {
    ClusterName string
    Address     string
    Port        int32
    TLSCert     string
    Healthy     bool
    LastCheck   time.Time
}

// CrossClusterServiceDiscovery manages service discovery across clusters
type CrossClusterServiceDiscovery struct {
    mu            sync.RWMutex
    clusterClients map[string]client.Client
    endpoints     map[string][]ClusterEndpoint
    refreshTicker *time.Ticker
}

func NewCrossClusterServiceDiscovery(clients map[string]client.Client, refreshInterval time.Duration) *CrossClusterServiceDiscovery {
    sd := &CrossClusterServiceDiscovery{
        clusterClients: clients,
        endpoints:     make(map[string][]ClusterEndpoint),
        refreshTicker: time.NewTicker(refreshInterval),
    }

    go sd.refreshLoop()
    return sd
}

func (sd *CrossClusterServiceDiscovery) refreshLoop() {
    for range sd.refreshTicker.C {
        ctx := context.Background()
        sd.RefreshEndpoints(ctx)
    }
}

func (sd *CrossClusterServiceDiscovery) RefreshEndpoints(ctx context.Context) error {
    sd.mu.Lock()
    defer sd.mu.Unlock()

    for clusterName, clusterClient := range sd.clusterClients {
        // List all FabricPeer services in the cluster
        services := &corev1.ServiceList{}
        if err := clusterClient.List(ctx, services,
            client.MatchingLabels{"app.kubernetes.io/component": "peer"}); err != nil {
            continue
        }

        for _, svc := range services.Items {
            endpoint := ClusterEndpoint{
                ClusterName: clusterName,
                Address:     fmt.Sprintf("%s.%s.svc.cluster.local", svc.Name, svc.Namespace),
                Port:        7051,
                Healthy:     true,
                LastCheck:   time.Now(),
            }

            // Get TLS certificate from associated secret
            secret := &corev1.Secret{}
            if err := clusterClient.Get(ctx, types.NamespacedName{
                Name:      svc.Name + "-tls",
                Namespace: svc.Namespace,
            }, secret); err == nil {
                endpoint.TLSCert = string(secret.Data["tls.crt"])
            }

            key := fmt.Sprintf("%s/%s", svc.Namespace, svc.Name)
            sd.endpoints[key] = append(sd.endpoints[key], endpoint)
        }
    }

    return nil
}

// GetEndpoints returns all known endpoints for a service
func (sd *CrossClusterServiceDiscovery) GetEndpoints(namespace, name string) []ClusterEndpoint {
    sd.mu.RLock()
    defer sd.mu.RUnlock()

    key := fmt.Sprintf("%s/%s", namespace, name)
    return sd.endpoints[key]
}

// GetHealthyEndpoints returns only healthy endpoints
func (sd *CrossClusterServiceDiscovery) GetHealthyEndpoints(namespace, name string) []ClusterEndpoint {
    endpoints := sd.GetEndpoints(namespace, name)
    healthy := make([]ClusterEndpoint, 0)

    for _, ep := range endpoints {
        if ep.Healthy {
            healthy = append(healthy, ep)
        }
    }

    return healthy
}

Split-Brain Prevention

package federation

import (
    "context"
    "fmt"
    "sync"
    "time"
)

// QuorumPolicy defines the requirements for maintaining consensus safety
type QuorumPolicy struct {
    MinClusters    int
    MinOrderers    int
    CheckInterval  time.Duration
    GracePeriod    time.Duration
}

// SplitBrainPrevention prevents consensus divergence in multi-cluster deployments
type SplitBrainPrevention struct {
    policy          QuorumPolicy
    clusterClients  map[string]client.Client
    witnessEndpoints []string
    mu              sync.RWMutex
    fenced          map[string]bool
}

func NewSplitBrainPrevention(policy QuorumPolicy, clients map[string]client.Client, witnesses []string) *SplitBrainPrevention {
    sbp := &SplitBrainPrevention{
        policy:           policy,
        clusterClients:   clients,
        witnessEndpoints: witnesses,
        fenced:           make(map[string]bool),
    }

    go sbp.monitorLoop()
    return sbp
}

func (sbp *SplitBrainPrevention) monitorLoop() {
    ticker := time.NewTicker(sbp.policy.CheckInterval)
    defer ticker.Stop()

    for range ticker.C {
        ctx := context.Background()
        sbp.checkQuorum(ctx)
    }
}

func (sbp *SplitBrainPrevention) checkQuorum(ctx context.Context) {
    sbp.mu.Lock()
    defer sbp.mu.Unlock()

    // Check cluster connectivity
    reachableClusters := 0
    reachableOrderers := 0

    for clusterName, clusterClient := range sbp.clusterClients {
        // Check if cluster is reachable
        if sbp.isClusterReachable(ctx, clusterClient) {
            reachableClusters++

            // Count healthy orderers in this cluster
            orderers, err := sbp.countHealthyOrderers(ctx, clusterClient)
            if err == nil {
                reachableOrderers += orderers
            }
        } else {
            // Cluster unreachable - fence its nodes
            sbp.fenceCluster(clusterName)
        }
    }

    // Check if we have quorum
    hasQuorum := reachableClusters >= sbp.policy.MinClusters &&
                 reachableOrderers >= sbp.policy.MinOrderers

    if !hasQuorum {
        // We don't have quorum - consult external witnesses
        witnessVotes := sbp.consultWitnesses(ctx)

        if witnessVotes < len(sbp.witnessEndpoints)/2+1 {
            // Witnesses confirm we're in a minority partition
            // Fence all local orderers
            sbp.fenceAllLocal()
        }
    }
}

func (sbp *SplitBrainPrevention) consultWitnesses(ctx context.Context) int {
    votes := 0
    var wg sync.WaitGroup
    var mu sync.Mutex

    for _, endpoint := range sbp.witnessEndpoints {
        wg.Add(1)
        go func(ep string) {
            defer wg.Done()

            if sbp.getWitnessVote(ctx, ep) {
                mu.Lock()
                votes++
                mu.Unlock()
            }
        }(endpoint)
    }

    wg.Wait()
    return votes
}

func (sbp *SplitBrainPrevention) fenceCluster(clusterName string) {
    sbp.fenced[clusterName] = true

    // Emit event and alert
    log.Info("Fencing cluster due to network partition", "cluster", clusterName)

    // Could also:
    // 1. Scale down orderer replicas
    // 2. Update NetworkPolicy to block traffic
    // 3. Send alerts to operators
}

func (sbp *SplitBrainPrevention) unfenceCluster(clusterName string) {
    sbp.fenced[clusterName] = false
    log.Info("Unfencing cluster after network recovery", "cluster", clusterName)
}

func (sbp *SplitBrainPrevention) IsFenced(clusterName string) bool {
    sbp.mu.RLock()
    defer sbp.mu.RUnlock()
    return sbp.fenced[clusterName]
}

StatefulSet and PersistentVolume Coordination

Blockchain nodes require careful coordination of persistent storage to maintain ledger integrity.

Storage Configuration

apiVersion: v1
kind: StorageClass
metadata:
  name: blockchain-fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  iops: "16000"
  throughput: "1000"
  encrypted: "true"
  fsType: ext4
reclaimPolicy: Retain  # Critical for blockchain data
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: blockchain-snapshot
driver: ebs.csi.aws.com
deletionPolicy: Retain
parameters:
  tagSpecification_1: "Name=blockchain-backup"
  tagSpecification_2: "Environment=production"

PVC Reconciliation

func (r *FabricPeerReconciler) reconcilePVCs(ctx context.Context, peer *hlfv1alpha1.FabricPeer) error {
    pvcs := []struct {
        name        string
        storageSpec hlfv1alpha1.StorageSpec
    }{
        {"peer-data", peer.Spec.Storage.Peer},
        {"statedb-data", peer.Spec.Storage.StateDB},
        {"chaincode-data", peer.Spec.Storage.Chaincode},
    }

    for _, pvcSpec := range pvcs {
        pvc := &corev1.PersistentVolumeClaim{
            ObjectMeta: metav1.ObjectMeta{
                Name:      fmt.Sprintf("%s-%s", peer.Name, pvcSpec.name),
                Namespace: peer.Namespace,
                Labels:    r.buildLabels(peer),
                Annotations: map[string]string{
                    "blockchain.example.com/component": "peer",
                    "blockchain.example.com/mspid":     peer.Spec.MspID,
                },
            },
            Spec: corev1.PersistentVolumeClaimSpec{
                AccessModes: []corev1.PersistentVolumeAccessMode{
                    corev1.ReadWriteOnce,
                },
                StorageClassName: &pvcSpec.storageSpec.StorageClass,
                Resources: corev1.ResourceRequirements{
                    Requests: corev1.ResourceList{
                        corev1.ResourceStorage: pvcSpec.storageSpec.Size,
                    },
                },
            },
        }

        if err := controllerutil.SetControllerReference(peer, pvc, r.Scheme); err != nil {
            return err
        }

        existing := &corev1.PersistentVolumeClaim{}
        err := r.Get(ctx, types.NamespacedName{Name: pvc.Name, Namespace: pvc.Namespace}, existing)
        if err != nil {
            if errors.IsNotFound(err) {
                if err := r.Create(ctx, pvc); err != nil {
                    return fmt.Errorf("failed to create PVC %s: %w", pvc.Name, err)
                }
                continue
            }
            return err
        }

        // Check if PVC is bound
        if existing.Status.Phase != corev1.ClaimBound {
            return fmt.Errorf("PVC %s is not bound (phase: %s)", existing.Name, existing.Status.Phase)
        }
    }

    return nil
}

Backup and Disaster Recovery

apiVersion: blockchain.example.com/v1alpha1
kind: LedgerBackup
metadata:
  name: org1-peer0-backup
  namespace: fabric-network
spec:
  source:
    peerRef:
      name: org1-peer0
      namespace: fabric-network
    dataPath: /var/hyperledger/production

  schedule:
    # Full backup daily at 2 AM UTC
    full:
      cron: "0 2 * * *"
      retention: 7d
    # Incremental backup every 4 hours
    incremental:
      cron: "0 */4 * * *"
      retention: 48h
    # Snapshot before any upgrade
    preUpgrade:
      enabled: true

  destination:
    primary:
      type: S3
      bucket: blockchain-backups-primary
      region: us-east-1
      encryption:
        type: aws:kms
        keyId: arn:aws:kms:us-east-1:123456789:key/abc-123
    secondary:
      type: GCS
      bucket: blockchain-backups-dr
      region: eu-west1
      encryption:
        type: customer-managed
        keySecretRef:
          name: backup-encryption-key

  verification:
    enabled: true
    checksumAlgorithm: SHA256
    verifyRestorability: true
    testRestoreSchedule: "0 4 * * 0"  # Weekly on Sunday

Network Policies and Service Discovery

Default Deny Policy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: fabric-network
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

Peer Network Policy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: fabric-peer-policy
  namespace: fabric-network
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: peer
  policyTypes:
    - Ingress
    - Egress
  ingress:
    # Allow gRPC from other peers (gossip)
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: peer
      ports:
        - protocol: TCP
          port: 7051
    # Allow gRPC from orderers
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: orderer
      ports:
        - protocol: TCP
          port: 7051
    # Allow chaincode connections
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: chaincode
      ports:
        - protocol: TCP
          port: 7052
    # Allow operations/metrics from monitoring
    - from:
        - namespaceSelector:
            matchLabels:
              name: monitoring
          podSelector:
            matchLabels:
              app: prometheus
      ports:
        - protocol: TCP
          port: 9443
    # Allow external access via ingress controller
    - from:
        - namespaceSelector:
            matchLabels:
              name: istio-system
      ports:
        - protocol: TCP
          port: 7051
  egress:
    # Allow DNS
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53
    # Allow connection to orderers
    - to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: orderer
      ports:
        - protocol: TCP
          port: 7050
    # Allow gossip to other peers
    - to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: peer
      ports:
        - protocol: TCP
          port: 7051
    # Allow connection to CA for enrollment
    - to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: ca
      ports:
        - protocol: TCP
          port: 7054
    # Allow external peer communication (cross-cluster)
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
            except:
              - 169.254.0.0/16
              - 10.0.0.0/8
              - 172.16.0.0/12
              - 192.168.0.0/16
      ports:
        - protocol: TCP
          port: 7051

Orderer Network Policy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: fabric-orderer-policy
  namespace: fabric-network
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: orderer
  policyTypes:
    - Ingress
    - Egress
  ingress:
    # Allow from peers for transaction submission
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: peer
      ports:
        - protocol: TCP
          port: 7050
    # Allow from other orderers (Raft consensus)
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: orderer
      ports:
        - protocol: TCP
          port: 7050
    # Allow admin operations
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: orderer
      ports:
        - protocol: TCP
          port: 7053
    # Allow metrics scraping
    - from:
        - namespaceSelector:
            matchLabels:
              name: monitoring
      ports:
        - protocol: TCP
          port: 9443
  egress:
    # Allow DNS
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53
    # Allow Raft communication to other orderers
    - to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: orderer
      ports:
        - protocol: TCP
          port: 7050
    # Allow CA enrollment
    - to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: ca
      ports:
        - protocol: TCP
          port: 7054
    # Allow cross-cluster Raft communication
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
      ports:
        - protocol: TCP
          port: 7050

Monitoring and Alerting Integration

Prometheus Metrics Implementation

package metrics

import (
    "github.com/prometheus/client_golang/prometheus"
    "sigs.k8s.io/controller-runtime/pkg/metrics"
)

var (
    // Reconciliation metrics
    reconcileTotal = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "blockchain_operator_reconcile_total",
            Help: "Total number of reconciliations per controller",
        },
        []string{"controller", "result"},
    )

    reconcileDuration = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Name:    "blockchain_operator_reconcile_duration_seconds",
            Help:    "Duration of reconciliation in seconds",
            Buckets: []float64{0.1, 0.5, 1, 2, 5, 10, 30, 60},
        },
        []string{"controller"},
    )

    reconcileErrors = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "blockchain_operator_reconcile_errors_total",
            Help: "Total number of reconciliation errors",
        },
        []string{"controller", "error_type"},
    )

    // Blockchain node metrics
    nodeStatus = prometheus.NewGaugeVec(
        prometheus.GaugeOpts{
            Name: "blockchain_node_status",
            Help: "Current status of blockchain nodes (0=Unknown, 1=Pending, 2=Initializing, 3=Running, 4=Failed, 5=Terminating)",
        },
        []string{"namespace", "name", "component", "mspid"},
    )

    nodeBlockHeight = prometheus.NewGaugeVec(
        prometheus.GaugeOpts{
            Name: "blockchain_node_block_height",
            Help: "Current block height of blockchain nodes",
        },
        []string{"namespace", "name", "channel"},
    )

    nodePeerCount = prometheus.NewGaugeVec(
        prometheus.GaugeOpts{
            Name: "blockchain_node_peer_count",
            Help: "Number of connected peers",
        },
        []string{"namespace", "name"},
    )

    nodeTransactionCount = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "blockchain_node_transactions_total",
            Help: "Total number of transactions processed",
        },
        []string{"namespace", "name", "channel", "status"},
    )

    // Consensus metrics
    raftLeaderChanges = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "blockchain_raft_leader_changes_total",
            Help: "Total number of Raft leader changes",
        },
        []string{"namespace", "name", "channel"},
    )

    raftProposalFailures = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "blockchain_raft_proposal_failures_total",
            Help: "Total number of Raft proposal failures",
        },
        []string{"namespace", "name", "channel"},
    )

    // Multi-cluster metrics
    crossClusterLatency = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Name:    "blockchain_cross_cluster_latency_seconds",
            Help:    "Latency of cross-cluster communication",
            Buckets: []float64{0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10},
        },
        []string{"source_cluster", "target_cluster"},
    )

    clusterQuorumStatus = prometheus.NewGaugeVec(
        prometheus.GaugeOpts{
            Name: "blockchain_cluster_quorum_status",
            Help: "Quorum status (1=has quorum, 0=no quorum)",
        },
        []string{"cluster"},
    )
)

func init() {
    metrics.Registry.MustRegister(
        reconcileTotal,
        reconcileDuration,
        reconcileErrors,
        nodeStatus,
        nodeBlockHeight,
        nodePeerCount,
        nodeTransactionCount,
        raftLeaderChanges,
        raftProposalFailures,
        crossClusterLatency,
        clusterQuorumStatus,
    )
}

// Helper functions for recording metrics
func RecordReconcile(controller, result string, duration float64) {
    reconcileTotal.WithLabelValues(controller, result).Inc()
    reconcileDuration.WithLabelValues(controller).Observe(duration)
}

func RecordReconcileError(controller, errorType string) {
    reconcileErrors.WithLabelValues(controller, errorType).Inc()
}

func UpdateNodeStatus(namespace, name, component, mspid string, status int) {
    nodeStatus.WithLabelValues(namespace, name, component, mspid).Set(float64(status))
}

func UpdateBlockHeight(namespace, name, channel string, height int64) {
    nodeBlockHeight.WithLabelValues(namespace, name, channel).Set(float64(height))
}

ServiceMonitor Configuration

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: blockchain-operator
  namespace: monitoring
  labels:
    team: blockchain
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: blockchain-operator
  namespaceSelector:
    matchNames:
      - blockchain-operator-system
  endpoints:
    - port: metrics
      interval: 15s
      path: /metrics
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: fabric-peers
  namespace: monitoring
  labels:
    team: blockchain
spec:
  selector:
    matchLabels:
      app.kubernetes.io/component: peer
  namespaceSelector:
    any: true
  endpoints:
    - port: operations
      interval: 30s
      path: /metrics
      scheme: https
      tlsConfig:
        insecureSkipVerify: true
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: fabric-orderers
  namespace: monitoring
  labels:
    team: blockchain
spec:
  selector:
    matchLabels:
      app.kubernetes.io/component: orderer
  namespaceSelector:
    any: true
  endpoints:
    - port: operations
      interval: 30s
      path: /metrics
      scheme: https
      tlsConfig:
        insecureSkipVerify: true

PrometheusRule for Alerts

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: blockchain-alerts
  namespace: monitoring
spec:
  groups:
    - name: blockchain-operator
      rules:
        - alert: BlockchainOperatorReconcileErrors
          expr: |
            rate(blockchain_operator_reconcile_errors_total[5m]) > 0.1
          for: 10m
          labels:
            severity: warning
          annotations:
            summary: "High reconciliation error rate"
            description: "Controller {{ $labels.controller }} has error rate {{ $value }}/s"

        - alert: BlockchainOperatorReconcileSlow
          expr: |
            histogram_quantile(0.99, rate(blockchain_operator_reconcile_duration_seconds_bucket[5m])) > 30
          for: 10m
          labels:
            severity: warning
          annotations:
            summary: "Slow reconciliation detected"
            description: "P99 reconcile duration is {{ $value }}s"

    - name: blockchain-nodes
      rules:
        - alert: BlockchainNodeDown
          expr: |
            blockchain_node_status != 3
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "Blockchain node is not running"
            description: "Node {{ $labels.namespace }}/{{ $labels.name }} is not in Running state"

        - alert: BlockchainNodeNoPeers
          expr: |
            blockchain_node_peer_count == 0
          for: 10m
          labels:
            severity: warning
          annotations:
            summary: "Blockchain node has no peers"
            description: "Node {{ $labels.namespace }}/{{ $labels.name }} has 0 connected peers"

        - alert: BlockchainNodeBlockHeightStalled
          expr: |
            increase(blockchain_node_block_height[30m]) == 0
          for: 30m
          labels:
            severity: warning
          annotations:
            summary: "Block height not increasing"
            description: "Node {{ $labels.namespace }}/{{ $labels.name }} block height stalled"

    - name: blockchain-consensus
      rules:
        - alert: RaftLeaderChangesHigh
          expr: |
            rate(blockchain_raft_leader_changes_total[1h]) > 10
          for: 15m
          labels:
            severity: warning
          annotations:
            summary: "High rate of Raft leader changes"
            description: "Channel {{ $labels.channel }} has {{ $value }} leader changes/hour"

        - alert: RaftProposalFailuresHigh
          expr: |
            rate(blockchain_raft_proposal_failures_total[5m]) > 0.5
          for: 10m
          labels:
            severity: critical
          annotations:
            summary: "High rate of Raft proposal failures"
            description: "Orderer {{ $labels.namespace }}/{{ $labels.name }} has proposal failure rate {{ $value }}/s"

    - name: blockchain-multi-cluster
      rules:
        - alert: ClusterQuorumLost
          expr: |
            blockchain_cluster_quorum_status == 0
          for: 1m
          labels:
            severity: critical
          annotations:
            summary: "Cluster has lost quorum"
            description: "Cluster {{ $labels.cluster }} has lost quorum - split-brain risk"

        - alert: CrossClusterLatencyHigh
          expr: |
            histogram_quantile(0.99, rate(blockchain_cross_cluster_latency_seconds_bucket[5m])) > 0.5
          for: 10m
          labels:
            severity: warning
          annotations:
            summary: "High cross-cluster latency"
            description: "P99 latency between {{ $labels.source_cluster }} and {{ $labels.target_cluster }} is {{ $value }}s"

RBAC and Security Contexts

Operator RBAC

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: blockchain-operator-manager
rules:
  # Core API resources
  - apiGroups: [""]
    resources: ["pods", "pods/log", "pods/exec"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["services", "endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: [""]
    resources: ["configmaps", "secrets"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "patch"]

  # Apps API
  - apiGroups: ["apps"]
    resources: ["statefulsets", "deployments"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Networking
  - apiGroups: ["networking.k8s.io"]
    resources: ["networkpolicies"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Custom resources
  - apiGroups: ["hlf.kungfusoftware.es"]
    resources: ["*"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: ["hlf.kungfusoftware.es"]
    resources: ["*/status"]
    verbs: ["get", "update", "patch"]
  - apiGroups: ["hlf.kungfusoftware.es"]
    resources: ["*/finalizers"]
    verbs: ["update"]

  # Monitoring
  - apiGroups: ["monitoring.coreos.com"]
    resources: ["servicemonitors", "podmonitors"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  # Snapshots for backup
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshots", "volumesnapshotcontents"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: blockchain-operator-manager
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: blockchain-operator-manager
subjects:
  - kind: ServiceAccount
    name: blockchain-operator
    namespace: blockchain-operator-system

Pod Security Standards

apiVersion: v1
kind: Namespace
metadata:
  name: fabric-network
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: fabric-orderers-pdb
  namespace: fabric-network
spec:
  minAvailable: 3  # Maintain Raft quorum
  selector:
    matchLabels:
      app.kubernetes.io/component: orderer
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: fabric-peers-pdb
  namespace: fabric-network
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: peer

Troubleshooting Common Failures

Node Restart Handling

func (r *FabricPeerReconciler) handleNodeRestart(ctx context.Context, peer *hlfv1alpha1.FabricPeer) error {
    log := r.Log.WithValues("peer", peer.Name)

    // Get the StatefulSet
    sts := &appsv1.StatefulSet{}
    if err := r.Get(ctx, types.NamespacedName{Name: peer.Name, Namespace: peer.Namespace}, sts); err != nil {
        return err
    }

    // Check if any pods are in CrashLoopBackOff
    pods := &corev1.PodList{}
    if err := r.List(ctx, pods,
        client.InNamespace(peer.Namespace),
        client.MatchingLabels(r.buildLabels(peer))); err != nil {
        return err
    }

    for _, pod := range pods.Items {
        for _, containerStatus := range pod.Status.ContainerStatuses {
            if containerStatus.State.Waiting != nil {
                reason := containerStatus.State.Waiting.Reason

                switch reason {
                case "CrashLoopBackOff":
                    log.Info("Pod in CrashLoopBackOff, checking logs", "pod", pod.Name)

                    // Check if it's a known recoverable error
                    if r.isRecoverableError(ctx, &pod) {
                        // Attempt recovery actions
                        if err := r.attemptRecovery(ctx, peer, &pod); err != nil {
                            log.Error(err, "Recovery attempt failed")
                        }
                    } else {
                        // Emit event for manual intervention
                        r.recorder.Event(peer, corev1.EventTypeWarning, "CrashLoopBackOff",
                            fmt.Sprintf("Pod %s is in CrashLoopBackOff - manual intervention may be required", pod.Name))
                    }

                case "ImagePullBackOff":
                    r.recorder.Event(peer, corev1.EventTypeWarning, "ImagePullBackOff",
                        fmt.Sprintf("Pod %s cannot pull image - check image availability and credentials", pod.Name))
                }
            }
        }
    }

    return nil
}

func (r *FabricPeerReconciler) isRecoverableError(ctx context.Context, pod *corev1.Pod) bool {
    // Get recent logs
    logs, err := r.getPodLogs(ctx, pod, 100)
    if err != nil {
        return false
    }

    // Check for known recoverable patterns
    recoverablePatterns := []string{
        "database locked",
        "temporary failure in name resolution",
        "connection refused",
        "no route to host",
    }

    for _, pattern := range recoverablePatterns {
        if strings.Contains(strings.ToLower(logs), pattern) {
            return true
        }
    }

    return false
}

func (r *FabricPeerReconciler) attemptRecovery(ctx context.Context, peer *hlfv1alpha1.FabricPeer, pod *corev1.Pod) error {
    log := r.Log.WithValues("peer", peer.Name, "pod", pod.Name)

    // Strategy 1: Delete the pod to force restart
    log.Info("Attempting recovery by deleting pod")
    if err := r.Delete(ctx, pod); err != nil {
        return fmt.Errorf("failed to delete pod: %w", err)
    }

    // Wait for pod to be recreated
    time.Sleep(10 * time.Second)

    // Strategy 2: If that doesn't work, check PVC
    pvc := &corev1.PersistentVolumeClaim{}
    pvcName := fmt.Sprintf("peer-data-%s-0", peer.Name)
    if err := r.Get(ctx, types.NamespacedName{Name: pvcName, Namespace: peer.Namespace}, pvc); err == nil {
        // Check if PVC is corrupted or has issues
        if pvc.Status.Phase != corev1.ClaimBound {
            log.Info("PVC is not bound, may need manual intervention", "pvc", pvcName)
            r.recorder.Event(peer, corev1.EventTypeWarning, "PVCNotBound",
                fmt.Sprintf("PVC %s is not bound - check storage provisioner", pvcName))
        }
    }

    return nil
}

Network Partition Detection

func (r *FabricPeerReconciler) detectNetworkPartition(ctx context.Context, peer *hlfv1alpha1.FabricPeer) (*NetworkPartitionInfo, error) {
    info := &NetworkPartitionInfo{
        Timestamp: time.Now(),
        Affected:  make([]string, 0),
    }

    // Check connectivity to other peers
    peerList := &hlfv1alpha1.FabricPeerList{}
    if err := r.List(ctx, peerList, client.InNamespace(peer.Namespace)); err != nil {
        return nil, err
    }

    for _, otherPeer := range peerList.Items {
        if otherPeer.Name == peer.Name {
            continue
        }

        // Try to connect to the peer's gRPC endpoint
        endpoint := fmt.Sprintf("%s.%s.svc.cluster.local:7051", otherPeer.Name, otherPeer.Namespace)
        if !r.canConnectToEndpoint(ctx, endpoint, 5*time.Second) {
            info.Affected = append(info.Affected, otherPeer.Name)
        }
    }

    // Check connectivity to orderers
    ordererList := &hlfv1alpha1.FabricOrderingServiceList{}
    if err := r.List(ctx, ordererList, client.InNamespace(peer.Namespace)); err != nil {
        return nil, err
    }

    for _, orderer := range ordererList.Items {
        for _, node := range orderer.Spec.Nodes {
            endpoint := fmt.Sprintf("%s.%s.svc.cluster.local:7050", node.ID, orderer.Namespace)
            if !r.canConnectToEndpoint(ctx, endpoint, 5*time.Second) {
                info.OrderersUnreachable = append(info.OrderersUnreachable, node.ID)
            }
        }
    }

    info.IsPartitioned = len(info.Affected) > 0 || len(info.OrderersUnreachable) > 0

    if info.IsPartitioned {
        r.recorder.Event(peer, corev1.EventTypeWarning, "NetworkPartition",
            fmt.Sprintf("Detected network partition: unreachable peers=%v, unreachable orderers=%v",
                info.Affected, info.OrderersUnreachable))
    }

    return info, nil
}

Troubleshooting Decision Tree

Mermaid diagram
Click to expand


Complete Working Example

Here's a complete example deploying a Hyperledger Fabric network using the operator:

1. Install the Operator

# Add Helm repository
helm repo add bevel https://hyperledger.github.io/bevel-operator-fabric
helm repo update

# Install the operator
helm install hlf-operator --namespace=hlf-operator-system \
  --create-namespace bevel/hlf-operator \
  --set resources.requests.cpu=100m \
  --set resources.requests.memory=256Mi \
  --set resources.limits.cpu=500m \
  --set resources.limits.memory=512Mi

2. Deploy Certificate Authority

apiVersion: hlf.kungfusoftware.es/v1alpha1
kind: FabricCA
metadata:
  name: org1-ca
  namespace: fabric-network
spec:
  image: hyperledger/fabric-ca
  version: 1.5.7
  hosts:
    - localhost
    - org1-ca
    - org1-ca.fabric-network
    - org1-ca.fabric-network.svc
    - org1-ca.fabric-network.svc.cluster.local
  service:
    type: ClusterIP
  storage:
    size: 5Gi
    storageClass: standard
    accessMode: ReadWriteOnce
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 500m
      memory: 512Mi
  ca:
    name: ca
    subject:
      cn: ca
      o: Org1
      c: US
      st: California
      l: San Francisco
  tlsCA:
    name: tlsca
    subject:
      cn: tlsca
      o: Org1
      c: US
      st: California
      l: San Francisco

3. Deploy Peer

apiVersion: hlf.kungfusoftware.es/v1alpha1
kind: FabricPeer
metadata:
  name: org1-peer0
  namespace: fabric-network
spec:
  image: hyperledger/fabric-peer
  version: 2.5.4
  mspID: Org1MSP
  stateDb: couchdb
  couchdb:
    user: admin
    password: adminpw
  secret:
    enrollment:
      component:
        cahost: org1-ca.fabric-network
        caport: 7054
        caname: ca
        catls:
          cacert: ""  # Will be populated by operator
        enrollid: peer0
        enrollsecret: peer0pw
      tls:
        cahost: org1-ca.fabric-network
        caport: 7054
        caname: tlsca
        catls:
          cacert: ""
        enrollid: peer0
        enrollsecret: peer0pw
  storage:
    peer:
      storageClass: standard
      size: 20Gi
      accessMode: ReadWriteOnce
    statedb:
      storageClass: standard
      size: 10Gi
      accessMode: ReadWriteOnce
    chaincode:
      storageClass: standard
      size: 5Gi
      accessMode: ReadWriteOnce
  resources:
    peer:
      requests:
        cpu: 250m
        memory: 512Mi
      limits:
        cpu: 1000m
        memory: 2Gi
    couchdb:
      requests:
        cpu: 100m
        memory: 256Mi
      limits:
        cpu: 500m
        memory: 1Gi
  service:
    type: ClusterIP

4. Deploy Orderer

apiVersion: hlf.kungfusoftware.es/v1alpha1
kind: FabricOrderingService
metadata:
  name: orderer
  namespace: fabric-network
spec:
  image: hyperledger/fabric-orderer
  version: 2.5.4
  mspID: OrdererMSP
  nodes:
    - id: orderer0
      enrollment:
        component:
          cahost: orderer-ca.fabric-network
          caport: 7054
          caname: ca
          enrollid: orderer0
          enrollsecret: orderer0pw
        tls:
          cahost: orderer-ca.fabric-network
          caport: 7054
          caname: tlsca
          enrollid: orderer0
          enrollsecret: orderer0pw
      storage:
        storageClass: standard
        size: 50Gi
        accessMode: ReadWriteOnce
      resources:
        requests:
          cpu: 250m
          memory: 512Mi
        limits:
          cpu: 1000m
          memory: 2Gi

5. Verification Commands

# Check operator status
kubectl get pods -n hlf-operator-system

# Check Fabric CA status
kubectl get fabricca -n fabric-network

# Check peers
kubectl get fabricpeer -n fabric-network -o wide

# Check orderers
kubectl get fabricorderingservice -n fabric-network

# View peer logs
kubectl logs -n fabric-network -l app.kubernetes.io/component=peer -c peer --tail=100

# Check peer status
kubectl get fabricpeer org1-peer0 -n fabric-network -o jsonpath='{.status}'

Performance Metrics

Based on testing with the patterns described in this guide:

MetricTargetAchieved
Operator startup time<5s~3s
Reconciliation cycle<30s~15-25s
Certificate enrollment<10s~5s
StatefulSet ready<2min~90s
Cross-cluster latency<500ms~200ms
Node recovery time<5min~3min

Conclusion

Building Kubernetes operators for blockchain networks requires careful consideration of:

  1. CRD Design: Model complex blockchain component relationships with clear spec/status separation
  2. Reconciliation: Implement phased, idempotent reconciliation with proper error handling
  3. Multi-Cluster: Use federation patterns with split-brain prevention for geographic distribution
  4. Storage: Coordinate StatefulSets and PVCs with appropriate retention policies
  5. Security: Apply defense-in-depth with RBAC, network policies, and pod security standards
  6. Monitoring: Integrate Prometheus metrics for observability and alerting
  7. Operations: Prepare for common failure scenarios with automated recovery

The Bevel Operator Fabric provides a production-ready foundation that can be extended for specific blockchain requirements. By following the patterns outlined in this guide, you can build operators that deliver 99%+ uptime for critical blockchain infrastructure.

References

Technical FAQ

Q: How does the operator handle certificate renewal? A: The operator monitors certificate expiration and triggers re-enrollment with the Fabric CA before expiration, typically at 80% of the certificate lifetime.

Q: What happens during a Raft leader election? A: The operator monitors Raft metrics and can detect leader changes. During elections, transaction submission may briefly pause but the operator ensures orderer pods remain available.

Q: How do I scale the number of peers? A: Create additional FabricPeer custom resources. The operator will handle certificate enrollment, storage provisioning, and gossip configuration automatically.

Q: Can I upgrade Fabric versions with zero downtime? A: Yes, by using rolling updates with proper PodDisruptionBudgets. Update the version in the CRD spec, and the operator will perform a rolling update of the StatefulSet.

Q: How does the operator handle storage failures? A: The operator monitors PVC status and can trigger alerts. For data recovery, it integrates with VolumeSnapshot for backup/restore operations.


Appendix: Quick Reference

Operator Installation Commands

# Install Bevel Operator Fabric
helm repo add bevel https://hyperledger.github.io/bevel-operator-fabric
helm repo update

helm install hlf-operator bevel/hlf-operator \
  --namespace hlf-operator-system \
  --create-namespace \
  --set resources.requests.cpu=100m \
  --set resources.requests.memory=256Mi

# Verify installation
kubectl get pods -n hlf-operator-system
kubectl get crds | grep hlf

Essential kubectl Commands

# View all Fabric resources
kubectl get fabricca,fabricpeer,fabricorderingservice -A

# Check operator logs
kubectl logs -n hlf-operator-system -l app.kubernetes.io/name=hlf-operator -f

# Describe peer for troubleshooting
kubectl describe fabricpeer org1-peer0 -n fabric-network

# Check reconciliation events
kubectl get events -n fabric-network --field-selector reason=ReconcileSuccess

# Port forward for local testing
kubectl port-forward svc/org1-peer0 7051:7051 -n fabric-network

Performance Tuning Parameters

ParameterDefaultRecommendedDescription
MaxConcurrentReconciles13Parallel reconciliations
SyncPeriod10h30mFull resync interval
CacheSyncTimeout2m5mCache sync timeout
LeaseDuration15s15sLeader election lease
RenewDeadline10s10sLeader renewal deadline
RetryPeriod2s2sLeader retry period

Resource Sizing Guide

ComponentCPU RequestMemory RequestStorage
Fabric CA100m256Mi5Gi
Fabric Peer250m512Mi20Gi (data) + 10Gi (statedb)
CouchDB100m256Mi(included with peer)
Fabric Orderer250m512Mi50Gi
Operator100m256Mi-

This guide was created based on best practices from the Hyperledger Bevel Operator Fabric, Kubebuilder, and Operator SDK documentation.

Building Kubernetes Operators for Blockchain Networks: A Deep Dive into Bevel Operator Fabric - David Viejo