1. 服务概览
服务
用途
外部访问端口
Prometheus
监控数据源采集
30080
Grafana
数据展示平台
30081
Alertmanager
告警管理
30082
Loki
日志数据采集
32537
2. 准备工作 2.1 镜像准备 1 2 3 4 5 6 7 8 9 10 11 12 13 14 docker pull grafana/loki:2.3.0 docker pull grafana/promtail:2.3.0 docker tag grafana/loki:2.3.0 172.16.10.160:80/grafana/loki:2.3.0 docker push 172.16.10.160:80/grafana/loki:2.3.0 docker tag grafana/promtail:2.3.0 172.16.10.160:80/grafana/promtail:2.3.0 docker push 172.16.10.160:80/grafana/promtail:2.3.0 docker save -o grafana-loki-2.3.0.tar grafana/loki:2.3.0 docker save -o grafana-promtail-2.3.0.tar grafana/promtail:2.3.0
3. Loki 部署 3.1 RBAC 配置 (loki-rbac.yaml) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 apiVersion: v1 kind: ServiceAccount metadata: name: loki namespace: logging --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: loki namespace: logging rules: - apiGroups: ["extensions" ] resourceNames: ["loki" ] resources: ["podsecuritypolicies" ] verbs: ["use" ] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: loki namespace: logging roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: loki subjects: - kind: ServiceAccount name: loki
3.2 配置文件 (loki-configmap.yaml) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 apiVersion: v1 kind: ConfigMap metadata: name: loki namespace: logging labels: app: loki data: loki.yaml: | auth_enabled: false ingester: chunk_idle_period: 3m chunk_block_size: 262144 chunk_retain_period: 1m max_transfer_retries: 0 lifecycler: ring: kvstore: store: inmemory replication_factor: 1 limits_config: enforce_metric_name: false reject_old_samples: true reject_old_samples_max_age: 168h schema_config: configs: - from: 2020-10-24 store: boltdb-shipper object_store: filesystem schema: v11 index: prefix: index_ period: 24h server: http_listen_port: 3100 storage_config: boltdb_shipper: active_index_directory: /data/loki/boltdb-shipper-active cache_location: /data/loki/boltdb-shipper-cache cache_ttl: 24h shared_store: filesystem filesystem: directory: /data/loki/chunks chunk_store_config: max_look_back_period: 0s table_manager: retention_deletes_enabled: true retention_period: 48h compactor: working_directory: /data/loki/boltdb-shipper-compactor shared_store: filesystem
3.3 部署文件 (loki.yaml) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 kind: Service metadata: name: loki namespace: logging labels: app: loki spec: type: ClusterIP ports: - port: 3100 protocol: TCP name: http-metrics targetPort: http-metrics selector: app: loki --- apiVersion: v1 kind: Service metadata: name: loki-outer namespace: logging labels: app: loki spec: type: NodePort ports: - port: 3100 protocol: TCP name: http-metrics targetPort: http-metrics nodePort: 32537 selector: app: loki --- apiVersion: apps/v1 kind: StatefulSet metadata: name: loki namespace: logging labels: app: loki spec: podManagementPolicy: OrderedReady replicas: 1 selector: matchLabels: app: loki serviceName: loki updateStrategy: type: RollingUpdate template: metadata: labels: app: loki spec: serviceAccountName: loki securityContext: fsGroup: 10001 runAsGroup: 10001 runAsNonRoot: true runAsUser: 10001 initContainers: [] containers: - name: loki image: 172.16 .10 .160 :80/grafana/loki:2.3.0 imagePullPolicy: IfNotPresent args: - --config.file=/etc/loki/loki.yaml volumeMounts: - name: config mountPath: /etc/loki - name: storage mountPath: /data ports: - name: http-metrics containerPort: 3100 protocol: TCP livenessProbe: httpGet: path: /ready port: http-metrics scheme: HTTP initialDelaySeconds: 45 timeoutSeconds: 1 periodSeconds: 10 successThreshold: 1 failureThreshold: 3 readinessProbe: httpGet: path: /ready port: http-metrics scheme: HTTP initialDelaySeconds: 45 timeoutSeconds: 1 periodSeconds: 10 successThreshold: 1 failureThreshold: 3 securityContext: readOnlyRootFilesystem: true terminationGracePeriodSeconds: 4800 volumes: - name: config configMap: defaultMode: 420 name: loki volumeClaimTemplates: - metadata: name: storage labels: app: loki annotations: volume.beta.kubernetes.io/storage-class: "course-nfs-storage" spec: accessModes: - ReadWriteOnce resources: requests: storage: "2Gi"
4. Promtail 部署 4.1 配置文件 (loki-promtail-configmap.yaml) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 apiVersion: v1 kind: ConfigMap metadata: name: loki-promtail namespace: logging labels: app: promtail data: promtail.yaml: | client: backoff_config: max_period: 5m max_retries: 10 min_period: 500ms batchsize: 1048576 batchwait: 1s external_labels: {} timeout: 10s positions: filename: /run/promtail/positions.yaml server: http_listen_port: 3101 target_config: sync_period: 10s scrape_configs: - job_name: kubernetes-pods-name pipeline_stages: - docker: {} kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: - __meta_kubernetes_pod_label_name target_label: __service__ - source_labels: - __meta_kubernetes_pod_node_name target_label: __host__ - action: drop regex: '' source_labels: - __service__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - action: replace replacement: $1 separator: / source_labels: - __meta_kubernetes_namespace - __service__ target_label: job - action: replace source_labels: - __meta_kubernetes_namespace target_label: namespace - action: replace source_labels: - __meta_kubernetes_pod_name target_label: pod - action: replace source_labels: - __meta_kubernetes_pod_container_name target_label: container - replacement: /var/log/pods/*$1/*.log separator: / source_labels: - __meta_kubernetes_pod_uid - __meta_kubernetes_pod_container_name target_label: __path__ - job_name: kubernetes-pods-app pipeline_stages: - docker: {} kubernetes_sd_configs: - role: pod relabel_configs: - action: drop regex: .+ source_labels: - __meta_kubernetes_pod_label_name - source_labels: - __meta_kubernetes_pod_label_app target_label: __service__ - source_labels: - __meta_kubernetes_pod_node_name target_label: __host__ - action: drop regex: '' source_labels: - __service__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - action: replace replacement: $1 separator: / source_labels: - __meta_kubernetes_namespace - __service__ target_label: job - action: replace source_labels: - __meta_kubernetes_namespace target_label: namespace - action: replace source_labels: - __meta_kubernetes_pod_name target_label: pod - action: replace source_labels: - __meta_kubernetes_pod_container_name target_label: container - replacement: /var/log/pods/*$1/*.log separator: / source_labels: - __meta_kubernetes_pod_uid - __meta_kubernetes_pod_container_name target_label: __path__ - job_name: kubernetes-pods-direct-controllers pipeline_stages: - docker: {} kubernetes_sd_configs: - role: pod relabel_configs: - action: drop regex: .+ separator: '' source_labels: - __meta_kubernetes_pod_label_name - __meta_kubernetes_pod_label_app - action: drop regex: '[0-9a-z-.]+-[0-9a-f]{8,10}' source_labels: - __meta_kubernetes_pod_controller_name - source_labels: - __meta_kubernetes_pod_controller_name target_label: __service__ - source_labels: - __meta_kubernetes_pod_node_name target_label: __host__ - action: drop regex: '' source_labels: - __service__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - action: replace replacement: $1 separator: / source_labels: - __meta_kubernetes_namespace - __service__ target_label: job - action: replace source_labels: - __meta_kubernetes_namespace target_label: namespace - action: replace source_labels: - __meta_kubernetes_pod_name target_label: pod - action: replace source_labels: - __meta_kubernetes_pod_container_name target_label: container - replacement: /var/log/pods/*$1/*.log separator: / source_labels: - __meta_kubernetes_pod_uid - __meta_kubernetes_pod_container_name target_label: __path__ - job_name: kubernetes-pods-indirect-controller pipeline_stages: - docker: {} kubernetes_sd_configs: - role: pod relabel_configs: - action: drop regex: .+ separator: '' source_labels: - __meta_kubernetes_pod_label_name - __meta_kubernetes_pod_label_app - action: keep regex: '[0-9a-z-.]+-[0-9a-f]{8,10}' source_labels: - __meta_kubernetes_pod_controller_name - action: replace regex: '([0-9a-z-.]+)-[0-9a-f]{8,10}' source_labels: - __meta_kubernetes_pod_controller_name target_label: __service__ - source_labels: - __meta_kubernetes_pod_node_name target_label: __host__ - action: drop regex: '' source_labels: - __service__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - action: replace replacement: $1 separator: / source_labels: - __meta_kubernetes_namespace - __service__ target_label: job - action: replace source_labels: - __meta_kubernetes_namespace target_label: namespace - action: replace source_labels: - __meta_kubernetes_pod_name target_label: pod - action: replace source_labels: - __meta_kubernetes_pod_container_name target_label: container - replacement: /var/log/pods/*$1/*.log separator: / source_labels: - __meta_kubernetes_pod_uid - __meta_kubernetes_pod_container_name target_label: __path__ - job_name: kubernetes-pods-static pipeline_stages: - docker: {} kubernetes_sd_configs: - role: pod relabel_configs: - action: drop regex: '' source_labels: - __meta_kubernetes_pod_annotation_kubernetes_io_config_mirror - action: replace source_labels: - __meta_kubernetes_pod_label_component target_label: __service__ - source_labels: - __meta_kubernetes_pod_node_name target_label: __host__ - action: drop regex: '' source_labels: - __service__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - action: replace replacement: $1 separator: / source_labels: - __meta_kubernetes_namespace - __service__ target_label: job - action: replace source_labels: - __meta_kubernetes_namespace target_label: namespace - action: replace source_labels: - __meta_kubernetes_pod_name target_label: pod - action: replace source_labels: - __meta_kubernetes_pod_container_name target_label: container - replacement: /var/log/pods/*$1/*.log separator: / source_labels: - __meta_kubernetes_pod_annotation_kubernetes_io_config_mirror - __meta_kubernetes_pod_container_name target_label: __path__
4.2 RBAC 配置 (loki-promtail-rbac.yaml) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 apiVersion: v1 kind: ServiceAccount metadata: name: loki-promtail labels: app: promtail namespace: logging --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: labels: app: promtail name: promtail-clusterrole namespace: logging rules: - apiGroups: ["" ] resources: ["nodes" , "nodes/proxy" , "services" , "endpoints" , "pods" ] verbs: ["get" , "watch" , "list" ] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: promtail-clusterrolebinding labels: app: promtail namespace: logging subjects: - kind: ServiceAccount name: loki-promtail namespace: logging roleRef: kind: ClusterRole name: promtail-clusterrole apiGroup: rbac.authorization.k8s.io
4.3 部署文件 (loki-promtail.yaml) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 apiVersion: apps/v1 kind: DaemonSet metadata: name: loki-promtail namespace: logging labels: app: promtail spec: selector: matchLabels: app: promtail updateStrategy: rollingUpdate: maxUnavailable: 1 type: RollingUpdate template: metadata: labels: app: promtail spec: serviceAccountName: loki-promtail containers: - name: promtail image: 172.16 .10 .160 :80/grafana/promtail:2.3.0 imagePullPolicy: IfNotPresent args: - --config.file=/etc/promtail/promtail.yaml - --client.url=http://loki:3100/loki/api/v1/push env: - name: HOSTNAME valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName volumeMounts: - mountPath: /etc/promtail name: config - mountPath: /run/promtail name: run - mountPath: /var/lib/docker/containers name: docker readOnly: true - mountPath: /var/log/pods name: pods readOnly: true ports: - containerPort: 3101 name: http-metrics protocol: TCP securityContext: readOnlyRootFilesystem: true runAsGroup: 0 runAsUser: 0 readinessProbe: failureThreshold: 5 httpGet: path: /ready port: http-metrics scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists volumes: - name: config configMap: defaultMode: 420 name: loki-promtail - name: run hostPath: path: /run/promtail type: "" - name: docker hostPath: path: /var/lib/docker/containers - name: pods hostPath: path: /var/log/pods
5. 部署命令 1 2 3 4 5 6 kubectl apply -f loki-rbac.yaml kubectl apply -f loki-configmap.yaml kubectl apply -f loki.yaml kubectl apply -f loki-promtail-rbac.yaml kubectl apply -f loki-promtail-configmap.yaml kubectl apply -f loki-promtail.yaml
6. 验证部署 1 2 3 4 5 kubectl get pods -n logging kubectl get svc -n logging
7. 离线部署注意事项
镜像准备 :
提前下载所有需要的Docker镜像
推送到内部私有镜像仓库
确保所有节点都能访问私有仓库
存储配置 :
根据实际情况调整StorageClass配置
确保有足够的存储空间
网络配置 :
确保NodePort端口在防火墙中开放
检查网络策略是否允许必要的通信
资源限制 :
根据集群规模调整Loki和Promtail的资源请求和限制
参考文档: Loki+Promtail日志收集方案