Loki + Promtail:日志采集

1. 服务概览

服务 用途 外部访问端口
Prometheus 监控数据源采集 30080
Grafana 数据展示平台 30081
Alertmanager 告警管理 30082
Loki 日志数据采集 32537

2. 准备工作

2.1 镜像准备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 拉取官方镜像
docker pull grafana/loki:2.3.0
docker pull grafana/promtail:2.3.0

# 重新打标签并推送到私有仓库
docker tag grafana/loki:2.3.0 172.16.10.160:80/grafana/loki:2.3.0
docker push 172.16.10.160:80/grafana/loki:2.3.0

docker tag grafana/promtail:2.3.0 172.16.10.160:80/grafana/promtail:2.3.0
docker push 172.16.10.160:80/grafana/promtail:2.3.0

# 保存镜像为离线包
docker save -o grafana-loki-2.3.0.tar grafana/loki:2.3.0
docker save -o grafana-promtail-2.3.0.tar grafana/promtail:2.3.0

3. Loki 部署

3.1 RBAC 配置 (loki-rbac.yaml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
apiVersion: v1
kind: ServiceAccount
metadata:
name: loki
namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: loki
namespace: logging
rules:
- apiGroups: ["extensions"]
resourceNames: ["loki"]
resources: ["podsecuritypolicies"]
verbs: ["use"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: loki
namespace: logging
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: loki
subjects:
- kind: ServiceAccount
name: loki

3.2 配置文件 (loki-configmap.yaml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
apiVersion: v1
kind: ConfigMap
metadata:
name: loki
namespace: logging
labels:
app: loki
data:
loki.yaml: |
auth_enabled: false

ingester:
chunk_idle_period: 3m # 如果块没有达到最大的块大小,那么在刷新之前,块应该在内存中不更新多长时间
chunk_block_size: 262144
chunk_retain_period: 1m # 块刷新后应该在内存中保留多长时间
max_transfer_retries: 0 # Number of times to try and transfer chunks when leaving before falling back to flushing to the store. Zero = no transfers are done.
lifecycler: # 配置ingester的生命周期,以及在哪里注册以进行发现
ring:
kvstore:
store: inmemory # 用于ring的后端存储,支持consul、etcd、inmemory
replication_factor: 1 # 写入和读取的ingesters数量,至少为1(为了冗余和弹性,默认情况下为3)

limits_config:
enforce_metric_name: false
reject_old_samples: true # 旧样品是否会被拒绝
reject_old_samples_max_age: 168h # 拒绝旧样本的最大时限

schema_config: # 配置从特定时间段开始应该使用哪些索引模式
configs:
- from: 2020-10-24 # 创建索引的日期。如果这是唯一的schema_config,则使用过去的日期,否则使用希望切换模式时的日期
store: boltdb-shipper # 索引使用哪个存储,如:cassandra, bigtable, dynamodb,或boltdb
object_store: filesystem # 用于块的存储,如:gcs, s3, inmemory, filesystem, cassandra,如果省略,默认值与store相同
schema: v11
index: # 配置如何更新和存储索引
prefix: index_ # 所有周期表的前缀
period: 24h # 表周期

server:
http_listen_port: 3100

storage_config: # 为索引和块配置一个或多个存储
boltdb_shipper:
active_index_directory: /data/loki/boltdb-shipper-active
cache_location: /data/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: filesystem
filesystem:
directory: /data/loki/chunks

chunk_store_config: # 配置如何缓存块,以及在将它们保存到存储之前等待多长时间
max_look_back_period: 0s # 限制查询数据的时间,默认是禁用的,这个值应该小于或等于table_manager.retention_period中的值

table_manager:
retention_deletes_enabled: true # 日志保留周期开关,用于表保留删除
retention_period: 48h # 日志保留周期,保留期必须是索引/块的倍数

compactor:
working_directory: /data/loki/boltdb-shipper-compactor
shared_store: filesystem

3.3 部署文件 (loki.yaml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
kind: Service
metadata:
name: loki
namespace: logging
labels:
app: loki
spec:
type: ClusterIP
ports:
- port: 3100
protocol: TCP
name: http-metrics
targetPort: http-metrics
selector:
app: loki
---
apiVersion: v1
kind: Service
metadata:
name: loki-outer
namespace: logging
labels:
app: loki
spec:
type: NodePort
ports:
- port: 3100
protocol: TCP
name: http-metrics
targetPort: http-metrics
nodePort: 32537
selector:
app: loki
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: loki
namespace: logging
labels:
app: loki
spec:
podManagementPolicy: OrderedReady
replicas: 1
selector:
matchLabels:
app: loki
serviceName: loki
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: loki
spec:
serviceAccountName: loki
securityContext:
fsGroup: 10001
runAsGroup: 10001
runAsNonRoot: true
runAsUser: 10001
initContainers: []
containers:
- name: loki
image: 172.16.10.160:80/grafana/loki:2.3.0
imagePullPolicy: IfNotPresent
args:
- --config.file=/etc/loki/loki.yaml
volumeMounts:
- name: config
mountPath: /etc/loki
- name: storage
mountPath: /data
ports:
- name: http-metrics
containerPort: 3100
protocol: TCP
livenessProbe:
httpGet:
path: /ready
port: http-metrics
scheme: HTTP
initialDelaySeconds: 45
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: http-metrics
scheme: HTTP
initialDelaySeconds: 45
timeoutSeconds: 1
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
securityContext:
readOnlyRootFilesystem: true
terminationGracePeriodSeconds: 4800
volumes:
- name: config
configMap:
defaultMode: 420
name: loki
volumeClaimTemplates:
- metadata:
name: storage
labels:
app: loki
annotations:
volume.beta.kubernetes.io/storage-class: "course-nfs-storage"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "2Gi"

4. Promtail 部署

4.1 配置文件 (loki-promtail-configmap.yaml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
apiVersion: v1
kind: ConfigMap
metadata:
name: loki-promtail
namespace: logging
labels:
app: promtail
data:
promtail.yaml: |
client:
backoff_config:
max_period: 5m
max_retries: 10
min_period: 500ms
batchsize: 1048576
batchwait: 1s
external_labels: {}
timeout: 10s

positions:
filename: /run/promtail/positions.yaml

server:
http_listen_port: 3101

target_config:
sync_period: 10s

scrape_configs:
- job_name: kubernetes-pods-name
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels:
- __meta_kubernetes_pod_label_name
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__

- job_name: kubernetes-pods-app
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
regex: .+
source_labels:
- __meta_kubernetes_pod_label_name
- source_labels:
- __meta_kubernetes_pod_label_app
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__

- job_name: kubernetes-pods-direct-controllers
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
regex: .+
separator: ''
source_labels:
- __meta_kubernetes_pod_label_name
- __meta_kubernetes_pod_label_app
- action: drop
regex: '[0-9a-z-.]+-[0-9a-f]{8,10}'
source_labels:
- __meta_kubernetes_pod_controller_name
- source_labels:
- __meta_kubernetes_pod_controller_name
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__

- job_name: kubernetes-pods-indirect-controller
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
regex: .+
separator: ''
source_labels:
- __meta_kubernetes_pod_label_name
- __meta_kubernetes_pod_label_app
- action: keep
regex: '[0-9a-z-.]+-[0-9a-f]{8,10}'
source_labels:
- __meta_kubernetes_pod_controller_name
- action: replace
regex: '([0-9a-z-.]+)-[0-9a-f]{8,10}'
source_labels:
- __meta_kubernetes_pod_controller_name
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__

- job_name: kubernetes-pods-static
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
regex: ''
source_labels:
- __meta_kubernetes_pod_annotation_kubernetes_io_config_mirror
- action: replace
source_labels:
- __meta_kubernetes_pod_label_component
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_annotation_kubernetes_io_config_mirror
- __meta_kubernetes_pod_container_name
target_label: __path__

4.2 RBAC 配置 (loki-promtail-rbac.yaml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
apiVersion: v1
kind: ServiceAccount
metadata:
name: loki-promtail
labels:
app: promtail
namespace: logging
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
app: promtail
name: promtail-clusterrole
namespace: logging
rules:
- apiGroups: [""]
resources: ["nodes", "nodes/proxy", "services", "endpoints", "pods"]
verbs: ["get", "watch", "list"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: promtail-clusterrolebinding
labels:
app: promtail
namespace: logging
subjects:
- kind: ServiceAccount
name: loki-promtail
namespace: logging
roleRef:
kind: ClusterRole
name: promtail-clusterrole
apiGroup: rbac.authorization.k8s.io

4.3 部署文件 (loki-promtail.yaml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: loki-promtail
namespace: logging
labels:
app: promtail
spec:
selector:
matchLabels:
app: promtail
updateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: promtail
spec:
serviceAccountName: loki-promtail
containers:
- name: promtail
image: 172.16.10.160:80/grafana/promtail:2.3.0
imagePullPolicy: IfNotPresent
args:
- --config.file=/etc/promtail/promtail.yaml
- --client.url=http://loki:3100/loki/api/v1/push
env:
- name: HOSTNAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
volumeMounts:
- mountPath: /etc/promtail
name: config
- mountPath: /run/promtail
name: run
- mountPath: /var/lib/docker/containers
name: docker
readOnly: true
- mountPath: /var/log/pods
name: pods
readOnly: true
ports:
- containerPort: 3101
name: http-metrics
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsGroup: 0
runAsUser: 0
readinessProbe:
failureThreshold: 5
httpGet:
path: /ready
port: http-metrics
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
volumes:
- name: config
configMap:
defaultMode: 420
name: loki-promtail
- name: run
hostPath:
path: /run/promtail
type: ""
- name: docker
hostPath:
path: /var/lib/docker/containers
- name: pods
hostPath:
path: /var/log/pods

5. 部署命令

1
2
3
4
5
6
kubectl apply -f loki-rbac.yaml
kubectl apply -f loki-configmap.yaml
kubectl apply -f loki.yaml
kubectl apply -f loki-promtail-rbac.yaml
kubectl apply -f loki-promtail-configmap.yaml
kubectl apply -f loki-promtail.yaml

6. 验证部署

1
2
3
4
5
# 检查Pod状态
kubectl get pods -n logging

# 检查服务状态
kubectl get svc -n logging

7. 离线部署注意事项

  1. 镜像准备

    • 提前下载所有需要的Docker镜像
    • 推送到内部私有镜像仓库
    • 确保所有节点都能访问私有仓库
  2. 存储配置

    • 根据实际情况调整StorageClass配置
    • 确保有足够的存储空间
  3. 网络配置

    • 确保NodePort端口在防火墙中开放
    • 检查网络策略是否允许必要的通信
  4. 资源限制

    • 根据集群规模调整Loki和Promtail的资源请求和限制

参考文档: Loki+Promtail日志收集方案