Ubuntu Nginx如何监控告警

时间2026-01-22 07:07:04发布访客分类主机资讯浏览1125

导读：Ubuntu 上搭建 Nginx 监控与告警的实用方案一监控目标与基础指标建议优先覆盖四类关键指标：延迟：关注 $request_time、$upstream_response_time，绘制 TP95/TP99 曲线，结合业...

Ubuntu 上搭建 Nginx 监控与告警的实用方案

一监控目标与基础指标

建议优先覆盖四类关键指标：
1. 延迟：关注 $request_time、$upstream_response_time，绘制 TP95/TP99 曲线，结合业务容忍度设置阈值。
2. 错误：监控 HTTP 5xx/4xx，尤其是 500/502/504；同时采集 error.log 中的关键错误。
3. 流量：关注 PV/UV、关键接口与后端命中率，识别突增/突降。
4. 饱和度：关注 CPU、连接数/队列、磁盘 IO；最大可承受连接数约为 worker_processes × worker_connections。
建议启用 ngx_http_stub_status_module 获取连接与请求的基础计数，用于存活与饱和度观测。

二方案一 Prometheus + Alertmanager + Nginx Exporter（推荐）

启用 Nginx 状态页
在 server 配置中增加：
location /nginx_status { stub_status on; access_log off; allow 127.0.0.1; deny all; }
重启后在 http://服务器IP/nginx_status 可查看 Active/Reading/Writing/Waiting 与总请求数。
部署 Nginx Exporter（以官方 nginx-prometheus-exporter 为例）
启动示例：
docker run -d --name nginx-exporter
-p 9113:9113
nginx/nginx-prometheus-exporter:latest
-nginx.scrape-uri=http:///nginx_status
访问 http://:9113/metrics 应看到 nginx 指标。
配置 Prometheus 抓取
在 prometheus.yml 增加：
scrape_configs:
- job_name: ‘nginx’
  static_configs:
  - targets: [‘:9113’]
配置 Alertmanager 邮件告警
alertmanager.yml 示例：
global:
smtp_smarthost: ‘smtp.example.com:587’
smtp_from: ‘alertmanager@example.com’
smtp_auth_username: ‘alertmanager’
smtp_auth_password: ‘password’
smtp_ssl: true
route:
receiver: ‘email’
receivers:
- name: ‘email’
  email_configs:
  - to: ‘admin@example.com’
Prometheus 告警规则示例（/etc/prometheus/rules/alert.rules）
groups:
- name: nginx
  rules:
  - alert: InstanceDown
    expr: up{ job=“nginx”} == 0
    for: 1m
    labels: severity: critical
    annotations:
    summary: “Nginx exporter down ({ { $labels.instance } } )”
    description: “Nginx exporter has been down for more than 1 minute.”
  - alert: NginxHigh5xx
    expr: sum(rate(nginx_http_requests_total{ status=~“5…”} [5m])) by (instance) / sum(rate(nginx_http_requests_total[5m])) by (instance) > 0.05
    for: 2m
    labels: severity: critical
    annotations:
    summary: “High 5xx rate on { { $labels.instance } } ”
    description: “5xx rate is { { $value | humanizePercentage } } over the last 5 minutes.”
  - alert: NginxHighLatency
    expr: histogram_quantile(0.95, sum(rate(nginx_http_request_duration_seconds_bucket[5m])) by (le,instance)) > 1
    for: 5m
    labels: severity: warning
    annotations:
    summary: “High 95th percentile latency on { { $labels.instance } } ”
    description: “95th percentile request latency is { { $value } } s over the last 5 minutes.”.

三方案二 ELK 或 Grafana Loki 日志告警

ELK（Filebeat → Logstash → Elasticsearch → Kibana）
1. Filebeat 采集 /var/log/nginx/access.log 与 error.log；
2. Logstash 用 Grok 解析日志，结构化后写入 ES；
3. Kibana 建立索引模式与可视化仪表盘；
4. 使用 Watcher 或 Kibana Alerting 配置阈值/异常规则（如 5xx 比例、403/404 突增、特定路径扫描）。
Grafana Loki + Promtail + Grafana
1. Promtail 采集 Nginx 日志并打标签；
2. Loki 存储与查询；
3. Grafana 建立日志面板，用 Grafana Alerting 基于 LogQL 触发告警（如 rate({ job=“nginx”} |= " 5xx ")）。

四轻量快速方案与运维要点

轻量快速方案
- 实时查看：tail -f /var/log/nginx/access.log /var/log/nginx/error.log
- 可视化分析：GoAccess
  goaccess /var/log/nginx/access.log -a -c -d --log-format=COMBINED -o report.html
  实时 HTML：goaccess … --real-time-html --port=7890
- 安全与滥用防护：Fail2Ban 针对探测与暴力路径（如 /wp-admin、.env）自动封禁。
日志轮转与权限
/var/log/nginx/*.log {
daily; missingok; rotate 7; compress; delaycompress; notifempty;
create 0640 www-data adm; sharedscripts;
postrotate
if [ -f /var/run/nginx.pid ]; then kill -USR1 cat /var/run/nginx.pid; fi
endscript
}
确保 /var/log/nginx/ 权限与属主正确，避免采集/告警失败。

声明：本文内容由网友自发贡献，本站不承担相应法律责任。对本内容有异议或投诉，请联系2913721942#qq.com核实处理，我们将尽快回复您，谢谢合作！

若转载请注明出处： Ubuntu Nginx如何监控告警
本文地址： https://pptw.com/jishu/789696.html

Golang在Debian上安全吗 Rust在Debian上的最新动态