人生其实如草 – 活的不过“从容”二字

centos7 升级内核

使用下面的脚本更新 kernel

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
yum --enablerepo=elrepo-kernel install kernel-ml

编辑“/etc/default/grub”文件，修改 “GRUB_DEFAULT” 为 0
运行下面的命令重建内核配置

grub2-mkconfig -o /boot/grub2/grub.cfg

重启服务器查看kernel 是否升级成功

参考链接： https://linuxstory.org/how-to-install-or-upgrade-the-latest-kernel-in-centos-7/#:~:text=%E5%9C%A8%20CentOS%207%20%E5%90%AF%E7%94%A8%20ELRepo%20%E4%BB%93%E5%BA%93%E5%90%AF%E7%94%A8%E5%90%8E%EF%BC%8C%E4%BD%A0%E5%8F%AF%E4%BB%A5%E4%BD%BF%E7%94%A8%E4%B8%8B%E9%9D%A2%E7%9A%84%E5%91%BD%E4%BB%A4%E5%88%97%E5%87%BA%E5%8F%AF%E7%94%A8%E7%9A%84%E5%86%85%E6%A0%B8%E7%9B%B8%E5%85%B3%E5%8C%85%EF%BC%9A%20%23%20yum,-%20%E6%89%BE%E5%87%BA%E5%8F%AF%E7%94%A8%E7%9A%84%E5%86%85%E6%A0%B8%E7%89%88%E6%9C%AC%20%E6%8E%A5%E4%B8%8B%E6%9D%A5%EF%BC%8C%E5%AE%89%E8%A3%85%E6%9C%80%E6%96%B0%E7%9A%84%E4%B8%BB%E7%BA%BF%E7%A8%B3%E5%AE%9A%E5%86%85%E6%A0%B8%EF%BC%9A%20%23%20yum%20–enablerepo%3Delrepo-kernel%20install%20kernel-ml

prometheus 监控体系搭建

prometheus 搭建

运行以下命令下载 prometheus, 并做解压等动作

mkdir -p /opt/monitor && cd /opt/monitor
wget https://github.com/prometheus/prometheus/releases/download/v2.35.0/prometheus-2.35.0.linux-amd64.tar.gz
tar xvf prometheus-2.35.0.linux-amd64.tar.gz && mv prometheus-2.35.0.linux-amd64 prometheus
mkdir -p /opt/monitor/prometheus/data

使用 supervisor 启动 prometheus， supervisor 里面的配置文件如下, 安装 supervisor 参考 supervisor安装：

[program:prometheus]
process_name=%(program_name)s
command=/opt/monitor/prometheus/prometheus --config.file=/opt/monitor/prometheus/prometheus.yml --storage.tsdb.path=/opt/monitor/prometheus/data --storage.tsdb.retention=60d --log.level=info --web.listen-address="192.168.19.69:9090" 
autostart=true
autorestart=true
user=root
redirect_stderr=true
stdout_logfile=/var/log/supervisor/prometheus.log

其中 “/opt/monitor/prometheus/prometheus.yml” 文件内容如下：

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["192.168.19.69:9090"]

使用 “supervisorctl update” 命令使刚刚添加的配置文件生效, 使用”supervisorctl status” 查看 promethus 是否启动成功
使用”http://192.168.19.69:9090/” 访问看是否成功

node_exporter 搭建

使用以下命令下载并解压 [program:node_exporter]

mkdir -p /opt/monitor && cd /opt/monitor
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
tar xvf node_exporter-1.3.1.linux-amd64.tar.gz && mv node_exporter-1.3.1.linux-amd64 node_exporter

使用 supervisor 启动 3. 其中 “/opt/monitor/prometheus/prometheus.yml” 文件内容如下：， supervisor 里面的配置文件如下, 安装 supervisor 参考 supervisor安装：

[program:node_exporter]
process_name=%(program_name)s
command=/opt/monitor/node_exporter/node_exporter  --web.listen-address="192.168.19.69:9111"  --web.config="/opt/monitor/node_exporter/config.yaml"
autostart=true
autorestart=true
user=root
redirect_stderr=true
stdout_logfile=/var/log/supervisor/node_exporter.log

其中“/opt/monitor/node_exporter/config.yaml”的配置文件如，这儿使用了 htpasswd 来生成密码，也可以不使用用户名和密码，生成密码的参考这篇文章”https://www.cnblogs.com/xjzyy/p/15602929.html”
```
basic_auth_users:
  prometheus: your_password
```
使用 “supervisorctl update” 命令使刚刚添加的配置文件生效, 使用”supervisorctl status” 查看 node_exporter 是否启动成功

修改 “/opt/monitor/prometheus/prometheus.yml” 文件内容如下，添加 node_exporter 的配置

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["192.168.19.69:9090"]

  - job_name: '69_node_exporter'
    scrape_interval: 5s
    scheme: http
    basic_auth:
      username: prometheus
      password: your_password # 这儿的密码就是 node_export 设置的密码
    static_configs:
    - targets: ['192.168.19.69:9111']
      labels:
        instance: 19.168.19.69

使用 “ supervisorctl restart prometheus” 重新启动 prometheus, 使配置文件生效

grafana 安装

使用下面的命令安装 grafana

mkdir -p /opt/monitor && cd /opt/monitor
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.5.0.linux-amd64.tar.gz
tar xvf grafana-enterprise-8.5.0.linux-amd64.tar.gz && mv grafana-8.5.0/ grafana
cd /opt/monitor/grafana/conf && cp sample.ini grafana.ini

使用 supervisor 启动 grafana， supervisor 里面的配置文件如下, 安装 supervisor 参考 supervisor安装：

[program:grafana]
process_name=%(program_name)s
directory=/opt/monitor/grafana/bin
command=/opt/monitor/grafana/bin/grafana-server -config /opt/monitor/grafana/conf/grafana.ini
autostart=true
autorestart=true
user=root
redirect_stderr=true
stdout_logfile=/var/log/supervisor/grafana.log

使用 “supervisorctl update” 命令使刚刚添加的配置文件生效, 使用”supervisorctl status” 查看 grafana 是否启动成功
使用”http://192.168.19.69:3000/login” 登录. 默认用户名和密码都是admin,然后点击设置添加数据源, 数据源选择 promethus ， url使用“http://192.168.19.59:9090”
点击 “+” “import” 输入ID “1860” 选择 “prometheus” 然后导入

alertmanager 安装

使用下面的命令安装 alertmanager

mkdir -p /opt/monitor && cd /opt/monitor
https://github.com/prometheus/alertmanager/releases/download/v0.24.0/alertmanager-0.24.0.linux-amd64.tar.gz
tar xvf alertmanager-0.24.0.linux-amd64.tar.gz && mv alertmanager-0.24.0.linux-amd64/ alertmanager
mkdir -p /opt/monitor/alertmanager/template

使用 supervisor 启动 alertmanager， supervisor 里面的配置文件如下, 安装 supervisor 参考 supervisor安装：

[program:alertmanager]
process_name=%(program_name)s
command=/opt/monitor/alertmanager/alertmanager --config.file="/opt/monitor/alertmanager/alertmanager.yml"   --web.listen-address="192.168.19.69:9993"  --cluster.listen-address="192.168.19.69:9994"
autostart=true
autorestart=true
user=root
redirect_stderr=true
stdout_logfile=/var/log/supervisor/alertmanager.log

其中”/opt/monitor/alertmanager/alertmanager.yml”里面的文件内容如下：

global:
  resolve_timeout: 5m

templates:
  - '/opt/monitor/alertmanager/template/test.tmpl'

route:
  group_by: ['alertname']
  group_wait: 5s
  group_interval: 5s
  repeat_interval: 2m
  receiver: 'wechat'

receivers:
- name: 'web.hook'
  webhook_configs:
  - url: 'http://127.0.0.1:5001/'
- name: 'wechat'  # 下面这一段是企业微信的报警设置，需要自己先申请应用
  wechat_configs:
    - send_resolved: true
      agent_id: 'your_agent'
      to_user: 'your_name'
      api_secret: 'your_api_secret'
      corp_id: 'your_corp_id'

其中”/opt/monitor/alertmanager/template/test.tmpl”里面的文件内容如下，里面是设置的报警格式设置

{{ define "wechat.default.message" }}
{{ range $i, $alert :=.Alerts }}
=======  监控报警  =========
告警状态：{{ .Status }}
告警级别：{{ $alert.Labels.severity }}
告警类型：{{ $alert.Labels.alertname }}
告警应用：{{ $alert.Annotations.summary }}
告警主机：{{ $alert.Labels.instance }}
告警详情：{{ $alert.Annotations.description }}
触发阀值：{{ $alert.Annotations.value }}
触发时间: {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
恢复时间: {{ ($alert.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
==========  end  ========== 
{{ end }} 
{{ end }}

使用 “supervisorctl update” 命令使刚刚添加的配置文件生效, 使用”supervisorctl status” 查看 alertmanager 是否启动成功

编辑 “/opt/monitor/prometheus/prometheus.yml” 文件，添加内容，如下所示:

# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets: ['192.168.19.94:9993']

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rules/*.yml"
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["192.168.19.69:9090"]

  - job_name: '69_node_exporter'
    scrape_interval: 5s
    scheme: http
    basic_auth:
      username: prometheus
      password: your_password # 这儿的密码就是 node_export 设置的密码
    static_configs:
    - targets: ['192.168.19.69:9111']
      labels:
        instance: 19.168.19.69

“mkdir -p /opt/monitor/prometheus/rules ” , 并创建三个文件，分别为“dist.yml mem.yml unreachable.yml” 这三个文件可以自定义，里面定义了报警的规则，三个文件里面的内容如下：

dist.yml 的内容如下

groups:
- name: root_dist_error
  rules:
  - alert: "硬盘报警"
    expr: 100 - (node_filesystem_avail_bytes{device="rootfs",fstype="rootfs",mountpoint="/"} / node_filesystem_size_bytes{device="rootfs",fstype="rootfs",mountpoint="/"}) * 100 > 80
    for: 60s
    labels:
      severity: error 
      team: testteam
    annotations:
      summary: "root disk used is large"
      description: "根目录使用率大于80%"
      value: "{{ humanize $value }}%"

mem.yml 的内容如下：

groups:
  - name: error_mem
    rules:
    - alert: "memory error"
      expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes )) / node_memory_MemTotal_bytes * 100 > 85
      for: 20s
      labels:
        severity: error 
        team: testteam
      annotations:
        summary: "Memory Usage is busy"
        description: "memory usage is lager 80%"
        value: "{{ humanize $value }}%"

unreachable.yml 的内容如下：

groups: 
  - name: InstanceDown #同性质的一组报警，监控当前节点的指标的组名称
    rules:
    - alert: InstanceDown
      expr: up == 0 #每一个实例都会有一个up的状态，up是默认赋予被监控端的一个指标，0为失败状态，1为存活状态
      for: 20m #当前报警的持续时间，1分钟之内如果都是up == 0的状态，才会发出报警
      labels: #设置报警级别
        severity: error #报警级别为error级别
      annotations: #注释信息
        summary: "Instance {{ $labels.instance }} is down"
        description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 10 minutes."

然后使用 ”supervisorctl restart prometheus“ 重启 premetheus ，使改动生效

debian10 安装 yapi

前置条件有下面两个

安装node并确保 node 版本=> 7.6 目前测试node14 node16 也不满足要求, 目前使用的node版本是10
安装mongodb并确保 mongodb 版本 => 2.6，请运行 mongo –version 查看版本号

具体安装步骤

运行下面的命令安装 yapi 的引导工具

npm install -g yapi-cli --registry https://registry.npm.taobao.org
npm install -g node-gyp
npm install -g pm2
yapi server

根据命令行提示信息，在浏览器中访问部署页面
选择部署版本 -> 输入公司名称 -> 输入yapi的部署路径 -> 输入管理员邮箱 -> 输入网站端口号 ->输入数据库地址 -> 输入数据库端口 —> 输入数据库名 -> 选择开始部署
初始化管理员账号成功,账号名：”admin@admin.com”，密码：”ymfe.org”部署成功，请切换到部署目录，输入： “pm2 start vendors/server/app.js” 指令启动服务器, 然后在浏览器打开 http://127.0.0.1:3000 访问
下面命令是设置 pm2 命令开机启动
```
pm2 startup
pm2 save
```

参考链接: https://github.com/YMFE/yapi/issues/16 ; https://mp.weixin.qq.com/s/XNntrSbRhOokQivC9Hffwg

ubuntu20.04 初始化脚本

#!/bin/bash

#update soft
apt update && apt upgrade
apt install wget tar curl rsync bzip2 lsof telnet htop screen tree vim gcc tree git make net-tools lrzsz psmisc hwloc gsmartcontrol chrony -y

#时间设置 
timedatectl set-local-rtc 1
timedatectl set-timezone Asia/Shanghai
systemctl start chrony
systemctl enable chrony


cat <<EOF | sudo tee /lib/systemd/system/rc.local.service
[Unit]
Description=/etc/rc.local Compatibility
Documentation=man:systemd-rc-local-generator(8)
ConditionFileIsExecutable=/etc/rc.local
After=syslog.target network.target remote-fs.target nss-lookup.target

[Service]
Type=forking
ExecStart=/etc/rc.local start
TimeoutSec=0
RemainAfterExit=no
GuessMainPID=no

[Install]
WantedBy=multi-user.target
Alias=rc-local.service
EOF

ln -s /lib/systemd/system/rc.local.service /etc/systemd/system/rc.local.service


cat <<EOF | sudo tee /etc/rc.local
#!/bin/bash
# 将你需要执行的命令写在这里，禁止写入死循环命令

exit 0
EOF

chmod 755 /etc/rc.local

#设置最大打开文件描述符数
cat >> /etc/security/limits.conf <<EOF
*           soft   nofile       65535
*           hard   nofile       65535
EOF

#set ssh
sed -i 's/^GSSAPIAuthentication yes$/GSSAPIAuthentication no/' /etc/ssh/sshd_config
sed -i 's/#UseDNS yes/UseDNS no/' /etc/ssh/sshd_config
systemctl  restart sshd.service


# profile 修改
echo "export HISTTIMEFORMAT=\"%F %T \"" >> /etc/profile
echo "" >> /etc/profile
echo "## 自定义别名" >> /etc/profile
echo "alias c=clear" >> /etc/profile
echo "alias vi=vim" >> /etc/profile
echo "alias dsh='du -hsx * | sort -rh | head -n 10'" >> /etc/profile
sed -i 's/HISTSIZE=1000/HISTSIZE=10000/g' /etc/profile
source /etc/profile

Nginx 禁止国外ip访问(debian版本)

参考下面脚本安装 Nginx

#!/bin/bash
## install nginx
WORK_DIR=`mktemp -d`
apt install wget libpcre3 libpcre3-dev zlib1g-dev \
    openssl libssl-dev libxml2 libxml2-dev libxslt-dev  \
    gcc  make libgd-dev  libgeoip-dev  libperl-dev libmaxminddb* 
cd /opt && git clone https://github.com/leev/ngx_http_geoip2_module.git

cd $WORK_DIR
wget http://nginx.org/download/nginx-1.20.1.tar.gz

tar zxvf nginx-1.20.1.tar.gz && cd nginx-1.20.1
./configure --prefix=/opt/nginx --user=apache --group=apache \
    --with-http_ssl_module --with-http_stub_status_module --with-http_gzip_static_module \
    --with-pcre --with-http_v2_module --with-http_dav_module \
    --with-http_flv_module --with-http_realip_module --with-http_addition_module \
    --with-http_xslt_module --with-http_sub_module --with-http_random_index_module \
    --with-http_degradation_module --with-http_secure_link_module --with-http_perl_module \
    --add-module=/opt/ngx_http_geoip2_module \
    --with-debug --with-file-aio --with-stream --with-ld-opt=-Wl,-E
make && make install
cd ~ && [ -d $WORK_DIR ] && rm $WORK_DIR -rf

## 添加nginx运行账户
cat /etc/passwd | grep apache
if [ $? -ne 0 ];then
    groupadd apache
    useradd -g apache -s /sbin/nologin -c "apache" apache
fi

下面一个示例的 nginx 配置文件

user  apache;
worker_processes  auto;

events {
    worker_connections  65535;
}

http {
    include       mime.types;
    default_type  application/octet-stream;
    sendfile        on;
    tcp_nopush     on;
    keepalive_timeout  65;
    gzip  on;

    #  GeoLite2-Country.mmdb 这个文件需要到 'https://dev.maxmind.com' 自行下载并放到 ' /usr/share/GeoIP' 这个目录下
    geoip2 /usr/share/GeoIP/GeoLite2-Country.mmdb {
        auto_reload 5m;
        $geoip2_data_country_code country iso_code;
    }

    map $geoip2_data_country_code $allowed_country {
        default yes;
        CN no;
    }

    server {
        listen       80;
        server_name  localhost;

        location / {
            root   html;
            index  index.html index.htm;
        }

        if ($allowed_country = yes) {
            return 403;
        }
    }

}

测试的话需要用一个国外的节点进行测试

2025 年 8 月
一	二	三	四	五	六	日
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31