Go to content Go to menu Go to search

Скрипт сбора статистики Nginx reverse proxy

Задача

Имеется Nginx, работающий в качестве frontend для нескольких Tomcat-ов и Апачей.
Необходимо собирать статистику по скорости ответов back-end серверов и вообще..

Ниже приведен скрипт, который читает специальным образом отформатированный лог Nginx и генерирует отчет подобного вида:
(в колонке Errors сумма 5xx и 4xx)

Nginx statistics from vps ( 192.168.77.140 ):

Total     Errors    5xx       4xx       3xx       2xx       1xx       Domain
============================================================================================
50666     2013      28        1985      1219      47434     0         spw4u.propertyminder.com
61335     165       2         163       144       61026     0         spw4u.com
122       84        0         84        11        27        0         1611-shirley-ave-94024.com
6959      16        0         16        2092      4851      0         homebrella.com
6111      15        0         15        0         4814      1282      messaging-history-us.propertyminder.com
4623      6         0         6         3313      1304      0         login-us.propertyminder.com
181       3         0         3         0         178       0         1310primavera.com
4199      1         0         1         2093      2105      0         provider.homebrella.com
4197      0         0         0         2092      2105      0         my.billy.com
389       0         0         0         108       281       0         crm.pminder.com
117       0         0         0         0         117       0         geo.propertyminder.com
--------------------------------------------------------------------------------------------
138899    2303      30        2273      11072     124242    1282      Totals

Upstream  Total                          
response  upstream                       
time      requests  Upstream             Domain
=====================================================================
14.92     6111      192.168.1.73:8080    messaging-history-us.propertyminder.com
2.61      122       192.168.1.72:80      1611-shirley-ave-94024.com
0.88      49684     192.168.1.72:80      spw4u.propertyminder.com
0.67      181       192.168.1.72:80      1310primavera.com
0.37      117       192.168.1.72:80      geo.propertyminder.com
0.19      61305     192.168.1.72:80      spw4u.com
0.09      4199      192.168.1.72:8082    provider.homebrella.com
0.05      6959      192.168.1.72:8082    homebrella.com
0.03      4197      192.168.1.72:80      my.billy.com
0.01      389       192.168.1.72:80      crm.pminder.com
0.01      4614      192.168.1.72:80      login-us.propertyminder.com

Решение

Для начала нужно заставить Nginx писать лог в нужном нам формате, для чего добавим в конфиг в раздел http:

log_format timing '[$time_local] $remote_addr $status $body_bytes_sent $request_time [$upstream_addr] $upstream_response_time $http_host "$request"';
access_log  /var/log/nginx/timing.log timing;

Собственно сам скрипт (/root/bin/nginx_stat/nginx_stat.pl):

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
#!/usr/bin/perl
#
# This script read specifically formatted nginx log and generate statistics about upstreams
#
# For use add to nginx config folowing lines:
#
# log_format timing '[$time_local] $remote_addr $status $body_bytes_sent $request_time [$upstream_addr] $upstream_response_time $http_host "$request"';
# access_log  /var/log/nginx/timing.log timing;
#
# usage: cat /var/log/nginx/timing.log | ./nginx_stat.pl
#

use strict;
use Data::Dumper qw(Dumper);
use List::Util qw(sum);

# Constatnts
my $MIN_REQUESTS_TO_SHOW = 100;
my $HOSTNAME_BIN = '/bin/hostname';

# Variables
my %domain_stats;
my %domain_calculated_stats;
my %responses_stats;
my %responses_totals;
my $tolal_requests;

while (<stdin>) {
    chomp;
    my $line = $_;

    my $domain;
    my $log_time; my $remote_addr; my $status; my $body_bytes_sent; my $request_time; my $upstream; my $upstream_response_time; my $http_host; my $request;

    # [28/Sep/2016:03:45:11 -0700] 66.249.69.159 200 159 0.003 [192.168.1.72:80] 0.003 1101-bill-joseph-way.spw4u.propertyminder.com "GET /robots.txt HTTP/1.1"
    if ($line =~ /^\[(.+)\] (\d+\.\d+\.\d+\.\d+) (\d+) (\d+) (\d+\.\d+) \[(.+)\] (.+) (.+) \".+\"/) {

        ($log_time, $remote_addr, $status, $body_bytes_sent, $request_time, $upstream, $upstream_response_time, $http_host, $request) = ($1, $2, $3, $4, $5, $6, $7, $8);
        # cut $http_host to domain level three 
        if ( $http_host =~ /.+\.(.+\..+\..+)$/ ) {  $domain = $1; }
        elsif ( $http_host =~ /.+\.(spw4u.com)$/ ) {  $domain = $1; }
        else {$domain = $http_host;}

        #
        # Count responses        
        #
        $tolal_requests++;
        if ( !$responses_stats{$domain} ) {
            %{$responses_stats{$domain}}=(
                'Tot'=>0,
                'Err'=>0,
                '1xx'=>0,
                '2xx'=>0,
                '3xx'=>0,
                '4xx'=>0,
                '5xx'=>0,
            );
        }
        $responses_stats{$domain}{'Tot'}++;
        if ($status =~ /1../) { $responses_stats{$domain}{'1xx'}++; }
        if ($status =~ /2../) { $responses_stats{$domain}{'2xx'}++; }
        if ($status =~ /3../) { $responses_stats{$domain}{'3xx'}++; }
        if ($status =~ /4../) { $responses_stats{$domain}{'4xx'}++; $responses_stats{$domain}{'Err'}++ }
        if ($status =~ /5../) { $responses_stats{$domain}{'5xx'}++; $responses_stats{$domain}{'Err'}++ }
        #
        # Count Upstreams response times
        #
        next if $upstream == "-"; # next if we don't have upstream
        push ( @{$domain_stats{$domain}{$upstream}{'arr_upstream_response_time'}} ,$upstream_response_time);
    }    
}

#
# Calculate upstream mean response time
#
foreach my $domain (keys %domain_stats ){
    foreach my $upstream (keys %{$domain_stats{$domain}} ){
        $domain_calculated_stats{$domain}{$upstream}{'adv_upstream_response_time'} = &mean( @{$domain_stats{$domain}{$upstream}{'arr_upstream_response_time'}} );
    }
}
#
# Output responses statistics
#
my $hostname = `$HOSTNAME_BIN`; chomp $hostname; 
$hostname = $hostname." ( ".`$HOSTNAME_BIN -I`; chomp $hostname;
$hostname = $hostname."):";

print "Nginx statistics from $hostname\n";
print "\n";
printf "%-10s%-10s%-10s%-10s%-10s%-10s%-10s%s\n", "Total","Errors","5xx","4xx","3xx","2xx","1xx", "Domain"; 
print  "============================================================================================\n";

foreach my $domain (sort { $responses_stats{$b}{"Err"} <=> $responses_stats{$a}{"Err"} } keys %responses_stats ) {
    next if $responses_stats{$domain}{"Tot"} < $MIN_REQUESTS_TO_SHOW;
    printf "%-10d%-10d%-10d%-10d%-10d%-10d%-10d%s\n", $responses_stats{$domain}{"Tot"},$responses_stats{$domain}{"Err"},$responses_stats{$domain}{"5xx"},$responses_stats{$domain}{"4xx"},$responses_stats{$domain}{"3xx"},$responses_stats{$domain}{"2xx"},$responses_stats{$domain}{"1xx"}, $domain; 
    foreach ('Tot', 'Err','5xx','4xx',"3xx","2xx","1xx"){
        $responses_totals{$_}=$responses_totals{$_}+$responses_stats{$domain}{$_};
    }
}
print  "--------------------------------------------------------------------------------------------\n";
printf "%-10s%-10s%-10s%-10s%-10s%-10s%-10s%s\n", $responses_totals{"Tot"},$responses_totals{"Err"},$responses_totals{"5xx"},$responses_totals{"4xx"},$responses_totals{"3xx"},$responses_totals{"2xx"},$responses_totals{"1xx"}, 'Totals'; 
#
# Output upstream response time
#
print "\n";
printf "%-10s%-10s%-21s%s\n", "Upstream","Total",   "","";
printf "%-10s%-10s%-21s%s\n", "response","upstream","","";
printf "%-10s%-10s%-21s%s\n", "time",    "requests","Upstream","Domain";
print  "=====================================================================\n";

for my $keypair (
        sort { $domain_calculated_stats{$b->[0]}{$b->[1]}{adv_upstream_response_time} <=> $domain_calculated_stats{$a->[0]}{$a->[1]}{adv_upstream_response_time} }
        map { my $intKey=$_; map [$intKey, $_], keys %{$domain_calculated_stats{$intKey}} } keys %domain_calculated_stats
    ) {
    my $total_upstream_requests = scalar(@{$domain_stats{$keypair->[0]}{$keypair->[1]}{'arr_upstream_response_time'}});
    next if $total_upstream_requests < $MIN_REQUESTS_TO_SHOW;
    my $domain = $keypair->[0];
    my $upstream = $keypair->[1];
    #printf "%.2f\t\t%s\t\t%-17s\t%s\n",  $domain_calculated_stats{$keypair->[0]}{$keypair->[1]}{'adv_upstream_response_time'}, $total_upstream_requests, $keypair->[1], $keypair->[0];
    printf "%-10.2f%-10s%-21s%s\n",  $domain_calculated_stats{$keypair->[0]}{$keypair->[1]}{'adv_upstream_response_time'}, $total_upstream_requests, $keypair->[1], $keypair->[0];
}

#
# Subroutines
#
sub mean {
    return sum(@_)/@_;
}

Ну и в конце концов делаем так, что бы статистика приходила на почту и сохранялась на диск (на всякий случай)..

Создаем файл /root/bin/nginx_stat/start_nginx_stat.pl:

1
2
3
4
5
6
7
#!/bin/bash
ADMIN_MAIL="admin@example.com"
WORKING_DIR="/root/bin/nginx_stat"
DATE=`date +%Y-%m-%d`
/bin/mkdir $WORKING_DIR/$DATE
cat /var/log/nginx/timing.log | $WORKING_DIR/nginx_stat.pl > $WORKING_DIR/$DATE/$DATE.txt
echo "Subject: Nginx stat form my super host" | /bin/cat - $WORKING_DIR/$DATE/$DATE.txt | /usr/sbin/sendmail $ADMIN_MAIL

Добавляем его в logrotate (/etc/logrotate.d/nginx):

/var/log/nginx/*.log {
        daily
        missingok
        rotate 30
        compress
        delaycompress
        notifempty
        create 640 nginx adm
        sharedscripts
        prerotate
            /root//bin/nginx_stat/start_nginx_stat.pl
        endscript
        postrotate
                [ -f /var/run/nginx.pid ] && kill -USR1 `cat /var/run/nginx.pid`
        endscript
}

Готово!


при публикации материалов с данного сайта обратная ссылка на сайт обязательна.
valynkin.ru © no rights reserved