aboutsummaryrefslogtreecommitdiff
path: root/content/posts/site-analytics.md
blob: 7a284168dd9757690302e401b714789809ed627e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
title: "Site Analytics"
date: 2024-11-03T22:49:15-05:00
draft: true
description: "Some scripts to analyze my site's visitors"
type: "post"
tags: ["website", "meta", "scripts"]
---


I usually don't care much about who visits my site, but I was kind of curious today and decided to take a look.

First, I fetched the site logs:

```fish
journalctl -u nginx -g " a.exozy.me " > out
```

Then, I used GoAccess to analyze the logs, selecting the first option for log format:

```fish
goaccess (cut -d' ' -f8- out | psub)
```

Most of the data is pretty mundane, but I did learn a few interesting things: 25% of the requests are for my RSS feed and 27% of the visitors are crawlers. Phew, I though crawlers would be more like 99%, because that's what the exogit visitor stats are like.

Lastly, I used `geoiplookup` to see what countries the visitors are from:

```fish
for i in (cut -d' ' -f8 out)
    if string match -q "*.*" $i
        geoiplookup $i
    else
        geoiplookup6 $i
    end
end | grep -v hostname | sort | uniq -c | sort -n
```

That produced the following output:

```
      1 GeoIP Country V6 Edition: DE, Germany
      2 GeoIP Country Edition: AU, Australia
      2 GeoIP Country Edition: GB, United Kingdom
      3 GeoIP Country Edition: NL, Netherlands
      5 GeoIP Country Edition: AT, Austria
      7 GeoIP Country Edition: RU, Russian Federation
      8 GeoIP Country Edition: CZ, Czech Republic
      8 GeoIP Country Edition: IT, Italy
     11 GeoIP Country Edition: BR, Brazil
     12 GeoIP Country Edition: JP, Japan
     12 GeoIP Country Edition: MA, Morocco
     12 GeoIP Country Edition: SG, Singapore
     13 GeoIP Country V6 Edition: AU, Australia
     15 GeoIP Country Edition: FR, France
     17 GeoIP Country Edition: DE, Germany
     22 GeoIP Country Edition: EE, Estonia
     25 GeoIP Country Edition: TW, Taiwan
     25 GeoIP Country Edition: VN, Vietnam
     28 GeoIP Country V6 Edition: CA, Canada
     44 GeoIP Country Edition: FI, Finland
     52 GeoIP Country Edition: CN, China
     57 GeoIP Country Edition: BG, Bulgaria
     57 GeoIP Country Edition: CA, Canada
     64 GeoIP Country V6 Edition: FR, France
     66 GeoIP Country Edition: JO, Jordan
     67 GeoIP Country Edition: IN, India
     83 GeoIP Country V6 Edition: KR, Korea, Republic of
     85 GeoIP Country Edition: HK, Hong Kong
    157 GeoIP Country V6 Edition: US, United States
    787 GeoIP Country Edition: US, United States
```

All those US visitors are probably just me. But I didn't know my site was so big in Hong Kong? Maybe because of VPNs?

2024-12-11 update: I switched to Caddy so I'm now using the command `journalctl -u caddy --output cat --lines all -g http.log.access | goaccess --log-format caddy`.