aboutsummaryrefslogtreecommitdiff
path: root/personal_infra/README.md
blob: 9654bffe0f7c651105096fc50b1dc81b3ab09ba9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
# My personal infrastructure

This is a general overview.
See [HACKING](HACKING.md) for more "usage" instructions.

* Hetzner auction server: 128Gb RAM, 2x1Tb SSD. Runs Proxmox, Wireguard/ocserv, Apache as reverse proxy
  * LXC container running Nextcloud
  * LXC container running Vaultwarden
  * LXC container running Miniflux
  * LXC container running Flexisip
  * LXC container running Gitolite
  * LXC container running SeaweedFS (for Takahe)
  * LXC container running a workstation
  * LXC container running PostgreSQL
  * LXC container running a FreeIPA replica
  * LXC container running Ipsilon
  * LXC container running Nagios
  * LXC container running Grafana
  * LXC container running ClickHouse
  * Two VMs running Talos, providing two Kubernetes clusters (production/test)
    * My blog
    * Incarnator
    * A CRUD system I run to track my weight
    * Some other small projects
* Flat 1
  * HP Proliant Microserver: 4Gb RAM, 2x4Tb HDD
    * DHCP/DNS
    * Runs SMB/NFS
    * ZFS backups on external USB drives
    * Wireguard/ocserv
  * Raspberry Pi 3B (1Gb RAM) running LibreElec + TVHeadend, records to NFS share on HP server
* Flat 2
  * Fanless N100 running Debian, runs DHCP/DNS, Wireguard/ocserv
* Netcup 2Gb RAM VPS running FreeIPA (also Wireguard/ocserv)

## Configuration management

I prefer using Ansible for orchestration, and Puppet for configuration management.

* `up.py` compiles Puppet catalogs without a Puppet Server.
* `pseudo_resource_exporter.py` simulates exported resources on the catalogs generated by `up.py`.
  You can use this script as a template to implement your own catalog manipulations.
* `playbooks/roles/apply_puppet/` uses `up.py` to apply Puppet to Ansible hosts.
  This script collects facts, adds the Ansible inventory to Hiera (so you can use Ansible inventory data to parameterize Puppet), compiles the catalogs, ships them to Ansible nodes, and executes Puppet.

Except for exported resources, which work differently, this setup has most of the benefits of Puppet Server without having to run a Puppet Server and PuppetDB.

Being able to simulate exported resources without a master lets you use the `nagios_core` module without infrastructure.
With the `nagios_core` module, Puppet code, such as a module which sets up a web server, can define "inline" Puppet monitoring for the managed resources.

## Networking

I like having working DNS, so I run dnsmasq on both flats and for the Proxmox network on the Hetzner server.
It also does integrated DHCP (mostly everything gets a DHCP IP and thus, a hostname).
Every environment has a /24 network with DNS/DHCP and their own domain (hetzner.int.mydomain, flat1.int.mydomain, etc.).
I use Route 53 for DNS records (except those of my own networks). DNS records are created with Ansible playbooks.

I have the following snippets on dnsmasq's configuration:

```
server=/flat1.mydomain/ip.of.flat1.dns
rev-server=net.mask.of/flat1,ip.of.flat1.dns
```

So one dnsmasq instance can lookup records (even reverse DNS) on the other dnsmasq instances, so I can address systems on other networks by their name.
This could also be achieved by NS records, if I'm not mistaken, but this way everything is private on my own dnsmasq servers and not on public DNS.

I join all networks using Wireguard. Wireguard keys are generated and distributed using an Ansible playbook.

On every network I've also set up ocserv to provide remote access if I'm outside these networks; I can pick the closest access point and reach my entire network.

## Authentication

I run a two-node FreeIPA cluster.
It provides a user directory and centralized auth, with passwordless single-sign on.
It also has sudo integration, so I can sudo on all systems with a single password.

Many systems and services are integrated in FreeIPA.
My workstations are joined to the domain so I can even log in to some web applications without typing a password.

Ipsilon adds OpenID for web application authentication.

Ipsilon is backed by Red Hat, although they seem to have shifted their focus to KeyCloak. KeyCloak is much more featureful, but I prefer Ipsilon because:

* It's deployed via RPM
* Integration with FreeIPA is a one-liner
* It's still used by the Fedora Project infrastructure

FreeIPA and Ipsilon are running on Rocky Linux 9.

## Mail

All systems are running Postfix configured to send emails.
This way I get notifications on failed cronjobs or automated updates.

## TLS

I set up certificates using [mod_md](https://httpd.apache.org/docs/2.4/mod/mod_md.html).

## Observability

I run Nagios monitoring all hosts and services.
I get alerts for hosts and services being down.
I use https://github.com/alexpdp7/ragent as the monitor, which also means I get notifications when a host is updated and requires a reboot.

I use [nagios-otel](https://github.com/alexpdp7/nagios-otel) to deliver metrics via OpenTelemetry into a ClickHouse database.
See [my configuration](puppet/site/nagios.h1.int.pdp7.net.pp) for opentelemetry-collector that scrapes the Nagios log to create logs in ClickHouse.
I use Grafana to explore monitoring information in ClickHouse.

## Operating systems

I use:

* Proxmox, as it provides LXC containers (and VMs if needed) and ZFS storage. I like ZFS for its protection about bitrot, and because send/recv and snapshots are great for backups
* EL9/EL10, using Rocky Linux 9 and AlmaLinux 10.
* Debian in a few hosts.
* LibreElec for my mediacenter Raspberry. Common distros are not an option, as they don't support hardware video acceleration. LibreElec sets up everything I need with minimal fuss, so while it's the system that doesn't use configuration management, it works fine.

## Software updates

I use `dnf-automatic` on EL9/EL10, and `unattended-upgrades` on Debian/Ubuntu so updates are automatically installed.

`ragent` monitors when systems need a reboot and warns me through Nagios.

## Packaging

* https://copr.fedorainfracloud.org/coprs/koalillo/seaweedfs/ https://github.com/alexpdp7/seaweedfs-rpm

## Storage

I run Nextcloud on an LXC container, files are stored in a ZFS filesystem.

Media and other non-critical files are stored in the Proliant and shared via Samba and NFS.

### Media

The Raspberry has a DVB-T tuner and TVHeadend, recordings are stored on the Proliant in an NFS share.

### Backup

Valuable data is on dedicated datasets.
I run [sanoid](https://github.com/jimsalterjrs/sanoid) in the Hetzner and Proliant servers to create and prune snapshots.
I use [syncoid](https://github.com/jimsalterjrs/sanoid#syncoid) in the Proliant and a laptop to synchronize snapshots as a backup.

## Kubernetes

I use Talos Linux to run Kubernetes.

## My blog

See [blog](../blog).

## Phones

I wanted to eliminate my landlines, because I get a ton of spam there.
However, I need to provide calls between my home and another home using physical phones (people like wireless headsets- smartphones are not really well designed for extended phone calls).

The key to this is the SIP protocol.
You can get classical phones that work using the SIP protocol, or ATA devices that turn a regular phone into a SIP phone.

I installed Flexisip.

The major difficulty in setting a SIP server is networking.
I run Flexisip in an LXC container on Proxmox.
I expose the SIP server's SSL TCP port to the Internet, plus a range of UDP ports, using iptables.
(I consulted some SIP forums, and apparently there are no major hardening requirements in exposing a SIP server to the Internet, although I think maybe it's better to use a SIP proxy.)
You can also use STUN/TURN servers, but I had lots of trouble getting that set up.

For the phones, I bought and set up two Grandstream HT801 ATA devices.
Those are quite cheap (around 40€), but they are quite fancy professional network devices, with a rough but featureful UI (they can do OpenVPN, SNMP, etc.).
They connect directly to Flexisip over the Internet, autoconfiguring via DHCP, so in theory they could work anywhere in the world with a network connection.
After configuration and assigning an extension, you only need to connect cheap wireless phones to them, and start making calls with the 1000...1020 extensions.

For testing and occasional calls I use [Baresip](https://github.com/baresip/baresip) and [Linphone](https://www.linphone.org/) from F-Droid in my smartphone, and from Debian in my laptop.
For smartphones, SIP has the drawback that it requires a persistent connection to the SIP server to receive calls- thus draining the battery a bit.
Linphone/Flexisip are supposed to use mobile push, but I have not set this up.
So the only devices that are connected 24/7 are the ATAs, I use my smartphone and my laptop occasionally.

## Possible improvements

* Add a lab so I can experiment with things in isolated environments.