Some Insights on Tinkering with Servers
Server Selection
- Cloud providers: Foreign ones basically require a credit card, so you can pass on them directly. Domestically, Alibaba Cloud has the largest market share, but the differences between major cloud providers’ services are negligible for ordinary people. You can choose based on price, but be cautious when choosing small providers.
- CPU: At least two cores. Operation and maintenance are hard with just one core, as your services will fight for resources with your ssh and vscode server.
- Memory: At least 2GB; 4GB is the comfort zone. It’s fine if there are no Java services, but once you have Java, just forget about it. Servers with little memory can only run go and rust; jvm and node services are completely unusable. Anyone who has manually deployed services knows how sweet go is, and wouldn’t dare to use jvm.
- Network: If you are sure the traffic will be low or if you are not afraid of DDoS, you can choose pay-by-traffic; otherwise, choose fixed bandwidth.
- Region: Currently, Hong Kong is still the optimal solution. Not only is ICP registration not required, but accessing Docker Hub, GitHub, Huggingface, etc., doesn’t require a proxy, and you can even set up a proxy server.
Operating System Selection
Server sides are typically either Debian-based or RHEL-based.
Among the free ones, Debian and AlmaLinux stand out the most (the author of RockyLinux has a bad reputation). Those who like RHEL (conservatives) should choose AlmaLinux, but the dnf package manager updates packages much slower than apt, and sometimes package updates are at the mercy of Red Hat. Debian has an extremely obvious advantage in ecology; even Docker chose debian for its Docker Hardened Image.
Note
Actually, CentOS Stream is not unusable, it’s just that the psychological gap it brought to people is too big, making it feel very dangerous. If you are not a strict perfectionist about security, rather than worrying about this, you should worry about whether security measures elsewhere are done well. In fact, it can sometimes get security patches even earlier than RHEL, and it is backed by Fedora upstream.
Preparation Work
ssh
Using passwords is not recommended; it’s recommended to use keys. Enable public key login in sshd_config, and you can conveniently disable password login while you are at it.
The best public-private key algorithm is ed25519 (can be generated using ssh-keygen -t ed25519). I will skip the rest of the operations, as there are extremely many tutorials available.
Docker
Docker’s advantages are not just being reproducible. Service configuration files, lifecycles, networks (ports), logs, updates, CVEs, etc., can all be handled entirely by Docker.
The installation method is detailed in the official documentation: https://docs.docker.com/engine/install/
If you are not the root user by default, consider adding your regular user to the docker group.
Using Docker
Whether it is a single service or multiple services, it is recommended to configure them all using Docker Compose. The limitations of the command line are too great; configuration files are easier to maintain.
You can be like me and specifically create a folder to organize configuration files:
- Caddyfile
- docker-compose.yml
- Dockerfile
- .env
- reload.sh
- app.conf
- docker-compose.yml
- docker-compose.yml
- .env
- docker-compose.yml
- s3.config.json
Specific configuration examples:
- Write down the major version number clearly if possible, to facilitate seamless updates; if not, first check if there are versions like stable, and finally choose latest. Do not update services lightly afterwards, otherwise it is very likely to cause service crashes (if upgrading underlying services, it will also cause a chain reaction, crashing other services). Therefore, I did not write
pull_policy: always. Those strictly following SemVer can update boldly. - It is recommended to use
.envand not write environment variables directly in the compose file. - Configuration files should be uniformly mounted using local directories; otherwise, try to use Docker’s volume management. You can check the actual storage path on the host machine via
docker volume inspect <id>. - Use bridge networks and Docker’s built-in IP resolution function (container names are automatically resolved to specific IPs) as much as possible. Although host mode is very convenient, container ports are very prone to conflicts, and it is very dangerous if the firewall is configured improperly. The producer creates an independent bridge network, and after consumers join the producer’s network, they access it using the container name. Therefore, except for reverse proxy services, avoid exposing ports to the host machine as much as possible.
| |
Basic Service Selection
The most essential basic services are reverse proxy (external access), database (storing data), object storage (assisting the database in storing files), and identity authentication (needless to say).
Reverse Proxy
Many people’s first reaction is definitely Nginx, a very well-established web server. However, in front of Caddy, Nginx’s DX (Developer Experience) is just too terrible. Reverse proxies like Traefik and HAProxy are too advanced and only suitable for extreme performance scenarios; gateway products like Kong are not suitable for personal server use.
Using plugins with Caddy requires building a standalone Caddy executable file:
| |
Caddyfile example:
| |
Database
PostgreSQL is an outstanding leader among open-source databases; needless to say more.
User + database creation tutorial:
| |
Object Storage
Since MinIO turned evil, it seems there are no easy-to-use object storages left. Currently, the ones with relatively high community attention are:
- GarageHQ: Only supports CLI management; the current version does not yet support anonymous file access.
- SeaweedFS: Distributed storage, but it supports both anonymous access and fine-grained authentication. The only object storage in the Docker Hardened Image, trustworthy enough.
- RustFS: A rising star, good user experience, but has many negative reviews and insufficient maturity.
- Ceph: Distributed storage. (I haven’t tried it yet).
My current choice is SeaweedFS. (Refer to another blog)
Note
WebDAV is an option; you can wrap an OpenList (AList successor) as a front-end to manage files.
| |
S3 configuration file:
Note
You can configure anonymous permissions, and it also supports granular permissions down to specific operations and Buckets.
| |
Caddy proxying S3 API (supports Virtual Host Style)
Note
DNS requires configuring wildcard domain resolution for *.s3.mioyi.net; just *.mioyi.net is not enough.
| |
Note
SeaweedFS is primarily designed for distributed + massive small file storage, so it is very aggressive in volume creation. It will create a large number of concurrent writes and disaster recovery replicas, and will also pre-allocate, which is very tight for personal servers with small hard drives. Therefore, the master must be configured with -defaultReplication=000 -volumeSizeLimitMB=64, and simultaneously the volume container must also be configured with -max=0 to allow it to automatically scale without limits.
See https://github.com/seaweedfs/seaweedfs/wiki/Replication https://github.com/seaweedfs/seaweedfs/wiki/Optimization
Identity Authentication
Casdoor’s UX (User Experience) is very good (at least better than Keycloak and Authentik), and it also has Chinese support. After configuring single sign-on, you can automatically log into all the Apps you deploy. Specific configurations are quite lengthy, so they are omitted here.