20 random bookmarks

2025-08-23

249.

My process to debug DNS timeouts in a large EKS cluster · Jack's home on the web

cep.dev/posts/eks-dns-timeouts-sudo-hostname-lookups

TIL Sudo need hostname lookups

2025-06-12

227.

ThinkPad T14s Gen 3 AMD Linux User Review + Tweaks

linuxblog.io/thinkpad-t14s-gen-3-amd-linux-user-review-tweaks#Extend_ThinkPad_Battery_life_on_Linux

2025-05-17

216.

Prometheus: How We Slashed Memory Usage

devoriales.com/post/384/prometheus-how-we-slashed-memory-usage

2025-02-21

213.

Tying Engineering Metrics to Business Metrics | by Iccha Sethi | Medium

icchasethi.medium.com/tying-engineering-metrics-to-business-metrics-f4df7651e026

2024-12-02

196.

Evolving my ergonomic setup (or, my laptop with extra steps) | nicole@web

ntietz.com/blog/evolving-ergo-setup

2024-09-28

187.

ArgoCD Finalizer Shield: Protecting Your Production Clusters from Unintended Deletion | by Tal Yitzhak | FAUN — Developer Community 🐾

faun.pub/argocd-finalizer-shield-protecting-your-clusters-from-unintended-deletion-c7929a82d983

2024-06-26

175.

Running a Raspberry Pi with a read-only root filesystem # Chris Dzombak

www.dzombak.com/blog/2024/03/Running-a-Raspberry-Pi-with-a-read-only-root-filesystem.html

2024-04-14

149.

Scaling Go to 192 Cores with Heavy I/O · Jaz's Blog

jazco.dev/2024/01/10/golang-and-epoll

2024-04-01

148.

Product-Focused Reliability for SRE - Google - Site Reliability Engineering

sre.google/resources/practices-and-processes/product-focused-reliability-for-sre

2024-01-06

125.

The challenges of configuring Kubernetes resources’ Requests & Limits in combination with HPA at Scale | by Alexandre Souza | Nov, 2023 | Medium

medium.com/@alexandre.highrollers/the-challenges-of-configuring-kubernetes-resources-requests-limits-in-combination-with-hpa-at-92177cb5a378

2023-12-16

109.

All my favorite tracing tools: eBPF, QEMU, Perfetto, new ones I built and more - Tristan Hume

thume.ca/2023/12/02/tracing-methods

2023-11-10

100.

Building a Successful SRE Team. Successful techniques to ensure your… | by Sven Hans Knecht | Medium

blog.hans-knecht.com/building-a-successful-sre-team-283232bc2694

2023-08-12

90.

An NGINX and DNS based outage | Andy Dote

andydote.co.uk/2022/04/23/nginx-dns

I recently encountered a behaviour in Nginx that I didn’t expect and caused a production outage in the process. While I would love to blame DNS for this, as it’s usually the cause of most network-related issues, in this case, the fault lies with Nginx.

2023-08-05

83.

The System Resiliency Pyramid

www.codereliant.io/the-system-resiliency-pyramid

2023-07-17

67.

Preventing Pipewire from being SIGKILLed

lantian.pub/en/article/random-notes/pipewire-sigkill-fix.lantian

I frequently encounter the situation that the Pipewire audio server is suddenly stopped:

The problem usually appears when I connect/disconnect my laptop from the power adapter. My computer usually lags for a short time while switching between performance profiles.
systemctl --user status pipewire.service only shows that the Pipewire process was terminated by a SIGKILL signal, without any other useful log information.
Neither coredumpctl nor dmesg shows the existence of a core dump event.

2023-07-01

54.

Linux network namespaces and HTTP requests in Go

blog.tomlebreux.com/2022/03/02/linux-network-namespaces-and-http-requests-in-go.html

2023-06-29

47.

A few words on taking notes

www.allthingsdistributed.com/2023/06/a-few-words-on-taking-notes.html

2023-06-17

19.

My 24 year old HP Jornada can do things your modern iPhone still can't do!

raymii.org/s/blog/My_24_year_old_HP_Jornada_can_do_things_your_modern_iPhone_still_cant_do.html

oldtech cools

2023-06-13

3.

Scaling Site Reliability Engineering Teams the Right Way

www.squadcast.com/blog/scaling-site-reliability-engineering-teams-the-right-way
2.

monitoring is a pain

matduggan.com/were-all-doing-metrics-wrong