Algorithms (3 blogmarks)

← Blogmarks

Rob Pike's 5 Rules of Programming

https://www.cs.unc.edu/~stotts/COMP590-059-f24/robsrules.html

The general theme here is "less is more". Don't write anticipatory code for potential bottlenecks, you need to measure and identify them first. Start with simpler algorithms. Start with simpler data structures.

Rule 1. You can't tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is.

Rule 2. Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest.

Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. (Even if n does get big, use Rule 2 first.)

Rule 4. Fancy algorithms are buggier than simple ones, and they're much harder to implement. Use simple algorithms as well as simple data structures.

Rule 5. Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

This last bit of Rule 5 is particularly interesting: "Data structures, not algorithms, are central to programming."

It is so much easier to reason about and build a flow of logic when you've modeled the problem well and gotten the data structure right.

Punycode: My New Favorite Algorithm

https://www.iankduncan.com/engineering/2025-12-01-punycode/

I enjoyed reading about this algorithm because it demonstrates some clever tricks and problem solving. It's useful to read about new algorithms from time to time if for no other reason than as a source of inspiration to keep you actively thinking about unique was to solve problems.

A high level summary of what makes it so neat: the algorithm uses adaptive bias adjustment, variable-length encoding, and delta compression to achieve extremely dense (in the information-theoretic sense) results for both single-script domains (like German bücher-café.de or Chinese 北京-旅游.cn) and mixed-script domains (like hello世界.com). What makes it particularly elegant is that it does this without any shared state between encoder and decoder. The adaptation is purely a deterministic function of the encoded data itself. This is a really cool example of how you can use the constraints of a system to your advantage to solve a problem.

Reservoir Sampling

https://samwho.dev/reservoir-sampling/

Reservoir sampling is a technique for selecting a fair random sample when you don't know the size of the set you're sampling from.