Early History of a Perfect Join Algorithm

I've written two posts about an optimal join algorithm that has been making a comeback in recent years - even Microsoft felt the FOMO. Throughout the years, the algorithm has become associated with Mihalis Yannakakis. But just like many other great ideas in computer science, there was not one single inventor of the algorithm, but three! In the same year Yannakakis published his seminal paper, Phil Bernstein and Dah-Ming Chiu published their paper, describing almost exactly the same algorithm for solving acyclic joins. To recognize Bernstein and Chiu's contribution, I think we should start calling the algorithm BCY algorithm.

But being obsessive as I am, I did some more archaeology to find out if this is really a case of concurrent discovery, or should one of them claim first dibs. The Bernstein & Chiu paper was actually accepted in 1979 (this is marked at the end of the paper), yet only appeared in 1981 due to the long turnaround for journals. I also stumbled upon this short review by Z. Meral Özsoyoglu (the "O" in "GYO reduction", another concurrently discovered algorithm). The review suggested that a technical report of the paper was already circulating in 1979. Beyond the "when", Özsoyoglu's review also pinpoints the "what":

Bernstein and Chiu's algorithm was limited to at most one join attribute between any two relations, i.e. semijoins involving single domains

So Bernstein & Chiu were 2 years early, but their algorithm was more limited? Not so fast! If you read their paper, you'll see the algorithm described is exactly the same as what we've been calling Yannakakis' algorithm. It's just that in the proofs, Bernstein & Chiu made the assumption that all joins are single-attribute, but they didn't actually need it - the algorithm remains correct for multiple-attribute joins! On the other hand, although Yannakakis' paper does deal with the more general case of multi-attribute joins,¹ he overshot a little bit: he also considers projections over joins, in which case the best complexity we can get is only polynomial in |\text{IN}| + |\text{OUT}|, not linear, but it is easy to specialize the algorithm for pure join queries and prove the linear complexity. Finally, the core of all algorithms variants above is a process to obtain fully reduced relations, called "full reducers", using semijoins, and a widely cited early work for that is a 1979 technical report The theory of semi-joins by Phil Bernstein and Nathan Goodman.