It has been more than 3 years now since the MIR ini­ti­at­ive has been ac­cep­ted. Just a year after it has been en­abled by de­fault to all the users of nightly Rust. Even then, it was fairly clear that MIR is well po­si­tioned to en­able op­tim­isa­tions that would, in many in­stances, res­ult in sig­ni­fic­antly faster com­pil­a­tion. “This would be achieved by op­tim­ising the gen­eric code”, we thought at the time, “which would make it easi­er, and thus faster, for LLVM to do its share of work on the mono­morphic code”. While the idea is prom­ising, not much has ma­ter­i­al­ised since.

Cur­rently rustc has a num­ber of MIR op­tim­isa­tions: a simple in­lin­er, ba­sic con­stant and copy propaga­tion, a single in­struc­tion com­bin­a­tion rule, a few graph sim­pli­fic­a­tion and clean up passes… The pat­tern here is clear – most of the op­tim­isa­tions we cur­rently have are ba­sic and lim­ited in their po­tency. Given the pace at which we man­aged to bring up MIR in the first place, one would be right to ex­pect… some­thing more.

As some­body who has made an at­tempt and failed to im­ple­ment a num­ber of data­flow-­based op­tim­isa­tions (a­mong other thing­s), I con­sider my­self fairly qual­i­fied to haz­ard a guess as to what is the reason for the cur­rent state we are at. Here it goes.

MIR is muta­tion-a­verse

The MIR data struc­ture has not changed much since it was first con­ceived. An oc­ca­sional enum size op­tim­isa­tion or new state­ment kind is usu­ally the limit of modi­fic­a­tions to the struc­ture of the type.

On one hand, this proves we did some­thing right, es­pe­cially given the fairly com­plex tool­ing such as miri which could be built on top of it. On the other hand, it has been ob­vi­ous since al­most the be­gin­ning that mutat­ing the MIR to ob­tain a dif­fer­ent look­ing MIR is fairly awk­ward. Es­pe­cially when non-trivial trans­form­a­tions get in­volved. And that is ex­actly what op­tim­isa­tions do!

Mul­tiple is­sues con­trib­ute to this is­sue. MIR it­self be­ing made out of Vec<T>ors is one thing. It makes trans­form­a­tions such as re­mov­ing or adding a state­ment in­ef­fi­cient and very much in your face. Lack of a con­veni­ent com­mon API to trans­form MIR in ar­bit­rary ways is an­oth­er. Op­tim­isa­tion code is then forced to re­im­ple­ment such modi­fic­a­tion by fid­dling with the data struc­tures dir­ectly. This in turn makes in­ef­fi­cient op­er­a­tions with vec­tors much more likely as well. While there is a MirPatch ap­proach which has been gain­ing some use, it does not al­low to eas­ily ex­press ar­bit­rary block trans­form­a­tions and it is­n’t too con­veni­ent either.

Hav­ing some­thing that is easy and cheap to modify in ar­bit­rary ways would go a long way to­wards en­abling im­ple­ment­a­tion of the more com­plic­ated op­tim­isa­tions. This some­thing does not ne­ces­sar­ily need to be MIR it­self – it could be some­thing spe­cific­ally tailored and used for op­tim­isa­tion.

As an aside, some­thing trans­ac­tion­al, with an abil­ity to ef­fi­ciently roll back the modi­fic­a­tions to a pre­vi­ous check­point, would be ideal. There are quite a few op­tim­isa­tions (esp. those that in­volve heur­ist­ics, such as in­lin­ing) where it is not ob­vi­ous whether tak­ing cer­tain ac­tions would res­ult in bet­ter code. For those op­tim­isa­tions we could as well just do the op­tim­isa­tion and de­cide whether it res­ul­ted in bet­ter code then and there.

Data col­lec­tion is hard

Most op­tim­isa­tions need a sig­ni­fic­ant amount of in­form­a­tion about the pro­gram, so that they could prove equi­val­ence between the code be­ing op­tim­ised and the op­tim­ised ver­sion of it. Data such as ali­asing in­form­a­tion and call graph have many uses in vari­ety of dif­fer­ent op­tim­isa­tions. Some other data col­lec­tion may be done via, say, data­flow ana­lysis.

While solu­tion here could be as simple as im­ple­ment­ing com­mon in­fra­struc­ture for ob­tain­ing such data, if our cur­rent us­age of the data­flow ana­lysis in­fra­struc­ture is any­thing to go by, we will end up with each pass do­ing their own ana­lysis and, there­fore, un­ne­ces­sar­ily re­peat­ing a lot of the work.

In prac­tice, many op­tim­isa­tions tend to need sim­ilar kind of in­form­a­tion. A good ex­ample of this is the triplet of con­stant propaga­tion, alias propaga­tion and ter­min­ator sim­pli­fic­a­tion. All these use ex­actly the same kind of data and pro­duce trans­form­a­tions very sim­ilar in their nature. To achieve the best res­ults it is im­port­ant to share the know­ledge between the passes. In fact – if the know­ledge is not shared it may be­come im­possible to achieve op­timal code. Sec­tion 2.2. of the Hoopl pa­per has an ex­ample of such a scen­ario.

Im­prov­ing in­fra­struc­ture here and es­tab­lish­ing a method for shar­ing data between passes ef­fi­ciently would make it sig­ni­fic­antly easier to im­ple­ment the op­tim­isa­tions without hav­ing to care about either nitty or gritty. This could be done by en­cod­ing such in­form­a­tion into MIR it­self.

Get­ting op­tim­isa­tions right is hard

This is some­what in the wish­list hand­wavy area and some­thing I would love to dis­cuss more at the All Hands.

My ex­per­i­ence with LLVM shows that im­ple­ment­ing the op­tim­isa­tions cor­rectly and keep­ing them that way as changes oc­cur is a sig­ni­fic­ant dif­fi­culty. There are mul­tiple sides to this is­sue. In­ter­ac­tions between dif­fer­ent op­tim­isa­tions are non-trivial to un­der­stand, some­times res­ult­ing in an op­tim­isa­tion that is valid in isol­a­tion but not when in com­bin­a­tion with an­other one. This of­ten hap­pens when the two op­tim­isa­tions dis­agree on what code is ac­cept­able and what is not. This be­comes more im­port­ant as op­tim­isa­tions are mod­i­fied – this is when most of the bugs are in­tro­duced.

Sadly, it is nigh im­possible to keep the whole set of rules in the head and con­sider all of the fine points in the con­text of some­what ar­bit­rary op­tim­isa­tion in­ter­ac­tion graph. All a doc­u­ment of rules ends up help­ing with is in resolv­ing dis­agree­ments and as­sign­ing blame, rather than main­tain­ing the op­tim­isa­tions.

Ideally we would end up with some sort of a tool that is able to prove, with an ar­bit­rary, but still reas­on­able, in­vest­ment of time, that some op­tim­isa­tion yiel­ded an equi­val­ent op­tim­ised func­tion. In gen­er­al, prov­ing this sort of thing is an un­de­cid­able prob­lem (equi­val­ent to the halt­ing prob­lem). To our be­ne­fit op­tim­isers are lim­ited to only mak­ing de­cid­ably equi­val­ent changes by con­struc­tion. This al­lows a tool to at least verify that, given a func­tion, an op­tim­ised ver­sion of it and whatever rules we im­pose on an op­tim­iser, the op­tim­isa­tion does not vi­ol­ate those rules.

miri already veri­fies some of these rules, so I ima­gine a tool could be based on it in some way. The tool could be some­thing akin to a fuzzer which at­tempts to find lack of equi­val­ence between a non-­op­timal ba­sic block and whatever that block was op­tim­ised to. This would give a fair amount of con­fid­ence and avoid the halt­ing prob­lem en­tirely.

rustc, the great op­tim­iser

It is heart­en­ing to see MIR op­tim­isa­tion re­ceive re­newed fo­cus and be con­sidered as one of the big top­ics of dis­cus­sion at the 2019 All Hands. Re­gard­less of de­cisions made there, the road ahead of us is long and it is im­port­ant to get the first steps right. This post is an out­line of what I be­lieve those steps are. With the right steps, the bright fu­ture with rustc as a great op­tim­iser will be upon us.