LuaJIT git HEAD now contains the new allocation sinking and store sinking optimization. This optimization is enabled by default. In case you encounter any problems and want to check whether they are caused by this optimization, you can turn it off with: -O-sink The optimization is geared towards the elimination of short-lived aggregates. It handles plain Lua tables as well as FFI cdata (e.g. structs, complex or short arrays). It also handles elimination of immutable FFI types that are implicitly boxed (e.g. 64 bit ints or pointers) in more contexts (e.g. loop-carried variables). This optimization adds quite a bit of complexity, so I'd appreciate it if it would receive wider testing. Feedback welcome! Here are a few examples that show the improved performance. The timings in seconds are for Lua 5.1.5 vs. LuaJIT git HEAD on x86 (32 bit). Lower numbers are better: Typical point class with Lua tables: local point point = { new = function(self, x, y) return setmetatable({x=x, y=y}, self) end, __add = function(a, b) return point:new(a.x + b.x, a.y + b.y) end, } point.__index = point local a, b = point:new(1.5, 2.5), point:new(3.25, 4.75) for i=1,1e8 do a = (a + b) + b end print(a.x, a.y) 140.0 Lua 26.9 LuaJIT -O-sink 0.35 LuaJIT -O+sink *** 400x faster than Lua *** Typical point class with cdata struct: local ffi = require("ffi") local point point = ffi.metatype("struct { double x, y; }", { __add = function(a, b) return point(a.x + b.x, a.y + b.y) end }) local a, b = point(1.5, 2.5), point(3.25, 4.75) for i=1,1e8 do a = (a + b) + b end print(a.x, a.y) 10.9 LuaJIT -O-sink 0.20 LuaJIT -O+sink *** 700x faster than Lua *** 64 bit arithmetic in a loop: local x = 0LL for i=1,1e9 do x = x + 100 end print(x) 45.8 LuaJIT -O-sink (x86) 40.9 LuaJIT -O-sink (x64) 0.84 LuaJIT -O+sink (x86) 0.48 LuaJIT -O+sink (x64) --Mike