LuaJIT 2.1 status and sponsorships

  • From: Mike Pall <mike-1305@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Tue, 21 May 2013 22:02:00 +0200

LuaJIT 2.1 Status
=================

Development on LuaJIT 2.1 is currently underway. Since a lot of
the code is still in flux, it'll remain in alpha status for now.
Many more improvements will go into 2.1 before it becomes final.

Currently, LuaJIT 2.1 is only available via the 'v2.1' branch of
the git repository. If you want to check it out, please refer to
the instructions at: http://luajit.org/download.html

Here are the main changes in LuaJIT 2.1 as of today:

* The x87 FPU support for the interpreter has been removed.

  For LuaJIT 2.0, x86 CPUs without SSE2 only worked in interpreter
  mode and the JIT-compiler was disabled. LuaJIT 2.1 won't run at
  all on a CPU without SSE2 support.

  The 32 bit x86 port of LuaJIT 2.1 now offers consistent IEEE-754
  semantics, just like the other ports. And all implicit conversions
  of FP numbers to integers round towards zero.

* Several standard library functions have been replaced with
  builtin bytecode.

  These library functions have been rewritten in Lua -- the source
  code can be found inline in the corresponding modules (lib_*.c).
  The code is mangled, pre-compiled to bytecode and embedded into
  the binary. This saves space, simplifies compilation and avoids
  NYI cases that couldn't be resolved otherwise.

  Most of the changes shouldn't have any user-visible impact
  (string.len() etc.). But you may notice some former NYI cases
  are compiled, e.g. table.foreachi() or table.remove() (in 2.0,
  only the 'pop' operation was compiled).

  More builtins will follow. A bigger item on my TODO list is a
  Lua rewrite of the package library (package.*, require, module).

  I should note that replacing a C builtin with a Lua builtin does
  _not_ cause a slowdown, even on platforms where the JIT-compiler
  has to be disabled (iOS, consoles). The overhead of a series of
  Lua/C API calls from C code is comparable to the overhead of
  bytecode dispatch. The core of the bytecode interpreter and many
  important builtins are written in heavily optimized assembler.
  Of course, once JIT-compiled, both overheads can be eliminated.

* String formatting semantics have been unified across platforms.

  Both Lua 5.x and LuaJIT 2.0 show platform-dependent behavior for
  the integer formats %d, %u, %o and %x of string.format().
  Basically, you got 32 bit wide conversions on some platforms and
  64 bit conversions on others. string.format("%d", 2^32) returned
  either "0" or "4294967296" or an undefined result. Rounding was
  similarly unpredictable.

  LuaJIT 2.1 offers consistent 64 bit conversions for all integer
  formats. The unsigned formats also accept and convert negative
  numbers and re-interpret them as unsigned 64 bit integers.

  One potential pitfall for users is that
    string.format("%x", -1)
  previously returned "ffffffffffffffff" on some LuaJIT 2.0
  platforms and "ffffffff" on others. You may need to adapt your
  code if it relies on the latter result.

  Note that the Lua BitOp manual has warned about this behavior of
  %x with signed inputs. The suggested, portable workaround is
  to use bit.tohex(), which guarantees fixed length output.

  On a related note, the %s and %c formats now handle embedded NUL
  characters.

* All bit.* functions can now operate on 64 bit FFI cdata values.
  Details below.

* String buffers and string operations have been improved.
  Details below.


Sponsorship for 64 bit Bitwise Operations
=========================================

A corporate sponsor, who wishes to remain anonymous, has sponsored
the development of 64 bit bitwise operations for LuaJIT 2.1.

The rules for bitwise operations follow the general rules for
arithmetic operations on FFI data types: 64 bit integer arguments
are 'sticky'. E.g. bit.band(0x16LL, 5) returns 4LL (a cdata
number, not a Lua number).

The only exception is bit.tobit(), which truncates to a signed
32 bit integer and returns a plain Lua number -- just like the
32 bit bitwise operations.

The default output size for bit.tohex() is 8 hex digits for plain
numbers and 16 hex digits for FFI cdata numbers.

For the exact semantics, point your browser to the file
  doc/ext_ffi_semantics.html#cdata_arith
in the v2.1 branch.


Sponsorship for Performance Improvements
========================================

I'm happy to announce that CloudFlare Inc. https://www.cloudflare.com/
is sponsoring various performance improvements for LuaJIT 2.1.
CloudFlare is operating one of the world's largest deployments of
nginx + LuaJIT. As you can imagine, at this scale, every fraction
of a microsecond that can be shaved off for each request has a
significant impact.

I'm working closely together with Yichun Zhang (agentzh) at
CloudFlare to identify and fix any remaining bottlenecks. The
current focus is on string operations. Many of these are NYI items
in LuaJIT 2.0, causing fallbacks to the interpreter.

These items have already been improved in LuaJIT 2.1 (git):

- Reorganize string buffer management and tune string ops.

- JIT-compile string concatenations ('..' operator, CAT bytecode).

- JIT-compile the following library functions:
  string.char(), string.rep(), string.reverse(), string.lower(),
  string.upper(), table.concat(), bit.tohex()

- Reorganize and speed up string formatting and remove most
  dependencies on sprintf(). FP number formats still use sprintf().

- Partially JIT-compile the following library functions:
  string.format() -- Not %p, not for non-string args to %s.
  string.find() -- Only for fixed string searches (no patterns).

This means that all string.* functions are now compiled, except
for string.dump() (not helpful to compile) and pattern matches.
http://wiki.luajit.org/NYI has been updated accordingly.

More improvements coming, as we go along.

--Mike

Other related posts: