Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The LuaJIT example isn't correct though, the lifetime of garbage collected objects is clearly documented: https://luajit.org/ext_ffi_semantics.html#gc In the example function `blob` will not be collected because it is reachable from the `blob` argument local variable (IOW it is on the Lua stack). `ffi.string`() copies the string data into a new Lua string, and the lifetime of blob is guaranteed until the return of the function. So not sure what the issue is.

  function blob_contents(blob) -- <- this ensures liveness until past return 
     local len_out = ffi.new('unsigned int')
     local contents = hb.hb_blob_get_data(blob, len_out)
     local len = len_out[0];
     return ffi.string(contents, len)
  end


Unfortunately things aren't so simple, as when doing JIT compilation, LuaJIT _will_ try to shorten the lifetimes of local variables. Using the latest available version of LuaJIT (https://github.com/LuaJIT/LuaJIT/commit/0d313b243194a0b8d239...), the following reliably fails for me:

  local ffi = require"ffi"
  local function collect_lots()
    for i = 1, 20 do collectgarbage() end
  end
  local function f(s)
    local blob = ffi.new"int[2]"
    local interior = blob + 1
    interior[0] = 13 -- should become the return value
    s:gsub(".", collect_lots)
    return interior[0] -- kept alive by blob?
  end
  for i = 1, 60 do
    local str = ("x"):rep(i - 59)
    assert(f(str) == 13) -- can fail!!
  end


Well that is from 3 weeks ago. If that remains then it’s a bug or the documentation is wrong. What are the rules for keeping a GC object alive? What earthly useful meaning can “Lua stack” have in the FFI GC documentation if not to local bindings since that is the only user visible exposure of it in the language.

From the LuaJIT docs: So e.g. if you assign a cdata array to a pointer, you must keep the cdata object holding the array alive as long as the pointer is still in use:

  ffi.cdef[[
  typedef struct { int *a; } foo_t;
  ]]

  local s = ffi.new("foo_t", ffi.new("int[10]")) -- WRONG!

  local a = ffi.new("int[10]") -- OK
  local s = ffi.new("foo_t", a)
  -- Now do something with 's', but keep 'a' alive until you're 
  done.
What on earth does "OK" here mean if not the local variable binding? It's the expectation because this is what it says on the tin.

This then isn’t a discussion about fundamental issues or "impossibilities" with GC, but with poor language implementations not following their own specifications or not having them.

Since LuaJIT does not have an explicit pinning interface the expectation that a local variable binding remains until the end of scope is pretty basic. If your bug case is expected then even the line: interior[0] = 13 is undefined and so would everything after local s in the documentation, ie you can do absolutely nothing with a pointed to cdata until you pin it in a table. Who would want to use that?


You're absolutely right. I'm not particularly familiar with LuaJIT so when I read the article I got the impression the LuaJIT GC semantics weren't documented. Looks like the LuaJIT behavior is well defined and the implementation isn't keeping its own promises.


The argument is that the JIT might realise that blob is never used beyond that line, and collect it early. In general that would be a desirable feature.


I know it says this: "The semantics of LuaJIT do not prescribe when GC can happen and what values will be live, so the GC and the compiler are not constrained to extend the liveness of blob to, say, the entirety of its lexical scope. "

But it is flat wrong. From the LuaJIT documentation: "All explicitly (ffi.new(), ffi.cast() etc.) or implicitly (accessors) created cdata objects are garbage collected. You need to ensure to retain valid references to cdata objects somewhere on a Lua stack, an upvalue or in a Lua table while they are still in use. Once the last reference to a cdata object is gone, the garbage collector will automatically free the memory used by it (at the end of the next GC cycle)."

The Lua stack in this case includes all the local variables in that function scope. It's a non-issue/straw man and is common sense. If LuaJIT FFI worked the way the author supposed, it would be near impossible to use practically.

“It is perfectly valid to collect blob after its last use”

This is a useless statement. It’s perfectly “valid” for LuaJIT to not even read your source code and exit immediately, but that isn’t what it does because it would be useless. What counts as a reference is both PUC Lua and LuaJIT is defined.

As far as the desirability of finer grained liveness, Lua has block scope (do end), but in practice LuaJIT does well inlining so functions ought to be short anyway.


On a related note, the OP says:

> or attempt to manually extend the lifetime of a finalizable object, and then pray the compiler and GC don’t learn new tricks to invalidate your trick.

This is also silly. There is no reason whatsoever that a GC can’t offer an actual API to keep an object alive.


LuaJIT lets you remove a finalizer:

> An existing finalizer can be removed by setting a nil finalizer, e.g. right before explicitly deleting a resource:

    ffi.C.free(ffi.gc(p, nil)) -- Manually free the memory.
https://luajit.org/ext_ffi_api.html#ffi_gc


GC is usually not specified to happen at particular times, or saying which values are definitely going to be GCed. Instead it relies on the language semantics, so that any value which is used later in the program, is not going to be GCed. How and when the runtime system determines that a value is not going to be used again is an optimization problem, not a correctness problem.

So everything you quote Lua a saying here is consistent. The thing is that it only considers "used later" as "used later by the Lua program". Or rather, it only considers Lua values as "values". A value stored in non-managed memory is not a value. It's not GCed. The `ffi.new`-created Lua value is, and it's finalizer happens to free the native memory that the pointer refers to.

So non -Lua "values" are not GCed, they are freed as side effects of Lua values being GCed.


Okay, LuaJIT is a bit specific example, what about .NET CLR? Because it absolutely does this optimization, and e.g. an object can get GC'd while one of its instance methods is still running given that this instance method is statically (or maybe even dynamically?) guaranteed to not access `this`.


Yes, and? It has mechanisms with well defined semantics such as GCHandle and KeepAlive. That’s literally what it is there for so it makes this: “or attempt to manually extend the lifetime of a finalizable object, and then pray the compiler and GC don’t learn new tricks to invalidate your trick” a non starter.

It’s not a trick, it’s a documented interface.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: