procaryote 16 hours ago

Right-shifting three bits would reduce the size of the lookup table to 32 slots

I guess something like

    const int extra_bits = (sizeof(int) - 1) * 8;
    int x = __builtin_clz(~lead_byte);
    return (x == 0) + (x > 1 + extra_bits) * (x < 5 + extra_bits) * (x - extra_bits));
could work, although I've not tested it for all cases or checked if it's fast

The idea there is to invert the bits, use a built in operation to count leading zeros (i.e. leading ones in the original byte) and then do some math to achieve the same semantics as the lookup table

  • zahlman 13 hours ago

    > Right-shifting three bits

    This is not compatible with the special cases that need to be checked (e.g. c0 and c1 start bytes must be rejected).