Open
Description
As a follow up on #424:
In the core c setting, the 10 most often occurring opcodes with a string index are:
$ bceval c '.map({.name if .operands.first("str")}).Bag.sort(-*.value).head(10)'
dispatch_o => 108459
const_s => 61445
getattr_o => 16352
dispatch_v => 15188
bindattr_o => 11969
dispatch_i => 7467
getlex_no => 2769
getattr_i => 1657
bindattr_i => 863
param_on_o => 788
It looks to me that we could get a 2 * (108459 + 61445)) / 11983692
= 2.8% reduction in frame's opcode size (and thus improve its chances of getting inlined) by creating 2 alternate versions of opcodes: dispatch_o
and const_s
: dispatch_o_16
and const_s_16
. The only difference with their counterparts would be reserving 2 bytes for the string index, rather than 4.
In interp.c
, the additions would be (I think, please correct me if I'm wrong, as my C
is rusty):
OP(const_s_16):
GET_REG(cur_op, 0).s = MVM_cu_string(tc, cu, GET_UI16(cur_op, 2).u32);
cur_op += 4;
goto NEXT;
OP(dispatch_o_16): {
MVMDispInlineCacheEntry **ice_ptr = MVM_disp_inline_cache_get(
cur_op, bytecode_start, tc->cur_frame);
MVMDispInlineCacheEntry *ice = *ice_ptr;
MVMString *id = MVM_cu_string(tc, cu, GET_UI16(cur_op, 2).u32);
MVMCallsite *callsite = cu->body.callsites[GET_UI16(cur_op, 4)];
MVMuint16 *args = (MVMuint16 *)(cur_op + 6);
MVMuint32 bytecode_offset = (cur_op - bytecode_start) - 2;
tc->cur_frame->return_value = &GET_REG(cur_op, 0);
tc->cur_frame->return_type = MVM_RETURN_OBJ;
cur_op += 6 + 2 * callsite->flag_count;
tc->cur_frame->return_address = cur_op;
ice->run_dispatch(tc, ice_ptr, ice, id, callsite, args, tc->cur_frame->work,
tc->cur_frame->static_info, bytecode_offset);
goto NEXT;
}
I guess the oplist parser would need to support a new kind of string index reference, something like "str16".
And of course the necessary changes in the MAST writing.