Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce superfluous move instructions #32

Open
wants to merge 3 commits into
base: trunk-updates
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions gcc/brig/ChangeLog
Original file line number Diff line number Diff line change
@@ -1,3 +1,28 @@
2017-11-13 Henry Linjamäki <[email protected]>

Change internal representation of HSA registers. Instead
representing HSA's untyped registers as unsigned int the gccbrig
analyzes brig code and builds the register variables as a type
used in tree expressions at most. This gives better chance to
optimize CONVERT_VIEW_EXPRs away.
* brigfrontend/brig-code-entry-handler.cc: Add analysis method for
register type usage. Handle any-typed register variables.
* brigfrontend/brig-code-entry-handler.h: New declarations for the
above.
* brigfrontend/brig-copy-move-inst-handler.cc: Handle any-typed
register variables.
* brigfrontend/brig-cvt-inst-handler.cc: Likewise.
* brigfrontend/brig-function.cc: Build register variables as a
type based on results of analysis phase.
* brigfrontend/brig-function.h: Move HSA register count defines to
brig-utils.h.
* brigfrontend/brig-to-generic.cc: New analysis handler. Analyze
HSA register usage.
* brigfrontend/brig-to-generic.h: New declarations.
* brigfrontend/brig-util.cc: New utility functions.
* brigfrontend/brig-util.h: New declarations for the above.


2017-10-09 Pekka Jääskeläinen <[email protected]>

* brigfrontend/brig-to-generic.cc: Support BRIG_KIND_NONE
Expand Down
107 changes: 56 additions & 51 deletions gcc/brig/brigfrontend/brig-code-entry-handler.cc
Original file line number Diff line number Diff line change
Expand Up @@ -137,14 +137,7 @@ brig_code_entry_handler::build_tree_operand (const BrigInstBase &brig_inst,
correct size here so we don't need a separate unpack/pack for it.
fp16-fp32 conversion is done in build_operands (). */
if (is_input && TREE_TYPE (element) != operand_type)
{
if (int_size_in_bytes (TREE_TYPE (element))
== int_size_in_bytes (operand_type)
&& !INTEGRAL_TYPE_P (operand_type))
element = build1 (VIEW_CONVERT_EXPR, operand_type, element);
else
element = convert (operand_type, element);
}
element = build_reinterpret_cast (operand_type, element);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Don't we need proper casting (via the convert call) when the integer sizes differ to ensure no garbage is left to the upper bits?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed build_reinterpret_cast to do unsigned convert (zero extend or clip out upper bits when sizes differ). Signedness does not seem to matter here in the tests I have run but I check couple more cases for this.


CONSTRUCTOR_APPEND_ELT (constructor_vals, NULL_TREE, element);
++operand_ptr;
Expand Down Expand Up @@ -436,7 +429,8 @@ brig_code_entry_handler::build_address_operand
= (const BrigOperandRegister *) m_parent.get_brig_operand_entry
(addr_operand.reg);
tree base_reg_var = m_parent.m_cf->get_m_var_declfor_reg (mem_base_reg);
var_offset = convert_to_pointer (ptr_type_node, base_reg_var);
tree as_uint = build_reinterpret_to_uint (base_reg_var);
var_offset = convert_to_pointer (ptr_type_node, as_uint);

gcc_assert (var_offset != NULL_TREE);
}
Expand Down Expand Up @@ -527,7 +521,10 @@ brig_code_entry_handler::build_tree_operand_from_brig
= ((const uint32_t *) &operand_entries->bytes)[operand_index];
const BrigBase *operand_data
= m_parent.get_brig_operand_entry (operand_offset);
return build_tree_operand (*brig_inst, *operand_data, operand_type);

bool inputp = !gccbrig_hsa_opcode_op_output_p (brig_inst->opcode,
operand_index);
return build_tree_operand (*brig_inst, *operand_data, operand_type, inputp);
}

/* Builds a single (scalar) constant initialized element of type
Expand Down Expand Up @@ -1140,6 +1137,23 @@ brig_code_entry_handler::build_h2f_conversion (tree source)

tree_stl_vec
brig_code_entry_handler::build_operands (const BrigInstBase &brig_inst)
{
return build_or_analyze_operands (brig_inst, /* analyze = */ false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop the comment, why it's there? Also below?

}

void
brig_code_entry_handler::analyze_operands (const BrigInstBase &brig_inst)
{
build_or_analyze_operands (brig_inst, /* analyze = */ true);
}

/* Implements both the build_operands () and analyze_operands () call
so changes goes in tandem. Performs build_operands () when ANALYZE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goes -> go

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also briefly explain what is being analyzed.

is false. Otherwise, analyze operands and return empty list. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> "only analyze the operands and return an empty list."


tree_stl_vec
brig_code_entry_handler::
build_or_analyze_operands (const BrigInstBase &brig_inst, bool analyze)
{
/* Flush to zero. */
bool ftz_mod = false;
Expand Down Expand Up @@ -1327,9 +1341,19 @@ brig_code_entry_handler::build_operands (const BrigInstBase &brig_inst)
/* Treat the operands as the storage type at this point. */
operand_type = half_storage_type;

if (analyze)
{
if (operand_data->kind == BRIG_KIND_OPERAND_REGISTER)
{
const BrigOperandRegister &brig_reg
= (const BrigOperandRegister &) *operand_data;
m_parent.add_reg_used_as_type (brig_reg, operand_type);
}
continue;
}

tree operand = build_tree_operand (brig_inst, *operand_data, operand_type,
!is_output);

gcc_assert (operand);

/* Cast/convert the inputs to correct types as expected by the GENERIC
Expand All @@ -1342,28 +1366,9 @@ brig_code_entry_handler::build_operands (const BrigInstBase &brig_inst)
else if (TREE_CODE (operand) != LABEL_DECL
&& TREE_CODE (operand) != TREE_VEC
&& operand_data->kind != BRIG_KIND_OPERAND_ADDRESS
&& !VECTOR_TYPE_P (TREE_TYPE (operand)))
&& operand_data->kind != BRIG_KIND_OPERAND_OPERAND_LIST)
{
size_t reg_width = int_size_in_bytes (TREE_TYPE (operand));
size_t instr_width = int_size_in_bytes (operand_type);
if (reg_width == instr_width)
operand = build_reinterpret_cast (operand_type, operand);
else if (reg_width > instr_width)
{
/* Clip the operand because the instruction's bitwidth
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, is this really safe to replace with an interpret cast?

is smaller than the HSAIL reg width. */
if (INTEGRAL_TYPE_P (operand_type))
operand
= convert_to_integer (signed_or_unsigned_type_for
(TYPE_UNSIGNED (operand_type),
operand_type), operand);
else
operand = build_reinterpret_cast (operand_type, operand);
}
else if (reg_width < instr_width)
/* At least shift amount operands can be read from smaller
registers than the data operands. */
operand = convert (operand_type, operand);
operand = build_reinterpret_cast (operand_type, operand);
}
else if (brig_inst.opcode == BRIG_OPCODE_SHUFFLE)
/* Force the operand type to be treated as the raw type. */
Expand Down Expand Up @@ -1399,8 +1404,9 @@ tree
brig_code_entry_handler::build_output_assignment (const BrigInstBase &brig_inst,
tree output, tree inst_expr)
{
/* The destination type might be different from the output register
variable type (which is always an unsigned integer type). */
/* The result/input type might be different from the output register
variable type (can be any type; see get_m_var_declfor_reg @
brig-function.cc). */
tree output_type = TREE_TYPE (output);
tree input_type = TREE_TYPE (inst_expr);
bool is_fp16 = (brig_inst.type & BRIG_TYPE_BASE_MASK) == BRIG_TYPE_F16
Expand Down Expand Up @@ -1440,12 +1446,12 @@ brig_code_entry_handler::build_output_assignment (const BrigInstBase &brig_inst,
{
inst_expr = add_temp_var ("before_f2h", inst_expr);
tree f2h_output = build_f2h_conversion (inst_expr);
tree conv_int = convert_to_integer (output_type, f2h_output);
tree assign = build2 (MODIFY_EXPR, output_type, output, conv_int);
tree conv = build_reinterpret_cast (output_type, f2h_output);
tree assign = build2 (MODIFY_EXPR, output_type, output, conv);
m_parent.m_cf->append_statement (assign);
return assign;
}
else if (VECTOR_TYPE_P (TREE_TYPE (output)))
else if (VECTOR_TYPE_P (output_type) && TREE_CODE (output) == CONSTRUCTOR)
{
/* Expand/unpack the input value to the given vector elements. */
size_t i;
Expand Down Expand Up @@ -1473,22 +1479,21 @@ brig_code_entry_handler::build_output_assignment (const BrigInstBase &brig_inst,
bitwidths. */
size_t src_width = int_size_in_bytes (input_type);
size_t dst_width = int_size_in_bytes (output_type);

if (src_width == dst_width)
{
/* A simple bitcast should do. */
tree bitcast = build_reinterpret_cast (output_type, inst_expr);
tree assign = build2 (MODIFY_EXPR, output_type, output, bitcast);
m_parent.m_cf->append_statement (assign);
return assign;
}
else
tree input = inst_expr;
/* Integer results are extended to the target register width, using
the same sign as the inst_expr. */
if (INTEGRAL_TYPE_P (TREE_TYPE (input)) && src_width != dst_width)
{
tree conv_int = convert_to_integer (output_type, inst_expr);
tree assign = build2 (MODIFY_EXPR, output_type, output, conv_int);
m_parent.m_cf->append_statement (assign);
return assign;
bool unsigned_p = TYPE_UNSIGNED (TREE_TYPE (input));
tree resized_type
= build_nonstandard_integer_type (dst_width * BITS_PER_UNIT,
unsigned_p);
input = convert_to_integer (resized_type, input);
}
input = build_reinterpret_cast (output_type, input);
tree assign = build2 (MODIFY_EXPR, output_type, output, input);
m_parent.m_cf->append_statement (assign);
return assign;
}
return NULL_TREE;
}
Expand Down
6 changes: 6 additions & 0 deletions gcc/brig/brigfrontend/brig-code-entry-handler.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ class brig_code_entry_handler : public brig_entry_handler
tree build_h2f_conversion (tree source);

tree_stl_vec build_operands (const BrigInstBase &brig_inst);
void analyze_operands (const BrigInstBase &brig_inst);
tree build_output_assignment (const BrigInstBase &brig_inst, tree output,
tree inst_expr);

Expand All @@ -105,6 +106,11 @@ class brig_code_entry_handler : public brig_entry_handler
/* HSAIL-specific builtin functions not yet integrated to gcc. */

static builtin_map s_custom_builtins;

private:

tree_stl_vec build_or_analyze_operands (const BrigInstBase &brig_inst,
bool analyze);
};

/* Implement the Visitor software pattern for performing various actions on
Expand Down
4 changes: 2 additions & 2 deletions gcc/brig/brigfrontend/brig-copy-move-inst-handler.cc
Original file line number Diff line number Diff line change
Expand Up @@ -53,12 +53,12 @@ brig_copy_move_inst_handler::operator () (const BrigBase *base)

tree input = build_tree_operand_from_brig (brig_inst, source_type, 1);
tree output = build_tree_operand_from_brig (brig_inst, dest_type, 0);

if (brig_inst->opcode == BRIG_OPCODE_COMBINE)
{
/* For combine, a simple reinterpret cast from the array constructor
works. */

tree casted = build_reinterpret_cast (dest_type, input);
tree casted = build_reinterpret_cast (TREE_TYPE (output), input);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does dest_type and TREE_TYPE (output) differ here? Sounds like a problem in gccbrig_tree_type_for_hsa_type()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dest_type is tree type for combine_vX_bY_bZ so its always unsigned int type. output tree is 'register' variable which can be any type in this patch.

tree assign = build2 (MODIFY_EXPR, TREE_TYPE (output), output, casted);
m_parent.m_cf->append_statement (assign);
}
Expand Down
35 changes: 18 additions & 17 deletions gcc/brig/brigfrontend/brig-cvt-inst-handler.cc
Original file line number Diff line number Diff line change
Expand Up @@ -222,13 +222,9 @@ brig_cvt_inst_handler::generate (const BrigBase *base)
{
tree casted_input = build_reinterpret_cast (src_type, input);

/* Perform the int to float conversion. */
/* Perform the float to int conversion. */
conversion_result = convert_to_integer (dest_type, casted_input);
}
/* The converted result is finally extended to the target register
width, using the same sign as the destination. */
conversion_result
= convert_to_integer (TREE_TYPE (output), conversion_result);
}
else
{
Expand All @@ -239,21 +235,26 @@ brig_cvt_inst_handler::generate (const BrigBase *base)

size_t dst_reg_size = int_size_in_bytes (TREE_TYPE (output));

tree assign = NULL_TREE;
/* The output register can be of different type&size than the
conversion output size. Cast it to the register variable type. */
if (dst_reg_size > conv_dst_size)
{
tree casted_output
= build1 (CONVERT_EXPR, TREE_TYPE (output), conversion_result);
assign = build2 (MODIFY_EXPR, TREE_TYPE (output), output, casted_output);
}
else
conversion output size. Only need to handle signed integers, rest
is handled by reinterpret_cast. */
tree casted_output = conversion_result;
if (dst_reg_size > conv_dst_size &&
INTEGRAL_TYPE_P (TREE_TYPE (casted_output)))
{
tree casted_output
= build_reinterpret_cast (TREE_TYPE (output), conversion_result);
assign = build2 (MODIFY_EXPR, TREE_TYPE (output), output, casted_output);
gcc_assert (!VECTOR_TYPE_P (casted_output));

bool unsignedp = TYPE_UNSIGNED (TREE_TYPE (casted_output));
tree resized_int_type
= build_nonstandard_integer_type (dst_reg_size * BITS_PER_UNIT,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot you store dst_reg_type somewhere to avoid recreating it?

unsignedp);
casted_output = build1 (CONVERT_EXPR, resized_int_type, casted_output);
}

casted_output
= build_reinterpret_cast (TREE_TYPE (output), casted_output);
tree assign = build2 (MODIFY_EXPR, TREE_TYPE (output), output, casted_output);

m_parent.m_cf->append_statement (assign);

return base->byteCount;
Expand Down
69 changes: 48 additions & 21 deletions gcc/brig/brigfrontend/brig-function.cc
Original file line number Diff line number Diff line change
Expand Up @@ -272,40 +272,67 @@ brig_function::add_local_variable (std::string name, tree type)
return variable;
}

/* Return tree type for a HSA register.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an


The tree type can be anything (scalar, vector, int, float, etc.)
but its size is guaranteed to match the HSA register size.

HSA registers are untyped but we select a type based on their use
to reduce (unoptimizable) VIEW_CONVERT_EXPR nodes (seems to occurs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unoptimizable) -> (sometimes unoptimizable)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

occurs -> occur

when use or def reaches over current BB). */

tree
brig_function::get_tree_type_for_hsa_reg (const BrigOperandRegister *reg) const
{
size_t reg_size = gccbrig_reg_size (reg);

/* The default type. */
tree type = build_nonstandard_integer_type (reg_size, true);

if (m_parent->m_fn_regs_use_index.count (m_name) == 0)
return type;

const regs_use_index &index = m_parent->m_fn_regs_use_index[m_name];
size_t reg_id = gccbrig_hsa_reg_hash (*reg);
if (index.count (reg_id) == 0)
return type;

const reg_use_info &info = index.find (reg_id)->second;
std::vector<std::pair<tree, size_t> >::const_iterator it
= info.m_type_refs.begin ();
std::vector<std::pair<tree, size_t> >::const_iterator it_end
= info.m_type_refs.end ();
size_t max_refs_as_type_count = 0;
for (; it != it_end; it++)
{
size_t type_bit_size = int_size_in_bytes (it->first) * BITS_PER_UNIT;
if (type_bit_size != reg_size) continue;
if (it->second > max_refs_as_type_count)
{
type = it->first;
max_refs_as_type_count = it->second;
}
}

return type;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a slight optimization opportunity in the fallback case: if HSA reg size matching one was found, vote the unsign/sign property to reduce masking/casting.

}

/* Returns a DECL_VAR for the given HSAIL operand register.
If it has not been created yet for the function being generated,
creates it as an unsigned int variable. */
creates it as a type determined by analysis phase. */

tree
brig_function::get_m_var_declfor_reg (const BrigOperandRegister *reg)
{
size_t offset = reg->regNum;
switch (reg->regKind)
{
case BRIG_REGISTER_KIND_QUAD:
offset
+= BRIG_2_TREE_HSAIL_D_REG_COUNT + BRIG_2_TREE_HSAIL_S_REG_COUNT +
BRIG_2_TREE_HSAIL_C_REG_COUNT;
break;
case BRIG_REGISTER_KIND_DOUBLE:
offset += BRIG_2_TREE_HSAIL_S_REG_COUNT + BRIG_2_TREE_HSAIL_C_REG_COUNT;
break;
case BRIG_REGISTER_KIND_SINGLE:
offset += BRIG_2_TREE_HSAIL_C_REG_COUNT;
case BRIG_REGISTER_KIND_CONTROL:
break;
default:
gcc_unreachable ();
break;
}
size_t offset = gccbrig_hsa_reg_hash (*reg);

reg_decl_index_entry *regEntry = m_regs[offset];
if (regEntry == NULL)
{
size_t reg_size = gccbrig_reg_size (reg);
tree type;
if (reg_size > 1)
type = build_nonstandard_integer_type (reg_size, true);
type = get_tree_type_for_hsa_reg (reg);
else
type = boolean_type_node;

Expand Down
Loading