Skip to content

More interactive commit checker #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

CohenArthur
Copy link
Owner

No description provided.

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Checking 687a870: FAILED
ERR: cannot find a ChangeLog location in message

@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch 4 times, most recently from f69afcd to abfdbef Compare February 2, 2023 09:01
@CohenArthur
Copy link
Owner Author

One issue is the message poster is editing its previous message rather than giving a new one... is that the proper behavior or do we want one message each time?

@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch 2 times, most recently from 687a870 to 45a093f Compare February 2, 2023 10:16
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Checking 6b0a285: FAILED
ERR: cannot find a ChangeLog location in message

@CohenArthur
Copy link
Owner Author

Let's try a rerun now

@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch 2 times, most recently from 6b0a285 to f4894c5 Compare February 2, 2023 10:48
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Checking f4894c5: FAILED
ERR: cannot find a ChangeLog location in message

@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch from f4894c5 to 1ed3032 Compare February 2, 2023 11:50
@CohenArthur
Copy link
Owner Author

Let's try to get @gerris-rs to post here

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Checking 1ed3032: FAILED
ERR: cannot find a ChangeLog location in message

@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch from 1ed3032 to 391a58e Compare February 2, 2023 12:04
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

  • Changelog skeleton for commit 391a58e:

@CohenArthur
Copy link
Owner Author

Again

@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch from 391a58e to 7075ca9 Compare February 2, 2023 12:33
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

  • Changelog skeleton for commit 7075ca9:

@CohenArthur
Copy link
Owner Author

The code inside the backticks does not display properly, despite gerris emitting the proper output:

  • Changelog skeleton for commit 7075ca9:
ChangeLog:

	* .github/workflows/commit-format.yml:


@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch from 7075ca9 to b1b3ebc Compare February 2, 2023 12:56
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

  • Changelog skeleton for commit b1b3ebc:

@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch from b1b3ebc to 3492e95 Compare February 2, 2023 14:31
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

  • Changelog skeleton for commit 3492e95:

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

  • Changelog skeleton for commit 3492e95:

@CohenArthur
Copy link
Owner Author

and... again?

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

  • Changelog skeleton for commit 3492e95:

@CohenArthur
Copy link
Owner Author

so the output is not great. For some reason gerris isn't picking up the output from mklog.py.

Finally, the access token is also not working. Not sure why that is. Worst case scenario we don't have a nice profile picture helping us :)

@CohenArthur CohenArthur force-pushed the more-interactive-commit-checker branch 2 times, most recently from f93047e to 72c36df Compare February 6, 2023 21:59
@github-actions
Copy link

github-actions bot commented Feb 6, 2023

  • Changelog skeleton for commit f93047e:

@github-actions
Copy link

github-actions bot commented Feb 6, 2023

  • Changelog skeleton for commit 72c36df:

@github-actions
Copy link

github-actions bot commented Feb 6, 2023

  • Changelog skeleton for commit 423ee86:
* Changelog skeleton for commit 72c36df790744a3cf51f367868fb199a68268922:

CohenArthur pushed a commit that referenced this pull request Feb 8, 2023
The aarch64 ISA specification allows a left shift amount to be applied
after extension in the range of 0 to 4 (encoded in the imm3 field).

This is true for at least the following instructions:

 * ADD (extend register)
 * ADDS (extended register)
 * SUB (extended register)

The result of this patch can be seen, when compiling the following code:

uint64_t myadd(uint64_t a, uint64_t b)
{
    return a+(((uint8_t)b)<<4);
}

Without the patch the following sequence will be generated:

0000000000000000 <myadd>:
   0:	d37c1c21 	ubfiz	x1, x1, #4, #8
   4:	8b000020 	add	x0, x1, x0
   8:	d65f03c0 	ret

With the patch the ubfiz will be merged into the add instruction:

0000000000000000 <myadd>:
   0:	8b211000 	add	x0, x0, w1, uxtb #4
   4:	d65f03c0 	ret

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_uxt_size): fix an
	off-by-one in checking the permissible shift-amount.
CohenArthur pushed a commit that referenced this pull request Apr 6, 2023
…hook [PR108583]

This replaces the custom division hook with just an implementation through
add_highpart.  For NEON we implement the add highpart (Addition + extraction of
the upper highpart of the register in the same precision) as ADD + LSR.

This representation allows us to easily optimize the sequence using existing
sequences. This gets us a pretty decent sequence using SRA:

        umull   v1.8h, v0.8b, v3.8b
        umull2  v0.8h, v0.16b, v3.16b
        add     v5.8h, v1.8h, v2.8h
        add     v4.8h, v0.8h, v2.8h
        usra    v1.8h, v5.8h, 8
        usra    v0.8h, v4.8h, 8
        uzp2    v1.16b, v1.16b, v0.16b

To get the most optimal sequence however we match (a + ((b + c) >> n)) where n
is half the precision of the mode of the operation into addhn + uaddw which is
a general good optimization on its own and gets us back to:

.L4:
        ldr     q0, [x3]
        umull   v1.8h, v0.8b, v5.8b
        umull2  v0.8h, v0.16b, v5.16b
        addhn   v3.8b, v1.8h, v4.8h
        addhn   v2.8b, v0.8h, v4.8h
        uaddw   v1.8h, v1.8h, v3.8b
        uaddw   v0.8h, v0.8h, v2.8b
        uzp2    v1.16b, v1.16b, v0.16b
        str     q1, [x3], 16
        cmp     x3, x4
        bne     .L4

For SVE2 we optimize the initial sequence to the same ADD + LSR which gets us:

.L3:
        ld1b    z0.h, p0/z, [x0, x3]
        mul     z0.h, p1/m, z0.h, z2.h
        add     z1.h, z0.h, z3.h
        usra    z0.h, z1.h, #8
        lsr     z0.h, z0.h, #8
        st1b    z0.h, p0, [x0, x3]
        inch    x3
        whilelo p0.h, w3, w2
        b.any   .L3
.L1:
        ret

and to get the most optimal sequence I match (a + b) >> n (same constraint on n)
to addhnb which gets us to:

.L3:
        ld1b    z0.h, p0/z, [x0, x3]
        mul     z0.h, p1/m, z0.h, z2.h
        addhnb  z1.b, z0.h, z3.h
        addhnb  z0.b, z0.h, z1.h
        st1b    z0.h, p0, [x0, x3]
        inch    x3
        whilelo p0.h, w3, w2
        b.any   .L3

There are multiple RTL representations possible for these optimizations, I did
not represent them using a zero_extend because we seem very inconsistent in this
in the backend.  Since they are unspecs we won't match them from vector ops
anyway. I figured maintainers would prefer this, but my maintainer ouija board
is still out for repairs :)

There are no new test as new correctness tests were added to the mid-end and
the existing codegen tests for this already exist.

gcc/ChangeLog:

	PR target/108583
	* config/aarch64/aarch64-simd.md (@aarch64_bitmask_udiv<mode>3): Remove.
	(*bitmask_shift_plus<mode>): New.
	* config/aarch64/aarch64-sve2.md (*bitmask_shift_plus<mode>): New.
	(@aarch64_bitmask_udiv<mode>3): Remove.
	* config/aarch64/aarch64.cc
	(aarch64_vectorize_can_special_div_by_constant,
	TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST): Removed.
	(TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT,
	aarch64_vectorize_preferred_div_as_shifts_over_mult): New.
CohenArthur pushed a commit that referenced this pull request Oct 24, 2024
…o_debug_section [PR116614]

cat abc.C
  #define A(n) struct T##n {} t##n;
  #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
  #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
  #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
  #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
  E(1) E(2) E(3)
  int main () { return 0; }
./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g -fdebug-types-section -c
./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2
(not included in testsuite as it takes a while to compile) FAILs with
lto-wrapper: fatal error: Too many copied sections: Operation not supported
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

The following patch fixes that.  Most of the 64K+ section support for
reading and writing was already there years ago (and especially reading used
quite often already) and a further bug fixed in it in the PR104617 fix.

Yet, the fix isn't solely about removing the
  if (new_i - 1 >= SHN_LORESERVE)
    {
      *err = ENOTSUP;
      return "Too many copied sections";
    }
5 lines, the missing part was that the function only handled reading of
the .symtab_shndx section but not copying/updating of it.
If the result has less than 64K-epsilon sections, that actually wasn't
needed, but e.g. with -fdebug-types-section one can exceed that pretty
easily (reported to us on WebKitGtk build on ppc64le).
Updating the section is slightly more complicated, because it basically
needs to be done in lock step with updating the .symtab section, if one
doesn't need to use SHN_XINDEX in there, the section should (or should be
updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would
be overwise stored but couldn't fit.  But repeating due to that all the
symtab decisions what to discard and how to rewrite it would be ugly.

So, the patch instead emits the .symtab_shndx section (or sections) last
and prepares the content during the .symtab processing and in a second
pass when going just through .symtab_shndx sections just uses the saved
content.

2024-09-07  Jakub Jelinek  <[email protected]>

	PR lto/116614
	* simple-object-elf.c (SHN_COMMON): Align comment with neighbouring
	comments.
	(SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for
	consistency.
	(simple_object_elf_find_sections): Formatting fixes.
	(simple_object_elf_fetch_attributes): Likewise.
	(simple_object_elf_attributes_merge): Likewise.
	(simple_object_elf_start_write): Likewise.
	(simple_object_elf_write_ehdr): Likewise.
	(simple_object_elf_write_shdr): Likewise.
	(simple_object_elf_write_to_file): Likewise.
	(simple_object_elf_copy_lto_debug_section): Likewise.  Don't fail for
	new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy
	over .symtab_shndx sections, though emit those last and compute their
	section content when processing associated .symtab sections.  Handle
	simple_object_internal_read failure even in the .symtab_shndx reading
	case.
CohenArthur pushed a commit that referenced this pull request Oct 24, 2024
Implement vddup and vidup using the new MVE builtins framework.

We generate better code because we take advantage of the two outputs
produced by the v[id]dup instructions.

For instance, before:
	ldr	r3, [r0]
	sub	r2, r3, #8
	str	r2, [r0]
	mov	r2, r3
	vddup.u16	q3, r2, #1

now:
	ldr	r2, [r0]
	vddup.u16	q3, r2, #1
	str	r2, [r0]

2024-08-21  Christophe Lyon  <[email protected]>

	gcc/
	* config/arm/arm-mve-builtins-base.cc (class viddup_impl): New.
	(vddup): New.
	(vidup): New.
	* config/arm/arm-mve-builtins-base.def (vddupq): New.
	(vidupq): New.
	* config/arm/arm-mve-builtins-base.h (vddupq): New.
	(vidupq): New.
	* config/arm/arm_mve.h (vddupq_m): Delete.
	(vddupq_u8): Delete.
	(vddupq_u32): Delete.
	(vddupq_u16): Delete.
	(vidupq_m): Delete.
	(vidupq_u8): Delete.
	(vidupq_u32): Delete.
	(vidupq_u16): Delete.
	(vddupq_x_u8): Delete.
	(vddupq_x_u16): Delete.
	(vddupq_x_u32): Delete.
	(vidupq_x_u8): Delete.
	(vidupq_x_u16): Delete.
	(vidupq_x_u32): Delete.
	(vddupq_m_n_u8): Delete.
	(vddupq_m_n_u32): Delete.
	(vddupq_m_n_u16): Delete.
	(vddupq_m_wb_u8): Delete.
	(vddupq_m_wb_u16): Delete.
	(vddupq_m_wb_u32): Delete.
	(vddupq_n_u8): Delete.
	(vddupq_n_u32): Delete.
	(vddupq_n_u16): Delete.
	(vddupq_wb_u8): Delete.
	(vddupq_wb_u16): Delete.
	(vddupq_wb_u32): Delete.
	(vidupq_m_n_u8): Delete.
	(vidupq_m_n_u32): Delete.
	(vidupq_m_n_u16): Delete.
	(vidupq_m_wb_u8): Delete.
	(vidupq_m_wb_u16): Delete.
	(vidupq_m_wb_u32): Delete.
	(vidupq_n_u8): Delete.
	(vidupq_n_u32): Delete.
	(vidupq_n_u16): Delete.
	(vidupq_wb_u8): Delete.
	(vidupq_wb_u16): Delete.
	(vidupq_wb_u32): Delete.
	(vddupq_x_n_u8): Delete.
	(vddupq_x_n_u16): Delete.
	(vddupq_x_n_u32): Delete.
	(vddupq_x_wb_u8): Delete.
	(vddupq_x_wb_u16): Delete.
	(vddupq_x_wb_u32): Delete.
	(vidupq_x_n_u8): Delete.
	(vidupq_x_n_u16): Delete.
	(vidupq_x_n_u32): Delete.
	(vidupq_x_wb_u8): Delete.
	(vidupq_x_wb_u16): Delete.
	(vidupq_x_wb_u32): Delete.
	(__arm_vddupq_m_n_u8): Delete.
	(__arm_vddupq_m_n_u32): Delete.
	(__arm_vddupq_m_n_u16): Delete.
	(__arm_vddupq_m_wb_u8): Delete.
	(__arm_vddupq_m_wb_u16): Delete.
	(__arm_vddupq_m_wb_u32): Delete.
	(__arm_vddupq_n_u8): Delete.
	(__arm_vddupq_n_u32): Delete.
	(__arm_vddupq_n_u16): Delete.
	(__arm_vidupq_m_n_u8): Delete.
	(__arm_vidupq_m_n_u32): Delete.
	(__arm_vidupq_m_n_u16): Delete.
	(__arm_vidupq_n_u8): Delete.
	(__arm_vidupq_m_wb_u8): Delete.
	(__arm_vidupq_m_wb_u16): Delete.
	(__arm_vidupq_m_wb_u32): Delete.
	(__arm_vidupq_n_u32): Delete.
	(__arm_vidupq_n_u16): Delete.
	(__arm_vidupq_wb_u8): Delete.
	(__arm_vidupq_wb_u16): Delete.
	(__arm_vidupq_wb_u32): Delete.
	(__arm_vddupq_wb_u8): Delete.
	(__arm_vddupq_wb_u16): Delete.
	(__arm_vddupq_wb_u32): Delete.
	(__arm_vddupq_x_n_u8): Delete.
	(__arm_vddupq_x_n_u16): Delete.
	(__arm_vddupq_x_n_u32): Delete.
	(__arm_vddupq_x_wb_u8): Delete.
	(__arm_vddupq_x_wb_u16): Delete.
	(__arm_vddupq_x_wb_u32): Delete.
	(__arm_vidupq_x_n_u8): Delete.
	(__arm_vidupq_x_n_u16): Delete.
	(__arm_vidupq_x_n_u32): Delete.
	(__arm_vidupq_x_wb_u8): Delete.
	(__arm_vidupq_x_wb_u16): Delete.
	(__arm_vidupq_x_wb_u32): Delete.
	(__arm_vddupq_m): Delete.
	(__arm_vddupq_u8): Delete.
	(__arm_vddupq_u32): Delete.
	(__arm_vddupq_u16): Delete.
	(__arm_vidupq_m): Delete.
	(__arm_vidupq_u8): Delete.
	(__arm_vidupq_u32): Delete.
	(__arm_vidupq_u16): Delete.
	(__arm_vddupq_x_u8): Delete.
	(__arm_vddupq_x_u16): Delete.
	(__arm_vddupq_x_u32): Delete.
	(__arm_vidupq_x_u8): Delete.
	(__arm_vidupq_x_u16): Delete.
	(__arm_vidupq_x_u32): Delete.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant