Skip to content

Commit a26c5fc

Browse files
committed
aarch64: Add support for unpacked SVE FP conversions
This patch introduces expanders for FP<-FP conversions that levarage partial vector modes. We also extend the INT<-FP and FP<-INT conversions using the same approach. The ACLE enables vectorized conversions like the following: fcvt z0.h, p7/m, z1.s modelling the source vector as VNx4SF: ... | SF| SF| SF| SF| and the destination as a VNx8HF, where this operation would yield: ... | 0 | HF| 0 | HF| 0 | HF| 0 | HF| hence the useful results are stored unpacked, i.e. ... | X | HF| X | HF| X | HF| X | HF| (VNx4HF) This patch allows the vectorizer to use this variant of fcvt as a conversion from VNx4SF to VNx4HF. The same idea applies to widening conversions, and between vectors with FP and integer base types. If the source itself had been unpacked, e.g. ... | X | SF| X | SF| (VNx2SF) The result would yield ... | X | X | X | HF| X | X | X | HF| (VNx2HF) The upper bits of each container here are undefined, it's important to avoid interpreting them during FP operations - doing so could introduce spurious traps. The obvious route we've taken here is to mask undefined lanes using the operation's predicate if we have flag_trapping_math. The VPRED predicate mode (e.g. VNx2BI here) cannot do this; to ensure correct behavior, we need a predicate mode that can control the data as if it were fully-packed (VNx4BI). Both VNx2BI and VNx4BI must be recognised as legal governing predicate modes by the corresponding FP insns. In general, the governing predicate mode for an insn could be any such with at least as many significant lanes as the data mode. For example, addvnx4hf3 could be controlled by any of VNx{4,8,16}BI. We implement 'aarch64_predicate_operand', a new define_special_predicate, to acheive this. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_sve_valid_pred_p): Declare helper for aarch64_predicate_operand. (aarch64_sve_packed_pred): Declare helper for new expanders. (aarch64_sve_fp_pred): Likewise. * config/aarch64/aarch64-sve.md (<optab><mode><v_int_equiv>2): Extend into... (<optab><SVE_HSF:mode><SVE_HSDI:mode>2): New expander for converting vectors of HF,SF to vectors of HI,SI,DI. (<optab><VNx2DF_ONLY:mode><SVE_2SDI:mode>2): New expander for converting vectors of SI,DI to vectors of DF. (*aarch64_sve_<optab>_nontrunc<SVE_PARTIAL_F:mode><SVE_HSDI:mode>): New pattern to match those we've added here. (@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Extend into... (@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><SVE_SI:mode>): Match both VNx2SI<-VNx2DF and VNx4SI<-VNx4DF. (<optab><v_int_equiv><mode>2): Extend into... (<optab><SVE_HSDI:mode><SVE_F:mode>2): New expander for converting vectors of HI,SI,DI to vectors of HF,SF,DF. (*aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_PARTIAL_F:mode>): New pattern to match those we've added here. (trunc<SVE_SDF:mode><SVE_PARTIAL_HSF:mode>2): New expander to handle narrowing ('truncating') FP<-FP conversions. (*aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_PARTIAL_HSF:mode>): New pattern to handle those we've added here. (extend<SVE_PARTIAL_HSF:mode><SVE_SDF:mode>2): New expander to handle widening ('extending') FP<-FP conversions. (*aarch64_sve_<optab>_nontrunc<SVE_PARTIAL_HSF:mode><SVE_SDF:mode>): New pattern to handle those we've added here. * config/aarch64/aarch64.cc (aarch64_sve_packed_pred): New function. (aarch64_sve_fp_pred): Likewise. (aarch64_sve_valid_pred_p): Likewise. * config/aarch64/iterators.md (SVE_PARTIAL_HSF): New mode iterator. (SVE_HSF): Likewise. (SVE_SDF): Likewise. (SVE_SI): Likewise. (SVE_2SDI) Likewise. (self_mask): Extend to all integer/FP vector modes. (narrower_mask): Likewise (excluding QI). * config/aarch64/predicates.md (aarch64_predicate_operand): New special predicate to handle narrower predicate modes. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/pack_fcvt_signed_1.c: Disable the aarch64 vector cost model to preserve this test. * gcc.target/aarch64/sve/pack_fcvt_unsigned_1.c: Likewise. * gcc.target/aarch64/sve/pack_float_1.c: Likewise. * gcc.target/aarch64/sve/unpack_float_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_cvtf_1.c: New test. * gcc.target/aarch64/sve/unpacked_cvtf_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_cvtf_3.c: Likewise. * gcc.target/aarch64/sve/unpacked_fcvt_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_fcvt_2.c: Likewise. * gcc.target/aarch64/sve/unpacked_fcvtz_1.c: Likewise. * gcc.target/aarch64/sve/unpacked_fcvtz_2.c: Likewise.
1 parent 35b8acb commit a26c5fc

16 files changed

+902
-39
lines changed

gcc/config/aarch64/aarch64-protos.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -947,6 +947,7 @@ bool aarch64_parallel_select_half_p (machine_mode, rtx);
947947
bool aarch64_pars_overlap_p (rtx, rtx);
948948
bool aarch64_simd_scalar_immediate_valid_for_move (rtx, scalar_int_mode);
949949
bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool);
950+
bool aarch64_sve_valid_pred_p (rtx, machine_mode);
950951
bool aarch64_sve_ptrue_svpattern_p (rtx, struct simd_immediate_info *);
951952
bool aarch64_simd_valid_and_imm (rtx);
952953
bool aarch64_simd_valid_and_imm_fmov (rtx, unsigned int * = NULL);
@@ -1028,6 +1029,8 @@ rtx aarch64_ptrue_reg (machine_mode, unsigned int);
10281029
rtx aarch64_ptrue_reg (machine_mode, machine_mode);
10291030
rtx aarch64_pfalse_reg (machine_mode);
10301031
bool aarch64_sve_same_pred_for_ptest_p (rtx *, rtx *);
1032+
rtx aarch64_sve_packed_pred (machine_mode);
1033+
rtx aarch64_sve_fp_pred (machine_mode, rtx *);
10311034
void aarch64_emit_load_store_through_mode (rtx, rtx, machine_mode);
10321035
bool aarch64_expand_maskloadstore (rtx *, machine_mode);
10331036
void aarch64_emit_sve_pred_move (rtx, rtx, rtx);

gcc/config/aarch64/aarch64-sve.md

Lines changed: 151 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -154,8 +154,10 @@
154154
;; ---- [FP<-INT] Packs
155155
;; ---- [FP<-INT] Unpacks
156156
;; ---- [FP<-FP] Packs
157+
;; ---- [FP<-FP] Truncating conversions
157158
;; ---- [FP<-FP] Packs (bfloat16)
158159
;; ---- [FP<-FP] Unpacks
160+
;; ---- [FP<-FP] Extending conversions
159161
;; ---- [PRED<-PRED] Packs
160162
;; ---- [PRED<-PRED] Unpacks
161163
;;
@@ -9524,18 +9526,37 @@
95249526
;; - FCVTZU
95259527
;; -------------------------------------------------------------------------
95269528

9527-
;; Unpredicated conversion of floats to integers of the same size (HF to HI,
9528-
;; SF to SI or DF to DI).
9529-
(define_expand "<optab><mode><v_int_equiv>2"
9530-
[(set (match_operand:<V_INT_EQUIV> 0 "register_operand")
9531-
(unspec:<V_INT_EQUIV>
9529+
;; Unpredicated conversion of floats to integers of the same size or wider,
9530+
;; excluding conversions from DF (see below).
9531+
(define_expand "<optab><SVE_HSF:mode><SVE_HSDI:mode>2"
9532+
[(set (match_operand:SVE_HSDI 0 "register_operand")
9533+
(unspec:SVE_HSDI
9534+
[(match_dup 2)
9535+
(match_dup 3)
9536+
(match_operand:SVE_HSF 1 "register_operand")]
9537+
SVE_COND_FCVTI))]
9538+
"TARGET_SVE
9539+
&& (~(<SVE_HSDI:self_mask> | <SVE_HSDI:narrower_mask>) & <SVE_HSF:self_mask>) == 0"
9540+
{
9541+
operands[2] = aarch64_sve_fp_pred (<SVE_HSDI:MODE>mode, &operands[3]);
9542+
}
9543+
)
9544+
9545+
;; SI <- DF can't use SI <- trunc (DI <- DF) without -ffast-math, so this
9546+
;; truncating variant of FCVTZ{S,U} is useful for auto-vectorization.
9547+
;;
9548+
;; DF is the only source mode for which the mask used above doesn't apply,
9549+
;; we define a separate pattern for it here.
9550+
(define_expand "<optab><VNx2DF_ONLY:mode><SVE_2SDI:mode>2"
9551+
[(set (match_operand:SVE_2SDI 0 "register_operand")
9552+
(unspec:SVE_2SDI
95329553
[(match_dup 2)
95339554
(const_int SVE_RELAXED_GP)
9534-
(match_operand:SVE_FULL_F 1 "register_operand")]
9555+
(match_operand:VNx2DF_ONLY 1 "register_operand")]
95359556
SVE_COND_FCVTI))]
95369557
"TARGET_SVE"
95379558
{
9538-
operands[2] = aarch64_ptrue_reg (<VPRED>mode);
9559+
operands[2] = aarch64_ptrue_reg (VNx2BImode);
95399560
}
95409561
)
95419562

@@ -9554,18 +9575,37 @@
95549575
}
95559576
)
95569577

9557-
;; Predicated narrowing float-to-integer conversion.
9558-
(define_insn "@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>"
9559-
[(set (match_operand:VNx4SI_ONLY 0 "register_operand")
9560-
(unspec:VNx4SI_ONLY
9578+
;; As above, for pairs used by the auto-vectorizer only.
9579+
(define_insn "*aarch64_sve_<optab>_nontrunc<SVE_PARTIAL_F:mode><SVE_HSDI:mode>"
9580+
[(set (match_operand:SVE_HSDI 0 "register_operand")
9581+
(unspec:SVE_HSDI
9582+
[(match_operand:<SVE_HSDI:VPRED> 1 "aarch64_predicate_operand")
9583+
(match_operand:SI 3 "aarch64_sve_gp_strictness")
9584+
(match_operand:SVE_PARTIAL_F 2 "register_operand")]
9585+
SVE_COND_FCVTI))]
9586+
"TARGET_SVE
9587+
&& (~(<SVE_HSDI:self_mask> | <SVE_HSDI:narrower_mask>) & <SVE_PARTIAL_F:self_mask>) == 0"
9588+
{@ [ cons: =0 , 1 , 2 ; attrs: movprfx ]
9589+
[ w , Upl , 0 ; * ] fcvtz<su>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_PARTIAL_F:Vetype>
9590+
[ ?&w , Upl , w ; yes ] movprfx\t%0, %2\;fcvtz<su>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_PARTIAL_F:Vetype>
9591+
}
9592+
)
9593+
9594+
;; Predicated narrowing float-to-integer conversion. The VNx2DF->VNx4SI
9595+
;; variant is provided for the ACLE, where the zeroed odd-indexed lanes are
9596+
;; significant. The VNx2DF->VNx2SI variant is provided for auto-vectorization,
9597+
;; where the upper 32 bits of each container are ignored.
9598+
(define_insn "@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><SVE_SI:mode>"
9599+
[(set (match_operand:SVE_SI 0 "register_operand")
9600+
(unspec:SVE_SI
95619601
[(match_operand:VNx2BI 1 "register_operand")
95629602
(match_operand:SI 3 "aarch64_sve_gp_strictness")
95639603
(match_operand:VNx2DF_ONLY 2 "register_operand")]
95649604
SVE_COND_FCVTI))]
95659605
"TARGET_SVE"
95669606
{@ [ cons: =0 , 1 , 2 ; attrs: movprfx ]
9567-
[ w , Upl , 0 ; * ] fcvtz<su>\t%0.<VNx4SI_ONLY:Vetype>, %1/m, %2.<VNx2DF_ONLY:Vetype>
9568-
[ ?&w , Upl , w ; yes ] movprfx\t%0, %2\;fcvtz<su>\t%0.<VNx4SI_ONLY:Vetype>, %1/m, %2.<VNx2DF_ONLY:Vetype>
9607+
[ w , Upl , 0 ; * ] fcvtz<su>\t%0.<SVE_SI:Vetype>, %1/m, %2.<VNx2DF_ONLY:Vetype>
9608+
[ ?&w , Upl , w ; yes ] movprfx\t%0, %2\;fcvtz<su>\t%0.<SVE_SI:Vetype>, %1/m, %2.<VNx2DF_ONLY:Vetype>
95699609
}
95709610
)
95719611

@@ -9710,18 +9750,19 @@
97109750
;; - UCVTF
97119751
;; -------------------------------------------------------------------------
97129752

9713-
;; Unpredicated conversion of integers to floats of the same size
9714-
;; (HI to HF, SI to SF or DI to DF).
9715-
(define_expand "<optab><v_int_equiv><mode>2"
9716-
[(set (match_operand:SVE_FULL_F 0 "register_operand")
9717-
(unspec:SVE_FULL_F
9753+
;; Unpredicated conversion of integers to floats of the same size or
9754+
;; narrower.
9755+
(define_expand "<optab><SVE_HSDI:mode><SVE_F:mode>2"
9756+
[(set (match_operand:SVE_F 0 "register_operand")
9757+
(unspec:SVE_F
97189758
[(match_dup 2)
9719-
(const_int SVE_RELAXED_GP)
9720-
(match_operand:<V_INT_EQUIV> 1 "register_operand")]
9759+
(match_dup 3)
9760+
(match_operand:SVE_HSDI 1 "register_operand")]
97219761
SVE_COND_ICVTF))]
9722-
"TARGET_SVE"
9762+
"TARGET_SVE
9763+
&& (~(<SVE_HSDI:self_mask> | <SVE_HSDI:narrower_mask>) & <SVE_F:self_mask>) == 0"
97239764
{
9724-
operands[2] = aarch64_ptrue_reg (<VPRED>mode);
9765+
operands[2] = aarch64_sve_fp_pred (<SVE_HSDI:MODE>mode, &operands[3]);
97259766
}
97269767
)
97279768

@@ -9741,6 +9782,22 @@
97419782
}
97429783
)
97439784

9785+
;; As above, for pairs that are used by the auto-vectorizer only.
9786+
(define_insn "*aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_PARTIAL_F:mode>"
9787+
[(set (match_operand:SVE_PARTIAL_F 0 "register_operand")
9788+
(unspec:SVE_PARTIAL_F
9789+
[(match_operand:<SVE_HSDI:VPRED> 1 "aarch64_predicate_operand")
9790+
(match_operand:SI 3 "aarch64_sve_gp_strictness")
9791+
(match_operand:SVE_HSDI 2 "register_operand")]
9792+
SVE_COND_ICVTF))]
9793+
"TARGET_SVE
9794+
&& (~(<SVE_HSDI:self_mask> | <SVE_HSDI:narrower_mask>) & <SVE_PARTIAL_F:self_mask>) == 0"
9795+
{@ [ cons: =0 , 1 , 2 ; attrs: movprfx ]
9796+
[ w , Upl , 0 ; * ] <su>cvtf\t%0.<SVE_PARTIAL_F:Vetype>, %1/m, %2.<SVE_HSDI:Vetype>
9797+
[ ?&w , Upl , w ; yes ] movprfx\t%0, %2\;<su>cvtf\t%0.<SVE_PARTIAL_F:Vetype>, %1/m, %2.<SVE_HSDI:Vetype>
9798+
}
9799+
)
9800+
97449801
;; Predicated widening integer-to-float conversion.
97459802
(define_insn "@aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>"
97469803
[(set (match_operand:VNx2DF_ONLY 0 "register_operand")
@@ -9924,6 +9981,27 @@
99249981
}
99259982
)
99269983

9984+
;; -------------------------------------------------------------------------
9985+
;; ---- [FP<-FP] Truncating conversions
9986+
;; -------------------------------------------------------------------------
9987+
;; Includes:
9988+
;; - FCVT
9989+
;; -------------------------------------------------------------------------
9990+
9991+
;; Unpredicated float-to-float truncation.
9992+
(define_expand "trunc<SVE_SDF:mode><SVE_PARTIAL_HSF:mode>2"
9993+
[(set (match_operand:SVE_PARTIAL_HSF 0 "register_operand")
9994+
(unspec:SVE_PARTIAL_HSF
9995+
[(match_dup 2)
9996+
(match_dup 3)
9997+
(match_operand:SVE_SDF 1 "register_operand")]
9998+
SVE_COND_FCVT))]
9999+
"TARGET_SVE && (~<SVE_SDF:narrower_mask> & <SVE_PARTIAL_HSF:self_mask>) == 0"
10000+
{
10001+
operands[2] = aarch64_sve_fp_pred (<SVE_SDF:MODE>mode, &operands[3]);
10002+
}
10003+
)
10004+
992710005
;; Predicated float-to-float truncation.
992810006
(define_insn "@aarch64_sve_<optab>_trunc<SVE_FULL_SDF:mode><SVE_FULL_HSF:mode>"
992910007
[(set (match_operand:SVE_FULL_HSF 0 "register_operand")
@@ -9939,6 +10017,21 @@
993910017
}
994010018
)
994110019

10020+
;; As above, for pairs that are used by the auto-vectorizer only.
10021+
(define_insn "*aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_PARTIAL_HSF:mode>"
10022+
[(set (match_operand:SVE_PARTIAL_HSF 0 "register_operand")
10023+
(unspec:SVE_PARTIAL_HSF
10024+
[(match_operand:<SVE_SDF:VPRED> 1 "aarch64_predicate_operand")
10025+
(match_operand:SI 3 "aarch64_sve_gp_strictness")
10026+
(match_operand:SVE_SDF 2 "register_operand")]
10027+
SVE_COND_FCVT))]
10028+
"TARGET_SVE && (~<SVE_SDF:narrower_mask> & <SVE_PARTIAL_HSF:self_mask>) == 0"
10029+
{@ [ cons: =0 , 1 , 2 ; attrs: movprfx ]
10030+
[ w , Upl , 0 ; * ] fcvt\t%0.<SVE_PARTIAL_HSF:Vetype>, %1/m, %2.<SVE_SDF:Vetype>
10031+
[ ?&w , Upl , w ; yes ] movprfx\t%0, %2\;fcvt\t%0.<SVE_PARTIAL_HSF:Vetype>, %1/m, %2.<SVE_SDF:Vetype>
10032+
}
10033+
)
10034+
994210035
;; Predicated float-to-float truncation with merging.
994310036
(define_expand "@cond_<optab>_trunc<SVE_FULL_SDF:mode><SVE_FULL_HSF:mode>"
994410037
[(set (match_operand:SVE_FULL_HSF 0 "register_operand")
@@ -10081,6 +10174,27 @@
1008110174
}
1008210175
)
1008310176

10177+
;; -------------------------------------------------------------------------
10178+
;; ---- [FP<-FP] Extending conversions
10179+
;; -------------------------------------------------------------------------
10180+
;; Includes:
10181+
;; - FCVT
10182+
;; -------------------------------------------------------------------------
10183+
10184+
;; Unpredicated float-to-float extension.
10185+
(define_expand "extend<SVE_PARTIAL_HSF:mode><SVE_SDF:mode>2"
10186+
[(set (match_operand:SVE_SDF 0 "register_operand")
10187+
(unspec:SVE_SDF
10188+
[(match_dup 2)
10189+
(match_dup 3)
10190+
(match_operand:SVE_PARTIAL_HSF 1 "register_operand")]
10191+
SVE_COND_FCVT))]
10192+
"TARGET_SVE && (~<SVE_SDF:narrower_mask> & <SVE_PARTIAL_HSF:self_mask>) == 0"
10193+
{
10194+
operands[2] = aarch64_sve_fp_pred (<SVE_SDF:MODE>mode, &operands[3]);
10195+
}
10196+
)
10197+
1008410198
;; Predicated float-to-float extension.
1008510199
(define_insn "@aarch64_sve_<optab>_nontrunc<SVE_FULL_HSF:mode><SVE_FULL_SDF:mode>"
1008610200
[(set (match_operand:SVE_FULL_SDF 0 "register_operand")
@@ -10096,6 +10210,21 @@
1009610210
}
1009710211
)
1009810212

10213+
;; As above, for pairs that are used by the auto-vectorizer only.
10214+
(define_insn "*aarch64_sve_<optab>_nontrunc<SVE_PARTIAL_HSF:mode><SVE_SDF:mode>"
10215+
[(set (match_operand:SVE_SDF 0 "register_operand")
10216+
(unspec:SVE_SDF
10217+
[(match_operand:<SVE_SDF:VPRED> 1 "aarch64_predicate_operand")
10218+
(match_operand:SI 3 "aarch64_sve_gp_strictness")
10219+
(match_operand:SVE_PARTIAL_HSF 2 "register_operand")]
10220+
SVE_COND_FCVT))]
10221+
"TARGET_SVE && (~<SVE_SDF:narrower_mask> & <SVE_PARTIAL_HSF:self_mask>) == 0"
10222+
{@ [ cons: =0 , 1 , 2 ; attrs: movprfx ]
10223+
[ w , Upl , 0 ; * ] fcvt\t%0.<SVE_SDF:Vetype>, %1/m, %2.<SVE_PARTIAL_HSF:Vetype>
10224+
[ ?&w , Upl , w ; yes ] movprfx\t%0, %2\;fcvt\t%0.<SVE_SDF:Vetype>, %1/m, %2.<SVE_PARTIAL_HSF:Vetype>
10225+
}
10226+
)
10227+
1009910228
;; Predicated float-to-float extension with merging.
1010010229
(define_expand "@cond_<optab>_nontrunc<SVE_FULL_HSF:mode><SVE_FULL_SDF:mode>"
1010110230
[(set (match_operand:SVE_FULL_SDF 0 "register_operand")

gcc/config/aarch64/aarch64.cc

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3860,6 +3860,44 @@ aarch64_sve_same_pred_for_ptest_p (rtx *pred1, rtx *pred2)
38603860
return (ptrue1_p && ptrue2_p) || rtx_equal_p (pred1[0], pred2[0]);
38613861
}
38623862

3863+
3864+
/* Generate a predicate to control partial SVE mode DATA_MODE as if it
3865+
were fully packed, enabling the defined elements only. */
3866+
rtx
3867+
aarch64_sve_packed_pred (machine_mode data_mode)
3868+
{
3869+
unsigned int container_bytes
3870+
= aarch64_sve_container_bits (data_mode) / BITS_PER_UNIT;
3871+
/* Enable the significand of each container only. */
3872+
rtx ptrue = force_reg (VNx16BImode, aarch64_ptrue_all (container_bytes));
3873+
/* Predicate at the element size. */
3874+
machine_mode pmode
3875+
= aarch64_sve_pred_mode (GET_MODE_UNIT_SIZE (data_mode)).require ();
3876+
return gen_lowpart (pmode, ptrue);
3877+
}
3878+
3879+
/* Generate a predicate and strictness value to govern a floating-point
3880+
operation with SVE mode DATA_MODE.
3881+
3882+
If DATA_MODE is a partial vector mode, this pair prevents the operation
3883+
from interpreting undefined elements - unless we don't need to suppress
3884+
their trapping behavior. */
3885+
rtx
3886+
aarch64_sve_fp_pred (machine_mode data_mode, rtx *strictness)
3887+
{
3888+
unsigned int vec_flags = aarch64_classify_vector_mode (data_mode);
3889+
if (flag_trapping_math && (vec_flags & VEC_PARTIAL))
3890+
{
3891+
if (strictness)
3892+
*strictness = gen_int_mode (SVE_STRICT_GP, SImode);
3893+
return aarch64_sve_packed_pred (data_mode);
3894+
}
3895+
if (strictness)
3896+
*strictness = gen_int_mode (SVE_RELAXED_GP, SImode);
3897+
/* Use the VPRED mode. */
3898+
return aarch64_ptrue_reg (aarch64_sve_pred_mode (data_mode));
3899+
}
3900+
38633901
/* Emit a comparison CMP between OP0 and OP1, both of which have mode
38643902
DATA_MODE, and return the result in a predicate of mode PRED_MODE.
38653903
Use TARGET as the target register if nonnull and convenient. */
@@ -23697,6 +23735,19 @@ aarch64_simd_shift_imm_p (rtx x, machine_mode mode, bool left)
2369723735
return IN_RANGE (INTVAL (x), 1, bit_width);
2369823736
}
2369923737

23738+
23739+
/* Check whether X can control SVE mode MODE. */
23740+
bool
23741+
aarch64_sve_valid_pred_p (rtx x, machine_mode mode)
23742+
{
23743+
machine_mode pred_mode = GET_MODE (x);
23744+
if (!aarch64_sve_pred_mode_p (pred_mode))
23745+
return false;
23746+
23747+
return known_ge (GET_MODE_NUNITS (pred_mode),
23748+
GET_MODE_NUNITS (mode));
23749+
}
23750+
2370023751
/* Return the bitmask CONST_INT to select the bits required by a zero extract
2370123752
operation of width WIDTH at bit position POS. */
2370223753

0 commit comments

Comments
 (0)