Version: Unity 6.6 Alpha (6000.6)
LanguageEnglish
  • C#

Avx

class in Unity.Burst.Intrinsics

Suggest a change

Success!

Thank you for helping us improve the quality of Unity Documentation. Although we cannot accept all submissions, we do read each suggested change from our users and will make updates where applicable.

Close

Submission failed

For some reason your suggested change could not be submitted. Please <a>try again</a> in a few minutes. And thank you for taking the time to help us improve the quality of Unity Documentation.

Close

Cancel

Description

AVX intrinsics

Static Properties

Property Description
IsAvxSupported Evaluates to true at compile time if AVX intrinsics are supported.

Static Methods

Method Description
broadcast_ss Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.
cmp_pd Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.
cmp_ps Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.
cmp_sd Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.
cmp_ss Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.
maskload_pd Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).
maskload_ps Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).
maskstore_pd Store packed double-precision (64-bit) floating-point elements from a into memory using mask.
maskstore_ps Store packed single-precision (32-bit) floating-point elements from a into memory using mask.
mm256_add_pd Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.
mm256_add_ps Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.
mm256_addsub_pd Alternatively add and subtract packed double-precision (64-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.
mm256_addsub_ps Alternatively add and subtract packed single-precision (32-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.
mm256_and_pd Compute the bitwise AND of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.
mm256_and_ps Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.
mm256_andnot_pd Compute the bitwise NOT of packed double-precision (64-bit) floating-point elements in a and then AND with b, and store the results in dst.
mm256_andnot_ps Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b, and store the results in dst.
mm256_blend_pd Blend packed double-precision (64-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.
mm256_blend_ps Blend packed single-precision (32-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.
mm256_blendv_pd Blend packed double-precision (64-bit) floating-point elements from a and b using mask, and store the results in dst.
mm256_blendv_ps Blend packed single-precision (32-bit) floating-point elements from a and b using mask, and store the results in dst.
mm256_broadcast_pd Broadcast 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements of dst.
mm256_broadcast_ps Broadcast 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements of dst.
mm256_broadcast_sd Broadcast a double-precision (64-bit) floating-point element from memory to all elements of dst.
mm256_broadcast_ss Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.
mm256_castpd_psFor compatibility with C++ code only. This is a no-op in Burst.
mm256_castpd_si256For compatibility with C++ code only. This is a no-op in Burst.
mm256_castpd128_pd256For compatibility with C++ code only. This is a no-op in Burst.
mm256_castpd256_pd128For compatibility with C++ code only. This is a no-op in Burst.
mm256_castps_pdFor compatibility with C++ code only. This is a no-op in Burst.
mm256_castps_si256For compatibility with C++ code only. This is a no-op in Burst.
mm256_castps128_ps256For compatibility with C++ code only. This is a no-op in Burst.
mm256_castps256_ps128For compatibility with C++ code only. This is a no-op in Burst.
mm256_castsi128_si256For compatibility with C++ code only. This is a no-op in Burst.
mm256_castsi256_pdFor compatibility with C++ code only. This is a no-op in Burst.
mm256_castsi256_psFor compatibility with C++ code only. This is a no-op in Burst.
mm256_castsi256_si128For compatibility with C++ code only. This is a no-op in Burst.
mm256_ceil_pd Round the packed double-precision (64-bit) floating-point elements in a up to an integer value, and store the results as packed double-precision floating-point elements in dst.
mm256_ceil_ps Round the packed single-precision (32-bit) floating-point elements in a up to an integer value, and store the results as packed single-precision floating-point elements in dst.
mm256_cmp_pd Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.
mm256_cmp_ps Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.
mm256_cvtepi32_pd Convert packed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
mm256_cvtepi32_ps Convert packed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
mm256_cvtpd_epi32 Convert packed double-precision(64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.
mm256_cvtpd_ps Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.
mm256_cvtps_epi32 Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.
mm256_cvtps_pd Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.
mm256_cvtss_f32 Copy the lower single-precision (32-bit) floating-point element of a to dst.
mm256_cvttpd_epi32 Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.
mm256_cvttps_epi32 Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.
mm256_div_pd Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst.
mm256_div_ps Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst.
mm256_dp_ps Conditionally multiply the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in dst using the low 4 bits of imm8.
mm256_extract_epi32 Extract a 32-bit integer from a, selected with index (which must be a constant), and store the result in dst.
mm256_extract_epi64 Extract a 64-bit integer from a, selected with index (which must be a constant), and store the result in dst.
mm256_extractf128_pd Extract 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with imm8, and store the result in dst.
mm256_extractf128_ps Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the result in dst.
mm256_extractf128_si256 Extract 128 bits (composed of integer data) from a, selected with imm8, and store the result in dst.
mm256_floor_pd Round the packed double-precision (64-bit) floating-point elements in a down to an integer value, and store the results as packed double-precision floating-point elements in dst.
mm256_floor_ps Round the packed single-precision (32-bit) floating-point elements in a down to an integer value, and store the results as packed single-precision floating-point elements in dst.
mm256_hadd_pd Horizontally add adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.
mm256_hadd_ps Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.
mm256_hsub_pd Horizontally subtract adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.
mm256_hsub_ps Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.
mm256_insert_epi16 Copy a to dst, and insert the 16-bit integer i into dst at the location specified by index (which must be a constant).
mm256_insert_epi32 Copy a to dst, and insert the 32-bit integer i into dst at the location specified by index (which must be a constant).
mm256_insert_epi64 Copy a to dst, and insert the 64-bit integer i into dst at the location specified by index (which must be a constant).
mm256_insert_epi8 Copy a to dst, and insert the 8-bit integer i into dst at the location specified by index (which must be a constant).
mm256_insertf128_pd Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by imm8.
mm256_insertf128_ps Copy a to dst, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by imm8.
mm256_insertf128_si256 Copy a to dst, then insert 128 bits of integer data from b into dst at the location specified by imm8.
mm256_lddqu_si256 Load 256-bits of integer data from unaligned memory into dst. This intrinsic may perform better than mm256_loadu_si256 when the data crosses a cache line boundary.
mm256_load_pd Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory
mm256_load_ps Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory
mm256_load_si256 Load 256-bits (composed of 8 packed 32-bit integers elements) from memory
mm256_loadu_pd Load 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory
mm256_loadu_ps Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory
mm256_loadu_si256 Load 256-bits (composed of 8 packed 32-bit integers elements) from memory
mm256_loadu2_m128 Load two 128-bit values (composed of 4 packed single-precision (32-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.
mm256_loadu2_m128d Load two 128-bit values (composed of 2 packed double-precision (64-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.
mm256_loadu2_m128i Load two 128-bit values (composed of integer data) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.
mm256_maskload_pd Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).
mm256_maskload_ps Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).
mm256_maskstore_pd Store packed double-precision (64-bit) floating-point elements from a into memory using mask.
mm256_maskstore_ps Store packed single-precision (32-bit) floating-point elements from a into memory using mask.
mm256_max_pd Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst.
mm256_max_ps Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst.
mm256_min_pd Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst.
mm256_min_ps Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst.
mm256_movedup_pd Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst.
mm256_movehdup_ps Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.
mm256_moveldup_ps Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.
mm256_movemask_pd Set each bit of mask dst based on the most significant bit of the corresponding packed double-precision (64-bit) floating-point element in a.
mm256_movemask_ps Set each bit of mask dst based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in a.
mm256_mul_pd Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.
mm256_mul_ps Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.
mm256_or_pd Compute the bitwise OR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.
mm256_or_ps Compute the bitwise OR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.
mm256_permute_pd Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.
mm256_permute_ps Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.
mm256_permute2f128_pd Shuffle 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.
mm256_permute2f128_ps Shuffle 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.
mm256_permute2f128_si256 Shuffle 128-bits (composed of integer data) selected by imm8 from a and b, and store the results in dst.
mm256_permutevar_pd Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.
mm256_permutevar_ps Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.
mm256_rcp_ps Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.
mm256_round_pd Round the packed double-precision (64-bit) floating-point elements in a using the rounding parameter, and store the results as packed double-precision floating-point elements in dst.
mm256_round_ps Round the packed single-precision (32-bit) floating-point elements in a using the rounding parameter, and store the results as packed single-precision floating-point elements in dst.
mm256_rsqrt_ps Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.
mm256_set_epi16 Set packed short elements in dst with the supplied values.
mm256_set_epi32 Set packed int elements in dst with the supplied values.
mm256_set_epi64x Set packed 64-bit integers in dst with the supplied values.
mm256_set_epi8 Set packed byte elements in dst with the supplied values.
mm256_set_m128 Set packed __m256 vector dst with the supplied values.
mm256_set_m128d Set packed v256 vector with the supplied values.
mm256_set_m128i Set packed v256 vector with the supplied values.
mm256_set_pd Set packed double-precision (64-bit) floating-point elements in dst with the supplied values.
mm256_set_ps Set packed single-precision (32-bit) floating-point elements in dst with the supplied values.
mm256_set1_epi16 Broadcast 16-bit integer a to all all elements of dst. This intrinsic may generate the vpbroadcastw instruction.
mm256_set1_epi32 Broadcast 32-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastd instruction.
mm256_set1_epi64x Broadcast 64-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastq instruction.
mm256_set1_epi8 Broadcast 8-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastb instruction.
mm256_set1_pd Broadcast double-precision (64-bit) floating-point value a to all elements of dst.
mm256_set1_ps Broadcast single-precision (32-bit) floating-point value a to all elements of dst.
mm256_setr_epi16 Set packed short elements in dst with the supplied values in reverse order.
mm256_setr_epi32 Set packed int elements in dst with the supplied values in reverse order.
mm256_setr_epi64x Set packed 64-bit integers in dst with the supplied values in reverse order.
mm256_setr_epi8 Set packed byte elements in dst with the supplied values in reverse order.
mm256_setr_m128 Set packed v256 vector with the supplied values in reverse order.
mm256_setr_m128d Set packed v256 vector with the supplied values in reverse order.
mm256_setr_m128i Set packed v256 vector with the supplied values in reverse order.
mm256_setr_pd Set packed double-precision (64-bit) floating-point elements in dst with the supplied values in reverse order.
mm256_setr_ps Set packed single-precision (32-bit) floating-point elements in dst with the supplied values in reverse order.
mm256_setzero_pd Return Vector with all elements set to zero.
mm256_setzero_ps Return Vector with all elements set to zero.
mm256_setzero_si256 Return Vector with all elements set to zero.
mm256_shuffle_pd Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst.
mm256_shuffle_ps Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.
mm256_sqrt_pd Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst.
mm256_sqrt_ps Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst.
mm256_store_pd Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory
mm256_store_ps Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory
mm256_store_si256 Store 256-bits (composed of 8 packed 32-bit integer elements) from a into memory
mm256_storeu_pd Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory
mm256_storeu_ps Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory
mm256_storeu_si256 Store 256-bits (composed of 8 packed 32-bit integer elements) from a into memory
mm256_storeu2_m128 Store the high and low 128-bit halves (each composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.
mm256_storeu2_m128d Store the high and low 128-bit halves (each composed of 2 packed double-precision (64-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.
mm256_storeu2_m128i Store the high and low 128-bit halves (each composed of integer data) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.
mm256_stream_pd Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
mm256_stream_ps Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
mm256_stream_si256 Store 256-bits of integer data from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
mm256_sub_pd Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst.
mm256_sub_ps Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst.
mm256_testc_pd Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.
mm256_testc_ps Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.
mm256_testc_si256 Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the CF value.
mm256_testnzc_pd Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
mm256_testnzc_ps Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
mm256_testnzc_si256 Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
mm256_testz_pd Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.
mm256_testz_ps Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.
mm256_testz_si256 Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the ZF value.
mm256_undefined_pdReturn a 256-bit vector with undefined contents.
mm256_undefined_psReturn a 256-bit vector with undefined contents.
mm256_undefined_si256Return a 256-bit vector with undefined contents.
mm256_unpackhi_pd Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.
mm256_unpackhi_ps Unpack and interleave single-precision(32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.
mm256_unpacklo_pd Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.
mm256_unpacklo_ps Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.
mm256_xor_pd Compute the bitwise XOR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.
mm256_xor_ps Compute the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.
mm256_zeroall Zeros the contents of all YMM registers
mm256_zeroupper Zero the upper 128 bits of all YMM registers; the lower 128-bits of the registers are unmodified.
mm256_zextpd128_pd256 Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
mm256_zextps128_ps256 Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
mm256_zextsi128_si256 Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
permute_pd Shuffle double-precision (64-bit) floating-point elements in a using the control in imm8, and store the results in dst.
permute_ps Shuffle single-precision (32-bit) floating-point elements in a using the control in imm8, and store the results in dst.
permutevar_pd Shuffle double-precision (64-bit) floating-point elements in a using the control in b, and store the results in dst.
permutevar_ps Shuffle single-precision (32-bit) floating-point elements in a using the control in b, and store the results in dst.
testc_pd Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.
testc_ps Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.
testnzc_pd Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
testnzc_ps Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
testz_pd Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.
testz_ps Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.
undefined_pdReturn a 128-bit vector with undefined contents.
undefined_psReturn a 128-bit vector with undefined contents.
undefined_si128Return a 128-bit vector with undefined contents.