Class X86.Sse4_1
SSE 4.1 intrinsics
Inherited Members
Namespace: Unity.Burst.Intrinsics
Assembly: Unity.Burst.dll
Syntax
public static class X86.Sse4_1
Properties
Name | Description |
---|---|
IsSse41Supported | Evaluates to true at compile time if SSE 4.1 intrinsics are supported. |
Methods
Name | Description |
---|---|
MK_INSERTPS_NDX(int, int, int) | Helper macro to create index-parameter value for insert_ps |
blend_epi16(v128, v128, int) | Blend packed 16-bit integers from "a" and "b" using control mask "imm8", and store the results in "dst". |
blend_pd(v128, v128, int) | Blend packed double-precision (64-bit) floating-point elements from "a" and "b" using control mask "imm8", and store the results in "dst". |
blend_ps(v128, v128, int) | Blend packed single-precision (32-bit) floating-point elements from "a" and "b" using control mask "imm8", and store the results in "dst". |
blendv_epi8(v128, v128, v128) | Blend packed 8-bit integers from "a" and "b" using "mask", and store the results in "dst". |
blendv_pd(v128, v128, v128) | Blend packed double-precision (64-bit) floating-point elements from "a" and "b" using "mask", and store the results in "dst". |
blendv_ps(v128, v128, v128) | Blend packed single-precision (32-bit) floating-point elements from "a" and "b" using "mask", and store the results in "dst". |
ceil_pd(v128) | Round the packed double-precision (64-bit) floating-point elements in "a" up to an integer value, and store the results as packed double-precision floating-point elements in "dst". |
ceil_ps(v128) | Round the packed single-precision (32-bit) floating-point elements in "a" up to an integer value, and store the results as packed single-precision floating-point elements in "dst". |
ceil_sd(v128, v128) | Round the lower double-precision (64-bit) floating-point element in "b" up to an integer value, store the result as a double-precision floating-point element in the lower element of "dst", and copy the upper element from "a" to the upper element of "dst". |
ceil_ss(v128, v128) | Round the lower single-precision (32-bit) floating-point element in "b" up to an integer value, store the result as a single-precision floating-point element in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". |
cmpeq_epi64(v128, v128) | Compare packed 64-bit integers in "a" and "b" for equality, and store the results in "dst". |
cvtepi16_epi32(v128) | Sign extend packed 16-bit integers in "a" to packed 32-bit integers, and store the results in "dst". |
cvtepi16_epi64(v128) | Sign extend packed 16-bit integers in "a" to packed 64-bit integers, and store the results in "dst". |
cvtepi32_epi64(v128) | Sign extend packed 32-bit integers in "a" to packed 64-bit integers, and store the results in "dst". |
cvtepi8_epi16(v128) | Sign extend packed 8-bit integers in "a" to packed 16-bit integers, and store the results in "dst". |
cvtepi8_epi32(v128) | Sign extend packed 8-bit integers in "a" to packed 32-bit integers, and store the results in "dst". |
cvtepi8_epi64(v128) | Sign extend packed 8-bit integers in the low 8 bytes of "a" to packed 64-bit integers, and store the results in "dst". |
cvtepu16_epi32(v128) | Zero extend packed unsigned 16-bit integers in "a" to packed 32-bit integers, and store the results in "dst". |
cvtepu16_epi64(v128) | Zero extend packed unsigned 16-bit integers in "a" to packed 64-bit integers, and store the results in "dst". |
cvtepu32_epi64(v128) | Zero extend packed unsigned 32-bit integers in "a" to packed 64-bit integers, and store the results in "dst". |
cvtepu8_epi16(v128) | Zero extend packed unsigned 8-bit integers in "a" to packed 16-bit integers, and store the results in "dst". |
cvtepu8_epi32(v128) | Zero extend packed unsigned 8-bit integers in "a" to packed 32-bit integers, and store the results in "dst". |
cvtepu8_epi64(v128) | Zero extend packed unsigned 8-bit integers in the low 8 byte sof "a" to packed 64-bit integers, and store the results in "dst". |
dp_pd(v128, v128, int) | Conditionally multiply the packed double-precision (64-bit) floating-point elements in "a" and "b" using the high 4 bits in "imm8", sum the four products, and conditionally store the sum in "dst" using the low 4 bits of "imm8". |
dp_ps(v128, v128, int) | Conditionally multiply the packed single-precision (32-bit) floating-point elements in "a" and "b" using the high 4 bits in "imm8", sum the four products, and conditionally store the sum in "dst" using the low 4 bits of "imm8". |
extract_epi32(v128, int) | Extract a 32-bit integer from "a", selected with "imm8", and store the result in "dst". |
extract_epi64(v128, int) | Extract a 64-bit integer from "a", selected with "imm8", and store the result in "dst". |
extract_epi8(v128, int) | Extract an 8-bit integer from "a", selected with "imm8", and store the result in the lower element of "dst". |
extract_ps(v128, int) | Extract a single-precision (32-bit) floating-point element from "a", selected with "imm8", and store the result in "dst". |
extractf_ps(v128, int) | Extract a single-precision (32-bit) floating-point element from "a", selected with "imm8", and store the result in "dst" (as a float). |
floor_pd(v128) | Round the packed double-precision (64-bit) floating-point elements in "a" down to an integer value, and store the results as packed double-precision floating-point elements in "dst". |
floor_ps(v128) | Round the packed single-precision (32-bit) floating-point elements in "a" down to an integer value, and store the results as packed single-precision floating-point elements in "dst". |
floor_sd(v128, v128) | Round the lower double-precision (64-bit) floating-point element in "b" down to an integer value, store the result as a double-precision floating-point element in the lower element of "dst", and copy the upper element from "a" to the upper element of "dst". |
floor_ss(v128, v128) | Round the lower single-precision (32-bit) floating-point element in "b" down to an integer value, store the result as a single-precision floating-point element in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". |
insert_epi32(v128, int, int) | Copy "a" to "dst", and insert the 32-bit integer "i" into "dst" at the location specified by "imm8". |
insert_epi64(v128, long, int) | Copy "a" to "dst", and insert the 64-bit integer "i" into "dst" at the location specified by "imm8". |
insert_epi8(v128, byte, int) | Copy "a" to "dst", and insert the lower 8-bit integer from "i" into "dst" at the location specified by "imm8". |
insert_ps(v128, v128, int) | Copy "a" to "tmp", then insert a single-precision (32-bit) floating-point element from "b" into "tmp" using the control in "imm8". Store "tmp" to "dst" using the mask in "imm8" (elements are zeroed out when the corresponding bit is set). |
max_epi32(v128, v128) | Compare packed 32-bit integers in "a" and "b", and store packed maximum values in "dst". |
max_epi8(v128, v128) | Compare packed 8-bit integers in "a" and "b", and store packed maximum values in "dst". |
max_epu16(v128, v128) | Compare packed unsigned 16-bit integers in "a" and "b", and store packed maximum values in "dst". |
max_epu32(v128, v128) | Compare packed unsigned 32-bit integers in "a" and "b", and store packed maximum values in "dst". |
min_epi32(v128, v128) | Compare packed 32-bit integers in "a" and "b", and store packed minimum values in "dst". |
min_epi8(v128, v128) | Compare packed 8-bit integers in "a" and "b", and store packed minimum values in "dst". |
min_epu16(v128, v128) | Compare packed unsigned 16-bit integers in "a" and "b", and store packed minimum values in "dst". |
min_epu32(v128, v128) | Compare packed unsigned 32-bit integers in "a" and "b", and store packed minimum values in "dst". |
minpos_epu16(v128) | Horizontally compute the minimum amongst the packed unsigned 16-bit integers in "a", store the minimum and index in "dst", and zero the remaining bits in "dst". |
mpsadbw_epu8(v128, v128, int) | Compute the sum of absolute differences (SADs) of quadruplets of unsigned 8-bit integers in "a" compared to those in "b", and store the 16-bit results in "dst". |
mul_epi32(v128, v128) | Multiply the low 32-bit integers from each packed 64-bit element in "a" and "b", and store the signed 64-bit results in "dst". |
mullo_epi32(v128, v128) | Multiply the packed 32-bit integers in "a" and "b", producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in "dst". |
packus_epi32(v128, v128) | Convert packed 32-bit integers from "a" and "b" to packed 16-bit integers using unsigned saturation, and store the results in "dst". |
round_pd(v128, int) | Round the packed double-precision (64-bit) floating-point elements in "a" using the "rounding" parameter, and store the results as packed double-precision floating-point elements in "dst". |
round_ps(v128, int) | Round the packed single-precision (32-bit) floating-point elements in "a" using the "rounding" parameter, and store the results as packed single-precision floating-point elements in "dst". |
round_sd(v128, v128, int) | Round the lower double-precision (64-bit) floating-point element in "b" using the "rounding" parameter, store the result as a double-precision floating-point element in the lower element of "dst", and copy the upper element from "a" to the upper element of "dst". |
round_ss(v128, v128, int) | Round the lower single-precision (32-bit) floating-point element in "b" using the "rounding" parameter, store the result as a single-precision floating-point element in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". |
stream_load_si128(void*) | Load 128-bits of integer data from memory into dst using a non-temporal memory hint. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated. |
test_all_ones(v128) | Compute the bitwise NOT of "a" and then AND with a 128-bit vector containing all 1's, and return 1 if the result is zero, otherwise return 0.> |
test_all_zeros(v128, v128) | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "mask", and return 1 if the result is zero, otherwise return 0. |
test_mix_ones_zeroes(v128, v128) | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "mask", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "mask", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return 1 if both the "ZF" and "CF" values are zero, otherwise return 0. |
testc_si128(v128, v128) | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "b", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "b", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return the "CF" value. |
testnzc_si128(v128, v128) | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "b", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "b", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return 1 if both the "ZF" and "CF" values are zero, otherwise return 0. |
testz_si128(v128, v128) | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "b", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "b", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return the "ZF" value. |