| Method |
Description |
| blend_epi16 | Blend packed 16-bit integers from "a" and "b" using control mask "imm8", and store the results in "dst". |
| blend_pd | Blend packed double-precision (64-bit) floating-point elements from "a" and "b" using control mask "imm8", and store the results in "dst". |
| blend_ps | Blend packed single-precision (32-bit) floating-point elements from "a" and "b" using control mask "imm8", and store the results in "dst". |
| blendv_epi8 | Blend packed 8-bit integers from "a" and "b" using "mask", and store the results in "dst". |
| blendv_pd | Blend packed double-precision (64-bit) floating-point elements from "a" and "b" using "mask", and store the results in "dst". |
| blendv_ps | Blend packed single-precision (32-bit) floating-point elements from "a" and "b" using "mask", and store the results in "dst". |
| ceil_pd | Round the packed double-precision (64-bit) floating-point elements in "a" up to an integer value, and store the results as packed double-precision floating-point elements in "dst". |
| ceil_ps | Round the packed single-precision (32-bit) floating-point elements in "a" up to an integer value, and store the results as packed single-precision floating-point elements in "dst". |
| ceil_sd | Round the lower double-precision (64-bit) floating-point element in "b" up to an integer value, store the result as a double-precision floating-point element in the lower element of "dst", and copy the upper element from "a" to the upper element of "dst". |
| ceil_ss | Round the lower single-precision (32-bit) floating-point element in "b" up to an integer value, store the result as a single-precision floating-point element in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". |
| cmpeq_epi64 | Compare packed 64-bit integers in "a" and "b" for equality, and store the results in "dst". |
| cvtepi16_epi32 | Sign extend packed 16-bit integers in "a" to packed 32-bit integers, and store the results in "dst". |
| cvtepi16_epi64 | Sign extend packed 16-bit integers in "a" to packed 64-bit integers, and store the results in "dst". |
| cvtepi32_epi64 | Sign extend packed 32-bit integers in "a" to packed 64-bit integers, and store the results in "dst". |
| cvtepi8_epi16 | Sign extend packed 8-bit integers in "a" to packed 16-bit integers, and store the results in "dst". |
| cvtepi8_epi32 | Sign extend packed 8-bit integers in "a" to packed 32-bit integers, and store the results in "dst". |
| cvtepi8_epi64 | Sign extend packed 8-bit integers in the low 8 bytes of "a" to packed 64-bit integers, and store the results in "dst". |
| cvtepu16_epi32 | Zero extend packed unsigned 16-bit integers in "a" to packed 32-bit integers, and store the results in "dst". |
| cvtepu16_epi64 | Zero extend packed unsigned 16-bit integers in "a" to packed 64-bit integers, and store the results in "dst". |
| cvtepu32_epi64 | Zero extend packed unsigned 32-bit integers in "a" to packed 64-bit integers, and store the results in "dst". |
| cvtepu8_epi16 | Zero extend packed unsigned 8-bit integers in "a" to packed 16-bit integers, and store the results in "dst". |
| cvtepu8_epi32 | Zero extend packed unsigned 8-bit integers in "a" to packed 32-bit integers, and store the results in "dst". |
| cvtepu8_epi64 | Zero extend packed unsigned 8-bit integers in the low 8 byte sof "a" to packed 64-bit integers, and store the results in "dst". |
| dp_pd | Conditionally multiply the packed double-precision (64-bit) floating-point elements in "a" and "b" using the high 4 bits in "imm8", sum the four products, and conditionally store the sum in "dst" using the low 4 bits of "imm8". |
| dp_ps | Conditionally multiply the packed single-precision (32-bit) floating-point elements in "a" and "b" using the high 4 bits in "imm8", sum the four products, and conditionally store the sum in "dst" using the low 4 bits of "imm8". |
| extract_epi32 | Extract a 32-bit integer from "a", selected with "imm8", and store the result in "dst". |
| extract_epi64 | Extract a 64-bit integer from "a", selected with "imm8", and store the result in "dst". |
| extract_epi8 | Extract an 8-bit integer from "a", selected with "imm8", and store the result in the lower element of "dst". |
| extract_ps | Extract a single-precision (32-bit) floating-point element from "a", selected with "imm8", and store the result in "dst". |
| extractf_ps | Extract a single-precision (32-bit) floating-point element from "a", selected with "imm8", and store the result in "dst" (as a float). |
| floor_pd | Round the packed double-precision (64-bit) floating-point elements in "a" down to an integer value, and store the results as packed double-precision floating-point elements in "dst". |
| floor_ps | Round the packed single-precision (32-bit) floating-point elements in "a" down to an integer value, and store the results as packed single-precision floating-point elements in "dst". |
| floor_sd | Round the lower double-precision (64-bit) floating-point element in "b" down to an integer value, store the result as a double-precision floating-point element in the lower element of "dst", and copy the upper element from "a" to the upper element of "dst". |
| floor_ss | Round the lower single-precision (32-bit) floating-point element in "b" down to an integer value, store the result as a single-precision floating-point element in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". |
| insert_epi32 | Copy "a" to "dst", and insert the 32-bit integer "i" into "dst" at the location specified by "imm8". |
| insert_epi64 | Copy "a" to "dst", and insert the 64-bit integer "i" into "dst" at the location specified by "imm8". |
| insert_epi8 | Copy "a" to "dst", and insert the lower 8-bit integer from "i" into "dst" at the location specified by "imm8". |
| insert_ps | Copy "a" to "tmp", then insert a single-precision (32-bit) floating-point element from "b" into "tmp" using the control in "imm8". Store "tmp" to "dst" using the mask in "imm8" (elements are zeroed out when the corresponding bit is set). |
| max_epi32 | Compare packed 32-bit integers in "a" and "b", and store packed maximum values in "dst". |
| max_epi8 | Compare packed 8-bit integers in "a" and "b", and store packed maximum values in "dst". |
| max_epu16 | Compare packed unsigned 16-bit integers in "a" and "b", and store packed maximum values in "dst". |
| max_epu32 | Compare packed unsigned 32-bit integers in "a" and "b", and store packed maximum values in "dst". |
| min_epi32 | Compare packed 32-bit integers in "a" and "b", and store packed minimum values in "dst". |
| min_epi8 | Compare packed 8-bit integers in "a" and "b", and store packed minimum values in "dst". |
| min_epu16 | Compare packed unsigned 16-bit integers in "a" and "b", and store packed minimum values in "dst". |
| min_epu32 | Compare packed unsigned 32-bit integers in "a" and "b", and store packed minimum values in "dst". |
| minpos_epu16 | Horizontally compute the minimum amongst the packed unsigned 16-bit integers in "a", store the minimum and index in "dst", and zero the remaining bits in "dst". |
| MK_INSERTPS_NDX | Helper macro to create index-parameter value for insert_ps |
| mpsadbw_epu8 | Compute the sum of absolute differences (SADs) of quadruplets of unsigned 8-bit integers in "a" compared to those in "b", and store the 16-bit results in "dst". |
| mul_epi32 | Multiply the low 32-bit integers from each packed 64-bit element in "a" and "b", and store the signed 64-bit results in "dst". |
| mullo_epi32 | Multiply the packed 32-bit integers in "a" and "b", producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in "dst". |
| packus_epi32 | Convert packed 32-bit integers from "a" and "b" to packed 16-bit integers using unsigned saturation, and store the results in "dst". |
| round_pd | Round the packed double-precision (64-bit) floating-point elements in "a" using the "rounding" parameter, and store the results as packed double-precision floating-point elements in "dst". |
| round_ps | Round the packed single-precision (32-bit) floating-point elements in "a" using the "rounding" parameter, and store the results as packed single-precision floating-point elements in "dst". |
| round_sd | Round the lower double-precision (64-bit) floating-point element in "b" using the "rounding" parameter, store the result as a double-precision floating-point element in the lower element of "dst", and copy the upper element from "a" to the upper element of "dst". |
| round_ss | Round the lower single-precision (32-bit) floating-point element in "b" using the "rounding" parameter, store the result as a single-precision floating-point element in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". |
| stream_load_si128 |
Load 128-bits of integer data from memory into dst using a non-temporal memory hint. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
|
| test_all_ones | Compute the bitwise NOT of "a" and then AND with a 128-bit vector containing all 1's, and return 1 if the result is zero, otherwise return 0.> |
| test_all_zeros | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "mask", and return 1 if the result is zero, otherwise return 0. |
| test_mix_ones_zeroes | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "mask", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "mask", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return 1 if both the "ZF" and "CF" values are zero, otherwise return 0. |
| testc_si128 | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "b", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "b", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return the "CF" value. |
| testnzc_si128 | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "b", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "b", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return 1 if both the "ZF" and "CF" values are zero, otherwise return 0. |
| testz_si128 | Compute the bitwise AND of 128 bits (representing integer data) in "a" and "b", and set "ZF" to 1 if the result is zero, otherwise set "ZF" to 0. Compute the bitwise NOT of "a" and then AND with "b", and set "CF" to 1 if the result is zero, otherwise set "CF" to 0. Return the "ZF" value. |