1. Motivation
[P1928R7] introduced data parallel types to C++. It mostly provided operators
which worked on or with
types, but it also included overloads of
useful functions from other parts of C++ (e.g., sin, cos, abs). In this paper we
propose some other functions from standard C++ headers which should receive
overloads to work with
types. The list isn’t exhaustive, but
reflects those functions which are desirable to include.
2. Support for < bit >
The
header is part of the numerics library and provides utilities for
manipulating and querying the properties of integral values when treated as
collections of bits. The table below summarises the contents of
.
Name  Purpose  Proposed (Y/N) 

 A type which indicates the endianness of scalar types.  N 
 reinterpret the object representation of one type as that of another  N 
 reverses the bytes in the given integer value  Y 
 checks if a number is an integral power of two  Y 
 finds the smallest integral power of two not less than the given value  Y 
 finds the largest integral power of two not greater than the given value  Y 
 finds the smallest number of bits needed to represent the given value  Y 
 computes the result of bitwise leftrotation  Y 
 computes the result of bitwise rightrotation  Y 
 counts the number of consecutive 0 bits, starting from the most significant bit  Y 
 counts the number of consecutive 1 bits, starting from the most significant bit  Y 
 counts the number of consecutive 0 bits, starting from the least significant bit  Y 
 counts the number of consecutive 1 bits, starting from the least significant bit  Y 
 counts the number of 1 bits in an unsigned integer  Y 
Of these types and functions, only the first two shouldn’t be handled by
:

indicates the endianess of a scalar type. A SIMD value with elements of the underlying scalar type will have the same properties and does not do a special handling inendian
.std :: simd 
should be handled differently forbit_cast
values, and a separate proposal forstd :: simd
will be provided.simd_bit_cast
All the other functions from
should be handled in
by
elementwise application of the function to each element of the SIMD value. Any
constraints and behaviours on the function will be applied at the SIMD value
level. For instance, if
participates in overload resolution only if the argument type satisfies
concept then the overload of
with
parameter had the same constraint for
.
One small modification to the behaviour of
for
is where the return
type differs to the input type. For example, the standard
header defines
some query functions as returning integer values:
template < class T > constexpr int bit_width ( T x ) noexcept ; template < class T > constexpr int countl_one ( T x ) noexcept ;
If an
were to be returned from the
overload of such functions
then the size of the elements could change. For example, computing
the bit width of a 8bit integer could generate a
of 64bit integers as the
output, which would lead to a dramatic change in storage size and performance.
Instead, we propose that all the overloads for
should return element
types which are the same physical size as the element types they are querying.
This would mean that calling
on an
8bit integer will
return a
containing signed 8bit values.
3. Wording
Below, substitute the � character with a number the editor finds appropriate for the table, paragraph, section or subsection.
3.1. Add new section [simd.bit]
�
bit library [simd.bit]
basic_simd template < typename T , typename Abi > constexpr basic_simd < T , Abi > byteswap ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > constexpr basic_simd < T , Abi >:: mask_type has_single_bit ( const basic_simd < T , Abi >& x ) noexcept ; template < typename T , typename Abi > constexpr basic_simd < T , Abi > bit_ceil ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > constexpr basic_simd < T , Abi > bit_floor ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > constexpr simd < std :: make_signed_t < T > , basic_simd < T , Abi >> bit_width ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > [[ nodiscard ]] constexpr basic_simd < T , Abi > rotl ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > [[ nodiscard ]] constexpr basic_simd < T , Abi > rotr ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > constexpr simd < std :: make_signed_t < T > , basic_simd < T , Abi >> countl_zero ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > constexpr simd < std :: make_signed_t < T > , basic_simd < T , Abi >> countl_one ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > constexpr simd < std :: make_signed_t < T > , basic_simd < T , Abi >> countr_zero ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > constexpr simd < std :: make_signed_t < T > , basic_simd < T , Abi >> countr_one ( const basic_simd < T , Abi >& n ) noexcept ; template < typename T , typename Abi > constexpr simd < std :: make_signed_t < T > , basic_simd < T , Abi >> popcount ( const basic_simd < T , Abi >& n ) noexcept ; Constraints:
Any constraints from the equivalent scalar function from
will be applied to the equivalent elementwise function in
< bit > .
simd Returns:
A
with the same width as the input, where the i^{th} element will be equal to the result of applying the same function to the i^{th} element of the input.
basic_simd The element size of the return value will always be the same physical size as that of each input element, even when the equivalent scalar function would return a larger type.
Remarks:
The order in which the functions are applied to each element is unspecified.
4. Revision History
R0 => R1

Fix typo:
tostd :: make_signed std :: make_signed_t 
Make several paper text improvements