Overview

Advanced Floating-point Vector Lane Selection

Select: Selects between the first set of lanes or the second one according to the value in 'select'. If the lane corresponding bit in select is 0 it returns the value in the first set of lanes,otherwise, if it is 1, it returns the value in the second set of lanes.

Shuffle: Shuffle selects from a single input acording to the start/offset computation.

Note: fpsel behaves as a "Shuffle" intrinsic.

To have more information in lane selection please refer to here.

Functions
v16float	fpselect16 (unsigned int select, v32float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
	Performs a floating point selection between lanes of xbuff.

v16float	fpselect16 (unsigned int select, v16float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
	Performs a floating point selection between lanes of xbuff.

v16float	fpselect16 (unsigned int select, v16float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi, v16float ybuff, int ystart, unsigned int yoffsets, unsigned int yoffsets_hi)
	Performs a floating point selection between lanes of xbuff and ybuff.

v8cfloat	fpselect8 (unsigned int select, v16cfloat xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
	Performs a floating point selection between lanes of xbuff.

v8cfloat	fpselect8 (unsigned int select, v8cfloat xbuff, int xstart, unsigned int xoffsets, int ystart, unsigned int yoffsets)
	Performs a floating point selection between lanes of xbuff.

v8cfloat	fpselect8 (unsigned int select, v8cfloat xbuff, int xstart, unsigned int xoffsets, v8cfloat ybuff, int ystart, unsigned int yoffsets)
	Performs a floating point selection between lanes of xbuff and ybuff.

v16float	fpshuffle16 (v32float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi)
	Performs a floating point shuffle between lanes of xbuff.

v16float	fpshuffle16 (v16float xbuff, int xstart, unsigned int xoffsets, unsigned int xoffsets_hi)
	Performs a floating point shuffle between lanes of xbuff.

v8cfloat	fpshuffle8 (v16cfloat xbuff, int xstart, unsigned int xoffsets)
	Performs a floating point shuffle between lanes of xbuff.

v8cfloat	fpshuffle8 (v8cfloat xbuff, int xstart, unsigned int xoffsets)
	Performs a floating point shuffle between lanes of xbuff.

Function Documentation

v16float fpselect16	(	unsigned int	select,
		v32float	xbuff,
		int	xstart,
		unsigned int	xoffsets,
		unsigned int	xoffsets_hi,
		int	ystart,
		unsigned int	yoffsets,
		unsigned int	yoffsets_hi
	)

Performs a floating point selection between lanes of xbuff.

fpselect(a, b, s)
{
  if (s)
    return b;
  else
    return a;
}

for (int i = 0; i < 16; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = fpselect(x[idx], x[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v16float	Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select	unsigned int	Value of each bit selects from the value to be placed in the corresponding vector position
xbuff	v32float	Input buffer of 32 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystart	int	Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane

Note

For more information on how the function f() selects data from the buffers go here.

v16float fpselect16	(	unsigned int	select,
		v16float	xbuff,
		int	xstart,
		unsigned int	xoffsets,
		unsigned int	xoffsets_hi,
		int	ystart,
		unsigned int	yoffsets,
		unsigned int	yoffsets_hi
	)

Performs a floating point selection between lanes of xbuff.

fpselect(a, b, s)
{
  if (s)
    return b;
  else
    return a;
}

for (int i = 0; i < 16; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = fpselect(x[idx], x[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v16float	Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select	unsigned int	Value of each bit selects from the value to be placed in the corresponding vector position
xbuff	v16float	Input buffer of 16 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ystart	int	Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to first lane
yoffsets_hi	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane

Note

For more information on how the function f() selects data from the buffers go here.

v16float fpselect16	(	unsigned int	select,
		v16float	xbuff,
		int	xstart,
		unsigned int	xoffsets,
		unsigned int	xoffsets_hi,
		v16float	ybuff,
		int	ystart,
		unsigned int	yoffsets,
		unsigned int	yoffsets_hi
	)

Performs a floating point selection between lanes of xbuff and ybuff.

fpselect(a, b, s)
{
  if (s)
    return b;
  else
    return a;
}

for (int i = 0; i < 16; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = fpselect(x[idx], y[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v16float	Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select	unsigned int	Value of each bit selects from the value to be placed in the corresponding vector position
xbuff	v16float	Input buffer of 16 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane
ybuff	v16float	Input buffer of 16 elements with single precision
ystart	int	Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets	unsigned int	4b offset for each lane, applied to the ybuffer. LSB apply to first lane
yoffsets_hi	unsigned int	4b offset for each lane, applied to the ybuffer. LSB apply to 8th lane

Note

For more information on how the function f() selects data from the buffers go here.

v8cfloat fpselect8	(	unsigned int	select,
		v16cfloat	xbuff,
		int	xstart,
		unsigned int	xoffsets,
		int	ystart,
		unsigned int	yoffsets
	)

Performs a floating point selection between lanes of xbuff.

fpselect(a, b, s)
{
  if (s)
    return b;
  else
    return a;
}

for (int i = 0; i < 8; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = fpselect(x[idx], x[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v8cfloat	Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select	unsigned int	Value of each bit selects from the value to be placed in the corresponding vector position
xbuff	v16cfloat	Input buffer of 16 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane
ystart	int	Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets	unsigned int	3b (aligned to 4b) offset for each lane in the xbuffer for the second input. LSB apply to first lane

Note

When xoffsets or yoffsets is a runtime parameter, it might be more efficient to use a non-complex fpselect instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets The same goes for the select parameter.
For more information on how the function f() selects data from the buffers go here.

v8cfloat fpselect8	(	unsigned int	select,
		v8cfloat	xbuff,
		int	xstart,
		unsigned int	xoffsets,
		int	ystart,
		unsigned int	yoffsets
	)

Performs a floating point selection between lanes of xbuff.

fpselect(a, b, s)
{
  if (s)
    return b;
  else
    return a;
}

for (int i = 0; i < 8; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = fpselect(x[idx], x[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v8cfloat	Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select	unsigned int	Value of each bit selects from the value to be placed in the corresponding vector position
xbuff	v8cfloat	Input buffer of 8 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane
ystart	int	Starting position offset applied to all lanes of input from xbuffer for the second input
yoffsets	unsigned int	3b (aligned to 4b) offset for each lane in the xbuffer for the second input. LSB apply to first lane

Note

When xoffsets or yoffsets is a runtime parameter, it might be more efficient to use a non-complex fpselect instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets The same goes for the select parameter.
For more information on how the function f() selects data from the buffers go here.

v8cfloat fpselect8	(	unsigned int	select,
		v8cfloat	xbuff,
		int	xstart,
		unsigned int	xoffsets,
		v8cfloat	ybuff,
		int	ystart,
		unsigned int	yoffsets
	)

Performs a floating point selection between lanes of xbuff and ybuff.

fpselect(a, b, s)
{
  if (s)
    return b;
  else
    return a;
}

for (int i = 0; i < 8; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = fpselect(x[idx], y[idy], select[i])
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v8cfloat	Value of each lane is the result of a floating point selection between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
select	unsigned int	Value of each bit selects from the value to be placed in the corresponding vector position
xbuff	v8cfloat	Input buffer of 8 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane
ybuff	v8cfloat	Input buffer of 8 elements with single precision
ystart	int	Starting position offset applied to all lanes of input from ybuffer for the second input
yoffsets	unsigned int	3b (aligned to 4b) offset for each lane in the ybuffer for the second input. LSB apply to first lane

Note

When xoffsets or yoffsets is a runtime parameter, it might be more efficient to use a non-complex fpselect instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets The same goes for the select parameter.
For more information on how the function f() selects data from the buffers go here.

v16float fpshuffle16	(	v32float	xbuff,
		int	xstart,
		unsigned int	xoffsets,
		unsigned int	xoffsets_hi
	)

Performs a floating point shuffle between lanes of xbuff.

for (int i = 0; i < 16; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = x[idx]
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v16float	Value of each lane is the result of a floating point shuffle between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
xbuff	v32float	Input buffer of 32 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane

Note

For more information on how the function f() selects data from the buffers go here.

v16float fpshuffle16	(	v16float	xbuff,
		int	xstart,
		unsigned int	xoffsets,
		unsigned int	xoffsets_hi
	)

Performs a floating point shuffle between lanes of xbuff.

for (int i = 0; i < 16; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = x[idx]
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v16float	Value of each lane is the result of a floating point shuffle between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
xbuff	v16float	Input buffer of 16 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to first lane
xoffsets_hi	unsigned int	4b offset for each lane, applied to the xbuffer. LSB apply to 8th lane

Note

For more information on how the function f() selects data from the buffers go here.

v8cfloat fpshuffle8	(	v16cfloat	xbuff,
		int	xstart,
		unsigned int	xoffsets
	)

Performs a floating point shuffle between lanes of xbuff.

for (int i = 0; i < 8; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = x[idx]
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v8cfloat	Value of each lane is the result of a floating point shuffle between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
xbuff	v16cfloat	Input buffer of 16 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane

Note

When xoffsets is a runtime parameter, it might be more efficient to use a non-complex fpshuffle instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets
For more information on how the function f() selects data from the buffers go here.

v8cfloat fpshuffle8	(	v8cfloat	xbuff,
		int	xstart,
		unsigned int	xoffsets
	)

Performs a floating point shuffle between lanes of xbuff.

for (int i = 0; i < 8; i++)
    idx = f( xstart, xoffsets[i]);
    idy = f( ystart, yoffsets[i]);
    o[i] = x[idx]
xoffsets, xoffsets_hi, yoffsets, yoffsets_hi have 8 offset values each. 4 bits per offset.
For Example: for v16int32 output type, idx for output_lane_0 = f(xstart,xoffsets[0])
For Example: for v16int32 output type, idx for output_lane_15 = f(xstart,xoffsets_hi[7])
In case of v32int16, 1 offset is used for 2 adjacent lanes.
For more information on how the function f() selects data from the buffers refer to Lane selection note below.

Parameters

Input/Output	Type	Comments
return	v8cfloat	Value of each lane is the result of a floating point shuffle between lanes of xbuff where the result of lane 0 goes to lane 0 of the output.
xbuff	v8cfloat	Input buffer of 8 elements with single precision
xstart	int	Starting position offset applied to all lanes of input from X buffer
xoffsets	unsigned int	3b (aligned to 4b) offset for each lane, applied to the xbuffer. LSB apply to first lane

Note

When xoffsets is a runtime parameter, it might be more efficient to use a non-complex fpshuffle instuction and calculate the offsets accordingly. Therefore both, real and imaginary (real+1) lane must be considered in the offsets
For more information on how the function f() selects data from the buffers go here.

Overview

Functions

Function Documentation

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters

Parameters