AI Engine Intrinsics  (AIE) r2p21
 All Data Structures Namespaces Functions Variables Typedefs Groups Pages
Direct Digital Synthesis (DDS) Interpolation

Overview

This intrinsic function is used to perform Direct Digital Synthesis (DDS) interpolation between two points on a unit circle on the IQ plane.

Carrier extraction needs 8 sin/cos pairs equally spaced in phase. This function ensures that only every 8th sin/cos must be calculated. It uses an interpolating FIR filter to obtain the other sin/cos pairs.

The function is a 16 tap FIR filter with conjugate symmetric taps taking two sin/cos points as inputs and producing 8 as outputs.

Functions

v8cacc48 dds_ipol (v16cint16 xbuf, int xidx1, int xidx2, int swap, v16int16 zbuf)
 This intrinsic function interpolates the input by 8, using a 16 tap complex symmetric FIR filter.
 
v8cacc48 dds_ipol (v32cint16 xbuf, int xidx1, int xidx2, int swap, v16int16 zbuf)
 

Function Documentation

v8cacc48 dds_ipol ( v16cint16  xbuf,
int  xidx1,
int  xidx2,
int  swap,
v16int16  zbuf 
)

This intrinsic function interpolates the input by 8, using a 16 tap complex symmetric FIR filter.

This intrinsic function is used by the Direct Digital Synthesis (DDS) implementation on the AIE to perform an interpolation by 8 between two points on a unit circle on the IQ plane.

PARAMETERS

Input/Output Type valid bitsComments
return v8cacc48 all Output accumulator which holds the output delay line.
xbuf v16/32cint16 all Vector of 16 complex values containing the two points on the IQ unit circle to interpolate between
xidx1 int 5b LSB Index to first interpolate value in d. This must be a compile time constant.
xidx2 int 3b LSB Index to second interpolate value in d. This must be a compile time constant.
swap int 1b LSB Parameter which swaps real and imaginary values of the data. If the swap flag is set, the bits representing the imaginary part of the data will come first. This must be a compile time constant.
zbuf v16int16 all Vector of 16 integers containing the pregenerated filter taps to be used.
Note
Parmeters 'xidx1', 'xidx2' and 'swap' must be compile time constants.

The two input points are taken from the vector xbuf at the points indicated by the two index parameters given, xidx1 and xidx2.

The output of the function is a Vector of 8 complex values stored in an accumulator.

The effective calculation is as follows:

dsz = sizeof(xbuf)==128 ? 32 : 16
x0 = xbuf[xidx1 % dsz ]
x1 = xbuf[(xidx1 & (dsz-1)) | (xidx2 % 8)]
o(0) = c7*x0 + conj(c0)*x1
o(1) = c6*x0 + conj(c1)*x1
o(2) = c5*x0 + conj(c2)*x1
o(3) = c4*x0 + conj(c3)*x1
o(7) = c0*x0 + conj(c7)*x1
o(6) = c1*x0 + conj(c6)*x1
o(5) = c2*x0 + conj(c5)*x1
o(4) = c3*x0 + conj(c4)*x1

and produces 8 outputs in two cycles since it requires 64 multiplications in 16 bit, where x0 = xbuf[xidx1 % dsz ] and x1 = xbuf[(xidx1 & (dsz-1)) | (xidx2 % 8)]

The complex values that are part of the effective calculation shown above are converted into another set of real coefficients that are passed into this function as the v16int16 z. It is this calculation that exploits the symmetric nature of the filter to reduce the number of real multiplications to 32.

This zbuf vector is pre-calculated and stored in AIE memory as follows:

z0=re(c0+c7)/2
z1=re(c1+c6)/2
...
z4 =im(c0+c7)/2
...
z11=re(c0-c7)/2
..
z15=im(c0-c7)/2

where z[0]-z[15] is the input zbuf

These z values are then used in the following formulation of intermediate values m0a-m3a and m0b-m3b

m0a = z0*(x0+x1) + -j*z4*(x0-x1);
m1a = z1*(x0+x1) + -j*z5*(x0-x1);
m2a = z2*(x0+x1) + -j*z6*(x0-x1);
m3a = z3*(x0+x1) + -j*z7*(x0-x1);
m0b = z11*(x0-x1) + -j*z15*(x0+x1);
m1b = z10*(x0-x1) + -j*z14*(x0+x1);
m2b = z9*(x0-x1) + -j*z13*(x0+x1);
m3b = z8*(x0-x1) + -j*z12*(x0+x1);

These intermediate terms are then used to generate the final output as follows:

acc0 = m0a - m0b;
acc1 = m1a - m1b;
acc2 = m2a - m2b;
acc3 = m3a - m3b;
acc4 = m3a + m3b;
acc5 = m2a + m2b;
acc6 = m1a + m1b;
acc7 = m0a + m0b;

NB: it should be noted that in order to facilitate more efficient movement of values between registers, o(4)->o(7) are reversed as the contents of the adders do not need to changed between the two operations of this function.

o(4)->o(7)
o(5)->o(6)
o(6)->o(5)
o(7)->o(4)

because of this the next function after the dds_ipol must reverse this.

v8cacc48 dds_ipol ( v32cint16  xbuf,
int  xidx1,
int  xidx2,
int  swap,
v16int16  zbuf 
)