Advances in Wireless Communication 课堂笔记(上)

Basics of error-correcting codes

Generation of wireless communications

Gnerationtimetechnicmax speed
1Gearly 1980’sAnalog, FM12kbps
2G1991digital, TDMA50kbps~1Mbps
3G (include UMTS & CDMA 2000 & 3GPP)1998CDMA20Mbps
4G (LTE)2008OFDM1Gbps
5G2020up to 10Gbps

communication system

在这里插入图片描述
channel is characterized by the transition probabilities
P r { Y = y   ∣   X = x }   f o r    a n y    x ∈ X ,   y ∈ Y x = { 0 , 1 } , y = R Pr\{Y=y \,| \,X=x\} \,for\,\, any\,\, x\in X, \, y \in Y\\ x=\{0,1\},y=R Pr{Y=yX=x}foranyxX,yYx={0,1},y=R
in the case of continuous y, we use condition pdf

Channel model

  1. BSC: binary symetric channel
    在这里插入图片描述
    capacity: bit per channel use, 1 − h 2 ( p ) 1-h_2(p) 1h2(p), h 2 ( ) h_2() h2() is a function

  2. BEC: binary erasure channel
    在这里插入图片描述
    capacity: 1 − ε 1-\varepsilon 1ε

  3. AWGN: ADDITIVE WHITE Gaussian Noise channel
    在这里插入图片描述
    x ∈ { − 1 , + 1 } , b i n a r y , o r    c o n t i n u o u s    x = R x\in \{-1,+1\}, binary, \\ or\,\, continuous\,\, x=\reals x{1,+1},binary,orcontinuousx=R
    x is subject to power constrain E [ X 2 ] = P E[X^2]=P E[X2]=P
    P is power of transistor
    capacity: 1 2 l g ( 1 + S N R ) = P δ 2 \frac 1 2 lg(1+SNR)=\frac P {\delta^2} 21lg(1+SNR)=δ2P
    also W l g ( 1 + S N R ) W lg(1+SNR) Wlg(1+SNR) W:bandwidth

  4. Rayleigh fading channel
    x ∈ X      y ∈ R      y = a x + n x\in X\,\,\,\, y \in \reals \,\,\,\, y=ax+n xXyRy=ax+n a: RV, obey Rayleigh distribution with scalar parameter τ 2 \tau^2 τ2
    n ∼ N ( 0 , δ 2 ) n\thicksim \mathcal N (0,\delta^2) nN(0,δ2)

code design

Code: s structured subset of an ambient set, collection of all codewords.
Encoder: A mapping between the set of message and the set of codewords.
Decoder: Given an elementary ∈ A \in A A, (y is the received symbol or a sequence of such y’s), find the “most likely” codeword/message.

在这里插入图片描述
m: message
c: codeword
C: code
Minimize the prob of error P r { m ˆ ≠ m } Pr\{\^{m}\not = m\} Pr{mˆ=m} through structure of code

A natural structure with algebraic strutures to play with a linear subspacce to an ambient vector.

A linear code C of dimension k in A, here, F n F^n Fn is field: 0, 1

Each element in c ∈ C c\in C cC is represented as a vector of length n, c = ( c 1 , c 2 , c 3 , . . . , c n )    , c i ∈ F c=(c_1,c_2,c_3,...,c_n)\,\, ,c_i\in F c=(c1,c2,c3,...,cn),ciF, c is binary sequence, n is the length of the code.

( n , k ) (n,k) (n,k) code,n is block length, k is dimension of code, k ≤ n k \le n kn

Example:
Let c be an ( n , n − 1 ) (n,n-1) (n,n1) linear code as follows
在这里插入图片描述

c is one-parity code.

rate: bit per channel/symbol use

Rate of a code C of length n over an alphabet of size q q q: r a t e ( C ) = l o g q ∣ C ∣ n   ∣ ∣ C ∣ = q k   = k n q k :   s i z e    o f    t h e    c o d e    o f    d i m e n s i o n    k q : p o s s i b l e    n u m b e r    o f    c o d e s rate(C)=\frac {log_q|C|} n\,|_{|C|=q^k}\,=\frac k n\\ q^k:\,size\,\,of\,\,the\,\,code\,\,of\,\,dimension\,\,k\\ q: possible\,\,number\,\,of\,\,codes rate(C)=nlogqCC=qk=nkqk:sizeofthecodeofdimensionkq:possiblenumberofcodes

Hamming distance d H ( x , y ) d_H(x,y) dH(x,y)
d H ( x , y ) d_H(x,y) dH(x,y)=number of positions(bits) in which x x x and y y y differ
x : x: x:transmitted , y : y: y:received

properties:

  1. d H ( x , y ) ≥ 0 d_H(x,y)\ge0 dH(x,y)0
  2. d H ( x , y ) = 0   ⟺   x = y d_H(x,y)=0\,\Longleftrightarrow\, x=y dH(x,y)=0x=y
  3. d H ( x , y ) = d H ( y , x ) d_H(x,y)=d_H(y,x) dH(x,y)=dH(y,x)
  4. triangle inequality: d H ( x , z ) ≤ d H ( x , y ) + d H ( y , z ) d_H(x,z)\le d_H(x,y)+d_H(y,z) dH(x,z)dH(x,y)+dH(y,z)

Hamming weight: number of non-zero entries of x ⃗ \vec{x} x
Hamming weight at a vector x ⃗ \vec{x} x : w H ( x ) = d H ( x , 0 ) w_H(x)=d_H(x,0) wH(x)=dH(x,0)
0 : z e r o    c o d e 0: zero\,\,code 0:zerocode

Minimum distance of a code
d m i n ( c ) = d_{min}(c)= dmin(c)= m i n x , x ′ ∈ c x ≠ x ′ min \atop {x,x'\in c \atop x\not =x'} x=xx,xcmin d H ( x , x ′ ) d_H(x,x') dH(x,x)

for linear code C, the d m i n ( c ) = d_{min}(c)= dmin(c)= m i n c ∈ C c ≠ 0 min \atop {c\in C \atop c\not =0} c=0cCmin w H ( c ) w_H(c) wH(c)

if we want to find the minimum distance, just need to find the minimum distance of non-zero codeword to the all-zero codeword.

Theorem: (worse case guarantee) Let d = d m i n ( c ) d=d_{min}(c) d=dmin(c), then c can correct up to ∣ d − 1 2 ∣ |\frac {d-1} 2| 2d1 errors.

Approach to design code

Construct codes with maximum distance, give a certain rate (or length an size)
Algebraic codes: Turbo Code (3G/4G used), LDPC and polar codes (in 5G)

linear code approach
Consider a basis for an (n,k) linear code C, which cover field F F F, denoted by c 1 , c 2 , . . . , c k c_1, c_2,...,c_k c1,c2,...,ck c = { λ 1 c 1 + λ 2 c 2 + . . . + λ k c k   ∣   λ i ∈ F } c=\{ \lambda_1c_1+\lambda_2c_2+...+\lambda_kc_k\, |\,\lambda_i\in F \} c={λ1c1+λ2c2+...+λkckλiF}
Let G = [ c 1 c 2 . . . c k ] k × n G=\begin{bmatrix} c_1 \\ c_2 \\...\\ c_k \end{bmatrix}_{k\times n} G= c1c2...ck k×n , a generator matrix
for the code c
c = ( λ 1 , λ 2 , . . . , λ k ) × G c=(\lambda_1, \lambda_2,..., \lambda_k)\times G c=(λ1,λ2,...,λk)×G
c = { V G ∣ V ∈ F k } , V : m e s s a g e    m a t r i x c=\{VG | V\in F^k\}, V:message\,\, matrix c={VGVFk},V:messagematrix
the generator matrix is not unique

encoding mapping: V → V G V\rightarrow VG VVG
V V V:message of length k k k, k k k bit
V G VG VG:encoded codeword

example:
one parity check code ( x 1 , x 2 , . . . , x n − 1 ) → ( x 1 , x 2 , . . . , x n − 1 , ∑ i = 1 n − 1 n i ) (x_1,x_2,...,x_{n-1})\rightarrow (x_1,x_2,...,x_{n-1},\sum_{i=1}^{n-1}n_i) (x1,x2,...,xn1)(x1,x2,...,xn1,i=1n1ni)
在这里插入图片描述
(the left region of the dash line can be any number)

systematic encoder: every encoded codeword contains the original message as follows:
message= ( u 1 , u 2 , . . . , u k ) (u_1,u_2,...,u_k) (u1,u2,...,uk), codeword= ( u 1 , u 2 , . . . , u k , x k + 1 , . . . , x n ) (u_1,u_2,...,u_k, x_{k+1},...,x_n) (u1,u2,...,uk,xk+1,...,xn)
so G = [ I k × k ∣ A k × ( n − k ) ] k × n G=\begin{bmatrix}I_{k\times k} |A_{k \times (n-k)} \end{bmatrix}_{k\times n} G=[Ik×kAk×(nk)]k×n
no matter what matrix A is.

Theorem: Every linear code has a systematic encoder up to a permittion on the code bits, which can design generator matrix

For a code C with generator matrix G k × n G_{k\times n} Gk×n let H ( n − k ) × n H_{(n-k)\times n} H(nk)×n denote the kernel of G k × n G_{k\times n} Gk×n, G H T = 0 k × ( n − k ) GH^T=0_{k\times (n-k)} GHT=0k×(nk)
all rows of G are orthogonal to all rows of H
H: the parity-check matrix for c

Note: In binary field, non-zero vectors can be self-orthogonal. Any binary vector are even Hamming weight is self-orthogonal.

Example:
For on-parity check code C, C with G ( n − 1 ) × n    H = [ 1 , 1 , . . . , 1 ] 1 × n G_{(n-1)\times n}\,\,H=[1,1,...,1]_{1\times n} G(n1)×nH=[1,1,...,1]1×n. In general, for a systematic G = [ I k × k ∣ A k × ( n − k ) ] k × n G=\begin{bmatrix} I_{k\times k}| A_{k\times (n-k)} \end{bmatrix}_{k\times n} G=[Ik×kAk×(nk)]k×nwe have H = [ − A T   ∣   I ( n − k ) × ( n − k ) ] ( n − k ) × n H=[-A^T\,|\,I_{(n-k)\times (n-k)}]_{(n-k)\times n} H=[ATI(nk)×(nk)](nk)×n

Example:
Let C be a binary linear (6, 3) code with the generator matrix G = [ 1    0    1    1    0    1 0    1    0    1    1    0 0    0    1    0    0    1 ] G=\begin{bmatrix} 1\;0\;1\;1\;0\;1\\ 0\;1\;0\;1\;1\;0\\ 0\;0\;1\;0\;0\;1\\ \end{bmatrix} G= 101101010110001001
a. Find a systematic generator matrix for C.
systematic form: G s y s = [ 1    0    0    1    0    0 0    1    0    1    1    0 0    0    1    0    0    1 ] G_{sys}=\begin{bmatrix} 1\;0\;0\;1\;0\;0\\ 0\;1\;0\;1\;1\;0\\ 0\;0\;1\;0\;0\;1\\ \end{bmatrix} Gsys= 100100010110001001
b. Find a parity-check matrix for C.
G s y s = [ I k × k ∣ A k × ( n − k ) ] k × n H = [ − A T   ∣   I ( n − k ) × ( n − k ) ] ( n − k ) × n H = [ 1    1    0    1    0    0 0    1    0    0    1    0 0    0    1    0    0    1 ] G_{sys}=\begin{bmatrix} I_{k\times k}| A_{k\times (n-k)} \end{bmatrix}_{k\times n}\\ H=[-A^T\,|\,I_{(n-k)\times (n-k)}]_{(n-k)\times n}\\ H=\begin{bmatrix} 1\;1\;0\;1\;0\;0\\ 0\;1\;0\;0\;1\;0\\ 0\;0\;1\;0\;0\;1\\ \end{bmatrix} Gsys=[Ik×kAk×(nk)]k×nH=[ATI(nk)×(nk)](nk)×nH= 110100010010001001
c. What is the minimum distance of C?
The minimum distance is at least two, since there is no zero column in H H H
And we do have a codeword of weight 2(the third row of G G G), d m i n ( C ) = 2 d_{min}(C)=2 dmin(C)=2

Lemma properties: let c be a linear (n,k) code, with parity-check matrix H, then we have c ∈ C ⟺ H C T = 0 c\in C \Longleftrightarrow HC^T=0 cCHCT=0

Graphical model representation of decoding, message passing algorithms

linear Code C

parity check matrix H ( n − k ) × n H_{(n-k)\times n} H(nk)×n
n n n: block length
k k k: # of information bit c ∈ C ⇔ H C T = 0 c\in C \Leftrightarrow HC^T=0 cCHCT=0在这里插入图片描述
each row of H is parity checkequation

For any y ∈ F y\in F yF, the syndrome of y y y with respect to the code C with its parity check matrix H is define as H y H^y Hy, syndrome of y.
H y T Hy^T HyT: matrix ,size ( n − k ) × 1 (n-k)\times 1 (nk)×1
number of possible symdroms: 2 n − k 2^{n-k} 2nk

H ( a i + c j ) T = H a i t + H c j t H(a_i+c_j)^T=Ha^t_i+Hc^t_j H(ai+cj)T=Hait+Hcjt
where H c j t = 0 Hc^t_j=0 Hcjt=0

Let S 1 , S 2 , . . . , S 2 n − k S_1, S_2,...,S_{2^{n-k}} S1,S2,...,S2nk denote all possible syndromes, also let a i a_i ai be the minimum weight vector with H a i + = S i Ha^+_i=S_i Hai+=Si

Coset leaderstandard arraysyndromes
a 1 a_1 a1 a 1 + c 1      . . .      a 1 + c 2 k a_1+c_1\;\;...\;\;a_1+c_{2^k} a1+c1...a1+c2k S 1 S_1 S1
a 2 a_2 a2 a 2 + c 1      . . .      a 2 + c 2 k a_2+c_1\;\;...\;\;a_2+c_{2^k} a2+c1...a2+c2k S 2 S_2 S2
. . . ... ... . . . ... ...
a 2 n − k a_{2^{n-k}} a2nk a 2 n − k + c 1      . . .      a 2 n − k + c 2 k a_{2^{n-k}}+c_1\;\;...\;\;a_{2^{n-k}}+c_{2^k} a2nk+c1...a2nk+c2k S 2 n − k S_{2^{n-k}} S2nk

c 1 , c 2 . . . , c k c_1,c_2...,c_k c1,c2...,ckdenote all the code words, so standard array is the possible y that can be received.

Syndrome decoding - only bitflip error

y y y :received vector (binary)

  1. compute the syndrome of y y y - H y T Hy^T HyT
  2. locate S 1 = H y T S_1=Hy^T S1=HyT in the standard array with coset leader a i a_i ai
  3. output codeword c = y − a i c=y-a_i c=yai, a i a_i ai:error pattern

The syndrome decoder is a minimum distance decoding, which mapping y y y to teh closest codeword. Let d m i n ( C ) = d d_{min}(C)=d dmin(C)=d, Then all binary vector of weight up to ∣ d − 1 2 ∣ |\frac {d-1}2| 2d1 will be among coset leaders

Maximum likelihood decoder

consider a BSC(P ), P<0.5
在这里插入图片描述
P r { r e c e i v i n g    y    ∣    c    i s    t r a n s m i t t e d } Pr\{receiving\; y \;| \;c\; is\; transmitted\} Pr{receivingycistransmitted}
w = w H ( y − c ) w=w_H(y-c) w=wH(yc), number of position in which y y y and c c c are different, w = p w ( 1 − p ) n − w = ( p 1 − p ) w ( 1 − p ) n w=p^w(1-p)^{n-w}=(\frac p {1-p})^w(1-p)^n w=pw(1p)nw=(1pp)w(1p)n
ML decoder ⇔ \Leftrightarrow maximize the probability P r { y ∣ m i n ( c ) } Pr\{y | min(c)\} Pr{ymin(c)} ⇔ \Leftrightarrow minimize w w w ⇔ \Leftrightarrow minimize distance decoder ⇔ \Leftrightarrow syndrome decoder

for BSC, these decoder are the same

Note that ML decoding has exponential (in n) complexity
Also syndrom decoding needs to search with in an array of exponential size ⇒ \Rightarrow exponential complexity

LDPC Code: A low-density parity check code is a binary, linear block code for which the parity-check mtrix is sparse. (both row and coloum be sparse in terms of # of 1’s)

A regular LDPC code, has an equal # of 1’s in each row w r w_r wr and equal # of 1’s in each colum w c w_c wc

Note that w c ⋅ n = w r ⋅ m w_c\cdot n=w_r\cdot m wcn=wrm for H m × n H_{m\times n} Hm×n

在这里插入图片描述
with m ≥ n − k m \ge n-k mnk for an (n,k) code, this code is refer to as a ( w c , w r w_c, w_r wc,wr) regular LDPC code $$

Example:
A (2,4) regular LDPC code ,n=10, m=5, k=6, w c = 2 w_c=2 wc=2, w r = 4 w_r=4 wr=4
H = [ 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 1 0 0 1 0 1 1 ] 5 × 10 H=\begin{bmatrix} 1&1&1&1&0&0&0&0&0&0 \\ 1&0&0&0&1&1&1&0&0&0 \\ 0&1&0&0&1&0&0&1&1&0 \\ 0&0&1&0&0&1&0&1&0 &1\\ 0&0&0&1&0&0&1&0&1&1 \\ \end{bmatrix}_{5\times 10} H= 11000101001001010001011000101001001001100010100011 5×10 r a n k ( H ) = n − k = 4 ,      k = n − r a n k ( H ) rank(H)=n-k=4,\;\;k=n-rank(H) rank(H)=nk=4,k=nrank(H)

Gallager’s early work, Gallager’s decoder

There exists a sequence of LDPC codes(regular) with increasing length and positive rate k / n > 0 k/n >0 k/n>0, positive d m i n / n > 0 d_{min}/n >0 dmin/n>0

Gallager’s decoder(hard-desicion bit flipping decoder)

  1. fix a threshold S (to be optimized)
  2. compute the syndrome bits S j S_j Sj’s,
    H y T = [ S 1 S 2 . . . S m ]                y T = [ . . . i − t h . . . ] Hy^T=\begin{bmatrix} S_1 \\ S_2 \\ ... \\S_m\\ \end{bmatrix} \;\;\;\;\;\;\; y^T=\begin{bmatrix} ...\\ i-th \\... \end{bmatrix} HyT= S1S2...Sm yT= ...ith...
    y is received vector
  3. of all S j S_j Sj’s are 0, then stops
  4. otherwise bit i, i=1, 2, …, n
    g i g_i gi: number of non-zero syndroms that involve the i-th bit
  5. A = { i = g i > S } A=\{i=g_i>S\} A={i=gi>S}
  6. flip bit i for all i in A and back to step 2

Belief propagation Algorithm

Belief propagation (BP) is a type of message passing algorithm. It uses a Tanner graph representation of the code (A bi-partite graph)
one part: A node for each information bit (variable node), in other part A check node for each parity在这里插入图片描述
There is an edge connection f to x i x_i xi if the (i,j) entry in matrix H is one

for instane over AWGN, y i = ( 2 x i − 1 ) + n i ,    n i ∼ N ( 0 , σ 2 ) y_i=(2{x_i}-1)+n_i,\;n_i\sim\mathcal N(0,\sigma^2) yi=(2xi1)+ni,niN(0,σ2)

if f 1 f_1 f1 connect to x 1 ,    x 2 ,    x 3 x_1,\;x_2,\;x_3 x1,x2,x3, then x 1 + x 2 + x 3 = 0 x_1+x_2+x_3=0 x1+x2+x3=0
x i + x i ′ + x i ′ ′ + . . . = 0 x_i+x_{i'}+x_{i''}+...=0 xi+xi+xi′′+...=0

BP algorithm is an iterative decoding algorithm
In each iteration

  • Each variable node sends a message to each check node
  • each check node sends a message to each variable node
  • each variable node update its ‘belief’ about x i x_i xi

Goal of decoding : compute P ( x i = 0   ∣   y 1 , y 2 , . . . y n   a n d    a l l    p a r i t y    b i t s    b e i n g    " 0 " ) P(x_i=0 \,|\,y_1,y_2,...y_n \, and\;all\;parity\;bits\;being\;"0") P(xi=0y1,y2,...ynandallparitybitsbeing"0")
Also called:bit map decoder
在这里插入图片描述
q i j ( x ) = P ( x i = x   ∣   y i , a l l    t h e    e x t r i n s i c    i n f o r m a t i o n    p a s s e d    t o    x i    f r o m    f j ) q_{ij}(x)=P(x_i=x \,|\,y_i, all\; the \;extrinsic\; information\; passed \;to\; x_i \;from \;f_j ) qij(x)=P(xi=xyi,alltheextrinsicinformationpassedtoxifromfj)
r j i ( x ) = P ( p a r i t y    b i t    f j    i s    s a t i s f i e d    ∣    x i = x , o t h e r    b i t s    X i ′ ′ s    c o n n e c t e d    t o    f j    ( o t h e r    t h a n    X i )    a r e    d i s t r i b u t e d    w i t h    q i ′ , j ) r_{ji}(x)=P(parity \;bit\; f_j\; is \;satisfied\;|\;x_i=x, other \;bits\; X_{i'}'s \;connected \; to \;f_j\;(other\;than\;X_i)\;are\;distributed\;with \;q_{i',j} ) rji(x)=P(paritybitfjissatisfiedxi=x,otherbitsXisconnectedtofj(otherthanXi)aredistributedwithqi,j)

How to compute q , r q,r q,r
initialization: q i , j ( x ) = P ( X i = x   ∣   Y i = y i ) ,    x ∈ { 0 , 1 } q_{i,j}(x)=P(X_i=x\,|\,Y_i=y_i),\;x\in \{0,1\} qi,j(x)=P(Xi=xYi=yi),x{0,1}

ratio P ( X i = 0   ∣   Y i = y i ) P ( X i = 1   ∣   Y i = y i ) \frac {P(X_i=0\,|\,Y_i=y_i)}{P(X_i=1\,|\,Y_i=y_i)} P(Xi=1Yi=yi)P(Xi=0Yi=yi) is likelihood ratio for making decision. In practice, we work with the log(likelihood ratio) LLR, if LLR is positive, ratio>1, x i = 0 x_i=0 xi=0

Notations:
P i = P ( X i = 1   ∣   Y i = y i ) ∼ P_i=P(X_i=1\,|\,Y_i=y_i)\sim Pi=P(Xi=1Yi=yi) L ( X i ) L(X_i) L(Xi) in the LLR domain
R j ∼ R_j\sim Rj indices of 1(s) in row j of H
C i ∼ C_i\sim Ci indices of 1(s) in colum i of H
R j \ i ∼ R_{j \backslash i} \sim Rj\i R j R_j Rj exclude i (for example, row 1 is [0 1 1 0 1], R 1 \ 2 = { 3 , 5 } R_{1 \backslash 2}=\{3,5\} R1\2={3,5} )

Lemma: A = ( a 1 , a 2 , . . . , a L ) A=(a_1,a_2,...,a_L) A=(a1,a2,...,aL) of independent binary random variable with P ( a i = 1 ) = P i P(a_i=1)=P_i P(ai=1)=Pi, Then we have, look at P ( ∑ i = 1 L a i = 0 ) = 1 2 + 1 2 ∏ i = 1 L ( 1 − 2 P i ) P ( ∑ i = 1 L a i = 1 ) = 1 2 − 1 2 ∏ i = 1 L ( 1 − 2 P i ) P(\sum_{i=1}^L a_i=0)=\frac 1 2+\frac 1 2 \prod^L_{i=1}(1-2P_i)\\ P(\sum_{i=1}^L a_i=1)=\frac 1 2-\frac 1 2 \prod^L_{i=1}(1-2P_i) P(i=1Lai=0)=21+21i=1L(12Pi)P(i=1Lai=1)=2121i=1L(12Pi)

message passing: r j , i ( 0 ) = 1 2 + 1 2 ∏ i ∈ R j \ i ( 1 − 2 q i ′ , j ( 1 ) ) r j , i ( 1 ) = 1 − r j , i ( 0 ) q i , j ( 0 ) q i , j ( 1 ) = ( 1 − P i ) P i ∏ j ′ ∈ C i \ j r j ′ , i ( 0 ) r j ′ , i ( 1 ) L ( q i , j ) = l o g ( q i , j ( 0 ) q i , j ( 1 ) ) L ( r j , i ) = l o g ( r j , i ( 0 ) r j , i ( 1 ) ) ⇒ { L ( q i , j ) = L ( X i ) + ∑ j ′ ∈ c i \ j L ( r j ′ , i ) L ( r j , i ) = 2 t a n h − 1 ( ∏ i ∈ R j \ i t a n h ( 1 2 L ( q i ′ , j ) ) ) u p d a t e    b e l i e f    o f    X i   ′ s      L n e w ( X i ) = L ( x i ) + ∑ j ∈ c i L ( r j , i ) r_{j,i}(0)=\frac 1 2 +\frac 1 2 \prod_{i \in R_{j \backslash i}}(1-2q_{i',j}(1))\\ r_{j,i}(1)=1-r_{j,i}(0)\\ \frac{q_{i,j}(0)}{q_{i,j}(1)}=\frac {(1-P_i)}{P_i} \prod_{j'\in C_{i\backslash j}}\frac {r_{j',i(0)}}{r_{j',i}(1)} \\ L(q_{i,j})=log(\frac {q_{i,j}(0)}{q_{i,j}(1)})\\ L(r_{j,i})=log(\frac {r_{j,i}(0)}{r_{j,i}(1)})\\ \Rightarrow \begin{cases} L(q_{i,j})=L(X_i)+\sum_{j'\in c_{i\backslash j}} L(r_{j',i})\\ L(r_{j,i})=2tanh^{-1}(\prod _{i \in R_{j \backslash i}} tanh(\frac 1 2 L(q_{i',j})))\\ update \; belief \;of \;X_i \, {'s} \;\;L_{new}(X_i)=L(x_i)+\sum_{j\in c_i} L(r_{j,i}) \end{cases} rj,i(0)=21+21iRj\i(12qi,j(1))rj,i(1)=1rj,i(0)qi,j(1)qi,j(0)=Pi(1Pi)jCi\jrj,i(1)rj,i(0)L(qi,j)=log(qi,j(1)qi,j(0))L(rj,i)=log(rj,i(1)rj,i(0)) L(qi,j)=L(Xi)+jci\jL(rj,i)L(rj,i)=2tanh1(iRj\itanh(21L(qi,j)))updatebeliefofXisLnew(Xi)=L(xi)+jciL(rj,i)

The step also can be written as α i , j = S i g n ( L ( q i , j ) ) β i , j = ∣ L ( q i , j ) ∣ ϕ ( n ) = l o g ( e n + 1 e n − 1 ) ϕ      i s      s e l f i n v e r s e : ϕ − 1 = ϕ L ( r j , i ) = ∏ i ′ ∈ R j \ i α i ′ , j ϕ − 1 ( ∑ i ′ ∈ R j \ i ϕ ( β i ′ , j ) ) \alpha_{i,j}=Sign(L(q_{i,j}))\\ \beta_{i,j}=|L(q_{i,j})|\\ \phi(n)=log(\frac {e^n+1}{e^n-1})\\ \phi \;\;is\;\; selfinverse:\phi^{-1}=\phi \\ L(r_{j,i})=\prod_{i' \in R_{j\backslash i}} \alpha_{i',j} \phi^{-1}(\sum_{i'\in R_{j\backslash i}}\phi(\beta_{i',j})) αi,j=Sign(L(qi,j))βi,j=L(qi,j)ϕ(n)=log(en1en+1)ϕisselfinverse:ϕ1=ϕL(rj,i)=iRj\iαi,jϕ1(iRj\iϕ(βi,j))
Min-Sum approximation Lapproximate with m i n i ′ ∈ R j \ i min \atop{i'\in R_{j\backslash i}} iRj\imin β i ′ , j \beta_{i',j} βi,j, the result smaller than m i n i ′ ∈ R j \ i min \atop{i'\in R_{j\backslash i}} iRj\imin β i ′ , j \beta_{i',j} βi,j

offset Min-Sum approximation
L ( r j , i ) = ∏ i ′ ∈ R j \ i α i ′ , j ( L(r_{j,i})=\prod _{i'\in R_{j\backslash i}} \alpha_{i',j} ( L(rj,i)=iRj\iαi,j( m i n i ′ ∈ R j \ i min \atop{i'\in R_{j\backslash i}} iRj\imin β i ′ , j − α ) \beta_{i',j}-\alpha) βi,jα)
α \alpha α: constant to be optimized per application

Example:
BP decoding with Min-Sum approximation A(2,3) regular LDPC code, H = [ 1 1 1 0 0 0 1 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 ] 4 × 6 n = 6 ,    m = 4 ,    k = 3 ,    w c = 2 ,    w r = 3 r a n k ( H ) = 3    ⟹    k = 6 − 3 = 3 H=\begin{bmatrix} 1 &1& 1& 0& 0 &0 \\ 1 &0&0&1&1&0\\ 0 &1&0&1&0&1\\ 0 &0&1&0&1&1\\ \end{bmatrix}_{4\times 6}\\ n=6,\;m=4,\;k=3,\;w_c=2,\;w_r=3\\ rank(H)=3\implies k=6-3=3 H= 110010101001011001010011 4×6n=6,m=4,k=3,wc=2,wr=3rank(H)=3k=63=3
Tunner graph representation
“1” in H matrix means having connection between check node and variable node.
在这里插入图片描述
suppose we have L ( X i ) = − 1 , 2 , 3 , − 4 , 4 , 1 f o r    i = 1 , 2 , 3 , 4 , 5 , 6 L(X_i)=-1,2,3,-4,4,1\\ for\; i=1,2,3,4,5,6 L(Xi)=1,2,3,4,4,1fori=1,2,3,4,5,6
在这里插入图片描述
what is the updated belife of X i X_i Xi after on iteration of BP with Min-Sum approximation
write down q i , j q_{i,j} qi,j matching the connection q i , j = L ( X i ) q_{i,j}=L(X_i) qi,j=L(Xi) (in black, ingnore the arrow point, the q i , j q_{i,j} qi,j is the message from “circle” to “square”)
在这里插入图片描述
find r j , i r_{j,i} rj,i (number in red)
在这里插入图片描述

updated belief L ( n e w ) ( X i ) L^{(new)}(X_i) L(new)(Xi)
在这里插入图片描述

complexity:BP can be used (in principle) to decode any linear code given H. For LDPC code of “constant” degree(with respect number), the complexity of each iteration is O ( n ) O(n) O(n). Otherwise for a general code it’s O ( n 2 ) O(n^2) O(n2)

The exact LLR calculation (max-product) is rather complex and is often approximated but approximation works well when there are only “a few” terms (in the sum of ϕ ′ s \phi {'s} ϕs)

length of the shortest cycle= 2 l 2l 2l, the LLR equations hold up until l = 1 l=1 l=1 iterations
Another issue with BP for general code is the Tanner representation is dense ⇒ \Rightarrow it will have too many short cycles. And short cycles adversely affect the performance of BP, since the independence of r j , i r_{j,i} rj,i's (for a fixed i) or q i , j q_{i,j} qi,j (for fixed j) would be violate

  • when to stop
    【In practice, after a fixed number of iterations(usually in the range between 5 and 20)
  • early stopping
    【Maybe check H to see if parity-check equations are satified after making hard decision s on X i X_i Xi’s (according to update beliefs)
  • drawback
    【expensive

CRC: cyclic redundancy check

An m × n m\times n m×n partiy-check matrix, LLR= l o g ( P ( X i = 0 ∣ Y j ′ s ) P ( X i = 1 ∣ Y j ′ s ) ) log(\frac{P(X_i=0|Y_j's)}{P(X_i=1|Y_j's)}) log(P(Xi=1∣Yjs)P(Xi=0∣Yjs))
if L L R > 0 ,    X i ^ = 0 LLR>0,\;\hat{X_i} =0 LLR>0,Xi^=0
if L L R < 0 ,    X i ^ = 1 LLR<0,\;\hat{X_i} =1 LLR<0,Xi^=1
this step we called: making the hard decision

matrix [ X 1 ^ X 2 ^ . . . X n ^ ] \begin{bmatrix} \hat{X_1}\\ \hat{X_2}\\ ...\\ \hat{X_n} \end{bmatrix} X1^X2^...Xn^
if this matrix equals to 0, the decoder is done, output X i ^ \hat{X_i} Xi^ 's
if does not equal to 0, still need to continue
This process is complex - comparable complexity to one iteration of BP DECODING

so we want Alternative solution (to stop), called CRC, cyclic redundancy check.
A few extra bits of redundancy (8,12,16,24) using a cyclic code - an algebraic code.

if CRC= 8 bit, H ′ = [ 1001.. 01001 . . . . ] 8 × n H'=\begin{bmatrix} 1 0 0 1..\\ 01 0 0 1\\ .... \end{bmatrix}_{8\times n} H= 1001..01001.... 8×n
cyclic shift
each row of matrix H’ like a cyclic shift, CRC length= l ( 8 ) l(8) l(8)

overall H of the code + CRC = [ H H ′ ] ( m + l ) × n \begin{bmatrix}H\\H'\end{bmatrix}_{(m+l)\times n} [HH](m+l)×n
(n,k)linear code +l bits of CRC, # of information bits = k − l k-l kl

[ H ( n − k ) × n H l × n ′ ] ( n − k + l ) × n \begin{bmatrix} H_{(n-k)\times n}\\H'_{l\times n} \end{bmatrix}_{(n-k+l)\times n} [H(nk)×nHl×n](nk+l)×n

we can check CRC equation at the end of each iteration,
if CRC pass: stop the decoder, is CRC not pass: next iteration

At the end of decode , we also check CRC to see if we have read a “valid” codeword" ©
valid codeword: H C T = 0 ( v a l i d    c o d e )      H ′ C T = 0 ( v a l i d    C R C ) HC^T=0(valid\;code)\;\;H'C^T=0(valid\;CRC) HCT=0(validcode)HCT=0(validCRC)
probability of CRC failure= 1 2 l \frac 1 2 l 21l

Transport block
在这里插入图片描述
each codeblock has its own CRC
the entive transport block was another CRC

LDPC code over BEC

when decoding over BEC, LLRs do not matter as each coded bit is either known or erased.
在这里插入图片描述
q i j = { X i if  X i    i s    k n o w n e if  X i    i s    e r a s e d q_{ij}= \begin{cases} X_i &\text{if } X_i \;is\; known \\ e &\text{if } X_i \;is \;erased \end{cases} qij={Xieif Xiisknownif Xiiserased
在这里插入图片描述
r j i = ∑ i ∈ R j \ i X i = { k n o w n if  a l l    X i ( i ∈ R j \ i )    a r e    k n o w n e o t h e r w i s e r_{ji}=\sum_{i\in R_{j\backslash i}} X_i= \begin{cases} known &\text{if } all \;X_i(i\in R_{j\backslash i}) \;are\; known \\ e &otherwise \end{cases} rji=iRj\iXi={knowneif allXi(iRj\i)areknownotherwise

update belief if X i X_i Xi is erased, X i X_i Xi is known if at least one of r j i r_{ji} rji's, j ∈ c i j\in c_i jci is known ⇒ \Rightarrow X i X_i Xi=known r j , i r_{j,i} rj,i, otherwise it remain erased

Example:
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
A stopping set is a set of erased, variables that can not be corrected regardless of other variables (even if all others are known)
How this happen?
Let G denote the set of neighbors of the stopping set V V V, then every check node in G is connected to at least two variable node in V V V.
The minimum stopping se V m i n V_{min} Vmin is the stopping set containing the fewest # of variable node.
Then the code can correct up to ∣ V m i n ∣ = 1 |V_{min}|=1 Vmin=1 erasures ⇒      d m i n ≥ ∣ V m i n ∣ \Rightarrow\;\;d_{min}\ge|V_{min}| dminVmin (a code of minimum distance d m i n d_{min} dmin can correct up to d m i n − 1 d_{min}-1 dmin1 erasures)

Density evolution → \rightarrow over BEC( p)
consider ( w r , w c w_r, w_c wr,wc), regular LDPC code, the probability that a variable node remains erased after the l l l-th iteration (assuming independence l ≤ L / 2 l\le L/2 lL/2)

when l l l is the length of the shortest cycle also refer to as the “girth” of the Tanner graph, denoted by ε l \varepsilon_l εl

ε 0 = P ε l = P ⋅ ( 1 − ( 1 − ε l − 1 ) w r − 1 ) w c − 1      f o r    l ≥ 1 \varepsilon_0=P\\ \varepsilon_l=P\cdot(1-(1-\varepsilon_{l-1})^{w_r-1})^{w_c-1}\;\;for\;l\ge1 ε0=Pεl=P(1(1εl1)wr1)wc1forl1 P P P: bit X i X_i Xi is originally erased by channel
在这里插入图片描述

( 1 − ε l − 1 ) w r − 1 (1-\varepsilon_{l-1})^{w_r-1} (1εl1)wr1: the probability that all X i ∈ R j \ i X_{i\in R_{j\backslash i}} XiRj\i are not erased.
1 − ( 1 − ε l − 1 ) w r − 1 1-(1-\varepsilon_{l-1})^{w_r-1} 1(1εl1)wr1: the probability that r j i r_{ji} rji is erased.

Given the degree distribution ( w r , w c w_r,w_c wr,wc) the threshold ε \varepsilon ε is the maximum p for which ε l → 0 \varepsilon_l\rightarrow0 εl0 as l → ∞ l\rightarrow \infty l
n large, girth large

Example:
For the (3,6) regular LDPC as n → ∞ n\rightarrow \infty n, we have ε ∗ = 0.4294 \varepsilon^*=0.4294 ε=0.4294, capacity = 0.5706 =0.5706 =0.5706, R = 1 2 R=\frac 1 2 R=21, low capacity

For (d,2d) regular LDPC codes, ε ∗ → 0.5 \varepsilon^* \rightarrow 0.5 ε0.5 as l → ∞ l \rightarrow \infty l(we can achieve the capacity)
Assuming a random erasurable of all regular (d,2d) LDPC codes, the girth grow large enough as n n n grows large with probability 1

Note that:

  1. Density evolution only describes the asymptotic performance of the Ramdon erasurable of codes.
  2. Goal of good H design x = { High girth High (or almost full) rank  large minimum stopping set x = \begin{cases} \text{High girth} \\ \text{High (or almost full) rank}\\ \text{ large minimum stopping set} \end{cases} x= High girthHigh (or almost full) rank large minimum stopping set

Channel coding techniques in 5G systems

Structure of LDPC in 5G

Protograph LDPC codes: lifting operation using a base matrix

For a lifting operation of size z z z, each check node is replaced by z z z check nodes (and variable node).
Then each edge in the Tanner graph is replaced by a shifted/performance matrix
在这里插入图片描述
it keeps the degree distribution of the Tanner graph (regardless of z)
在这里插入图片描述
lifting size : z z z
each entry in the base matrix is a number from -1,0,1,…, z z z
− 1 ⇒ -1\Rightarrow 1 no edges: z × z z \times z z×z of all-zero matrix
0 ⇒ 0 \Rightarrow 0 Identity matrix, for i=1,2,…, z − 1 z-1 z1 shifted permutation matrix by i i i

Example:
在这里插入图片描述

There are two types of base graphs/matrixes in 5G LDPC code.
B 1 B_1 B1 of size 46 × \times × 68, and B 2 B_2 B2 of size 42 × \times × 52.
There are lifting sizes up to 354 maximum block length supported by 5G LDPC is 384 × \times × 68 = 26112

Polar Code (channel dependence)
channel polarization theory: let W denote the channel BEC§
w ( y ∣ x ) w(y|x) w(yx) denotes the probability of receiving y y y given x x x
在这里插入图片描述
W ( 0 ∣ 0 ) = 1 − P W ( e ∣ 0 ) = P W ( 1 ∣ 0 ) = 0 W ( 1 ∣ 1 ) = 1 − P W ( e ∣ 1 ) = P W ( 0 ∣ 1 ) = 0 W(0|0)=1-P\\ W(e|0)=P\\ W(1|0)=0\\ W(1|1)=1-P\\ W(e|1)=P\\ W(0|1)=0 W(0∣0)=1PW(e∣0)=PW(1∣0)=0W(1∣1)=1PW(e∣1)=PW(0∣1)=0

在这里插入图片描述

now consider two channels, the channel that u 1 u_1 u1 observes and the channel that u 2 u_2 u2 observes assuming u 1 u_1 u1 is known

在这里插入图片描述
( u 1 u_1 u1 observes)

在这里插入图片描述
( u 2 u_2 u2 observes assuming u 1 u_1 u1 is known)

Given w − w^- w is also a BEC, with erasure probability 1 − ( 1 − p ) 2 = 2 p − p 2 1-(1-p)^2=2p-p^2 1(1p)2=2pp2, because u 1 u_1 u1 is known/decoded if and only if both y 1 ( = u 1 + u 2 ) y_1(=u_1+u_2) y1(=u1+u2) and y 2 ( = u 2 ) y_2(=u_2) y2(=u2) are non-erasures u 1 = y 1 + y 2 = u 1 + u 2 + u 2 = u 1 u_1=y_1+y_2=u_1+u_2+u_2=u_1 u1=y1+y2=u1+u2+u2=u1
w + w^+ w+ is also a BEC with erasure probability p 2 p^2 p2 because u 2 u_2 u2 is decoded if either y 1 y_1 y1 or y 2 y_2 y2 is a non-erasured (if both y 1 y_1 y1 and y 2 y_2 y2 are erased, u 2 u_2 u2 unknown)

Note

  1. the sum-capacity is preserved (capacity of BEC( P) is 1-P),
    c ( w − ) + c ( w + ) = 1 − 2 p + p 2 + 1 − p 2 = 2 ( 1 − p ) = 2 c ( w ) c(w^-)+c(w^+)=1-2p+p^2+1-p^2=2(1-p)=2c(w) c(w)+c(w+)=12p+p2+1p2=2(1p)=2c(w)
  2. Also, 2 p − p 2 > p 2 2p-p^2>p^2 2pp2>p2, for 0 < p < 1 ∼ w + 0<p<1 \sim w^+ 0<p<1w+ is better than w − w^- w

Channel splitting operation

在这里插入图片描述
Then w + + w^{++} w++ input u 4 u_4 u4 output y 1    y 2    y 3    y 4    u 1    u 2    u 3 y_1\;y_2\;y_3\;y_4\;u_1\;u_2\;u_3 y1y2y3y4u1u2u3
w + − w^{+-} w+− input u 3 u_3 u3 output y 1    y 2    y 3    y 4    u 1    u 2 y_1\;y_2\;y_3\;y_4\;u_1\;u_2 y1y2y3y4u1u2
w − + w^{-+} w−+ input u 2 u_2 u2 output y 1    y 2    y 3    y 4    u 1 y_1\;y_2\;y_3\;y_4\;u_1 y1y2y3y4u1
w − − w^{--} w−− input u 1 u_1 u1 output y 1    y 2    y 3    y 4 y_1\;y_2\;y_3\;y_4 y1y2y3y4

This can continue recursively, for n = 2 m n=2^m n=2m, this is called polarization transform of length n n n denoted by p ( n ) p^{(n)} p(n) recursion stop from n n n to 2 n 2n 2n

在这里插入图片描述

w + + . . . + + . . . ↔ w ( i ) w^{++...++...}\leftrightarrow w^{(i)} w++...++...w(i) by mapping i − 1 i-1 i1 into a binary format of length m    ( m = l g ( n ) ) m\;(m=lg(n)) m(m=lg(n)) and replace “1” by “+” and “0” by “-”.

The sum-capacity is preserved: ∑ i = 1 n c ( w ( i ) ) = n ∗ c ( w ) \sum_{i=1}^{n} c(w^{(i)})=n*c(w) i=1nc(w(i))=nc(w)
(for symmetric channels)
proof is by chain rule of mutual information assuming input bits u i u_i ui’s are uniform i.i.d(independent and identically distributed) )

polarization tree
在这里插入图片描述
channel polarization: As n grows large, the bit-channels become either completely noiseless (capcacity goes to one) or become completely noise (capacity goes to zero)
(except a vanishing fraction of bit-channels)
further more, the fraction of noiseless channel → c ( w ) \rightarrow c(w) c(w)

Example:
BEC(0.5), n=4
Let z ( i ) z^{(i)} z(i) denote the erasure prob of w ( i ) w^{(i)} w(i)
在这里插入图片描述
A prove for polarization for BEC
n = 2 m n=2m n=2m bit channels n ⋅ c ( w ) − ∑ i = 1 n c ( w ( i ) ) n\cdot c(w) - \sum_{i=1}^n c(w^{(i)}) nc(w)i=1nc(w(i))
T n = 1 n ∑ i = 1 n ( 1 − z ( i ) ) T_n=\frac 1 n\sum_{i=1}^n(1-z^{(i)}) Tn=n1i=1n(1z(i))
z ( i ) z^{(i)} z(i): erasure probability of w ( i ) w^{(i)} w(i) i i i-th bit channel
It is sufficient to prove that l i m n → ∞ T n = 0 {lim \atop n\rightarrow\infty } T_n=0 nlimTn=0
在这里插入图片描述

z 2 ( 1 − z 2 ) + ( 2 z − z 2 ) ( 1 − 2 z + z 2 ) = 2 z ( 1 − z ) ( 1 − z ( 1 − z ) ) z^2(1-z^2)+(2z-z^2)(1-2z+z^2)=2z(1-z)(1-z(1-z)) z2(1z2)+(2zz2)(12z+z2)=2z(1z)(1z(1z))
define α i = z ( i ) ( 1 − z ( i ) ) \alpha_i=z^{(i)}(1-z^{(i)}) αi=z(i)(1z(i))
T 2 n = 1 2 n ∑ 2 α i ( 1 − α i ) = 1 n ∑ α i − α i 2 = 1 n ∑ α i − 1 n ∑ α i 2   ≤ T n − T n 2 T_{2n}=\frac 1 {2n}\sum 2\alpha_i(1-\alpha_i)=\frac 1 n \sum \alpha_i-\alpha^2_i\\ = \frac 1 n\sum \alpha_i-\frac 1 n\sum \alpha_i^2 \,\le T_n-T_n^2 T2n=2n12αi(1αi)=n1αiαi2=n1αin1αi2TnTn2
Lemma: ∑ α i 2 n ≥ ( ∑ α i n ) 2 \frac {\sum \alpha_i^2} n\ge(\frac {\sum \alpha_i} n)^2 nαi2(nαi)2
Note that sequency of { T n } n − 1 \{T_n\}_{n-1} {Tn}n1 positive and strictly decresing → l i m n → ∞ T n \rightarrow {lim \atop n\rightarrow\infty } T_n nlimTn exists
Let T ∞ = l i m n → ∞ T n T_\infty= {lim \atop n\rightarrow\infty } T_n T=nlimTn, T ∞ = T ∞ 2 − T ∞ ⇒ T ∞ = 0 T_\infty=T^2_\infty-T_\infty\Rightarrow T_\infty=0 T=T2TT=0

β n = ∣ i ∣ ε ≤ z ( i ) ≤ 1 − ε ∣ n \beta_n=\frac {|{i|\varepsilon\le z^{(i)}\le1-\varepsilon}|}n βn=niεz(i)1ε for some ε > 0 \varepsilon>0 ε>0
Note that T n ≥ β n − ε ( 1 − ε ) T_n\ge\beta_n-\varepsilon(1-\varepsilon) Tnβnε(1ε)
for any fixed of ε \varepsilon ε, β n → 0 \beta_n\rightarrow0 βn0 since T n → 0 T_n\rightarrow0 Tn0

Polarization transform

在这里插入图片描述
在这里插入图片描述
x 1 = u 1 + u 2 x 2 = u 2 → [ x 1      x 2 ] = [ u 1      u 2 ] [ 1    0 1    1 ] = [ u 1      u 2 ] G 2 x_1=u_1+u_2\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\\ x_2=u_2 \qquad \rightarrow [x_1\;\;x_2]=[u_1\;\;u_2]\begin{bmatrix}1\;0\\1\; 1 \end{bmatrix}=[u_1\;\;u_2]G_2 x1=u1+u2x2=u2[x1x2]=[u1u2][1011]=[u1u2]G2
G 2 n = [ G n    0 n × n G n        G n ] = G n × G 2 = G 2 ⨂ G 2 ⨂ G 2 ⨂ . . . ⨂ G 2 ⏟ m times m= l o g 2 n = G 2 ⨂ m      Kronecker power G_{2n}=\begin{bmatrix} G_n\;0_{n\times n}\\G_n\;\;\;G_n \end{bmatrix}=G_n\times G_2=\underbrace{G_2\bigotimes G_2\bigotimes G_2\bigotimes ...\bigotimes G_2 }_{\text{m times m=$log_2 n $}} = G_2^{\bigotimes m} \;\;\text{Kronecker power} G2n=[Gn0n×nGnGn]=Gn×G2=m times m=log2n G2G2G2...G2=G2mKronecker power

Kronecker product of A m × n A_{m\times n} Am×n and B p × q B_{p \times q} Bp×q
A ⨂ B = [ a 11 B . . . . . . . . . . a 1 n B . . . . . . a m 1 B . . . . . . . . . . a m n B ] m p × n q G 4 = [ 1    0    0    0 1    1    0    0 1    0    1    0 1    1    1    1 ] 4 × 4 (replacing "0" with "-1" results Hadamard matrix) ⇒ G n − 1 = G n      o r      G n × G n = I n × n      s e l f − i n v e r s e A\bigotimes B=\begin{bmatrix} a_{11}B..........a_{1n}B\\...\qquad\qquad\quad...\\a_{m1}B..........a_{mn}B \end{bmatrix}_{mp\times nq}\\ G_4=\begin{bmatrix} 1\;0\;0\;0\\ 1\;1\;0\;0\\1\;0\;1\;0\\1\;1\;1\;1\\\end{bmatrix}_{4\times 4}\\ \text{(replacing "0" with "-1" results Hadamard matrix)}\\ \Rightarrow G_n^{-1}=G_n \;\;or \;\;G_n\times G_n=I_{n\times n} \;\;self-inverse AB= a11B..........a1nB......am1B..........amnB mp×nqG4= 1000110010101111 4×4(replacing "0" with "-1" results Hadamard matrix)Gn1=GnorGn×Gn=In×nselfinverse

Encoding complexity

U 1 × n × G n U_{1\times n}\times G_n U1×n×Gn can be done with O ( n l o g ( n ) ) O(nlog (n)) O(nlog(n)) complexity function x ( 1 , n ) = G m u l t i p l i e r x(1,n)=G_{multiplier} x(1,n)=Gmultiplier
( u ( 1 : n ) ) (u(1:n)) (u(1:n))

if n==1
	x=u
	return
end
x1=G_multiplier(u(1*n/2))
x2=G_multiplier(u(x/2+1*n))
x=(x1+x2 , x2) 
end

( x 1 = . . .        x 2 = . . . x_1=... \;\;\; x_2=... x1=...x2=... these two steps can be done in parallel)
( x 1 + x 2 x_1+x_2 x1+x2 is entry-wise addition)

output of function: x 1 × n = u 1 × n G n x_{1\times n}=u_{1\times n}G_n x1×n=u1×nGn
f ( n ) f(n) f(n)=# of operations to compute this u 1 × n G n u_{1\times n}G_n u1×nGn
= { f ( n ) = 2 f ( n 2 ) + n 2 ⇒ f ( n ) = n l g ( n ) 2 f ( 1 ) = 0 =\begin{cases} f(n)=2f(\frac n 2)+\frac n 2\Rightarrow f(n)=\frac {nlg(n)} 2 \\ f(1)=0 \end{cases} ={f(n)=2f(2n)+2nf(n)=2nlg(n)f(1)=0
latency (time needed, assuming parallelization)
latency of computing u G uG uG with the function G m u l t i p l i e r G_{multiplier} Gmultiplier
g ( n ) g(n) g(n) : the latency, g ( n ) = g ( n 2 ) + 1 ⇒ g ( n ) = l g ( n ) g(n)=g(\frac n 2)+1\Rightarrow g(n)=lg(n) g(n)=g(2n)+1g(n)=lg(n) (fast enough)

polar code construction

length n n n dimension k k k, channel w w w
pick the indices of the k k k “best” bit-channels w ( i ) s w^{(i)} s w(i)s in the polarization transform of length n n n
the genrator matrix for ( n , k n,k n,k) polar code associated with w w w
from matrix G n × n G_{n\times n} Gn×n, select the rows that are indexed by “good” bit-channel

Polar encoder
在这里插入图片描述

example:
n=8 k=4 for BEC(0.5)
在这里插入图片描述
k=4, pick 4 best one indices 4 6 7 8
在这里插入图片描述
u 1 × 8 = [ 0    0    0    m 1    0    m 2    m 3    m 4 ] u_{1\times 8}=[0\;0\;0\;m_1\;0\;m_2\;m_3\;m_4] u1×8=[000m10m2m3m4]
message bit are m 1 , m 2 , m 3 , m 4 m_1,m_2,m_3,m_4 m1,m2,m3,m4
⇒ \Rightarrow compute u 1 × 8 G 8 u_{1\times 8}G_8 u1×8G8 to get the encoded codeword

Decoder polar code

Successive cancellation decoder
let A denote the set of indices of “good” bit-channels selected fro the code construction, A = 1 , 2 , . . . , n A={1,2,...,n} A=1,2,...,n For i = 1 , 2 , . . . , n i=1,2,...,n i=1,2,...,n
u ˆ i \^u_i uˆi :decoded version of u i u_i ui
u ˆ i \^u_i uˆi { 0 if  i ∈ A ML decision of  u i  given  y 1 , y 2 , . . . , y n  and  u ˆ 1 , u ˆ 2 , . . . , u ˆ n \begin{cases} 0 &\text{if } i\in A \\ \text{ML decision of } u_i \text{ given } y_1,y_2,...,y_n \text{ and } \^u_1,\^u_2,...,\^u_n \end{cases} {0ML decision of ui given y1,y2,...,yn and uˆ1,uˆ2,...,uˆnif iA

Let probability of error P e ( u i ) = P e ( w ( i ) ) Pe(u_i)=Pe(w^{(i)}) Pe(ui)=Pe(w(i)), assuming that u ˆ 1 i − 1 = u 1 i − 1 \^u_1^{i-1}=u_1^{i-1} uˆ1i1=u1i1 ( u ˆ 1 i − 1 : u 1 , u 2 , . . . , u i − 1 \^u_1^{i-1}:u1,u2,...,u_{i-1} uˆ1i1:u1,u2,...,ui1)
Lemma: Pe(the polar code associated with A and decoded with SC) ≤ ∑ i ∈ A P e ( u i ) \le \sum_{i\in A}Pe(u_i) iAPe(ui)
P e ( u i ) Pe(u_i) Pe(ui): probability of error of individual bit-channel

PROOF:
by union bound on the error events u ˆ i ≠ u i \^u_i\ne u_i uˆi=ui for the first(smallest) i, going back to the construction of polar code, there are two criteria:

  1. For a fixed rate: sort the bit-channels and pick the best k − n R k-nR knR , R:given rate. Finding the rate, n n n block length, n = 2 m n=2^m n=2m rate R R R, dimension k = n R k=nR k=nR. Polarization transform of length n n n, split into n n n bit-channels w ( 1 )    w ( 2 )    . . . w ( n )    w^{(1)}\;w^{(2)}\;...w^{(n)}\; w(1)w(2)...w(n). Sort them (according to capacity or probability of erasure) pick the best k k k of them, let A = A= A= set of indices of the selected/good ones.
  2. For a given bound on P e Pe Pe. Pe(polar code associated with A under SC) ≤ ∑ i ∈ A P e ( u i ) \le \sum_{i\in A}Pe(u_i) iAPe(ui), u i u_i ui: for bit-channel w ( i ) w^{(i)} w(i).
    Sort the bit-channels from best to worse u π ( 1 )    u π ( 2 )    . . . u π ( n )    u_{\pi(1)}\;u_{\pi(2)}\;...u_{\pi(n)}\; uπ(1)uπ(2)...uπ(n) - sorting permutation according to R e ( u i ) Re(u_i) Re(ui)
    u n u_n un is always the best π ( 1 ) = n \pi(1)=n π(1)=n
    u 1 u_1 u1 is always the worst π ( n ) = 1 \pi(n)=1 π(n)=1
    u π ( n ) → u_{\pi(n)}\rightarrow uπ(n)then accumulate as many u π ( i ) u_{\pi(i)} uπ(i)'s as possible (starting from π ( 1 ) \pi(1) π(1) till the sum ∑ i = 1 k P e ( u π ( i ) ) \sum_{i=1}^k Pe(u_{\pi(i)}) i=1kPe(uπ(i)) reaches the bound on P e Pe Pe)

Example:
for n=8 k=4 BEC( 1 2 \frac 1 2 21)
在这里插入图片描述
standard : P e < 1 3 Pe<\frac 1 3 Pe<31
∑ P e = 1 256 + 31 256 + 49 256 = 81 256 < 1 3 \sum Pe=\frac 1 {256}+\frac {31} {256}+\frac {49} {256}=\frac {81} {256}<\frac {1} {3} Pe=2561+25631+25649=25681<31 good
but if + P e ( u π ( 4 ) ) +Pe(u_{\pi(4)}) +Pe(uπ(4)), ∑ P e \sum Pe Pe will greater than 1 3 \frac 1 3 31, ∼ k = 3 \sim k=3 k=3, set of good bit-channels, A = { 8 , 7 , 6 } A=\{8,7,6\} A={8,7,6}

(未完,接:Advances in Wireless Communication 课堂笔记(中)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值