# Inegalités entropiques en théorie de l'information quantique

publicité
```Quantum entropy inequalities
August 24, 2021
You should call it ‘entropy’ for two
reasons; first, the function is already
in use in thermodynamics; second,
and most importantly, most people
don’t know what entropy really is,
and if you use ‘entropy’ in an
argument, you will win every time.
John Von Neumann to Claude
Shannon
Summary
This training was supervised by Marius Lemm, Professor at the &Eacute;cole Polytechnique F&eacute;d&eacute;rale de Lausanne, chair of analysis and mathematical physics. During
this training in quantum information theory I learnt about quantum entropy. After
having worked on the basic notion : definition of the various quantum entropies,
basic inequalities of entropy, strong subadditivity, data processing inequality; I
learnt that strong subadditivity was the most important entropy inequality and
that every other one could be derived from it. Then, I worked on the main tool used
in a new method to find entropy inequalities from [Sut19], the Golden-Thompson
inequality. After that, I studied the application of this method to prove a variant of
the Linden-Winter inequality, a constrained but new inequality. The proof follows
the note [LX]. Eventually, we did some research about generalizing the proof to the
case of approximate quantum Markov chains, without much success. However we
did manage to bring back the original symmetry in the proof.
Contents
1 Introduction
1.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Specifics of quantum physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2
2
3
5
2 Golden-Thompson inequality
7
3 Quantum Markov chains
15
3.1 Quantum Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Approximate quantum Markov chains . . . . . . . . . . . . . . . . . . . . . . . 17
4 Linden-Winter inequality and possible extensions
4.1 Linden and Winter’s original inequality . . . . . .
4.2 A new proof for the inequality . . . . . . . . . . . . .
4.3 Thoughts on the proof . . . . . . . . . . . . . . . . . .
4.4 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
18
18
18
22
25
Introduction
In this section we introduce the notations as well as the basic notions used later.
This follows [NC00], [Wil19] and [Sut18].
1.1
Notations
Let A be a quantum system, i.e. an Hermitian space. We will be using the following
notations :
• ρ † : Conjugate transpose of a matrix ρ
• l o g denotes the natural logarithm, either for complex numbers or matrices.
• |ψ⟩ : ket ψ, vector of A.
• ⟨ψ| : bra ψ, vector of the dual of A.
• I A : identity map of A.
• t r , t rA : trace and partial trace over A.
• s up p (ρ) : Support of ρ.
• Sp (ρ) : Set of Eigenvalues of ρ.
• ⊕ : Sum between two operators acting on orthogonal spaces.
•
ρ
p
: p-Schatten norm of the operator ρ.
2
• |ρ| := ρρ †
1/2
: Absolute value of the matrix ρ.
• L (A) : Set of linear operators over A.
• D (A) : Set of density operators over A.
• P (A) : Set of positive semi-definite operators over A.
• P+ (A): Set of positive definite operators over A.
• T P C P (A, B ) : set of trace preserving completely positive maps from A to B.
The matrices used here are positive, unless specified otherwise.
1.2
Specifics of quantum physics
A quantum system is a Hermitian space A, whose vectors |ψ⟩ represent its possible
states. The adjoint of |ψ⟩ is written ⟨ψ|. We have, for example : ⟨ψ|ψ⟩ is the scalar
product of ψ with itself, and |ψ⟩ ⟨ψ| is an operator acting on A, more precisely it is
the projector over the subspace spanned by |ψ⟩.
We can describe multipartite systems with tensor product. Let AB be a bipartite
system, i.e. it is the Hermitian space : A ⊗ B . A state of AB which can not be written
in the form |ψA ⟩ ⊗ |ψB ⟩ is said to be entangled.
We can also represent states with density operators:
Definition 1.1 Density operator
Let {(pi , |ψi ⟩)} be an set of pure states associated with a probability, where
P
i pi = 1. Then we define :
X
ρ :=
pi |ψi ⟩ ⟨ψi |
i
the density operator of the system.
A density operator which is a projector, that is to say is is of the form |ψ⟩ ⟨ψ| is
called a pure state. Otherwise, it is a mixed state. We now need some mathematical
criterion to describe the evolution of a quantum system. Let N be an operator that
maps a density matrix ρA to ρB so that ρB is the evolution of ρA over time. Using
physical reasoning (see [Wil19, Section 4.4.1] ) we can define quantum channels:
3
Definition 1.2 Quantum channels
Let A and B be quantum systems, and N : L (A) → L (B ) be a linear map. It is
called a quantum channel if it is a trace preserving completely positive map,
i.e. :
• ∀ρA ∈ L (A) , t r (N (ρA )) = t r (ρA )
• For all quantum system R and , I L (R ) ⊗ N maps positive semi-definite
operators to positive semi-definite operators.
We can justify these with the following reasoning : We want a density operator to
be mapped to a density operator,as a quantum system is &quot;stable&quot; over time, so N
has to be trace preserving and has to map positive semi-definite operators to positive semi-definite operators. But suppose we prepare a larger quantum system,
which subsystems are R and A. Then N can be extended as I L (R ) ⊗N and this has to
be trace-preserving - which it is automatically- and positive semi-definite, which
is the definition of N being a completely positive map.
We will use density operators to describe quantum systems as it is convenient
for describing multipartite systems. The key notion here is the partial trace :
Definition 1.3 Partial trace
Let |a 1 ⟩ ⟨a 2 | ⊗ |b1 ⟩ ⟨b2 | be in A ⊗ B . We define :
t rA (|a 1 ⟩ ⟨a 2 | ⊗ |b1 ⟩ ⟨b2 |) := |b1 ⟩ ⟨b2 | t r (|a 1 ⟩ ⟨a 2 |)
Where the trace on the right side is the usual trace. Then we extend the definition by linearity.
An important property of the partial trace (as for the regular trace) is cyclicity:
For any matrices ρ1 , ρ2 , ρ3 acting on a space A ⊗ B where the products are well defined, we have : t rA (ρ1 ρ2 ρ3 ) = t rA (ρ3 ρ1 ρ2 ).
The partial trace is the operation which allows us to get the density operators of the
subsystems :
Property 1.4 Reduced states
Let A,B be two systems, with ρAB , ρA , ρB be the density operators of AB, A
and B respectively. The so-called reduced density operators ρA and ρB are obtained by taking the partial trace :
ρA = t rB ρAB
ρB = t rA ρAB
4
We now introduce Schatten norms, which are the norms we will use for matrices :
Definition 1.5 Schatten norms
Let p ≥ 1 and L be a matrix. We define :
1/p
∥L ∥p := (t r |L |p )
Where : |L | := (L L † )1/2
1.3
Entropy
Definition 1.6 Classical entropy
Let X be a discrete
random variable. We introduce the Shannon entropy :
P
H (X ) := −
P (X = x )l o g (P (X = x ))
x ∈X (Ω)
Entropy was developed by Shannon as a way to quantify how much information
can be delivered by a source.
Other entropy-related quantities exist, but the classical case is here only to be compared with the quantum case :
5
Definition 1.7 Quantum entropies
Let ρA , ρB , ρC be density operators representing states of the systems A,B and
C. We define:
P
• Von Neumann entropy : S (A) := −t r (ρA l o g ρA ) = −
λl o g (λ)
λ∈Sp (ρA )
• Quantum relative entropy :
D (A||B ) := S (A||B ) := t r (ρA l o g ρA ) − t r (ρA l o g ρB )
• Quantum measured relative entropy :
DM (ρA ||ρB ) := sup t r ρA l o g ω − l o g t r ρB ω
ω∈P+ (A)
• Quantum joint entropy : S (AB ) := −t r (ρAB l o g ρAB )
• Quantum conditional entropy : S (A|B ) := S (AB ) − S (B )
• Quantum mutual information : I (A; B ) := S (A) + S (B ) − S (AB )
• Quantum conditional mutual information :
I (A; C |B ) := S (AB ) + S (B C ) − S (AB C ) − S (B )
At this point we can make two remarks : First, the relative entropy is also called divergence, hence the notation with a &quot;D&quot;. Second, one may notice that the definition of the measured relative entropy is quite different from the others. It is in fact
a characterization, not the usual definition. It is however equivalent, and we will
only use this form. The more standard definition would need to introduce other
notions of quantum physics, in particular a specific sort of measurement. The relative entropy can also be expressed with a variational formula :
Theorem 1.8 Variational formula for relative entropy
Let ρ ∈ D (A) and σ ∈ P (A). Then :
D (ρ||σ) = sup
ω∈P+ (A)
t r ρl o g ω − l o g t r e l o g σ+l o g ω
These quantities obey some laws, one of the most important ones being nonnegativity. We have the following inequalities :
6
Theorem 1.9 Quantum entropy inequalities
Let A,B and C be three quantum systems, and ρ and σ density operators. We
have :
• Non-negativity of quantum entropy : S (A) ≥ 0
• Subadditivity : I (A; B ) ≥ 0 i.e. S (AB ) ≤ S (A) + S (B )
• Strong subadditivity : I (A; C |B ) ≥ 0 i.e. S (AB C ) + S (B ) ≤ S (AB ) + S (B C )
• Klein’s inequality : S (A||B ) ≥ 0
• Non-negativity of relative entropies : D (ρ||σ) ≥ DM (ρ||σ) ≥ 0
Strong subadditivity is the strongest entropy inequality : all the other (unconstrained)
inequalities are deduced from it.
One other inequality is the data processing inequality. It states that the entropy
can not increase with time :
Theorem 1.10 Data processing inequality
Let ρ, σ be two density operators and E be a quantum channel. Then :
D (E (ρ)||E (σ)) ≤ D (ρ||σ)
The measured relative entropy also respects the data processing inequality.
2
Golden-Thompson inequality
The major result we will need later is the Golden-Thompson inequality. To prove
this inequality, we need some results from complex interpolation theory first. These
theorems come from [SBT16] except for Lemma 2.2 which comes from [G00].
7
Theorem 2.1
Let S := {z ∈ C, 0 ≤ R e (z ) ≤ 1} , G : S → B (A) a holomorphic function in the
interior of S to the set of bounded operators on a separable Hilbert space. Let
us suppose that G is also continuous on the boundary of S.
θ
Let p0 , p1 be in ]0, ∞[, θ ∈]0, 1[, and pθ s.t. p1θ = 1−θ
p0 + p1 .
Let :
βθ : R → R
t 7→
s i n (πθ )
2θ (c o s h (πt ) + c o s (πθ ))
be a probability distribution.
If furthermore z 7→ ||G (z )||pR e (z ) is uniformly bounded on S, then :
l o g ||G (θ )||pθ ≤
+∞
Z


θ
β1−θ (t )l o g ||G (i t )||1−θ
p0 + βθ (t )l o g ||G (1 + i t )||p1 d t
−∞
We first need a lemma :
Lemma 2.2
Let g be analytic on the open strip S := {z ∈ C , 0 &lt; ℜ(z ) &lt; 1}, continuous and
uniformly bounded on its closure. Then for θ ∈]0, 1[ , we have :
l o g g (θ ) ≤
+∞
Z

β1−θ (t )l o g g (i t )
−∞
8
1−θ
+ βθ l o g g (1 + i t )
θ

dt
Proof:
The lemma is in fact a slightly stronger result : we can indeed show that for any
z := x + i y ∈ S , we have :
s i n (πx )
l o g g (x + i y ) ≤
2
+∞
Z
l o g g (i t + i y )
c o s h (πt ) − c o s (πx )
+
l o g g (1 + i t + i y )
c o s h (πt ) + c o s (πt )

dt
−∞
We will only do the proof for y = 0. In that case, we write θ = x . Notice that
−c o s (πθ ) = c o s (π(1 − θ )) and s i n (πθ ) = s i n (π(1 − θ )). Eventually, recall that
1/θ &gt; 1; this inequality implies:
l o g g (θ ) ≤
+∞
Z

l o g g (i t )
l o g g (1 + i t )
s i n (π(1 − θ ))
s i n(πθ )
+
dt
2θ
c o s h (πt ) + c o s (π(1 − θ ))
2θ
c o s h (πt ) + c o s (πθ )
−∞
Which is exactly the lemma as stated above, according to the definition of βθ .
The proof follows [G00, Lemma 1.3.8].
For U any harmonic function on the open disc D of radius 1, we have the Poisson
formula :
1
U (z ) =
2π
Z
π
U R e iφ
−π
R 2 − ρ2
R e i φ − ρe i θ
dφ
(1)
for z := ρe i φ and |z | &lt; R &lt; 1.
Let u be subharmonic on D, continuous on the circle CR of radius R. When U = u ,
the right side of (1) defines a harmonic function on the open disc DR of radius R
that coincides with u on the border.
The maximum principle for subharmonic functions implies, for |z | &lt; R &lt; 1 :
1
u (z ) ≤
2π
Z
π
u R e iφ
−π
R 2 − ρ2
R e i φ − ρe i θ
dφ
(2)
for z := ρe i φ .
This is valid for all subharmonic functions u that are continuous on the circle CR .
Let :
h :D →S


1
1+ξ
ξ 7→
l og i
πi
1−ξ
1+ξ
The fraction i 1−ξ is analytic on D (its series representation has a radius of convergence of at least 1) and lies on the upper half-plane, h is holomorphic. We then also
have that g ◦ h is holomorphic on D. u := l o g |g ◦ h | is then subharmonic.
We indeed have, for any holomorphic function F, l o g |F | is subharmonic. First,
assume that F does not vanish. Then we have : l o g |F | = R e (l o g (F )) and l o g (F ) is
a holomorphic function, so its real part is harmonic. For the general case, we need
to allow infinite values so l o g |F | is only subharmonic.
9
Applying (2) to this function and computing the module at the bottom of the fraction gives :
l o g |g ◦ h (z )| ≤
1
2π
Z
π
l o g g ◦ h R e iφ
−π
R 2 − ρ2
dφ
R 2 − 2ρR c o s (θ − φ) + ρ 2
(3)
for z := ρe i φ and |z | ≤ ρ &lt; R
We use our assumption that g is uniformly bounded on the closure of S, so we can
use the Lebesgue dominated convergence theorem in (3) for ρ &lt; R and R → 1 :
1
l o g |g ◦ h (z )| ≤
2π
Z
π
l o g g ◦ h R e iφ
−π
1 − ρ2
dφ
1 − 2ρc o s (θ − φ) + ρ 2
(4)
let x := h (ρe i θ ). A simple computation gives that h is invertible and : ρe i θ =
πi x
c o s (πx )
−i
h −1 (x ) = ee πi x +i
= −i 1+s
i n(πx ) so using the exponential writing of complex numbers
:
c o s (πx )
1 + s i n(πx )

|θ | = π
2
Note that the absolute values take into account the sign of c o s (πx ) as x is in ]0, 1[.
Either way, we have :


 ρ=
1 − ρ2
s i n(πx )
=
1 − 2ρc o s (θ − φ) + ρ 2 1 + c o s (πx )s i n(φ)
Thus we can write (4) as :
10
1
l o g g (x ) ≤
2π
Z
+π
−π
s i n (πx )
l o g g (h (e i φ )) d φ
1 + c o s (πx )s i n(πx )
(5)
We now change variables :
• On ] − π, 0[, we use i t = h (e i φ ). This is a legit change of variables as h is
holomorphic and invertible. We have :
1
2π
Z
0
1
s i n (πx )
l o g g (h (e i φ )) d φ =
1
+
c
o
s
(πx
)s
i
n
(πx
)
2
−π
+∞
Z
−∞
s i n (πx )
l o g |g (i t )|d t
c o s h (πt ) − c o s (πx )
• On ]0, π[, we use 1 + i t = h (e i φ ). We have :
1
2π
2π
Z
0
s i n (πx )
1
l o g g (h (e i φ )) d φ =
1 + c o s (πx )s i n (πx )
2
Z
+∞
−∞
s i n(πx )
l o g |g (1+i t )|d t
c o s h (πt ) − c o s (πx )
We now add these two quantities and use (5):
1
l o g |g (x )| ≤
2
Z
+∞
−∞
s i n (πx )
l o g |g (i t )| + l o g |g (1 + i t )| d t
c o s h (πx ) − c o s (πx )
Which is the desired result according to the remark at the beginning of the proof.
□
We now prove theorem 2 :
Proof:
The proof follows [SBT16, Appendix D]. For x in [0, 1] we define q x as the H&ouml;lder
conjugate of px : p1x + q1x = 1. Using the definition of px given earlier we have :
x
= 1−x
q0 + q1 . Let θ ∈]0, 1[ be fixed, the operator G (θ ) is bounded by assumption,
so it has a polar decomposition : G (θ ) = U ∆ with ∆ positive semi-definite and U
is a partial isometry, satisfying : ∆U U † = U †U ∆ = ∆ because G (θ ) is normal. (See
for example [NC00, Section 2.1.10]) For z in S, let :
1
qx
X (z )† := C
−pθ

1−z
q0
+ qz
1

pθ
∆

1−z
q0
+ qz
1

U†
with C := ||∆||pθ = ||G (θ )||pθ &lt; ∞ We find that X is anti-holomorphic on S, that is X
is holomorphic on S (using Cauchy-Riemann equation for instance). We also have
:


p q 1−x + x
||X (x + i y )||qqxx = t r (C −1 ∆) 0 x q0 q1 = t r (C −1 ∆)p0 = 1
The last equality resulting of the definition of C.
11
Consequently, Hilbert-Schmidt inner product g (z ) := t r X (z )†G (z ) is holomorphic
(because X † and G are holomorphic and the trace is linear ) and bounded on S,
because H&ouml;lder’s inequality states :
|g (x + i y )| ≤ ||X (x + i y )||qx ||G (x + i y )||px ≤ ||G (x + i y )||px
Hence, the assumption on G says that g satisfies the assumption of the previous
lemma:
l o g g (θ ) ≤
+∞
Z

β1−θ (t )l o g g (i t )
1−θ
+ βθ l o g g (1 + i t )
θ

dt
(6)
−∞
We now have to verify the following relations to conclude:
g (θ ) = t r (X (θ )G (θ )) = C
p
− qθ
θ
t r ∆pθ U †U ∆ = C 1−pθ t r ∆pθ = ||G (θ )||pθ
|g (i t )| ≤ ||G (i t )||p0 and |g (1 + i t )| ≤ ||G (1 + i t )||pθ
The last two inequalities result from H&ouml;lder’s inequality. Plugging these relations
into (6) yields the result.
□
The Golden-Thompson inequality comes directly from the following result :
Theorem 2.3
Let p ≥ 1 , r ∈]0, 1] , βr be as defined earlier, n ∈ N and a collection (A k )k ≤n of
n positive definite matrices. Then :
l og
n
Y
k =1
1/r
+∞
Z
A kr
βr (t )l o g
≤
p
−∞
12
n
Y
k =1
t
A 1+i
k
dt
p
Proof:
We prove the case of definite matrices, as the one with semi-definite matrices
follows Q
by continuity. For r = 1 the assertion is trivial. Otherwise, let us define
n
G (z ) := k =1 A kz for z ∈ S where S is the strip of the complex plane previously defined. This function satisfies the regularity assumptions of theorem 2. We pick
θ = r , p0 = ∞ , p1 = p so that pθ = θr . We find:
p
l o g ∥G (1 + i t )∥θ1 = r l o g
n
Y
t
A 1+i
k
k =1
And :
p
0
l o g ∥G (i t )∥1−θ
= r (1 − r )l o g
n
Y
A ikt
=0
k =1
∞
as the matrices A ikt are unitary (recall that ∥.∥∞ is the operator norm). Moreover
we have :
1/r
n
n
Y
Y
r
r
l o g ∥G (θ )∥pθ = l o g
Ak
= rl og
Ak
k =1
k =1
p /r
p
Plugging these relations into theorem 2 yields the result.
□
We need one last tool before proving Golden-Thompson inequality:
Property 2.4 Multivariate Lie-Trotter product formula
For any collection of square matrices (A k )k ≤m with m in N, we have :
lim
n →+∞

e
A1
n
13
...e
Am
n
n
m
P
= e k =1
Ak
Proof:
Ak
A
We compute the product using : e n = I + nk +O n12 where I is the identity matrix.
We have :
 
m
Y
Y
Ak
Ak
1
n
I+
+O
e =
2
n
n
k =1
k
 
X
1
1
Ak + O
=I +
n k
n2
So we have :
m
Y
e
Ak
n
n
k =1
 n
1X
1
= I+
Ak + O
n k
n2
P
 
i
n X
1
n
k Ak
+O
=I +
n
n
i
i =1
P
i



 
n
X
1
1
k Ak
=
1 + +O
+O
i
!
n
n
i =0

Taking the limit with n tending to infinity yields the result.
□
We can now easily prove the Golden-thompson inequality:
Corollary 2.5 Golden-Thompson inequality
Let p ≥ 1 , β0 as defined previously, n ∈ N, and (Hk )k ≤n a collection of hermitian matrices. Then :
P
l og e
Z
Hk
+∞
β0 (t )l o g
≤
k
p
−∞
Y
k
((1 + i t )Hk )
dt
p
Proof:
We use the previous theorem with r = n1 and by letting r tend to infinity. On
the left side we use Lieb-Trotter product formula to get the expression we want.
For the right side, we can use Lebesgue’s dominated convergence theorem, as
the expression under the Q
integral is continuous so bounded on [−1, 1] and is
bounded by 2c o sπh (πt ) l o g
k ((1 + i t )Hk ) p for |t | &gt; 1, which is integrable beQ
cause l o g
k ((1 + i t )Hk ) p is bounded with respect to the variable t. Using the
sub-multiplicativity of the Schatten norms, it is sufficient to show that A 1+i t
p
is bounded. To see that, we use that as A is Hermitian, the matrix A i t is unitary :
†
†
†
A i t := e i t l o g A = e −i t l o g A = A −i t .
14
Then we use that the Schatten norms are unitary invariant : let U := A i t . We have
p /2
p /2
U = t r A † A with the last equality
: ∥AU ∥pp := t r U † A † AU
= t r U † A† A
resulting from the cyclicity of the trace. Thus the expression in the integral is dominated by an integrable function so we can use Lebesgue dominated convergence
theorem.
□
We can apply this inequality with p = 2, and n = 6 for
logarithm is increasing and Jensen’s inequality, we have :
e
1
2
P
1
2 Hk k ≤6 .
Using that the
P

 12
Hk
= t r (e k )
Hk
k
2
Thus :

tr e
P
Hk

k
Z
+∞
≤
 1+i t

1+i t
1+i t
1−i t
1−i t
β0 (t )t r e 2 H1 e 2 H2 . . . e 2 H6 e 2 H6 . . . e 2 H1 d t
−∞
Which can be simplified using the cyclicity of the trace :

tr e
P
k
Hk

Z
+∞
≤


1+i t
1−i t
1−i t
β0 (t )t r e H1 e 2 H2 . . . e H6 e 2 H5 . . . e 2 H2 d t
(7)
−∞
This result will be used later.
3
3.1
Quantum Markov chains
Quantum Markov chains
In this section we define quantum Markov chains. Let A,B,C be three quantum systems (i.e. Hilbert spaces) and consider ρAB C , ρA , ρB , ρC , . . . be the density operators associated with the states of the corresponding systems. This follows [Sut18,
Chapter 5].
Definition 3.1 Quantum Markov chains
A ↔ B ↔ C is a quantum Markov chain if there exists a recovery map
RB →B C , i.e. a trace preserving completely positive map from B to BC such
that: ρAB C = RB →B C (ρAB )
Informally, the state of C can be reconstructed from the state of AB by only acting
on the B part. The idea behind this concept is close to classical quantum Markov
chains, where one can say that &quot;The future only depends on the past by the present&quot;.
We have two characterisations of quantum Markov chains :
15
Property 3.2 First characterisation of quantum Markov chains
A state ρAB C ∈ S (A ⊗ B ⊗ C ) is a quantum Markov chain if and only if there
exists a decomposition of the B system :
M
B=
b jL ⊗ b jR
j
such that :
ρAB C =
M
ρAb jL ⊗ ρB jR C
j
This characterisation describes the structure of a quantum Markov chain. However, the second characterisation links quantum Markov chains with entropy quantities, and thus is of greater importance for us:
Property 3.3 Second characterisation of quantum Markov chains
A state ρAB C ∈ S (A ⊗B ⊗C ) is a quantum Markov chain if and only if I (A; C |B ) =
0. Moreover, in this case, the rotated Petz recovery map:
[t ]
1+i t
TB →B C : X B 7→ ρB C2
− 1+i2 t
ρB
− 1−i2 t
X B ρB
1−i t
⊗ IC ρB C2
[t ]
satisfies : ρAB C = TB →B C (ρAB ) for any t in R
Proof:
Suppose that ρAB C is a quantum Markov chain. First, we compute :
−S (A|B ) = D (ρAB ||I A ⊗ ρB )
[t ]
[t ]
≥ D (TB →B C (ρAB )||I A ⊗ TB →B C (ρB ))
= −S (A|B C )
[t ]
Where we used the data processing inequality and that T rA (TB →B C (ρAB )) =
[t ]
TB →B C (ρB ), which is a simple computation. We now have :
I (A; C |B ) := S (A|B ) − S (A|B C ) ≤ 0
Combining this with strong sub-additivity, we obtain that I (A; C |B ) = 0.
The other direction is more complicated to prove. It requires a strengthening of the
data processing inequality involving the rotated Petz recovery map and relying inter alia on the multivariate Golden-Thompson inequality [Sut18, proposition 5.21]
□
16
3.2
Approximate quantum Markov chains
The previous characterisation suggests a link between the conditional mutual information I (A; C |B ) and the recoverability of the quantum Markov chain A ↔ B ↔
[t ]
C , i.e. the distance between the recovered state TB →B C (ρAB ) and ρAB C ). Here we
measure the &quot;distance&quot; with the measured relative entropy. It is not a metric as
it doesn’t satisfy the triangular inequality, however its non-negativity with equality
if and only if both states are equal allows us to use is as an indicator for &quot;closeness&quot;.
Theorem 3.4
Let ρAB C ∈ S (A ⊗ B ⊗ C ). We have :
I (A; C |B ) ≥ DM (ρAB C ||T B →B C (ρAB ))
Where T B →B C : X B 7→
recovery map.
R +∞
−∞
[t ]
β0 (t )TB →B C (X B )d t is the (averaged) rotated Petz
The theorem is admitted here. The proof relies on a strengthened data processing
inequality.
Definition 3.5 Approximate quantum Markov chains
A state ρAB C ∈ S (A ⊗ B ⊗ C ) is an ε−approximate Markov chain if :I (A; C |B ) &lt; ε.
The previous theorem shows that an approximate quantum Markov chain has a
property of good recoverability. However, it can be shown that it is not necessary
close to a quantum Markov chain [Sut18, Proposition 5.9] with respect to the trace
distance.
17
4
4.1
Linden-Winter inequality and possible extensions
Linden and Winter’s original inequality
In 2004 Linden and Winter developed a new (constrained) inequality [LW04] :
Theorem 4.1 Linden-Winter inequality
Let A,B,C,D be quantum systems, with the constraints:
• A ↔ B ↔ C is a quantum Markov chain, i.e. I (A; C |B ) = 0
• B ↔ A ↔ C is a quantum Markov chain, i.e. I (B ; C |A) = 0
• A ↔ D ↔ B is a quantum Markov chain, i.e. I (A; B |D ) = 0
Then we have :
I (C ; D ) ≥ I (C ; AB )
Which can also be written as :
I (AB C ; D ) ≥ I (AB ; C D )
They also proved that this inequality is not a mere consequence of strong subadditivity.
However, the constraints are really strong, as they are equalities; the probability
of them to happen is zero in an experiment. Therefore the aim here is either to
remove those constraints or to replace them by weaker ones. This is an open problem. The best way seems to replace these three quantum Markov chains by approximate quantum Markov chains.
4.2
A new proof for the inequality
The proof given in [LW04] relies on the first characterisation of quantum Markov
chains. However, we do not have any similar characterisation for approximate
quantum Markov chains. For this reason, this proof is not shown here. However,
in [Sut19], Sutter gave a new method to prove entropy inequalities :
18
Method to prove entropy inequalities
• Write the inequality as the non-negativity a relative entropy D (ρ||σ) ≥ 0
• Expand the relative entropy using the variational formula.
• Use the multivariate Golden-Thompson inequality.
• Recognize the variational formula of measured relative entropy.
• Use the non-negativity of measured relative entropy(i.e. show that
t r ρ = 1 and t r σ ≤ 1).
This method can be used to prove a slight different version of the Linden-Winter
inequality. This follows an unpublished note summarizing the work of M.Lemm
and D. Xiang in 2019 [LX] .
Theorem 4.2 Variant of the Linden-Winter inequality
Let A,B,C,D be quantum systems, with the constraints:
• A ↔ B ↔ C is a quantum Markov chain, i.e. I (A; C |B ) = 0
• B ↔ A ↔ C is a quantum Markov chain, i.e. I (B ; C |A) = 0
We also assume that ρAB is invertible. Then we have :
I (AB C ; D ) ≥ I (AB ; C D )
Proof:
First, we use the two quantum Markov chains : we write that ρAB C can be recovered
in two different ways using the Petz recovery map :
1+i t
ρAB C = ρB C2
− 1+i2 t
ρB
− 1−i2 t
ρAB ρB
1−i t
⊗ IC ρB C2
As the second characterisation shows, in a quantum Markov chain, A and C play
symmetrical roles, so that if A ↔ B ↔ C is a quantum Markov chain, C ↔ B ↔ A
is one too. Here we swap A and C to get :
19
1+i t
ρAB2
− 1+i2 t
ρB
− 1−i2 t
ρB C ρB
1−i t
1+i t
1−i t
1+i t
−
− 1−i t
⊗ IC ρAB2 = ρAB C = ρAB2 ρA 2 ρAC ρA 2 ⊗ IC ρAB2
We will now write the terms without the appropriate tensored identity matrix to
make the proof easier to read. We thus write :
1+i t
− 1+i2 t
ρAB2 ρB
− 1−i2 t
ρB C ρB
1−i t
1+i t
− 1+i2 t
ρAB2 = ρAB2 ρA
− 1−i2 t
ρAC ρA
1−i t
ρAB2
Here we use the assumption that ρAB is invertible to cancel it and we trace out over
the A system :
1+i t
1+i t
−
− 1−i t
−
− 1−i t
d i m A ρB 2 ρB C ρB 2 = t rA ρA 2 ρAC ρA 2
(8)
We now write the wanted inequality as a statement about divergences. First, let us
write it with quantum entropy :
S (AB C ) + S (D ) − S (AB ) − S (C D ) ≥ 0
We use that I (A; C |B ) = 0, i.e. that S (AB )+S (B C )−S (B )−S (AB C ) = 0 and add these
two quantities, so that the initial statement is equivalent to :
S (B C ) + S (D ) − S (B ) − S (C D ) + S (AB C D ) − S (AB C D ) ≥ 0
(9)
We now write these entropies with divergences. We have :
S (B C ) = −t rB C (ρB C l o g (ρB C )) = −t rAB C D (ρAB C D l o g (ρB C ) ⊗ I AD )
Similarly :
S (B C ) = −t rAB C D (ρAB C D l o g (ρB C ) ⊗ I AD )
S (D ) = −t rAB C D (ρAB C D l o g (ρD ) ⊗ I AB C )
S (B ) = −t rAB C D (ρAB C D l o g (ρB ) ⊗ I AC D )
S (C D ) = −t rAB C D (ρAB C D l o g (ρC D ) ⊗ I AB )
S (AB C D ) = −t rAB C D (ρAB C D l o g (ρAB C D ))
We can now sum these terms and pull out ρAB C D :
S (B C ) + S (D ) − S (B ) − S (C D ) + S (AB C D ) − S (AB C D ) = D (ρAB C D ||e σ )
Where we define σ as : σ := l o g ρAB C D − l o g ρC D + l o g ρD − l o g ρB + l o g ρB C
Once again, all terms are implicitly tensored with the according identity matrix so
they all represent states of the system ABCD.
We now use the variational formula for relative entropy:
20
D (ρAB C D ||e σ ) = sup t r (ρAB C D l o g ω) − l o g t r (e σ+l o g ω )
ω&gt;0
≥ sup t r (ρAB C D l o g ω) − l o g t r
ω&gt;0
+∞
Z
β0 (t )κ(t )ωd t
−∞
We used the Golden-Thompson inequality for 6 matrices to obtain the lower
bound, with :
− 1+i t
1+i t
1+i t
− 1+i2 t
κ(t ) := ρAB2 C D ρC D2 ρD 2 ρB
We thus have :
− 1−i2 t
1−i t
− 1−i t
σ
1−i t
ρD 2 ρC D2 ρAB2 C D
ρB C ρB
D (ρAB C D ||e ) ≥ DM ρAB C D ||
Z
+∞
β0 (t )κ(t )d t
−∞
So by linearity of the trace, proving that for each t in R, κ(t ) is a quantum systems
allows us to use the non-negativity of measured relative entropy to conclude the
proof.
Let t be in R . We now compute t r κ(t ) to show that it is equal to 1. First we can use
cyclicity of the trace to trace out over the A system :
1+i t
− 1+i t
1+i t
− 1+i2 t
− 1+i t
1+i t
− 1+i t
t rAB C D κ(t ) = t rAB C D ρAB2 C D ρC D2 ρD 2 ρB
− 1−i2 t
ρB C ρB
− 1−i t
1−i t
− 1−i t
1−i t
− 1−i t
1−i t
ρD 2 ρC D2 ρAB2 C D
= t rAB C D ρAB C D ρC D2 ρD 2 ρB 2 ρB C ρB 2 ρD 2 ρC D2
1−i t
− 1+i t 1+i t − 1+i t
− 1−i t
− 1−i t
= t rB C D t rA ρAB C D ρC D2 ρD 2 ρB 2 ρB C ρB 2 ρD 2 ρC D2
− 1+i t
1+i t
− 1+i2 t
− 1+i t
1+i t
− 1+i2 t
= t rB C D ρB C D ρC D2 ρD 2 ρB
1+i t
= t rB C D ρB C2 D ρC D2 ρD 2 ρB
− 1−i2 t
ρB C ρB
− 1−i2 t
ρB C ρB
1−i t
− 1−i t
1−i t
− 1−i t
ρD 2 ρC D2
1−i t
ρD 2 ρC D2 ρB C2 D
We now use our identity (8) :
1+i t
− 1+i t
1+i t
t rB C D ρB C2 D ρC D2 ρD 2
− 1+i2 t
ρB
1+i t
1+i t
1
− 1+i t
t rB C D ρB C2 D ρC D2 ρD 2
d i mA
1−i t
1−i t
− 1−i t
− 1−i t
ρB C ρB 2 ρD 2 ρC D2 ρB C2 D =
1+i t
1−i t
1−i t
−
− 1−i t
− 1−i t
t rA ρA 2 ρAC ρA 2 ⊗ I B ρD 2 ρC D2 ρB C2 D
We can now trace out over the B system, as we did for the A system :
21
1+i t
1−i t
1+i t
1−i t
1+i t
1
− 1+i t
− 1−i t
− 1−i t
−
t rB C D ρB C2 D ρC D2 ρD 2 t rA ρA 2 ρAC ρA 2 ⊗ I B ρD 2 ρC D2 ρB C2 D =
d i mA
1+i t
1−i t
1+i t
1+i t
1−i t
1
− 1+i t
−
− 1−i t
− 1−i t
2
2
i
t rC D ρ C D
ρC D2 ρD 2 t rA ρA 2 ρAC ρA 2
ρD 2 ρC D2 ρC D
=
d i mA
1+i t
1−i t
1+i t
1
−
− 1−i t
i
t rC D ρD 2 t rA ρA 2 ρAC ρA 2
ρD 2
d i mA
i
We can now use the cyclicity of the trace and trace out over D:
1+i t
1−i t
1+i t
1
−
− 1−i t
ρD 2 =
t rC D ρD 2 t rA ρA 2 ρAC ρA 2
d i mA
1+i t
1
− 1−i t
−
i
=
t rC D ρD t rA ρA 2 ρAC ρA 2
d i mA
1+i t
1
− 1−i t
−
i
t rAC ρA 2 ρAC ρA 2
d i mA
i
We now trace out over C then over A :
1+i t
−
− 1−i t
t rAC ρA 2 ρAC ρA 2 = t rAC ρA−1 ρAC
= t rA ρA−1 ρA
= t rA I A
= d i mA
Eventually, we have : T r κ(t ) = 1. Thus, we get :
Z
+∞
β0 (t )κ(t )d t =
tr
Z
−∞
+∞
β0 (t )t r (κ(t )) d t = 1
−∞
Where we used that β0 is a probability density. This concludes the proof.
□
We can comment on this version of the Linden-Winter inequality. First, about
the assumption, one can see that the third quantum Markov chain ( A ↔ D ↔
B ) is not used here. This seems to help loosening the constraints, but as it will
be discussed later, it may not the right direction to go to. Second, we added an
assumption about the invertibility of the reduced density operator ρAB . As this
matrix is diagonalizable and positive, this constraint states that every eigenvalue
of ρAB is strictly positive, so it is not a very strong assumption.
4.3
Thoughts on the proof
This section presents different ideas we had during the last two weeks of the training.
22
As it was already said, the assumptions of the Linden-Winter inequality are very
strong : as they are equalities, we can think of it in a probabilistic way as a conditioning over a set of probability zero. Thus a lot of effort is made to relax the assumptions from equalities to inequalities : here the logical extension is, instead of
having three quantum Markov chains, to have three approximate quantum Markov
chains. As the proof originally relied on a characterization of quantum Markov
chains for which we do not know any adaptation for approximate quantum Markov
chains. Thus, with the new method from [Sut19] applied in [LX] and showed in the
previous section, there is hope to adapt the proof with these new assumptions.
First, we need to see where the assumptions are used. We can see from the proof
that (8) directly comes from the assumptions, and from the fact that ρAB is invertible. Then we use the assumptions again to get (9).
To get a similar equation of (8), we start by using the averaged Petz recovery map:

DM ρAB C
+∞
Z
− 1+i t
β0 (t )ρAB ρB 2
− 1−i t
ρB C ρB 2
− 1+i t
β0 (t )ρAB ρA 2
− 1−i t
ρAC ρA 2
1+i t
2
1−i t
2

ρAB d t ≤ ε
−∞

DM ρAB C
Z
+∞
1+i t
2
1−i t
2

ρAB d t ≤ ε
−∞
We would want to use the triangular inequality, but is does not hold here. We have
two main options :
We could use R&eacute;nyi relative entropies :
Definition 4.3
R&eacute;nyi relative entropies
Let α ∈]0, 1[∪]1, +∞[ and ρ and σ in S(A) :
( α
1−α
1−α
l o g σ 2α ρσ 2α if s up p (ρ) ⊂ s up p (σ) or α &lt; 1
Dα (ρ||σ) := α − 1
+∞ otherwise
As the limits for α going to infinity or 1 exist, we define :
D1 (ρ||σ) := D (ρ||σ)
1
1
Dma x (ρ||σ) := l o g σ− 2 ρσ− 2
We have two interesting properties :
Property 4.4
For ρ and σ in S(A) :
DM (ρ||σ) ≤ D (ρ||σ) ≤ Dma x (ρ||σ)
23
Triangle-like inequality
Property 4.5
For ρ, σ and ω in S(A) and α in ] 12 , ∞[ :
Dα (ρ||σ) ≤ Dα (ρ||ω) + Dma x (ω||σ)
With these two properties, we can see that if we were to replace the assumptions :
• I (A; C |B ) ≤ ε
• I (B ; C |A) ≤ ε
• I (A; B |D ) ≤ ε
by :
• Dm a x (ρAB C ||T B →B C (ρAB )) ≤ ε
• Dm a x (ρAB C ||T A→AC (ρAB )) ≤ ε
• Dm a x (ρAB D ||T D →B D (ρAD )) ≤ ε
Then we could use the triangle-like inequality and we finally have :
Z
+∞
DM
1+i t
2
− 1+i t
β0 (t )ρAB ρA 2
− 1−i t
ρAC ρA 2
1−i t
2
ρAB
−∞
Z
+∞
1+i t
2
− 1+i t
β0 (t )ρAB ρB 2
− 1−i t
ρB C ρB 2
1−i t
2
ρAB d t ≤ 2ε
−∞
The problem now is to get from here to an equation similar to (8). We can no
longer cancel the ρAB even if it is invertible, and the integrals are hard to manipulate. We could not find a solution for the first problem, however we considered
using the mean value theorem to get rid of the integrals. Unfortunately, we did not
succeed to go further with this idea.
One similar approach is to use the trace norm instead of the R&eacute;nyi relative entropies, which we can do because of Pinsker’s inequality, but this does not help
going further.
Now, we need to see why we need equation (9). In the original proof, it helps us get
rid of ρAB C term in κ(t ) after we &quot;translated&quot; the inequality into the non-negativity
24

of measured relative entropy. To show that the trace of κ(t ) equals 1, we need to
trace out over the different systems, one at a time. Thus, if a system appears on
only one density operator, then we can easily trace out over this system (take for
example the first step when we trace out over the A system). The order of the terms
is important, but notice that in the Golden-Thompson inequality, we can shuffle
the order on the right side as the order on the left side is insignificant. However,
without (9), we can’t untangle the systems so we can’t trace out step by step over
the different systems.
As substracting conditional mutual information does not help cancelling any term,
we have to add them in order to get a simpler expression. Even if these conditional
mutual information are non-zero, we have two options : The first one is to then
use a lower-bound on (measured) relative entropy to show that it is greater than ε.
Such a lower bound exist: we found one in [RW15]. However, this does not look like
a good solution : First, it seems hard to apply. Second, the lower bound depends
on the dimension d , which appears in the expression : This might fail to generalize
to infinite dimension spaces, and even fail with high finite dimensions.
The other option is to modify the inequality to include error-terms, as suggested
in [Sut19], to get an inequality of the form :
κ1 I (A; C |B ) + κ2 I (B ; C |A) + κ3 I (A; B |D ) + I (AB C ; D ) − I (AB ; C D ) ≥ 0
with κ1 , κ2 , κ3 three constants to determine.
4.4
Symmetry
There is a symmetry between the A and the B systems in the original theorem by
Linden and Winter :
Theorem 4.6 Linden-Winter inequality
Let A,B,C,D be quantum systems, with the constraints:
• A ↔ B ↔ C is a quantum Markov chain, i.e. I (A; C |B ) = 0
• B ↔ A ↔ C is a quantum Markov chain, i.e. I (B ; C |A) = 0
• A ↔ D ↔ B is a quantum Markov chain, i.e. I (A; B |B ) = 0
Then we have :
I (C ; D ) ≥ I (C ; AB )
Which can also be written as :
I (AB C ; D ) ≥ I (AB ; C D )
If we swap A and B, the first assumption becomes the second and vice-versa. The
third one is symmetrical. We also notice that the inequality is symmetrical as only
the AB system is relevant. However, in the proof by Lemm and Xiang, the symme25
try is broken twice : when we trace out over A to get (8) and when we add I (A; C |B )
to get (9). However we can adapt the proof to keep the symmetry by slightly modifying the equation (9). We sketch the proof :
Proof:
In order to get a similar equation to (9), we add 12 (I (A; C |B ) + I (B ; C |A)) instead
of I (A; C |B ) and we multiply by two to obtain an inequality equivalent to the one
desired :
S (AC ) + S (B C ) − S (A) − S (B ) + 2S (D ) − 2S (C D ) + 2S (AB C D ) − 2S (AB C D ) ≥ 0
It is sufficient to prove the two following inequalities :
S (AC ) − S (A) + S (D ) − S (C D ) + S (AB C D ) − S (AB C D ) ≥ 0
S (B C ) − S (B ) + S (D ) − S (C D ) + S (AB C D ) − S (AB C D ) ≥ 0
These two equations replace equation (9) : The second one is actually equation (9)
and the first one is obtained by swapping A and B. Thus, we know that the second
one is true, according to the previous proof, and to prove the first one is suffices to
get a similar equation to equation (8) but by swapping A with B, which is possible :
1+i t
− 1+i2 t
ρAB2 ρB
− 1−i2 t
ρB C ρB
1−i t
1+i t
− 1+i2 t
ρAB2 = ρAB2 ρA
− 1−i2 t
ρAC ρA
1−i t
ρAB2
We then use that ρAB is invertible but we trace out over B this time, to get :
1+i t
1+i t
− 1−i t
−
− 1−i t
−
d i m B ρA 2 ρAC ρA 2 = T rB ρB 2 ρB C ρB 2
Using this, the rest of the proof is identical to the one showed earlier but by swapping A and B.
Thus we proved the two inequalities that yield the result.
□
References
[G00]
Classical fourier analysis. Springer, third edition, 2000.
[LW04] Linden and Winter. A new inequality for the von neumann entropy. 2004.
[LX]
Lemm and Xiang. Entropy inequalities.
[NC00] Nielsen and Chuang. Quantum Computation and quantum information.
Cambridge university press, 10th edition, 2000.
[RW15] Reeb and Wolf. Tight bound on relative entropy by entropy difference.
2015.
26
[SBT16] Sutter, Berta, and Tomamichel. Multivariate trace inequalities. 2016.
[SR18]
Sutter and Renner. Necessary criterion for approximate recoverability.
2018.
[Sut18] Sutter. Approximate markov chains. Springer, 2018.
[Sut19] Sutter. A new approach to prove von neumann inequalities. Talk, 2019.
[Wil19] Wilde. From classical to quantum Shannon entropy. Cambridge university
press, 2019.
27
```