Quantum entropy inequalities Théo Fradin August 24, 2021 You should call it ‘entropy’ for two reasons; first, the function is already in use in thermodynamics; second, and most importantly, most people don’t know what entropy really is, and if you use ‘entropy’ in an argument, you will win every time. John Von Neumann to Claude Shannon Summary This training was supervised by Marius Lemm, Professor at the École Polytechnique Fédérale de Lausanne, chair of analysis and mathematical physics. During this training in quantum information theory I learnt about quantum entropy. After having worked on the basic notion : definition of the various quantum entropies, basic inequalities of entropy, strong subadditivity, data processing inequality; I learnt that strong subadditivity was the most important entropy inequality and that every other one could be derived from it. Then, I worked on the main tool used in a new method to find entropy inequalities from [Sut19], the Golden-Thompson inequality. After that, I studied the application of this method to prove a variant of the Linden-Winter inequality, a constrained but new inequality. The proof follows the note [LX]. Eventually, we did some research about generalizing the proof to the case of approximate quantum Markov chains, without much success. However we did manage to bring back the original symmetry in the proof. Contents 1 Introduction 1.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Specifics of quantum physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 2 3 5 2 Golden-Thompson inequality 7 3 Quantum Markov chains 15 3.1 Quantum Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Approximate quantum Markov chains . . . . . . . . . . . . . . . . . . . . . . . 17 4 Linden-Winter inequality and possible extensions 4.1 Linden and Winter’s original inequality . . . . . . 4.2 A new proof for the inequality . . . . . . . . . . . . . 4.3 Thoughts on the proof . . . . . . . . . . . . . . . . . . 4.4 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 18 18 22 25 Introduction In this section we introduce the notations as well as the basic notions used later. This follows [NC00], [Wil19] and [Sut18]. 1.1 Notations Let A be a quantum system, i.e. an Hermitian space. We will be using the following notations : • ρ † : Conjugate transpose of a matrix ρ • l o g denotes the natural logarithm, either for complex numbers or matrices. • |ψ⟩ : ket ψ, vector of A. • ⟨ψ| : bra ψ, vector of the dual of A. • I A : identity map of A. • t r , t rA : trace and partial trace over A. • s up p (ρ) : Support of ρ. • Sp (ρ) : Set of Eigenvalues of ρ. • ⊕ : Sum between two operators acting on orthogonal spaces. • ρ p : p-Schatten norm of the operator ρ. 2 • |ρ| := ρρ † 1/2 : Absolute value of the matrix ρ. • L (A) : Set of linear operators over A. • D (A) : Set of density operators over A. • P (A) : Set of positive semi-definite operators over A. • P+ (A): Set of positive definite operators over A. • T P C P (A, B ) : set of trace preserving completely positive maps from A to B. The matrices used here are positive, unless specified otherwise. 1.2 Specifics of quantum physics A quantum system is a Hermitian space A, whose vectors |ψ⟩ represent its possible states. The adjoint of |ψ⟩ is written ⟨ψ|. We have, for example : ⟨ψ|ψ⟩ is the scalar product of ψ with itself, and |ψ⟩ ⟨ψ| is an operator acting on A, more precisely it is the projector over the subspace spanned by |ψ⟩. We can describe multipartite systems with tensor product. Let AB be a bipartite system, i.e. it is the Hermitian space : A ⊗ B . A state of AB which can not be written in the form |ψA ⟩ ⊗ |ψB ⟩ is said to be entangled. We can also represent states with density operators: Definition 1.1 Density operator Let {(pi , |ψi ⟩)} be an set of pure states associated with a probability, where P i pi = 1. Then we define : X ρ := pi |ψi ⟩ ⟨ψi | i the density operator of the system. A density operator which is a projector, that is to say is is of the form |ψ⟩ ⟨ψ| is called a pure state. Otherwise, it is a mixed state. We now need some mathematical criterion to describe the evolution of a quantum system. Let N be an operator that maps a density matrix ρA to ρB so that ρB is the evolution of ρA over time. Using physical reasoning (see [Wil19, Section 4.4.1] ) we can define quantum channels: 3 Definition 1.2 Quantum channels Let A and B be quantum systems, and N : L (A) → L (B ) be a linear map. It is called a quantum channel if it is a trace preserving completely positive map, i.e. : • ∀ρA ∈ L (A) , t r (N (ρA )) = t r (ρA ) • For all quantum system R and , I L (R ) ⊗ N maps positive semi-definite operators to positive semi-definite operators. We can justify these with the following reasoning : We want a density operator to be mapped to a density operator,as a quantum system is "stable" over time, so N has to be trace preserving and has to map positive semi-definite operators to positive semi-definite operators. But suppose we prepare a larger quantum system, which subsystems are R and A. Then N can be extended as I L (R ) ⊗N and this has to be trace-preserving - which it is automatically- and positive semi-definite, which is the definition of N being a completely positive map. We will use density operators to describe quantum systems as it is convenient for describing multipartite systems. The key notion here is the partial trace : Definition 1.3 Partial trace Let |a 1 ⟩ ⟨a 2 | ⊗ |b1 ⟩ ⟨b2 | be in A ⊗ B . We define : t rA (|a 1 ⟩ ⟨a 2 | ⊗ |b1 ⟩ ⟨b2 |) := |b1 ⟩ ⟨b2 | t r (|a 1 ⟩ ⟨a 2 |) Where the trace on the right side is the usual trace. Then we extend the definition by linearity. An important property of the partial trace (as for the regular trace) is cyclicity: For any matrices ρ1 , ρ2 , ρ3 acting on a space A ⊗ B where the products are well defined, we have : t rA (ρ1 ρ2 ρ3 ) = t rA (ρ3 ρ1 ρ2 ). The partial trace is the operation which allows us to get the density operators of the subsystems : Property 1.4 Reduced states Let A,B be two systems, with ρAB , ρA , ρB be the density operators of AB, A and B respectively. The so-called reduced density operators ρA and ρB are obtained by taking the partial trace : ρA = t rB ρAB ρB = t rA ρAB 4 We now introduce Schatten norms, which are the norms we will use for matrices : Definition 1.5 Schatten norms Let p ≥ 1 and L be a matrix. We define : 1/p ∥L ∥p := (t r |L |p ) Where : |L | := (L L † )1/2 1.3 Entropy Definition 1.6 Classical entropy Let X be a discrete random variable. We introduce the Shannon entropy : P H (X ) := − P (X = x )l o g (P (X = x )) x ∈X (Ω) Entropy was developed by Shannon as a way to quantify how much information can be delivered by a source. Other entropy-related quantities exist, but the classical case is here only to be compared with the quantum case : 5 Definition 1.7 Quantum entropies Let ρA , ρB , ρC be density operators representing states of the systems A,B and C. We define: P • Von Neumann entropy : S (A) := −t r (ρA l o g ρA ) = − λl o g (λ) λ∈Sp (ρA ) • Quantum relative entropy : D (A||B ) := S (A||B ) := t r (ρA l o g ρA ) − t r (ρA l o g ρB ) • Quantum measured relative entropy : DM (ρA ||ρB ) := sup t r ρA l o g ω − l o g t r ρB ω ω∈P+ (A) • Quantum joint entropy : S (AB ) := −t r (ρAB l o g ρAB ) • Quantum conditional entropy : S (A|B ) := S (AB ) − S (B ) • Quantum mutual information : I (A; B ) := S (A) + S (B ) − S (AB ) • Quantum conditional mutual information : I (A; C |B ) := S (AB ) + S (B C ) − S (AB C ) − S (B ) At this point we can make two remarks : First, the relative entropy is also called divergence, hence the notation with a "D". Second, one may notice that the definition of the measured relative entropy is quite different from the others. It is in fact a characterization, not the usual definition. It is however equivalent, and we will only use this form. The more standard definition would need to introduce other notions of quantum physics, in particular a specific sort of measurement. The relative entropy can also be expressed with a variational formula : Theorem 1.8 Variational formula for relative entropy Let ρ ∈ D (A) and σ ∈ P (A). Then : D (ρ||σ) = sup ω∈P+ (A) t r ρl o g ω − l o g t r e l o g σ+l o g ω These quantities obey some laws, one of the most important ones being nonnegativity. We have the following inequalities : 6 Theorem 1.9 Quantum entropy inequalities Let A,B and C be three quantum systems, and ρ and σ density operators. We have : • Non-negativity of quantum entropy : S (A) ≥ 0 • Subadditivity : I (A; B ) ≥ 0 i.e. S (AB ) ≤ S (A) + S (B ) • Strong subadditivity : I (A; C |B ) ≥ 0 i.e. S (AB C ) + S (B ) ≤ S (AB ) + S (B C ) • Klein’s inequality : S (A||B ) ≥ 0 • Non-negativity of relative entropies : D (ρ||σ) ≥ DM (ρ||σ) ≥ 0 Strong subadditivity is the strongest entropy inequality : all the other (unconstrained) inequalities are deduced from it. One other inequality is the data processing inequality. It states that the entropy can not increase with time : Theorem 1.10 Data processing inequality Let ρ, σ be two density operators and E be a quantum channel. Then : D (E (ρ)||E (σ)) ≤ D (ρ||σ) The measured relative entropy also respects the data processing inequality. 2 Golden-Thompson inequality The major result we will need later is the Golden-Thompson inequality. To prove this inequality, we need some results from complex interpolation theory first. These theorems come from [SBT16] except for Lemma 2.2 which comes from [G00]. 7 Theorem 2.1 Let S := {z ∈ C, 0 ≤ R e (z ) ≤ 1} , G : S → B (A) a holomorphic function in the interior of S to the set of bounded operators on a separable Hilbert space. Let us suppose that G is also continuous on the boundary of S. θ Let p0 , p1 be in ]0, ∞[, θ ∈]0, 1[, and pθ s.t. p1θ = 1−θ p0 + p1 . Let : βθ : R → R t 7→ s i n (πθ ) 2θ (c o s h (πt ) + c o s (πθ )) be a probability distribution. If furthermore z 7→ ||G (z )||pR e (z ) is uniformly bounded on S, then : l o g ||G (θ )||pθ ≤ +∞ Z θ β1−θ (t )l o g ||G (i t )||1−θ p0 + βθ (t )l o g ||G (1 + i t )||p1 d t −∞ We first need a lemma : Lemma 2.2 Let g be analytic on the open strip S := {z ∈ C , 0 < ℜ(z ) < 1}, continuous and uniformly bounded on its closure. Then for θ ∈]0, 1[ , we have : l o g g (θ ) ≤ +∞ Z β1−θ (t )l o g g (i t ) −∞ 8 1−θ + βθ l o g g (1 + i t ) θ dt Proof: The lemma is in fact a slightly stronger result : we can indeed show that for any z := x + i y ∈ S , we have : s i n (πx ) l o g g (x + i y ) ≤ 2 +∞ Z l o g g (i t + i y ) c o s h (πt ) − c o s (πx ) + l o g g (1 + i t + i y ) c o s h (πt ) + c o s (πt ) dt −∞ We will only do the proof for y = 0. In that case, we write θ = x . Notice that −c o s (πθ ) = c o s (π(1 − θ )) and s i n (πθ ) = s i n (π(1 − θ )). Eventually, recall that 1/θ > 1; this inequality implies: l o g g (θ ) ≤ +∞ Z l o g g (i t ) l o g g (1 + i t ) s i n (π(1 − θ )) s i n(πθ ) + dt 2θ c o s h (πt ) + c o s (π(1 − θ )) 2θ c o s h (πt ) + c o s (πθ ) −∞ Which is exactly the lemma as stated above, according to the definition of βθ . The proof follows [G00, Lemma 1.3.8]. For U any harmonic function on the open disc D of radius 1, we have the Poisson formula : 1 U (z ) = 2π Z π U R e iφ −π R 2 − ρ2 R e i φ − ρe i θ dφ (1) for z := ρe i φ and |z | < R < 1. Let u be subharmonic on D, continuous on the circle CR of radius R. When U = u , the right side of (1) defines a harmonic function on the open disc DR of radius R that coincides with u on the border. The maximum principle for subharmonic functions implies, for |z | < R < 1 : 1 u (z ) ≤ 2π Z π u R e iφ −π R 2 − ρ2 R e i φ − ρe i θ dφ (2) for z := ρe i φ . This is valid for all subharmonic functions u that are continuous on the circle CR . Let : h :D →S 1 1+ξ ξ 7→ l og i πi 1−ξ 1+ξ The fraction i 1−ξ is analytic on D (its series representation has a radius of convergence of at least 1) and lies on the upper half-plane, h is holomorphic. We then also have that g ◦ h is holomorphic on D. u := l o g |g ◦ h | is then subharmonic. We indeed have, for any holomorphic function F, l o g |F | is subharmonic. First, assume that F does not vanish. Then we have : l o g |F | = R e (l o g (F )) and l o g (F ) is a holomorphic function, so its real part is harmonic. For the general case, we need to allow infinite values so l o g |F | is only subharmonic. 9 Applying (2) to this function and computing the module at the bottom of the fraction gives : l o g |g ◦ h (z )| ≤ 1 2π Z π l o g g ◦ h R e iφ −π R 2 − ρ2 dφ R 2 − 2ρR c o s (θ − φ) + ρ 2 (3) for z := ρe i φ and |z | ≤ ρ < R We use our assumption that g is uniformly bounded on the closure of S, so we can use the Lebesgue dominated convergence theorem in (3) for ρ < R and R → 1 : 1 l o g |g ◦ h (z )| ≤ 2π Z π l o g g ◦ h R e iφ −π 1 − ρ2 dφ 1 − 2ρc o s (θ − φ) + ρ 2 (4) let x := h (ρe i θ ). A simple computation gives that h is invertible and : ρe i θ = πi x c o s (πx ) −i h −1 (x ) = ee πi x +i = −i 1+s i n(πx ) so using the exponential writing of complex numbers : c o s (πx ) 1 + s i n(πx ) |θ | = π 2 Note that the absolute values take into account the sign of c o s (πx ) as x is in ]0, 1[. Either way, we have : ρ= 1 − ρ2 s i n(πx ) = 1 − 2ρc o s (θ − φ) + ρ 2 1 + c o s (πx )s i n(φ) Thus we can write (4) as : 10 1 l o g g (x ) ≤ 2π Z +π −π s i n (πx ) l o g g (h (e i φ )) d φ 1 + c o s (πx )s i n(πx ) (5) We now change variables : • On ] − π, 0[, we use i t = h (e i φ ). This is a legit change of variables as h is holomorphic and invertible. We have : 1 2π Z 0 1 s i n (πx ) l o g g (h (e i φ )) d φ = 1 + c o s (πx )s i n (πx ) 2 −π +∞ Z −∞ s i n (πx ) l o g |g (i t )|d t c o s h (πt ) − c o s (πx ) • On ]0, π[, we use 1 + i t = h (e i φ ). We have : 1 2π 2π Z 0 s i n (πx ) 1 l o g g (h (e i φ )) d φ = 1 + c o s (πx )s i n (πx ) 2 Z +∞ −∞ s i n(πx ) l o g |g (1+i t )|d t c o s h (πt ) − c o s (πx ) We now add these two quantities and use (5): 1 l o g |g (x )| ≤ 2 Z +∞ −∞ s i n (πx ) l o g |g (i t )| + l o g |g (1 + i t )| d t c o s h (πx ) − c o s (πx ) Which is the desired result according to the remark at the beginning of the proof. □ We now prove theorem 2 : Proof: The proof follows [SBT16, Appendix D]. For x in [0, 1] we define q x as the Hölder conjugate of px : p1x + q1x = 1. Using the definition of px given earlier we have : x = 1−x q0 + q1 . Let θ ∈]0, 1[ be fixed, the operator G (θ ) is bounded by assumption, so it has a polar decomposition : G (θ ) = U ∆ with ∆ positive semi-definite and U is a partial isometry, satisfying : ∆U U † = U †U ∆ = ∆ because G (θ ) is normal. (See for example [NC00, Section 2.1.10]) For z in S, let : 1 qx X (z )† := C −pθ 1−z q0 + qz 1 pθ ∆ 1−z q0 + qz 1 U† with C := ||∆||pθ = ||G (θ )||pθ < ∞ We find that X is anti-holomorphic on S, that is X is holomorphic on S (using Cauchy-Riemann equation for instance). We also have : p q 1−x + x ||X (x + i y )||qqxx = t r (C −1 ∆) 0 x q0 q1 = t r (C −1 ∆)p0 = 1 The last equality resulting of the definition of C. 11 Consequently, Hilbert-Schmidt inner product g (z ) := t r X (z )†G (z ) is holomorphic (because X † and G are holomorphic and the trace is linear ) and bounded on S, because Hölder’s inequality states : |g (x + i y )| ≤ ||X (x + i y )||qx ||G (x + i y )||px ≤ ||G (x + i y )||px Hence, the assumption on G says that g satisfies the assumption of the previous lemma: l o g g (θ ) ≤ +∞ Z β1−θ (t )l o g g (i t ) 1−θ + βθ l o g g (1 + i t ) θ dt (6) −∞ We now have to verify the following relations to conclude: g (θ ) = t r (X (θ )G (θ )) = C p − qθ θ t r ∆pθ U †U ∆ = C 1−pθ t r ∆pθ = ||G (θ )||pθ |g (i t )| ≤ ||G (i t )||p0 and |g (1 + i t )| ≤ ||G (1 + i t )||pθ The last two inequalities result from Hölder’s inequality. Plugging these relations into (6) yields the result. □ The Golden-Thompson inequality comes directly from the following result : Theorem 2.3 Let p ≥ 1 , r ∈]0, 1] , βr be as defined earlier, n ∈ N and a collection (A k )k ≤n of n positive definite matrices. Then : l og n Y k =1 1/r +∞ Z A kr βr (t )l o g ≤ p −∞ 12 n Y k =1 t A 1+i k dt p Proof: We prove the case of definite matrices, as the one with semi-definite matrices follows Q by continuity. For r = 1 the assertion is trivial. Otherwise, let us define n G (z ) := k =1 A kz for z ∈ S where S is the strip of the complex plane previously defined. This function satisfies the regularity assumptions of theorem 2. We pick θ = r , p0 = ∞ , p1 = p so that pθ = θr . We find: p l o g ∥G (1 + i t )∥θ1 = r l o g n Y t A 1+i k k =1 And : p 0 l o g ∥G (i t )∥1−θ = r (1 − r )l o g n Y A ikt =0 k =1 ∞ as the matrices A ikt are unitary (recall that ∥.∥∞ is the operator norm). Moreover we have : 1/r n n Y Y r r l o g ∥G (θ )∥pθ = l o g Ak = rl og Ak k =1 k =1 p /r p Plugging these relations into theorem 2 yields the result. □ We need one last tool before proving Golden-Thompson inequality: Property 2.4 Multivariate Lie-Trotter product formula For any collection of square matrices (A k )k ≤m with m in N, we have : lim n →+∞ e A1 n 13 ...e Am n n m P = e k =1 Ak Proof: Ak A We compute the product using : e n = I + nk +O n12 where I is the identity matrix. We have : m Y Y Ak Ak 1 n I+ +O e = 2 n n k =1 k X 1 1 Ak + O =I + n k n2 So we have : m Y e Ak n n k =1 n 1X 1 = I+ Ak + O n k n2 P i n X 1 n k Ak +O =I + n n i i =1 P i n X 1 1 k Ak = 1 + +O +O i ! n n i =0 Taking the limit with n tending to infinity yields the result. □ We can now easily prove the Golden-thompson inequality: Corollary 2.5 Golden-Thompson inequality Let p ≥ 1 , β0 as defined previously, n ∈ N, and (Hk )k ≤n a collection of hermitian matrices. Then : P l og e Z Hk +∞ β0 (t )l o g ≤ k p −∞ Y k ((1 + i t )Hk ) dt p Proof: We use the previous theorem with r = n1 and by letting r tend to infinity. On the left side we use Lieb-Trotter product formula to get the expression we want. For the right side, we can use Lebesgue’s dominated convergence theorem, as the expression under the Q integral is continuous so bounded on [−1, 1] and is bounded by 2c o sπh (πt ) l o g k ((1 + i t )Hk ) p for |t | > 1, which is integrable beQ cause l o g k ((1 + i t )Hk ) p is bounded with respect to the variable t. Using the sub-multiplicativity of the Schatten norms, it is sufficient to show that A 1+i t p is bounded. To see that, we use that as A is Hermitian, the matrix A i t is unitary : † † † A i t := e i t l o g A = e −i t l o g A = A −i t . 14 Then we use that the Schatten norms are unitary invariant : let U := A i t . We have p /2 p /2 U = t r A † A with the last equality : ∥AU ∥pp := t r U † A † AU = t r U † A† A resulting from the cyclicity of the trace. Thus the expression in the integral is dominated by an integrable function so we can use Lebesgue dominated convergence theorem. □ We can apply this inequality with p = 2, and n = 6 for logarithm is increasing and Jensen’s inequality, we have : e 1 2 P 1 2 Hk k ≤6 . Using that the P 12 Hk = t r (e k ) Hk k 2 Thus : tr e P Hk k Z +∞ ≤ 1+i t 1+i t 1+i t 1−i t 1−i t β0 (t )t r e 2 H1 e 2 H2 . . . e 2 H6 e 2 H6 . . . e 2 H1 d t −∞ Which can be simplified using the cyclicity of the trace : tr e P k Hk Z +∞ ≤ 1+i t 1−i t 1−i t β0 (t )t r e H1 e 2 H2 . . . e H6 e 2 H5 . . . e 2 H2 d t (7) −∞ This result will be used later. 3 3.1 Quantum Markov chains Quantum Markov chains In this section we define quantum Markov chains. Let A,B,C be three quantum systems (i.e. Hilbert spaces) and consider ρAB C , ρA , ρB , ρC , . . . be the density operators associated with the states of the corresponding systems. This follows [Sut18, Chapter 5]. Definition 3.1 Quantum Markov chains A ↔ B ↔ C is a quantum Markov chain if there exists a recovery map RB →B C , i.e. a trace preserving completely positive map from B to BC such that: ρAB C = RB →B C (ρAB ) Informally, the state of C can be reconstructed from the state of AB by only acting on the B part. The idea behind this concept is close to classical quantum Markov chains, where one can say that "The future only depends on the past by the present". We have two characterisations of quantum Markov chains : 15 Property 3.2 First characterisation of quantum Markov chains A state ρAB C ∈ S (A ⊗ B ⊗ C ) is a quantum Markov chain if and only if there exists a decomposition of the B system : M B= b jL ⊗ b jR j such that : ρAB C = M ρAb jL ⊗ ρB jR C j This characterisation describes the structure of a quantum Markov chain. However, the second characterisation links quantum Markov chains with entropy quantities, and thus is of greater importance for us: Property 3.3 Second characterisation of quantum Markov chains A state ρAB C ∈ S (A ⊗B ⊗C ) is a quantum Markov chain if and only if I (A; C |B ) = 0. Moreover, in this case, the rotated Petz recovery map: [t ] 1+i t TB →B C : X B 7→ ρB C2 − 1+i2 t ρB − 1−i2 t X B ρB 1−i t ⊗ IC ρB C2 [t ] satisfies : ρAB C = TB →B C (ρAB ) for any t in R Proof: Suppose that ρAB C is a quantum Markov chain. First, we compute : −S (A|B ) = D (ρAB ||I A ⊗ ρB ) [t ] [t ] ≥ D (TB →B C (ρAB )||I A ⊗ TB →B C (ρB )) = −S (A|B C ) [t ] Where we used the data processing inequality and that T rA (TB →B C (ρAB )) = [t ] TB →B C (ρB ), which is a simple computation. We now have : I (A; C |B ) := S (A|B ) − S (A|B C ) ≤ 0 Combining this with strong sub-additivity, we obtain that I (A; C |B ) = 0. The other direction is more complicated to prove. It requires a strengthening of the data processing inequality involving the rotated Petz recovery map and relying inter alia on the multivariate Golden-Thompson inequality [Sut18, proposition 5.21] □ 16 3.2 Approximate quantum Markov chains The previous characterisation suggests a link between the conditional mutual information I (A; C |B ) and the recoverability of the quantum Markov chain A ↔ B ↔ [t ] C , i.e. the distance between the recovered state TB →B C (ρAB ) and ρAB C ). Here we measure the "distance" with the measured relative entropy. It is not a metric as it doesn’t satisfy the triangular inequality, however its non-negativity with equality if and only if both states are equal allows us to use is as an indicator for "closeness". Theorem 3.4 Let ρAB C ∈ S (A ⊗ B ⊗ C ). We have : I (A; C |B ) ≥ DM (ρAB C ||T B →B C (ρAB )) Where T B →B C : X B 7→ recovery map. R +∞ −∞ [t ] β0 (t )TB →B C (X B )d t is the (averaged) rotated Petz The theorem is admitted here. The proof relies on a strengthened data processing inequality. Definition 3.5 Approximate quantum Markov chains A state ρAB C ∈ S (A ⊗ B ⊗ C ) is an ε−approximate Markov chain if :I (A; C |B ) < ε. The previous theorem shows that an approximate quantum Markov chain has a property of good recoverability. However, it can be shown that it is not necessary close to a quantum Markov chain [Sut18, Proposition 5.9] with respect to the trace distance. 17 4 4.1 Linden-Winter inequality and possible extensions Linden and Winter’s original inequality In 2004 Linden and Winter developed a new (constrained) inequality [LW04] : Theorem 4.1 Linden-Winter inequality Let A,B,C,D be quantum systems, with the constraints: • A ↔ B ↔ C is a quantum Markov chain, i.e. I (A; C |B ) = 0 • B ↔ A ↔ C is a quantum Markov chain, i.e. I (B ; C |A) = 0 • A ↔ D ↔ B is a quantum Markov chain, i.e. I (A; B |D ) = 0 Then we have : I (C ; D ) ≥ I (C ; AB ) Which can also be written as : I (AB C ; D ) ≥ I (AB ; C D ) They also proved that this inequality is not a mere consequence of strong subadditivity. However, the constraints are really strong, as they are equalities; the probability of them to happen is zero in an experiment. Therefore the aim here is either to remove those constraints or to replace them by weaker ones. This is an open problem. The best way seems to replace these three quantum Markov chains by approximate quantum Markov chains. 4.2 A new proof for the inequality The proof given in [LW04] relies on the first characterisation of quantum Markov chains. However, we do not have any similar characterisation for approximate quantum Markov chains. For this reason, this proof is not shown here. However, in [Sut19], Sutter gave a new method to prove entropy inequalities : 18 Method to prove entropy inequalities • Write the inequality as the non-negativity a relative entropy D (ρ||σ) ≥ 0 • Expand the relative entropy using the variational formula. • Use the multivariate Golden-Thompson inequality. • Recognize the variational formula of measured relative entropy. • Use the non-negativity of measured relative entropy(i.e. show that t r ρ = 1 and t r σ ≤ 1). This method can be used to prove a slight different version of the Linden-Winter inequality. This follows an unpublished note summarizing the work of M.Lemm and D. Xiang in 2019 [LX] . Theorem 4.2 Variant of the Linden-Winter inequality Let A,B,C,D be quantum systems, with the constraints: • A ↔ B ↔ C is a quantum Markov chain, i.e. I (A; C |B ) = 0 • B ↔ A ↔ C is a quantum Markov chain, i.e. I (B ; C |A) = 0 We also assume that ρAB is invertible. Then we have : I (AB C ; D ) ≥ I (AB ; C D ) Proof: First, we use the two quantum Markov chains : we write that ρAB C can be recovered in two different ways using the Petz recovery map : 1+i t ρAB C = ρB C2 − 1+i2 t ρB − 1−i2 t ρAB ρB 1−i t ⊗ IC ρB C2 As the second characterisation shows, in a quantum Markov chain, A and C play symmetrical roles, so that if A ↔ B ↔ C is a quantum Markov chain, C ↔ B ↔ A is one too. Here we swap A and C to get : 19 1+i t ρAB2 − 1+i2 t ρB − 1−i2 t ρB C ρB 1−i t 1+i t 1−i t 1+i t − − 1−i t ⊗ IC ρAB2 = ρAB C = ρAB2 ρA 2 ρAC ρA 2 ⊗ IC ρAB2 We will now write the terms without the appropriate tensored identity matrix to make the proof easier to read. We thus write : 1+i t − 1+i2 t ρAB2 ρB − 1−i2 t ρB C ρB 1−i t 1+i t − 1+i2 t ρAB2 = ρAB2 ρA − 1−i2 t ρAC ρA 1−i t ρAB2 Here we use the assumption that ρAB is invertible to cancel it and we trace out over the A system : 1+i t 1+i t − − 1−i t − − 1−i t d i m A ρB 2 ρB C ρB 2 = t rA ρA 2 ρAC ρA 2 (8) We now write the wanted inequality as a statement about divergences. First, let us write it with quantum entropy : S (AB C ) + S (D ) − S (AB ) − S (C D ) ≥ 0 We use that I (A; C |B ) = 0, i.e. that S (AB )+S (B C )−S (B )−S (AB C ) = 0 and add these two quantities, so that the initial statement is equivalent to : S (B C ) + S (D ) − S (B ) − S (C D ) + S (AB C D ) − S (AB C D ) ≥ 0 (9) We now write these entropies with divergences. We have : S (B C ) = −t rB C (ρB C l o g (ρB C )) = −t rAB C D (ρAB C D l o g (ρB C ) ⊗ I AD ) Similarly : S (B C ) = −t rAB C D (ρAB C D l o g (ρB C ) ⊗ I AD ) S (D ) = −t rAB C D (ρAB C D l o g (ρD ) ⊗ I AB C ) S (B ) = −t rAB C D (ρAB C D l o g (ρB ) ⊗ I AC D ) S (C D ) = −t rAB C D (ρAB C D l o g (ρC D ) ⊗ I AB ) S (AB C D ) = −t rAB C D (ρAB C D l o g (ρAB C D )) We can now sum these terms and pull out ρAB C D : S (B C ) + S (D ) − S (B ) − S (C D ) + S (AB C D ) − S (AB C D ) = D (ρAB C D ||e σ ) Where we define σ as : σ := l o g ρAB C D − l o g ρC D + l o g ρD − l o g ρB + l o g ρB C Once again, all terms are implicitly tensored with the according identity matrix so they all represent states of the system ABCD. We now use the variational formula for relative entropy: 20 D (ρAB C D ||e σ ) = sup t r (ρAB C D l o g ω) − l o g t r (e σ+l o g ω ) ω>0 ≥ sup t r (ρAB C D l o g ω) − l o g t r ω>0 +∞ Z β0 (t )κ(t )ωd t −∞ We used the Golden-Thompson inequality for 6 matrices to obtain the lower bound, with : − 1+i t 1+i t 1+i t − 1+i2 t κ(t ) := ρAB2 C D ρC D2 ρD 2 ρB We thus have : − 1−i2 t 1−i t − 1−i t σ 1−i t ρD 2 ρC D2 ρAB2 C D ρB C ρB D (ρAB C D ||e ) ≥ DM ρAB C D || Z +∞ β0 (t )κ(t )d t −∞ So by linearity of the trace, proving that for each t in R, κ(t ) is a quantum systems allows us to use the non-negativity of measured relative entropy to conclude the proof. Let t be in R . We now compute t r κ(t ) to show that it is equal to 1. First we can use cyclicity of the trace to trace out over the A system : 1+i t − 1+i t 1+i t − 1+i2 t − 1+i t 1+i t − 1+i t t rAB C D κ(t ) = t rAB C D ρAB2 C D ρC D2 ρD 2 ρB − 1−i2 t ρB C ρB − 1−i t 1−i t − 1−i t 1−i t − 1−i t 1−i t ρD 2 ρC D2 ρAB2 C D = t rAB C D ρAB C D ρC D2 ρD 2 ρB 2 ρB C ρB 2 ρD 2 ρC D2 1−i t − 1+i t 1+i t − 1+i t − 1−i t − 1−i t = t rB C D t rA ρAB C D ρC D2 ρD 2 ρB 2 ρB C ρB 2 ρD 2 ρC D2 − 1+i t 1+i t − 1+i2 t − 1+i t 1+i t − 1+i2 t = t rB C D ρB C D ρC D2 ρD 2 ρB 1+i t = t rB C D ρB C2 D ρC D2 ρD 2 ρB − 1−i2 t ρB C ρB − 1−i2 t ρB C ρB 1−i t − 1−i t 1−i t − 1−i t ρD 2 ρC D2 1−i t ρD 2 ρC D2 ρB C2 D We now use our identity (8) : 1+i t − 1+i t 1+i t t rB C D ρB C2 D ρC D2 ρD 2 − 1+i2 t ρB 1+i t 1+i t 1 − 1+i t t rB C D ρB C2 D ρC D2 ρD 2 d i mA 1−i t 1−i t − 1−i t − 1−i t ρB C ρB 2 ρD 2 ρC D2 ρB C2 D = 1+i t 1−i t 1−i t − − 1−i t − 1−i t t rA ρA 2 ρAC ρA 2 ⊗ I B ρD 2 ρC D2 ρB C2 D We can now trace out over the B system, as we did for the A system : 21 1+i t 1−i t 1+i t 1−i t 1+i t 1 − 1+i t − 1−i t − 1−i t − t rB C D ρB C2 D ρC D2 ρD 2 t rA ρA 2 ρAC ρA 2 ⊗ I B ρD 2 ρC D2 ρB C2 D = d i mA 1+i t 1−i t 1+i t 1+i t 1−i t 1 − 1+i t − − 1−i t − 1−i t 2 2 i t rC D ρ C D ρC D2 ρD 2 t rA ρA 2 ρAC ρA 2 ρD 2 ρC D2 ρC D = d i mA 1+i t 1−i t 1+i t 1 − − 1−i t i t rC D ρD 2 t rA ρA 2 ρAC ρA 2 ρD 2 d i mA i We can now use the cyclicity of the trace and trace out over D: 1+i t 1−i t 1+i t 1 − − 1−i t ρD 2 = t rC D ρD 2 t rA ρA 2 ρAC ρA 2 d i mA 1+i t 1 − 1−i t − i = t rC D ρD t rA ρA 2 ρAC ρA 2 d i mA 1+i t 1 − 1−i t − i t rAC ρA 2 ρAC ρA 2 d i mA i We now trace out over C then over A : 1+i t − − 1−i t t rAC ρA 2 ρAC ρA 2 = t rAC ρA−1 ρAC = t rA ρA−1 ρA = t rA I A = d i mA Eventually, we have : T r κ(t ) = 1. Thus, we get : Z +∞ β0 (t )κ(t )d t = tr Z −∞ +∞ β0 (t )t r (κ(t )) d t = 1 −∞ Where we used that β0 is a probability density. This concludes the proof. □ We can comment on this version of the Linden-Winter inequality. First, about the assumption, one can see that the third quantum Markov chain ( A ↔ D ↔ B ) is not used here. This seems to help loosening the constraints, but as it will be discussed later, it may not the right direction to go to. Second, we added an assumption about the invertibility of the reduced density operator ρAB . As this matrix is diagonalizable and positive, this constraint states that every eigenvalue of ρAB is strictly positive, so it is not a very strong assumption. 4.3 Thoughts on the proof This section presents different ideas we had during the last two weeks of the training. 22 As it was already said, the assumptions of the Linden-Winter inequality are very strong : as they are equalities, we can think of it in a probabilistic way as a conditioning over a set of probability zero. Thus a lot of effort is made to relax the assumptions from equalities to inequalities : here the logical extension is, instead of having three quantum Markov chains, to have three approximate quantum Markov chains. As the proof originally relied on a characterization of quantum Markov chains for which we do not know any adaptation for approximate quantum Markov chains. Thus, with the new method from [Sut19] applied in [LX] and showed in the previous section, there is hope to adapt the proof with these new assumptions. First, we need to see where the assumptions are used. We can see from the proof that (8) directly comes from the assumptions, and from the fact that ρAB is invertible. Then we use the assumptions again to get (9). To get a similar equation of (8), we start by using the averaged Petz recovery map: DM ρAB C +∞ Z − 1+i t β0 (t )ρAB ρB 2 − 1−i t ρB C ρB 2 − 1+i t β0 (t )ρAB ρA 2 − 1−i t ρAC ρA 2 1+i t 2 1−i t 2 ρAB d t ≤ ε −∞ DM ρAB C Z +∞ 1+i t 2 1−i t 2 ρAB d t ≤ ε −∞ We would want to use the triangular inequality, but is does not hold here. We have two main options : We could use Rényi relative entropies : Definition 4.3 Rényi relative entropies Let α ∈]0, 1[∪]1, +∞[ and ρ and σ in S(A) : ( α 1−α 1−α l o g σ 2α ρσ 2α if s up p (ρ) ⊂ s up p (σ) or α < 1 Dα (ρ||σ) := α − 1 +∞ otherwise As the limits for α going to infinity or 1 exist, we define : D1 (ρ||σ) := D (ρ||σ) 1 1 Dma x (ρ||σ) := l o g σ− 2 ρσ− 2 We have two interesting properties : Property 4.4 For ρ and σ in S(A) : DM (ρ||σ) ≤ D (ρ||σ) ≤ Dma x (ρ||σ) 23 Triangle-like inequality Property 4.5 For ρ, σ and ω in S(A) and α in ] 12 , ∞[ : Dα (ρ||σ) ≤ Dα (ρ||ω) + Dma x (ω||σ) With these two properties, we can see that if we were to replace the assumptions : • I (A; C |B ) ≤ ε • I (B ; C |A) ≤ ε • I (A; B |D ) ≤ ε by : • Dm a x (ρAB C ||T B →B C (ρAB )) ≤ ε • Dm a x (ρAB C ||T A→AC (ρAB )) ≤ ε • Dm a x (ρAB D ||T D →B D (ρAD )) ≤ ε Then we could use the triangle-like inequality and we finally have : Z +∞ DM 1+i t 2 − 1+i t β0 (t )ρAB ρA 2 − 1−i t ρAC ρA 2 1−i t 2 ρAB −∞ Z +∞ 1+i t 2 − 1+i t β0 (t )ρAB ρB 2 − 1−i t ρB C ρB 2 1−i t 2 ρAB d t ≤ 2ε −∞ The problem now is to get from here to an equation similar to (8). We can no longer cancel the ρAB even if it is invertible, and the integrals are hard to manipulate. We could not find a solution for the first problem, however we considered using the mean value theorem to get rid of the integrals. Unfortunately, we did not succeed to go further with this idea. One similar approach is to use the trace norm instead of the Rényi relative entropies, which we can do because of Pinsker’s inequality, but this does not help going further. Now, we need to see why we need equation (9). In the original proof, it helps us get rid of ρAB C term in κ(t ) after we "translated" the inequality into the non-negativity 24 of measured relative entropy. To show that the trace of κ(t ) equals 1, we need to trace out over the different systems, one at a time. Thus, if a system appears on only one density operator, then we can easily trace out over this system (take for example the first step when we trace out over the A system). The order of the terms is important, but notice that in the Golden-Thompson inequality, we can shuffle the order on the right side as the order on the left side is insignificant. However, without (9), we can’t untangle the systems so we can’t trace out step by step over the different systems. As substracting conditional mutual information does not help cancelling any term, we have to add them in order to get a simpler expression. Even if these conditional mutual information are non-zero, we have two options : The first one is to then use a lower-bound on (measured) relative entropy to show that it is greater than ε. Such a lower bound exist: we found one in [RW15]. However, this does not look like a good solution : First, it seems hard to apply. Second, the lower bound depends on the dimension d , which appears in the expression : This might fail to generalize to infinite dimension spaces, and even fail with high finite dimensions. The other option is to modify the inequality to include error-terms, as suggested in [Sut19], to get an inequality of the form : κ1 I (A; C |B ) + κ2 I (B ; C |A) + κ3 I (A; B |D ) + I (AB C ; D ) − I (AB ; C D ) ≥ 0 with κ1 , κ2 , κ3 three constants to determine. 4.4 Symmetry There is a symmetry between the A and the B systems in the original theorem by Linden and Winter : Theorem 4.6 Linden-Winter inequality Let A,B,C,D be quantum systems, with the constraints: • A ↔ B ↔ C is a quantum Markov chain, i.e. I (A; C |B ) = 0 • B ↔ A ↔ C is a quantum Markov chain, i.e. I (B ; C |A) = 0 • A ↔ D ↔ B is a quantum Markov chain, i.e. I (A; B |B ) = 0 Then we have : I (C ; D ) ≥ I (C ; AB ) Which can also be written as : I (AB C ; D ) ≥ I (AB ; C D ) If we swap A and B, the first assumption becomes the second and vice-versa. The third one is symmetrical. We also notice that the inequality is symmetrical as only the AB system is relevant. However, in the proof by Lemm and Xiang, the symme25 try is broken twice : when we trace out over A to get (8) and when we add I (A; C |B ) to get (9). However we can adapt the proof to keep the symmetry by slightly modifying the equation (9). We sketch the proof : Proof: In order to get a similar equation to (9), we add 12 (I (A; C |B ) + I (B ; C |A)) instead of I (A; C |B ) and we multiply by two to obtain an inequality equivalent to the one desired : S (AC ) + S (B C ) − S (A) − S (B ) + 2S (D ) − 2S (C D ) + 2S (AB C D ) − 2S (AB C D ) ≥ 0 It is sufficient to prove the two following inequalities : S (AC ) − S (A) + S (D ) − S (C D ) + S (AB C D ) − S (AB C D ) ≥ 0 S (B C ) − S (B ) + S (D ) − S (C D ) + S (AB C D ) − S (AB C D ) ≥ 0 These two equations replace equation (9) : The second one is actually equation (9) and the first one is obtained by swapping A and B. Thus, we know that the second one is true, according to the previous proof, and to prove the first one is suffices to get a similar equation to equation (8) but by swapping A with B, which is possible : As for equation (8), we start with the equation: 1+i t − 1+i2 t ρAB2 ρB − 1−i2 t ρB C ρB 1−i t 1+i t − 1+i2 t ρAB2 = ρAB2 ρA − 1−i2 t ρAC ρA 1−i t ρAB2 We then use that ρAB is invertible but we trace out over B this time, to get : 1+i t 1+i t − 1−i t − − 1−i t − d i m B ρA 2 ρAC ρA 2 = T rB ρB 2 ρB C ρB 2 Using this, the rest of the proof is identical to the one showed earlier but by swapping A and B. Thus we proved the two inequalities that yield the result. □ References [G00] Classical fourier analysis. Springer, third edition, 2000. [LW04] Linden and Winter. A new inequality for the von neumann entropy. 2004. [LX] Lemm and Xiang. Entropy inequalities. [NC00] Nielsen and Chuang. Quantum Computation and quantum information. Cambridge university press, 10th edition, 2000. [RW15] Reeb and Wolf. Tight bound on relative entropy by entropy difference. 2015. 26 [SBT16] Sutter, Berta, and Tomamichel. Multivariate trace inequalities. 2016. [SR18] Sutter and Renner. Necessary criterion for approximate recoverability. 2018. [Sut18] Sutter. Approximate markov chains. Springer, 2018. [Sut19] Sutter. A new approach to prove von neumann inequalities. Talk, 2019. [Wil19] Wilde. From classical to quantum Shannon entropy. Cambridge university press, 2019. 27