CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS 146 Editorial Board B . B O L L O B Á S , W . F U L T O N , A . K A T O K , F . K I R W A N , P. SARNAK, B. SIMON, B. TOTARO UNIT EQUATIONS IN DIOPHANTINE NUMBER THEORY Diophantine number theory is an active area that has seen tremendous growth over the past century, and in this theory unit equations play a central role. This comprehensive treatment is the first volume devoted to these equations. The authors gather together all the most important results and look at many different aspects, including effective results on unit equations over number fields, estimates on the number of solutions, analogues for function fields, and effective results for unit equations over finitely generated domains. They also present a variety of applications. Introductory chapters provide the necessary background in algebraic number theory and function field theory, as well as an account of the required tools from Diophantine approximation and transcendence theory. This makes the book suitable for young researchers as well as for experts who are looking for an up-to-date overview of the field. Jan-Hendrik Evertse works at the Mathematical Institute of Leiden University. His research concentrates on Diophantine approximation and applications to Diophantine problems. In this area he has obtained some influential results, in particular on estimates for the numbers of solutions of Diophantine equations and inequalities. He has written more than 75 research papers and co-authored one book with Bas Edixhoven entitled Diophantine Approximation and Abelian Varieties. Kálmán Győry is Professor Emeritus at the University of Debrecen, a member of the Hungarian Academy of Sciences and a well-known researcher in Diophantine number theory. Over his career he has obtained several significant and pioneering results, among others on unit equations, decomposable form equations, and their various applications. His results have been published in one book and 160 research papers. Győry is also the founder and leader of the number theory research group in Debrecen, which consists of his former students and their students. CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS Editorial Board: B. Bollobás, W. Fulton, A. Katok, F. Kirwan, P. Sarnak, B. Simon, B. Totaro All the titles listed below can be obtained from good booksellers or from Cambridge University Press. For a complete series listing visit www.cambridge.org/mathematics. Already published 109 H. Geiges An introduction to contact topology 110 J. Faraut Analysis on Lie groups: An introduction 111 E. Park Complex topological K-theory 112 D. W. Stroock Partial differential equations for probabilists 113 A. Kirillov, Jr An introduction to Lie groups and Lie algebras 114 F. Gesztesy et al. Soliton equations and their algebro-geometric solutions, II 115 E. de Faria & W. de Melo Mathematical tools for one-dimensional dynamics 116 D. Applebaum Lévy processes and stochastic calculus (2nd Edition) 117 T. Szamuely Galois groups and fundamental groups 118 G. W. Anderson, A. Guionnet & O. Zeitouni An introduction to random matrices 119 C. Perez-Garcia & W. H. Schikhof Locally convex spaces over non-Archimedean valued fields 120 P. K. Friz & N. B. Victoir Multidimensional stochastic processes as rough paths 121 T. Ceccherini-Silberstein, F. Scarabotti & F. Tolli Representation theory of the symmetric groups 122 S. Kalikow & R. McCutcheon An outline of ergodic theory 123 G. F. Lawler & V. Limic Random walk: A modern introduction 124 K. Lux & H. Pahlings Representations of groups 125 K. S. Kedlaya p-adic differential equations 126 R. Beals & R. Wong Special functions 127 E. de Faria & W. de Melo Mathematical aspects of quantum field theory 128 A. Terras Zeta functions of graphs 129 D. Goldfeld & J. Hundley Automorphic representations and L-functions for the general linear group, I 130 D. Goldfeld & J. Hundley Automorphic representations and L-functions for the general linear group, II 131 D. A. Craven The theory of fusion systems 132 J.Väänänen Models and games 133 G. Malle & D. Testerman Linear algebraic groups and finite groups of Lie type 134 P. Li Geometric analysis 135 F. Maggi Sets of finite perimeter and geometric variational problems 136 M. Brodmann & R. Y. Sharp Local cohomology (2nd Edition) 137 C. Muscalu & W. Schlag Classical and multilinear harmonic analysis, I 138 C. Muscalu & W. Schlag Classical and multilinear harmonic analysis, II 139 B. Helffer Spectral theory and its applications 140 R. Pemantle & M. C. Wilson Analytic combinatorics in several variables 141 B. Branner & N. Fagella Quasiconformal surgery in holomorphic dynamics 142 R. M. Dudley Uniform central limit theorems (2nd Edition) 143 T. Leinster Basic category theory 144 I. Arzhantsev, U. Derenthal, J. Hausen & A. Laface Cox rings 145 M. Viana Lectures on Lyapunov exponents 146 J.-H. Evertse & K. Győry Unit equations in Diophantine number theory 147 A. Prasad Representation theory 148 S. R. Garcia, J. Mashreghi & W. T. Ross Introduction to model spaces and their operators Unit Equations in Diophantine Number Theory JA N - H E N D R I K E V E RT S E Universiteit Leiden K Á L M Á N G Y Ő RY Debreceni Egyetem, Hungary University Printing House, Cambridge CB2 8BS, United Kingdom Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781107097605 © Jan-Hendrik Evertse and Kálmán Győry 2015 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2015 Printed in the United Kingdom by Clays, St Ives plc A catalogue record for this publication is available from the British Library ISBN 978-1-107-09760-5 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Contents Preface Summary PART I 1 page ix xi PRELIMINARIES Basic algebraic number theory 1.1 Characteristic polynomial, trace, norm, discriminant 1.2 Ideal theory for algebraic number fields 1.3 Extension of ideals; norm of ideals 1.4 Discriminant, class number, unit group and regulator 1.5 Explicit estimates 1.6 Absolute values: generalities 1.7 Absolute values and places on number fields 1.8 S-integers, S-units and S-norm 1.9 Heights 1.9.1 Heights of algebraic numbers 1.9.2 v-adic norms and heights of vectors and polynomials 1.10 Effective computations in number fields 1.11 p-adic numbers 3 3 5 7 9 11 12 15 17 19 19 2 Algebraic function fields 2.1 Valuations 2.2 Heights 2.3 Derivatives and genus 2.4 Effective computations 30 30 33 35 37 3 Tools from Diophantine approximation and transcendence theory 42 v 21 23 26 vi Contents 3.1 3.2 The Subspace Theorem and some variations Effective estimates for linear forms in logarithms PART II 4 5 42 51 U N I T E QUAT I O N S A N D A P P L I C AT I O N S Effective results for unit equations in two unknowns over number fields 4.1 Effective bounds for the heights of the solutions 4.1.1 Equations in units of a number field 4.1.2 Equations with unknowns from a finitely generated multiplicative group 4.2 Approximation by elements of a finitely generated multiplicative group 4.3 Tools 4.3.1 Some geometry of numbers 4.3.2 Estimates for units and S-units 4.4 Proofs 4.4.1 Proofs of Theorems 4.1.1 and 4.1.2 4.4.2 Proofs of Theorems 4.2.1 and 4.2.2 4.4.3 Proofs of Theorem 4.1.3 and its corollaries 4.5 Alternative methods, comparison of the bounds 4.5.1 The results of Bombieri, Bombieri and Cohen, and Bugeaud 4.5.2 The results of Murty, Pasten and von Känel 4.6 The abc-conjecture 4.7 Notes 4.7.1 Historical remarks and some related results 4.7.2 Some notes on applications Algorithmic resolution of unit equations in two unknowns 5.1 Application of Baker’s type estimates 5.1.1 Infinite places 5.1.2 Finite places 5.2 Reduction of the bounds 5.2.1 Infinite places 5.2.2 Finite places 5.3 Enumeration of the “small” solutions 5.4 Examples 5.5 Exceptional units 5.6 Supplement: LLL lattice basis reduction 5.7 Notes 61 62 62 64 67 68 68 72 79 79 81 84 87 87 88 89 93 93 94 96 97 100 102 103 103 105 111 119 121 123 126 Contents vii 6 Unit equations in several unknowns 6.1 Results 6.1.1 A semi-effective result 6.1.2 Upper bounds for the number of solutions 6.1.3 Lower bounds 6.2 Proofs of Theorem 6.1.1 and Corollary 6.1.2 6.3 A sketch of the proof of Theorem 6.1.3 6.3.1 A reduction 6.3.2 Notation 6.3.3 Covering results 6.3.4 The large solutions 6.3.5 The small solutions, and conclusion of the proof 6.4 Proof of Theorem 6.1.4 6.5 Proof of Theorem 6.1.6 6.6 Proofs of Theorems 6.1.7 and 6.1.8 6.7 Notes 128 130 130 131 134 136 140 140 142 142 144 147 148 158 161 165 7 Analogues over function fields 7.1 Mason’s inequality 7.2 Proofs 7.3 Effective results in the more unknowns case 7.4 Results on the number of solutions 7.5 Proof of Theorem 7.4.1 7.5.1 Extension to the k-closure of 7.5.2 Some algebraic geometry 7.5.3 Proof of Theorem 7.5.1 7.6 Results in positive characteristic 173 174 176 178 182 183 183 185 188 192 8 Effective results for unit equations in two unknowns over finitely generated domains 8.1 Statements of the results 8.2 Effective linear algebra over polynomial rings 8.3 A reduction 8.4 Bounding the degree in Proposition 8.3.7 8.5 Specializations 8.6 Bounding the height in Proposition 8.3.7 8.7 Proof of Theorem 8.1.3 8.8 Notes 197 198 201 204 212 215 222 225 230 Decomposable form equations 9.1 A finiteness criterion for decomposable form equations 231 233 9 viii Contents 9.2 9.3 9.4 9.5 9.6 9.7 10 Reduction of unit equations to decomposable form equations Reduction of decomposable form equations to unit equations 9.3.1 Proof of the equivalence (ii) ⇐⇒ (iii) in Theorem 9.1.1 9.3.2 Proof of the implication (i) ⇒ (iii) in Theorem 9.1.1 9.3.3 Proof of the implication (iii) ⇒ (i) in Theorem 9.1.1 Finiteness of the number of families of solutions Upper bounds for the number of solutions 9.5.1 Galois symmetric S-unit vectors 9.5.2 Consequences for decomposable form equations and S-unit equations Effective results 9.6.1 Thue equations 9.6.2 Decomposable form equations in an arbitrary number of unknowns Notes 236 237 238 238 240 244 249 251 253 257 258 263 272 Further applications 10.1 Prime factors of sums of integers 10.2 Additive unit representations in finitely generated integral domains 10.3 Orbits of polynomial and rational maps 10.4 Polynomials dividing many k-nomials 10.5 Irreducible polynomials and arithmetic graphs 10.6 Discriminant equations and power integral bases in number fields 10.7 Binary forms of given discriminant 10.8 Resultant equations for monic polynomials 10.9 Resultant inequalities and equations for binary forms 10.10 Lang’s Conjecture for tori 10.11 Linear recurrence sequences and exponential-polynomial equations 10.12 Algebraic independence results 284 284 References Glossary of frequently used notation Index 337 358 361 287 291 298 301 305 310 315 317 321 326 330 Preface Diophantine number theory (the study of Diophantine equations, Diophantine inequalities and their applications) is a very active area in number theory with a long history. This book is about unit equations, a class of Diophantine equations of central importance in Diophantine number theory, and their applications. Unit equations are equations of the form a1 x1 + · · · + an xn = 1 to be solved in elements x1 , . . . , xn from a finitely generated multiplicative group , contained in a field K, where a1 , . . . , an are non-zero elements of K. Such equations were studied originally in the cases where the number of unknowns n = 2, K is a number field and is the group of units of the ring of integers of K, or more generally, where is the group of S-units in K. Unit equations have a great variety of applications, among others to other classes of Diophantine equations, to algebraic number theory and to Diophantine geometry. Certain results concerning unit equations and their applications covered in our book were already presented, mostly in special or weaker form, in the books of Lang (1962, 1978, 1983), Győry (1980b), Sprindžuk (1982, 1993), Evertse (1983), Mason (1984), Shorey and Tijdeman (1986), de Weger (1989), Schmidt (1991), Smart (1998), Bombieri and Gubler (2006), Baker and Wüstholz (2007) and Zannier (2009), and in the survey papers of Evertse, Győry, Stewart and Tijdeman (1988b), Győry (1992a, 1996, 2002a, 2010) and Bérczes, Evertse and Győry (2007b). In 1988, we wrote, together with Stewart and Tijdeman, the survey Evertse, Győry, Stewart and Tijdeman (1988b) on unit equations and their applications giving the state of the art of the subject at that time. Since then, the theory of unit equations has been greatly expanded. In the present book we have ix x Preface tried to give a comprehensive and up-to-date treatment of unit equations and their applications. We prove effective finiteness results for unit equations in two unknowns, describe practical algorithms to solve such equations, give explicit upper bounds for the number of solutions, discuss analogues of unit equations over function fields and over finitely generated domains, and present various applications. The proofs of the results concerning unit equations are mostly based on the very powerful Thue–Siegel–Roth–Schmidt theory from Diophantine approximation and Baker’s theory from transcendence theory. We note that there are other important methods and applications, some discovered very recently, that deserve a detailed discussion, but to which we could pay only little or no attention due to lack of time and space. The present book is the first in a series of two. The second book, titled Discriminant Equations in Diophantine Number Theory, also published by Cambridge University Press, is about polynomials and binary forms of given discriminant, with applications to algebraic number theory, Diophantine approximation and Diophantine geometry. There, we will apply the results from the present book. The contents of these two books are an outgrowth of research, done by the two authors since the 1970s. The present book is aimed at anybody (graduate students and experts) with basic knowledge of algebra (groups, commutative rings, fields, Galois theory) and elementary algebraic number theory. For convenience of the reader, in part I of the book we have provided some necessary background. Acknowledgments We are very grateful to Yann Bugeaud, Andrej Dujella, István Gaál, Rafael von Känel, Attila Pethő, Michael Pohst, Andrzej Schinzel and two anonymous referees for carefully reading and critically commenting on some chapters of our book, to Csaba Rakaczki for his careful typing of a considerable part of this book, and to Cambridge University Press, in particular David Tranah, Sam Harrison and Clare Dennison, for their suggestions for and assistance with the final preparation of the manuscript. The research of the second named author was supported in part by Grants 100339 and 104208 from the Hungarian National Foundation for Scientific Research (OTKA). Summary We start with a brief historical overview and then outline the contents of our book. Thue (1909) proved that if F ∈ Z[X, Y ] is a binary form (i.e., a homogeneous polynomial) of degree at least 3 which is irreducible over Q and if δ is a non-zero integer, then the equation F (x, y) = δ in x, y ∈ Z (nowadays called a Thue equation) has only finitely many solutions. To this end, Thue developed a very original Diophantine approximation method concerning the approximation of algebraic numbers by rationals, which was extended later by Siegel, Dyson, Gelfond and Roth. Thue’s result was generalized by Siegel (1921) as follows. Let K be an algebraic number field of degree d with ring of integers OK , let F ∈ OK [X, Y ] be a binary form of degree n > 4d 2 − 2d such that F (1, 0) = 0 and F (X, 1) has no multiple zeros, and let δ be a non-zero element of OK . Then the equation F (x, y) = δ in x, y ∈ OK has only finitely many solutions. This has the following interesting consequence, which was not stated explicitly by Siegel, but which was implicitly proved by him. Denote by OK∗ the group of units of OK . Let a1 , a2 be non-zero elements of the number field K. Then the equation a1 x1 + a2 x2 = 1 OK∗ . (1) has only finitely many solutions in x1 , x2 ∈ To prove this, choose an integer n > 4d 2 − 2d. By Dirichlet’s Unit Theorem, the group OK∗ is finitely generated, and thus, any solution x1 , x2 ∈ OK∗ of (1) can be written as xi = βi εin for i = 1, 2 with βi , εi ∈ OK∗ , such that βi may assume only finitely many values. Thus, we get a finite number of Thue equations a1 β1 ε1n + a2 β2 ε2n = 1, each of which has only finitely many solutions in ε1 , ε2 . xi xii Summary Mahler (1933a) proved another generalization of Thue’s theorem. Let F ∈ Z[X, Y ] be a binary form of degree n ≥ 3 such that F (1, 0) = 0 and F (X, 1) has no multiple zeros, and let p1 , . . . , pt be distinct primes. Then the equation F (x, y) = ±p1z1 · · · ptzt (today called a Thue–Mahler equation) has only finitely many solutions in integers x, y, z1 , . . . , zt with gcd(x, y) = 1. A consequence of this result, proved by Mahler in a slightly different formulation, is as follows. Let a1 , a2 be nonzero rational numbers and let be the multiplicative group generated by −1, p1 , . . . , pt . Then (1) has only finitely many solutions in x1 , x2 ∈ . The argument is similar to that above. By extending the set of primes p1 , . . . , pt , we may assume that the numerators and denominators of a1 , a2 are composed of primes from p1 · · · pt . Then, by clearing denominators, we can rewrite (1) as u + v = w, where u, v, w are integers, composed of primes from p1 , . . . , pt , with gcd(u, v, w) = 1. Choose n ≥ 3. Then we may write u as ax n and v as by n , where a, b, x, y are integers composed of primes from p1 , . . . , pt and a, b are from a finite set independent of x1 , x2 . Thus, equation (1) can be reduced to a finite number of Thue–Mahler equations as above with F = aXn + bY n which all have only finitely many solutions. Lang (1960) considered equation (1) with unknowns x1 , x2 taken from a finitely generated multiplicative group, and was the first to realize the central importance of this equation. He proved the general result that if a1 , a2 are nonzero elements from an arbitrary field K of characteristic 0 and is an arbitrary finitely generated multiplicative subgroup of K ∗ , then (1) has only finitely many solutions in elements x1 , x2 ∈ . Inspired by Siegel’s original result, equations of type (1) with unknowns from a finitely generated multiplicative group are called unit equations (in two unknowns), although the group need not be the unit group of a ring. The proofs of all results mentioned above are based on extensions of Thue’s method, which are ineffective in the sense that they do not provide a method to determine the solutions of the equations considered above. In the 1960s, A. Baker developed a new method in transcendence theory, giving non-trivial effective lower bounds for linear forms in logarithms of algebraic numbers. This turned out to be a very powerful tool to prove effective finiteness results for Diophantine equations, that enable one to determine all solutions of the equation, at least in principle. With this method, and extensions thereof, it became possible to give explicit upper bounds for the heights of the solutions of Thue equations and Thue–Mahler equations, and also for the Summary xiii heights of the solutions of equations (1) in units of the ring of integers of a number field or more generally, in S-units, these are elements in the number field in whose prime ideal factorizations only prime ideals from a prescribed, finite set S occur. Baker (1968b) obtained explicit upper bounds for the solutions of Thue equations. His result was extended by Coates (1969) to Thue–Mahler equations. For explicit upper bounds for the heights of the solutions of unit equations and S-unit equations in two unknowns, see Győry (1972, 1973, 1974, 1979), and the many subsequent improvements discussed in Chapter 4. The bounds enabled one to determine, at least in principle, all solutions. Since the 1980s, practical algorithms have been developed, combining Baker’s theory with the Lenstra–Lenstra–Lovász (LLL) lattice basis reduction algorithm and enumeration techniques, which allow one to solve in practice concrete Thue equations, Thue–Mahler equations and (S-) unit equations, see for instance de Weger (1989), Wildanger (1997) and Smart (1998). In the 1960s and early 1970s, Schmidt developed his higher dimensional generalization of the Thue–Siegel–Roth method, leading to his Subspace Theorem in Schmidt (1972). Schlickewei (1977b) proved an extension of the Subspace Theorem, involving both archimedean and non-archimedean absolute values. Using this so-called p-adic Subspace Theorem, several authors obtained finiteness results for the number of soultions of unit equations in an arbitrary number of unknowns, i.e., for linear equations a1 x1 + · · · + an xn = 1 in x1 , . . . , xn ∈ , (2) where a1 , . . . , an are non-zero elements, and is a finitely generated multiplicative group in a field K of characteristic 0, see Dubois and Rhin (1976), Schlickewei (1977a), Evertse (1984b), Evertse and Győry (1988b) and van der Poorten and Schlickewei (1982, 1991). We mention that the p-adic Subspace Theorem is ineffective, and so its consequences for equation (2) are ineffective. It is still open to solve unit equations of the form (2) in more than two unknowns effectively. In part I of the book, consisting of the first three chapters, we have collected some basic tools. Chapter 1 gives a collection of the results from elementary algebraic number theory that we need throughout the book. In Chapter 2 we recall some basic facts about algebraic function fields. These are used in Chapters 7 and 8. In Chapter 3 we have stated without proof some fundamental results from Diophantine approximation and transcendence theory. We have included some versions of the Subspace Theorem, due to Schmidt, Schlickewei and Evertse, and estimates of Matveev (2000) and Yu (2007) concerning linear forms in logarithms, which are used in Chapters 4, 5 and 6. xiv Summary Part II, consisting of the other chapters, is the main body of our book. Chapter 4 provides a survey of effective results concerning unit equations in two unknowns over number fields. We derive among others the best effective upper bounds to date, established in Győry and Yu (2006), for the solutions of equation (1) in S-units of a number field. For applications, we give the bounds in completely explicit form. The main tools in the proofs are the results on linear forms in logarithms mentioned above. In Chapter 5 we address the problem of practically solving concrete equations of the form (1) in units and S-units. Here, we combine estimates for linear forms in logarithms as mentioned in Chapter 3 with the LLL lattice basis reduction algorithm and an enumeration process. In Chapter 6, we give an overview of the ineffective theory of unit equations in several unknowns. Among other things, we sketch a proof of the theorem of Evertse, Schlickewei and Schmidt (2002), giving an explicit upper bound for the number of those solutions of (2) for which the left side in (2) has no vanishing subsum. The bound depends only on the number n of unknowns and the rank of . We also include a proof of the theorem of Beukers and Schlickewei (1996) which gives a similar, but sharper, result for equations in two unknowns. Further, we discuss some results giving lower bounds for the number of solutions of unit equations. In Chapter 7, we deal with analogues over function fields of characteristic 0 of some of the effective and ineffective results discussed in Chapters 4 and 6. In particular, we present the Stothers–Mason abc-theorem due to Stothers (1981) and Mason (1984) for algebraic functions, and a result of Evertse and Zannier (2008) on the number of solutions of unit equations in two unknowns over function fields, analogous to the result of Beukers and Schlickewei mentioned above. Further, we give a brief overview of recent results on unit equations over function fields of positive characteristic. In Chapter 8, the effective results of Chapters 4 and 7 on S-unit equations in two unknowns over number fields and over function fields are combined with some effective specialization argument to prove a general effective finiteness theorem, due to Evertse and Győry (2013), on the solutions of equation (1) in units x1 , x2 of an arbitrary, effectively given finitely generated integral domain A over Z. Chapter 9 deals with applications of unit equations to decomposable form equations, which are higher dimensional generalizations of Thue and Thue– Mahler equations. It is proved that unit equations in an arbitrary number of unknowns are in a certain sense equivalent to decomposable form equations, and in particular unit equations in two unknowns are equivalent to Thue equations. Further, a complete description of the set of solutions of decomposable Summary xv form equations is presented. We give explicit upper bounds for the number of solutions when this number is finite. The bounds do not depend on the coefficients of the decomposable forms involved. We also discuss effective results for some important classes of decomposable form equations, including Thue equations, discriminant form equations, and certain norm form equations. The presented results have many applications, especially to algebraic number theory. The results on unit equations have many further applications to other Diophantine problems. In Chapter 10 we have made a small selection. We give among other things applications to prime factors of sums of integers, additive unit representations in integral domains, dynamics of polynomial maps, arithmetic graphs, irreducible polynomials, equations and inequalities involving discriminants and resultants, power integral bases in number fields, Diophantine geometry, exponential-polynomial equations, and transcendence theory. As was mentioned in the Preface, a number of applications of the results of the present book are given in our second book Discriminant Equations in Diophantine Number Theory. At the end of several chapters there are Notes in which some historical remarks are made and further related results, generalizations and applications are mentioned. 1 Basic algebraic number theory We have collected some basic facts about algebraic number fields (finite field extensions of Q), p-adic numbers, and related topics. For further details and proofs, we refer to Lang (1970), chapters I–V, Neukirch (1992), Kapitel I–III and Koblitz (1984). In the present book, a ring is by default a commutative ring with unit element, and an integral domain is a commutative ring with unit element and without divisors of 0. Given a ring A, we denote by A+ its underlying additive group, and by A∗ its unit group (multiplicative group of invertible elements). The ring of integers of an algebraic number field K, that is the integral closure of Z in K, is denoted by OK . 1.1 Characteristic polynomial, trace, norm, discriminant For the moment, let K be any field of characteristic 0. Choose an algebraic closure K of K. For every α ∈ K, there is a unique, monic, irreducible polynomial fα ∈ K[X], such that fα (α) = 0, and fα divides g for every polynomial g ∈ K[X] with g(α) = 0. We call fα the monic minimal polynomial of α. Let f ∈ K[X] be a non-zero polynomial. Then f = a(X − α1 ) . . . (X − αr ) with a ∈ K ∗ , α1 , . . . , αr ∈ K. We call K(α1 , . . . , αr ) the splitting field of f over K. Let L be a finite extension of K of degree n. Then there are precisely n distinct K-isomorphic embeddings L → K, σ1 , . . . , σn , say. The composition of the fields σ1 (L), . . . , σn (L) is called the normal closure of L over K. We define the characteristic polynomial of α ∈ L with respect to L/K by n (X − σi (α)). χL/K,α := i=1 In fact, we have χL/K,α = fα[L:K(α)] and χL/K,α is the characteristic polynomial of the K-linear map x → αx from L to L. So χL/K,α ∈ K[X]. 3 Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 4 Basic algebraic number theory We define the trace and norm of α ∈ L over K by TrL/K (α) := n σi (α), NL/K (α) := i=1 n σi (α). i=1 These are up to sign coefficients of χL/K,α . So we have TrL/K (α), NL/K (α) ∈ K for α ∈ L. (1.1.1) Notice that TrL/K is a K-linear map L → K and that NL/K is a multiplicative map L → K. Further, the trace and norm are transitive in towers: let M ⊃ L ⊃ K be a tower of finite extension fields; then TrM/K (α) = TrL/K (TrM/L (α)), for α ∈ M. NM/K (α) = NL/K (NM/L (α)) Let again L be a finite extension of K of degree n. Take a K-basis {ω1 , . . . , ωn } of L. Then the discriminant of this basis is given by DL/K (ω1 , . . . , ωn ) := det(TrL/K (ωi ωj ))i,j =1,...,n . By (1.1.1) we have DL/K (ω1 , . . . , ωn ) ∈ K. The discriminant can be expressed otherwise as DL/K (ω1 , . . . , ωn ) = (det(σi (ωj ))i,j =1,...,n )2 , where σ1 , . . . , σn are the K-isomorphic embeddings of L in K. For instance, if L = K(θ ), then {1, θ, . . . , θ n−1 } is a K-basis of L and by Vandermonde’s identity, (σi (θ ) − σj (θ ))2 = 0. (1.1.2) DL/K (1, θ, . . . , θ n−1 ) = 1≤i<j ≤n Let {θ1 , . . . , θn }, {ω1 , . . . , ωn } be any two K-bases of L. Then ωi = n aij θj for i = 1, . . . , n j =1 with aij ∈ K and det(aij ) = 0. By a straightforward computation we have DL/K (ω1 , . . . , ωn ) = (det(aij )i,j =1,...,n )2 DL/K (θ1 , . . . , θn ). (1.1.3) By applying this relation with {1, θ, . . . , θ } for {θ1 , . . . , θn }, and using (1.1.2), we deduce that if {ω1 , . . . , ωn } is any K-basis of L, then n−1 DL/K (ω1 , . . . , ωn ) = 0. We give an application to linear algebra. Let again K be a field of characteristic 0, and let G be a Galois extension of K. For a vector x = (x1 , . . . , xg ) ∈ Gg Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.2 Ideal theory for algebraic number fields 5 and for σ in the Galois group Gal(G/K) of G over K, we define σ (x) := (σ (x1 ), . . . , σ (xg )). Lemma 1.1.1 Let g ≥ 1, and let V be a G-linear subspace of Gg such that σ (x) ∈ V for x ∈ V , σ ∈ Gal(G/K). Then V has a basis consisting of vectors from K g . Proof. Pick a non-zero vector b ∈ V . Let L ⊆ G be the smallest Galois extension of K containing the coefficients of b and choose a K-basis {ω1 , . . . , ωn } of L. Let Gal(L/K) = {σ1 , . . . , σn }. We have b = nj=1 ωj yj with yj ∈ K g for n j = 1, . . . , n. Then also σi (b) = j =1 σi (ωj )yj for i = 1, . . . , n. The matrix (σi (ωj ))i,j =1,...,n is invertible (the square of its determinant being the discriminant of ω1 , . . . , ωn ), hence y1 , . . . , yn are L-linear combinations of σi (b) (i = 1, . . . , n). Now our assumption on V implies that y1 , . . . , yn ∈ V . It follows that V is generated by vectors from K g , hence it has a basis from K g . Now let K be an algebraic number field and L a finite extension of K. Then for α ∈ L we have α ∈ OL ⇐⇒ χL/K,α ∈ OK [X]. As a consequence, TrL/K (α), NL/K (α) ∈ OK for α ∈ OL , and DL/K (ω1 , . . . , ωn ) ∈ OK for every K-basis {ω1 , . . . , ωn } of L with ω1 , . . . , ωn ∈ OL . 1.2 Ideal theory for algebraic number fields We start with some general notation. Let A be an integral domain with quotient field K. For α ∈ K and a subset F of K, we define αF := {αx : x ∈ F}. A fractional ideal of A is a subset a of K such that a = {0} and there is α ∈ A \ {0} such that αa is an ideal of A. In particular, for α ∈ K ∗ , the set αA is a fractional ideal, which we denote by (α) when it is clear from the context what the underlying domain A is. More generally, given a subset S = {0} of K such that there is α ∈ A \ {0} with αS ⊂ A, the set of all finite A-linear combinations with elements from S is a fractional ideal of A, denoted by SA, called the fractional ideal generated by S. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 6 Basic algebraic number theory Let K be an algebraic number field. Recall that its ring of integers OK is a Dedekind domain, that is, OK is integrally closed, every ideal of OK is finitely generated, and every non-zero prime ideal of OK is a maximal ideal (see Lang (1970), chapter 1, sections 2, 3). Henceforth, when we are dealing with prime ideals of OK , we always exclude (0). Let a, b be two fractional ideals of OK . We define their greatest common divisor or sum, lowest common multiple and product by gcd(a, b) = a + b := {α + β : α ∈ a, β ∈ b}, lcm(a, b) := a ∩ b, ab := OK -module generated by all products αβ with α ∈ a and β ∈ b, respectively. Further, the inverse of a fractional ideal a of OK is defined by a−1 := {α ∈ K : αa ⊆ OK }. The gcd, lcm and product of two fractional ideals of OK , and the inverse of a fractional ideal of OK are again fractional ideals of OK . We denote by P(OK ) the collection of non-zero prime ideals of OK . The following result comprises the ideal theory for OK . Theorem 1.2.1 (i) The fractional ideals of OK form an abelian group with product and inverse as defined above, and with unit element OK = (1). (ii) Every fractional ideal a of OK can be decomposed uniquely as a product of powers of prime ideals pordp (a) , a= p∈P(OK ) where the exponents ordp (a) are rational integers, at most finitely many of which are non-zero. (iii) A fractional ideal a of OK is contained in OK if and only if ordp (a) ≥ 0 for every p ∈ P(OK ). Proof. See Lang (1970), chapter 1, section 6. The group of fractional ideals of OK is denoted by I (OK ). The following consequences are obvious. Corollary 1.2.2 Let a, b be two fractional ideals of OK . Then a ⊆ b ⇐⇒ ordp (a) ≥ ordp (b) for every p ∈ P(OK ). Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.3 Extension of ideals; norm of ideals 7 Further, we have for every p ∈ P(OK ), ordp (a · b) = ordp (a) + ordp (b), ordp (a + b) = min(ordp (a), ordp (b)), ordp (a ∩ b) = max(ordp (a), ordp (b)). For p ∈ P(OK ) we define ordp (x) := ordp ((x)) if x ∈ K ∗ , ordp (0) := ∞. Corollary 1.2.2 implies that for every p ∈ P(OK ), ordp defines a discrete valuation on K, i.e., ordp is a surjective map from K to Z ∪ {∞} such that for x, y ∈ K we have ordp (xy) = ordp (x) + ordp (y); ordp (x + y) ≥ min(ordp (x), ordp (y)), ordp (x) = ∞ ⇐⇒ x = 0. The next corollary, whose proof is straightforward, gives some other consequences. Corollary 1.2.3 (i) Let a be a fractional ideal of OK . Then x ∈ a ⇐⇒ ordp (x) ≥ ordp (a) for all p ∈ P(OK ). In particular, x ∈ OK ⇐⇒ ordp (x) ≥ 0 for all p ∈ P(OK ). (ii) Let a be the fractional ideal of OK generated by a set S. Then ordp (a) = min{ordp (α) : α ∈ S} for p ∈ P(OK ). 1.3 Extension of ideals; norm of ideals Let K be an algebraic number field and L a finite extension of K of degree n. Every fractional ideal a of OK can be extended to a fractional ideal of OL , aOL := {αy : α ∈ a, y ∈ OL }, and the map a → aOL gives an injective group homomorphism from the group of fractional ideals of OK to the group of fractional ideals of OL . The extension of a prime ideal p of OK can be decomposed in a unique way as a product of Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 8 Basic algebraic number theory powers of prime ideals of OL , that is, pOL = g Pei i , i=1 where P1 , . . . , Pg are distinct prime ideals of OL and e1 , . . . , eg are positive integers. We call P1 , . . . , Pg the prime ideals of OL lying above p. The exponent ei , henceforth denoted by e(Pi |p), is called the ramification index of Pi over p. The residue class ring OL /Pi is a finite field extension of OK /p. The degree [OL /Pi : OK /p] of this extension, called the residue class degree of Pi over p, is denoted by f (Pi |p). The next proposition gives some properties of ramification indices and residue class degrees. Proposition 1.3.1 Let L, p, P1 , . . . , Pg be as above. g (i) We have i=1 e(Pi |p)f (Pi |p) = [L : K]. (ii) Assume that L/K is a Galois extension. Then for any two prime ideals Pi , Pj ∈ {P1 , . . . , Pg } there is σ ∈ Gal(L/K) such that Pj = σ Pi . Further, e(P1 |p) = · · · = e(Pg |p) and f (P1 |p) = · · · = f (Pg |p). Proof. See Lang (1970), chapter 1, section 7, proposition 21, corollary 2. Proposition 1.3.2 (transitivity in towers) Let M ⊃ L ⊃ K be a tower of finite field extensions. Further, let P be a prime ideal of OL in the prime ideal factorization of pOL and Q a prime ideal in the prime ideal factorization of POM . Then e(Q|p) = e(Q|P) · e(P|p), f (Q|p) = f (Q|P) · f (P|p). Proof. See Lang (1970), chapter 1, section 7, proposition 20. Let again K be an algebraic number field and L a finite extension of K. We define the norm over K of a prime ideal P of OL by NL/K (P) := pf (P|p) , where p is the prime ideal of OK such that P occurs in the prime ideal factorization of pOL . Then the norm NL/K (A) of an arbitrary fractional ideal A of OL is defined by multiplicativity, i.e., p P|p f (P|p)·ordP (A) , (1.3.1) NL/K (A) := p∈P(OK ) where the sum is over all prime ideals of OL lying above p. Thus, NL/K defines a homomorphism from the group of fractional ideals of OL to the group of fractional ideals of OK . Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.4 Discriminant, class number, unit group and regulator 9 Below, we give some properties of the norm. Proposition 1.3.3 Let L be a finite extension of K. (i) Let A be a fractional ideal of OL . Then NL/K (A) is equal to the fractional ideal generated by the numbers NL/K (α), α ∈ A. (ii) For every α ∈ L∗ we have NL/K (αOL ) = NL/K (α)OK . (iii) Let p be a prime ideal of OK , and P1 , . . . , Pg the prime ideals of OL dividing p. Then for every α ∈ OL , ordp (NL/K (α)) = g f (Pi |p)ordPi (α). i=1 (iv) For every fractional ideal a of OK we have NL/K (aOL ) = a[L:K] . (v) Let M be a finite extension of L. Then for every fractional ideal C of OM , NM/K (C) = NL/K (NM/L (C)). Proof. For (i), (iv), (v) see Neukirch (1992), Kapitel III, Satz 1.6. Part (ii) is a consequence of (i), and part (iii) a consequence of (ii) and (1.3.1). Let K be an algebraic number field. The norm NK/Q (a) of a fractional ideal a of OK is a fractional ideal of Z. Hence there is a positive rational number a such that NK/Q (a) = (a). This number a is called the absolute norm of a, notation NK (a) (often written as N (a) if it is clear from the context which is the underlying number field). It is obvious that the absolute norm is multiplicative. From parts (ii) and (iv) of Proposition 1.3.3, we obtain at once: NK ((α)) = |NK/Q (α)| for α ∈ K ∗ , NK ((a)) = |a|[K:Q] for a ∈ Q∗ . If p is a prime ideal of OK dividing a prime number p, we have NK (p) = pf (p|p) = |OK /p|. More generally, for any non-zero ideal a of OK we have NK (a) = |OK /a|. 1.4 Discriminant, class number, unit group and regulator Let K be an algebraic number field of degree d over Q. There are d distinct isomorphic embeddings of K in C, which we denote by σ1 , . . . , σd ; further, we will write α (i) := σi (α) for α ∈ K. We assume that among these embeddings there are precisely r1 real embeddings, i.e., embeddings σ with σ (K) ⊂ R, and r2 pairs of complex conjugate embeddings, i.e., pairs {σ, σ } where σ (α) = σ (α) for α ∈ K. Thus, d = r1 + 2r2 and after reordering the embeddings we Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 10 Basic algebraic number theory may assume that σi (i = 1, . . . , r1 ) are the real embeddings and {σi , σi+r2 } (i = r1 + 1, . . . r1 + r2 ) the pairs of complex conjugate embeddings. Viewed as a Z-module, OK is free of rank d. Taking any Z-basis {ω1 , . . . , ωd } of OK , we define the discriminant of K by 2 DK := DK/Q (ω1 , . . . , ωd ) = det ωj(i) i,j =1,...,d . This is a non-zero rational integer which, by (1.1.3), is independent of the choice of the basis. Denote as before by I (OK ) the group of fractional ideals of OK . Further, denote by P (OK ) the subgroup of principal fractional ideals of OK . The quotient group Cl(OK ) = I (OK )/P (OK ) is called the class group of K. Theorem 1.4.1 The class group Cl(OK ) of OK is finite. Proof. See Neukirch (1992), Kapitel I, Satz 6.3. The cardinality of the class group is called the class number of K, and we denote this by hK . We denote by WK the multiplicative group consisting of all roots of unity in K. This is a finite, cyclic subgroup of K ∗ . We denote the number of roots of unity of K by ωK . We recall the following fundamental theorem of Dirichlet concerning the unit group OK∗ of OK . Elements of OK∗ will usually be referred to as units of K. Recall that if V is an n-dimensional vector space over R, then a full lattice in V is an additive subgroup {z1 a1 + · · · + zn an : z1 , . . . , zn ∈ Z}, where {a1 , . . . , an } is a basis of V . Theorem 1.4.2 The map LOGK : ε → e1 log |ε(1) |, . . . , er1 +r2 log |ε(r1 +r2 ) | (where ej = 1 for j = 1, . . . r1 and ej = 2 for j = r1 + 1, . . . , r1 + r2 ) defines a surjective homomorphism from OK∗ to a full lattice in the real vector space given by {x = (x1 , . . . , xr1 +r2 ) ∈ Rr1 +r2 : x1 + · · · + xr1 +r2 = 0} with kernel WK . Proof. See Neukirch (1992), Kapitel I, Satz 7.1. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.5 Explicit estimates 11 The following consequence is immediate. Corollary 1.4.3 Put r = rK := r1 + r2 − 1. Then OK∗ ∼ = WK × Zr . More explicitly, there are ε1 , . . . , εr ∈ OK∗ such that every ε ∈ OK∗ can be expressed uniquely as ε = ζ ε1b1 . . . εrbr where ζ is a root of unity in K and b1 , . . . br are rational integers. The number rK (denoted by r if it is clear to which number field it refers) is called the unit rank of K. A set of units {ε1 , . . . , εr } as above is called a fundamental system of units for K. Given such a system, we define the regulator of K by (j ) RK := det ej log εi i,j =1,...,r . This regulator is non-zero, and independent of the choice of ε1 , . . . , εr . 1.5 Explicit estimates We have collected from the literature some estimates for the field parameters defined above. As before, K is an algebraic number field of degree d. For the number ωK of roots of unity of K we have the estimate ωK ≤ 20d log log d if d ≥ 3. (1.5.1) This follows from the observation that the degree of the maximal cyclotomic subfield of K, which is ϕ(ωK ) where ϕ is Euler’s totient function, divides d, and from Rosser and Schoenfeld (1962), Theorem 15, which gives an explicit lower bound for ϕ(n) of the order n/ log log n. For the class number and regulator of K we have hK RK ≤ |DK |1/2 (log∗ |DK |)d−1 . (1.5.2) The first inequality of this type was proved by Landau (1918). The above version follows from Louboutin (2000) and (1.5.1); see (59) in Győry and Yu (2006). The following lower bound for the regulator is due to Friedman (1989): RK > 0.2052. (1.5.3) We recall an important lower estimate for discriminants. By an inequality due to Minkowski (see, e.g., Lang (1970), chapter V, section 4, proof of corollary Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 12 Basic algebraic number theory of theorem 4) we have |DK | > π 4 d dd d! 2 . (1.5.4) Further, we need the following lemma. Lemma 1.5.1 Let g ∈ Z[X] be a monic polynomial of degree m with non-zero discriminant. Assume that the coefficients of g have absolute values at most M. Let K = Q(θ ), where θ is a zero of g. Then |DK | ≤ m2m−1 M 2m−2 . Proof. The monic minimal polynomial, say f , of θ is in Z[X] and it divides g in Z[X]. Suppose K has degree d. Using the expression of the discriminant of a monic polynomial as the product of the squares of the differences of its zeros, one easily shows that the discriminant D(f ) of f divides D(g) in the ring of algebraic integers and so also in Z. Further, by (1.1.2), we have D(f ) = DK/Q (1, θ, . . . , θ d−1 ). Writing 1, θ, . . . , θ d−1 as Z-linear combinations of a Zbasis of OK , and using (1.1.3), we infer that DK divides D(f ). Therefore, DK divides D(g). Using for instance an estimate from Lewis and Mahler (1961) (bottom of p. 335), which uses a determinantal expression for D(g), one obtains |D(g)| ≤ m2m−1 M 2m−2 . This proves our lemma. Remark There is an analogue for this lemma where for g one can take any nonzero polynomial in Z[X], not necessarily monic or of non-zero discriminant. We will not work this out. 1.6 Absolute values: generalities We have collected some facts on absolute values. Our basic reference is Neukirch (1992), Kapitel II. Let K be an infinite field. An absolute value on K is a function | · | : K → R≥0 satisfying the following conditions: |xy| = |x| · |y| for all x, y ∈ K; there is C ≥ 1 such that |x + y| ≤ C max(|x|, |y|) for all x, y ∈ K; |x| = 0 ⇐⇒ x = 0. These conditions imply that |1| = 1. An absolute value | · | on K is called trivial if |x| = 1 for x ∈ K ∗ . Henceforth, all absolute values we will consider Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.6 Absolute values: generalities 13 are non-trivial. Two absolute values | · |1 , | · |2 on K are called equivalent if there is c > 0 such that |x|2 = |x|c1 for all x ∈ K. An absolute value | · | on K is called non-archimedean if it satisfies the ultrametric inequality |x + y| ≤ max(|x|, |y|) for x, y ∈ K and archimedean if it does not satisfy this inequality. A valuation on K is a function v : K → R ∪ {∞} such that C −v defines a non-archimedean absolute value on K, where C is any constant > 1. Equivalently, v(0) = ∞, v(x) ∈ R v(xy) = v(x) + v(y), for x ∈ K ∗ , v(x + y) ≥ min(v(x), v(y)) for x, y ∈ K. Notice that if v is a valuation on K, then v(K ∗ ) = {v(x) : x ∈ K ∗ } is an additive subgroup of R, called the value group of v. In this book, we agree that a discrete valuation on K is a valuation v on K for which v(K ∗ ) = Z. (In much of the literature, a discrete valuation on K is a valuation v for which v(K ∗ ) is a non-trivial discrete subgroup of R; then a valuation v for which v(K ∗ ) = Z is called a normalized discrete valuation). A field with absolute value is a pair (K, | · |), where K is an infinite field, and | · | a non-trivial absolute value on K. An injective homomorphism/isomorphism of fields with absolute value ϕ : (K1 , | · |1 ) → (K2 , | · |2 ) is an injective homomorphism/isomorphism ϕ : K1 → K2 such that |x|1 = |ϕ(x)|2 for x ∈ K. Let (K, | · |) be a field with absolute value. A sequence {an } = {an }∞ n=0 in K is called a convergent sequence of (K, | · |), if there is α ∈ K such that limn→∞ |an − α| = 0. A Cauchy sequence of (K, | · |) is a sequence {an } in K with limm,n→∞ |am − an | = 0. We call (K, | · |) complete if every Cauchy sequence of (K, | · |) converges. Suppose that (K, | · |) is a non-complete field with absolute value. Then we can extend this to a complete field with absolute value (K, | · |), the completion of (K, | · |), as follows. Let R be the ring of Cauchy sequences of (K, | · |) with componentwise addition and multiplication, and M the ideal of sequences of (K, | · |) converging to 0. Then M is a maximal ideal of R and thus, R/M is a field which will be our K. We view K as a subfield of K by identifying α ∈ K with the element of K represented by the constant sequence {α}. We extend | · | to K by setting |α| := limn→∞ |an | for α ∈ K, where {an } is any Cauchy Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 14 Basic algebraic number theory sequence of (K, | · |) representing α. The field K is the smallest complete field containing K, in the sense that if there exists an injective homomorphism of (K, | · |) into a complete field with absolute value (L, | · | ), say, then this can be extended in precisely one way to an injective homomorphism from (K, | · |) into (L, | · | ). It is easy to see that notions such as convergence, Cauchy sequence, completeness, completion, depend on the equivalence class of an absolute value rather than the absolute value itself. If K is a field with valuation v, then notions such as convergence, Cauchy sequence, completeness with respect to v are meant to be the corresponding notions with respect to the absolute value C −v , where C > 1 is any constant. Example 1: Absolute values on Q. Define MQ := {∞} ∪ {prime numbers}. This is called the set of places of Q. We call ∞ the infinite place, and the prime numbers the finite places of Q. We define absolute values | · |p (p ∈ MQ ) by |a|∞ := max(a, −a) |a|p := p −ordp (a) for a ∈ Q, for a ∈ Q for every prime number p, where ordp (a) is the exponent of p in the unique prime factorization of a, i.e., if a = pm b/c with m, b, c ∈ Z and p bc, then ordp (a) = m. We agree that ordp (0) = ∞ and |0|p = 0. The absolute value | · |∞ is archimedean, while the other ones are nonarchimedean. The completion of Q with respect to | · |∞ is Q∞ := R. For a prime number p, the completion of Q with respect to | · |p is the field of p-adic numbers, denoted by Qp . The above absolute values satisfy the Product Formula |a|p = 1 for a ∈ Q∗ . p∈MQ By a theorem of Ostrowski (see Neukirch (1992), Kapitel II, Satz 3.7), every non-trivial absolute value on Q is equivalent to one of | · |p (p ∈ MQ ). Example 2. By another theorem of Ostrowski (see Neukirch (1992), Kapitel II, Satz 4.2), if (K, | · |) is a complete field with archimedean absolute value, then up to isomorphism, K = R or C, and | · | is equivalent to the ordinary absolute value. We finish with recalling some facts about extensions of absolute values. Let (K, | · |) be a field with absolute value, and L an extension field of K. By an extension or continuation of | · | to L we mean an absolute value on L whose restriction to K is | · |. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.7 Absolute values and places on number fields 15 Proposition 1.6.1 Let (K, | · |) be a complete field with absolute value. (i) Let L be a finite extension of K. Then there is precisely one extension of | · | to L, which is given by |NL/K (·)|1/[L:K] . The field L is complete with respect to this extension. (ii) Let K be an algebraic closure of K. Then | · | has a unique extension to K. If we denote this extension also by | · |, we have |τ (x)| = |x| for x ∈ K, τ ∈ Gal(K/K). Proof. See for instance Neukirch (1992), Kapitel II, Theorem 4.8. 1.7 Absolute values and places on number fields Let K be an algebraic number field. We introduce a collection of normalized absolute values {| · |v }v∈MK on K by taking suitable powers of the extensions to K of the absolute values | · |p (p ∈ MQ ) defined in the previous subsection. A real place of K is a set {σ } where σ : K → R is a real embedding of K. A complex place of K is a pair {σ, σ } of conjugate complex embeddings K → C. An infinite place is a real or complex place. Clearly, if r1 , r2 denote the number of real and complex places of K, we have r1 + 2r2 = [K : Q]. A finite place of K is a non-zero prime ideal of OK . We denote by MK∞ , MK0 the sets of infinite places and finite places, respectively, of K, and by MK the set of all places of K, i.e., MK := MK∞ ∪ MK0 . With every place v ∈ MK we associate an absolute value | · |v on K, which is defined as follows for α ∈ K: |α|v := |σ (α)| if v = {σ } is real; |α|v := |σ (α)|2 = |σ (α)|2 |a|v := NK (p) −ordp (a) if v = {σ, σ } is complex; if v = p is a prime ideal of OK , where NK (p) = |OK /p| is the absolute norm of p, and ordp (α) is the exponent of p in the prime ideal factorization of (α), where we agree that ordp (0) = ∞. We denote by Kv the completion of K with respect to | · |v . Notice that Kv = R if v is real, Kv = C if v is complex, while Kv is a finite extension of Qp if v = p is a prime ideal of OK , and p is the prime number with p ∩ Z = (p). Combining the Product Formula over Q with the identity NK ((α)) = |NK/Q (α)| for α ∈ K, where the left-hand side denotes the absolute norm of (α), one easily deduces the Product Formula over K, |α|v = 1 for α ∈ K ∗ . (1.7.1) v∈MK Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 16 Basic algebraic number theory To deal with archimedean and non-archimedean absolute values simultaneously, it is convenient to use |x1 + · · · + xn |v ≤ ns(v) max(|x1 |v , . . . , |xn |v ) (1.7.2) for v ∈ MK , x1 , . . . , xn ∈ K, where s(v) = 1 if v is real, s(v) = 2 if v is complex, s(v) = 0 if v is finite. Note that v∈MK∞ s(v) = [K : Q]. Let ρ : K1 → K2 be an isomorphism of algebraic number fields, and v a place of K2 . We define a place v ◦ ρ on K1 by ⎧ if v = {σ } is real, ⎨{σρ} v ◦ ρ := {τρ, τ ρ} if v = {τ, τ } is complex, ⎩ −1 ρ (p) if v = p is a prime ideal of OK . Then |α|v◦ ρ = |ρ(α)|v for α ∈ K1 , v ∈ MK2 . (1.7.3) Let L be a finite extension of K and v, V places of K, L, respectively. We say that V lies above v or v below V , notation V |v, if the restriction of | · |V to K is a power of | · |v . This is the case precisely if either both v, V are infinite and the embeddings in v are the restrictions to K of the embeddings in V ; or if v = p, V = P are prime ideals of OK , OL , respectively with P ⊃ p. In that case, the completion LV of L with respect to | · |V is a finite extension of Kv . In fact, [LV : Kv ] is 1 or 2 if v, V are infinite, while if v = p, V = P are finite, we have [LV : Kv ] = e(P|p)f (P|p), where e(P|p), f (P|p) denote the ramification index and residue class degree of P over p. We say that two places V1 , V2 of L are conjugate over K if there is a K-automorphism σ of L such that V2 = V1 ◦ σ . Proposition 1.7.1 Let K be a number field, and L a finite extension of K. Further, let v be a place of K and V1 , . . . , Vg the places of L above v. Then [LV :Kv ] |α|Vk = |α|v k for α ∈ K, k = 1, . . . , g, g |α|Vk = |NL/K (α)|v for α ∈ L, (1.7.4) (1.7.5) k=1 g [LVk : Kv ] = [L : K]. (1.7.6) k=1 Further, if L/K is Galois, then V1 , . . . , Vg are conjugate to each other, and we have [LVk : Kv ] = [L : K]/g for k = 1, . . . , g. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.8 S-integers, S-units and S-norm 17 Proof. The verification is completely straightforward if v is an infinite place. So assume that v = p is a finite place; then Vk = Pk (k = 1, . . . , g) are the prime ideals of OL containing p. The first identity (1.7.4) follows from the observation that for α ∈ K, P ∈ {P1 , . . . , Pg }, |α|P = NL (P)−ordP (α) = NK (p)−f (P|p)e(P|p)ordp (α) = |α|p P [L :Kp ] . Identity (1.7.5) follows by expressing both sides of the identity as powers of NK (p) and showing by means of Proposition 1.3.3 (iii) that the exponents are equal. Identity (1.7.6) follows by combining (1.7.4) with (1.7.5) with α ∈ K ∗ . The last assertion follows from Proposition 1.3.1 (ii). 1.8 S-integers, S-units and S-norm Let S denote a finite subset of MK containing all infinite places. We say that α ∈ K is an S-integer if |α|v ≤ 1 for all v ∈ MK \ S. The S-integers form a ring in K, denoted by OS . Its unit group, denoted OS∗ , is called the group of S-units. For S = MK∞ the ring of S-integers is just OK and the group of S-units just ∗ OK . Otherwise, we have S = MK∞ ∪ {p1 , . . . , pt }, where p1 , . . . , pt are prime ideals of OK . Then OS = OK [(p1 · · · pt )−1 ], and OS∗ consists of those elements α of K such that (α) is composed of prime ideals from p1 , . . . , pt . In the case K = Q, S = {∞, p1 , . . . , pt } where p1 , . . . , pt are prime numbers, we write ZS for the ring of S-integers. Thus, ZS = Z[(p1 · · · pt )−1 ]. We define the S-norm of α ∈ K by |α|v . NS (α) := v∈S Notice that the S-norm is multiplicative. Let again S = MK∞ ∪ {p1 , . . . , pt }. Take α ∈ K ∗ and write (α) = pk11 · · · pkt t a, where a is a fractional ideal of OK composed of prime ideals outside S. Then by the Product Formula, |α|−1 NK (p)ordp (α) NS (α) = v = v∈S p∈P(OK )\{p1 ,...,pt } = NK (a). Let L be a finite extension of K and T the set of places of L lying above the places of S. Then OT , the ring of T -integers of L is the integral closure of OS Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 18 Basic algebraic number theory in L. Further, by Proposition 1.7.1, NT (α) = NS (α)[L:K] for α ∈ K ∗ . We recall the extension of Dirichlet’s Unit Theorem to S-units. Theorem 1.8.1 (S-unit Theorem) Let S = {v1 , . . . , vs } be a finite set of places of K, containing all infinite places. Then the map LOGS : ε → ((log |ε|v1 , . . . , log |ε|vs ) defines a surjective homomorphism from OS∗ to a full lattice in the real vector space {x = (x1 , . . . , xs ) ∈ Rs : x1 + · · · + xs = 0} with kernel WK . Proof. See Lang (1970), chapter V, section 1, Unit Theorem. This implies at once: Corollary 1.8.2 We have OS∗ ∼ = WK × Zs−1 . More explicitly, there are ε1 , . . . , εs−1 ∈ OS∗ such that every ε ∈ OS∗ can be expressed uniquely as b s−1 , ε = ζ ε1b1 . . . εs−1 (1.8.1) where ζ is a root of unity in K and b1 , . . . bs−1 are rational integers. A system {ε1 , . . . , εs−1 } as above is called a fundamental system of S-units. Analogously as for units of OK , we define the S-regulator by (1.8.2) RS := det log |εi |vj i,j =1,...,s−1 . This quantity is non-zero, and independent of the choice of ε1 , . . . , εs−1 and of the choice v1 , . . . , vs−1 from S. In the case that S = MK∞ , the S-regulator RS is equal to the regulator RK . More generally, we have RS = RK · [I (S) : P (S)] · t log N (pi ) , (1.8.3) i=1 where p1 , . . . , pt are the prime ideals in S, I (S) is the group of fractional ideals of OK composed of prime ideals from p1 , . . . , pt and P (S) is the group of principal fractional ideals of OK composed of prime ideals from p1 , . . . , pt . Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.9 Heights 19 We note that the index [I (S) : P (S)] is a divisor of the class number hK . By combining (1.8.3) with (1.5.2) we obtain RS ≤ hK RK · t log NK (pi ) i=1 ≤ |DK |1/2 (log∗ |DK |)d−1 · t log NK (pi ). (1.8.4) i=1 By combining (1.8.3) with (1.5.3), we obtain (log 2)(log 3)s−2 if K = Q, s := |S| ≥ 3, RS ≥ 0.2052(log 2)s−2 if K = Q, s ≥ 3. (1.8.5) 1.9 Heights There are various ways to define the height of an algebraic number, a vector with algebraic coordinates or a polynomial with algebraic coefficients. Here we have made a small selection. The other notions of height needed in this book will be defined on the spot. We fix an algebraic closure Q of Q. 1.9.1 Heights of algebraic numbers The absolute multiplicative height of α ∈ Q is defined by max(1, |α|v )1/[K:Q] , H (α) := v∈MK where K ⊂ Q is any number field containing α. It follows from Proposition 1.7.1 that this is independent of the choice of K. The absolute logarithmic height of α is given by h(α) := log H (α). Below, we have brought together some properties of the absolute logarithmic height. These can easily be reformulated into properties of the absolute multiplicative height. We start with a trivial but useful observation: if K is an algebraic number field, S any non-empty subset of MK , and α = 0, then −h(α) ≤ 1 log |α|v ≤ h(α). [K : Q] v∈S (1.9.1) Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 20 Basic algebraic number theory 1 Indeed, the upper bound is obvious from h(α) = [K:Q] v∈MK log max(1, |α|v ), and the lower bound follows in the same manner, applying the Product Formula v∈S log |α|v = − v∈S log |α|v . In the case that S is a finite subset of MK , containing the infinite places, (1.9.1) translates into −h(α) ≤ 1 log NS (α) ≤ h(α). [K : Q] (1.9.2) The next lemma gives an estimate for the denominator of an algebraic number. Lemma 1.9.1 Let K be a number field and α ∈ K ∗ . Then there is a positive integer d such that d ≤ H (α)[K:Q] and dα ∈ OK . Proof. We take d := v∈MK0 max(1, |α|v ). It is clear that d ≤ H (α)[K:Q] . We show that d is a positive integer and dα ∈ OK . First observe that NK (p)max(0,−ordp (α)) = p p|p f (p|p) max(0,−ordp (α)) ∈ Z>0 , d= p∈P(OK ) p where the product is over the rational primes. Further, if p is a prime ideal of OK lying above the prime p, say, then ordp (d) ≥ ordp (d) ≥ −ordp (α), implying ordp (dα) ≥ 0. This holds for every prime ideal p of OK , hence dα ∈ OK . In the following lemma we have listed some further properties. Lemma 1.9.2 Let α, α1 , . . . , αn ∈ Q, m ∈ Z and let σ be an automorphism of Q. Then h(σ (α)) = h(α); n h(α1 · · · αn ) ≤ h(αi ); i=1 h(α1 + · · · + αn ) ≤ log n + n h(αi ); i=1 h(α m ) = |m|h(α) if α = 0. Proof. The first property is a consequence of (1.7.3), the third of (1.7.2), while the other two are obvious. See also Waldschmidt (2000), chapter 3. The minimal polynomial of α ∈ Q over Z, denoted by Pα , is by definition the polynomial P ∈ Z[X] of minimal degree, having positive leading coefficient and coefficients with greatest common divisor 1, such that P (α) = 0. Writing Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.9 Heights 21 Pα = a0 (X − α (1) ) · · · (X − α (d) ) where d = deg α and α (1) , . . . , α (d) are the conjugates of α in C, we have 1/d d (i) H (α) = |a0 | max 1, α , i=1 i.e., H (α) is the d-th root of the Mahler measure of α (see Waldschmidt (2000), Lemma 3.10). Further, writing Pα = a0 Xd + · · · + ad , we have 1 log(d + 1) + h(α) ≤ h(Pα ) ≤ log 2 + h(α), − 2d (1.9.3) where h(Pα ) := log max(|a0 |, . . . , |ad |) (see Waldschmidt (2000), Lemma 3.11). From this we deduce at once: Theorem 1.9.3 (Northcott’s Theorem) Let D, H be positive reals. Then there are only finitely many α ∈ Q such that deg α ≤ D and h(α) ≤ H . 1.9.2 v-adic norms and heights of vectors and polynomials Let K be an algebraic number field, v ∈ MK , and denote the unique extension of | · |v to Kv also by | · |v . We define the v-adic norm of a vector n x = (x1 , . . . , xn ) ∈ Kv by |x|v = |x1 , . . . , xn |v := max(|x1 |v , . . . , |xn |v ). n Let x = (x1 , . . . , xn ) ∈ Q and choose an algebraic number field K such that x ∈ K n . Then the multiplicative height and homogeneous multiplicative height of x are defined by 1/[K:Q] H (x) = H (x1 , . . . , xn ) := max(1, |x|v ) , v∈MK H hom (x) = H hom (x1 , . . . , xn ) := 1/[K:Q] |x|v , v∈MK respectively. By Proposition 1.7.1, these definitions are independent of the choice of K. We define the corresponding logarithmic heights by h(x) := log H (x), hhom (x) := log H hom (x), respectively. For instance, for x = (x1 , . . . , xn ) ∈ Zn \ {0} we have ⎫ h(x) = log max(|x1 |, . . . , |xn |), ⎬ |, . . . , |x |) max(|x 1 n .⎭ hhom (x) = log gcd(x1 , . . . , xn ) (1.9.4) Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 22 Basic algebraic number theory n ∗ It is easy to verify that for x = (x1 , . . . , xn ) ∈ Q , λ ∈ Q and for x1 , . . . , xm ∈ n Q , hhom (x) ≤ h(x), (1.9.5) max h(xi ) ≤ h(x) ≤ 1≤i≤n n h(xi ), (1.9.6) i=1 h(x) − h(λ) ≤ h(λx) ≤ h(x) + h(λ), (1.9.7) hhom (λx) = hhom (x), m h(xi ) + log m. h(x1 + · · · + xm ) ≤ (1.9.8) (1.9.9) i=1 We recall a few facts on heights and norms of polynomials. Let K be an algebraic number field and v ∈ MK . Denote the unique extension of | · |v to Kv also by | · |v . For a polynomial P ∈ Kv [X1 , . . . , Xg ], we denote by |P |v the v-adic norm of a vector, consisting of all non-zero coefficients of P . We write as before s(v) = 1 if v is real, s(v) = 2 if v is complex, and s(v) = 0 if v is finite. Proposition 1.9.4 Let P1 , . . . , Pm be non-zero polynomials in Kv [X1 , . . . , Xg ] and let n be the sum of the partial degrees of P := P1 · · · Pm . Then 2−ns(v) ≤ |P |v ≤ 2ns(v) . |P1 |v · · · |Pm |v Proof. In the case that v is finite then the term 2ns(v) is 1, and so this is Gauss’ Lemma. In the case that v is infinite this is a version of a lemma of Gelfond. Proofs of both can be found for instance in Bombieri and Gubler (2006), Lemmas 1.6.3, 1.6.11. For a polynomial P ∈ Q[X1 , . . . , Xg ], we denote by H (P ), H hom (P ), h(P ), h (P ), the respective heights of a vector whose coordinates are the non-zero coefficients of P . Obviously, for polynomials we have similar inequalities as in (1.9.5)–(1.9.9). From Proposition 1.9.4 we deduce at once: hom Corollary 1.9.5 Let P1 , . . . , Pm be non-zero polynomials in Q[X1 , . . . , Xg ] and let n be the sum of the partial degrees of P := P1 · · · Pm . Then m hom hom h (Pi ) ≤ n log 2. h (P ) − i=1 Proof. Choose a number field K containing the coefficients of P1 , . . . , Pm , apply Proposition 1.9.4 and take the product over v ∈ MK . Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.10 Effective computations in number fields 23 Corollary 1.9.6 Let P = (X − α1 ) · · · (X − αn ) ∈ Q[X]. Then n h(αi ) ≤ n log 2, h(P ) − i=1 n 2−n H (P ) ≤ H (αi ) ≤ 2n H (P ). i=1 Proof. The second assertion is an immediate consequence of the first one. To prove the first, observe that h(α) = hhom (X − α) for α ∈ Q and that h(P ) = hhom (P ) since P is monic. Applying Corollary 1.9.5 to the identity P (X) = n i=1 (X − αi ), the first assertion follows. For monic irreducible polynomials P with coefficients in Z, Corollary 1.9.6 gives a slightly weaker version of (1.9.3). 1.10 Effective computations in number fields We have listed the basic algorithmic results for algebraic number fields that will be needed later. We shall not present the algorithms themselves, but refer to the literature for their description. Our main references are Borevich and Shafarevich (1967), Pohst and Zassenhaus (1989) and Cohen (1993, 2000). When we say that for any given input from a specified set we can determine/compute effectively an output, we mean that there exists an algorithm (that is, a deterministic Turing machine) which, for any choice of input from the given set, computes the output in finitely many steps. We say that an object is given effectively if it is given in such a way that it can serve as input for an algorithm. An algebraic number field K is said to be effectively given if K = Q(θ ) and the monic minimal polynomial P ∈ Q[X] of θ are given. Then K Q[X]/(P ). Here we may assume that P ∈ Z[X] and that θ is an algebraic integer. Throughout this section we assume that K is effectively given in the form K = Q(θ ) with the monic minimal polynomial P of θ in Z[X]. We denote by d the degree of P , that is the degree of K over Q. We say that an element α of K is effectively given/computable if in the representation α = a0 + a1 θ + · · · + ad−1 θ d−1 (1.10.1) of α the coefficients a0 , . . . , ad−1 ∈ Q are effectively given/computable. (1.10.1) is regarded as the standard representation of α ∈ K with respect to θ . Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 24 Basic algebraic number theory We shall need the following algorithmic results. (I) If α, β ∈ K are effectively given/computable then α ± β, αβ and α/β (β = 0) are also effectively computable; see e.g. Cohen (1993), section 4.2. (II) One can determine effectively an integral basis of K, that is a Z-module basis {1, ω2 , . . . , ωd } of the ring of integers OK of K, and from that the discriminant DK of K; see e.g. Cohen (1993), section 6.1. It is easy to see that if α ∈ K is effectively given then one can determine b1 , . . . , bd in Q such that α = b1 + b2 ω2 + · · · + bd ωd . (1.10.2) In particular, one can decide whether α is in OK . (III) For any given F ∈ K[X], one can factorize F into irreducible polynomials over K; see Pohst and Zassenhaus (1989) or Cohen (1993), section 3.6. As a consequence, for given F ∈ K[X] one can determine all zeros of F in K. (IV) If α ∈ K is effectively given, then its characteristic polynomial relative to K/Q and its monic minimal polynomial over Q can be effectively determined. Conversely, if a monic, irreducible polynomial P (X) over Q is given, then one can decide whether any of its zeros belongs to K, and if it is so then all zeros of P (X) in K can be effectively determined. Consequently, if K/Q is normal, then all conjugates of any given α ∈ K can be effectively determined; see e.g. Győry (1983), remark 1. (V) For given C > 0 one can determine a finite and effectively determinable subset A of K such that if α ∈ K and h(α) ≤ C then α ∈ A. Indeed, representing α in the form (1.10.2) with an effectively given integral basis {1, ω2 , . . . , ωd }, taking conjugates with respect to K/Q and using Cramer’s Rule, one can get an effective upper bound for maxi h(bi ). But, for such bi , the numbers b1 + b2 ω2 + · · · + bd ωd form a finite and effectively computable subset of K. (VI) If α ∈ K is effectively given then one can effectively compute an upper bound for h(α). Indeed, by (IV) we can compute the minimal polynomial Pα (X) ∈ Z[X] of α with relatively prime coefficients. Then (1.9.3) provides an upper bound for h(α). We say that a fractional ideal a of OK is effectively given/ computable if a finite set of generators of a over OK is effectively given/computable. For other representations of fractional ideals we refer to Pohst and Zassenhaus (1989), section 6.3 and Cohen (1993), section 4.7. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.10 Effective computations in number fields 25 (VII) If a fractional ideal a of OK is effectively given then it can be decided whether a is principal. Further, if it is, one can compute an α ∈ K such that a = (α); see Cohen (1993), section 6.5. (VIII) For effectively given fractional ideals of OK one can compute their sum, product and their absolute norms. Further, one can test equality, inclusion (i.e. divisibility) and whether a given element of K is in a given fractional ideal; see e.g. Cohen (1993), section 4.7. (IX) If a is an effectively given fractional ideal of OK then its prime ideal factorization can be effectively determined; see e.g. Cohen (2000), section 2.3. In particular, one can decide whether a is an ideal of OK or a prime ideal. Let S be a finite set of places of K containing all infinite places. We say that S is effectively given if the prime ideals in S are effectively given. In what follows, we assume that S is effectively given. We recall that OS (resp. OS∗ ) denotes the ring of S-integers (resp. the group of S-units) in K. (X) For an effectively given place v ∈ MK and an effectively given α ∈ K, one can effectively compute |α|v . For the definition of an infinite place v being effectively given, and the computation of |α|v see Cohen (1993), section 4.2. For the computation of |α|v for a finite place v combine (IX) and (VIII) above. (XI) In view of (IX) one can decide for any given α ∈ K ∗ whether α ∈ OS or α ∈ OS∗ . (XII) A fundamental system of S-units can be effectively determined; see Cohen (2000), section 7.4. In particular, a fundamental system of units in K can be effectively found; see Borevich and Shafarevich (1967), chapter 2, section 5 or Pohst and Zassenhaus (1989), section 5.7. Further, the roots of unity in K can be effectively found; see e.g. Pohst and Zassenhaus (1989), section 5.4 or Cohen (1993), section 4.9. (XIII) If ε ∈ OS∗ and a fundamental system of S-units {ε1 , . . . , εs−1 } are effectively given, then one can determine effectively rational integers b1 , . . . , bs−1 and a root of unity ζ in K such that (1.8.1) holds. Proof. By Corollary 1.8.2, ε can be written in the form (1.8.1). Let S = {v1 , . . . , vs } be as in Theorem 1.8.1. Then (1.8.1) implies that log |ε|vi = s−1 bj log |εj |vi for i = 1, . . . , s. j =1 Considering this as a system of linear equations in b1 , . . . , bs−1 and using Cramer’s Rule, (1.8.5) and the fact that | log |ε|vi | and | log |εj |vi | can be Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 26 Basic algebraic number theory effectively bounded above for each i and j , one can derive an effectively computable upper bound for maxj |bj |. Testing all possible values of b1 , . . . , bs−1 , bs−1 is a root of unity. one can determine b1 , . . . , bs−1 such that ε/ε1b1 · · · εs−1 1.11 p-adic numbers Let p be a prime number. Recall that we have defined the absolute value | · |p on Q by | · |p := p−ordp (·) , where ordp (a) denotes the exponent of p in the unique prime factorization of a ∈ Q∗ , and ordp (0) = ∞. We denote by Qp the completion of Q with respect to | · |p or equivalently, with respect to ordp . Clearly, ordp defines a discrete valuation on Q, and hence on Qp . We define the ring of p-adic integers by Zp := {x ∈ Qp : ordp (x) ≥ 0}. Let L be a finite extension of Qp . There is precisely one absolute value on 1/[L:Qp ] L that extends | · |p , given by |NL/Qp (·)|p , and L is complete with respect to this absolute value. We can extend ordp to a valuation on L, by defining ordp (α) := − log |NL/Qp (α)|p /[L : Qp ] log p for α ∈ L. Clearly, ordp (α) is a rational number with denominator dividing [L : Qp ]. As a consequence, the value set of ordp on L∗ is a cyclic subgroup of Q containing Z, say of the shape e−1 Z, where e is a positive integer. This integer e is called the ramification index of L over Qp . Any positive integer e may occur as ramification index; for instance if α e = p, then Qp (α) has ramification index e over Qp . Now, let Qp denote an algebraic closure of Qp . Then the above considerations imply that ordp extends uniquely to a valuation on Qp , denoted also by ∗ ordp , with value group ordp (Qp ) = Q. It can be shown that Qp is not complete with respect to ordp but that the completion of Qp is algebraically closed (see Koblitz (1984), pp. 71–73). The ring of integers of Qp and its unit group are given by Zp := {x ∈ Qp : ordp (x) ≥ 0}, ∗ Zp = {x ∈ Qp : ordp (x) = 0}. Let K be an algebraic number field. Then any discrete valuation v on K lying above ordp corresponds to a prime ideal of OK above p, that is, p := {x ∈ OK : v(x) > 0}. Further, for x ∈ K ∗ , v(x) is precisely the exponent of p in the unique prime ideal factorization of (x), i.e., v = ordp . The completion of K with respect to ordp , denoted by Kp , is a finite extension of Qp , and the ramification index Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.11 p-adic numbers 27 of Kp over Qp is precisely the ramification index e(p|p) of p over p. Further, there is an embedding σ : K → Qp such that ordp (x) = e(p|p)ordp (σ (x)) for x ∈ K . We are going to define the p-adic logarithm. We start with some preliminaries. ∗ Lemma 1.11.1 Let α ∈ Zp and c ∈ R>0 . Then there is a positive integer m such that ordp (α m − 1) > c. This integer m depends only on p, c and the field Qp (α). Proof. Let L := Qp (α) and define o := {x ∈ L : ordp (x) ≥ 0}, m := {x ∈ L : ordp (x) > c}. Then o is the integral closure of Zp in L and m is an ideal of o. Let l be the smallest integer ≥ c. Then m ⊇ pl o. Since the additive structure of o is that of a free Zp -module of rank [L : Qp ], the residue class ring o/pl o has cardinality pl·[L:Qp ] . This shows that the residue class ring o/m is finite. But then its unit group (o/m)∗ is also finite, say of order m, and α m − 1 ∈ m. Clearly, m depends only on p, c and L. Lemma 1.11.2 ∗ (i) Let α ∈ Zp with ordp (α − 1) > 0. Then the series logp (α) := ∞ (−1)n−1 n=1 n · (α − 1)n converges to a limit in the field Qp (α). ∗ (ii) Let α ∈ Zp with ordp (α − 1) > 1/(p − 1). Then ordp (logp (α)) = ordp (α − 1). ∗ (iii) Let α, β ∈ Zp with ordp (α − 1) > 0, ordp (β − 1) > 0. Then logp (αβ) = logp (α) + logp (β). Proof. The proofs of (i) and (iii) can be found in Koblitz (1984), section 4.1. To prove (ii), put κ := ordp (α − 1). Then, since κ > 1/(p − 1), ordp (logp (α)) = min(κ, pκ − 1, p2 κ − 2, . . .) = κ. ∗ ∗ We now define logp on the whole group Zp as follows: take α ∈ Zp , choose a positive integer m such that ordp (α m − 1) > 0 (which exists by Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 28 Basic algebraic number theory Lemma 1.11.1) and put logp (α) := 1 logp (α m ). m By part (iii) of Lemma 1.11.2 this does not depend on the choice of m. Moreover, logp (αβ) = logp (α) + logp (β) ∗ for α, β ∈ Zp . (1.11.1) Proposition 1.11.3 ∗ (i) logp defines a surjective group homomorphism from Zp to the additive group of Qp with kernel the roots of unity in Qp . (ii) Let L be a finite extension of Qp . Then logp defines a non-surjective ∗ homomorphism from Zp ∩ L to the additive group of L. ∗ Proof. (i) By (1.11.1), logp defines a homomorphism from Zp to the additive group of Qp . We determine the kernel of logp . First, let α be a root of unity ∗ from Qp . Then α ∈ Zp . Further, there is a positive integer m with α m = 1 and ∗ so, logp α = m−1 logp (α m ) = 0. Now let α ∈ Zp which is not a root of unity. By Lemma 1.11.1, there exists a positive integer m such that ordp (α m − 1) > 1/(p − 1). Then by part (ii) of Lemma 1.11.2, ordp (logp (α)) = ordp (logp (α m )) − ordp (m) = ordp (α m − 1) − ordp (m) < ∞ (1.11.2) and thus logp (α) = 0. This proves that the kernel of logp consists of the roots of unity of Qp . To prove the surjectivity of logp , we use the p-adic exponential. By, e.g., ∗ Koblitz (1984), section 4.1, for α ∈ Qp with ordp (α) > p/(p − 1), the series expp (α) := ∞ αn n=0 n! converges to a limit in the field Qp (α). Moreover, again by Koblitz (1984), section 4.1, for these α we have ordp (expp (α) − 1) > 0 and logp (expp (α)) = α. Now let β ∈ Qp be arbitrary. Choose k ∈ Z>0 such that k + ordp (β) > ∗ k p/(p − 1) and then α ∈ Qp with α p = expp (p k β). We have α ∈ Zp since ordp (expp (p k β) − 1) > 0. Now, clearly, logp (α) = p−k logp (expp (p k β)) = β. This proves the surjectivity of logp . Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 1.11 p-adic numbers 29 ∗ (ii) Let α ∈ Zp ∩ L. By Lemma 1.11.1, there exists a positive integer m, depending only on p and L, such that ordp (α m − 1) > 1/(p − 1). Then logp (α) = m−1 logp (α m ) ∈ L and also, similarly to (1.11.2), ordp (logp (α)) = ordp (α m − 1) − ordp (m) > 1 − ordp (m), p−1 ∗ which is independent of α. So logp : Zp ∩ L → L is certainly not surjective. For the computation of p-adic logarithms of algebraic numbers, see de Weger (1989) and Smart (1998). Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003 2 Algebraic function fields By an algebraic function field in one variable over a field k, or, in short, function field over k, we mean a finitely generated field extension of transcendence degree 1 over k. We shall restrict ourselves to the case that k is algebraically closed and of characteristic 0. Thus, if K is a function field over k and z is any element from K \ k, then K is a finite extension of the field of rational functions k(z). We have collected here the concepts and results that are used in our book. For further details and proofs, we refer to the books Eichler (1966) and Mason (1984) and to the paper Schmidt (1978). 2.1 Valuations Let k be an algebraically closed field of characteristic 0 and K an algebraic function field over k. By a valuation on K over k we mean a discrete valuation with value group Z such that v(x) = 0 for x ∈ k∗ , i.e., a surjective map v : K → Z ∪ {∞} such that v(x) = ∞ ⇐⇒ x = 0; v(xy) = v(x) + v(y), v(x + y) ≥ min(v(x), v(y)) for x, y ∈ K; ∗ v(x) = 0 for x ∈ k . The corresponding local ring and maximal ideal of v are given by ov := {x ∈ K : v(x) ≥ 0}, mv := {x ∈ K : v(x) > 0}, respectively. The quotient ov /mv is a field, called the residue class field of v. Since k is algebraically closed, we have ov /mv = k. 30 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 2.1 Valuations 31 By a local parameter of v we mean an element zv of K such that v(zv ) = 1. Then the completion of K at v is the field of formal Laurent series k((zv )). Analogously to number fields, we denote the set of valuations on K by MK . Let K be a function field over k and L a finite extension of K. Let v be a valuation on K. We say that a valuation w of L lies above v, notation w|v, if the restriction of w to K is a multiple of v. In that case, we have w(x) = e(w|v)v(x) for x ∈ K, where e(w|v) is a positive integer, called the ramification index of w over v. First we describe the valuations on k(z). For every element a of k, each non-zero element x of k(z) may be expanded as a formal Laurent series ∞ am (z − a)m , m=n where am is an element of k and an = 0. Then orda defined by orda (x) := n is a valuation on k(z), and the field of Laurent series k((z − a)) is the completion of k(z) at orda . Similarly, we define a valuation ord∞ on k(z) expanding x ∈ k(z) as a Laurent series in z−1 . In particular, ord∞ (x) = − deg(x) for x ∈ k[z]. The completion of k(z) at ord∞ is k((z−1 )). The valuations orda (a ∈ k ∪ {∞}) provide all valuations on k(z). These valuations satisfy the Sum Formula orda (x) = 0 for x ∈ k(z)∗ . a∈k∪{i∞} Now, let K be an algebraic function field over k. We give a concrete description of the valuations on K by means of Puiseux expansions. To this end, fix z ∈ K \ k, so that K is a finite extension of k(z). Put d := [K : k(z)]. The function field K has a primitive element y over k(z) which satisfies an irreducible equation P (y, z) = y d + p1 (z)y d−1 + · · · + pd (z) = 0 (2.1.1) with coefficients pi (z) in k(z). If Q(z) is the common denominator of the rational functions pi (z), then y1 := Q(z)y satisfies an equation of the form (2.1.1) with coefficients from k[z]. Replacing y by y1 , we may assume that in (2.1.1) y is a primitive element of K with polynomials pi (z) ∈ k[z]. The field K may be embedded both in the field of fractional power series in z − a, where a is an arbitrary element of k, and in the field of fractional power series in z−1 . These fields are all algebraically closed. Every element x of K may be expressed in a unique way in the form x= d qi (z)y i−1 i=1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 32 Algebraic function fields with some q1 , . . . , qd in k(z). Hence, to expand the functions of K in power series in z − a or z−1 , it suffices to so expand the single function y. We recall Puiseux’s classical theorem. Theorem 2.1.1 For each element a of k, there are positive integers ra ≤ d and e1 , . . . , era with e1 + · · · + era = d, and formal Puiseux series yρ = ∞ aρm (z − a)m/eρ , aρnρ = 0, ρ = 1, . . . , ra m=nρ with coefficients aρm in k, that satisfy (2.1.1). Further, if ζ is a primitive eρ -th root of unity and yρj = ∞ aρm ζ j m (z − a)m/eρ (j = 1, . . . , eρ − 1), m=nρ then the left-hand side of (2.1.1) is identical with (y − yρj ). P (y, z) = ρ,j A similar assertion holds with z −1 instead of z − a. Proof. See, e.g., Eichler (1966) chapter III, section 1. For each a in k and each ρ, j as above, the map ϕρj : y → yρj determines uniquely an embedding of K into the field of formal Laurent expansions in powers of (z − a)1/eρ , i.e., for x ∈ K we have ϕρj (x) = ∞ am (z − a)m/eρ with am ∈ k for m ≥ n and an = 0. m=n For every ρ with 1 ≤ ρ ≤ ra , we construct a valuation v on K as follows: choose any j with 1 ≤ j ≤ eρ . Then, for any x ∈ K ∗ , we define v(x) := n in the above Laurent series expression for ϕρj (x). Notice that the valuation v on K lies above u := orda , that (z − a)1/eρ is a local parameter for v, and that eρ is the ramification index e(v|u) of v over u. The above construction gives all extensions of u = orda to K. One can construct in a similar way the extensions v of ord∞ to K. Each of these v is defined as the order of vanishing of the Laurent expansion in a local parameter z−1/eρ . In this way all valuations v of K are described. For convenience we say that v lies above a (a ∈ k ∪ {∞}) if it lies above orda and write e(v|a) for e(v|orda ). Notice that for all a ∈ k ∪ {∞} we have v|a e(v|a) = [K : k(z)], where the sum is taken over all valuations v of K lying above a. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 2.2 Heights 33 We say that v ∈ MK is called infinite with respect to z if it lies above ∞, i.e., if v(z) < 0. We denote this by v | ∞; otherwise, we say that v is finite with respect to z. To get a uniform notation, if v lies above a and corresponds to the pair (ρ, j ), we write zv for (z − a)1/eρ if a = ∞ and for z−1/eρ if a = ∞, and yv for yρj . Thus, for every valuation v of K, zv is a local parameter for v, and y → yv defines an isomorphic embedding ϕv : K → k((zv )). The valuations defined above have the following properties: v(x) = 0 for all v ∈ MK ⇐⇒ x ∈ k∗ , v(x) = 0 for x in K ∗ (Sum Formula). (2.1.2) (2.1.3) v∈MK For each non-zero x ∈ K, only finitely many summands are non-zero. Let L be a finite extension of K. On L we define valuations in the same way as for K. Then e(w|v) = [L : K] for v ∈ MK , (2.1.4) w|v where the sum is taken over all valuations w of L lying above v. More generally, we have the Extension Formula w(x) = v(NL/K (x)) for x ∈ L. (2.1.5) w|v 2.2 Heights Let again K be an algebraic function field in one variable over an algebraically closed field k of characteristic 0 and MK its set of valuations over k. For a vector x = (x1 , . . . , xn ) ∈ K n \ {0} we define v(x) := −min(v(x1 ), . . . , v(xn )) for v ∈ MK , and then HKhom (x) = HKhom (x1 , . . . , xn ) := v(x), v where as usual v indicates that the sum is taken over all valuations v ∈ MK . This is called the homogeneous height of x with respect to K. The height HK may be viewed as the function field analogue of the logarithmic height n hhom L (x) := v log maxi |xi |v for x = (x1 , . . . , xn ) ∈ L \ {0} relative to a number field L; this is [L : Q] times the absolute logarithmic height hhom defined Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 34 Algebraic function fields in Section 1.9. It has become common practice to denote function field heights by capital H . By the Sum Formula we have HKhom (λx) = HKhom (x) for λ ∈ K ∗ . (2.2.1) For instance, let p1 , . . . , pn ∈ k[z] with gcd(p1 , . . . , pn ) = 1. Then hom Hk(z) (p1 , . . . , pn ) = max(deg p1 , . . . , deg pn ). (2.2.2) If L is a finite extension of K, the valuations on L may be constructed as above, and the height in L may be defined accordingly. Furthermore, for x = (x1 , . . . , xn ) ∈ K n \ {0} we have HLhom (x) = [L : K]HKhom (x). We define a height for elements of K by HK (x) := HKhom (1, x) = − min(0, v(x)) (2.2.3) for x ∈ K, (2.2.4) v For instance, if K = k(z) and x = p/q where p, q are coprime polynomials from k[z], then Hk(z) (x) = max(deg p, deg q). From (2.1.4) and (2.1.5), one deduces that for any finite extension L of K, HL (x) = [L : K] · HK (x) for x ∈ K, (2.2.5) where HL (x) denotes the height of x with respect to L. We mention some properties of the height HK . It is evident from (2.1.2) and (2.1.3) that HK (x) ≥ 0 for x ∈ K, HK (x) = 0 ⇐⇒ x ∈ k. Further, from simple manipulations with valuations and from the Sum Formula it follows that HK (x m ) = |m|HK (x) and HK (x + y) HK (xy) for x ∈ K ∗ , m ∈ Z, (2.2.6) ≤ HK (x) + HK (y) for x, y ∈ K. Next, from (2.2.4) and (2.2.6) it follows that HK (x) = 12 HK (x) + HK (x −1 ) = 12 |v(x)| ≥ 12 |S| (2.2.7) for x ∈ K ∗ , v∈MK (2.2.8) where S is the set of valuations v ∈ MK for which v(x) = 0. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 2.3 Derivatives and genus 35 Let P ∈ K[X1 , . . . , Xr ] be a non-zero polynomial, and {p1 , . . . , pn } the set of non-zero coefficients of P . We define v(P ) := v(p1 , . . . , pn ) = − min(v(p1 ), . . . , v(pn )) for v ∈ MK , . HKhom (P ) := v(P ). v We have obvious analogues of (2.2.1), (2.2.2) and (2.2.5) for polynomials. By Gauss’ Lemma for valuations (the method of proof is similar to that of Bombieri and Gubler (2006), lemma 1.6.3) we have for any two polynomials P , Q ∈ K[X1 , . . . , Kr ], v(P Q) = v(P ) + v(Q) for v ∈ MK , hence HKhom (P Q) = HKhom (P ) + HKhom (Q). (2.2.9) Suppose that P = f0 (X − α1 ) · · · (X − αg ) with f0 , α1 , . . . , αg ∈ K. Then by (2.2.9) and the Sum Formula, applied to f0 , we obtain HKhom (P ) = g i=1 HK (αi ) ≥ max HK (αi ). 1≤i≤g (2.2.10) 2.3 Derivatives and genus For the moment, let L be any field extension, not necessarily of finite type, which has transcendence degree 1 over an algebraically closed field k of characteristic 0. The L-vector space (L/k) of differentials of L over k may be constructed as follows. We start with taking a variable δx for every x ∈ L. Then let V be the L-vector space consisting of all finite formal linear combinations i yi δxi with yi , xi ∈ L, and let V0 be the L-linear subspace of V generated by δx+y − δx − δy and δxy − xδy − yδx for x, y ∈ L and δx for x ∈ k. Then define (L/k) := V /V0 . For x ∈ L denote by dx the residue class of δx modulo V0 . Thus, (L/k) consists of all finite linear combinations ω = i yi dxi , where xi , yi ∈ L, and we have dx = 0 for x ∈ L \ k, dx = 0 for x ∈ k, and d(x + y) = dx + dy, d(xy) = xdy + ydx for x, y ∈ L. Consequently, d(λx) = λdx for x ∈ L, λ ∈ k. It is clear that if L is a subfield of L then up to isomorphism, (L /k) is contained in (L/k). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 36 Algebraic function fields For any x, y ∈ L with y ∈ k, there is an irreducible polynomial Q ∈ k[X, Y ] such that Q(x, y) = 0. Then we have ∂Q ∂Q (x, y)dx + (x, y)dy = 0. ∂X ∂Y Hence there exists a function in L (in fact in k(x, y)), which we denote by dx/dy, such that dx = dx · dy. dy We call dx/dy the derivative of x with respect to y. Notice that we have the chain rule dx dx dy = · dz dy dz for any x, y, z ∈ L with y, z ∈ k. As a consequence, if we fix z ∈ L \ k, then every differential ω ∈ (L/k) can be expressed as (ω/dz) · dz with ω/dz ∈ L. Now let again K be a function field in one variable over k, i.e., a finite type extension of transcendence degree 1 over k. For every valuation v ∈ MK we choose a local parameter zv . Let x ∈ K ∗ . Then for v ∈ MK we can express x ai zvi with n0 = v(x), ai ∈ k for i ≥ n0 and as a formal Laurent series ∞ 0 i=n ∞ an0 = 0, and then dx/dzv = i=n0 iai zvi−1 . As a consequence, v v dx dzv dx dzv = v(x) − 1 for any x in K with v(x) = 0, (2.3.1) ≥ 0 if v(x) = 0. (2.3.2) This shows that v(dx/dzv ) is independent of the choice of zv . Indeed, let zv be another local parameter for v. Then v(zv ) = 1, hence v(dzv /dzv ) = 0, and so v(dx/dzv ) = v(dx/dzv ). A differential ω of K over k is called holomorphic if v(ω/dzv ) ≥ 0 for all v ∈ MK ; this notion is independent of the choice of the zv . It can be shown that the holomorphic differentials of K over k form a finite dimensional k-vector space. The dimension of this space is called the genus of K over k, denoted by gK/k . Let x ∈ K \ k be arbitrary. It follows from the Sum Formula (2.1.3) and the chain rule dx/dzv = (dx/dz) · (dz/dzv ) that v v dx dzv = v v dz dzv Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 2.4 Effective computations 37 is independent of x provided the right-hand side of the equality is finite. We need only the following special case of the Riemann–Roch Theorem. Theorem 2.3.1 We have dx v dzv v = 2gK/k − 2 for every x ∈ K \ k. It is not difficult to check that k(z) has genus 0. The following genus estimate will be useful. Proposition 2.3.2 Let z ∈ K \ k, let F = Xg + f1 Xg−1 + · · · + fg with coefficients f1 , . . . , fg ∈ k[z], and suppose that K is the splitting field of F over k(z). Then gK/k ≤ (d − 1)g max(deg f1 , . . . , deg fg ), where d := [K : k(z)]. Proof. This is Lemma H in Schmidt (1978). 2.4 Effective computations In order to perform effective computations in the function field K, it is necessary to assume that the ground field k is presented explicitly in the sense of Fröhlich and Shepherdson (1956). This means here that there is an algorithm to determine the zeros of any polynomial with coefficients in k. In particular, in this case we can perform all the field operations with elements of k. Further, we assume that K is presented explicitly. This means that K is given in the form k(z)(y), with z a variable, and y a primitive element of K over k(z), with an explicitly given defining polynomial y d + p1 (z)y d−1 + · · · + pd (z) over K. We may assume that y is integral over k[z], that is that p1 (z), . . . , pd (z) are polynomials with coefficients in k. Every element x of K can be expressed uniquely in the form d qi (z) i=1 q(z) · y i−1 , where q1 , . . . , qd , q are polynomials of k[z] such that gcd(q, q1 , . . . , qd ) = 1 and qd is monic. We call (q1 , . . . , qd , q) a representation for x, and we say that x is given explicitly if a representation for x is given explicitly, and that x can be determined effectively from certain given input data if there is an algorithm to determine a representation for x from these data. From representations for Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 38 Algebraic function fields elements x1 , x2 ∈ K one can determine representations for x1 ± x2 , x1 x2 and x1 /x2 , if x2 = 0. One can easily compute a minimal polynomial of an explicitly given x over k[z], i.e., a polynomial F = f0 Xr + f1 Xr−1 + · · · + fr ∈ k[z][X] of minimal degree such that F (x) = 0 and gcd(f0 , . . . , fr ) = 1. Indeed, one starts by computing representations for x 2 , . . . , x d . Then by straightforward linear algebra one can determine the smallest r, which is ≤ d, for which there exist g0 , . . . , gr ∈ k(z), not all 0, such that g0 + g1 x + · · · + gr x r = 0, and having found such, one obtains a minimal polynomial of x by clearing denominators. It is important to note that if k and K are presented explicitly, then the valuations of K can be described explicitly. Specifically, in Section 2.1 we gave, for every valuation v of K, a local parameter zv for v as well as a Laurent series yv in zv , such that y → yv gives rise to an isomorphic embedding of K into k((zv )). The pair (zv , yv ) can be determined from the defining polynomial of y and the element of k ∪ {∞} above which v lies. By determining yv we mean that by an inductive procedure we can determine the coefficients of yv one by one. We say that the valuation v is given explicitly, if the pair (zv , yv ) is given, i.e., the inductive procedure to compute the coefficients of yv is given. If x ∈ K and the valuation v are given, then we can express x as a Laurent series in zv by substituting yv for y in the expression di=1 (qi (z)/q(z)) · y i−1 for x and by expressing z as a Laurent series in zv . Then we can compute v(x) by searching for the first non-zero coefficient in the Laurent series expansion for x. Further details may be found in Eichler (1966), chapter III, section 1 and Mason (1984), chapter V. We recall a result of Mason (1984), p. 11, lemma 1. Proposition 2.4.1 Suppose that k, K are presented explicitly, and a finite set S of valuations of K and integers nv (v ∈ S) are explicitly given. Then we can determine effectively whether there exists an element x in K such that v(x) = nv for v ∈ S, v(x) = 0 for v ∈ MK \ S. (2.4.1) Moreover, if such an x exists then it may be computed, and it is unique up to a non-zero factor in k. Proof. We do not lose any generality by augmenting S with a finite set of explicitly given valuations and setting nv := 0 for the added valuations. So, without loss of generality, we may assume that S contains all valuations that lie above {∞, a1 , . . . , at }, where a1 , . . . , at are certain elements of k. Further, it is enough to prove the assertion for the case when nv ≥ 0 for all finite v ∈ S, i.e., not lying above ∞; then the elements x under consideration Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 2.4 Effective computations 39 are integral over k[z]. For assume that x satisfies (2.4.1). Denote by ev the ramification index of v over the element of k ∪ {∞} over which it lies. Choose an integer m such that m > 0 and m + nv ≥ 0 for v ∈ S, and put q(z) := t m j =1 (z − aj ) . Then if v|∞ we have v(q(z)x) = nv − ev mt =: nv while if v|ai for some i we have v(q(z)x) = nv + ev m =: nv ≥ 0. Further, for v outside S we have v(q(z)) = 0 and so v(q(z)x) = 0. Clearly, q(z) can be determined effectively. So it suffices to prove our assertion with nv (v ∈ S) instead of nv . Recall that K is explicitly given in the form k(z)(y), with an explicitly given minimal polynomial of y over k(z) which is monic and has its coefficients in k[z]. Consider now the system of equations (2.4.1) in x, where it is assumed that nv ≥ 0 for v ∈ S with v finite. Thus, the elements x ∈ K under consideration are integral over k[z]. Each such x can be expressed in a unique way in the form x= d qi (z)y i−1 i=1 with qi (z) ∈ k(z), i = 1, . . . , d. Denote by σ1 , . . . , σd the distinct embeddings of K in a fixed algebraic closure of K. Then we infer that σj (x) = d qi (z)(σj (y))i−1 for j = 1, . . . , d. (2.4.2) i=1 Let D = det(σj (y)i−1 )2 = 1≤i<j ≤d (σi (y) − σj (y))2 , i.e., D is the discriminant of y. It has an explicit expression as a polynomial with integer coefficients in terms of the coefficients of the minimal polynomial of y. Hence it belongs to k[z] and is effectively computable. Since by assumption k is presented explicitly, we can determine the zeros of D in k together with their multiplicities. Hence we can give all valuations of K lying above the zeros of D explicitly, and for each such v we can determine v(D). We may augment S with these valuations. Thus, without loss of generality, v(D) = 0 for v outside S. It follows from (2.4.2) that, for each i, Dqi (z) may be expressed as a polynomial with integer coefficients in the σj (x) and σj (y). But σj (x) and σj (y) are integral over k[z] for each j , hence Dqi (z) ∈ k[z] for all i. By selecting σ1 , . . . , σd to be the embeddings corresponding to the infinite valuations on K, we may determine an integer u depending only on the integers nv (v|∞) such that ord∞ (Dqi (z)) ≥ −u for 1 ≤ i ≤ d, and hence each Dqi (z) is a polynomial of degree at most u. Consequently, we may write Dx = u d aij zj y i−1 (2.4.3) j =0 i=1 with some aij in k to be determined. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 40 Algebraic function fields We now prove that, assuming that x satisfies (2.4.3) and x = 0, we can replace (2.4.1) by the finite list of conditions for v ∈ S. (2.4.4) Indeed, by the Sum Formula we must have v∈S nv = 0, otherwise, (2.4.1) is not solvable. By (2.4.3) we have v(Dx) ≥ 0 for every finite valuation v. So v(x) ≥ 0 for every valuation v outside S. Suppose that v(x) > nv for some v ∈ S or v(x) > 0 for some v outside S. Then v(x) > nv = 0, 0= v(x) ≥ nv v∈MK v∈S a contradiction. So (2.4.1) is equivalent to the combination of (2.4.3) and (2.4.4). By replacing in (2.4.3) z and y by their Laurent expansions in terms of zv , we obtain for m Dx an expansion ∞ m=mv Lvm (a)zv where every term Lvm (a) is a linear form with coefficients in k in the coefficients aij in (2.4.3). Thus, x satisfies (2.4.3) and (2.4.4) if and only if the aij satisfy the finite system of linear equations Lvm (a) = 0 for v ∈ S, m = mv , mv + 1, . . . , nv + v(D) − 1. Now we can decide whether this system of linear equations has a non-zero solution in k, and if so, compute one. Consequently, we may determine whether there exists an element x in K with (2.4.1) and if so, compute such an x. Finally, if there are two elements x1 and x2 in K which satisfy (2.4.1), then v(x1 /x2 ) = 0 for all v, whence x1 /x2 ∈ k. Thus, x is unique apart from a non-zero factor from k. This completes the proof. Proposition 2.4.2 Let a1 , . . . , ar , b be explicitly given elements of K. Then it can be determined effectively whether ξ1 a1 + · · · + ξr ar = b (2.4.5) is solvable in (ξ1 , . . . , ξr ) ∈ kr . If so, the set of solutions in kr of (2.4.5) is a linear variety of dimension r − rank k (a1 , . . . , ar ), a parameter representation of which can be determined effectively. Proof. First assume that a1 , . . . , ar are linearly independent over k. Recall that from any given x ∈ K, we can effectively determine its derivatives x (j ) := dj x/dzj for all j ≥ 0. Now, clearly, if ξ1 , . . . , ξr ∈ k satisfy (2.4.5), then (j ) ξ1 a1 + · · · + ξr ar(j ) = b(j ) (j = 0, . . . , r − 1). Since a1 , . . . , ar are linearly independent over k, the Wronskian determinant (j −1) )i,j =1,...,r is non-zero. Hence the latter system has a unique solution det(ai Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 2.4 Effective computations 41 (ξ1 , . . . , ξr ) ∈ K r which can be determined effectively. Then it can be checked whether this solution belongs to kr . Now suppose that rank k {a1 , . . . , ar } = m < r. By means of the above procedure, we can select a k-linearly independent subset of m elements from {a1 , . . . , ar } and express the other elements as k-linear combinations of this subset. Assume that {a1 , . . . , am } is k-linearly independent. Check with the above procedure whether b is a k-linear combination of a1 , . . . , am . If so, express b and am+1 , . . . , ar as k-linear combinations of a1 , . . . , am , substitute these into (2.4.5) and compare the coefficients of a1 , . . . , am . Thus, one can rewrite (2.4.5) as a system of linear equations with coefficients in k, whose solution set is a linear variety of dimension r − m, and it is straightforward to compute a parameter representation of the latter. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004 3 Tools from Diophantine approximation and transcendence theory In this chapter, we have collected some fundamental results from Diophantine approximation and transcendence theory on which the main results of this book are based. Section 3.1 is about Schmidt’s Subspace Theorem and its variations. These will be applied in Chapter 6. In Section 3.2 we recall the best known effective estimates for linear forms in logarithms, which are used in Chapters 4 and 5. For more details and background, we refer to Schmidt (1980), Evertse and Schlickewei (2002), Bombieri and Gubler (2006), chapter 6, 7 and Baker and Wüstholz (2007). 3.1 The Subspace Theorem and some variations In this section we formulate some versions of the Subspace Theorem that are used in Chapter 6. In particular, we recall the p-adic Subspace Theorem, the Parametric Subspace Theorem, and a quantitative version of a special case of the latter. We start with a brief introduction, taking as starting point Roth’s celebrated Theorem on the approximation of algebraic numbers by rationals. Theorem 3.1.1 Let α ∈ R \ Q be an algebraic number and > 0. Then there are only finitely many pairs (x, y) ∈ Z2 with y > 0 such that α − x ≤ max(|x|, |y|)−2− . (3.1.1) y Proof. See Roth (1955). Roth’s proof consists of two steps: first the deduction of a non-vanishing result for polynomials, now known as Roth’s Lemma; second, under the assumption that Theorem 3.1.1 is false the construction of an auxiliary polynomial that violates Roth’s Lemma. 42 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 3.1 The Subspace Theorem and some variations 43 Weaker versions of Roth’s Theorem were proved earlier by Thue (1909) with 1 d + 1 instead of 2 where d = deg α, Siegel (1921) with exponent exponent 2 √ 2 d, and Dyson (1947) √ and Gelfond (1960) (result proved in the late 1940s), both with exponent 2d. The proofs of Thue–Roth are all ineffective, in that they do not provide a method to determine the solutions of the inequality under consideration. Extending earlier work of Ridout (1958), Lang (1960) proved a generalization of Roth’s Theorem, usually referred to as the p-adic Roth’s Theorem, where the underlying inequality takes its solutions from an algebraic number field and where various archimedean and non-archimedean absolute values from this number field are involved. Roth’s Theorem was generalized in another direction by W. M. Schmidt to simultaneous approximation. His work culminated in his so-called Subspace Theorem. Below, we denote by · the maximum norm on Cn , i.e., x := max(|x1 |, . . . , |xn |) for x = (x1 , . . . , xn ) ∈ Cn . Theorem 3.1.2 (Subspace Theorem) Let n ≥ 2 and let Li = nj=1 αij Xj (i = 1, . . . , n) be linearly independent linear forms with algebraic coefficients in C and let > 0. Then the set of solutions of the inequality |L1 (x) · · · Ln (x)| ≤ x− in x ∈ Zn \ {0} (3.1.2) is contained in a finite union of proper linear subspaces of Qn . Proof. See Schmidt (1972) or Schmidt (1980). In general, inequalities of the shape (3.1.2) need not have finitely many solutions. Theorem 3.1.2 =⇒ Theorem 3.1.1. Notice that if (x, y) ∈ Z2 with y > 0 is a solution of (3.1.1), then |y(x − αy)| ≤ max(|x|, |y|)− . Now Theorem 3.1.2 implies that the solutions of the latter, hence of (3.1.1), lie in finitely many one-dimensional subspaces of Q2 . But the solutions of (3.1.1) in a given one-dimensional subspace of Q2 are all of the shape m(x0 , y0 ) with (x0 , y0 ) a fixed pair of integers with gcd(x0 , y0 ) = 1 and y0 > 0, and m ∈ Z>0 . By substituting this into (3.1.1) we see that m is bounded. This shows that a given one-dimensional subspace of Q2 contains only finitely many solutions of (3.1.1). Schmidt (1975) generalized his Subspace Theorem to inequalities of which the unknowns are taken from an algebraic number field, and Schlickewei Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 44 Diophantine approximation and transcendence (1977b) extended this further to inequalities involving both archimedean and non-archimedean absolute values, thus generalizing both Lang’s p-adic Roth’s Theorem mentioned above and Schmidt’s Subpsace Theorem. We give a reformulation of his result that is better adapted to our purposes. Let K be an algebraic number field. We use the absolute values | · |v (v ∈ MK ) defined in Section 1.7. Let S be a finite set of places of K, containing all infinite places. Recall that the ring of S-integers of K is given by OS = {x ∈ K : |x|v ≤ 1 for v ∈ MK \ S}. We define the S-height of x = (x1 , . . . , xn ) ∈ OSn by HS (x) = HS (x1 , . . . , xn ) := max(|x1 |v , . . . , |xn |v ). v∈S It follows easily from the Product Formula that HS (εx) = HS (x) for ε ∈ OS∗ . We shall show below that for any C > 0 there are, up to multiplication with a scalar from OS∗ , only finitely many vectors x ∈ OSn with HS (x) ≤ C. Theorem 3.1.3 (p-adic Subspace Theorem) For v ∈ S, let L1v , . . . , Lnv be linearly independent linear forms in X1 , . . . , Xn with coefficients in K. Further, let > 0. Then the set of solutions of |L1v (x) · · · Lnv (x)|v ≤ HS (x)− in x ∈ OSn \ {0} (3.1.3) v∈S is contained in a union of finitely many proper linear subspaces of K n . Proof. This is a reformulation of a result of Schlickewei (1977b). His proof is based on his earlier papers Schlickewei (1976a, 1976b, 1976c). A special case of Schlickewei’s result was proved independently by Dubois and Rhin (1975). A complete proof of Schlickewei’s theorem can also be found in Bombieri and Gubler (2006), chapter 7. Theorem 3.1.3 =⇒ Theorem 3.1.2. Let L1 , . . . , Ln be the linear forms from Theorem 3.1.2. Let K be the algebraic number field generated by the coefficients of L1 , . . . , Ln and their conjugates, and suppose that K has degree d. Let S be the set of infinite places of K. Recall that if v is an infinite place of K, then either | · |v = |σ (·)| if v = σ is a real embedding of K or | · |v = |σ (·)|2 if v = {σ, σ } is a pair of conjugate complex embeddings of K. For either v = σ a real embedding or v = {σ, σ } a pair of conjugate complex embeddings, we put Liv := σ −1 (Li ), where σ −1 (Li ) is the linear form obtained by applying σ −1 to the coefficients of Li . For x ∈ Zn , the left- and right-hand sides of (3.1.3) are precisely the d-th powers of the left- and right-hand sides of (3.1.2). Thus, for x ∈ Zn , inequality (3.1.2) implies (3.1.3), and then an application of Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 3.1 The Subspace Theorem and some variations 45 Theorem 3.1.3 implies that the solutions of (3.1.2) lie in a union of finitely many proper linear subspaces of Qn . Schmidt’s proof of Theorem 3.1.2 is basically an extension of Roth’s method, i.e., the construction of an auxiliary polynomial and an application of Roth’s Lemma, combined with techniques from the geometry of numbers. The arguments of Schlickewei and Dubois and Rhin are essentially a “p-adization” of Schmidt’s method. In their groundbreaking paper Faltings and Wüstholz (1994) gave a totally different proof of Theorem 3.1.3, where they avoided the use of geometry of numbers by applying a very powerful generalization of Roth’s Lemma due to Faltings, his Product Theorem, see Faltings (1991). We mention that both the method of Schmidt and that of Faltings and Wüstholz are ineffective, in that they do not provide a method to determine the subspaces containing the solutions of the inequality under consideration. Theorems 3.1.2 and 3.1.3 are very powerful tools to obtain finiteness results for various types of Diophantine equations, such as unit equations, norm form equations, decomposable form equations and exponential-polynomial equations, see Chapters 6, 9 and Section 10.11 in the present book. The proofs of these finiteness results are all ineffective, in the sense that they do not provide a method to determine the solutions. On the other hand, there are now good quantitative versions of Theorems 3.1.2 and 3.1.3, giving explicit upper bounds for the number of subspaces, that led to explicit upper bounds for the numbers of solutions of the above mentioned equations. Schmidt (1989) obtained a quantitative version of Theorem 3.1.2, giving an explicit upper bound for the number of subspaces containing the “large” solutions. Schlickewei (1992) generalized this, and obtained a quantitative version of Theorem 3.1.3. This was substantially improved by Evertse (1996), by using a quantitative version of Faltings’ Product Theorem. Schlickewei made the important observation that a sufficiently good quantitative version of the so-called Parametric Subspace Theorem (see below), which deals with a parametrized class of twisted heights, would lead to much better bounds for the number of solutions of certain classes of Diophantine equations, than quantitative versions of Theorem 3.1.3. In Schlickewei (1996a), he proved a special case of such a quantitative version, and applied this to obtain sharper estimates for the zero multiplicity of a linear recurrence sequence (see Section 10.11). Evertse and Schlickewei (2002) sharpened and extended Schlickewei’s result, and obtained a completely general quantitative version of the Parametric Subspace Theorem. For more historical information we refer to Evertse and Schlickewei (1999). Evertse and Schlickewei essentially followed Schmidt’s proof of his Theorem 3.1.2, with the necessary refinements. Evertse and Ferretti (2013) obtained Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 46 Diophantine approximation and transcendence a further improvement, following also Schmidt’s proof scheme, but inserting ideas from Faltings and Wüstholz (1994). We first state the Parametric Subspace Theorem in a qualitative form and then give, in a special case relevant for our purposes, a quantitative version of the latter. The Parametric Subspace Theorem is in fact a generalization of Theorem 3.1.3, although this is not obvious at a first glance. Let again K be a number field, S a finite set of places of K containing all infinite places, > 0, and for v ∈ S, let {L1v , . . . , Lnv } be a system of linearly independent linear forms in K[X1 , . . . , Xn ]. Take a solution x ∈ OSn \ {0} of (3.1.3). Assume that the left-hand side of (3.1.3) is non-zero. Write |Liv (x)|v = HS (x)div (v ∈ S, i = 1, . . . , n), Liv := Xi , div := 0 (v ∈ MK \ S, i = 1, . . . , n), d = (div : v ∈ MK , i = 1, . . . , n), Q := HS (x). Define the so-called twisted height: HQ,d (x) := max |Liv (x)|v Q−div . v∈MK 1≤i≤n (3.1.4) Notice that by (3.1.3) we have n div ≤ −, v∈MK i=1 and that HQ,d (x) ≤ 1. In the above observations, both d and Q vary with x. The Parametric Subspace Theorem deals with inequalities involving twisted heights, where Q varies but d is fixed. Theorem 3.1.4 (Parametric Subspace Theorem) Let K be an algebraic number field and S a finite set of places of K containing all infinite places. Further, let n ≥ 2, let {L1v , . . . , Lnv } (v ∈ S) be systems of linearly independent linear forms from K[X1 , . . . , Xn ], and let d = (div : v ∈ MK , i = 1, . . . , n) be a tuple of reals such that div = 0 for v ∈ MK \ S, i = 1, . . . , n. Put n μ := n1 div . v∈MK i=1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 3.1 The Subspace Theorem and some variations 47 Then for every δ > 0 there are Q0 and a finite collection {T1 , . . . , Tt } of proper linear subspaces of K n such that for every Q ≥ Q0 there is T ∈ {T1 , . . . , Tt } with x ∈ K n : HQ,d (x) ≤ Q−μ−δ ⊆ T . Proof. This was first formulated by Evertse and Schlickewei (2002), in a quantitative form with explicit upper bounds for Q0 and t. In fact, in their paper, Evertse and Schlickewei proved an “Absolute Parametric Subspace Theorem”, n with solutions x taken from Q instead of K n . Below we deduce Theorem 3.1.3 from Theorem 3.1.4. We first prove a lemma. We keep our convention that K is a number field and S a finite set of places of K, containing all infinite places. Further, we set d := [K : Q], s := |S|. For x = (x1 , . . . , xn ) ∈ K n , v ∈ MK , we put xv := max(|x1 |v , . . . , |xn |v ). Lemma 3.1.5 (i) There is a constant C depending only on K and S, such that for every x ∈ OSn \ {0}, there is ε ∈ OS∗ with εxv ≤ CHS (x)1/s for v ∈ MK . (ii) For every A > 0 there are, up to multiplication with a scalar from OS∗ , only finitely many vectors x ∈ OSn with HS (x) ≤ A. Proof. (i) Let S = {v1 , . . . , vs } and H := {x = (x1 , . . . , xs ) ∈ Rs : x1 + · · · + xs = 0}. Then by the S-unit Theorem (see Theorem 1.8.1) the map LOGS : ε → (log |ε|v1 , . . . , log |ε|vs ) OS∗ to an (s − 1)-dimensional lattice in H . maps Let x ∈ OSn \ {0}. Then the point a := (s −1 log HS (x) − log xv1 , . . . , s −1 log HS (x) − log xvs ) lies in H . Choose ε ∈ OS∗ such that the lattice point LOGS (ε) is closest to a. Then in fact a − logS (ε) ≤ γ , where · is the maximum norm on Rs and γ is a constant depending only on K, S. This ε satisfies (i) with C = eγ . (ii) Consider x ∈ OSn with HS (x) ≤ A. After multiplying x with a suitable S-unit, we can arrange that xv ≤ C · A1/s for v ∈ S. Then the coordinates x1 , . . . , xn of x have absolute heights max(1, |xi |v ) ≤ C s/d A1/d for i = 1, . . . , n. H (xi ) := v∈MK Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 48 Diophantine approximation and transcendence By Northcott’s Theorem (see Theorem 1.9.3) this leaves only finitely many possibilities for x1 , . . . , xn , hence for x. Theorem 3.1.4 =⇒ Theorem 3.1.3. By Lemma 3.1.5 (i), it suffices to show that the solutions x ∈ OSn \ {0} of (3.1.3) with the additional property xv ≤ CHS (x)1/s for v ∈ S (3.1.5) lie in finitely many proper linear subspaces of K n . Lemma 3.1.5 (ii) implies that by assuming HS (x) to be sufficiently large, we exclude at most finitely many one-dimensional subspaces of solutions x. Solutions with (3.1.5) and with HS (x) sufficiently large, in fact satisfy |Liv (x)|v ≤ HS (x)2/s for v ∈ S, i = 1, . . . , n. (3.1.6) Hence it suffices to prove that the solutions of (3.1.3) with (3.1.6) lie in finitely many proper linear subspaces of K n . Let x be a solution of (3.1.3) with (3.1.6), and define div (x) := max −2n − , log |Liv (x)|v log HS (x) for v ∈ S, i = 1, . . . , n; taking log 0 := −∞, this is well-defined also if Liv (x) = 0. Then ⎫ −2n − ≤ div (x) ≤ 2/s for v ∈ S, i = 1, . . . , n, ⎪ ⎪ ⎬ n ⎪ div (x) ≤ −. ⎪ ⎭ (3.1.7) v∈S i=1 We define a tuple d := (div : v ∈ MK , i = 1, . . . , n) by div := 0 for v ∈ MK \ S, i = 1, . . . , n and Z, div − < div (x) ≤ div for v ∈ S, i = 1, . . . , n. (3.1.8) div ∈ 2ns 2ns Then by (3.1.7), 2 + for v ∈ S, i = 1, . . . , n. s 2ns Notice that by (3.1.7) we have also − 2n − ≤ div ≤ μ := 1 n n div ≤ − v∈MK i=1 (3.1.9) . 2n Further, we have |Liv (x)|v ≤ HS (x)div (x) ≤ HS (x)div for v ∈ S, i = 1, . . . , n, hence with Q := HS (x) we have HQ,d (x) ≤ 1. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 3.1 The Subspace Theorem and some variations 49 By Theorem 3.1.4, the solutions of (3.1.3) satisfying (3.1.8) for some fixed tuple d lie in finitely many proper linear subspaces of K n . Further, by (3.1.8) and (3.1.9), the tuples d belong to a finite set independent of x. This proves Theorem 3.1.3. The general statement of the quantitative version of Theorem 3.1.4, with explicit upper bounds for Q0 , t, is quite technical. We give here only a special case, which is sufficient for our purposes. We keep our assumptions that K is an algebraic number field of degree d and S is a finite set of places of K, containing the infinite places. Theorem 3.1.6 Let Liv (v ∈ MK , i = 1, . . . , n) be linear forms such that for every v ∈ MK , the set {L1v , . . . , Lnv } is linearly independent and {L1v , . . . , Lnv } ⊂ {X1 , . . . , Xn , X1 + · · · + Xn }. Let d = (div : v ∈ MK , i = 1, . . . , n) be any tuple of reals such that div = 0 for v ∈ MK \ S, i = 1, . . . , n. Put n μ := n1 div v∈MK i=1 and suppose that max (d1v , . . . , dnv ) ≤ λ with λ > μ. v∈MK Let 0 < δ < λ − μ and put := λ−μ . δ Let HQ,d be defined by (3.1.4). Then there is a finite collection {T1 , . . . , Tt } of proper linear subspaces of K n of cardinality t ≤ C(n, ) with C(n, ) effectively computable and depending only on n and , such that for every Q with Q > n2d/δ (3.1.10) there is T ∈ {T1 , . . . , Tt } such that {x ∈ K n : HQ,d (x) ≤ Q−μ−δ } ⊆ T . (3.1.11) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 50 Diophantine approximation and transcendence This was proved by Evertse and Schlickewei (2002), Theorem 1.1 in the special case μ = 0, λ = 1, with 2 C(n, ) = 4(n+9) n+4 . For our purposes, the precise value of C(n, ) will not matter, but we should mention here that Evertse and Ferretti (2013), Theorem 1.1 proved the same result with the better bound C(n, ) = 106 22n n10 3 (log(6n))2 , again in the special case μ = 0, λ = 1. It is not difficult to reduce Theorem 3.1.6 to the special case μ = 0, λ = 1. Put n 1 (v ∈ MK , i = 1, . . . , n), div − n1 := λ−μ div j =1 dj v : v ∈ MK , i = 1, . . . , n , d := div Q := Qλ−μ , δ δ := λ−μ . = 0 for v ∈ MK \ S, i = 1, . . . , n, Then div n div = 0, v∈MK i=1 ≤ 1, max d1v , . . . , dnv v∈MK and (3.1.10) changes into Q > n2d/δ . Further, HQ,d (x) = HQ ,d (x)Qμ , hence (3.1.11) changes into {x ∈ K n : HQ ,d (x) ≤ Q−δ } ⊆ T . Thus, Theorem 3.1.6 follows from the special case μ = 0, λ = 1. The Subspace Theorem and its generalizations and quantitative refimenents have many applications. In this book we have focused on applications to unit equations and subsequent applications thereof, see Chapters 6, 9, 10, but there is much more, see for instance the survey papers Bilu (2008), Corvaja and Zannier (2008), Bugeaud (2011), and the book Zannier (2003). Somewhat surprisingly, from Theorem 3.1.3 one can derive extensions where the linear polynomials are replaced by higher degree polynomials and where the solutions are taken from an arbitrary algebraic variety instead of K n , see Corvaja and Zannier (2004a) and Evertse and Ferretti (2002, 2008). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 3.2 Effective estimates for linear forms in logarithms 51 3.2 Effective estimates for linear forms in logarithms In this section we present some results from Baker’s theory of logarithmic forms that are used in Chapters 4 and 5. We formulate, without proof, the best known effective estimates for linear forms in logarithms, due to Matveev (2000) in the complex case and Yu (2007) in the p-adic case, as well as a common, uniform formulation of them which will be more convenient to apply. We first give a brief introduction, starting with the famous Gelfond– Schneider Theorem on transcendental numbers. For the moment, Q denotes the algebraic closure of Q in C, and algebraic numbers are supposed to belong to Q. Here and below log denotes, except otherwise stated, any fixed determination of the logarithm, and for α, β ∈ C with α = 0 we define α β := eβ log α . Theorem 3.2.1 Suppose that α and β are algebraic numbers such that α = 0, 1 and that β is not rational. Then α β is transcendental. Proof. See Gelfond (1934) and Schneider (1934). The theorem was proved independently by Gelfond and Schneider. Their proofs are different, but both depend on the construction of an auxiliary function. Assuming that in Theorem 3.2.1 α β is algebraic and following the arguments of Gelfond, one can construct a function F (z) of a complex variable z which is a polynomial in α z and α βz with integral coefficients, not all zero, such that F (m) (l) = 0 for all integers l, m with 1 ≤ l ≤ h and 0 ≤ m < k, where h, k are appropriate parameters. Then combining some arithmetic and analytic considerations and using induction on k, one can prove that F (m) (l) = 0 for all m, which leads to a contradiction. Theorem 3.2.1 provided an answer to Hilbert’s seventh problem. An equivalent formulation of the theorem is that if α1 , α2 are non-zero algebraic numbers such that log α1 and log α2 are linearly independent over Q, then they are linearly independent over Q. By means of a refinement of his method of proof, Gelfond (1935) gave a non-trivial effective lower bound for the absolute value of β1 log α1 + β2 log α2 , where β1 , β2 denote algebraic numbers, not both 0, and α1 , α2 denote algebraic numbers different from 0 and 1 such that log α1 / log α2 is not rational. Mahler (1935b) proved a p-adic analogue of the Gelfond–Schneider Theorem. A generalization to the p-adic absolute value was given in Gelfond (1940) in a quantitative form. In his book Gelfond (1960), Gelfond remarked that a generalization of his above results from two logarithms to arbitrary many would be of great significance for the solutions of many difficult problems in number theory. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 52 Diophantine approximation and transcendence In his celebrated series of papers, Baker (1966, 1967a, 1967b, 1968a) made a major breakthrough in transcendental number theory by generalizing the Gelfond–Schneider Theorem to arbitrary many logarithms. In Baker (1966, 1967b), he proved the following. Theorem 3.2.2 Let α1 , . . . , αn denote non-zero algebraic numbers. If log α1 , . . . , log αn are linearly independent over Q, then 1, log α1 , . . . , log αn are linearly independent over Q. Further, Baker (1967a, 1967b, 1968a) gave non-trivial lower bounds for the absolute value of linear forms in logarithms of the form β1 log α1 + · · · + βn log αn , where α1 , . . . , αn are non-zero algebraic numbers such that log α1 , . . . , log αn are linearly independent over Q and β1 , . . . , βn are algebraic numbers, not all 0. Proof of Theorem 3.2.2 (sketch; see Baker (1967b) for full details). To illustrate most of the principal ideas of Baker, we sketch the main steps of the proof of a slightly weaker assertion, which states that if α1 , . . . , αn , β1 , . . . , βn−1 are non-zero algebraic numbers such that α1 , . . . , αn are multiplicatively indepenβn−1 β = αn cannot hold. Supposing the opposite and following dent, then α1 1 · · · αn−1 the arguments of Baker, one can construct an auxiliary function F (z1 , . . . , zn−1 ) in n − 1 complex variables, which generalizes the function of a single variable zn−1 and employed by Gelfond. The function is a polynomial in α1z1 , . . . , αn−1 βn−1 zn−1 β1 z1 α1 · · · αn−1 , such that F (z, . . . , z) = L λ1 =0 ··· L p(λ1 , . . . , λn )α1λ1 z · · · αnλn z , λn =0 where L is a large parameter and p(λ1 , . . . , λn ) are rational integers, not all 0. Then for every positive integer l, the number F (l, . . . , l) lies in the algebraic number field Q(α1 , . . . , αn ). It follows from a well-known lemma on linear equations (known as Siegel’s Lemma) that the p(λ1 , . . . , λn ) can be chosen such that their absolute values are not too large and such that Fm1 ,...,mn−1 (l, . . . , l) = 0 (3.2.1) for all integers l, m1 , . . . , mn−1 with 1 ≤ l ≤ h and m1 + · · · + mn−1 ≤ k, where Fm1 ,...,mn−1 denotes the corresponding derivative of F (z1 , . . . , zn−1 ) and h, k are appropriate parameters. In this situation, the basic interpolation techniques used earlier by Gelfond and others do not work in general. Using some analytic considerations Baker applied an ingenious extrapolation procedure to Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 3.2 Effective estimates for linear forms in logarithms 53 extend (3.2.1) to a larger range of values for l, at the price of slightly diminishing the range of values for m1 + · · · + mn−1 . Repeating this procedure, one can get F (l, . . . , l) = 0 for 1 ≤ l ≤ (L + 1)n . This can be regarded as a system of linear equations in the coefficients p(λ1 , . . . , λn ) of F , which, because of the multiplicative independence of α1 , . . . , αn , cannot have a non-zero solution. This proves the assertion. Let again α1 , . . . , αn be n ≥ 2 non-zero algebraic numbers, and let log α1 , . . . , log αn denote now the principal values of the logarithm. Theorem 3.2.3 Let b1 , . . . , bn be rational integers and 0 < ε ≤ 1. Assume that 0 < |b1 log α1 + · · · + bn log αn | < e−εB , where B = max {|b1 |, . . . , |bn |}. Then B ≤ B0 , where B0 is effectively computable in terms of α1 , . . . , αn and ε. This was proved in Baker (1968a) with B0 = (4n ε−1 d 2n A)(2n+1) , 2 2 where d ≥ 4 and A ≥ 4 are upper bounds for the degrees and heights, respectively, of α1 , . . . , αn . Here, by the height of an algebraic number we mean the maximum of the absolute values of the coefficients in its minimal defining polynomial, which is chosen such that its coefficients are relatively prime integers. Baker’s general effective estimates led to significant applications in number theory. For applications to Diophantine equations, the inequalities of Baker (1968a, 1968b) in which β1 , . . . , βn are rational integers proved to be particularly useful. Using his effective estimates, Baker (1968b, 1968c, 1969) gave the first explicit upper bounds for the solutions of Thue equations, Mordell equations, and superelliptic and hyperelliptic equations; see also Sections 9.6 and 9.7. Later, several improvements and generalizations were established by Baker and others, including Feldman, Baker and Stark, Tijdeman, van der Poorten, Sprindžuk, Shorey, Wüstholz, Philippon and Waldschmidt, Waldschmidt, Baker and Wüstholz, Laurent, Mignotte, Nesterenko and Matveev and, in the p-adic case, Coates, Sprindžuk, Brumer, Vinogradov and Sprindžuk, van der Poorten, Bugeaud, Laurent and Yu. They have introduced various new ideas to improve or refine the previous bounds. Their results made it possible to obtain enormously many applications. For further applications to Diophantine problems, we refer to Győry (1980b, 2002, 2010), Sprindžuk (1982, 1993), Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 54 Diophantine approximation and transcendence Shorey and Tijdeman (1986), Serre (1989), de Weger (1989), Bilu (1995), Wildanger (1997, 2000), Smart (1998), Gaál (2002) and Tzanakis (2013), to Chapters 4 and 5 of the present book, to our next book on discriminant equations, and to the references given there. Using an elementary geometric lemma due to Bombieri (1993) and Bombieri and Cohen (1997), Bilu and Bugeaud (2000) showed that one does not need the full strength of Baker’s theory to get, for bn = ±1, an effective version of Theorem 3.2.3: it can be deduced from an estimate for linear forms in just two logarithms. However, the results of the theory of linear forms in n ≥ 2 logarithms provide much better bounds for B. For comprehensive accounts of Baker’s theory, analogues for elliptic logarithms and algebraic groups and extensive bibliographies the reader can consult Baker (1975, 1988), Baker and Masser (1977), Lang (1978), Feldman and Nesterenko (1998), Waldschmidt (2000), Wüstholz (2002) and, for the state of the art as well, Baker and Wüstholz (2007). We now state the results of Matveev and Yu and give a common, uniform formulation of them. Let again K be an algebraic number field of degree d, and assume that it is embedded in C. We put χ = 1 if K is real, and χ = 2 otherwise. Let = b1 log α1 + · · · + bn log αn , where α1 , . . . , αn are n (≥ 2) non-zero elements of K with some fixed non-zero values of log α1 , . . . , log αn , and b1 , . . . , bn are rational integers, not all zero. Let A1 , . . . , An be reals with Ai ≥ max {dh(αi ), | log αi |, 0.16} (i = 1, . . . , n) and put B := max {1, max {|bi |Ai /An : 1 ≤ i ≤ n}} . The following deep result was proved by Matveev (2000). Theorem 3.2.4 Let K, α1 , . . . , αn , b1 , . . . , bn and be as above, and suppose that = 0. Then log || > −C1 (n, d)A1 · · · An log(eB), where C1 (n, d) := min 1 χ 1 en 2 χ 30n+3 n3.5 , 26n+20 d 2 log(ed). Further, B may be replaced by max (|b1 |, . . . , |bn |). Proof. This is Corollary 2.3 of Matveev (2000). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 3.2 Effective estimates for linear forms in logarithms 55 We shall use the following consequence of Theorem 3.2.4. Let = α1b1 · · · αnbn − 1 (3.2.2) and Ai ≥ max {dh(αi ), π } , i = 1, . . . , n. Theorem 3.2.5 Suppose that = 0, bn = ±1 and that B satisfies nπ B ≥ max |b1 |, . . . , |bn−1 |, 2e max √ , A1 , . . . , An−1 An . 2 (3.2.3) Then we have √ log || > −C2 (n, d)A1 · · · An log B / 2An , (3.2.4) where √ C2 (n, d) := min 1.451(30 2)n+4 (n + 1)5.5 , π 26.5n+27 d 2 log(ed). Proof. Let log denote the principal value of the logarithm. There exists an even rational integer b0 such that |b0 | ≤ |b1 | + · · · + |bn | and that |Im( )| ≤ π , where := b0 log α0 + b1 log α1 + · · · + bn log αn and α0 = −1. The assumption = 0 implies that = 0. We may assume that |e − 1| = || ≤ 1/3. Then it follows that | | ≤ 0.6, whence || ≥ 1 | |. 2 (3.2.5) Using | log |αi || ≤ dh(αi ), it is easy to show that √ | log αi | ≤ 2 max {dh(αi ), π } , i = 1, . . . , n. √ Thus, setting A0 = π/ 2, we have √ 2Ai ≥ max {dh(αi ), | log αi |, 0.16} , i = 0, 1, . . . , n. Further, (3.2.3) implies 2 |bi |Ai B . ≥ e max 1, max √ 0≤i≤n An 2An By applying now Theorem 3.2.4 to | | and using (3.2.5), we obtain (3.2.4). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 56 Diophantine approximation and transcendence Remark 3.2.6 Since for any complex number z, |ez − 1| ≤ |z|e|z| holds, for || ≤ 1 it follows that || ≤ e||. Together with (3.2.5) this implies that if we have an effective and quantitative result for || then we also have a similar one for the corresponding || or | |, and conversely. Consider again defined by (3.2.2). Let now B and Bn be real numbers satisfying B ≥ max {|b1 |, . . . , |bn |} , B ≥ Bn ≥ |bn |. Let p be a prime ideal of OK and denote by ep and fp the ramification index and the residue class degree of p, respectively. Suppose that p lies above the rational prime number p. Then the norm of p is N (p) = pfp . The following profound result is due to Yu (2007). Theorem 3.2.7 Assume that ordp bn ≤ ordp bi for i = 1, . . . , n, and set hi := max{h(αi ), 1/16e2 d 2 }, i = 1, . . . , n. If = 0, then for any real δ with 0 < δ ≤ 1/2 we have epn N (p) δB −1 ordp < C3 (n, d) , max h · · · h log(Mδ ), 1 n Bn C4 (n, d) (log N (p))2 (3.2.6) where C3 (n, d) := (16ed)2(n+1) n3/2 log(2nd) log(2d), C4 (n, d) := (2d)2n+1 log(2d) log3 (3d), and M := Bn C5 (n, d)N (p)n+1 h1 · · · hn−1 with C5 (n, d) := 2e(n+1)(6n+5) d 3n log(2d). Proof. This is the second consequence of the Main Theorem in Yu (2007). As is remarked there, for p > 2, the expression (16ed)2(n+1) can be replaced by (10ed)2(n+1) . For the proof of Theorem 4.2.1, it will be more convenient to use a uniform lower bound for log ||v which is valid both for infinite and for finite places v. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 3.2 Effective estimates for linear forms in logarithms 57 For a place v ∈ MK , we write as above 2 if v is infinite, N (v) := N (p) if v = p is finite. The following theorem is a consequence of Theorems 3.2.5 and 3.2.7. Theorem 3.2.8 Let v ∈ MK . Suppose that in (3.2.2) = 0, bn = ±1 and that α1 , . . . , αn−1 are not roots of unity. Let := h(α1 ) · · · h(αn−1 ), H := max {h(αn ), 1} . If B is a real number such that B ≥ max{|b1 |, . . . , |bn−1 |, 2e(3d)2n H }, (3.2.7) then log ||v > −C6 (n, d) N (v) H log∗ log N (v) BN (v) . H (3.2.8) where C6 (n, d) := λ(16ed)3n+2 (log∗ d)2 , and λ = 1 or 12 according as n ≥ 3 or n = 2. In the proof, we shall also need the following. Proposition 3.2.9 Let α be a non-zero algebraic number of degree d which is not a root of unity. Then log 2 if d = 1, dh(α) ≥ 2/(log 3d)3 if d ≥ 2. Proof. This result is due to Voutier (1996). Remark 3.2.10 For d ≥ 2 this lower bound may be replaced by the quantity (1/4)(log log d/ log d)3 ; see Voutier (1996). It is a conjecture, inspired by a question of D. H. Lehmer (1933), that even dh(α) ≥ c > 0 should hold for some absolute constant c. Proof of Theorem 3.2.8. First assume that v is infinite. There is an embedding σ : K → C such that ||v = |σ ()| or |σ ()|2 according as σ is real or not. Observe further that h(σ (α)) = h(α) for each α ∈ Q. Hence it suffices to prove (3.2.8) for ||. Suppose that in Theorem 3.2.5 Ai = max {dh(αi ), π } for i = 1, . . . , n. Then, using Proposition 3.2.9, it is easy to see that A1 · · · An ≤ (2.52d)2n H. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 58 Diophantine approximation and transcendence √ 2An > H /N(v) and nπ 2e max √ , A1 , . . . , An−1 An ≤ 2e (3d)2n H. 2 Further, we have Now (3.2.7) implies (3.2.3), and (3.2.8) follows from the inequality (3.2.4) of Theorem 3.2.5. Next assume that v is finite. Keeping the notation of Theorem 3.2.7 and using again Proposition 3.2.9, we infer that hi = h(αi ) for i = 1, . . . , n − 1 and hn ≤ max {h(αn ), 1} = H. Choosing δ = h1 · · · hn−1 H /B and Bn = 1 in Theorem 3.2.7, (3.2.7) implies that δ ≤ 12 . Using the fact that ||v = N (p)−ordp , after some computation (3.2.8) follows from (3.2.6) of Theorem 3.2.7. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005 4 Effective results for unit equations in two unknowns over number fields In this chapter we present effective finiteness results in quantitative form on equations of the shape a1 x1 + a2 x2 = 1, (4.1) where a1 , a2 are non-zero elements of an algebraic number field K, and the unknowns x1 , x2 are units, S-units or, more generally, elements of a finitely generated multiplicative subgroup of K ∗ . We usually refer to such equations as “unit equations”, also if the unknowns are taken from a group that is not the unit group of a ring. In the case that the unknowns are S-units, we speak about an S-unit equation. In certain applications, it is more convenient to consider equation (4.1) in homogeneous form a1 x1 + a2 x2 + a3 x3 = 0, (4.2) where a1 , a2 , a3 denote non-zero elements of K, and the unknowns x1 , x2 , x3 are units, S-units or elements of . For a long time equations (4.1) and (4.2) were utilized merely in special cases and in an implicit way. It was proved by Siegel (1921) (in an implicit form) for units of a number field, and by Mahler (1933a) for S-units in Q that equation (4.1) has only finitely many solutions. For S-unit equations over number fields, the finiteness of the number of solutions follows from work of Parry (1950). Extending results of Siegel, Mahler and Parry, Lang (1960) proved that equation (4.1) has only finitely many solutions in x1 , x2 ∈ even in the case when K is any field of characteristic 0 and is any finitely generated multiplicative subgroup of K ∗ . This implies that, up to a common proportional factor, (4.2) has also finitely many solutions. These results are ineffective. In this chapter we restrict ourselves to the case when K is a number field. The general case will be discussed in Chapters 6 and 8. Using Baker’s theory of logarithmic forms, Győry (1972, 1973, 1974, 1979, 1979/1980) gave the first effective upper bounds for the heights of the solutions 61 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 62 Unit equations in two unknowns of unit equations and S-unit equations over number fields. He systematically applied his results among others to decomposable form equations, polynomials and algebraic numbers of given discriminant, and irreducible polynomials, see Győry (1972, 1973, 1974, 1976, 1978a,b, 1980a,b, 1981a,b,c, 1982c). Győry’s bounds have been improved by several people, for references see the Notes in Section 4.7. In the present chapter, we derive effective upper bounds for the heights of the solutions of S-unit equations by means of the best known variants of the classical Baker’s method. There are now other methods giving effective bounds for the solutions, see Bombieri (1993), Bombieri and Cohen (1997, 2003), Bugeaud (1998), Murty and Pasten (2013) and von Känel (2014b). A brief discussion of these methods, together with a comparison of the bounds they yield, is given in Section 4.5. In Section 4.1, we present the best upper bounds to date for the heights of the solutions of (4.1) and (4.2) in units, S-units and, more generally, in an arbitrary finitely generated subgroup of a number field K. These results will be used to prove the main results in Chapter 8 on unit equations over finitely generated integral domains, and in Section 9.6 on decomposable form equations over K. Further, they will be applied to discriminant equations in our next book. For these and other possible applications, we give the upper bounds in completely explicit form. In Section 4.2 we state new effective and quantitative results on approximation of numbers from K ∗ by elements of a finitely generated subgroup of K ∗ . These are the hard core of our proofs. In Section 4.6, an application is presented in the direction of the abc-conjecture over number fields. Many other applications are mentioned in the Notes, Section 4.7 of this chapter and in Chapter 10. Sections 3.2 and 4.3 contain the main tools needed in the proofs. We recalled in Section 3.2 the best known effective estimates, due to Matveev (2000) and Yu (2007), for linear forms in logarithms. Further, in Section 4.3 we prove a new result from the geometry of numbers and give height estimates for units/ S-units in a fundamental/maximal independent system of units/S-units. Finally, in Section 4.4 we prove the results from Sections 4.1 and 4.2. 4.1 Effective bounds for the heights of the solutions 4.1.1 Equations in units of a number field Let K be an algebraic number field of degree d, OK the ring of integers of K and OK∗ the group of units of OK . We denote by R the regulator of K, by r Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.1 Effective bounds for the heights of the solutions 63 the rank of OK∗ , by MK the set of (infinite and finite) places, and by MK∞ the set of infinite places of K. For v ∈ MK , | · |v denotes the absolute value corresponding to v, defined in Section 1.7. We recall that the absolute (multiplicative) height H (α) of α ∈ K is defined by 1/d max(1, |α|v ) H (α) := v∈MK and the absolute logarithmic height h(α) by h(α) := log H (α). More generally, we define the height h(α) of α ∈ Q by taking a number field K containing α and using the above definition; one can show that this is independent of the choice of K. For more details and for the most important properties of the height, we refer to Section 1.9. We shall frequently use these properties without any further reference. Let a1 , a2 , a3 be non-zero elements of K and let H be a real with H ≥ max{h(a1 ), h(a2 ), h(a3 )}, H ≥ max{1, π/d}. Consider the homogeneous unit equation a1 x1 + a2 x2 + a3 x3 = 0 in x1 , x2 , x3 ∈ OK∗ . (4.1.1) The following theorem is due to Győry and Yu (2006). Theorem 4.1.1 All solutions x1 , x2 , x3 of (4.1.1) satisfy max h(xi /xj ) ≤ c1 R(log∗ R)H, i,j (4.1.2) where c1 := 4(r + 1)2r+9 23.2(r+12) log(2r + 2)(d log∗ (2d))3 . In some applications, for instance in our book on discriminant equations, at least two of the unknowns x1 , x2 , x3 are conjugate to each other over Q. In these situations the following theorem will lead to much sharper quantitative results. Let K1 be a subfield of K with degree d1 , unit rank r1 and regulator RK1 . Assume that for some Q-isomorphism σ of K1 , σ (K1 ) is also a subfield of K. Theorem 4.1.2 All solutions x1 , x2 , x3 of (4.1.1) with x2 ∈ K1 , x3 = σ (x2 ) satisfy max h(xi /xj ) ≤ c2 RK1 H log 1≤i,j ≤3 h(x2 ) , H (4.1.3) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 64 Unit equations in two unknowns provided that h(x2 ) > c3 RK1 H, (4.1.4) where c2 := 25.5r1 +45 r12r1 +2.5 , c3 := 320d 2 r12r1 . It should be observed that in (4.1.3) the upper bound depends on h(x2 ). In terms of d and r1 , Theorem 4.1.2 is an improvement of a result of Győry (1998). In the next subsection we give more general versions of Theorem 4.1.1. A similar generalization of Theorem 4.1.2 is given in Győry (1998). But Theorems 4.1.1 and 4.1.2 provide, in the special situation they deal with, much better bounds in terms of d and r. 4.1.2 Equations with unknowns from a finitely generated multiplicative group Let again K be an algebraic number field of degree d. Let be a finitely generated multiplicative subgroup of K ∗ of rank q > 0, and ∞ the torsion subgroup of consisting of all elements of finite order. We recall that q is the smallest positive integer such that / tors has a system of q generators. Let S denote the smallest set of places of K such that S contains all infinite places, and ⊆ OS∗ where OS∗ denotes the group of S-units in K. Further, let a1 , a2 ∈ K ∗ . We consider the equation a1 x1 + a2 x2 = 1 in x1 ∈ , x2 ∈ OS∗ . (4.1.5) In our first theorem below the following notation is used: H := max{1, h(a1 ), h(a2 )}; {ξ1 , . . . , ξm } is a system of generators for / tors (not necessarily a basis) and := h(ξ1 ) · · · h(ξm ); s := |S|, p1 , . . . , pt are the prime ideals in S, and P := max{2, N (p1 ), . . . , N (pt )}, where N(pi ) := |OK /pi | denotes the norm of pi ; in the case that S consists only of the infinite places we put t := 0, P := 2. Theorem 4.1.3 If x1 , x2 is a solution of (4.1.5), then max{h(x1 ), h(x2 )} < 6.5 c4 s P H max{log(c4 sP ), log∗ }, log P (4.1.6) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.1 Effective bounds for the heights of the solutions 65 where c4 := 11λ · (m + 1)(log∗ m)(16ed)3m+5 with λ = 12 if m = 1, λ = 1 if m ≥ 2. For some of our applications it is essential that we allow ξ1 , . . . , ξm to be any set of generators of / tors and not necessarily a basis; see for instance the proof of Theorem 9.6.2 and the proofs of certain results on discriminant equations, to be discussed in our next book. Almost the same bounds as in (4.1.6) were obtained in Bérczes, Evertse and Győry (2009), but with c4 replaced by a constant which, for m > q > 0, contains also the factor q q . This improvement here will be important in our book on discriminant equations. Theorem 4.1.3 implies in an effective way the finiteness of the number of solutions x1 , x2 ∈ of (4.1.5). To formulate this in a precise form we recall that as in Section 1.10, K is said to be effectively given if the minimal polynomial over Z of a primitive element θ of K over Q is given. We may assume that θ is an algebraic integer. Further, an element α of K is said to be given/effectively determinable if it is expressed in the form α = (p0 + p1 θ + · · · + pd−1 θ d−1 )/q with rational integers p0 , . . . , pd−1 , q with gcd(p0 , . . . , pd−1 , q) = 1 that are given/can be effectively computed (see Section 1.10). Corollary 4.1.4 For given a1 , a2 ∈ K ∗ , equation (4.1.5) has only finitely many solutions in x1 , x2 ∈ . Further, there exists an algorithm which, from effectively given K, a1 , a2 , a system of generators for / tors and tors , computes all solutions x1 , x2 . In the special case = OS∗ , we obtain from Theorem 4.1.3 the following. Let S be a finite subset of MK containing all infinite places, with the above parameters s, P . Denote by RS the S-regulator (see (1.8.2)). Define c5 := 11λs 2 (log∗ s)(16ed)3s+2 with λ = 12 if s = 2, λ = 1 if s ≥ 3, c6 := ((s − 1)!)2 /(2s−2 d s−1 ). Corollary 4.1.5 Every solution x1 , x2 of a1 x1 + a2 x2 = 1 in x1 , x2 ∈ OS∗ (4.1.7) satisfies max(h(x1 ), h(x2 )) < 6.5c5 c6 (P / log P )H RS max{log(c5 P ), log∗ (c6 RS )}. (4.1.8) This was proved by Győry and Yu (2006) in a slightly sharper form in terms of d and s. Their proof is a more general variant of that of Theorem 4.1.1. In the special case S = MK∞ , Corollary 4.1.5 gives Theorem 4.1.1 but only with Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 66 Unit equations in two unknowns a weaker bound in terms of d and r. From Theorem 4.1.3, a weaker version of Theorem 4.1.2 can also be deduced. We say that S is effectively given if the prime ideals in S are effectively given in the sense defined in Section 1.10. The next corollary follows both from Corollary 4.1.5 and from Corollary 4.1.4. Corollary 4.1.6 For given a1 , a2 ∈ K ∗ , equation (4.1.7) has only finitely many solutions. Further, there exists an algorithm which, from effectively given K, a1 , a2 and S, computes all solutions. If the number t of finite places in S exceeds log P , then, in terms of S, s is the dominating factor in the bound occurring in (4.1.8). This factor is a consequence of the use of Proposition 4.3.9 concerning S-units whose proof is based on Minkowski’s Theorem on successive minima. In the following version of Corollary 4.1.5 there is no factor of the form s s or t t . This improvement plays an important role in some applications, see e.g. Győry, Pink and Pintér (2004), Győry and Yu (2006), Győry (2006) and it is also applied in our next book on discriminant equations. Let s R := max{h, R}, where h and R denote the class number and regulator of K, respectively. Further, let r denote the unit rank of K. From Theorem 4.2.1 below we shall deduce the following. Theorem 4.1.7 Let t > 0. Then every solution x1 , x2 of equation (4.1.7) satisfies t+4 max{h(x1 ), h(x2 )} < c7 d r+3 R P H RS , (4.1.9) where c7 is an effectively computable positive absolute constant. This was established in Győry and Yu (2006) in a somewhat different and completely explicit form; for a slight improvement see Győry (2008a). We note that in view of (1.5.2) and (1.5.3), R can be estimated from above in terms of d and the discriminant of K. Further, in view of (1.8.3) we have R t i=1 log N(pi ) ≤ RS ≤ hR t log N (pi ). i=1 The linear dependence on H of the bounds in (4.1.6), (4.1.8) and (4.1.9) cannot be improved. Indeed, let a1 = 1 − ε with ε ∈ OS∗ and a2 = 1. Then equations (4.1.5) and (4.1.7) have the solution x1 = 1, x2 = ε, and it is easy to see that H − log 2 ≤ max{h(x1 ), h(x2 )} ≤ H + log 2. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.2 Elements of a finitely generated multiplicative group 67 4.2 Approximation by elements of a finitely generated multiplicative group We deduce Theorem 4.1.3 from the following Diophantine approximation theorem. Keeping the above notation, we put N(v) := 2 if v is an infinite place, and N (v) := N (p) if v = p is a finite place, i.e., prime ideal of OK . Theorem 4.2.1 Let be a finitely generated multiplicative subgroup of K ∗ with system of generators {ξ1 , . . . , ξm } for / tors . Let α ∈ K ∗ , and put H := max(h(α), 1), := h(ξ1 ) · · · h(ξm ). Further, let v ∈ MK . Then for every ξ ∈ with αξ = 1, we have log |1 − αξ |v > −c8 N (v) H log∗ log N(v) N (v)h(ξ ) , H (4.2.1) where c8 := 2λ · (m + 1) log∗ (dm)(log∗ d)2 (16ed)3m+5 with λ = 12 if m = 1, λ = 1 if m ≥ 2. The following theorem is an immediate consequence of Theorem 4.2.1. The estimate (4.2.3) below is of a similar flavour to results in Bombieri (1993), Bombieri and Cohen (1997, 2003) and Bugeaud (1998) (see also Bilu (2002), Bombieri and Gubler (2006), section 5.4), but, as will be seen in Section 4.5, inequality (4.2.3) below gives in many cases a better upper bound for h(ξ ). Theorem 4.2.2 Let α ∈ K ∗ , v ∈ MK and 0 < κ ≤ 1. If ξ ∈ is such that αξ = 1 and log |1 − αξ |v < −κh(ξ ) (4.2.2) then h(ξ ) < 6.4(c8 /κ) N (v) H max{log((c8 /κ)N (v)), log∗ }. log N (v) (4.2.3) Similar results were proved in Bérczes, Evertse and Győry (2009) but with c8 replaced by a constant which, for m > q > 0, contains also the factor q q . Here q denotes the rank of . It is crucial for some applications of Theorems 4.2.1 and 4.2.2, for example in Theorem 4.1.3 and Theorem 4.1.7, that no factor q q occurs in c8 . The main tool in the proofs of Theorems 4.2.1 and 4.2.2 is the theory of logarithmic forms, more precisely Theorem 3.2.8. It will be combined with some new results from the geometry of numbers and some estimates for fundamental/ independent units. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 68 Unit equations in two unknowns 4.3 Tools 4.3.1 Some geometry of numbers Let V be a real vector space of finite dimension n. We endow V with a topology by choosing a linear isomorphism ϕ : V → Rn and taking the inverse images under ϕ of the open sets of Rn . This does not depend on the choice of ϕ. By a lattice in V we mean an additive group of the shape q zi ai : z1 , . . . , zq ∈ Z , L= i=1 where a1 , . . . , aq are linearly independent vectors of V . We call {a1 , . . . , aq } a basis of L and q the dimension of L. Clearly, q ≤ n. By a full lattice in V we mean a lattice in V of maximal dimension n. A norm or convex distance function on V is a function . : V → R≥0 such that x + y ≤ x + y for x, y ∈ V ; λx = |λ| · x x = 0 for x ∈ V , λ ∈ R; if and only if x = 0. The unit ball of . is defined by B. = {x ∈ V : x ≤ 1}. It is a convex, compact, symmetric body in V , i.e., it is convex, symmetric about 0, and it is compact and has interior points with respect to the topology on V defined above. Conversely, with any convex, compact, symmetric body C in V one can associate a norm .C on V such that C is the unit ball of .C : take xC := λ, where λ is the minimum of all reals μ ≥ 0 such that x ∈ μC := {μy : y ∈ C}. Let · be a norm on V , and L a q-dimensional lattice in V . For i = 1, . . . , q we define the i-th successive minimum λi = λi (., L) of . with respect to L, to be the minimum of all numbers λ such that {x ∈ V : x ≤ λ} contains at least i linearly independent vectors from L. We recall Minkowski’s Theorem on successive minima. For technical simplicity we restrict ourselves to the special case of full lattices in Rq . But note that the general case can be reduced to this special case by means of a linear isomorphism. We denote by “vol” the Lebesgue measure on Rq , normalized such that the unit cube [0, 1]q has measure 1. If L is a full lattice in Rq with basis {a1 , . . . , aq }, say, we define the determinant of L by d(L) = |det(a1 , . . . , aq )|. This is independent of the choice of the basis. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.3 Tools 69 Theorem 4.3.1 Let λ1 , . . . , λq be the successive minima of a norm . on Rq with respect to a full lattice L in Rq . Then vol(B. ) 2q ≤ λ1 · · · λq ≤ 2q . q! d(L) Proof. For a proof, see Cassels (1959), chapter VIII or Minkowski (1910). We note that both the upper bound and the lower bound are best possible. Corollary 4.3.2 Let · be a norm on Rq , and L a full lattice in Rq such that vol(B. ) ≥ 2q d(L). Then there is a non-zero x ∈ L with x ≤ 1. Proof. By Theorem 4.3.1 we have λ1 ≤ (λ1 · · · λq )1/q ≤ 1. Theorem 4.3.3 Let . be a norm on Rq , L a full lattice in Rq , and λ1 , . . . , λq the successive minima of · with respect to L. Then L has a basis {a1 , . . . , aq } such that ai ≤ max(1, i/2)λi for i = 1, . . . , q. Proof. See Cassels (1959), chapter V. The idea of the proof originates from Mahler. We now prove a technical result, which will be applied later in combination with logarithmic forms estimates. Proposition 4.4.1 from Section 4.4, which is a consequence of Proposition 4.3.4 below, will play an important role in the proof of Theorem 4.2.1. Proposition 4.3.4 Let V be a real vector space, L a lattice in V of dimension q ≥ 1, and . a norm on V , such that x ≥ θ > 0 for all x ∈ L \ {0}. Further, let m ≥ q be an integer, and let a1 , . . . , am be vectors in L \ {0} for which a1 , . . . , am generate L as a Z-module, and among all systems of m vectors that generate L, m ai is minimal. (4.3.1) i=1 Then for every x ∈ L there are b1 , . . . , bm ∈ Z such that x = b1 a1 + · · · + bm am with |bi | ≤ q 2q x θ for i = 1, . . . , m. It is crucial for applications that in Proposition 4.3.4 a1 , . . . , am do not have to form a basis of L. We assume that V = Rq , L = Zq which is no loss of generality. Indeed, we may assume without loss of generality that V is the real vector space Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 70 Unit equations in two unknowns spanned by L. Let ϕ : Rq → V be a linear isomorphism such that ϕ(Zq ) = L. Define a norm .ϕ on Rq by xϕ := ϕ(x). Then clearly, it suffices to prove Proposition 4.3.4 with Rq , Zq , ϕ −1 (a1 ), . . . , ϕ −1 (am ), .ϕ instead of L, a1 , . . . , am , .. For the proof of Proposition 4.3.4 (with V = Rq , L = Zq ) we make some preparations. Since we assume L = Zq , the factor d(L) in these results disappears. Denote by λ1 , . . . , λq the successive minima of . with respect to Zq . By assumption, we have λ1 ≥ θ . We define V := vol(B· ) = vol({x ∈ Rq : x ≤ 1}). We need a number of lemmas. Lemma 4.3.5 Let f0 , f1 , . . . , fm be vectors in Zq such that f1 , . . . , fm generate Zq . Then there are integers b1 , . . . , bm such that f0 = m bi fi , |bi | ≤ M(f0 , . . . , fm ) for i = 1, . . . , m, i=1 where M(f0 , . . . , fm ) = max 0≤i1 <···<iq ≤m |det(fi1 , . . . , fiq )|. Proof. This is a result of Borosh, Flahive, Rubin and Treybig (1989). Lemma 4.3.6 Let f1 , . . . , fq ∈ Rq . Then |det(f1 , . . . , fq )| ≤ q! V f1 · · · fq . 2q Proof. We assume without loss of generality that f1 , . . . , fq are linearly independent. Put gi := fi −1 fi for i = 1, . . . , q, and denote by D the convex hull of the points ±gi (i = 1, . . . , q). Then our lemma follows at once from the observations D ⊂ B· and vol(D) = 2q |det(f1 , . . . , fq )| 2q · |det(g1 , . . . , gq )| = · . q! q! f1 · · · fq In what follows, let a1 , . . . , am be as in Proposition 4.3.4, and assume again that L = Zq . Lemma 4.3.7 Let i1 , . . . , iq be any distinct indices from {1, . . . , m}. Then q j =1 aij ≤ q! λ1 · · · λq . 2q−1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.3 Tools 71 Proof. For convenience, we put μ1 = · · · = μm−q+1 := λ1 , μm−q+2 := λ2 , . . . , μm := λq . By Theorem 4.3.3, the lattice Zq has a basis {y1 , . . . , yq } such that yi ≤ max(1, i/2)λi for i = 1, . . . , q. This implies a1 · · · am ≤ y1 m−q+1 y2 · · · yq q! m−q+1 q! ≤ q−1 λ1 λ2 · · · λq = q−1 μ1 · · · μm , (4.3.2) 2 2 where we have used (4.3.1). Without loss of generality we may assume that a1 ≤ · · · ≤ am . Let i0 := 0 and for j = 1, . . . , q define ij to be the largest index i such that rank{a1 , . . . , ai } = j . Then ai ≥ λj for ij −1 + 1 ≤ i ≤ ij , j = 1, . . . , q, and so ai ≥ μi for i = 1, . . . , m. Together with (4.3.2) this implies that for any subset I of {1, . . . , m}, ai q! ≤ q−1 . μ 2 i i∈I Hence for any q distinct indices i1 , . . . , iq from {1, . . . , m}, q aij ≤ am−q+1 · · · am ≤ j =1 q! q! μm−q+1 · · · μm ≤ q−1 λ1 · · · λq , q−1 2 2 which is our lemma. Proof of Proposition 4.3.4. Without loss of generality, we assume that x = 0. In view of Lemma 4.3.5, it suffices to show that x . (4.3.3) θ First, let i1 , . . . , iq be any q distinct indices from {1, . . . , m}. Then by Lemmas 4.3.6 and 4.3.7, Theorem 4.3.1 (Minkowski’s Theorem on successive minima) and our assumption x ≥ θ , we have M(x, a1 , . . . , am ) ≤ q 2q · q! · V · ai1 · · · aiq 2q (q!)2 ≤ 2q−1 · V λ1 · · · λq 2 (q!)2 x . ≤ q−1 ≤ q 2q · 2 θ |det(ai1 , . . . , aiq )| ≤ Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 72 Unit equations in two unknowns Next, let i1 , . . . , iq−1 be any q − 1 distinct indices from {1, . . . , m}. Using Lemma 4.3.6, our assumption ai ≥ θ for i = 1, . . . , m, and Lemma 4.3.7 and Theorem 4.3.1 (Minkowski’s Theorem), we get q! · V x · ai1 · · · aiq−1 2q x q! ≤ q · V ai1 · · · aiq · 2 θ (with iq ∈ {1, . . . , m} \ {i1 , . . . , iq−1 }) |det(x, ai1 , . . . , aiq−1 )| ≤ (q!)2 x · V λ1 · · · λq · 22q−1 θ 2 x (q!) x ≤ q 2q · . ≤ q−1 · 2 θ θ ≤ This clearly proves (4.3.3) and Proposition 4.3.4. 4.3.2 Estimates for units and S-units Let K be an algebraic number field of degree d with ring of integers OK , unit rank r and regulator R. Denote by ωK the number of roots of unity in K. We determine upper bounds for the heights of units and S-units in a fundamental/maximal independent system. We start with some auxiliary results. The first is due to Loher and Masser. Proposition 4.3.8 For n ≥ 1, let α1 , . . . , αn be multiplicatively independent non-zero elements of K. Then we have 58(n!en /nn )d n+1 (log∗ d)h(α1 ) · · · h(αn ) ≥ ωK . Proof. This is a consequence of Loher and Masser (2004), Theorem 3. As is known, n!en /nn is asymptotic to Proposition 4.3.8 gives √ √ 2π n and n!en /nn ≤ e n. Hence √ 58e n d n+1 (log∗ d)h(α1 ) · · · h(αn ) ≥ ωK . (4.3.4) For simplicity, we shall apply the consequence (4.3.4) of Proposition 4.3.8. Let S = {v1 , . . . , vs } be a finite set of places on K which contains the set MK∞ of the infinite places. Denote by OS , OS∗ and RS the ring of S-integers, the group of S-units and the S-regulator of K, respectively. If in particular S = MK∞ , then s = r + 1, OS = OK , OS∗ is just the unit group OK∗ of K, and RS = R. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.3 Tools 73 We define the constants c9 := ((s − 1)!)2 /(2s−2 d s−1 ), c9 := (s − 1)!/d s−1 , √ c10 := 29e s − 2 d s−1 (log∗ d) c9 (s ≥ 3), √ c10 := 29e s − 2 d s−1 (log∗ d)c9 (s ≥ 3), c11 := (((s − 1)!)2 /2s−1 )(log(3d))3 . Proposition 4.3.9 Let s ≥ 2. There exists in K a fundamental (respectively independent) system {ε1 , . . . , εs−1 } of S-units with the following properties: (i) s−1 h(εi ) ≤ c9 RS (resp. c9 RS ); i=1 RS ) if s ≥ 3; (ii) max h(εi ) ≤ c10 RS (resp. c10 1≤i≤s−1 (iii) for such a fundamental system {ε1 , . . . , εs−1 }, the absolute values of the entries of the inverse matrix of (log |εi |vj )i,j =1,...,s−1 do not exceed c11 . Remark A similar result was proved earlier by Siegel (1969) for ordinary units, i.e., in the case S = MK∞ . The proof given below, which is a straightforward extension of Siegel’s argument, is due to Győry and Yu (2006) and, in slightly weaker forms Hajdu (1993) and Bugeaud and Győry (1996a). Recently, for multiplicatively independent S-units, Vaaler (2014), theorems 1, 2 obtained the slightly better upper bound s!/(2d)s−1 instead of c9 . Proof. For α ∈ K \ {0}, put v(α) := log |α|v1 , . . . , log |α|vs−1 . The full lattice L in Rs−1 spanned by the vectors v(η) with η ∈ OS∗ has determinant RS ; see Section 1.8. The function · : Rs−1 → R defined by x := |x1 | + · · · + |xs−1 | for x = (x1 , . . . , xs−1 ) ∈ Rs−1 is a norm; see Section 4.3.1. Denote by V the volume of the unit ball {x ∈ Rs−1 : x ≤ 1}. It is easy to check that V = 2s−1 /(s − 1)!. By Theorem 4.3.1 (Minkowski’s Theorem on successive minima) the successive minima λ1 , . . . , λs−1 of L with respect to · have the property λ1 · · · λs−1 ≤ 2s−1 RS /V = (s − 1)!RS . (4.3.5) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 74 Unit equations in two unknowns Further, there are multiplicatively independent S-units η1 , . . . , ηs−1 in OS for which v(ηi ) = λi , However, for every η ∈ OS∗ we have i = 1, . . . , s − 1. s j =1 (4.3.6) |η|vj = 1, hence s s 1 1 log |η|vj max 0, log |η|vj = d j =1 2d j =1 ⎛ ⎞ s−1 s−1 1 ⎝ log |η|vj + = log |η|vi ⎠ , 2d j =1 i=1 h(η) = which implies that 1 1 v(η) ≤ h(η) ≤ v(η). 2d d (4.3.7) We infer from (4.3.5), (4.3.6) and (4.3.7) that s−1 h(ηi ) ≤ i=1 (s − 1)! · RS , d s−1 i.e. (i) holds for η1 , . . . , ηs−1 . It follows from Theorem 4.3.3 that there exists a fundamental system of S-units {ε1 , . . . , εs−1 } in OS such that v(εi ) ≤ max{1, i/2}v(ηi ), i = 1, . . . , s − 1. (4.3.8) Further, by (4.3.7), (4.3.8), (4.3.6) and (4.3.5) we have s−1 i=1 h(εi ) ≤ s−1 1 d s−1 v(εi ) ≤ i=1 ((s − 1)!)2 ≤ s−2 s−1 · RS , 2 d s−1 (s − 1)! v(ηi ) 2s−2 d s−1 i=1 (4.3.9) which proves (i). (ii) is an immediate consequence of (i) and (4.3.4). It remains to prove (iii). Putting E := (log |εi |vj )i,j =1,...,s−1 we have |det(E)| = RS . If s = 2, then (1.5.3) and (1.8.5) prove (iii). Now let s > 2 and eij := det(Eij )/ det(E), where Eij denotes the matrix obtained from E by omitting the i-th row and j -th column. It follows from (4.3.9) and Hadamard’s Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.3 Tools 75 inequality that ! s−1 " s−1 " s−1 2 " log |ε |det(Eij )| ≤ | ≤ v(εp ) p vq # p=1 p = i ≤ q=1 q = j p=1 p = i RS ((s − 1)!)2 . · s−2 2 v(εi ) Together with (4.3.7), |det(E)| = RS and Proposition 3.2.9, this proves (iii). For s ≥ 3, let √ (s − 1)!)2 c12 := 29e s − 2 · · π s−2 d log∗ d. 2s−2 When we apply Theorem 3.2.5 to unit equations, we shall get better bounds by using the following version of (i), Proposition 4.3.9. Lemma 4.3.10 Let {ε1 , . . . , εs−1 } be a fundamental system of S-units in K with the properties specified in Proposition 4.3.9. Then s−1 i=1 max(RS , π ), max(dh(εi ), π ) ≤ c12 RS , if s = 2, if s ≥ 3. (4.3.10) Proof. The case s = 2 is trivially true by Proposition 4.3.9. Suppose s ≥ 3. Let k denote the number of indices i with 1 ≤ i ≤ s − 1 such that dh(εi ) < π . Suppose first 1 ≤ k ≤ s − 2. Without loss of generality, we may assume dh(εi ) < π for i = 1, . . . , k and dh(εj ) ≥ π for j = k + 1, . . . , s − 1. Thus, using (4.3.4) and Proposition 4.3.9, we infer that s−1 max(dh(εi ), π ) = π k /d k h(ε1 ) · · · h(εk ) d s−1 h(ε1 ) · · · h(εs−1 ) i=1 √ ((s − 1)!)2 · π k d(log∗ d) RS ≤ c12 RS , ≤ 29e k · 2s−2 which proves (4.3.10). If k = 0, then (4.3.10) immediately follows from (i) of Proposition 4.3.9. For k = s − 1, we have dh(εi ) < π for each i. Further, if s ≥ 3 then (1.8.5) gives a lower bound for RS and (4.3.10) follows. Let p1 , . . . , pt be the prime ideals in S, and put Q := N(p1 · · · pt ) if t > 0, Q := 1 if t = 0. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 76 Unit equations in two unknowns Let hK denote the class number of K, and let ⎧ ⎨0, c13 := 1/d, √ ⎩ 29er!r r − 1 log d, if r = 0, if r = 1, if r ≥ 2. Proposition 4.3.11 Let θv (v ∈ S) be reals with v∈S θv = 0. Then there exists ε ∈ OS∗ such that | log |ε|v − θv | ≤ c13 dR + hK log Q. (4.3.11) v∈S Remark As will follow from the proof, in the special case t = 0 the unit ε ∈ OK∗ occurring in Proposition 4.3.11 can be chosen from the group generated by independent units having the properties specified in (i) and (ii) of Proposition 4.3.9. Proof. We start with the case t = 0. Then S = {v1 , . . . , vr+1 }, where v1 , . . . , vr+1 are the infinite places of K. Write θi for θvi . If r = 0, then S = {v1 }, hence θ1 = 0, and thus the assertion holds with ε = 1. Assume that r > 0. Choose a system of independent units ε1 , . . . , εr in K with the properties specified in Proposition 4.3.9. Consider the system of linear equations r log |εj |vi xj = θi i = 1, . . . , r + 1 j =1 in x1 , . . . , xr . The equations with i = 1, . . . , r have a unique solution (x1 , . . . , xr ) ∈ Rr , since det(log |εj |vi )i,j =1,...,r = R = 0. This solution satisfies also the equation with i = r + 1, since r+1 i=1 log |εj |vi = 0 for j = 1, . . . , r, r+1 and i=1 θi = 0. Let b1 , . . . , br be the rational integers with − 12 < bj − xj ≤ 1 2 for j = 1, . . . , r and take ε = ε1b1 · · · εrbr . Then r+1 r+1 r (bj − xj ) log |εj |v log |ε|v − θi = i i i=1 i=1 j =1 ≤ 1 2 r+1 r r r log |εj |v ≤ log |εj |v . i i=1 j =1 i j =1 i=1 We assert that if r > 1, then the inner sum over i in the last expression is at most (d/r)c13 R. This can be seen by using (4.3.5), (4.3.6), the second inequality of (4.3.7) and by applying Proposition 4.3.8 to any r − 1 of the Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.3 Tools 77 εi (1 ≤ i ≤ r). Thus Proposition 4.3.11 is proved for r > 1. If r = 1, we can use (i) of Proposition 4.3.9 to prove the assertion. Now let t > 0. Recall that S = MK∞ ∪ S0 , where S0 = {p1 , . . . , pt } is the set of finite places, i.e., prime ideals in S. For p ∈ S0 , let kp be the integer such that − 12 < kp + θp ≤ 12 . hK log Np There is α ∈ K ∗ such that (α) = ( p∈S0 pkp )hK . We have α ∈ OS∗ and log |α|p − θp = kp hK log Np − θp ≤ 1 hK log N p (p ∈ S0 ). (4.3.12) 2 By the Product Formula, v∈S log |α|v = 0. Together with what we just proved, this implies that there is η ∈ OK∗ such that log |η|v + log |α|v − θv + B ≤ c13 dR, A := (4.3.13) r + 1 ∞ v∈MK where B := (log |α|p − θp ). p∈S0 Now take ε := ηα. Clearly, (4.3.12) holds with ε instead of α. Hence | log |ε|p − θp | ≤ 12 hK log Q. p∈S0 Further, (4.3.13) and (4.3.12) imply | log |ε|v − θv | ≤ A + |B| ≤ c13 dR + 12 hK log Q. v∈MK∞ Now (4.3.11) follows by a simple addition. Recall that we have defined the S-norm NS (α) := v∈S |α|v for α ∈ K; see Section 1.8. In the case of S = MK∞ , this is just |NK/Q (α)|. In addition, we define ⎛ ⎞ ⎠ max(1, |α|v ), max(1, |α|−1 MS (α) := max ⎝ v ) v∈MK \S v∈MK \S for α ∈ K ∗ . By the Product Formula we have MS (α) = |α|−1 v = NS (α) for α ∈ OS \ {0}. v∈MK \S Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 78 Unit equations in two unknowns Proposition 4.3.12 Let α ∈ K ∗ and let n be a positive integer. Then there exists ε ∈ OS∗ such that h(ε n α) ≤ 1 hK log MS (α) + n c13 R + log Q . d d (4.3.14) In particular, if α ∈ OS \ {0} then there exists ε ∈ OS∗ such that h(ε n α) ≤ 1 hK log NS (α) + n c13 R + log Q . d d (4.3.15) Proof. Inequality (4.3.15) is an immediate consequence of (4.3.14). We prove (4.3.14). We assume that NS (α) ≥ 1. This is no loss of generality since both the height and MS are invariant under x → x −1 , and NS (α) ≥ 1 can be achieved by replacing α by α −1 if necessary. By Proposition 4.3.11, there is ε ∈ OS∗ such that 1 1 B := log |ε|v + n log |α|v − sn log NS (α) ≤ c13 dR + hK log Q, v∈S where s = |S|. Hence 1 1 log max(1, |εn α|v ) = max(0, n log |ε|v + log |α|v ) d d v∈S v∈S 1 n · B + log NS (α) d d 1 hK log Q + log NS (α) ≤ n c13 R + d d ≤ since by assumption NS (α) ≥ 1. By adding d1 log v∈MK \S max(1, |εn α|v ) on both sides and observing that by the Product Formula NS (α) max 1, |εn α|v = NS (α) max(1, |α|v ) v∈MK \S v∈MK \S = |α|−1 v · max(1, |α|v ) ≤ MS (α), v∈MK \S our Proposition follows. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.4 Proofs 79 4.4 Proofs 4.4.1 Proofs of Theorems 4.1.1 and 4.1.2 We keep the notation from Section 4.1. Thus, K is an algebraic number field of degree d, R is the regulator of K and r the rank of OK∗ . Further, H is a real with H ≥ max(h(a1 ), h(a2 ), h(a3 )) and H ≥ max(1, π/d). Proof of Theorem 4.1.1. For r = 0 the assertion is trivial, hence we assume that r ≥ 1. Let x1 , x2 , x3 be a solution of (4.1.1). Assume without loss of generality that h(x1 /x3 ) ≥ h(xi /xj ) for 1 ≤ i < j ≤ 3. Put α := −a1 /a3 , β := −a2 /a3 , x := x1 /x3 , y := x2 /x3 . (4.4.1) Then αx + βy = 1, x, y ∈ OK∗ , max{h(α), h(β)} ≤ 2H. (4.4.2) (4.4.3) Clearly, h(x) ≥ h(y). We give an upper bound for the height of x. Let ε1 , . . . , εr be a fundamental system of units in K with the properties specified in Proposition 4.3.9. Then y can be written in the form y = ζ ε1b1 · · · εrbr , (4.4.4) where ζ is a root of unity in K and b1 , . . . , br are rational integers. Denote by v1 , . . . , vr+1 the infinite places of K. We infer from (4.4.4) that log |y|vj = r bi log |εi |vj , j = 1, . . . , r, i=1 whence, using (iii) of Proposition 4.3.9 and the fact that y is a unit, we get max{|b1 |, . . . , |br |} ≤ c11 r log |y|v = 2c dh(y) ≤ 2c dh(x), (4.4.5) j 11 11 j =1 denotes the constant c11 with s − 1 replaced by r. Set αr+1 := ζβ where c11 and br+1 = 1. Let v be an infinite place for which |x|v is minimal. Then, from (4.4.2) we deduce that d br+1 h(x) + 2dH. − 1v = log |αx|v ≤ − log ε1b1 · · · εrbr αr+1 r +1 (4.4.6) We shall prove that h(x) < c14 R(log∗ R)H, (4.4.7) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 80 Unit equations in two unknowns where c14 := min{(r + 1)2r+9 23.2r+38.4 , (r + 1)2r+3.5 24.3r+44.3 } × 2 log(2r + 2)(d log∗ (2d))3 , which is somewhat stronger than (4.1.2). Set Ai = max(dh(εi ), π ), i = 1, . . . , r, Ar+1 = 2dH ≥ max(dh(αr+1 ), π ) . (4.4.8) We may assume that h(x) > 4(r + 1)H and 2c11 dh(x) > 2e max (r + 1)π , A1 , . . . , Ar Ar+1 , √ 2 since otherwise, using (1.5.3), Proposition 4.3.9 and (4.3.4), the upper bound (4.4.7) easily follows. In view of (4.4.5), we can apply Theorem 3.2.5 with dh(x). Combining this with Lemma 4.3.10, and using (4.4.6) and B = 2c11 (4.4.8), we infer that ⎧ 11 h(x) ⎨−2dv C2 (2, d)dH max{R, π } log 2c√ if r = 1, 2 2H log |αx|v > 2c h(x) 11 ⎩−2dv C2 (r + 1, d)c dH R log √ if r ≥ 2, 12 2 2H where C2 (r + 1, d) is the constant occurring in Theorem 3.2.5 and c12 denotes the constant c12 with s − 1 replaced by r. Together with (4.4.6) this implies (4.4.7), hence (4.1.2). Proof of Theorem 4.1.2. We follow the arguments of the proof of Theorem 4.1.1. Let x1 , x2 , x3 be an arbitrary but fixed solution of (4.1.1) with x2 ∈ K1 and x3 = σ (x2 ). The cases x3 = x2 and r1 = 0 being trivial, we assume that x3 = x2 and r1 ≥ 1. Then d ≥ d1 ≥ 2. We define again α := −a1 /a3 , β := −a2 /a3 , x := x1 /x3 , y := x2 /x3 so that we have again (4.4.2), (4.4.3). Let {ε1 , . . . , εr1 } be a fundamental system of units in K1 with the properties specified in Proposition 4.3.9. Then br x2 = ζ ε1b1 · · · εr1 1 with a root of unity ζ in K1 and with rational integers b1 , . . . , br1 . We obtain as in the proof of Theorem 4.1.1 that (4.4.9) max |b1 |, . . . , |br1 | ≤ 2c15 d1 h(x2 ), where c15 := 2 (r1 !)2 /2r1 (log(3d1 ))3 . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.4 Proofs 81 Consider the infinite place v on K for which |x|v is minimal. Setting αr1 +1 = −α2 ζ /(α3 σ (ζ )) and ηi = εi /σ (εi ) for i = 1, . . . , r1 , we deduce from (4.4.2) that α1 x1 br log η1b1 · · · ηr1 1 αr1 +1 − 1v = log α x 3 3 v d h(x1 /x3 ) + 2dH. ≤− r +1 (4.4.10) We have h(αr1 +1 ) ≤ 2H and h(ηi ) ≤ 2h(εi ) for i = 1, . . . , r1 . To apply Theorem 3.2.5 to the left-hand side of (4.4.10), set Ai := max{dh(εi ), π }, i = 1, . . . , r1 , Ar1 +1 := 2dH. (4.4.11) These imply max{dh(ηi ), π } ≤ 2Ai , i = 1, . . . , r1 , max dh(αr1 +1 ), π ≤ Ar1 +1 . (4.4.12) (4.4.13) We may assume that h(x1 /x3 ) > 4(r + 1)H and (r1 + 1)π , 2A1 , . . . , 2Ar1 Ar1 +1 2c15 d1 h(x2 ) > 2e max √ 2 since otherwise, using (1.5.3), Proposition 4.3.9 and (4.4.11), (4.4.13), we obtain h(x2 ) < 320d 2 r1 RK1 H which contradicts our assumption (4.1.4). Applying now Theorem 3.2.5 and using (4.4.9), (4.4.10), (4.4.12) and (4.4.13), we obtain a1 x1 > −C2 (r1 + 1, d)2r1 A · · · A log c15 d√1 h(x2 ) , (4.4.14) log 1 r1 +1 a x d 2H 3 3 v where C2 (r1 + 1, d), coming from Theorem 3.2.5, is √ C2 (r1 + 1, d) = min 1.451(30 2)r1 +5 (r1 + 2)5.5 , π 26.5r1 +33.5 d 2 log(ed). Comparing (4.4.14) with (4.4.10) and using (4.4.11), (4.4.13), Lemma 4.3.10, (4.1.4) and (1.5.3) we deduce first for h(x1 /x3 ) and then, by (4.4.1)–(4.4.3), for each h(xi /xj ) the estimate (4.1.3). 4.4.2 Proofs of Theorems 4.2.1 and 4.2.2 The proof of Theorem 4.2.1 will be based on Theorem 3.2.8. Theorem 4.2.2 is a simple corollary of Theorem 4.2.1. We need also the following. Proposition 4.4.1 Let be a finitely generated multiplicative subgroup of K ∗ with rank = q > 0. Let m ≥ q be a given integer, and let {ξ1 , . . . , ξm } be Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 82 Unit equations in two unknowns a system of generators for / tors such that the product h(ξ1 ) · · · h(ξm ) is minimal (4.4.15) among all systems of m elements that generate / tors . Then for every ξ ∈ there are rational integers b1 , . . . , bm and a root of unity ζ such that ξ = ζ ξ1b1 · · · ξmbm and max(|b1 |, . . . , |bm |) ≤ c16 (d/2)(log 3d)3 h(ξ ), (4.4.16) where c16 := q 2q . Proof. Let S = {v1 , . . . , vs } be a finite subset of MK such that S ⊇ MK∞ and is a subgroup of OS∗ . Then for ξ ∈ we have h(ξ ) = s 1 | log |ξ |vi |. 2d i=1 Let {η1 , . . . , ηq } be a basis for / tors . Then every ξ ∈ can be expressed uniquely as x ξ = ζ η1x1 · · · ηqq with x1 , . . . , xq ∈ Z and a root of unity ζ. (4.4.17) We define a norm on Rq by s q 1 xj log |ηj |vi for x = (x1 , . . . , xq ) ∈ Rq . x := 2d i=1 j =1 Then if ξ and x = (x1 , . . . , xq ) ∈ Zq are related by (4.4.17), we have h(ξ ) = x. (4.4.18) Further, by Proposition 3.2.9, x ≥ θ := 2 d(log 3d)3 for x ∈ Z \ {0}. Define vectors ai = (ai1 , . . . , aiq ) ∈ Zq by a ξi = ζi η1ai1 · · · ηqiq , i = 1, . . . , m, where ζi is a root of unity. Then a1 , . . . , am generate Zq and by (4.4.15) and (4.4.18), the product a1 · · · am is minimal. Further, if ξ and the vector x = (x1 , . . . , xq ) ∈ Zq are related by (4.4.17), it follows that ξ = ζ ξ1b1 · · · ξmbm with ζ ∈ tors , b1 , . . . , bm ∈ Z (4.4.19) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.4 Proofs 83 holds for some root of unity ζ if and only if x= m bi ai . i=1 Now Proposition 4.4.1 follows immediately by applying Proposition 4.3.4 to the norm . defined above. Proof of Theorem 4.2.1. We may assume without loss of generality that ξ1 , . . . , ξm have been chosen so that h(ξ1 ) · · · h(ξm ) is minimal among all systems of m elements that generate . By Proposition 4.4.1, there are rational integers b1 , . . . , bm and a root of unity ζ in K for which (4.4.19) and (4.4.16) hold. Then we have 1 − αξ = 1 − α ξ1b1 · · · ξmbm with α = ζ α. Let c17 := m2m (d/2)(log 3d)3 . We distinguish two cases. First assume that c17 h(ξ ) ≥ 2e(3d)2(m+1) H . Then we can apply Theorem 3.2.8 with B = c17 h(ξ ) and (4.2.1) follows. Next consider the case when c17 h(ξ ) < 2e(3d)2(m+1) H . Then, by the Product Formula we have the following Liouville type inequality −d |1 − αξ |v = |1 − αξ |−1 max(1, |αξ |w )−1 w ≥2 w ∈ MK w = v w ∈ MK w = v ≥ 2−d exp(−dh(αξ )) ≥ 2−d exp −d H + 2e(3d)2(m+1) H c17 . In view of Proposition 3.2.9 we have ≥ 2 d(log 3d)3 m if d ≥ 2 and ≥ (log 2)m if d = 1, (4.4.20) hence we obtain again (4.2.1). Proof of Theorem 4.2.2. Let ξ ∈ be such that αξ = 1 and (4.2.2) holds. By Theorem 4.2.1 we have (4.2.1). Then, with the notation X := N (v)h(ξ )/H and b := c8 κ −1 N (v)2 /(log N(v)), it follows that X ≤ b log X. In view of (4.4.20) we have b ≥ e2 . We use now that if a, b, X are real numbers with a ≥ 0, b ≥ 1, X ≥ 1 and X ≤ b log X + a then 2(b log b + a) if b > e2 (4.4.21) X≤ 2(2e2 + a) if b ≤ e2 (see Pethő and de Weger (1986)). This gives (4.2.3). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 84 Unit equations in two unknowns 4.4.3 Proofs of Theorem 4.1.3 and its corollaries We keep the notation from Section 4.1. In particular, K is an algebraic number field of degree d, is a finitely generated subgroup of K ∗ of rank q > 0, and S is the smallest set of places of K containing all infinite places, such that ⊆ OS∗ . Proof of Theorem 4.1.3. Let x1 , x2 be a solution of (4.1.5). It follows from (4.1.5) that h(x1 ) ≤ 3H + h(x2 ) + log 2. (4.4.22) First assume that h(x2 ) < 400sH . Then (4.4.22) gives h(x1 ) < 404sH, whence P · h(x1 )/H ≤ 404sP . Using now Proposition 3.2.9 and the fact that the function X/ log X is monotone increasing for X > e, (4.1.6) easily follows. Now assume that h(x2 ) ≥ 400sH. (4.4.23) Pick v ∈ S for which |x2 |v is minimal. Then we deduce from (4.1.5) that d log |1 − a1 x1 |v = log |a2 x2 |v ≤ − h(x2 ) + dH. s (4.4.24) Further, (4.4.22) and (4.4.23) imply that h(x1 ) ≤ 1.01h(x2 ). Hence we infer from (4.4.23) and (4.4.24) that log |1 − a1 x1 |v < −κh(x1 ) with the choice κ = d/(2.02s). By applying Theorem 4.2.2, for h(x1 ) we get the upper bound occurring in (4.1.6) with 6.5 replaced by 6.4 in the bound. Finally, (4.1.5) implies h(x2 ) ≤ 3H + h(x1 ) + log 2. But, in view of Proposition 3.2.9 the bound obtained for h(x1 ) is much larger than 3H + log 2, hence we get (4.1.6) for h(x2 ) as well. Proof of Corollary 4.1.4. For given a1 , a2 ∈ K ∗ , the finiteness of the number of solutions of equation (4.1.5) in x1 , x2 ∈ immediately follows from Theorem 4.1.3 and Theorem 1.9.3. The group of roots of unity in K being cyclic, tors is also cyclic. Suppose that K, a1 , a2 , a system of generators {ξ1 , . . . , ξm } of / tors and a generator Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.4 Proofs 85 ζ of tors are effectively given. We shall utilize some algorithmic results from algebraic number theory, references to the literature of which are listed in Section 1.10. The factorizations of ξ1 , . . . , ξm into prime ideals can be effectively determined. Then the set S consisting of all infinite places of K and of all prime ideals occurring in these factorizations can be effectively determined. If x1 , x2 ∈ is a solution of (4.1.5), then x1 , x2 belong to OS∗ , the group of S-units and Theorem 4.1.3 provides an effectively computable upper bound for h(x1 ) and h(x2 ). Therefore x1 , x2 are contained in a finite and effectively computable subset, say H, of K ∗ . We can select those pairs (x1 , x2 ) from H × H that satisfy (4.1.5). From the remaining x1 , x1 one can select those x1 , x2 that are S-units. We have still to decide whether such x1 , x2 are contained in or not, that is that x1 = ζ z0 ξ1z1 · · · ξmzm with some z0 , . . . , zm ∈ Z, (4.4.25) and similarly for x2 . One can determine a fundamental system {ε1 , . . . , εs−1 } of S-units in K where s = |S|, and a generator ρ of the group of roots of unity in K. Further, for any effectively given ε ∈ OS∗ , one can determine effectively rational integers b1 , . . . , bs−1 and b with 0 ≤ b < wK such that b s−1 ε = ρ b ε1b1 · · · εs−1 , (4.4.26) where wK denotes the number of roots of unity in K. In (4.4.25) we represent now x, ζ , ξ1 , . . . , ξm in the form (4.4.26) and compare the representations of the left- and right-hand sides of (4.4.25). Then we arrive at a system of linear equations in z0 , z1 , . . . , zm . But one can decide whether this system of equations is solvable in Z or not, that is, whether x1 ∈ or not. In case of x2 one can proceed in the same way. Proof of Corollary 4.1.5. Let {ε1 , . . . , εs−1 } be a fundamental system of S-units in K with the properties described in Proposition 4.3.9. Then, putting := h(ε1 ) · · · h(εs−1 ), we have ≤ c9 RS with c9 = ((s − 1)!)2 /(2s−2 d s−1 ). Now the assertion follows from Theorem 4.1.3. Proof of Corollary 4.1.6. Corollary 4.1.6 can be deduced both from Corollary 4.1.5 and from Corollary 4.1.4. The finiteness of the number of solutions of (4.1.7) follows immediately from Corollary 4.1.4 with the choice = OS∗ . If S is effectively given, a fundamental system of S-units and the roots of unity Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 86 Unit equations in two unknowns in K are effectively determinable. Hence the effective part of Corollary 4.1.6 is also an immediate consequence of Corollary 4.1.4. Theorem 4.1.7 will be deduced from Theorem 4.2.1. Proof of Theorem 4.1.7. Let x1 , x2 be a solution of (4.1.7). We infer as in the proof of Theorem 4.1.3 that, for some v ∈ S, 0 < log |1 − a1 x1 |v < −(d/2.02s)h(x1 ), (4.4.27) whence |1 − a1 x1 |v ≤ 1. This implies that |a1 x1 |v ≤ 1 or 4 according as v is finite or not. Consequently, we have |1 − (a1 x1 )hK |v = |1 − a1 x1 |v · |1 + a1 x1 + · · · + (a1 x1 )hK −1 |v ≤ c18 |1 − a1 x1 |v , (4.4.28) where c18 := 1 or 4hK , according as v is finite or not. Here hK denotes the class number of K. We shall give a lower bound for |1 − (a1 x1 )hK |v by means of Theorem 4.2.1. We first construct a subgroup of OS∗ such that x1hK ∈ . Denote by p1 , . . . , pt the prime ideals in S. There are π1 , . . . , πt in OK such that (πi ) = phi K and by Proposition 4.3.12, they can be chosen so that h(πi ) ≤ c19 d r R log N (pi ), i = 1, . . . , t. Here c19 and c20 , . . . , c25 below denote effectively computable absolute constants. By Proposition 4.3.9, there exists a fundamental system of units {ε1 , . . . , εr } such that h(ε1 ) · · · h(εr ) ≤ c20 d r R. Then := h(ε1 ) · · · h(εr )h(π1 ) · · · h(πt ) t t+1 t+1 ≤ c21 d r R log N (pi ). (4.4.29) i=1 Since x1 is an S-unit in K, we can write (x1 ) = pu1 1 · · · put t with appropriate integers u1 , . . . , ut . Consequently, we can write x1hK = ζ ε1b1 · · · εrbr π1u1 · · · πtut , where ζ is a root of unity and b1 , . . . , br are integers. That is, x1hK belongs to the multiplicative subgroup of OS∗ generated by ε1 , . . . , εr , π1 , . . . , πt and the roots of unity of K. Further, putting H := max(h(a1 ), 1), we have hK H ≥ max h a1hK , 1 ≥ H . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.5 Alternative methods, comparison of the bounds 87 If a1hK x1hK = 1, together with (4.1.7) and (1.8.5) this implies immediately (4.1.9). If a1hK x1hK = 1, then Theorem 4.2.1 gives P log 1 − a1hK x1hK v > −c (d, hK , s) H log∗ log P P h(x1 ) , H (4.4.30) where c (d, hK , s) = (c22 d)3s+2 (log∗ d)3 log∗ hK . We may assume that h(x1 ) ≥ 12 log(4shK /d), since otherwise by (4.1.7) we are done. Then (4.4.27) and (4.4.28) imply that log |1 − a1hK x1hK |v can be estimated from above by −(d/4s)h(x). Comparing this with (4.4.30), we infer that h(x1 ) ≤ c (d, hK , s) P H log∗ (P ), log P where c (d, hK , s) = (c23 d)3s+2 (log∗ hK )2 . In view of (4.4.29) we get log∗ (P ) ≤ c24 td(log∗ d)(log∗ R) log P . Finally, using again (4.4.29), (1.5.3) and (1.8.3), we obtain t t+4 t+4 h(x1 ) ≤ c25 d r+3 R PH log N (pi ) ≤ c25 d r+3 R P H RS , i=1 which gives (4.1.9) for h(x1 ). An upper bound of the same form follows for h(x2 ) from the equation (4.1.7). 4.5 Alternative methods, comparison of the bounds Baker’s theory of logarithmic forms made it possible to derive effective bounds for the solutions of S-unit equations. Later, some alternative methods were developed to obtain effective results for such equations. We briefly discuss these methods. 4.5.1 The results of Bombieri, Bombieri and Cohen, and Bugeaud Bombieri (1993) and Bombieri and Cohen (1997, 2003) developed an effective method in Diophantine approximation, based on an extended version of the Thue–Siegel principle, the Dyson Lemma and some geometry of numbers, to prove an earlier, weaker version of Theorem 4.2.2. Bugeaud (1998), following their approach and combining it with estimates for linear forms in logarithms, obtained results which are in certain parameters sharper than those of Bombieri and Cohen. This improvement is partly due to the use of linear forms in at most Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 88 Unit equations in two unknowns three logarithms. It follows from Bugeaud’s results that if (4.2.2) holds with 0 < κ ≤ 1, then 10T max{H, T } if v is infinite, h(ξ ) ≤ (4.5.1) 8c26 T max{H, 40T } if v is finite , where T := (2mc26 )m N (v)(log N (v)), with c26 := 8 × 1019 (d 4 (log 3d)7 /κ) log∗ (2d/κ) if v is infinite, 8 × 106 (d 5 /κ)(log∗ (2d/κ))2 if v is finite. It is easily seen that the bound in (4.2.3) has a better dependence on each parameter than the bound in (4.5.1), except possibly and H . In fact, the bound in (4.5.1) is smaller than that in (4.2.3) precisely when both and H log / are large relative to d, κ and N (v), and in that case, the bound (4.5.1) is at most a factor log better than (4.2.3). It is important to observe that in contrast with (4.5.1), the bound (4.2.3) does not contain the factor mm . Bugeaud (1998) used his result (4.5.1) to derive the bound max{h(x1 ), h(x2 )} < c27 P (log∗ P )RS max{c27 P (log∗ P )RS , H } (4.5.2) for the solutions x1 , x2 of the S-unit equation (4.1.7), where c27 := (1023 s 4 (log∗ s)2 d 3 )s . Observe that (4.1.8) is better than (4.5.2) in terms of each parameter, except possibly RS and H . The bound in (4.5.2) is smaller than that in (4.1.8) precisely if both RS and H log RS /RS are large relative to P , d and s, and in that case the bound in (4.5.2) is at most a factor log∗ RS better than that in (4.1.8). If H > c27 P (log∗ P )RS , then there is no log∗ RS factor in (4.5.2), however this bound contains s s . Our bound in (4.1.9) contains neither log∗ RS nor s s , but it depends on Rt . 4.5.2 The results of Murty, Pasten and von Känel Let S ⊂ MQ consist of the infinite place and the prime numbers p1 , . . . , pt . Then the corresponding ring of S-integers is ZS = Z[(p1 · · · pt )−1 ]. Murty and Pasten (2013) (see also Pasten (2014)) developed a new effective method to bound the heights of the solutions of special S-unit equations of the form x1 + x2 = 1 in x1 , x2 ∈ Z∗S . (4.5.3) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.6 The abc-conjecture 89 A similar method was obtained later and independently by von Känel (2014b). The basic idea behind the approach of Murty and Pasten and that of von Känel is an observation by Frey, that the now proved Shimura–Taniyama Conjecture, which states that the L-function of an elliptic curve is equal to that of an associated modular form, implies that (4.5.3) has only finitely many solutions. Murty and Pasten and von Känel observed that this can be made effective, and used this to obtain an explicit upper bound for the heights of the solutions of (4.5.3). We formulate the result of Murty and Pasten, which is slightly sharper. To be precise, let S be as above, and put Q := p1 · · · pt . Further, assume that t ≥ 2 and 2 ∈ S. Murty and Pasten proved that any solution x1 , x2 of the S-unit equation (4.5.3) satisfies h(x1 ), h(x2 ) ≤ 4.8Q log Q + 13Q + 25. (4.5.4) In the special case of equation (4.1), where the number field is Q and the coefficients a1 , a2 are equal to 1, (4.5.4) can be compared with the estimates obtained in Theorem 4.1.3 and Corollary 4.1.5, and even with the slightly sharper Theorem 2 of Győry and Yu (2006). In this case this latter result gives the estimate t log pi (4.5.5) h(x1 ), h(x2 ) ≤ 210t+2 t 4 (P / log P ) i=1 for the solutions x1 , x2 of (4.5.3), where P := maxi pi . It is easily seen that (4.5.4) improves (4.5.5) if Q is small, in particular if Q ≤ 230 . However, if t is small and Q is large then (4.5.5) gives a better bound for the solutions. It should be remarked that for most applications of S-unit equations, more general results concerning equations of the form (4.1) a1 x1 + a2 x2 = 1 over number fields are needed. 4.6 The abc-conjecture An extremely important S-unit equation is the abc-equation a + b = c, where S is a finite set of primes, and a, b, c are coprime positive integers not divisible by primes outside S. Then Corollary 4.1.5 and Theorem 4.1.7 provide explicit upper bounds for c in terms of S. However, these bounds are far from being best possible. The radical of (a, b, c) is defined as Q(a, b, c) := p. p|abc Oesterlé and, in a refined form, Masser (1985) proposed the following. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 90 Unit equations in two unknowns abc-conjecture For every ε > 0, there is a positive number C(ε) such that if a, b, c are coprime positive integers with a+b =c and radical Q = Q(a, b, c), (4.6.1) c < C(ε)Q1+ε . (4.6.2) then This is already sharp in the sense that (4.6.2) does not remain valid for ε = 0. On August 30, 2012, Shinichi Mochizuki (Kyoto University), posted on the internet a sequence of four papers on Inter-universal Teichmüller theory in which he claims to prove the abc-conjecture. For recent updates of these papers, see Mochizuki’s home page www.kurims.kyoto-u.ac.jp/∼motizuki/top-english.html. At the moment of completion of this book, his proof had not yet been checked. There are several refinements or modifications of the abc-conjecture; for references see e.g. Robert, Stewart and Tenenbaum (2014) where the authors propose and motivate the following conjecture. We denote by logn the n times iterated natural logarithm. There exists a real number C1 such that if a, b, c are coprime positive integers as in (4.6.1) then $ c < Q exp(4 3 log Q/ log2 Q(1 + (log3 Q + C1 )/2 log2 Q)). Furthermore, there exists a real number C2 and infinitely many pairs of coprime positive integers a, b, c with (4.6.1) such that $ c > Q exp(4 3 log Q/ log2 Q(1 + (log3 Q + C2 )/2 log2 Q)). For any positive integer m we denote by ω(m) the number of distinct prime factors of m. Baker (1998, 2004) and Granville (1998) formulated such refinements of the abc-conjecture which involve also ω(abc). The following completely explicit refined version is due to Baker (2004). If a, b, c are coprime positive integers with (4.6.1) then c< 6 Q(log Q)t , 5t! where t = ω(abc). The abc-conjecture has a very extensive literature. It unifies and motivates a number of results and problems in number theory. Further, it has several striking consequences. We mention here only some of them. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.6 The abc-conjecture 91 r It is easy to show that the abc-conjecture implies Fermat’s Last Theorem for every sufficiently large exponent. Indeed, assume that x, y, z are relatively prime positive integers, n > 3 and x n + y n = zn . Then, for ε = 1, the abcconjecture gives zn < C(1)Q2 , where Q := p= p ≤ xyz < z3 , p|x n y n zn r r r r p|xyz whence zn < C(1)z6 . This proves that there exists n0 such that n ≤ n0 or, in other words, Fermat’s Last Theorem is asymptotically true. The weaker version of the abc-conjecture when ε = 1, C(1) = 1 implies in the same way that n ≤ 5. As is known, Fermat’s Last Theorem is now proved by Wiles (1995), Taylor and Wiles (1995). It follows in a similar manner from the abc-conjecture that the generalized Fermat equation Ax k + By m + Czn = 0, where A, B, C are given non-zero integers, has finitely many solutions in relatively prime integers x, y, z greater that 1, and positive integers k, m, n which satisfy 1/k + 1/m + 1/n < 1. Elkies (1991) proved that the abc-conjecture implies Roth’s Approximation Theorem, that is Theorem 3.1.1, and that an effective abc-theorem would make Roth’s Theorem effective. See also Langevin (1999). Confirming a conjecture of Mordell (1922a), Faltings (1983) proved that a geometrically irreducible smooth projective curve of genus g ≥ 2, defined over Q, has only finitely many rational points. Falting’s Theorem is ineffective. Elkies (1991) showed that the abc-conjecture implies this theorem of Faltings, and in fact even an effective version of this if an effective version of the abc-conjecture is available. By a result of Lagarias and Soundararajan (2011), the abc-conjecture implies that for any fixed κ < 1, there are only finitely many coprime positive integers a, b, c such that a + b = c and P (abc) ≤ (log c)κ . On the other hand, under the Generalized Riemann Hypothesis the authors proved that for κ ≥ 8 there are infinitely many triples a, b, c satisfying these properties. Some weaker versions of the abc-conjecture have been proved in an effective way. By means of the theory of logarithmic forms, Stewart and Tijdeman (1986), Stewart and Yu (1991, 2001), Győry and Yu (2006), Surroca (2007) and Győry (2008a) obtained upper bounds for c as a function of Q(a, b, c). Stewart and Yu (2001) proved that (4.6.3) c < exp c28 Q1/3 (log Q)3 , Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 92 Unit equations in two unknowns where Q = Q(a, b, c) and c28 is an effectively computable positive absolute constant. Further, Győry (2008a) deduced from a slightly improved and completely explicit version of Theorem 4.1.7 that c < exp(210t+22 /t t−4 )Q(log Q)t ). (4.6.4) For a deeper connection between the abc-conjecture and the theory of logarithmic forms, we refer to Baker (2004). We present now a number field version of the abc-conjecture. Let K be an algebraic number field of degree d, and MK the set of places on K; see Section 1.7. The height of (a, b, c) ∈ (K ∗ )3 is defined as max(|a|v , |b|v , |c|v ) HK (a, b, c) = v∈MK and the radical of (a, b, c) as QK (a, b, c) := (NK (p))e(p) . p Here the product is taken over all prime ideals p for which |a|p , |b|p , |c|p are not all equal and e(p) is the ramification index of p over the rational prime below it. Denote by DK the absolute value of the discriminant of K. Vojta (1987) proposed a very general conjecture, and, as a consequence, suggested the first number field version of the abc-conjecture. Later, several refinements of Vojta’s version were suggested, see Elkies (1991), Broberg (1999), Vojta (2000), Granville and Stark (2000), Browkin (2000) and Masser (2002). The following uniform version is due to Masser. ABC-conjecture for the number field K For every > 0 there exists C() > 0, such that if a + b + c = 0 with a, b, c ∈ K ∗ , QK = QK (a, b, c), (4.6.5) then HK (a, b, c) < C()d (|DK | · QK )1+ . For K = Q, this reduces to the Oesterlé–Masser Conjecture. The upper bound is again best possible in term of . This general conjecture has also a very rich literature, and has many profound implications; see the abc-conjecture home page mentioned below. The bounds obtained for the solutions of S-unit equations can be used to derive weaker but unconditional upper bounds for HK (a, b, c). Let a, b, c be non-zero elements of K with a + b + c = 0, and let S = MK∞ ∪ {finite v ∈ MK such that |a|v , |b|v , |c|v are not all equal}. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.7 Notes 93 Then x = −a/c, y = −b/c is a solution of the S-unit equation x + y = 1 in x, y ∈ OS∗ . Every bound for h(x), h(y) gives a bound for HK (a, b, c). Using a result of Bugeaud and Győry (1996a) concerning S-unit equations, Surroca (2007) derived a bound for HK (a, b, c). By means of a slightly improved and explicit version of Theorem 4.1.7, Győry (2008a) considerably improved Surroca’s bound by showing that if > 0 and (4.6.5) holds then HK (a, b, c) can be estimated from above by (4.6.6) exp c29 (d, DK , )Q1+ K and, if by QK = QK (a, b, c) > max |DK |2/ , exp exp(max(|DK |, e)) , exp c30 (d, )(|DK | · QK )1+ , (4.6.7) where c29 , c30 are effectively computable constants depending only on the parameters occurring in the parentheses. Clearly, the bounds in (4.6.3), (4.6.4), (4.6.6) and (4.6.7) are still far from the conjectured best bounds. For other details, including generalizations and applications of the abcconjecture, we refer the reader to Bombieri and Gubler (2006), Baker and Wüstholz (2007), the abc-conjecture home page created and maintained by Nitaj, www.math.unicaen.fr/∼nitaj/abc.html, and the references given there. 4.7 Notes 4.7.1 Historical remarks and some related results r Inthe special case of S-unit equations over Q, effective finiteness results can be deduced for the solutions from a theorem of Coates (1969) on the greatest prime factor of binary forms and also from a result of Sprindžuk (1969) on ternary exponential equations. In the general case, for unit and S-unit equations over number fields, various effective bounds for the solutions were established in several papers and books, including Győry (1972, 1973, 1974, 1979, 1979/1980, 1980b, 2008a), Sprindžuk (1973, 1976, 1982, 1993), Lang (1978), Kotov and Trelina (1979), Schmidt (1992), Bugeaud and Győry (1996a), and Haristoy (2003). The best known bounds can be found in Győry and Yu (2006) and, in a more general form, for the solutions from a finitely Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 94 Unit equations in two unknowns ∗ generated multiplicative subgroup of Q , in Section 4.1 of the present book. Later, Bérczes, Evertse and Győry (2009) gave effective bounds for the heights and degrees of the solutions from the division group of a finitely ∗ generated multiplicative subgroup of Q . We note that in certain applications of Baker’s theory, Bilu systematically used so-called functional units instead of applying unit equations; see Bilu (2002) and the references given there. r Corollary 4.1.6 states that over a number field K, the S-unit equation x1 + x2 = 1 in x1 , x2 ∈ OS∗ has only finitely many solutions, and all of them can be, at least in principle, effectively determined. An equivalent statement is that the set of S-integral points of P1 (K) \ {0, 1, ∞} is finite, and these points can be, at least in principle, effectively computed. Here P1 (K) denotes the projective line over K. For this and other equivalent statements, see e.g. Section 9.2 and LeVesque and Waldschmidt (2011). r Let p1 , . . . , pt be distinct rational primes, and S = {∞, p1 , . . . , pt } and denote by Z∗S the group of S-unit equations in Q. As a common generalization of S-unit equations and binomial Thue equations Győry and Pintér (2008) considered over Q the equation un1 x1 + un2 x2 = 1 in u1 , u2 ∈ Z \ {0}, n ≥ 3, x1 , x2 ∈ Z∗S with gcd(u1 u2 , p1 · · · pt ) = 1. (4.7.1) They proved that the heights of un1 , un2 , x1 and x2 can be effectively bounded above in terms of S. This implies that there are only finitely many un1 , un2 with the given properties for which equation (4.7.1) can have a solution x1 , x2 , and these un1 , un2 , together with the possible solutions x1 , x2 , can be, at least in principle, effectively determined. All the results mentioned above were proved by means of the theory of logarithmic forms. 4.7.2 Some notes on applications r The effective results concerning equations (4.1) and (4.2) led to a great number of applications, among others to – Thue equations, Thue–Mahler equations and decomposable form equations, see Section 9.6 and the Notes in Chapter 9, – discriminant form and index form equations, see Section 9.6, the Notes in Chapter 9 and our book on discriminant equations, – discriminant equations and power integral bases, see Section 10.6 and our book on discriminant equations, – binary forms and decomposable forms of given discriminant, see Section 10.7 and our book on discriminant equations, – irreducible polynomials and arithmetic graphs, see Section 10.5, Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 4.7 Notes 95 – unit equations in two unknowns over finitely generated domains, see Chapter 8, – bounding the number of solutions of S-unit equations, see the Notes in Chapter 6. For applications of the so-obtained results and for references, the reader should consult Chapters 9 and 10, the books and survey papers Győry (1980b, 1992a, 2002, 2010), Sprindžuk (1982, 1993), Shorey and Tijdeman (1986), Evertse, Győry, Stewart and Tijdeman (1988b), Bombieri and Gubler (2006), Baker and Wüstholz (2007), and our book on discriminant equations. r In many cases, the applicability of Baker’s theory can be considerably extended by reducing the Diophantine problem under consideration to the study of such systems of unit equations in which the equations possess certain graph-theoretic connectedness properties. Then a combination of the effective results concerning equation (4.1) with some combinatorial arguments enables one to derive a bound for the solutions of the initial Diophantine problem; see Sections 9.6, 10.5, 10.6, Győry (1980c, 1981a, 1981c, 1982c) and Evertse, Győry, Stewart and Tijdeman (1988b). r We now mention some recent applications of the results presented in this chapter. There are many important applications to polynomials and binary forms of given discriminant; these will be discussed in full detail in our book on discriminant equations. Further, Theorem 1 of Győry and Yu (2006), that is Corollary 4.1.5 of the present chapter, has been recently used to obtain among others the following effective results. – In von Känel (2011, 2014a), an effective version of Shafarevich’s conjecture/Faltings’ Theorem is proved for hyperelliptic curves. This has been worked out in our book on discriminant equations. – In de Jong and Rémond (2011), the authors give an effective version of Shafarevich’s conjecture/Faltings’ Theorem for cyclic covers of the projective line of prime degree. – Finally, in von Känel (2013) a generalization of Szpiro’s Discriminant conjecture concerning elliptic curves is formulated for hyperelliptic curves, and a completely explicit exponential version of Szpiro’s Generalized conjecture is established. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006 5 Algorithmic resolution of unit equations in two unknowns Let K be an algebraic number field, 1 , 2 two finitely generated multiplicative subgroups of K ∗ , and a1 , a2 two non-zero elements of K. It follows from the results of the preceding chapter that the equation a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ 1 × 2 has only finitely many solutions, and effective upper bounds can be given for the heights of the solutions. These bounds are, however, too large for practical use, for finding all solutions of concrete equations of the above form. In this chapter a practical method will be provided to locate all the solutions to such equations, subject to the conditions that a1 , a2 and the generators of 1 and 2 are effectively given and that the ranks of 1 and 2 are not too large, presently the bound is about 12. In particular, we present an efficient algorithm for solving completely S-unit equations in two unknowns. The unknowns x1 and x2 can be represented as a power product of the generators of 1 and 2 , respectively. Assuming that the generators of infinite order are multiplicatively independent, these representations are unique up to powers of roots of unity. Thus, we arrive at an exponential Diophantine equation of the form (5.1.3) below which has to be solved. As in Chapter 4, we first derive an explicit upper bound for the absolute values of the unknown exponents, using the best known Baker’s type inequalities concerning linear forms in logarithms. In this way the existence of “large” solutions will be excluded. This part is an adaptation of Győry’s method (Győry (1979)) who was the first to give explicit bounds for the solutions in case of S-unit equations over number fields. Then, in concrete cases, we can considerably reduce the obtained bound by means of de Weger’s reduction techniques (de Weger (1987, 1989)) based on the LLL lattice basis reduction algorithm. This means that even “medium” sized solutions do not exist. Finally, some enumeration procedures due to Wildanger (1997, 2000) and Smart (1999) can be utilized to determine 96 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.1 Application of Baker’s type estimates 97 the “small” solutions under the reduced bound. We shall briefly illustrate the resolution process on two concrete equations. Of course, during the process some standard algebraic number-theoretical concepts and algorithms will also be needed, references to which, for convenience, are collected in Section 1.10. For further details, related results, methods, applications and examples we refer to de Weger (1987, 1989), Wildanger (1997, 2000), Smart (1998, 1999), Győry (2002), Gaál (2002) and Baker and Wüstholz (2007). 5.1 Application of Baker’s type estimates Let K be an algebraic number field of degree d, given by the minimal polynomial of a primitive integral element θ of K over Q. Assume that two non-zero algebraic numbers a1 , a2 are explicitly given, as defined in Section 1.10, that is, ai = (pi,0 + pi,1 θ + · · · + pi,d−1 θ d−1 )/qi with given rational integers pi,0 , . . . , pi,d−1 , qi with gcd(pi,0 , . . . , pi,d−1 , qi ) = 1 for i = 1, 2. Further, for i = 1, 2, let i be a multiplicative subgroup of rank ri in K ∗ , and i,∞ the torsion subgroup of i . We assume that for i = 1, 2, a system of generators ξi,1 , . . . , ξi,ri , that is a basis of i / i,∞ is explicitly given. We consider the equation a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ 1 × 2 . (5.1.1) To avoid trivialities, we deal only with the case r1 , r2 ≥ 1. Then each solution x1 , x2 of (5.1.1) can be written uniquely in the form xi = ζi ri b ξi,ji,j , i = 1, 2, (5.1.2) j =1 where ζi is a root of unity in K and the bi,j are rational integers. Hence (5.1.1) takes the form (a1 ζ1 ) r1 j =1 b ξ1,j1,j + (a2 ζ2 ) r2 b ξ2,j2,j = 1 (5.1.3) j =1 with unknown integer exponents bi,j . Let B := maxi,j |bi,j |. We are going to derive an upper bound for B. Such a bound could be deduced from the general effective results of Chapter 4. However, as will be seen, in concrete cases it is more profitable to reduce (5.1.3) to Baker’s type inequalities; see (5.1.10), (5.1.13) and (5.1.15) below. Then we Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 98 Algorithmic resolution of unit equations in two unknowns can apply Baker’s method to the left-hand sides of these inequalities to get a bound for B, and then we can use the LLL-algorithm to reduce the bound so obtained. Let MK denote the set of places on K. Further, for i = 1, 2 let Si be the support of the group i , that is the subset of MK which consists of the infinite places and of those finite places v for which |α|v = 1 for some α ∈ i . In view of the assumptions made on 1 and 2 , the sets S1 and S2 can be effectively determined in the sense defined in Section 1.10. In what follows, we assume some implicit fixed order for the real and complex infinite places and for the finite places in MK . This gives an order on S1 and S2 . We first consider the case when B = maxj |b1,j |. We infer from (5.1.2) that log |x1 |v = r1 b1,j log |ξ1,j |v j =1 for all v ∈ MK . Let S1 = {v1 , . . . , vs1 }, and choose k, l ∈ {1, . . . , s1 } such that | log |x1 |vk | = max | log |x1 |v | v∈S1 |x1 |vl = min |x1 |v . v∈S1 We need to perform our calculations for each possible l. Using the algorithms (XII) and (XIII) mentioned in Section 1.10, one can determine a fundamental as1 −1,j a with a system {ε1 , . . . , εs1 −1 } of S1 -units, and write ξ1,j = ζ1,j ε11,j · · · εs1 −1 root of unity ζ1,j and with rational integers ai,j ∈ Z. In view of the multiplicative independence of ξ1,1 , . . . , ξ1,r1 it follows that the rank of the matrix (ai,j )1 ≤ i ≤ s1 − 1 is r1 . But the matrix (log |εi |vj )1 ≤ i ≤ s1 − 1 is invertible, hence it is 1 ≤ j ≤ r1 1 ≤ j ≤ s1 − 1 easy to show that the matrix (log |ξ1,i |vj )1 ≤ i ≤ r1 1 ≤ j ≤ s1 − 1 quently, there is a subset {u1 , . . . , ur1 } of S1 ⎛ log |ξ1,1 |u1 · · · ⎜.. .. M = ⎝. . log |ξ1,1 |ur1 ··· is also of rank r1 . Conse- such that the matrix ⎞ log |ξ1,r1 |u1 ⎟ .. ⎠ . log |ξ1,r1 |ur1 is invertible. Thus we have ⎛ ⎞ ⎛ ⎞ log |x1 |u1 b1,1 ⎜ .. ⎟ ⎟ .. −1 ⎜ ⎝ . ⎠=M ⎝ ⎠. . b1,r1 log |x1 |ur1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.1 Application of Baker’s type estimates 99 This gives B ≤ c1 | log |x1 |vk |, (5.1.4) where c1 is the row norm of M −1 , that is the maximum, taken over all rows, of the sum of the absolute values of the elements of a row of M −1 . Remark 5.1.1 The value of c1 depends not only on 1 but also on the choice of the generators ξ1,1 , . . . , ξ1,r1 and the matrix M. As will be seen later, the bounds that will be derived for B depend heavily on c1 . When B = maxj |b2,j |, we obtain an inequality similar to (5.1.4). Thus we can compute a constant c1∗ ≥ c1 such that max |bi,j | ≤ c1∗ max {|log |x1 |v | , |log |x2 |v |} . v∈S1 ∪S2 i,j (5.1.5) This constant c1∗ will be needed in Section 5.3. For later purpose it is worth keeping c1 and c1∗ as small as possible. Remark 5.1.2 In the case of S-unit equations, Hajdu (2009) proved that there is a system of fundamental S-units which is optimal with respect to c1 and such a system can be constructed. Choose now c2 such that 0 < c2 < 1/c1 (s1 − 1). We shall see that an appropriate choice for c2 is 0.999/c1 (s1 − 1), provided that s1 is not too large. We show that |x1 |vl ≤ exp{−c2 B}. (5.1.6) Assuming the contrary, in view of (5.1.4) there are two possibilities. If |x1 |vk ≥ exp{ c11 B}, then using the Product Formula (1.7.1) we get exp s1 1 B ≤ |x1 |−1 vj < exp{c2 (s1 − 1)B}, c1 j =1 j = k which is impossible because of 1/c1 > c2 (s1 − 1). On the other hand, if |x1 |vk ≤ exp{− c11 B} then 1 exp{−c2 B} < |x1 |vl ≤ |x1 |vk ≤ exp − B c1 which is again a contradiction. This proves (5.1.6). Set := 1 − a2 x2 = 1 − a2 r2 b ξ2,j2,j , (5.1.7) j =1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 100 Algorithmic resolution of unit equations in two unknowns where a2 := a2 ζ2 and, in view of (5.1.1), = 0. The following computations must be performed for all roots of unity ζ2 in K. Let c3 := maxv∈S1 |a1 |v . Then it follows from (5.1.1) and (5.1.6) that ||vl ≤ c3 exp{−c2 B}. (5.1.8) We shall give a lower bound for ||vl which, together with (5.1.8), will yield an upper bound for B. We could apply here Theorem 3.2.8 which is valid for each v. We shall, however, get a slightly better bound for B if we use Theorem 3.2.4 or Theorem 3.2.7 according as vl is infinite or finite. 5.1.1 Infinite places First consider the case when vl is infinite. There is an embedding σ of K in C such that ||vl = |σ ()| if vl is real and |σ ()|2 otherwise. Since h(σ (α)) = h(α) for each α ∈ K, in applying Theorem 3.2.4 we omit vl and σ and we write simply || ≤ c3 exp{−c2 B}, in place of (5.1.8), where c3 := c3 , c2 := c2 if vl is real and c3 := c2 /2 otherwise. (5.1.9) √ c3 , c2 := vl is real. Using the inequality | log z| ≤ 2|z − 1| which holds for |z − 1| < 0.795, we deduce from (5.1.3), (5.1.7) and (5.1.9) that putting := log |a2 | + b2,1 log |ξ2,1 | + · · · + b2,r2 log |ξ2,r2 |, we have || = | log |a2 x2 || ≤ 2|1 − |a2 x2 || ≤ 2|1 − a2 x2 | = 2|| ≤ 2c3 exp{−c2 B}. (5.1.10) Further, = 0 implies that = 0. Let H := max{dh(α2 ), | log α2 |, 0.16} and c4 := max1≤j ≤r2 {dh(|ξ2,j |), | log |ξ2,j ||, 0.16}. We recall that B = maxj |b1,j |. Applying Theorem 3.2.4 to || with B replaced by c4 B/H , we can compute explicit constants c5 and c6 such that either B≤ 1 H c4 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.1 Application of Baker’s type estimates or c6 B || > exp −c5 H log . H 101 (5.1.11) In the second case (5.1.10) and (5.1.11) imply that c5 c6 c6 B c6 B < log H c2 H Hence (5.1.12) and (4.4.21) give c5 c5 c6 B ≤ 2H max log c2 c2 Thus we get , + 2e2 c6 c6 log(2c3 ) . c2 H +2 1 H, c7 B0 (vl ) := max c4 (5.1.12) log(2c3 ) =: c7 . c2 as an upper bound for B. vl is complex. Let log denote the principal value of the logarithm. There exists an even rational integer b2,0 such that |b2,0 | ≤ 1 + |b2,1 | + · · · + |b2,r2 | ≤ (r2 + 1) max |b2,j | j and that |Im()| ≤ π , where now = log a2 + b2,0 log ξ2,0 + · · · + b2,r2 log ξ2,r2 with ξ2,0 = −1. It follows from = 0 that = 0. We infer from (5.1.9) that either B< log(3c3 ) =: c8 c2 or |e − 1| = || ≤ 1/3. In the latter case || ≤ 0.6, whence || ≤ 2|| and so, by (5.1.9), || ≤ 2c3 exp{−c2 B}. (5.1.13) We apply now Theorem 3.2.4 to ||. We can compute explicit constants c9 , c10 and c11 such that either B ≤ c9 H or c11 B . || > exp −c10 H log H (5.1.14) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 102 Algorithmic resolution of unit equations in two unknowns Comparing now (5.1.13) and (5.1.14), we deduce that c11 c10 c11 B c11 B < log H c2 H whence, using (4.4.21), we infer that c10 c10 c11 B < 2H max log c2 c2 , c11 log(2c3 ) , c2 H + 2e2 c11 + 2 log(2c3 ) =: c12 . c2 Thus B0 (vl ) := max{c8 , c9 H, c12 } is an upper bound for B. 5.1.2 Finite places Suppose now that in (5.1.8) vl is finite. Let p denote the prime ideal of K which corresponds to vl . Using log ||vl = −(ordp ) log N (p), we infer from (5.1.8) that ordp ≥ (c2 B − log c3 )/ log N (p). (5.1.15) Recall that the generators ξ2,1 , . . . , ξ2,r2 are not roots of unity. Taking into consideration Proposition 3.2.9, we can apply Theorem 3.2.7 to ordp with the choice hj = h(ξ2,j ), j = 1, . . . , r2 , H = max(h(a2 ), 1), Bn = 1 and δ = h(ξ2,1 ) · · · h(ξ2,r2 )H /B. We can compute explicit constants c13 , c14 , c15 which depend among others on vl such that 2h(ξ2,1 ) · · · h(ξ2,r2 ) ≤ c13 and either B ≤ c13 H or B > c13 H , which guarantees that in Theorem 3.2.7, δ ≤ 1/2. In the second case Theorem 3.2.7 gives ordp < c14 H log c15 B H . (5.1.16) Now (5.1.15) and (5.1.16) imply that c15 c15 c16 B B < log c15 H c2 H + c15 log c3 , c2 H where c16 := c14 log N (p). In view of (4.4.21) this gives 2 log c3 c15 c16 c16 , 2e2 + log =: c17 , B < 2H max c2 c2 c2 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.2 Reduction of the bounds 103 whence we obtain B0 (vl ) := max{c13 H, c17 } as an upper bound for B. When B = maxj |b2,j |, we can get in the same way an upper bound for B for each v ∈ S2 . So in all cases we have a bound on B = maxi,j |bi,j |. 5.2 Reduction of the bounds The bounds obtained above for B are too large for practical use, to find all solutions of (5.1.3) in bi,j . We now show how to reduce these bounds by means of the LLL-algorithm. For the LLL-algorithm, we refer to Section 5.6. Further details on the applications of the LLL-algorithm to reduce Baker’s type bounds can be found in de Weger (1989), Smart (1998) and, in case of infinite places, in Gaál (2002). We first consider the case when B = maxj |b1,j |. We shall distinguish again two cases according as vl is infinite or finite. We illustrate the reduction procedure on the inequalities (5.1.10) (vl infinite and real), (5.1.13) (vl infinite and complex) and (5.1.15) (vl finite), reducing the corresponding bounds B0 (vl ) to much smaller ones. 5.2.1 Infinite places The inequalities (5.1.10) and (5.1.13) are of the form |b1 ϑ1 + · · · + bt ϑt | < c18 exp{−c19 B}, (5.2.1) where ϑ1 , . . . , ϑt are logarithms of some non-zero algebraic numbers, c18 , c19 are given explicit positive constants, and b1 , . . . , bt are unknown rational integers such that 0 < max(|b1 |, . . . , |bt |) ≤ B and B ≤ B0 with some explicit constant B0 . Remark We could also work with an inhomogeneous version of (5.1.1), when b1 = 1. We want to considerably reduce the upper bound B0 in the following way. Consider the inequality (5.2.1), where ϑ1 , . . . , ϑt are real or complex numbers. Denote by L the t-dimensional lattice spanned by the columns of the Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 104 Algorithmic resolution of unit equations in two unknowns (t + 2) × t matrix ⎛ 1 0 .. . 0 1 .. . ··· ··· ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ 0 0 ··· ⎜ ⎝CRe(ϑ1 ) CRe(ϑ2 ) · · · CIm(ϑ1 ) CIm(ϑ2 ) · · · 0 0 .. . ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ 1 ⎟ ⎟ CRe(ϑt )⎠ CIm(ϑt ) where C is a large constant to be specified in numerical cases. The last row can be omitted if ϑ1 , . . . , ϑt are all reals. Let a1 denote the first vector of an LLL-reduced basis of L. Lemma 5.2.1 If in (5.2.1) maxi |bi | ≤ B ≤ B0 and $ a1 ≥ (t + 1)2t−1 B0 , (5.2.2) then B≤ log C + log c18 − log B0 . c19 (5.2.3) This is a slight extension of a result of Gaál and Pohst (2002) where it is assumed that B = max(|b1 |, . . . , |bt |). Our version will be important below, applying Lemma 5.2.1 to (5.1.10) and (5.1.13). Proof. Following the proof of Lemma 1 in Gaál and Pohst (2002), we denote by a0 the shortest non-zero vector in L. Then it follows from the inequality (iv) of Proposition 5.6.1 that a1 2 ≤ 2t−1 a0 2 . Using (5.2.1) and the assumptions of our lemma, we infer that 2 exp{−2c19 B}. 21−t (t + 1)2t−1 B02 ≤ 21−t a1 2 ≤ a0 2 ≤ tB02 + C 2 c18 This gives B0 ≤ C · c18 exp{−c19 B}, whence (5.2.3) follows. We note that if in (5.2.1) the numbers ϑ1 , . . . , ϑt are linearly dependent over Q, then the number of unknowns can be reduced and we can apply Lemma 5.2.1 to a lower dimensional lattice. We expect our Lemma 5.2.1 to reduce our upper bound B0 for B, because it is believed that the logarithms of algebraic numbers behave as random complex Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.2 Reduction of the bounds 105 numbers. To ensure (5.2.2) we have to choose C sufficiently large. A suitable value of C is usually of magnitude B0t . Then the bound B0 is reduced almost to its logarithm. If Lemma 5.2.1 does not reduce our upper bound, a larger C can be chosen and we repeat the procedure. Keeping the notation of Section 5.1, we apply Lemma 5.2.1, for each infinite vl , to the corresponding inequality log |a | + b2,1 log |ξ2,1 | + · · · + b2,r log |ξ2,r | ≤ 2c3 exp{−c2 B} (5.1.10) 2 2 2 or log a + b2,0 log(−1) + b2,1 log ξ2,1 + · · · + b2,r log ξ2,r 2 2 2 √ c2 ≤ 2 c3 exp − B , 2(r2 + 1) (5.1.13) according as vl is real or not. We recall that in the first case max1≤j ≤r2 |b2,j | ≤ B, while in the second case max0≤j ≤r2 |b2,j | ≤ B , where B = (r2 + 1)B. In the previous section we derived in each case an explicit upper bound B0 (vl ) for B. Lemma 5.2.1 can be applied to (5.1.10) and (5.1.13) repeatedly. In every step we take as B0 the previous bound, initially the bound B0 (vl ), to get smaller and smaller bounds for B. The reduction is very efficient in the first and second steps. After about 4 − 5 steps the procedure stabilizes, that is does not yield an improvement any more. The final reduced bound is usually between 100 and 1000. 5.2.2 Finite places Now let vl be finite, and p the prime ideal of OK corresponding to vl . We recall that a2 = ζ2 a2 and x2 = ζ2 r2 b ξ2,j2,j . j =1 Consider now (5.1.15) in the form ⎞ ⎛ r2 b ξ2,j2,j − 1⎠ ≥ c20 B − c21 , ordp ⎝a2 (5.2.4) j =1 where b2,1 , . . . , b2,r2 are rational integers with max1≤j ≤r2 |b2,j | ≤ B and c20 := c2 / log N(p), c21 := log c3 / log N (p). In Section 5.1.2 we derived an upper Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 106 Algorithmic resolution of unit equations in two unknowns bound B0 (vl ) for B. We are now going to reduce this bound to a much smaller one by means of the LLL-reduction algorithm. To apply the reduction procedure we have to convert (5.2.4) into a linear form estimate. This will be done by using p-adic logarithms. We shall proceed in several steps. We follow the arguments of de Weger (1989) and Smart (1998). Step 1. Firstly we show that a2 x2 can be written in the form a2 x2 = η0 q2 d ηj j , (5.2.5) j =1 where dj are rational integers with |dj | ≤ |b2,j | ≤ B and ηj , j ≥ 1, are multiplicatively independent elements of K ∗ with ordp (ηj ) = 0 for j = 0, . . . , q2 , / S2 , and q2 = r2 − 1 otherwise. q2 = r2 if ordp (a2 ) = 0 and vl ∈ It follows from (5.2.4) and (5.1.1) that ordp (a1 x1 ) > 0 if B > c21 /c20 . Hence (5.1.1) implies that ordp (a2 x2 ) = 0. Put m0 := ordp (a2 ) and mj := ordp (ξ2,j ) for j = 1, . . . , r2 . Then we infer that m0 + r2 mj b2,j = 0. (5.2.6) j =1 If mj = 0 for each j with 0 ≤ j ≤ r2 , then we may take η0 = a2 , ηj = ξ2,j for j = 1, . . . , r2 and we are done. Next assume that not all mj are zero. Choose k > 0 such that |mk | is minimal among the non-zero numbers |mj |, j = 1, . . . , r2 . Let −m mk ξ2,k j ηj := ξ2,j for j = 1, . . . , r2 . Then ηk = 1, the other ηj are multiplicatively independent and ordp (ηj ) = 0 for j ≥ 1, j = k. Let dj , tj be rational integers such that b2,j = mk dj + tj with 0 ≤ tj < |mk |, j = 1, . . . , r2 (5.2.7) ⎛ ⎞ r2 tj ⎟ m/mk ⎜ η0 = a2 ⎝ ξ2,j ⎠ ξ2,k . (5.2.8) and let ⎛ ⎜ m = − ⎝ m0 + ⎞ r2 j =1 j = k ⎟ m j tj ⎠ , j =1 j = k Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.2 Reduction of the bounds 107 It follows from (5.2.6), (5.2.7) and (5.2.8) that m = mk r2 mj dj + mk tk , (5.2.9) j =1 whence m ≡ 0 (mod mk ) which implies that η0 ∈ K ∗ . Further, we obtain that ⎛ ⎞ r2 r2 r2 r2 d d tj ⎟ m/mk b ⎜ mk −mj j η0 ξ2,j ηj j = a2 ⎝ ξ2,j ξ2,k = a2 ξ2,j2,j = a2 x2 , ⎠ ξ2,k j =1 j = k j =1 j = k j =1 j =1 j = k whence (5.2.5) follows. Finally, in view of (5.2.7), (5.2.8) and (5.2.9) we have ordp (η0 ) = 0 which proves our claim. Remark 5.2.2 It is important to note that m0 , m1 , . . . , mr2 and hence the numbers η0 , η1 , . . . , ηr2 can be explicitly determined. However, we get different η0 for each possible choice of t1 , . . . , tr2 with 0 ≤ tj < |mk |, j = 1, . . . , r2 and m0 + rj2=1 mj tj ≡ 0 (mod mk ), and we have to perform our computations for each η0 . Step 2. We reduce (5.2.4) to an inequality concerning linear form in p-adic logarithms. In view of (5.2.4) and (5.2.5) we have ordp () ≥ c20 B − c21 , (5.2.10) where = 1 − η0 q2 d ηj j . j =1 Then (5.2.10) implies that ordp () > B> 1 c20 1 p−1 c21 + whenever 1 p−1 =: c22 . Using p-adic logarithms in Qp , the algebraic closure of Qp , and applying Lemma 1.11.2 and (1.11.1), we infer that ordp = ordp () ≥ c20 B − c21 , (5.2.11) where = logp η0 + d1 logp η1 + · · · + dq2 logp ηq2 . We note that here ordp (ηj ) = 0 for j = 0, . . . , q2 , hence the p-adic logarithms of η0 , . . . , ηq2 are well-defined (see Section 1.11). Further, logp η0 , . . . , logp ηq2 are elements of Kp , the p-adic completion of K (see Proposition 1.11.3), and Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 108 Algorithmic resolution of unit equations in two unknowns they can be approximated with any desired accuracy; see de Weger (1989) and Smart (1998), chapter 5. Step 3. We consider now (5.2.11) in the following more general form: ∞ > ordp (b1 ϑ1 + · · · + bt ϑt ) ≥ c23 B − c24 , (5.2.12) where ϑ1 , . . . , ϑt are given elements of Kp , c23 > 0, c24 are given explicit constants, b1 , . . . , bt ∈ Z with |bj | ≤ B for j = 1, . . . , t and B ≤ B0 for some explicit constant B0 . This is the p-adic analogue of the inequality (5.2.1). We first show that (5.2.12) can be reduced to the case when, in (5.2.12), all ϑi are integers in Qp . The field Qp (ϑ1 , . . . , ϑt ) is a finite extension of degree m, say, of Qp . Using standard arguments, we can determine an element δ which is integral over Qp , and p-adic numbers ϑij (i = 1, . . . , t, j = 0, . . . , m − 1), such that Qp (ϑ1 , . . . , ϑt ) = Qp (δ), and ϑi = m−1 i = 1, . . . , t. ϑij δ j , j =0 Putting (b) := t bi ϑi and j (b) = i=1 t bi ϑij (j = 0, . . . , m − 1), i=1 we have (b) = m−1 j (b)δ j . (5.2.13) j =0 We claim that ordp (j (b)) ≥ c23 B − c24 − 12 ordp (D(δ)) (5.2.14) for j = 0, . . . , m − 1, where D(δ) denotes the discriminant of δ over Qp . To prove (5.2.14), consider the conjugates δ (1) = δ, . . . , δ (m) of δ over Qp . Taking the corresponding conjugates in (5.2.13) we get (i) (b) = m−1 j (b)(δ (i) )j , i = 1, . . . , m. (5.2.15) j =0 Put (δ) := 1≤i<j ≤m (δ (i) − δ (j ) ). It follows from (5.2.15) that there are padic algebraic numbers κij such that ordp (κij ) ≥ 0 and (δ)j (b) = m κij (i) (b) for j = 0, . . . , m − 1. (5.2.16) i=1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.2 Reduction of the bounds 109 Since ordp ((i) (b)) does not depend on i, we infer from (5.2.12), (5.2.16) and 2ordp (δ) =ordp (D(δ)) that (5.2.14) holds. We could consider here the linear form estimates (5.2.14) simultaneously, as is done in Smart (1995, 1998). This would give a better reduced bound than using only one linear form. Nevertheless, we work only with one form, say j0 (b) = ti=1 bi ϑij0 such that j0 (b) = 0. On one hand, this case is simpler to apply. On the other hand it will enable us to apply the LLL-reduction algorithm similarly to that used in the complex case. For simplicity we omit the index j0 . Then, in view of (5.2.14), we arrive at (5.2.12) under the assumption that now ϑ1 , . . . , ϑt are elements of Qp and c24 is replaced by c25 := c24 + 12 ordp (D(δ)). We may assume without loss of generality that mini ordp (ϑi ) = ordp (ϑt ) =: c26 . Then ϑi := −ϑi /ϑt ∈ Zp for i = 1, . . . , t − 1 and (5.2.12) implies that + bt ) ≥ c23 B − c27 , ∞ > ordp (−b1 ϑ1 − · · · − bt−1 ϑt−1 (5.2.17) where c27 := c25 + c26 . Step 4. We apply now the LLL-reduction algorithm to (5.2.17) to reduce the bound B0 . For any ϑ ∈ Zp and for any positive integer u denote by ϑ {u} the unique rational integer such that ϑ ≡ ϑ {u} (mod pu ) and 0 ≤ ϑ {u} < pu . Let Lu denote the t-dimensional lattice generated by the columns of the matrix ⎞ ⎛ 1 0 ··· 0 0 ⎜ 0 1 ··· 0 0⎟ ⎟ ⎜ ⎟ ⎜ . . . . ⎜ .. .. .. ⎟ A = ⎜ .. ⎟. ⎟ ⎜ 0 ··· 1 0⎠ ⎝ 0 {u} {u} {u} ϑ1 ϑ2 · · · ϑt−1 pu For any b = (b1 , . . . , bt ) ∈ Zt , write + bt . (b) = −b1 ϑ1 − · · · − bt−1 ϑt−1 We claim that Lu = bT : b = (b1 , . . . , bt ) ∈ Zt and ordp (b) ≥ u . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 110 Algorithmic resolution of unit equations in two unknowns Indeed, if bT ∈ Lu then bT = AxT for some x = (x1 , . . . , xt ) ∈ Zt . This implies that bi = xi for i = 1, . . . , t − 1 and bt = t−1 {u} xi ϑi + xt pu ≡ i=1 t−1 bi ϑi (mod pu ), i=1 whence ordp (b) ≥ u follows. Conversely, if ordp ((b)) ≥ u for some b ∈ Zt then there exists x ∈ Zt such that bT = AxT , that is bT ∈ Lu which proves our claim. We recall that in (5.2.17) we have a bound B0 for B. Choose an integer constant u such that pu ≥ B0t+1 . We may expect that u is large enough to bound B using the following lemma. If it is not sufficiently large then we make u a little larger and apply the lemma again. Let a1 denote the first vector of an LLL-reduced basis of Lu . We prove the following analogue of Lemma 5.2.1. Lemma 5.2.3 If in (5.2.17) maxi |bi | ≤ B ≤ B0 and √ a1 > t2t−1 B0 , (5.2.18) then B≤ 1 (u − 1 + c27 ). c23 (5.2.19) Proof. Using (iv) of Proposition 5.6.1 and (5.2.18) we infer that a0 2 ≥ 21−t a1 2 > tB02 , where a0 denotes the shortest non-zero vector in the lattice Lu . Hence we infer that a0 > √ tB0 . This means that for b = (b1 . . . , bt ) ∈ Zt with maxi |bi | ≤ B ≤ B0 which satisfies (5.2.17), (b1 , . . . , bt )T cannot be a lattice point in Lu . Hence, for such a b, ordp (b) ≤ u − 1, which, together with (5.2.17), gives (5.2.19). Similarly to Lemma 5.2.1, Lemma 5.2.3 also reduces the bound B0 to almost its logarithm. Of course, Lemma 5.2.3 can also be applied repeatedly until we get a better bound than the previous one. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.3 Enumeration of the “small” solutions 111 Finally we note that performing the above procedure in (5.2.11) for all v ∈ S1 (when B = maxj |b1j |), then proceeding similarly for all v ∈ S2 (when B = maxj |b2j |), and denoting by BR the maximum of the reduced bounds obtained, we get B ≤ BR in our equation (5.1.3). 5.3 Enumeration of the “small” solutions In the first section we gave an upper bound for the solutions bi,j of the equation (5.1.3). Further, in the previous section we considerably reduced this bound to a new bound BR . A crucial problem in the resolution of equation (5.1.3) is now to check the remaining (2BR + 1)r1 +r2 cases for the exponents, where r1 , r2 denote the ranks of 1 and 2 , respectively. Even if the bound BR is moderate (say < 100) the direct enumeration is almost hopeless whenever the number r1 + r2 of exponents is greater than eight. We now present an efficient algorithm for finding all solutions of (5.1.3) under the reduced bound BR . This algorithm has been established by Wildanger (1997, 2000) for the case when both 1 and 2 are the unit group of OK , and by Smart (1999) in the general case. We follow the presentation of Smart (1999) with certain simplifications. For any real number H > 1 and for any finite set S of places of K containing the infinite places, we define the set 1 ≤ |α|v ≤ H for all v ∈ S . H, S := α ∈ K: H Denote by S the set of solutions (x1 , x2 ) ∈ 1 × 2 of (5.1.1). Writing x1 , x2 in the form (5.1.2), we consider (5.1.1) as the exponential equation (5.1.3) in integers bi,j with B = maxi,j |bi,j |. For a positive integer Bk , denote by SBk the set of solutions of (5.1.1) such that the absolute values of the corresponding exponents bi,j is at most Bk . Then S = SBR . We define SBk (H ) := {(x1 , x2 ) ∈ SBk : x1 ∈ H, S1 }. We first show that for ⎞ r1 log |ξ1,j |v ⎠ , H0 := max exp ⎝BR ⎛ v∈S1 j =1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 112 Algorithmic resolution of unit equations in two unknowns we have S = SBR (H0 ). (5.3.1) Indeed, using (5.1.2), for every solution (x1 , x2 ) of (5.1.1) and for each v ∈ S1 we infer that r1 r1 log |ξ1,j |v |log |x1 |v | = b1,j log |ξ1,j |v ≤ B j =1 j =1 ⎛ ⎞ r 1 log |ξ1,j |v ⎠ = log H0 . ≤ max ⎝BR v∈S1 j =1 This means that 1 ≤ |x1 |v ≤ H0 , H0 whence (5.3.1) follows. In what follows, we shall proceed in several steps. Step 1. We first decompose the solution set S into appropriate subsets. Set ti := max max |ai |v , ai−1 v v∈S1 ∪S2 for i = 1, 2, and t3 := max min |a2 |v , a2−1 v . v∈S1 ∪S2 For k ≥ 0, let Bk be a positive number with the choice B0 = BR , and let Hk , Hk+1 be real numbers such that max t1 , t2 , t3 , t3 − 1 t1 < Hk+1 < Hk . Note that Hk+1 > 1. We intend to find a positive number Bk+1 < Bk and then decompose the set SBk (Hk ) into the union of SBk+1 (Hk+1 ) and a union of some subsets, each containing a few elements which can be easily determined. If, starting with k = 0, that is with (5.3.1), this process can be repeated, finally it will remain to enumerate a set of the form SBk0 (Hk0 ) for some small values of Bk0 and Hk0 . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.3 Enumeration of the “small” solutions 113 We define the sets Tj,v = Tj,v (Bk , Hk , Hk+1 ), j = 1, 2, 3, 4, in the following way: ( ' 1 (v ∈ S2 ), T1,v := (x1 , x2 ) ∈ SBk (Hk ) : |a1 x1 − 1|v < 1 + t1 Hk+1 ( ' 1 1 T2,v := (x1 , x2 ) ∈ SBk (Hk ) : (v ∈ S1 ∪ S2 ), − 1 < a1 x1 1 + t1 Hk+1 v ' t1 , T3,v := (x1 , x2 ) ∈ SBk (Hk ) : |a2 x2 − 1|v < Hk+1 ( a2 x2 ∈ 1 + t1 Hk , S2 (v ∈ S1 ), ' a2 x2 t1 T4,v := (x1 , x2 ) ∈ SBk (Hk ) : − − 1 < , a1 x1 Hk+1 v ( a2 x2 ∈ 1 + t1 Hk , S1 ∪ S2 (v ∈ S1 ). a1 x1 Further, let T1 (Bk , Hk , Hk+1 ) := ) T1,v (Bk , Hk , Hk+1 ), v∈S2 T2 (Bk , Hk , Hk+1 ) := ) T2,v (Bk , Hk , Hk+1 ), v∈S1 ∪S2 T3 (Bk , Hk , Hk+1 ) := ) T3,v (Bk , Hk , Hk+1 ), v∈S1 T4 (Bk , Hk , Hk+1 ) := ) T4,v (Bk , Hk , Hk+1 ). v∈S1 We recall that c1∗ denotes the constant occurring in (5.1.5). Lemma 5.3.1 Let c28 := max log t1 Hk+1 + 1 , log(Hk+1 ) t3 and Bk+1 := c1∗ c28 . Then SBk (Hk ) = SBk+1 (Hk+1 ) 4 ) Tj (Bk , Hk , Hk+1 ). (5.3.2) j =1 Proof. Assume that (x1 , x2 ) ∈ SBk (Hk ) and that (x1 , x2 ) ∈ / SBk (Hk+1 ). Then there is a v ∈ S1 such that either |x1 |v < 1/Hk+1 or |x1 |v > Hk+1 . In the first case we infer that t1 |a2 x2 − 1|v = |a1 x1 |v < . Hk+1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 114 Algorithmic resolution of unit equations in two unknowns / T1 (Bk , Hk , Hk+1 ) then, for each u ∈ S2 , If (x1 , x2 ) ∈ |a2 x2 |u = |a1 x1 − 1|u ≥ 1 . 1 + t1 Hk+1 Further, we have |a2 x2 |u = |a1 x1 − 1|u ≤ 1 + |a1 x1 |u ≤ 1 + t1 Hk if u ∈ S1 , 1 + t1 if u ∈ S2 \ S1 . Consequently, for each u ∈ S2 we have |log |a2 x2 |u | ≤ max {log(1 + t1 ), log(1 + t1 Hk+1 ), log(1 + t1 Hk )} = log(1 + t1 Hk ). This implies that if |x1 |v < 1/Hk+1 for some v ∈ S1 and (x1 , x2 ) ∈ / T1 (Bk , Hk , Hk+1 ), then (x1 , x2 ) ∈ T3 (Bk , Hk , Hk+1 ). Next consider the case when |x1 |v > Hk+1 . Then it follows that a2 x2 1 t1 − a x − 1 = a x < H . 1 1 1 1 v k+1 v / T2 (Bk , Hk , Hk+1 ). Then for each u ∈ S1 ∪ S2 we have Assume that (x1 , x2 ) ∈ a2 x2 1 = 1 − 1 ≥ . a x a x 1 + t 1 1 u 1 1 1 Hk+1 u Further, it follows that a2 x2 = 1 − 1 ≤ 1 + 1 ≤ 1 + t1 Hk if u ∈ S1 , a x a x a x 1 + t1 if u ∈ S2 \ S1 . 1 1 u 1 1 1 1 u u This implies that (x1 , x2 ) ∈ T4 (Bk , Hk , Hk+1 ). Hence ⎛ ⎞ 4 ) ) ⎝ Tj (Bk , Hk , Hk+1 )⎠ . SBk (Hk ) = SBk (Hk+1 ) (5.3.3) j =1 Now consider the case when (x1 , x2 ) ∈ SBk (Hk+1 ). Then for each v ∈ S1 ∪ S2 we have |log |x1 |v | ≤ log Hk+1 and a1 x1 − 1 ≤ |a1 x1 |v + 1 ≤ t1 Hk+1 + 1 . |x2 |v = a2 v |a2 |v t3 / T1 (Bk , Hk , Hk+1 ), then for each v ∈ S2 If (x1 , x2 ) ∈ a1 x1 − 1 t3 ≥ . |x2 |v = a2 t1 Hk+1 + 1 v Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.3 Enumeration of the “small” solutions 115 Clearly, this inequality holds for v ∈ S1 \ S2 as well. Thus, we deduce that if (x1 , x2 ) ∈ SBk (Hk+1 ) \ T1 (Bk , Hk , Hk+1 ), then for each v ∈ S1 ∪ S2 |log |x1 |v | ≤ c28 , |log |x2 |v | ≤ c28 . However, in view of (5.1.5) we must have B ≤ c1∗ c28 = Bk+1 . Thus SBk (Hk+1 ) = SBk+1 (Hk+1 ) ) T1 (Bk , Hk , Hk+1 ) which, together with (5.3.3), completes the proof. Remark 5.3.2 Applying Lemma 5.3.1 we need to choose a value Hk+1 such that the algorithm prescribed below allows us to deduce that the sets Tj,v (Bk , Hk , Hk+1 ) are easy to enumerate for each j and v under consideration. Wildanger (2000) provides a heuristic method to find the best value for Hk+1 in the case when 1 = 2 is the unit group of OK . In the general case the analysis appears similar, and the choice of Wildanger for Hk+1 seems to be sufficient. Step 2. We are going to show how to enumerate, for each j and v in question, all the possible elements in Tj,v (Bk , Hk , Hk+1 ). In all cases our problem can be reformulated as trying to enumerate all nontrivial solutions of the following problem. Let α, ξ1 , . . . , ξr be explicitly given elements of K ∗ such that ξ1 , . . . , ξr are multiplicatively independent. Further, let S = {v1 , . . . , vs } be the support of the multiplicative group generated by ξ1 , . . . , ξr . Set x = ζ b0 r b ξj j , (5.3.4) j =1 where ζ is a primitive root of unity in K, and b0 , . . . , br are rational integers with 0 ≤ b0 < w, where w denotes the number of roots of unity in K. We have, for some H > 1 and for all v ∈ S, 1 ≤ |αx|v ≤ H. H (5.3.5) We wish to determine all x for which (5.3.4), (5.3.5) and, for some v ∈ S and some given ε ∈ (0, 1) |αx − 1|v < (5.3.6) hold. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 116 Algorithmic resolution of unit equations in two unknowns We shall distinguish two cases according as v is infinite or finite in (5.3.6). v is infinite. Let v = vi . For any z ∈ C, |z − 1| < ε implies that |log |z|| < log (1/(1 − )). Hence we deduce from (5.3.6) that log 1 , v real, |log |αx|v | ≤ := 1 1− 1√ log , v complex. 2 1− Further, we have, with obvious notation √ Arg (αx)(i) ≤ arccos 1 − =: ε . For simplicity, for any β ∈ K ∗ we denote by β (i) the conjugate of β over Q corresponding to vi . Consider the sublattice of Rs+1 which is generated by the columns of the matrix M obtained from the (s + 1) × (r + 1) matrix ⎛ ⎞ log |ξ1 |v1 · · · log |ξr |v1 0 ⎜ .. .. .. ⎟ 1 ⎜ . . .⎟ ⎜ ⎟ log H ⎝ log |ξ1 |v · · · log |ξr |v 0⎠ s 0 ··· s 0 0 by replacing the i-th row by 1 log |ξ1 |vi , . . . , log |ξr |vi , 0 and the last row by 1 Arg ξ1(i) , . . . , Arg ξr(i) , Arg ζ (i) . We expect the i-th and last row of M to have much larger entries than the other rows. Let x = Mb, where b = (b1 , . . . , br , b0 )T , and consider the vector y obtained from the vector T 1 −log |α|v1 , . . . , − log |α|vr , 0 ∈ Rr+1 log H by replacing the i-th coordinate by −log |α|vi / for i = 1, . . . , s, and the last coordinate by Arg((1/α)(i) )/ . Then we have log2 |αx|v Arg2 (αx)(i) log2 |αx|vi 2 + + x − y = ≤ s + 1. (5.3.7) 2 2 log2 H v∈S v = vi Hence we have proved that for any (b0 , b1 , . . . , br ) ∈ Zr+1 which corresponds by (5.3.4) to a solution x of (5.3.5) and (5.3.6), inequality (5.3.7) holds. The Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.3 Enumeration of the “small” solutions 117 inequality (5.3.7) defines an ellipsoid with center y. The lattice points contained in this ellipsoid can be enumerated by means of the algorithm of Fincke and Pohst (1985). The enumeration is usually very fast. However, it is essential that the improved version (see Fincke and Pohst (1985)) of the algorithm must be used, involving LLL reduction. v is finite. Let again p be the prime ideal of OK that corresponds to v. We proceed as above with the following modifications. Then (5.3.6) implies that ordp (αx) = 0. As in Step 1 of Section 5.2.2, we can reduce (5.3.4) to q d a similar problem where αx = η0 j =1 ηj j with multiplicatively independent η1 , . . . , ηq such that ordp (ηj ) = 0 for j = 0, . . . , q. We recall that η0 may assume finitely many values, each of which can be determined. We have to perform our computations for every possible value of η0 . Suppose that p has residue degree f and that it lies above the rational prime p. Choose a positive integer n such that ≤ p−f n . Then in view of (5.3.6) we have η0 q d ηj j ≡ 1 (mod pn ). (5.3.8) j =1 First assume that η0 , . . . , ηq are multiplicatively independent. Let G denote the subgroup of K ∗ generated by η0 , . . . , ηq . Since ordp (ηj ) = 0 for all j , we can consider the image of G in (OK /pn )∗ under reduction mod pn . The order of ηj (mod pn ) can be computed very quickly, as it is a divisor of the order of (OK /pn )∗ which is p(n−1)f (p f − 1). All d = (d0 , . . . , dq ) ∈ Zq+1 for which d η0d0 · · · ηqq ≡ 1 (mod pn ) form a full lattice in Zq+1 , a basis of which can be computed by using the algorithm MINIMIZE; see Teske (1998). This algorithm computes such a basis in the form dj = (d0j , . . . , djj , 0, . . . , 0) for j = 0, . . . , q, d where dij ∈ Z and djj is the smallest positive integer for which ηj jj belongs to the subgroup of G/pn generated by {η0 , . . . , ηj −1 }. Then putting ηj := η00j · · · ηj jj , d d we can write αx = q ηj nj j =0 with suitable integers n0 , . . . , nq . Let S = {u1 , . . . , us } denote the support of the group generated by η0 , . . . , ηq . Obviously S ⊂ S. We can now proceed in Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 118 Algorithmic resolution of unit equations in two unknowns a similar way as in the case of infinite places. Consider the sublattice of Rs generated by the vectors ηi = 1 log |ηi |u1 , . . . , log |ηi |us log H for i = 0, . . . , q. We have n0 η0 + · · · + nq ηq 2 = log2 |αx|u u∈S log2 H ≤ s + 1. Then, as above, we can determine all (n0 , . . . , nq ) and hence all x under consideration using the Fincke–Pohst algorithm. Next consider the case when η0 , . . . , ηq are multiplicatively dependent, and h let h be a positive integer for which η0 is contained in the multiplicative group generated by η1 , . . . , ηq . Then it follows from (5.3.8) that (αx)h = q d ηj j ≡ 1 (mod pn ) j =1 with some rational integers d1 , . . . , dq . Then we infer as above that (αx)h = q n η j j j =1 with some rational integers n1 , . . . , nq . Thus, following the above procedure, we can determine all (n1 , . . . , nq ) and hence, up to a factor of a root of unity in K, all x can also be found. The factor in question can be easily determined from equation (5.1.1). Step 3. By means of the above process we can determine all elements of the set Tj,v (Bk , Hk , Hk+1 ) for j = 1, 2, 3, 4, and for all v in question. Finally, at the end of the repeated procedure described in Steps 1 and 2 of this section we arrive at a set of the form SBk0 (Hk0 ) for some small values of Bk0 and Hk0 . Consider the lattice in Rs1 generated by the vectors T 1 log |ξ1,i |v1 , . . . , log |ξ1,i |vs1 log Hk0 for i = 1, . . . , r1 , where v1 , . . . , vs1 = S1 . Then the set SBk0 (Hk0 ) is contained in the ellipsoid ξ i := b1,1 ξ 1 + · · · + b1,r1 ξ r1 2 ≤ s1 whose points can be found by using again the Fincke–Pohst algorithm. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.4 Examples 119 5.4 Examples We now briefly illustrate the use of the above presented algorithm on two concrete S-unit equations. In these examples 1 = 2 holds, and this group is the unit group of the ring of integers, respectively an S-unit group of the underlying number field. In our book on discriminant equations, we will meet examples where 1 , 2 are distinct. We note that in the examples below the fundamental units were computed by the KANT package; see Daberkow et al. (1997). For an alternative package, we mention MAGMA; see Bosma et al. (1997). Example 5.4.1 (Smart (1997, 1999)). Let K16 denote the 16th cyclotomic field generated by ζ , where ζ is a 16th primitive root of unity. There is a prime ideal p in K16 such that p8 = (2). This ideal is principal and we can take π = 1 − ζ as a generator for p. Let S denote the set of places of K16 which consists of all infinite places and the single finite place v = p. Consider the S-unit equation x1 + x2 = 1 in S-units x1 , x2 of K16 . (5.4.1) The degree and unit rank of K16 are 8 and 3, respectively. One can take ε1 = ζ 2 + ζ 4 + ζ 6 , ε2 = − ζ 2 + ζ 3 + ζ 4 , ε3 = 1 + ζ 3 − ζ 5 as generators for the unit group of K16 . Then the solutions of (5.4.1) can be uniquely represented in the form b b b x1 = ζ b1,0 ε11,1 ε21,2 ε31,3 π b1,4 , b b b x2 = ζ b2,0 ε12,1 ε22,2 ε32,3 π b2,4 . with rational integer exponents bi,j . Obviously one can assume that 0 ≤ b1,0 , b2.0 ≤ 15. (5.4.2) Using Baker’s method and reduction techniques, it was shown in Smart (1997, 1999) that max |bi,j | ≤ 1066 = BR , i,j where BR denotes the reduced bound as defined in Section 5.2. We note that in Smart (1997, 1999) some earlier versions of Theorems 3.2.4, 3.2.7 and of the reduction algorithm were utilized. Using our versions described in Sections 5.1 and 5.2 we could get a slightly better value for BR , but this is in fact irrelevant for the last part of the computations. The enumeration process was applied repeatedly with the initial values B0 = BR , H0 = 103598 and with c1∗ = 1.63189. Then SH0 (B0 ) is just the set of Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 120 Algorithmic resolution of unit equations in two unknowns solutions of (5.4.1). Smart (1999) then chose H1 = 1090 , H2 = 1030 , H3 = 1015 , H4 = 106 , H5 = 103 . After the necessary computations it turned out that the sets Tj,v (Bk , Hk , Hk+1 ) are empty both for 0 ≤ k ≤ 4, 1 ≤ j ≤ 4 and all v ∈ S infinite, and for 0 ≤ k ≤ 2, 1 ≤ j ≤ 4 and v finite. For the finite v and for k = 3, 4, 1 ≤ j ≤ 4, the solutions in Tj,v (Bk , Hk , Hk+1 ) were determined by the Fincke–Pohst algorithm. Finally, it remained to enumerate the set SB5 (H5 ) for B5 = 11 which was accomplished again by means of the Fincke–Pohst method. The equation (5.4.1) has exactly 795 solutions, each of which satisfies (5.4.2) and max |b1,j |, |b2,j | ≤ 11. 1≤j ≤4 This result was needed in Smart (1997) for the calculation of curves of genus 2 with good reduction away from 2. Example 5.4.2 (Wildanger (2000)). Let K19 be the 19th cyclotomic field gen+ the maximal real subfield of erated by ζ = exp (2π i/19), and denote by K19 + −1 K19 . Then K19 = Q(θ ), where θ = ζ + ζ . Consider the unit equation x1 + x2 = 1 + in units x1 , x2 of the ring of integers of K19 . (5.4.3) + is totally real, its degree is 9 and its unit rank is 8. Further, The number field K19 ε1 = 1 − 4θ − 10θ 2 + 10θ 3 + 15θ 4 − 6θ 5 − 7θ 6 + θ 7 + θ 8 , ε2 = 3θ − θ 3 , ε5 = θ, ε3 = 1 − 2θ − 3θ 2 + θ 3 + θ 4 , ε6 = 2 − θ 2 , ε4 = 2 − 9θ 2 + 6θ 4 − θ 6 , ε7 = 2 − 4θ 2 + θ 4 , ε8 = −5θ + 5θ 2 + 10θ 3 − 5θ 4 − 6θ 5 + θ 6 + θ 7 + is a system of fundamental units in K19 . The solutions of (5.4.3) can be written uniquely in the form x1 = ± 8 b εj 1,j , x2 = ± j =1 8 b εj 2,j , j =1 where bi,j are rational integers. By means of Baker’s method Wildanger proved that max |bi,j | ≤ 1038 . i,j Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.5 Exceptional units 121 Further, using reduction techniques he showed that max |bi,j | ≤ 2076 = BR , i,j BR being the reduced bound. Finally, Wildanger’s variant of the enumeration algorithm was used repeatedly with the initial values B0 = BR , H0 = 6.9 × 104843 and then with the values H1 = 1.49 × 1030 , H2 = 3.89 × 1011 , H3 = 5.52 × 107 , H4 = 982 337.37, H5 = 73 360.74, H9 = 74.25, H6 = 9896.88, H7 = 1780.14, H8 = 365.36, H10 = 11.47. At the final enumeration all the 28 398 solutions of (5.4.3) were found. 5.5 Exceptional units The units ε of the ring of integers of a number field K such that 1 − ε is also a unit of this ring of integers are called exceptional units of K, see Nagell (1970). Nagell (1964, 1968b, 1970) determined all exceptional units in number fields of unit rank 1 and in certain number fields of unit rank 2. + has exactly 28 398 excepAs was mentioned above, the number field K19 tional units. For a positive integer m for which m ≡ 2 (mod 4), denote by Km the m-th cyclotomic field and by Km+ its maximal real subfield. Using the above method, Wildanger (2000) determined all exceptional units in the number fields Km+ for m ≤ 23. Further, by means of the next lemma he extended his result to the number fields Km as well. A number field is called a CM-field if it is a totally imaginary quadratic extension of a totally real number field. For example, the imaginary quadratic number fields and the cyclotomic fields are all CM-fields. Lemma 5.5.1 Let K be a CM-field. Then all non-real exceptional units in K are of the form 1 − ζ2 , ζ1 − ζ2 where ζ1 , ζ2 are roots of unity in K. Proof. See Győry (1971). + Denote by Sm and S+ m the set of exceptional units in Km and Km , respec+ tively, and let |Sm | and |Sm | be their cardinalities. The following table given Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 122 Algorithmic resolution of unit equations in two unknowns by Wildanger contains the values of |Sm | and |S+ m | for those m for which the + unit rank of Km is at most 10. m 1 3 4 5 7 8 9 11 12 13 15 16 17 19 20 21 23 24 25 27 28 32 33 36 40 44 48 60 [Km+ : Q] 1 1 1 2 3 2 3 5 2 6 4 4 8 9 4 6 11 4 10 9 6 8 10 6 8 10 8 8 |S+ m| 0 0 0 6 42 0 18 570 0 1830 90 0 11 700 28 398 54 1416 130 812 0 47 766 8676 678 0 73 110 354 4398 30 030 0 14 274 [Km : Q] 1 2 2 4 6 4 6 10 4 12 8 8 16 18 8 12 22 8 20 18 12 16 20 12 16 20 16 16 |Sm | 0 2 0 18 72 0 38 660 14 1962 440 0 11 940 28 704 138 2192 131 274 86 48 078 8858 888 0 75 242 710 4914 30 660 422 16 340 We remark that for m = 8, 16, 24, 32, 48 there is a prime ideal in OKm+ of norm 2. Hence these number fields Km+ cannot have exceptional units. This implies that in the solutions of the equation (5.4.1) in Example 5.4.1 one of the exponents b1,4 and b2,4 must be different from zero. For each d ∈ {2, . . . , 8}, r ∈ {2, . . . , d − 1}, Wildanger (2000) considered the number fields of degree d and unit rank r having one of the five discriminants Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.6 Supplement: LLL lattice basis reduction 123 with smallest absolute values, and computed for each of them all exceptional units. Finally we note that Wildanger’s method was implemented in KANT, see Daberkow et al. (1997). 5.6 Supplement: LLL lattice basis reduction Let n be an integer ≥ 2. The standard inner product on Rn is defined by a, b = n ai bi for a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Rn . i=1 We use · to denote the Euclidean norm on Rn . Thus for a = (a1 , . . . , an ) ∈ Rn we have * a = a, a1/2 = a12 + · · · + an2 . Let L be a t-dimensional lattice in Rn , i.e., L := {z1 a1 + · · · zt at : z1 , . . . , zt ∈ Z}, where a1 , . . . , at are linearly independent vectors in Rn . Then the determinant d(L) of L is given by 1/2 . d(L) = det ai , aj 1≤i,j ≤t If in particular L is a full lattice in R , i.e., with t = n, then n d(L) = | det(a1 , . . . , an )|. The determinant of L is independent of the choice of a1 , . . . , at . In this section, by a basis of a lattice or vector space we mean an ordered tuple of vectors a1 , . . . , at and not just a set {a1 , . . . , at }, since the outcome of the LLL-algorithm depends on the order in which the vectors of the initial basis are inserted. Let a1 , . . . , at be a basis of a t-dimensional lattice L in Rn , where 1 ≤ t ≤ n. To define an LLL-reduced basis of L we need an appropriate orthogonal basis in the subspace of Rn spanned by L. By means of the Gram–Schmidt orthogonalization process such an orthogonal basis a∗1 , . . . , a∗t can be defined inductively by a∗i = ai − i−1 μij a∗j , 1 ≤ i ≤ t, (5.6.1) j =1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 124 Algorithmic resolution of unit equations in two unknowns where μij = ai , a∗j / a∗j 2 , 1 ≤ j < i ≤ t. (5.6.2) A. K. Lenstra, H. W. Lenstra and Lovász (1982) introduced the notion of what is nowadays called an LLL-reduced basis of a lattice. A basis a1 , . . . , at of a lattice L in Rn is called LLL-reduced if a1 , . . . , at and the vectors a∗1 , . . . , a∗t of the corresponding orthogonal basis satisfy |μij | ≤ 12 , 1≤j <i≤t and a∗i + μi,i−1 a∗i−1 2 ≥ 3 ∗ 2 a , 4 i−1 1 < i ≤ t. (5.6.3) Clearly, (5.6.3) can be rewritten as a∗i 2 ≥ 3 − μ2i,i−1 a∗i−1 2 . 4 Lenstra, Lenstra and Lovász proved that every lattice in Rn has such a basis. Further, they developed a very practical algorithm, which from any given lattice and any basis of this lattice computes an LLL-reduced basis of this lattice. (In fact, Lenstra, Lenstra and Lovász formally stated their results only for full lattices, but the generalization to arbitrary lattices is implicit in their proof; see also Pohst (1993)). LLL-reduced bases have several useful properties. In our book the inequality (iv) below plays an important role in solving concrete Diophantine equations. Proposition 5.6.1 Let a1 , . . . , at be an LLL-reduced basis of a lattice L in Rn with associated orthogonal basis a∗1 , . . . , a∗t defined in (5.6.1). Then we have (i) (ii) (iii) (iv) (v) aj 2 ≤ 2i−1 a∗i for 1 ≤ j ≤ i ≤ t; d(L) ≤ ti=1 ai ≤ 2t(t−1)/4 d(L); a1 ≤ 2(t−1)/4 d(L)1/t ; a1 2 ≤ 2t−1 x2 for every x ∈ L, x = 0; for 1 ≤ j ≤ s, where 1 ≤ s ≤ t, aj 2 ≤ 2t−1 max x1 2 , . . . , xs 2 and x1 , . . . , xs are linearly independent vectors of L. Proof. See Lenstra, Lenstra and Lovász (1982) for t = n, and Pohst (1993) in the case t ≤ n. Following Lenstra, Lenstra and Lovász (1982) and Pohst (1993), we now briefly present the LLL-lattice basis reduction algorithm, that transforms a given basis a1 , . . . , at of a given lattice L in Rn into an LLL-reduced one. First the constants μij and the orthogonal basis vectors a∗i are calculated using Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.6 Supplement: LLL lattice basis reduction 125 (5.6.1) and (5.6.2). Then an LLL-reduced basis can be constructed by induction on the number of reduced basis vectors. The vectors a1 , . . . , at will be changed several times. However, the a∗i and μij will be updated at each step so that (5.6.1) and (5.6.2) remain valid. Assume that for some m with 2 ≤ m ≤ t + 1, the vectors a1 , . . . , am−1 are already LLL-reduced, that is form an LLL-reduced basis of the lattice generated by them. In other words, we assume that |μij | ≤ 1 2 for 1 ≤ j < i < m and 3 ∗ 2 a for 1 < i < m. 4 i−1 These inequalities trivially hold if m = 2. For m = t + 1 the algorithm terminates because then the full basis a1 , . . . , at is reduced. Next consider the case m ≤ t. The major steps are as follows. a∗i + μi,i−1 a∗i−1 2 ≥ (a) Reduce μm,m−1 to |μm,m−1 | ≤ 1/2, subtracting an appropriate multiple of am−1 from am . After these changes all the vectors a∗i remain unchanged. (b) If (5.6.3) holds for i = m, one can proceed to (c). Otherwise interchange am−1 and am and, if m > 2, replace m by m − 1. Then one can go on with (a). (c) Reduce μmj as in (a) to |μmj | ≤ 1/2 for j = m − 2, m − 3, . . . , 1. Then take m + 1 in place of m. If m > t, the algorithm terminates, otherwise we can go on with (a). The vectors a∗i are not used explicitly in the algorithm, only the squares of their norms Ai := a∗i 2 . LLL-reduction algorithm (Pohst (1993)). Input: A basis a1 , . . . , at of a t-dimensional lattice L ⊆ Rn . Output: A basis a1 , . . . , at of L which is LLL-reduced. (a) (Initialization) For i = 1, . . . , t set: μij ← ai , a∗j /Aj a∗i ← ai − i−1 (1 ≤ j ≤ i − 1), μij a∗j , Ai ← ai , a∗i . j =1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 126 Algorithmic resolution of unit equations in two unknowns Then set m ← 2. (b) (Set l). Set l ← m − 1. (c) (Change μml in the case |μml | > 12 ). If |μml | > integer to μml and 1 2 set r to the closest rational am ← am − ral , μmj ← μmj − rμlj (1 ≤ j ≤ l − 1), μml ← μml − r. For l = m − 1 go to (d), else to (e). (d) For Am < 34 − μ2m,m−1 Am−1 go to (f). (e) (Decrease l). Set l ← l − 1. For l > 0 go to (c). For m = t terminate; else set m ← m + 1 and go to (b). (f) (Interchange am−1 , am ). Set μ ← μm,m−1 , A ← Am + μ2 Am−1 , μm,m−1 ← μAm−1 /A, Am ← Am−1 Am /A, Am−1 ← A; then set for 1 ≤ j ≤ m − 2 and m + 1 ≤ i ≤ t am−1 am μi,m−1 μim ← ← am , am−1 μm−1,j μmj 1 μm,m−1 0 1 ← 0 1 1 −μ μmj , μm−1,j μi,m−1 . μim For m > 2 decrease m by 1. Then go to (b). As is proved in Lenstra, Lenstra and Lovász (1982), see also Pohst (1993), the above algorithm always terminates. Further, it is shown that if L is a sublattice of Zn of rank n with basis a1 , . . . , an with ai 2 ≤ A for i = 1, . . . , n, where A ≥ 2, then the algorithm uses O(n4 log A) arithmetic operations, while the integers occurring in the algorithm have binary lengths O(n log A). For more detailed treatments of the LLL-algorithm as well as for some refinements, we refer to the books de Weger (1989), Pohst (1993), Cohen (1993), Smart (1998) and Gaál (2002). 5.7 Notes r In the inhomogeneous version of (5.2.1), the first reduction algorithm was established in Baker and Davenport (1969). Generalizations of this algorithm to the case of several variables were given in Pethő and Schulenberg (1987) and de Weger (1989). r We note that the enumeration algorithm presented in Section 5.3 can be made even more efficient by combining it with some sieving procedure with appropriate prime ideals; see e.g. Smart (1998). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 5.7 Notes 127 r Exceptional units have several applications. An important application was given by Lenstra (1977) who showed that if a number field K contains a “large” subset {ε1 , . . . , εm } of integers of K such that εi − εj is a unit for each i = j then (the ring of integers in) K is Euclidean (with respect to the norm). This was used by Lenstra and others, see, e.g., Lenstra (1977), Mestre (1981), Leutbecher and Martinet (1982), Leutbecher (1985), Leutbecher and Niklasch (1989), Houriet (2007) to obtain several hundreds of new examples of Euclidean number fields. r There is also a link between exceptional units, Lenstra’s result and the dynamics of iterated polynomial mappings; see Zieve (1996). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007 6 Unit equations in several unknowns In the previous chapters we considered equations a1 x1 + a2 x2 = 1, (6.1) where the unknowns x1 , x2 are taken from the group of S-units, or more generally from a finitely generated multiplicative group in a number field. We proved effective finiteness results, which enable one to determine all solutions at least in principle. In fact, in several cases there are even practical algorithms to solve such equations. Our proofs are based on Baker-type inequalities for linear forms in ordinary or p-adic logarithms of algebraic numbers. In this chapter, we consider equations a1 x1 + · · · + an xn = 1 (6.2) in an arbitrary number of unknowns x1 , . . . , xn , which again may be S-units of a number field, or elements from a finitely generated multiplicative group. It should be noticed that equations of the type (6.2) in n > 2 unknowns may have infinitely many solutions. For instance, consider (6.2) with solutions taken from an infinite multiplicative group , and let (x1 , . . . , xn ) be a solution of this equation, with a1 x1 + · · · + am xm = 0, say, where 2 ≤ m < n. Then one obtains an infinite family of solutions by taking (ux1 , . . . , uxm , xm+1 , . . . , xn ) with u an arbitrary element of . To obstruct such obvious constructions of infinite families, we usually consider only non-degenerate solutions of (6.2), i.e., with ai xi = 0 for each non-empty subset I of {1, . . . , n}. i∈I We mention that for equations of type (6.2) in more than two unknowns, we can prove only ineffective finiteness results, as the only available methods to deal with such equations are ineffective. On the other hand, these methods make 128 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 Unit equations in several unknowns 129 it possible to give an explicit upper bound for the number of non-degenerate solutions of equation (6.2). The first method, which is the one followed in this book, is based on the p-adic Subspace Theorem of Schmidt and Schlickewei. The second method, originating from ideas in Faltings (1991) and further developed in Rémond (2002), is independent of the Subspace Theorem but uses instead Faltings’ Product Theorem. We should mention here that the second method has a wider applicability but that the first method based on the Subspace Theorem leads to smaller upper bounds for the number of nondegenerate solutions of (6.2). We give a quick overview of the results proved in this chapter. Our first theorem is a so-called “semi-effective” result, which is a reformulation of a result from Evertse (1984b). Let K be an algebraic number field, and S a finite set of places of K, containing the infinite places. For a vector x = (x0 , . . . , xn ) ∈ OSn+1 , define HS (x0 , . . . , xn ) := max |xi |v , NS (x0 · · · xn ) := |x0 · · · xn |v . v∈S i Then for every > 0, and every x ∈ and v∈S OSn+1 with x0 + · · · + xn = 0 i∈I xi = 0 for each proper, non-empty subset I of {0, . . . , n}, we have HS (x0 , . . . , xn ) K,S,n, NS (x0 · · · xn )1+ , where the implied constant depends only on K, S, n, . This constant is not effectively computable from our method of proof. We deduce this result from Theorem 3.1.3 (the p-adic Subspace Theorem). A consequence of this result is that equation (6.2) has only finitely many non-degenerate solutions in S-units x1 , . . . , xn . More generally, we consider equation (6.2) as an equation with unknowns from a multiplicative group of finite rank , contained in any field K of characteristic 0. Taking as starting point Theorem 3.1.6 (a quantitative version of the Parametric Subspace Theorem), we prove a result from Evertse, Schlickewei and Schmidt (2002), stating that equation (6.2) has only finitely many non-degenerate solutions in x1 , . . . , xn ∈ , whose number is bounded above by C(n)r+1 , where r = rank , and C(n) is an effectively computable number depending only on n. Next, we consider again equation (6.1), in unknowns x1 , x2 ∈ . We have included a proof of the result of Beukers and Schlickewei (1996), implying that for every pair of non-zero coefficients a1 , a2 , equation (6.1) has at most C r+1 solutions, where r = rank , and C is an effectively computable constant. We Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 130 Unit equations in several unknowns should mention here that the approach of Evertse, Schlickewei and Schmidt (2002) gives a similar result, but with a much larger constant C. Further, we prove a result from Evertse, Győry, Stewart and Tijdeman (1988a), which states in a precise way that for most pairs (a1 , a2 ), equation (6.1) has at most two solutions. We finish with some results concerning lower bounds for the number of solutions of (6.1) and (6.2). In particular, we have included a result √ by Konyagin and Soundararajan (2007) which implies that for every β < 2 − 2 there are groups of arbitrarily large rank r and a1 , a2 such that (6.1) has at least exp(r β ) solutions. In Section 6.7, the Notes of this chapter, we give an overview of some historical developments, and some related results. The results presented in this chapter have applications in Chapter 9 to decomposable form equations. Further, they will be applied in our book on discriminant equations. 6.1 Results 6.1.1 A semi-effective result Let K be an algebraic number field, and S a finite set of places of K containing all infinite places. We define the S-height of x = (x0 , . . . , xn ) ∈ OSn+1 by HS (x) = HS (x0 , . . . , xn ) := max(|x0 |v , . . . , |xn |v ), v∈S where the absolute value | · |v is normalized as in Section 1.7. Recall that the S-norm of a ∈ OS is defined by NS (a) := |a|v . v∈S Our first result is as follows. Theorem 6.1.1 Let > 0, n ≥ 1. There is a constant C ineff (K, S, n, ) depending only on K, S, n, for which the following holds. For any non-zero x0 , x1 , . . . , xn ∈ OS with x0 + x1 + · · · + xn = 0, xi = 0 for each proper, non-empty subset I of {0, . . . , n} i∈I Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.1 Results 131 we have HS (x0 , x1 , . . . , xn ) ≤ C ineff (K, S, n, )NS (x0 · · · xn )1+ . (6.1.1) This is in fact an equivalent formulation to Evertse (1984b), theorem 1. We have indicated by means of the superscript “ineff” that the constant C ineff is not effectively computable by means of our method of proof. We view Theorem 6.1.1 as a “semi-effective result”, since it is effective in terms of NS (x0 · · · xn ), but ineffective in terms of n, K, S, . From Theorem 6.1.1 we deduce a finiteness result on the equation a1 x1 + · · · + an xn = 1 in x1 , . . . , xn ∈ OS∗ , (6.1.2) where n ≥ 2 and a1 , . . . , an are non-zero elements of K. Recall that a solution (x1 , . . . , xn ) of (6.1.2) is called non-degenerate if ai xi = 0 for each non-empty subset I of {1, . . . , n} i∈I and degenerate otherwise. Theorem 6.1.1 implies the following finiteness result. Corollary 6.1.2 Equation (6.1.2) has only finitely many non-degenerate solutions in x1 , . . . , xn ∈ OS∗ . This result was proved independently in Evertse (1984b) and van der Poorten and Schlickewei (1982). It was announced in van der Poorten and Schlickewei (1982) and then proved in Evertse and Győry (1988b) and later in van der Poorten and Schlickewei (1991) that Corollary 6.1.2 is valid in the more general situation as well when, in (6.1.2), K is any field of characteristic 0, and OS∗ is replaced by any finitely generated multiplicative subgroup of K ∗ . Further, in Evertse and Győry (1988b) it was shown that the number of non-degenerate solutions can be estimated from above by a number depending only on n and , but with the method of proof in that paper it is not possible to effectively compute this number. In Section 6.2 we deduce Theorem 6.1.1 from the p-adic Subspace Theorem and then deduce from this Corollary 6.1.2. Here we follow Evertse (1984b). 6.1.2 Upper bounds for the number of solutions In this subsection we consider a generalization of (6.1.2) and give an upper bound for the number of its solutions. We say that a multiplicatively written abelian group is of finite rank r if has a free subgroup 0 of rank r such Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 132 Unit equations in several unknowns that for every x ∈ there is a positive integer m such that x m ∈ 0 . We say that is of rank 0 if every element of has finite order. Let now K be any field of characteristic 0, let n ≥ 2, and denote by (K ∗ )n the n-fold direct product of the multiplicative group K ∗ of K, endowed with coordinatewise multiplication (x1 , . . . , xn )(y1 , . . . , yn ) = (x1 y1 , . . . , xn yn ) and exponentiation (x1 , . . . , xn )m = (x1m , . . . , xnm ). The following result was established in Evertse, Schlickewei and Schmidt (2002). Theorem 6.1.3 Let K be a field of characteristic 0, let n ≥ 2, let a1 , . . . , an ∈ K ∗ and let be a subgroup of (K ∗ )n of finite rank r. Then the number of non-degenerate solutions of a1 x1 + · · · + an xn = 1 in (x1 , . . . , xn ) ∈ (6.1.3) can be estimated from above by a quantity A(n, r) depending on n and r only. For A(n, r) one may take exp((6n)3n (r + 1)). The main ingredients of the proof of this result are a specialization argument, to make a reduction to the case that K is a number field and is finitely generated, a version of the Quantitative Subspace Theorem (Evertse and Schlickewei (2002)) and an estimate of Schmidt (1996) for the number of points of very small height on an algebraic subvariety of a linear torus. This estimate of Schmidt was recently improved substantially by Amoroso and Viada (2009). By going through the proof of Evertse, Schlickewei and Schmidt, but replacing Schmidt’s estimate by theirs, they obtained a stronger version of the above Theorem 6.1.3 with 4 A(n, r) = (8n)4n (n+r+1) . (6.1.4) We note that by a different approach, based on Faltings’ Product Theorem instead of the Subspace Theorem, Rémond (2002) proved a general quantitative result for subvarieties of tori (see Section 10.10), which gives as a special case 2 that equation (6.1.3) has at most exp(n4n (r + 1)) non-degenerate solutions. If n = 2, then every solution is non-degenerate. In that case we have the following sharper result, which was proved by Beukers and Schlickewei (1996). Theorem 6.1.4 Let K be a field of characteristic 0 and a subgroup of K ∗ × K ∗ of finite rank r. Then the equation x1 + x2 = 1 in (x1 , x2 ) ∈ (6.1.5) has at most 28(r+1) solutions. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.1 Results 133 We immediately obtain the following corollary. Corollary 6.1.5 Let K, be as in Theorem 6.1.4 and a1 , a2 ∈ K ∗ . Then the equation a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ (6.1.6) has at most 28(r+2) solutions. Proof. Apply Theorem 6.1.4 with instead of the group generated by and (a1 , a2 ). In most cases, the bound 28(r+2) in Corollary 6.1.5 can be improved. Let K, be as in this corollary. Two pairs (a1 , a2 ), (b1 , b2 ) ∈ K ∗ × K ∗ are called -equivalent if there is (u1 , u2 ) ∈ such that (b1 , b2 ) = (a1 , a2 )(u1 , u2 ). Obviously, the number of solutions of (6.1.5) does not change if (a1 , a2 ) is replaced by a -equivalent pair. Then we have the following result, which was proved by Evertse, Győry, Stewart and Tijdeman (1988a). Theorem 6.1.6 Let K, be as in Theorem 6.1.4. There is a collection of at most finitely many -equivalence classes of pairs in K ∗ × K ∗ , such that for every pair (a1 , a2 ) ∈ K ∗ × K ∗ outside the union of these classes, equation (6.1.6) has at most two solutions. The number of these -equivalence classes is bounded above by a function B(r) depending on the rank r of only. In fact, the method of proof gives B(r) = 12A(5, 2r) + 24A(3, 2r) + 60A(2, 2r)2 , (6.1.7) where A(n, r) is any upper bound depending only on n and r for the number of non-degenerate solutions of (6.1.3). By using (6.1.4) we obtain B(r) = e20000(r+3) . For earlier bounds for B(r), see Győry (1992b) and Bérczes (2000). The bound 2 in Theorem 6.1.6 is optimal. For suppose that the set + := , the equation {(u1 , u2 ) ∈ : u1 = u2 } is infinite. Then for any (u1 , u2 ) ∈ + 1 − u2 u1 − 1 x1 + x2 = 1 u1 − u2 u1 − u2 has two solutions in , namely (1, 1) and (u1 , u2 ). But, by Corollary 6.1.5, the -equivalence class of such an equation can have only finitely many equations 1−u u −1 , then ( u −u2 , u 1−u ) with solutions (1, 1). Hence if (u1 , u2 ) runs through + 1 2 1 2 runs through infinitely many -equivalence classes. In Section 6.3 we sketch a proof of the following: equation (6.1.3) has at most c(n)r+1 non-degenerate solutions, where c(n) is an effectively computable constant depending only on n. For a detailed proof of Theorem 6.1.3 we refer to Evertse, Schlickewei and Schmidt (2002) (see also Rémond (2002)). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 134 Unit equations in several unknowns Theorem 6.1.4 is proved in Section 6.4. In Section 6.5 we deduce Theorem 6.1.6 from Theorem 6.1.3. Here we follow the proof of Evertse, Győry, Stewart and Tijdeman (1988a) and Bérczes (2000). For further historical comments related to the theorems in this subsection, we refer to Section 6.7. 6.1.3 Lower bounds Erdős, Stewart and Tijdeman were the first to consider lower bounds for the number of solutions of S-unit equations. For a set of distinct primes S = {p1 , . . . , pt }, denote by N (S) the number of solutions of x1 + x2 = 1 in x1 , x2 ∈ ±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z . (6.1.8) Then, in Erdős, Stewart and Tijdeman (1988), it was shown that for every > 0 and every sufficiently large t, there is a set of primes S of cardinality t such that N (S) ≥ exp((4 − )(t/ log t)1/2 ). Recall that Theorem 6.1.6 implies N (S) ≤ C t+1 with C a constant > 1. Stewart conjectured that there are absolute constants c1 , c2 > 1, such that 2/3 for every t > 0 and every set of primes S of cardinality t we have N (S) ≤ c1t , while conversely, for arbitrarily large t there is a set of primes S of cardinality 2/3 t such that N(S) ≥ c2t . Konyagin and Soundararajan (2007) obtained the following result, which is a small further step towards Stewart’s conjecture. √ Theorem 6.1.7 For every β < 2 − 2 = 0.586 . . . , there are sets of primes S of arbitrarily large cardinality t, such that N (S) ≥ exp(t β ). In Section 6.6 we have included the ingenious proof of Konyagin and Soundararajan. In their paper mentioned above, Konyagin and Soundararajan proved also that for every sufficiently large t there are distinct primes p1 , . . . , pt such that the equation x−y =1 has at least exp(t 1/16 ) solutions in positive integers x, y composed of p1 , . . . , pt . We omit the proof of this result, which is based on much deeper analytic number theory. There are also results on lower bounds for the number of solutions of S-unit equations in an arbitrary number of unknowns. Let again S = {p1 , . . . , pt } Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.1 Results 135 be a finite set of primes, let n ≥ 2, and denote by N (n, S) the number of non-degenerate solutions to x1 + · · · + xn = 1 in x1 , . . . , xn ∈ {±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z}. (6.1.9) In Evertse, Moree, Stewart and Tijdeman (2003), the authors proved that for every n ≥ 2, > 0, and every sufficiently large t there is a set of primes S of cardinality t such that 2 n t 1−1/n (log t)−1/n . N(n, S) ≥ exp (1 − ) n−1 This is a slight improvement of an unpublished result by Granville. It would be of interest to improve this further, by extending the approach of Konyagin and Soundararajan. We introduce another quantity, which more or less measures how much algebraic structure there is in the set of solutions of (6.1.9). Denote by g(n, S) the smallest integer g such that there is a non-zero polynomial P ∈ C[X1 , . . . , Xn ] of total degree g, not divisible by X1 + · · · + Xn − 1, such that P (x1 , . . . , xn ) = 0 for every solution (x1 , . . . , xn ) of (6.1.9). In other words, the set of solutions of (6.1.9) cannot be contained in a hypersurface of Cn of degree < g(n, S). It is not hard to show that g(n, S) ≤ 2n−1 − n + (n − 1)N (n, S)1/(n−1) . (6.1.10) Indeed, let N := N (n, S) and let g be the smallest integer such that ) > N . Then there is a non-zero polynomial P1 ∈ C[X1 , . . . , Xn−1 ] ( n+g−1 n−1 of total degree ≤ g such that P1 (x1 , . . . , xn−1 ) = 0 for each non-degenerate solution (x1 , . . . , xn ) of (6.1.9). This can be seen by viewing the relations P1 (x1 , . . . , xn−1 ) = 0 as linear equations in the coefficients of P1 . Thus, we ) unknowns and by our choice obtain a system of N linear equations in ( n+g−1 n−1 of g it has a non-trivial solution. Our choice of g implies g n−1 n−1 ≤ n+g−2 ≤ N, n−1 hence g ≤ (n − 1)N 1/(n−1) . Now let P be the product of P1 and of all poly nomials i∈I Xi with I a subset of {1, . . . , n − 1} of cardinality at least 2. Then P has total degree g + 2n−1 − n ≤ 2n−1 − n + (n − 1)N 1/(n−1) , P is not divisible by X1 + · · · + Xn − 1 since it depends only on X1 , . . . , Xn−1 , and every solution of (6.1.9), degenerate or non-degenerate, is a zero of P . This implies (6.1.10). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 136 Unit equations in several unknowns Again in Evertse, Moree, Stewart and Tijdeman (2003), it was shown that for every n ≥ 2, > 0 and every sufficiently large t there is a set of primes S of cardinality t such that g(n, S) ≥ exp((4 − )(t/ log t)1/2 ). Using the above theorem of Konyagin and Soundararajan we improve this as follows. √ Theorem 6.1.8 For every n ≥ 2, β < 2 − 2, there are sets of primes S of arbitrarily large cardinality t, such that g(n, S) ≥ exp(t β ). The proof of this result is given in Section 6.6. 6.2 Proofs of Theorem 6.1.1 and Corollary 6.1.2 We take as starting point Theorem 3.1.3. As before, K is an algebraic number field, S a finite set of places of K containing all infinite places and n an integer with n ≥ 2. We prove the following result, which is in fact equivalent to Theorem 6.1.1. Proposition 6.2.1 Let T be a subset of S and > 0. There is a constant C ineff (K, S, n, ) > 0 such that for all vectors x = (x1 , . . . , xn ) ∈ OSn satisfying xi = 0 for each non-empty subset I of {1, . . . , n} (6.2.1) i∈I we have n v∈S i=1 |xi |v |x1 + · · · + xn |v v∈T ≥ C ineff (K, S, n, ) max(|x1 |v , . . . , |xn |v ) HS (x)− . (6.2.2) v∈T Since |x1 + · · · + xn |v max(|x1 |v , . . . , |xn |v ) for v ∈ MK , the special case of Proposition 6.2.1 with T = S implies the general case of arbitrary subsets T of S. On the other hand, Proposition 6.2.1 with T = S is a reformulation of Theorem 6.1.1. Indeed, writing x0 := −(x1 + · · · + xn ), we see that (6.2.2) with T = S can be rewritten as NS (x0 · · · xn ) HS (x0 , . . . , xn )1− (with implied constant depending on K, S, n, ), which in turn is equivalent to inequality (6.1.1) in Theorem 6.1.1. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.2 Proofs of Theorem 6.1.1 and Corollary 6.1.2 137 A weaker version of Proposition 6.2.1 was proved earlier by van der Poorten and Schlickewei (unpublished). The proof of Proposition 6.2.1 is by induction on n. For n = 1 the assertion is trivially true. Assume Proposition 6.2.1 is true for vectors with fewer than n coordinates, where n ≥ 2. We proceed to prove (6.2.2) under this induction hypothesis. Henceforth we restrict ourselves to vectors x = (x1 , . . . , xn ) ∈ OSn satisfying (6.2.1) and n |xi |v |x1 + · · · + xn |v ≤ max (|x1 |v , . . . , |xn |v ) HS (x)− . v∈S i=1 v∈T v∈T (6.2.3) This is obviously no loss of generality. We start with a lemma. Lemma 6.2.2 The set of solutions x ∈ OSn \ {0} of (6.2.3) is contained in a union of finitely many proper linear subspaces of K n . Proof. Let x = (x1 , . . . , xn ) be a solution of (6.2.3). Define the linear form X0 := −(X1 + · · · + Xn ) and put x0 := −(x1 + · · · + xn ). For v ∈ S \ T , let L1v = X1 , . . . , Lnv = Xn . For v ∈ T , let iv ∈ {0, . . . , n} with |xiv |v = max(|x0 |v , . . . , |xn |v ), and let L1v , . . . , Lnv be the linear forms from {X0 , . . . , Xn } \ {Xiv } in some order. Then |xiv |v max(|x1 |v , . . . , |xn |v ) so (6.2.3) implies for v ∈ T , |L1v (x) · · · Lnv (x)|v HS (x)− . v∈S By Theorem 3.1.3, the solutions x ∈ OSn \ {0} of the latter inequality, and hence the solutions of (6.2.3) with |xiv |v = max(|x0 |v , . . . , |xn |v ) for v ∈ T , lie in a union of finitely many proper linear subspaces of K n . By applying this to all tuples (iv : v ∈ T ), Lemma 6.2.2 follows. Let T1 , . . . , Tt be the subspaces from Lemma 6.2.2 and let T ∈ {T1 , . . . , Tt }. Then T can be given by an equation x1 + · · · + xn = βi1 xi1 + · · · + βim xim , (6.2.4) where βi1 , . . . , βim ∈ K and m < n. Let x ∈ OSn be a vector with (6.2.1), (6.2.3) and with x ∈ T . So x satisfies (6.2.4). Let J be a minimal subset of {i1 , . . . , im } Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 138 Unit equations in several unknowns such that j ∈J βj xj = 0. By re-indexing, we may assume that ⎫ ⎪ ⎪ ⎪ ⎬ u x1 + · · · + xn = βi xi , u < n, i=1 ⎪ βi xi = 0 for each non-empty subset I of {1, . . . , u}.⎪ ⎪ ⎭ (6.2.5) i∈I We now consider vectors x ∈ OSn with (6.2.1), (6.2.3), (6.2.5) and show that these satisfy (6.2.2) with an appropriate constant C ineff . Below, constants implied by the Vinogradov symbols , will be ineffective, and will depend only on K, S, n, , β1 , . . . , βu . But notice that β1 , . . . , βu in turn depend only on K, S, n, , as they are coming from Lemma 6.2.2. So the constants implied by , ultimately depend only on K, S, n, . Choose δ ∈ OS \ {0} such that δβi ∈ OS for i = 1, . . . , u, and for a solution x ∈ OSn of (6.2.1), (6.2.3) and (6.2.5), write zi := δβi xi (i = 1, . . . , u), z = (z1 , . . . , zu ). Let x = (x1 , . . . , xn ) ∈ OSn be a vector with (6.2.1), (6.2.3) and (6.2.5). Then |xi |v |zi |v for v ∈ S, i = 1, . . . , u, and |x1 + · · · + xn |v |z1 + · · · + zu |v for v ∈ T . Hence n |xi |v |x1 + · · · + xn |v v∈S i=1 n v∈T |xi |v u v∈S i=u+1 |zi |v v∈S i=1 |z1 + · · · + zu |v . v∈T Now a first application of the induction hypothesis gives n |xi |v v∈S i=1 n |x1 + · · · + xn |v v∈T |xi |v v∈S i=u+1 n v∈S i=u+1 v∈T |xi |v max (|z1 |v , . . . , |zu |v ) HS (z)−/2 max (|x1 |v , . . . , |xu |v ) HS (x)−/2 . v∈T Let T1 = {v ∈ T : max(|x1 |v , . . . , |xn |v ) = max(|x1 |v , . . . , |xu |v )} , T2 = T \ T 1 . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.2 Proofs of Theorem 6.1.1 and Corollary 6.1.2 If T1 = T , we obtain at once (6.2.2), since T1 T . Then n v∈S i=u+1 139 |xi |v ≥ 1. Suppose max(|x1 |v , . . . , |xu |v ) = max(|x1 |v , . . . , |xn |v ) for v ∈ T1 , max(|x1 |v , . . . , |xu |v ) |(β1 − 1)x1 + · · · + (βu − 1) xu |v = |xu+1 + · · · + xn |v for v ∈ T2 . Now a second application of the induction hypothesis yields n |xi |v v∈S i=1 n |x1 + · · · + xn |v v∈T |xi |v v∈S i=u+1 × |xu+1 + · · · + xn |v v∈T2 max(|x1 |v , . . . , |xn |v ) HS (x)−/2 v∈T1 max(|xu+1 |v , . . . , |xn |v ) HS (xu+1 , . . . , xn )−/2 × v∈T1 v∈T2 max(|x1 |v , . . . , |xn |v ) HS (x)−/2 max(|x1 |v , . . . , |xn |v ) HS (x)− , v∈T as required. This proves Proposition 6.2.1, hence Theorem 6.1.1. Proof of Corollary 6.1.2. Let T be the smallest set of places such that S ⊆ T and a1 , . . . , an are T -units. Then for every non-degenerate solution x ∈ (OS∗ )n of (6.1.2), we have H (ai xi ) ≤ 1/[K:Q] max(1, |a1 x1 |v , . . . , |an xn |v ) v∈T = HT (1, a1 x1 , . . . , an xn )1/[K:Q] ≤ C ineff for some ineffective constant C ineff . This leaves only finitely many possibilities for ai xi for i = 1, . . . , n. This implies Corollary 6.1.2 at once. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 140 Unit equations in several unknowns 6.3 A sketch of the proof of Theorem 6.1.3 We outline a proof of the following result. Theorem 6.3.1 Let K be a field of characteristic 0, n ≥ 2, a subgroup of (K ∗ )n of finite rank r, and a1 , . . . , an ∈ K ∗ . Then the equation a1 x1 + · · · + an xn = 1 in (x1 , . . . , xn ) ∈ (6.1.3) has at most c1 (n)r+1 non-degenerate solutions, where c1 (n) is an effectively computable number depending only on n. Constants c2 (n), c3 (n), . . . introduced below will also be effectively computable and depending only on n. 6.3.1 A reduction We reduce Theorem 6.3.1 to the following apparently weaker result. ∗ Theorem 6.3.2 Let n ≥ 2 and let be a finitely generated subgroup of (Q )n of rank r. Then the set of solutions of x1 + · · · + xn = 1 in (x1 , . . . , xn ) ∈ (6.3.1) n is contained in a union of at most c2 (n)r+1 proper linear subspaces of Q . In the proof of Theorem 6.3.1, we have to make a reduction from the case that is contained in an arbitrary field K of characteristic 0 to the ∗ case ⊂ (Q )n , and for this, we need the following specialization result from algebraic geometry. Lemma 6.3.3 Let K be a field of characteristic 0 with K ⊃ Q, and u1 , . . . , um (m ≥ 1) non-zero elements of K. Then there exists a ring homomorphism ϕ : Q[u1 , . . . , um ] → Q, leaving Q invariant. Proof. Define the ideal I := {f ∈ Q[X1 , . . . , Xm ] : f (u1 , . . . , um ) = 0} and let m Z(I ) := x = (x1 , . . . , xm ) ∈ Q : f (x) = 0 for all f ∈ I . Obviously, I = (1), so by the Weak Nullstellensatz (see, e.g., Harris (1992), Theorem 5.17), the set Z(I ) is not empty. Choose c = (c1 , . . . , cm ) ∈ Z(I ). Then there is a well-defined ring homomorphism ϕ : Q[u1 , . . . , um ] → Q, mapping ui to ci for i = 1, . . . , m, and mapping the elements of Q to itself. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.3 A sketch of the proof of Theorem 6.1.3 141 Proof of Theorem 6.3.1. First suppose that is a finitely generated subgroup ∗ ∗ of (Q )n of rank r, and that a1 , . . . , an ∈ Q . By applying Theorem 6.3.2 to the group generated by and (a1 , . . . , an ), we infer that the set of solutions of (6.1.3) is contained in a union of at most c2 (n)r+2 proper linear subspaces n of Q . By induction on n, it now follows that (6.1.3) has at most c1 (n)r+1 nondegenerate solutions. We give the argument. For n = 2 Theorem 6.3.1 is obviously true. Let n ≥ 3 and assume Theorem 6.3.1 is true for equations in fewer than n unknowns. Consider the solutions of (6.1.3) lying in a ∗ fixed proper linear subspace of (Q )n , given by a non-trivial linear equation β1 x1 + · · · + βn xn = 0, say. By combining this with (6.1.3) we can eliminate one of the unknowns, and obtain an equation i∈I γi xi = 1, where I is a proper subset of {1, . . . , n}. For each subset J of I , consider the solutions of the lat ter equation such that i∈J γi xi = 1 but i∈J γi xi = 0 for each non-empty subset J of J . Assuming J has cardinality m, by the induction hypothesis the latter equation has at most c1 (m)r+1 solutions (xi : i ∈ J ). Substituting a tuple (xi : i ∈ J ) in (6.1.3) we obtain i∈J c ai xi = b, where J c := {1, . . . , n} \ J and b := 1 − i∈J ai xi , and b, as well as each proper subsum of the left-hand side, are non-zero. By applying again the induction hypothesis, we see that there are at most c1 (n − m)r+1 possibilities for the remaining tuple (xi : i ∈ J c ). So for given β1 , . . . , βn and J , we have at most (c1 (m)c1 (n − m))r+1 solutions (x1 , . . . , xn ). By summing over (β1 , . . . , βn ) (the number of which is bounded by c2 (n)r+2 ) and over all J , we obtain an upper bound c1 (n)r+1 for the total number of non-degenerate solutions of (6.1.3). This completes the induction step. We now consider the general case that is a subgroup of (K ∗ )n of finite rank r, and that a1 , . . . , an ∈ K ∗ , where K is any field of characteristic 0. We assume without loss of generality that Q is contained in K. Assume that (6.1.3) has M > c1 (n)r+1 non-degenerate solutions, x1 , . . . , xM , say, where xi = (xi1 , . . . , xin ) for i = 1, . . . , M. We apply Lemma 6.3.3 with the set {u1 , . . . , um } consisting of a1 , . . . , an , x11 , . . . , xMn , the subsums j ∈I aj xij (i = 1, . . . , M, I ⊂ {1, . . . , n}), the non-zero numbers among xi1 j − xi2 j (1 ≤ ii < i2 ≤ M, j = 1, . . . , n), and also the multiplicative inverses of all of these numbers. Thus, the images under ϕ of u1 , . . . , um are all non-zero. Put aj := ϕ(aj ), xij := ϕ(xij ) (i = 1, . . . , M, j = 1, . . . , n). Then aj = 0 for j = 1, . . . , n and xi = (xi1 , . . . , xin ) (i = 1, . . . , M) are distinct, non-degenerate solutions of a1 x1 + · · · + an xn = 1. (6.3.2) Let be the group generated by x1 , . . . , xM . Then is a finitely generated ∗ subgroup of (Q )n , and has rank at most r since it is a homomorphic image of Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 142 Unit equations in several unknowns a subgroup of . So (6.3.2) has at least M > c1 (n)r+1 non-degenerate solutions in , contrary to what has been established above. This shows that in the general case, (6.1.3) cannot have more than c1 (n)r+1 non-degenerate solutions. The remainder of this section is devoted to the proof of Theorem 6.3.2. 6.3.2 Notation ∗ Assume henceforth that is a finitely generated subgroup of (Q )n of rank r. Let K be an algebraic number field such that ⊂ (K ∗ )n , and let S be a finite set of places of K, containing all infinite places, such that ⊆ (OS∗ )n . Put d := [K : Q], s := |S|. For x = (x1 , . . . , xn ) ∈ K n define the heights h(x) := 1 max(0, log |x1 |v , . . . , log |xn |v ), d v∈M K + h(x) := n i=1 h(xi ) = n 1 max(0, log |xi |v ), d i=1 v∈M K where h(x) denotes the absolute logarithmic height of an algebraic number x. ∗ These heights can be extended in the usual manner to (Q )n by picking any number field K containing x1 , . . . , xn and applying the above definitions. It is ∗ straightforward to show that for x = (x1 , . . . , xn ) ∈ (Q )n we have 1+ h(x) ≤ h(x) ≤ + h(x). n (6.3.3) Further, for x = (x1 , . . . , xn ) ∈ we have ⎫ 1 ⎪ ⎪ max(0, log |x1 |v , . . . , log |xn |v ), ⎪ ⎬ d v∈S n n 1 1 ⎪ + ⎪ max(0, log |xi |v ) = |log |xi |v |,⎪ h(x) = ⎭ d v∈S i=1 2d v∈S i=1 h(x) = (6.3.4) where the last equality is a consequence of the Product Formula. 6.3.3 Covering results We treat this in more detail since we can simplify the argument in Evertse, Schlickewei and Schmidt (2002). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.3 A sketch of the proof of Theorem 6.1.3 143 Lemma 6.3.4 Let V be an r-dimensional real vector space and a norm on V . Let C, δ be positive reals, and let S be a subset of {x ∈ V : x ≤ C}. Then S has a subset S0 such that 2C r , δ for every x ∈ S there is x0 ∈ S0 with x − x0 ≤ δ. |S0 | ≤ 1 + (6.3.5) (6.3.6) Proof. Let S0 be any subset of S with the property that x − x > δ for any two distinct x , x ∈ S0 . We show that S0 satisfies (6.3.5). Knowing this, we can choose S0 of maximal cardinality; then it satisfies (6.3.6) as well. For u ∈ V , define Bu := {x ∈ V : x − u ≤ δ/2}. Then by the triangle inequality, the balls Bu (u ∈ S0 ) are pairwise disjoint, and are all contained in B := {x ∈ V : x ≤ C + δ/2}. Let μ be the Lebesgue measure on V normalized such that the unit ball {x ∈ V : x ≤ 1} has measure 1. Then, by comparing measures, |S0 | δ 2 r = μ(Bu ) ≤ μ(B) = C + u∈S0 δ 2 r , which implies (6.3.5). Write vectors in Rns as u = (uiv : v ∈ S, i = 1, . . . , n) and define the following homomorphism from to the additive group of Rns : ϕ : (x1 , . . . , xn ) → log |xi |v : v ∈ S, i = 1, . . . , n . d Then the kernel of ϕ is the torsion subgroup tors of . Let V be the real vector space generated by ϕ(). Then V has dimension r. Define a norm on Rns by 1 |uiv |. 2 v∈S i=1 n u := (6.3.7) Then by (6.3.4) we have + h(x) = ϕ(x) for x ∈ . (6.3.8) By combining this with Lemma 6.3.4 we obtain the following. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 144 Unit equations in several unknowns Lemma 6.3.5 Let C, δ be positive reals, and let S be a non-empty subset of {x ∈ : + h(x) ≤ C}. Then S has a subset S0 such that |S0 | ≤ 1 + 2C δ r (6.3.9) , for every x ∈ S there is x0 ∈ S0 with + h(x · x−1 0 ) ≤ δ. (6.3.10) Proof. Let S ∗ := ϕ(S). Choose S0∗ ⊂ S ∗ as in Lemma 6.3.4. Then choose for each u0 ∈ S0∗ precisely one element x0 ∈ S with ϕ(x0 ) = u0 , and let S0 be the set of all elements thus chosen. Then clearly, S0 satisfies (6.3.9). To show that it also satisfies (6.3.10), let x ∈ S, choose u0 ∈ S0∗ with ϕ(x) − u0 ≤ δ, h(x · x−1 and then x0 ∈ S0 with ϕ(x0 ) = u0 . Then by (6.3.8), + 0 ) ≤ δ. We give another application. Lemma 6.3.6 Let θ > 0. There is a subset S1 of V of cardinality |S1 | ≤ 1 + 4n θ r such that for every x = (x1 , . . . , xn ) ∈ there is c = (civ : v ∈ S, i = 1, . . . , n) ∈ S1 with n log |xi |v (6.3.11) dh(x) − civ ≤ θ. v∈S i=1 Proof. We apply Lemma 6.3.4 to the set S of vectors u(x) = log |xi |v : v ∈ S, i = 1, . . . , n dh(x) (x ∈ ) which is contained in V , and with the norm given by (6.3.7). By (6.3.8) we have for x ∈ , u(x) = + h(x) 1 ϕ(x) = ≤ n. h(x) h(x) So S ⊆ {u ∈ V : u ≤ n}. By Lemma 6.3.4 with δ = θ/2 there is a subset 2n r ) = (1 + 4n )r such that for every x ∈ S1 of S of cardinality at most (1 + θ/2 θ there is c ∈ S1 with u(x) − c ≤ θ/2. This implies (6.3.11). 6.3.4 The large solutions We give an upper bound for the number of subspaces containing the solutions x of (6.3.1) for which h(x) is large. Our main tool is Theorem 3.1.6, i.e., the quantitative version of the Parametric Subspace Theorem stated in Section 3.1. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.3 A sketch of the proof of Theorem 6.1.3 145 We apply Lemma 6.3.6 with θ = c3 (n)−1 , where c3 (n) is a sufficiently large function of n. Let S1 be the set from Lemma 6.3.6. With our choice of θ , we have |S1 | ≤ c4 (n)r . (6.3.12) Pick a tuple c = (civ : v ∈ S, i = 1, . . . , n) from S1 , and consider the solutions x = (x1 , . . . , xn ) ∈ of (6.3.1) with n log |xi |v −1 (6.3.13) dh(x) − civ ≤ θ = c3 (n) . v∈M i=1 K Put x0 := 1, X0 := X1 + · · · + Xn , c0v := 0 for v ∈ S, and for v ∈ S, choose iv ∈ {0, . . . , n} such that civ ,v = max(c0v , . . . , cnv ). Let L1v , . . . , Lnv be the linear forms Xi (i ∈ {0, . . . , n} \ {iv }) in some order, and put dj v = civ if Lj v = Xi . Further, for v ∈ MK \ S, i = 1, . . . , n put Liv = Xi , div = 0. Finally, put Q := exp(dh(x)), d := (div : v ∈ MK , i = 1, . . . , n) and define the twisted height HQ,d by (3.1.4). By (6.3.13) we have n log |Lj v (x)|v −1 − d j v ≤ c3 (n) , log Q v∈M j =1 K and this implies −1 HQ,d (x) ≤ Qc3 (n) . (6.3.14) Let ξiv := log |xi |v dh(x) (v ∈ S, Then n v∈S i = 1, . . . , n), ξ0v := 0 (v ∈ S). ξiv − max(ξ0v , . . . , ξnv ) i=0 1 = |x1 x2 · · · xn |v − log max(1, |x1 |v , . . . , |xn |v ) log dh(x) v∈S v∈S = −dh(x) = −1 dh(x) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 146 Unit equations in several unknowns and max(ξ0v , . . . , ξnv ) = 1. v∈MK In combination with (6.3.13) this implies, assuming that c3 (n) is sufficiently large, that n n 1 1 dj v = civ − max(c0v , . . . , cnv ) μ := n v∈M j =1 n v∈S i=0 K is approximately equal to n 1 1 ξiv − max(ξ0v , . . . , ξnv ) = − ; n v∈S i=0 n more precisely, 1 1 − c5 (n)−1 ≤ μ ≤ − + c5 (n)−1 , n n where c5 (n) = 2c3 (n). Likewise, max(d1v , . . . , dnv ) ≤ 1 + c5 (n)−1 . λ := − v∈MK Together with (6.3.14) this implies HQ,d (x) ≤ Q−μ−δ (6.3.15) where, provided that c3 (n) is sufficiently large, 1 − c5 (n)−1 + c3 (n)−1 > 0. n Recall that every solution x of (6.3.1) with (6.3.13) implies (6.3.15) with Q = exp(dh(x)). Now Theorem 3.1.6 (the Quantitative Parametric Subspace Theorem) implies that the set of solutions of (6.3.1) with (6.3.13) and with δ = c6 (n)−1 := 1 log Q ≥ 2c6 (n) log n =: c7 (n) d is contained in a union of at most c8 (n) proper linear subspaces of K n . Taking into account the upper bound (6.3.12) for the cardinality of S1 , which is an upper bound for the number of different inequalities (6.3.13), we arrive at the following. h(x) = Lemma 6.3.7 The set of solutions x = (x1 , . . . , xn ) ∈ of (6.3.1) with h(x) ≥ c7 (n) is contained in a union of at most c8 (n)c4 (n)r proper linear subspaces of K n . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.3 A sketch of the proof of Theorem 6.1.3 147 6.3.5 The small solutions, and conclusion of the proof It remains to consider the solutions x of (6.3.1) with h(x) < c7 (n). The crucial tool is the following, which we state without proof. ∗ Proposition 6.3.8 Let b1 , . . . , bn ∈ Q . Then the equation ∗ b1 y1 + · · · + bn yn = 1 in y = (y1 , . . . , yn ) ∈ (Q )n (6.3.16) has at most c9 (n) non-degenerate solutions with + h(y) ≤ c10 (n) . −1 This result is a special case of more general estimates for the number of points of small height lying on an arbitrary subvariety of a linear torus. From a result of Schmidt (1996), theorem 4, which was obtained by an elementary method, one can deduce the above Proposition with c9 (n) = c10 (n) = exp((4n)2n+2 ). From David and Philippon (1999), Theorem 1.3 and errata, which is much deeper and uses difficult commutative algebra, it follows that Proposition 6.3.8 holds with n−1 c9 (n) = 2(n+26)7 , c10 (n) = c9 (n)3/4 . Finally, a result of Amoroso and Viada (2009) implies that Proposition 6.3.8 holds with 2 c9 (n) = (400n5 log n)n (n−1)2 , 2 c10 (n) = 2(400n5 log n)n(n−1) . The proof of Amoroso and Viada also uses commutative algebra but it is not as difficult as that of David and Philippon. Consider the set S of solutions x of (6.3.1) with h(x) < c7 (n). By (6.3.3), these solutions satisfy + h(x) < nc7 (n). Apply Lemma 6.3.5 with C = nc7 (n), δ = c10 (n)−1 . According to that lemma, there is a subset S0 ⊆ S of cardinality |S0 | ≤ (1 + 2nc7 (n)c10 (n))r ≤ c11 (n)r such that for every x ∈ S there is x0 ∈ S0 with + h x · x−1 ≤ c10 (n)−1 . 0 (6.3.17) Write x0 = (b1 , . . . , bn ), y = (y1 , . . . , yn ) := x · x−1 0 . Then clearly, the number of non-degenerate solutions x of (6.3.1) with (6.3.17) is at most the number of non-degenerate solutions y of (6.3.16), hence at most c10 (n). Taking into account the cardinality of S0 , it follows that (6.3.1) has at most c9 (n)c11 (n)r non-degenerate solutions with + h(x) < c7 (n). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 148 Unit equations in several unknowns The degenerate solutions of (6.3.1) lie in at most 2n proper linear subspaces of K n , each given by a vanishing subsum. We infer that the solutions x of (6.3.1) (non-degenerate or not) with h(x) < c7 (n) lie in a union of at most 2n + c9 (n)c11 (n)r proper linear subspaces. Adding to this the quantity from Lemma 6.3.7, it follows that the complete set of solutions of (6.3.1) is contained in a union of at most c8 (n)c4 (n)r + 2n + c9 (n)c11 (n)r ≤ c2 (n)r+1 proper linear subspaces of K n . This completes the proof of Theorem 6.3.2. 6.4 Proof of Theorem 6.1.4 We follow Beukers and Schlickewei (1996). Let K be a field of characteristic 0, and a subgroup of K ∗ × K ∗ of rank r. We first show that Theorem 6.1.4 can be reduced to the following special case. ∗ ∗ Theorem 6.4.1 Let be a finitely generated subgroup of Q × Q of rank r. Then the equation x1 + x2 = 1 in (x1 , x2 ) ∈ (6.4.1) has at most 28(r+1) solutions. Proof of Theorem 6.1.4. We use again specializations. Let K be any field of characteristic 0, and let be a subgroup of K ∗ × K ∗ of rank r. We have to prove that any finite subset of the set of solutions of (6.4.1) has cardinality at most 28(r+1) . Let {(xi1 , xi2 ) : i = 1, . . . , N} be such a finite subset. We apply Lemma 6.3.3 with {u1 , . . . , um } consisting of the numbers xik (i = 1, . . . , N , k = 1, 2), the non-zero numbers among xik − xj k (1 ≤ i < j ≤ N , k = 1, 2), and the multiplicative inverses of all these numbers. Let ϕ be the ring homomorphism from Lemma 6.3.3, and put yik := ϕ(xik ) for i = 1, . . . , N, k = 1, 2. Since the images under ϕ of the numbers listed above are all non-zero, (yi1 , yi2 ) ∗ ∗ (i = 1, . . . , N) are distinct pairs from Q × Q . In fact, they yield N distinct solutions of y1 + y2 = 1 in (y1 , y2 ) ∈ , where is the group generated by (yi1 , yi2 ) (i = 1, . . . , N ). But is a subgroup of ϕ(), hence it has rank r ≤ r. Now it follows from Theorem 6.4.1 that N ≤ 28(r +1) ≤ 28(r+1) . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.4 Proof of Theorem 6.1.4 149 In the remainder of this section, we prove Theorem 6.4.1. Instead of the Quantitative Parametric Subspace Theorem, we can now use a much simpler method from Diophantine approximation, based on certain polynomial identities, going back to Thue and Siegel. For N ∈ Z>0 define the binary form N 2N − m WN (X, Y ) := N −m m=0 N + m N−m X (−Y )m , m and set Z := −X − Y , so that X + Y + Z = 0. Lemma 6.4.2 We have the following polynomial identities, valid for every positive integer N : WN (Y, X) = (−1)N WN (X, Y ); (6.4.2) X2N+1 WN (Y, Z) + Y 2N+1 WN (Z, X) + Z 2N+1 WN (X, Y ) = 0; Z 2N+1 W (X, Y ) Y 2N+1 WN (Z, X) N 2N+3 Z WN+1 (X, Y ) Y 2N+3 WN+1 (Z, X) (6.4.3) = cN (XY Z)2N+1 (X2 + XY + Y 2 ) with cN = 0. (6.4.4) Proof. Identity (6.4.2) is obvious. Identity (6.4.3) can be deduced from classical relations between hypergeometric functions (see, for instance, Bombieri and Gubler (2006), chapter 5), but we give a direct proof. Fix a positive integer N , let x, y, z be non-zero complex numbers with x + y + z = 0 and consider the rational function f (t) := 1 . (t(1 − xt)(1 + yt))N+1 This function has poles of order N + 1 at t = 0, t = 1/x, t = −1/y and no other poles on C ∪ {∞}. The sum of the residues at these poles is 0. We compute the residues. The residue of f at t = 0 is the coefficient of t N in the power series expansion of (1 − xt)−N−1 (1 + yt)−N−1 , which is N −N − 1 −N − 1 m (−x)N−m y N − m m m=0 = N 2N − m N −m m=0 N + m N−m x (−y)m = WN (x, y). m Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 150 Unit equations in several unknowns The residue of f at t = 1/x is equal to the residue at t = 0 of 1 ((t + 1/x)(−xt)(−z/x + yt))N+1 (x/z)N+1 = . (t(1 − xyt/z)(1 + xt))N+1 f (t + 1/x) = This residue is equal to (x/z)N+1 WN (xy/z, x) = (x/z)2N+1 WN (y, z). A similar computation gives that the residue of f at t = −1/y equals (y/z)2N+1 WN (z, x). Summing the residues and multiplying with z2N+1 shows that (6.4.3) holds for all non-zero x, y, z ∈ C with x + y + z = 0. Denote the left-hand side of (6.4.4) by N . We first show that N ≡ 0. Indeed, by (6.4.2), the value of N at X = 2, Y = Z = −1 is WN (2, −1) WN (−1, 2) WN+1 (2, −1) WN+1 (−1, 2) = ±2WN (2, −1)WN+1 (2, −1), which is easily seen to be non-zero. By (6.4.3), up to sign, N is invariant under the substitutions (X, Y ) → (Y, Z), (X, Y ) → (Z, X). Hence N is divisible by (XY Z)2N+1 . Since N is homogeneous of degree 6N + 5, the quotient N /(XY Z)2N+1 is a quadratic form, which is up to sign invariant under the above mentioned substitutions. So this quadratic form must be a scalar multiple of X2 + XY + Y 2 . n Recall that the homogeneous logarithmic height of x = (x1 , . . . , xn ) ∈ Q is given by 1 hhom (x) = hhom (x1 , . . . , xn ) := [K:Q] log max |xi |v , v∈MK 1≤i≤n where K is a number field with x ∈ K n (see Section 1.9). The other heights used in this chapter are related to this by h(x) = hhom (1, x1 , . . . , xn ), + h(x) = n hhom (1, xi ). i=1 Lemma 6.4.3 Let a, b, c be non-zero elements of Q, and let (xi , yi , zi ) (i = 3 1, 2) be two linearly independent vectors from Q such that axi + byi + czi = 0 for i = 1, 2. Then hhom (a, b, c) ≤ hhom (x1 , y1 , z1 ) + hhom (x2 , y2 , z2 ) + log 2. Proof. The vector (a, b, c) is proportional to the exterior product of (x1 , y1 , z1 ), (x2 , y2 , z2 ), which is (y1 z2 − y2 z1 , z1 x2 − x1 z2 , x1 y2 − x2 y1 ). So hhom (a, b, c) = hhom (y1 z2 − y2 z1 , z1 x2 − x1 z2 , x1 y2 − x2 y1 ). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.4 Proof of Theorem 6.1.4 151 Choose a number field K containing xi , yi , zi for i = 1, 2. Let s(v) := 1 if v is a real place, s(v) := 2 if v is a complex place, and s(v) := 0 if v is a finite place of K. Recall that v∈MK s(v) = [K : Q]. Now the lemma follows easily by observing that max(|y1 z2 − y2 z1 |v , |z1 x2 − z2 x1 |v , |x1 y2 − x2 y1 |v ) ≤ 2s(v) max(|x1 |v , |y1 |v , |z1 |v ) max(|x2 |v , |y2 |v , |z2 |v ) for v ∈ MK , and then taking the product over v ∈ MK , taking logarithms, and dividing by [K : Q]. ∗ ∗ Lemma 6.4.4 Let xi = (xi1 , xi2 ) ∈ Q × Q with xi1 + xi2 = 1 for i = 1, 2 and with x1 = x2 . Then + log 2. h(x1 ) ≤ h x2 x−1 1 Proof. Apply Lemma 6.4.3 with (a, b, c) = (x11 , y11 , −1), (x1 , y1 , z1 ) = −1 −1 , x22 x21 , 1). (1, 1, 1), (x2 , y2 , z2 ) = (x21 x11 ∗ ∗ Lemma 6.4.5 Let xi = (xi1 , xi2 ) ∈ Q × Q with xi1 + xi2 = 1 for i = 1, 2. Then for every positive integer N there is M ∈ {N, N + 1} such that 1 + log 8. h x2 x−2M−1 h(x1 ) ≤ M+1 1 Proof. If both x11 , x12 are roots of unity, then h(x1 ) = 0 and the lemma is obviously true. Assume that x11 , x12 are not both roots of unity. Choose a number field K containing xi1 , xi2 for i = 1, 2. By (6.4.3) we have 2M+1 2M+1 WM (x12 , −1) + x12 WM (−1, x11 ) − WM (x11 , x12 ) = 0 x11 for M ∈ Z>0 , while also 2M+1 −2M−1 2M+1 −2M−1 x21 x11 + x12 x22 x12 − 1 = 0. x11 Let N be a positive integer. By (6.4.4), and since x1 does not consist of roots of unity, there is M ∈ {N, N + 1} such that the vectors (x21 , x22 , −1) and 2M+1 2M+1 WM (−1, x11 ), −WM (x11 , x12 ) x11 WM (x12 , −1), x12 are linearly independent. This implies that the two vectors −2M−1 −2M−1 , x22 x12 , −1 , x21 x11 WM (x12 , −1), WM (−1, x11 ), −WM (x11 , x12 ) =: (a, b, c) are linearly independent. So by Lemma 6.4.3, + hhom (a, b, c) + log 2. (2M + 1)h(x1 ) ≤ h x2 x−2M−1 1 (6.4.5) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 152 Unit equations in several unknowns We estimate hhom (a, b, c). Choose a number field K containing x11 , x12 . The binary form WM has integer coefficients, whose absolute values have sum M 2M − m M −m m=0 M +m 3M + 1 = ≤ 23M m M (this can be seen by comparing the coefficients of XM in the power series identity (1 − X)−M−1 · (1 − X)−M−1 = (1 − X)−2M−2 ). As a consequence, we have for v ∈ MK , max(|a|v , |b|v , |c|v ) ≤ 23Ms(v) max(1, |x11 |v , |x12 |v )M . By taking the product over v ∈ MK , then logarithms and then dividing by [K : Q], we obtain hhom (a, b, c) ≤ M · h(x1 ) + 3M log 2. Together with (6.4.5) this gives + (3M + 1) log 2, (M + 1)h(x1 ) ≤ h x2 x−2M−1 1 which easily implies our lemma. The next result, which is needed to deal with the “small” solutions, is due to Beukers and Zagier. Lemma 6.4.6 Let x0 = (xi1 , xi2 ) (i = 0, 1, 2) be three distinct points from ∗ ∗ Q × Q with xi1 + xi2 = 1 for i = 0, 1, 2. Then + h x2 x−1 > 0.09. h x1 x−1 0 0 Proof. This is a consequence of Corollary 2.4 of Beukers and Zagier (1997). A similar result of this type, with a lower bound 1/2400 instead of 0.09, was obtained earlier by Schlickewei and Wirsing (1997). We give a sketch of the proof of Beukers and Zagier, referring for certain details to their paper. Write yi = (yi1 , yi2 ) = xi x−1 0 for i = 1, 2 and aj := x0j for j = 1, 2. Then (1, 1), y1 , y2 lie on the line L : a1 x1 + a2 x2 = 1. Hence 1 1 1 1 y11 y12 = 0. (6.4.6) 1 y y22 21 Further, 1 := y11 y12 y y 21 22 1 y12 y22 1 1 1 −1 y11 = y11 y12 y21 y22 1 y11 1 y −1 y21 21 1 −1 y12 = 0. −1 y (6.4.7) 22 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.4 Proof of Theorem 6.1.4 153 For assume the contrary. Then the three points (1, 1), y1 ,y2 lie on a conic C : b1 x1 x2 + b2 x2 + b3 x1 = 0, where at least two among the coefficients b1 , b2 , b3 must be non-zero. It is easy to see that L and C can have no more than two points in common, giving a contradiction. In what follows, we need functions on nine-dimensional complex space. We write points in C9 as z, (z00 , z01 , . . . , z22 ), or as (z0 , z1 , z2 ) where zi = (zi0 , zi1 , zi2 ) for i = 0, 1, 2. Define F : C9 → C by z01 z02 z02 z00 z00 z01 F (z) := z11 z12 z12 z10 z10 z11 . z z z22 z20 z20 z21 21 22 Notice that F (z) = ( 2i,j =0 zij ) det(zij−1 ) if all zij = 0. Let μ, ν be reals with μ ≥ 0, ν ≥ 0, 2μ + 3ν = 1, which will be chosen optimally later, and define μ,ν : C9 → R by μ,ν (z) := |F (z)|μ 2 |zij |ν . i,j =0 2 μ μ+ν This is equal to | det(zij−1 )| if all zij = 0. Finally, define the set i,j =0 |zij | D := z ∈ C9 : det(zij ) = 0, |zij | ≤ 1 for i, j = 0, 1, 2 and put m(μ, ν) := sup μ,ν (z). z∈D We first show that h(y1 ) + h(y2 ) ≥ −log m(μ, ν). (6.4.8) Choose a number field K containing the coordinates of y1 , y2 . For v ∈ MK , i = 1, 2, let λiv := max(1, |yi1 |v , |yi2 |v ). First, let v be an infinite place, and choose an embedding σv : K → C such that | · |v = |σv (·)|s(v) , where as usual, s(v) = 1 if v is real, s(v) = 2 if v is complex. Let z = (z0 , z1 , z2 ) = (z00 , . . . , z22 ), where z0 = (1, 1, 1), and zi is a scalar multiple of (1, σv (yi1 ), σv (yi2 )) such that max0≤j ≤2 |zij | = 1, for i = 1, 2. Then ||μv · |y11 y12 y21 y22 |νv = μ,ν (z)s(v) ≤ m(μ, ν)s(v) . (λ1v λ2v )2μ+3ν If v is finite then we have by the ultrametric inequality, ||μv · |y11 y12 y21 y22 |νv ≤ 1. (λ1v λ2v )2μ+3ν Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 154 Unit equations in several unknowns By taking the product over v ∈ MK , using (6.4.7) and the Product Formula, and the condition 2μ + 3ν = 1, inequality (6.4.8) easily follows. We first derive an upper bound for m(μ, ν), and then determine the minimum of this upper bound over the set of (μ, ν) ∈ R2≥0 with 2μ + 3ν = 1. Since the set D is compact, the function μ,ν attains a maximum on D, say at z. We use the fact that z can be chosen in such a way that at most one of the coordinates of z has absolute value < 1 (see Beukers and Zagier (1997), Lemma 3.2 for a proof). By symmetry, we may assume that |z00 | ≤ 1 and |zij | = 1 if (i, j ) = (0, 0). So zij−1 = zij if (i, j ) = (0, 0). Assume for the moment that z00 = 0. Then μ,ν (z) = |z00 |μ+ν | det(zij−1 )|μ = |z00 |μ+ν | det(zij−1 ) − det(zij )|μ −1 μ = |z00 |μ+ν |z00 − z00 | · |z11 · z22 − z12 · z21 | ≤ 2μ |z00 |ν (1 − |z00 |2 )μ . This is also true if z00 = 0 since in that case one can prove directly that |F (z)| ≤ 2. Computing the maximum of f (x) = 2μ x ν (1 − x 2 )μ on [0, 1], we obtain m(μ, ν) ≤ 2μ ν 2μ + ν ν/2 2μ 2μ + ν μ . We now let μ, ν vary, and determine the minimum of the right-hand side under the constraints 2μ + 3ν = 1, μ ≥ 0, ν ≥ 0. Elementary calculus (see Beukers and Zagier (1997), Lemma 3.3) shows that this minimum is equal to the unique root x0 ∈ (0, 1) of 12 x 2 + x 6 = 1. Inserting this into (6.4.8), we obtain h(y1 ) + h(y2 ) ≥ −log x0 > 0.09. This proves our lemma. We proceed further with equation (6.4.1) and assume henceforth that is a ∗ ∗ finitely generated subgroup of Q × Q of rank r. Then there exist an algebraic number field K and a finite set of places S of K containing the infinite places, such that ⊆ OS∗ × OS∗ . Let [K : Q] = d, |S| = s. We denote elements of R2s as u = (uiv : v ∈ S, i = 1, 2), and define a homomorphism from to the additive group of R2s by ϕ : (x1 , x2 ) → log |xi |v : v ∈ S, i = 1, 2 . d Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.4 Proof of Theorem 6.1.4 155 Let V ⊆ R2s be the real vector space spanned by ϕ(). Then V has dimension r. By (6.3.8) we have + h(x) = ϕ(x) for x ∈ , (6.4.9) where · is the norm on R2s , given by u := 1 2 2 |uiv | for u = (uiv : v ∈ S, i = 1, 2) ∈ R2s . v∈S i=1 Denote by S the image under ϕ of the set of solutions of (6.4.1). We have collected some properties of S. Lemma 6.4.7 For every u ∈ S there are at most two solutions x of (6.4.1) such that ϕ(x) = u. Proof. Let v be an infinite place of K. Then there is an embedding σ : K → C such that |x|v = |σ (x)|s(v) for x ∈ K, where s(v) = 1 if v is real, s(v) = 2 if v is complex. Consider the solutions x = (x1 , x2 ) of (6.4.1) with ϕ(x) = u. For these solutions, the absolute values |σ (x1 )| and |1 − σ (x1 )| = |σ (x2 )| have prescribed values, depending only on u. In geometric terms, σ (x1 ) is an intersection point of two given circles in the complex plane that depend on u. This implies that for any given u there are at most two possibilities for σ (x1 ), hence for x. Lemma 6.4.8 The set S has the following properties: (i) for any two distinct u1 , u2 ∈ S we have u1 ≤ 2u2 − u1 + log 4; (ii) for any two distinct u1 , u2 ∈ S and any positive integer N , there is M ∈ {N, N + 1} such that 2 u2 − (2M + 1)u1 + log 64; u1 ≤ M+1 (iii) for any three distinct u0 , u1 , u2 ∈ S we have u1 − u0 + u2 − u0 > 0.09. Proof. This is simply a combination of Lemmas 6.4.4–6.4.6, the inequality ∗ ∗ h(x) ≤ + h(x) ≤ 2h(x) for x ∈ Q × Q , and (6.4.9). Our strategy to prove Theorem 6.4.1 is to estimate the cardinality of S, using (i)–(iii) of Lemma 6.4.8. Then in view of Lemma 6.4.7, we only have to multiply the upper bound for |S| by 2 to get an upper bound for the number of solutions of (6.4.1). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 156 Unit equations in several unknowns We cover S by cones and estimate the number of points in a cone. Let θ > 0 be a parameter whose value will be specified later. By Lemma 6.3.4, there is a set E ⊂ {e ∈ V : e = 1}, with |E| ≤ (1 + 2θ −1 )r (6.4.10) such that for every non-zero u ∈ V there is e ∈ E for which u−1 u − e ≤ θ. (6.4.11) For e ∈ E denote by Se the set of u ∈ S with (6.4.11). Notice that every u ∈ Se can be written as u = ue + u with u ≤ θ u. (6.4.12) Lemma 6.4.9 Let e ∈ E, 0 < θ < 19 . log 16 (i) For any two distinct u1 , u2 ∈ Se with u2 ≥ u1 ≥ 1−9θ we have u2 ≥ 54 u1 . log 64 (ii) For any two distinct u1 , u2 ∈ Se with u2 ≥ u1 ≥ 1−9θ we have u2 < 10θ −1 u1 . log 16 (iii) The set of u ∈ Se with u ≥ 1−9θ has cardinality at most 3 + [log(10θ −1 )/ log(5/4)]. Proof. Part (i) is an elementary gap principle. Part (ii) is the more involved result, based on the polynomial identities from Lemma 6.4.2. Part (iii) follows from (i) and (ii). We first prove (i) and (ii). Let u1 , u2 be distinct elements of Se with u2 ≥ u1 . Put λi := ui for i = 1, 2. By (6.4.12), we have ui = λi e + ui with ui ≤ θ λi for i = 1, 2. Assume that λ2 < 54 λ1 . Then by property (i) of Lemma 6.4.8, λ1 ≤ 2(λ2 − λ1 )e + u2 − u1 + log 4 ≤ 2(λ2 − λ1 + θ λ2 + θ λ1 ) + log 4 < 1 2 + 92 θ λ1 + log 4. log 16 Hence λ1 < 1−9θ . This implies (i). Next, assume that λ2 ≥ 10θ −1 λ1 . Let N be the positive integer with 2N + 1 ≤ λ2 /λ1 < 2N + 3, and let M ∈ {N, N + 1} be the integer from Lemma 6.4.8 (ii). Thus, |λ2 − (2M + 1)λ1 | ≤ 2λ1 , and moreover, M> 4 . θ Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.4 Proof of Theorem 6.1.4 157 It follows that 2 λ1 ≤ M+1 (λ2 − (2M + 1)λ1 )e + u2 − (2M + 1)u1 + log 64 2 (2λ1 + λ2 θ + (2M + 1)λ1 θ ) + log 64 ≤ M+1 2 (2 + (2M + 3)θ + (2M + 1)θ )λ1 + log 64 ≤ M+1 4 = M+1 + 8θ λ1 + log 64 < 9θ λ1 + log 64, log 64 implying λ1 < 1−9θ . This proves (ii). We next prove (iii). We first consider the points u ∈ Se with log 64 log 16 ≤ u < . 1 − 9θ 1 − 9θ (6.4.13) Let u1 , u2 , . . . be these points, ordered such that u1 ≤ u2 ≤ . . .. Then by (i), we have for the n-th point in this sequence that un ≥ (5/4)n−1 u1 . Hence (5/4)n−1 < (log 64)/(log 16) = 3/2, implying n ≤ 2. So Se has at most two points u with (6.4.13). Next, we count the points u ∈ Se with u ≥ log 64 . 1 − 9θ (6.4.14) Similarly as above, we order these points in a sequence u1 , u2 , . . . such that u1 ≤ u2 ≤ . . .. Then again, by (i), we have for the n-th point in this sequence that un ≥ (5/4)n−1 u1 . On the other hand, by (ii) we have un < 10θ −1 u1 . Hence (5/4)n−1 < 10θ −1 . Thus, we obtain an upper bound 1 + [log(10θ −1 )/ log(5/4)] for the number of u ∈ Se with (6.4.14). Combined with the upper bound 2 for the number of points with (6.4.13), this gives (iii). Proof of Theorem 6.4.1. Let 0 < θ < 19 . We divide S into large points, i.e., log 16 log 16 with u ≥ 1−9θ , and small points, i.e., with u < 1−9θ . Combining the upper bound (6.4.10) for |E| with (iii) of Lemma 6.4.8, we see that S has at most , log(10θ −1 ) · (1 + 2θ −1 )r 3+ log(5/4) large points. To estimate the number of small points of S, we observe that by Lemma 6.4.8 (iii), for any u0 ∈ S, the set {u ∈ S : u − u0 ≤ 0.045} Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 158 Unit equations in several unknowns has cardinality at most 2. By Lemma 6.3.4, the set of small points of S can be covered by at most 1+ 2 log 16 0.045(1 − 9θ) r such sets. Hence S has at most 2 1+ 2 log 16 0.045(1 − 9θ) r small points. We now choose θ such that θ −1 = (log 16)/0.045(1 − 9θ), i.e., θ −1 = 9 + (log 16)/0.045 = 70.613 . . ., add the upper bounds for the number of large points and the number of small points of S obtained above, and finally multiply with 2 to get an upper bound for the number of solutions of (6.4.1). The resulting bound is 68 × 143r , which is smaller than the bound stated in Theorem 6.4.1. 6.5 Proof of Theorem 6.1.6 Let K be a field of characteristic 0, a1 , a2 ∈ K ∗ , and a subgroup of K ∗ × K ∗ of finite rank r. We consider equations a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ (6.1.6) having at least three distinct solutions, and we have to show that there are at most B(r) possibilities for the -equivalence class of (a1 , a2 ), where B(r) is given by (6.1.7). Thus, let (u1 , u2 ), (v1 , v2 ), (w1 , w2 ) be three distinct solutions of (6.1.6). Then 1 u1 u2 1 v1 v2 = 0, 1 w w 1 2 i.e. v1 w2 − v2 w1 + u2 w1 − u1 w2 + u1 v2 − u2 v1 = 0 (6.5.1) and each 2 × 2 subdeterminant of the above determinant is = 0. (6.5.2) A vanishing subsum of the left-hand side of (6.5.1) is called minimal if none of the proper subsums of this subsum is 0. We have to distinguish various cases depending on how (6.5.1) splits into minimal vanishing subsums. Clearly, two Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.5 Proof of Theorem 6.1.6 159 possible splittings that can be transformed into each other by permuting (u1 , u2 ), (v1 , v2 ), (w1 , w2 ) or interchanging the indices (1, 2) can be treated in the same manner. Notice that, in this way, one can derive at most 12 splittings from a given splitting. More precisely, after permuting (u1 , u2 ), (v1 , v2 ), (w1 , w2 ) or interchanging (1, 2), we are left with the following cases: (I) (II) (III) (IV) (V) no proper subsum of the left-hand side of (6.5.1) vanishes, v1 w2 + u2 w1 = 0, −v2 w1 − u1 w2 + u1 v2 − u2 v1 = 0, v1 w2 − v2 w1 + u2 w1 = 0, −u1 w2 + u1 v2 − u2 v1 = 0, v1 w2 + u2 w1 + u1 v2 = 0, −v2 w1 − u1 w2 − u2 v1 = 0, v1 w2 + u2 w1 − u2 v1 = 0, −v2 w1 − u1 w2 + u1 v2 = 0. Other splittings into minimal vanishing subsums are in conflict with (6.5.2). We define the quantities y1 := v1 /u1 , y2 := v2 /u2 , z1 := w1 /u1 , z2 := w2 /u2 . The pair (y1 , y2 ) determines uniquely the -equivalence class of (a1 , a2 ), since (a1 u1 , a2 u2 ) is the unique solution (ξ1 , ξ2 ) to ξ1 + ξ2 = 1, y1 ξ1 + y2 ξ2 = 1. Likewise, (z1 , z2 ) determines uniquely the -equivalence class of (a1 , a2 ). Case I. We rewrite (6.5.1) (by dividing by u2 v1 ) as y2 z1 z1 z2 y2 z2 − + − + = 1. y1 y1 y1 y1 Let be the image of × under the group homomorphism ((y1 , y2 ), (z1 , z2 )) → z2 , y2 z1 z1 z2 y2 , , , y1 y1 y1 y1 . y Then has rank at most 2r, and (z2 , . . . , y2 ) is a non-degenerate solution of 1 x1 − x2 + x3 − x4 + x5 = 1 in (x1 , . . . , x5 ) ∈ . y By Theorem 6.1.3 there are at most A(5, 2r) possibilities for (z2 , . . . , y2 ). Such y z 1 y a tuple determines z2 and y2 1 ( y2 )−1 = z1 , hence the -equivalence class of 1 1 (a1 , a2 ). So case I gives rise to at most A(5, 2r) possible -equivalence classes of pairs (a1 , a2 ). Case II. This implies y1 z2 = −1, z1 − y2 z1 z2 y2 − + = 1. y1 y1 y1 Theorem 6.1.3 implies that we have at most A(3, 2r) possibilities for the triple y z y z (− y2 1 , y2 , y2 ) (using again the argument based on a homomorphic image of 1 1 1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 160 Unit equations in several unknowns y z y z y z y × ). In combination with z1 2 = −1 this tuple determines z1 2 y2 1 z1 = 1 1 1 2 y1 y2 , hence y12 = y1 y2 y1 /y2 . This leads to two possibilities for (y1 , y2 ), hence two possible -equivalence classes for (a1 , a2 ). So case II gives rise to at most 2A(3, 2r) possible -equivalence classes of pairs (a1 , a2 ). Case III. This implies y1 z2 − y2 = 1, z1 − z2 y2 + = 1. y1 y1 By Theorem 6.1.3 we have at most A(2, 2r)2 possibilities for ( yz1 1z2 , y2 , yz21 , yy21 ). Each such tuple determines uniquely the pair (y1 , y2 ), hence the -equivalence class of (a1 , a2 ). So case III gives rise to at most A(2, 2r)2 possible -equivalence classes of pairs (a1 , a2 ). Case IV. This implies y1 z2 z1 − − = 1, y2 y2 − y2 z1 z2 − = 1. y1 y1 According to Theorem 6.1.3, there are at most A(2, 2r)2 possibilities for the tuple ( yy1 z2 2 , yz12 , yy2 z1 1 , yz21 ). From this tuple we can compute y2 y2 y2 z1 z2 = y1 z2 z1 y1 y1 y2 y1 3 . This gives three possibilities for yy21 , hence for (z1 , z2 ), and hence for the equivalence class of (a1 , a2 ). Thus in case IV there are at most 3A(2, 2r)2 possible -equivalence classes of pairs (a1 , a2 ). Case V. This implies z2 + z1 = 1, y1 z1 + z2 = 1. y2 Theorem 6.1.3 implies that we have at most A(2, 2r)2 possibilities for the triple (z2 , yz11 , z1 , yz22 ), hence for (z1 , z2 ), and hence for the -equivalence class of (a1 , a2 ). So in case V we have at most A(2, 2r)2 possibilities for the -equivalence class of (a1 , a2 ). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.6 Proofs of Theorems 6.1.7 and 6.1.8 161 By adding the upper bounds for the numbers of possible -equivalence classes of pairs (a1 , a2 ) found in cases I–V, and multiplying this with the number of permutations of (u1 , u2 ), (v1 , v2 ), (w1 , w2 ) and of the indices 1, 2, we obtain that the number of -equivalence classes of pairs (a1 , a2 ) ∈ K ∗ × K ∗ such that equation (6.1.6) has more than two solutions is at most 12(A(5, 2r) + 2A(3, 2r) + 5A(2, 2r)2 ) = B(r). This proves Theorem 6.1.6. 6.6 Proofs of Theorems 6.1.7 and 6.1.8 Proof of Theorem 6.1.7. We follow Konyagin and Soundararajan (2007). Recall that we are considering the equation (6.1.8) x1 + x2 = 1 in x1 , x2 ∈ ±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z , where S = {p1 ,√. . . , pt } is a set of distinct primes. We intend to show that for every β < 2 − 2 there are sets of primes S of arbitrarily large cardinality t such that the number N (S) of solutions of (6.1.8) is at least exp(t β ). Let y be a large real number and fix real numbers β, γ with 0 < β, γ < 1, which will be chosen optimally later. We introduce two sets L, M. The set L is the set of numbers that are the product of exactly [y β ] distinct primes from the interval [y/2, y], while M is the set of numbers that are the product of exactly [γ y β ] distinct primes from [y/4, y/2). Thus, the integers in L are coprime to those in M. Using the Prime Number Theorem and log ab a → 1 as b, → ∞ b log(a/b) b (which follows from Stirling’s Formula) we obtain that the set L has cardinality |L| = π (y) − π (y/2) = L1−β+o(1) , [y β ] (6.6.1) β where L := y [y ] and here and below, o(1) is used to denote functions of y that tend to 0 as y → ∞. In a similar manner, |M| = Lγ (1−β)+o(1) . (6.6.2) The idea is to find a positive integer u for which there are many triples (l1 , l2 , m) such that l1 , l2 ∈ L, m ∈ M and (l1 − l2 )/m = u, and then take for S the set Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 162 Unit equations in several unknowns of primes in [y/4, y] and those dividing u. Then the pairs (um/ l1 , l2 / l1 ) yield many solutions to (6.1.8). We first count the number of triples (l1 , l2 , m) with l1 ≡ l2 (mod m), l1 , l2 ∈ L, m ∈ M, l1 > l2 . (6.6.3) Let m ∈ M. For a ∈ Z, denote by r(L, a, m) the number of elements in L that lie in the residue class a (mod m). By the Cauchy–Schwarz inequality, 2 m m 1 |L|2 2 . r(L, a, m) ≥ r(L, a, m) = m a=1 m a=1 The left-hand side counts the pairs (l1 , l2 ) in L that are congruent modulo m. β Among these, there are |L| trivial solutions with l1 = l2 . Note that m ≤ y [γ y ] ≤ Lγ . We assume that γ < 1 − β. Then in view of (6.6.1), the integer m is of smaller order of magnitude than |L|. Deleting the pairs with l1 = l2 and using symmetry, we infer that for any fixed m ∈ M, the number of pairs l1 , l2 ∈ L with l1 > l2 that are congruent modulo m is bounded below by |L|2 − |L| = L2(1−β)−γ +o(1) . 2m Now, using (6.6.2), and summing over the elements of M, we see that the number of triples (l1 , l2 , m) with (6.6.3) is at least L2(1−β)−βγ +o(1) . β The elements of L are all ≤ y [y ] = L, and the integers of M are all ≥ β (y/4)[γ y ] , so the integers (l1 − l2 )/m with l1 , l2 , m satisfying (6.6.3), are all bounded above by L1−γ +o(1) . If we assume that 2(1 − β) − βγ > 1 − γ , or equivalently, (2 + γ )(1 − β) > 1, we see that there is a positive integer u ≤ L1−γ +o(1) with the property that there are at least L2(1−β)−βγ −1+γ +o(1) = L(2+γ )(1−β)−1+o(1) triples (l1 , l2 , m) with l1 , l2 ∈ L, m ∈ M and (l1 − l2 )/m = u. Notice that gcd(l1 , l2 ) is a divisor of u. By elementary number theory (see, e.g., Hardy and Wright (1980), chapter XVIII, Theorem 317), the number u has at most uO(1/ log log u) = Lo(1) divisors. Hence there are a divisor v of u, and at least L(2+γ )(1−β)−1+o(1) triples (l1 , l2 , m) such that l1 , l2 are coprime integers composed of primes from [y/2, y], m is an integer composed of primes from [y/4, y), and (l1 − l2 )/m = v. Let S consist of the primes from [y/4, y] and the Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.6 Proofs of Theorems 6.1.7 and 6.1.8 163 primes dividing v. Then these triples (l1 , l2 , m) yield at least L(2+γ )(1−β)−1+o(1) solutions (mv/ l1 , l2 / l1 ) to (6.1.8). In the course of the proof we assumed that γ < √1 − β and (2 + γ )(1 − β) > 1. Such a number γ exists precisely if β < 2 − 2. Since v ≤ u ≤ L1−γ +o(1) , the cardinality of S is at most π (y) − π (y/4) + log v < y for y sufficiently large. Further, for y sufficiently large, we have L(2+γ )(1−β)−1+o(1) = y [y β ]((2+γ )(1−β)−1+o(1)) > exp(y β ). This completes the proof of Theorem 6.1.7. √ Proof of Theorem 6.1.8. Let β < 2 − 2 and choose t and a set of primes S = {p1 , . . . , pt } such that N (S) ≥ exp(t β ) =: A(t). We consider x1 + · · · + xn = 1 in x1 , . . . , xn ∈ {±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z}. (6.1.9) Recall that g(n, S) denotes the minimal integer g such that there exists a nonzero polynomial P ∈ C[X1 , . . . , Xn ] of total degree g, which is not divisible by X1 + · · · + Xn − 1, and which has the property that P (x1 , . . . , xn ) = 0 for every solution (x1 , . . . , xn ) of (6.1.9). We prove by induction on n that g(n, S) ≥ A(t) for n ≥ 2. First, let n = 2. Let P ∈ C[X1 , X2 ] be a polynomial of total degree g(2, S), not divisible by X1 + X2 − 1, such that P (x1 , x2 ) = 0 for every solution (x1 , x2 ) of (6.1.8). Let Q(X) := P (X, 1 − X). Then Q(x1 ) = 0 for every x1 for which there exists x2 such that (x1 , x2 ) is a solution of (6.1.8). Hence g(2, S) = deg P = deg Q ≥ A(t). Suppose now that n ≥ 3, and that g(n − 1, S) ≥ A(t) is known to hold. Let U be the set of tuples (x1 , . . . , xn ) = (y1 , . . . , yn−2 , yn−1 x1 , yn−1 x2 ), where (y1 , . . . , yn−1 ) runs through the solutions of y1 + · · · + yn−1 = 1, y1 , . . . , yn−1 ∈ ±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z (6.6.4) and where (x1 , x2 ) runs through the solutions of (6.1.8). By construction, the tuples in U satisfy y1 + · · · + yn−2 + yn−1 (x1 + x2 ) = 1 and so they are solutions of (6.1.9). Let P ∈ C[X1 , . . . , Xn ] be a polynomial of total degree g(n, S), not divisible by X1 + · · · + Xn − 1, such that P (x1 , . . . , xn−1 , xn ) = 0 for every solution Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 164 Unit equations in several unknowns (x1 , . . . , xn ) of (6.1.8). Put Q(X1 , . . . , Xn−1 ) := P (X1 , . . . , Xn−1 , 1 − X1 − · · · − Xn−1 ). Then Q has total degree g(n, S), and is not identically 0. So we have to prove that Q has total degree at least A(t). Clearly, we have Q(y1 , . . . , yn−2 , yn−1 x1 ) = 0 (6.6.5) for every solution (y1 , . . . , yn−1 ) of (6.6.4) and every solution (x1 , x2 ) of (6.1.8). Define a new polynomial in n − 1 variables, Q∗ (Y1 , . . . , Yn−2 , Z) := Q(Y1 , . . . , Yn−2 , Z · (1 − Y1 − · · · − Yn−2 )). (6.6.6) Then Q∗ is not identically zero since Q is not identically zero and since the change of variables (X1 , . . . , Xn−1 ) → (Y1 , . . . , Yn−2 , Z · (1 − Y1 − · · · − Yn−2 )) is invertible. Now from (6.6.6), (6.6.5), it follows that Q∗ (y1 , . . . , yn−2 , x1 ) = 0 (6.6.7) for every solution (y1 , . . . , yn−1 ) of (6.6.4) and every solution (x1 , x2 ) of (6.1.8). We distinguish two cases. Case I. There is a solution (x1 , x2 ) of (6.1.8) such that the polynomial Q∗x1 (Y1 , . . . , Yn−2 ) := Q∗ (Y1 , . . . , Yn−2 , x1 ) is not identically zero. Then by (6.6.7), Q∗x1 is a non-zero polynomial with Q∗x1 (y1 , . . . , yn−2 ) = 0 for every solution (y1 , . . . , yn−1 ) of (6.6.4). Hence Q∗x1 has total degree ≥g(n − 1, S) ≥ A(t). Now by (6.6.6) this implies that the total degree of Q is at least A(t). Case II. The polynomial Q∗x1 (Y1 , . . . , Yn−2 ) is identically zero for every solution (x1 , x2 ) of (6.1.8). Then since (6.1.8) has at least A(t) solutions, the polynomial Q∗ must have degree at least A(t) in the variable Z. By (6.6.6) this implies that Q has degree at least A(t) in the variable Xn−1 . So again we conclude that the total degree of Q is at least A(t). This completes our induction step and our proof. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.7 Notes 165 6.7 Notes We recall some history concerning the number of solutions of unit equations and discuss some related results. r Lewis and Mahler (1961) obtained an explicit upper bound for the number of solutions of the S-unit equation over Q, x1 + x2 = 1 in x1 , x2 ∈ Z∗S , (6.7.1) where S = {∞, p1 , . . . , pt } with distinct primes p1 , . . . , pt and Z∗S = {±p1z1 · · · ptzt : zi ∈ Z} is the corresponding group of S-units. But their bound depends on p1 , . . . , pt . Lewis and Mahler derived this by applying a general result of theirs on Thue–Mahler equations to equations of the type |ax n + by n | = p1z1 · · · ptzt in x, y, z1 , . . . , zt ∈ Z. In fact, as was unnoticed by Lewis and Mahler, applying their general result instead to |xy(x + y)| = p1z1 · · · ptzt implies an upper bound ct+1 for the number of solutions of (6.7.1), with c an absolute constant independent of p1 , . . . , pt . A similar result was independently obtained by Silverman around 1984, by a different method (unpublished). The above result was generalized and improved by Evertse (1984a) as follows. Let K be an algebraic number field of degree d, S a finite set of places of K of cardinality s containing the infinite places, and a1 , a2 ∈ K ∗ . Then the equation a1 x1 + a2 x2 = 1 in x1 , x2 ∈ OS∗ has at most 3 × 7d+2s solutions. We note that earlier Győry (1979), under certain assumptions concerning the S-norms of a1 and a2 , obtained the better upper bound 4s + 1 by means of the theory of logarithmic forms. These were the first upper bounds that depend only on d and s, but not on the coefficients a1 , a2 . r Schlickewei considered the equation a1 x1 + a2 x1 = 1 in (x1 , x2 ) ∈ , (6.1.6) where a1 , a2 are non-zero elements of an arbitrary field K of characteristic 0, and is a subgroup of K ∗ × K ∗ of finite rank r. He derived a uniform upper bound for the number of solutions, depending only on r. His unpublished result was later improved by Beukers and Schlickewei (1996) who obtained the upper bound 28(r+2) for the number of solutions. r First Poe (1997) alone in a special case, and then Poe together with Bombieri and Mueller (Bombieri, Mueller and Poe (1997)) developed a “cluster principle” for the solutions of (6.1.6), in the case that K is a number field and Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 166 Unit equations in several unknowns is again a subgroup of K ∗ × K ∗ of rank r. Here, a cluster is a set of solutions such that for any two solutions x1 , x2 in the cluster, the height h(x1 x−1 2 ) is small, and the cluster principle gives an upper bound for the number of such clusters. By combining this cluster principle with Baker-type upper bounds for the heights of the solutions of (6.1.6), Bombieri et al. proved that (6.1.6) 2 has at most d 9r e86r solutions, where d = [K : Q]. Although this bound is much larger than that of Beukers and Schlickewei, the method of proof is very different, and it may be applicable to other situations. r In the special case when K is a number field and = OS∗ × OS∗ , where OS∗ is the group of S-units for some finite set of places S of K containing the infinite places, a weaker but effective version of Theorem 6.1.6 was established in Evertse, Győry, Stewart and Tijdeman (1988a). Using some earlier versions of Theorems 3.2.5, 3.2.7 and Corollary 4.1.5, due to Baker, van der Poorten and Győry, respectively, it was proved that apart from finitely many and effectively determinable OS∗ -equivalence classes of pairs (a1 , a2 ) ∈ K ∗ × K ∗ , the equation a1 x1 + a2 x2 = 1 has at most s + 1 solutions (x1 , x2 ) ∈ OS∗ × OS∗ , where s denotes the cardinality of S. Further, in the case when S is the set of infinite places of K and a1 , a2 ∈ Q∗ , the following more precise result was obtained in Brindza and Győry (1990). For given coprime positive integers a1 , a2 , there are only finitely many and effectively determinable positive integers c such that the equation a1 x1 + a2 x2 = c has more than one solution (up to conjugacy) in x1 , x2 ∈ OK∗ . The proof utilizes a simultaneous variant of Baker’s method. r Corvaja and Zannier (2006), and in a more general extent Levin (2006) considered one-parameter families of S-unit equations a1 (t)x1 + a2 (t)x2 = c(t) in t ∈ K, x1 , x2 ∈ OS∗ , (6.7.2) where as before K is a number field, S a finite set of places of K containing all infinite places, and where a1 , a2 , c ∈ K[X] are given polynomials. In his paper, Levin proved, among other things, that if (a1 , a2 , c) is a general triple of non-constant polynomials with deg a1 + deg a2 = deg c > 2, then (6.7.2) has only finitely many solutions with a1 (t)a2 (t)c(t) = 0. Here “general” means that if we view triples (a1 , a2 , c) as points in the affine space K 2 deg c+3 , then the set of triples (a1 , a2 , c) for which the above mentioned finiteness result does not hold is contained in a proper Zariski closed subset of K 2 deg c+3 . Levin’s proof allows us to effectively determine this Zariski closed subset. Notice that Levin’s result provides many examples of S-unit equations that have no solutions. In his proof, Levin heavily uses the finiteness results of Corvaja and Zannier (2002a, 2004b) on S-integral points on curves and surfaces, which they derived from the Subspace Theorem. As a consequence, Levin’s finiteness result on (6.7.2) is ineffective. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.7 Notes 167 r As was mentioned in Section 4.7, Győry and Pintér (2008) considered over Q the three-parameter family of S-unit equations un1 x1 + un2 x2 = 1 in u1 , u2 ∈ Z \ {0}, n ≥ 3, x1 , x2 ∈ Z∗S with gcd(u1 u2 , p1 · · · pt ) = 1, (6.7.3) where p1 , . . . , pt are distinct rational primes, S = {∞, p1 , . . . , pt } and Z∗S denotes the group of S-units in Q. They showed that apart from finitely many and effectively computable pairs (un1 , un2 ), the equations under consideration have no solution in x1 , x2 . r We now compare the above result with the special case K = Q, = Z∗S × Z∗S of Theorem 6.1.6 and with the remark occurring after that theorem. Further, we complete these results with a new one. For given a1 , a2 ∈ Q∗ , consider the equation a1 x1 + a2 x2 = 1 in x1 , x2 ∈ Z∗S . (6.7.4) We call two pairs (a1 , a2 ), (b1 , b2 ) ∈ Q∗ × Q∗ S-equivalent if they are Z∗S × Z∗S -equivalent, i.e., if there is (ε1 , ε2 ) ∈ Z∗S × Z∗S such that bi = ai εi for i = 1, 2. Then the number of solutions of (6.7.4) does not change if (a1 , a2 ) is replaced by an S-equivalent pair. Theorem 6.7.1 The following assertions hold. (i) There are only finitely many S-equivalence classes of pairs (a1 , a2 ) in Q∗ × Q∗ for which equation (6.7.4) has more than two solutions. (ii) For each N ∈ {0, 1, 2}, there are infinitely many S-equivalence classes of pairs (a1 , a2 ) ∈ Q∗ × Q∗ such that equation (6.7.4) has exactly N solutions. The assertion (i) is a special case of Theorem 6.1.6, hence is ineffective. The statement (ii) for N = 2 has been proved in a more general form after the enunciation of Theorem 6.1.6, while for N = 0 is an immediate consequence of the above result concerning equation (6.7.3). We now give a sketch of the proof for N = 1. The proof of each case of (ii) is constructive. Sketch of the proof of the case N = 1 of (ii). Let A be a large integer, and S the set of integers composed of the primes p1 , . . . , pt . Denote by H (A) the set of pairs (a, b) of relatively prime positive integers a, b with a, b ≤ A, and by P (A) the set of those triples (a, b, c) of positive integers a, b, c for which (a, b) ∈ H (A), gcd(ab, p1 · · · pt ) = 1 and a + b = c. It is known that H (A) has cardinality at least c1 A2 , where c1 is an effectively computable positive absolute constant. This implies that the cardinality of P (N ) is at least c2 A2 . Here c2 and c3 , c4 , c5 below are effectively computable positive numbers Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 168 Unit equations in several unknowns depending only on p1 , . . . , pt . If x, y, z is a solution of the equation ax + by = cz in x, y, z ∈ S with gcd(x, y, z) = 1 (6.7.5) for some (a, b, c) in P (A), then by Corollary 4.1.5 max(|x|, |y|, |z|) ≤ c3 Ac4 . Thus the total number of triples (x, y, z) with relatively prime x, y, z ∈ S for which there exists (a, b, c) in P (A) satisfying (6.7.5) is at most c5 (log A)3t . If (a, b, c) ∈ P (A) so that (6.7.5) holds for some relatively prime x, y, z in S and (x, y, z) is not (1, 1, 1) or (−1, −1, −1), then (a, b, c) = (y − z, z − x, y − x)/d with d = gcd(y − z, z − x, y − x). Hence (a, b, c) is uniquely determined by (x, y, z). Consequently, the number of (a, b, c) ∈ P (A) for which up to proportionality (1, 1, 1) is the only solution of (6.7.5) in S is at least c2 A2 − c5 (log A)3t , which tends to infinity as A tends to infinity. One can inductively construct an infinite sequence of such (a, b, c). For a triple (a, b, c) of this kind, write c = σ c0 with positive integers σ , c0 such that σ ∈ S, gcd(c0 , p1 · · · pt ) = 1. Then (1/σ, 1/σ ) is the only solution of the equation (a/c0 )x1 + (b/c0 )x2 = 1 in x1 , x2 ∈ Z∗S . Since a, b and c0 are pairwise relatively prime, the pairs (a/c0 , b/c0 ) under consideration are pairwise S-inequivalent. This proves the case N = 1 of (ii). r We discussed above results that give bounds for the number of solutions of S-unit equations. Here, we consider equations of the form x1 + x2 = 1 in x1 , x2 ∈ OK∗ , (6.7.6) where K is a number field. Recall that Evertse’s result mentioned above gives an upper bound 3 × 72d+3 for the number of solutions of this equation. Grant (1996) gave examples of number fields K of arbitrarily large degree d such that (6.7.6) has d 2 solutions. In fact, Grant’s examples were cyclotomic fields Q(e2πi/p ) with p a prime, and certain number fields arising from elliptic curves. We can get much better upper bounds for the number of solutions of (6.7.6) if we impose some restrictions on x1 , x2 . For instance, Silverman (1995) proved that if ε is a fixed element of OK∗ , then the equation εm + y = 1 has at most d 1+o(1) solutions m ∈ Z, y ∈ OK∗ . r We now deal with the equation a1 x1 + · · · + an xn = 1 in (x1 , . . . , xn ) ∈ , (6.1.3) where a1 , . . . , an are non-zero elements of a field K of characteristic 0 and is a subgroup of finite rank of the n-fold direct product (K ∗ )n . Already in Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.7 Notes 169 the 1970s, Dubois and Rhin (1976) and independently Schlickewei (1977a) obtained finiteness results for (6.1.3) in the special case that K = Q and = (Z∗S )n for some finite set of places S of Q, and also with a condition imposed on the solutions stronger than non-degeneracy. The general result that for arbitrary K of characteristic 0 and of finite rank, equation (6.1.3) has only finitely many non-degenerate solutions, was proved in several steps in the 1980s. Van der Poorten and Schlickewei in their unpublished preprint van der Poorten and Schlickewei (1982) and independently Evertse (1984b) proved that this equation has only finitely many non-degenerate solutions if K is a number field and = (OS∗ )n for some finite set of places S of K. Also in their above mentioned preprint, van der Poorten and Schlickewei claimed a generalization of this to the case that K is an arbitrary field of characteristic 0 and a finitely generated subgroup of (K ∗ )n , but their proof was incomplete. In van der Poorten and Schlickewei (1991) they published the complete proof of their claim. Meanwhile, Evertse and Győry (1988b) gave a different proof of the claim of van der Poorten and Schlickewei, and showed that the number of non-degenerate solutions can be estimated from above by a (with their method of proof not effectively computable) number depending only on n, K and . Further, Laurent (1984) developed some Kummer theory, which made it possible to extend the finiteness result on (6.1.3) from finitely generated groups to groups of finite rank. r Schlickewei (1990) was the first to obtain an explicit upper bound for the number of non-degenerate solutions of (6.1.3) in the case that K is a number field and = (OS∗ )n , where S is a finite set of places of K, containing all 3 infinite places. His bound was improved in Evertse (1995) to (235 n2 )n s , where s is the cardinality of S (see also Subsection 9.5.2 of the present book). In the case where K is a number field and = (OS∗ )n this has not been improved so far. Building further on unpublished weaker results of the last two authors, Evertse, Schlickewei and Schmidt (2002) proved that if K is any field of zero characteristic, a1 , . . . , an ∈ K ∗ , and a subgroup of rank r of (K ∗ )n , then (6.1.3) has at most A(n, r) = exp((6n)3n (r + 1)) non-degenerate solutions. 4 In Amoroso and Viada (2009) this was improved to A(n, r) = (8n)4n (n+r+1) . r We consider the case that has rank 0, i.e., we consider the equation a1 ζ1 + · · · + an ζn = 1 in roots of unity ζ1 , . . . , ζn , (6.7.7) where a1 , . . . , an again lie in a field K of characteristic 0. Results from Mann (1965) and Conway and Jones (1976) imply that if a1 , . . . , an ∈ Q∗ , then for each non-degenerate solution (ζ1 , . . . , ζn ) of (6.7.7), the lowest common Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 170 Unit equations in several unknowns multiple of the orders of ζ1 , . . . , ζn is ≤ C(n) with C(n) effectively computable in terms of n only. Further, results from Schinzel (1988) and Dvornicich and Zannier (2000) imply that if a1 , . . . , an generate a number field K of degree d, then for each non-degenerate solution of (6.7.7) the lowest common multiple of the orders of their components is bounded above by an effectively computable number C(n, d) depending on n and d only. This implies that the non-degenerate solutions of (6.7.7) can be determined effectively, and it implies also that the number of non-degenerate solutions of (6.7.7) is bounded above by a number depending on n and d only. Schlickewei (1996b) considered equations (6.7.7) with coefficients a1 , . . . , an from an arbitrary field K of characteristic 0 and obtained an upper bound 24(n+1)! for the number of non-degenerate solutions. This was improved by Evertse 2 (1999) to (n + 1)3(n+1) . The proofs of the two last mentioned results use only simple properties of cyclotomic fields. r Evertse, Győry, Stewart and Tijdeman (1988a) proved the following result, which shows that Theorem 6.1.6 has no obvious generalization to equations in more than two unknowns. Let K be a field of characteristic 0, n ≥ 3, and a subgroup of (K ∗ )n of finite rank. Call two tuples of coefficients (a1 , . . . , an ), (b1 , . . . , bn ) ∈ (K ∗ )n -equivalent if (a1 b1−1 , . . . , an bn−1 ) ∈ . Then there are groups of finite rank such that for every m > 0 there are infinitely many -equivalence classes of tuples (a1 , . . . , an ) ∈ (K ∗ )n with the property that (6.1.3) has at least m non-degenerate solutions. We give an easy construction different from that of Evertse et al. Choose m points (xi1 , . . . , xi,n−1 ) ∈ (K ∗ )n−1 such that xi1 + · · · + xi,n−1 = 1 for i = 1, . . . , m and no proper subsums of the left-hand sides vanish. Let 1 be the multiplicative group generated by xij , for all i = 1, . . . , m, j = 1, . . . , n − 1. Then the equation x1 + · · · + xn−1 = 1 in (x1 , . . . , xn−1 ) ∈ 1n−1 has at least m non-degenerate solutions. It follows that for all but finitely many α ∈ K \ {0, −1}, the equation 1 1+α 1 α x1 + · · · + 1+α xn−1 + 1+α xn = 1 in (x1 , . . . , xn ) ∈ 1n has at least m non-degenerate solutions, all with xn = 1. We claim that the 1 1 α , . . . , 1+α , 1+α ) with α ∈ K \ {0, −1} lie in infinitely tuples ϕ(α) := ( 1+α n many different 1 -equivalence classes. Indeed, it is easy to see that the 1n equivalence of ϕ(α), ϕ(β) implies that α/β ∈ 1 . Now if the tuples ϕ(α) with α ∈ K \ {0, −1} lay in finitely many 1n -equivalence classes, the group K ∗ would be finitely generated, which is clearly absurd. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 6.7 Notes 171 Instead, Evertse and Győry (1998b) proved the following result. Theorem 6.7.2 Let K be a field of characteristic 0 and a subgroup of (K ∗ )n of finite rank. Then for all tuples (a1 , , . . . , an ) ∈ (K ∗ )n with the exception of at most finitely many -equivalence classes, the (non-degenerate or degenerate) solutions of (6.1.3) lie in a union of at most 2(n+1)! proper linear subspaces of K n . This was improved in Evertse (2004) to 2n+1 . This bound is probably not best possible. It is as yet not clear what the optimal bound should be. r Let K be a number field, S a finite set of places of K containing the infinite places, and consider again equation (6.1.3) in S-units x1 , . . . , xn . There is as yet no general effective method to find all non-degenerate solutions if the number of unknowns is larger than 2. However, in his thesis, Vojta (1983) gave an effective method to determine all non-degenerate solutions in S-units of (6.1.3) if the number n of unknowns is 3, and cardinality of the set S is at most 3. Recently, Bennett (not published when this book went to press) extended this to n = 4, |S| ≤ 3. Both Vojta and Bennett proved more general effective results for systems of S-unit equations. More recently, Levin (2014) extended Vojta’s result to an effective result for S-integral points on certain quasi-projective varieties, where again |S| is small enough. The proofs of Vojta, Bennett and Levin all use Baker-type lower bounds for linear forms in logarithms. r An effective version of the p-adic Subspace Theorem of Schmidt and Schlickewei would yield an effective version of Corollary 6.1.2, stating the finiteness of the number of non-degenerate solutions of the S-unit equation (6.1.2), i.e. (6.1.3). It seems, however, hopeless to make the Subspace Theorem effective by the present methods. As is pointed out in Győry (1992a), an effective variant of the following weaker Diophantine result would also imply an effective version of Corollary 6.1.2, which would be of great importance for its applications. Let k, n ≥ 1 be integers, α0 , . . . , αk , β1 , . . . , βn non-zero elements of a number field K, and bi1 , . . . , bin (i = 1, . . . , k) rational integers with absolute values at most B such that = k αi β1bi1 · · · βnbin − α0 i=1 has no vanishing subsum containing α0 . Let | . |v be a normalized absolute value on K. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 172 Unit equations in several unknowns Proposition If 0 < ||v < e−δB for some δ > 0, then B < C, where C is a number depending only on k, n, K, α0 , . . . αk , β1 , . . . , βn , v and δ. For k = 1, this is a non-effective version of Baker’s Theorem and its p-adic analogue; see Section 3.2. For k ≥ 1, the above proposition is a straightforward consequence of Proposition 6.2.1, which was deduced from the Subspace Theorem. Hence the bound C is not effectively computable for k > 1 by the method of proof. In Győry (1992a) it is shown that an effective version of the above Proposition would imply an effective variant of Corollary 6.1.2 on S-unit equations. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008 7 Analogues over function fields Let k be an algebraically closed field of characteristic 0, and K a function field in one variable over k, i.e., a finitely generated extension of k of transcendence degree 1. Thus, K is a finite extension of the field of rational functions k(z), where z is any element of K \ k. For definitions and more information on function fields we refer to Chapter 2. We denote by gK/k the genus of K/k. By a valuation on K we mean a discrete valuation on K with value group Z such that v(x) = 0 for x ∈ k∗ . Let MK denote the set of valuations of K. We recall that for a finite subset S of MK a non-zero element u of K is called an S-unit if v(u) = 0 for all v ∈ MK \ S. In this chapter we deal with equations a1 x1 + a2 x2 = 1 (7.1) a1 x1 + · · · + an xn = 1 (7.2) and, in a less detailed manner, to be solved in S-units x1 , . . . , xn and with some generalizations. The coefficients are non-zero elements of K. In Section 7.1 we state Stothers’ and Mason’s Theorem, giving a function field analogue of the abc-conjecture, as well as a corollary which states that (7.1) has only finitely many solutions in S-units x1 , x2 with a1 x1 , a2 x2 ∈ k∗ , which can be effectively determined in a well-defined sense. The theorem of Stothers and Mason and its corollary are proved in Section 7.2. In Section 7.3 we give a survey without proofs on effective results on S-unit equations (7.2) in an arbitrary number of unknowns, and explain the structure of the set of solutions of such equations, which is somewhat more complicated than that of S-unit equations over number fields. 173 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 174 Analogues over function fields In Sections 7.4 and 7.5 we consider, among other things, the equation a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ , (7.3) where again a1 , a2 ∈ K ∗ and where is a multiplicative subgroup of K ∗ × K ∗ containing k∗ × k∗ such that /(k∗ × k∗ ) is a group of finite rank r. We prove a result from Evertse and Zannier (2008), stating that equation (7.3) has at most 3r solutions with a1 x1 , a2 x2 ∈ k. The method of proof we use, which is based on algebraic geometry, was developed by Bombieri, Mueller and Zannier (2001) and Zannier (2004). In the last section of this chapter, we give a brief overview of recent results on unit equations over fields of positive characteristic. 7.1 Mason’s inequality Recall that for any α ∈ K, the height of α relative to K is defined by min(0, v(α)). HK (α) := − v We have HK (α) ≥ 0, and equality holds precisely when α ∈ k. Further, we denote by |S| the cardinality of a set S. We start with a theorem of Mason (1983, 1984). It is a generalization of an earlier result of Stothers (1981). Theorem 7.1.1 (abc-theorem for function fields) Let S be a finite, non-empty subset of MK , and let x1 , x2 and x3 be non-zero elements of K with x1 + x2 + x3 = 0 (7.1.1) v(x1 ) = v(x2 ) = v(x3 ) for every v in MK \ S. (7.1.2) such that Then either x1 /x2 lies in k, or HK (x1 /x2 ) ≤ |S| + 2gK/k − 2. (7.1.3) We note that (7.1.3) is best possible in the sense that for every g ≥ 0 there is a function field K over k of genus g such that equality holds for infinitely many values of |S|; see Silverman (1984) for g = 0 and Brownawell and Masser (1986) for arbitrary g. Theorem 7.1.1 implies at once that if S is again a finite subset of MK and x1 , x2 are S-units with x1 + x2 = 1 and x1 , x2 ∈ k, then HK (xi ) ≤ |S|+ Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.1 Mason’s inequality 175 2gK/k − 2 for i = 1, 2. We state a more general result for the S-unit equation a1 x1 + a2 x2 = 1 in S-units x1 , x2 , (7.1.4) where a1 , a2 ∈ K ∗ . Theorem 7.1.2 Let (x1 , x2 ) be a solution of (7.1.4) with ai xi ∈ k∗ for i = 1, 2. Then max HK (xi ) ≤ |S| + 2gK/k − 2 + 5 max HK (ai ). i=1,2 i=1,2 We observe that equation (7.1.4) may have infinitely many solutions (x1 , x2 ) such that one of a1 x1 , a2 x2 lies in k. Indeed, suppose (7.1.4) has such a solution (x1,0 , x2,0 ). Then both a1 x1,0 , a2 x2,0 ∈ k∗ . Put a1 x1,0 /a2 x2,0 =: η. Then we obtain infinitely many solutions (x1 , x2 ) of (7.1.4) with a1 x1 , a2 x2 ∈ k∗ by taking (x1 , x2 ) = (x1,0 ξ, x2,0 (1 + (1 − ξ )η)) for any ξ ∈ k∗ . From Theorem 7.1.2, we obtain the following effective finiteness result. Here it is necessary to assume that k is presented explicitly in the sense of Fröhlich and Shepherdson (1956). This means that there is an algorithm to determine the zeros of any polynomial with coefficients in k. In particular, in this case we can perform the field operations in k. Further, we assume that K is presented explicitly, that is, K is given in the form k(z)(y) where z is a variable and y is a primitive element of K over k(z), with explicitly given minimal polynomial in k(z)[X]. We say that an element x of K is explicitly given if it is given in the form x= d (qi (z)/q(z))y i−1 , i=1 where d = [K : k(z)] and q1 , . . . , qd , q are explicitly given polynomials from k[z]. We call (q1 , . . . , qd , q) a representation for x. We say that a valuation v of K is explicitly given, if we are given a local parameter zv and a Laurent series yv in k((zv )) such that y → yv defines an isomorphic embedding of K into k((zv )). By a Laurent series being explicitly given we mean that we are given an inductive procedure to compute its coefficients one by one. If a nonzero x ∈ K and a valuation v are explicitly given, then v(x) can be determined by computing a Laurent series of x in terms of zv and searching for the first non-zero coefficient. Finally, an element x of K is said to be effectively determinable from certain given input data if there is an algorithm to determine an explicit representation of x from these data. We note that if elements x1 and x2 of K are effectively determinable, then so are x1 ± x2 , x1 x2 and x1 /x2 (x2 = 0). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 176 Analogues over function fields Corollary 7.1.3 Equation (7.1.4) has only finitely many solutions with ai xi ∈ k∗ for i = 1, 2 and these can be determined effectively if we assume that k, K are presented explicitly and a1 , a2 and the valuations in S are given explicitly in the sense described above. For a1 = a2 = 1, Mason (1983, 1984) deduced this corollary from his Theorem 7.1.1 stated above and from Propositions 2.4.1 and 2.4.2. Further, in this special case he extended Corollary 7.1.3 to the case of positive characteristic. We deduce Corollary 7.1.3 in a manner similar to Mason’s, using Theorem 7.1.2 instead of Theorem 7.1.1. We mention that in his book, Mason (1984) gave various applications of the results mentioned above, to Thue equations, hyper- and superelliptic equations, and curves of genus 0 and genus 1. 7.2 Proofs We prove Theorems 7.1.1, 7.1.2 and Corollary 7.1.3. Proof of Theorem 7.1.1. We assume without loss of generality that S is precisely the set of all v ∈ MK such that v(x1 ), v(x2 ), v(x3 ) are distinct. For convenience, we write u := x3 /x1 . Thus, (7.1.1) implies that x2 /x1 = −(u + 1) and our assumption translates into S = {v ∈ MK : v(u) = 0 or v(u + 1) = 0}. We may assume that u does not lie in k. Since v(u + 1) ≥ 0 if v(u) = 0, we can partition S into a disjoint union S∞ ∪ S0 ∪ S−1 , where S∞ = {v ∈ S : v(u) < 0}, S0 = {v ∈ S : v(u) > 0}, S−1 = {v ∈ S : v(u + 1) > 0}. These sets are pairwise disjoint. Notice that by the Sum Formula (see (2.1.3) in Section 2.1), v∈S∞ ∪S0 v(u) = 0. Now choose for every valuation v ∈ MK a local parameter zv . We compare the order of vanishing at v of u and the local derivative du/dzv . From (2.3.1) and (2.3.2) in Section 2.3, we infer v v v du dzv du dzv du dzv = v(u) − 1 for v ∈ S∞ ∪ S0 , =v d(u + 1) dzv = v(u + 1) − 1 for v ∈ S−1 , ≥ 0 for v ∈ MK \ S. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.2 Proofs 177 By combining these with Theorem 2.3.1 and the Sum Formula we obtain 2gK/k − 2 = v v = du dzv ≥ (v(u) − 1) + v∈S∞ ∪S0 (v(u + 1) − 1) v∈S−1 v(u + 1) − |S| = HK ((u + 1)−1 ) − |S| = HK (x1 /x2 ) − |S|. v∈S−1 This implies Theorem 7.1.1. Proof of Theorem 7.1.2. Put H := maxi=1,2 HK (ai ). For i = 1, 2, let Si be the set of valuations v outside S with v(ai ) = 0. By (2.2.8) we have |Si | ≤ / k∗ for 2HK (ai ) ≤ 2H for i = 1, 2. Take a solution (x1 , x2 ) of (7.1.4) with ai xi ∈ i = 1, 2. We have v(a1 x1 ) = v(a2 x2 ) = v(1) = 0 for v ∈ MK \ (S ∪ S1 ∪ S2 ). Notice that by (2.2.7) and (2.2.6), HK (xi ) ≤ HK (ai xi ) + HK ai−1 = HK (ai xi ) + HK (ai ) ≤ HK (ai xi ) + H for i = 1, 2. Now an application of Theorem 7.1.1 with a1 x1 , a2 x2 , 1 instead of x1 , x2 , x3 and S ∪ S1 ∪ S2 instead of S gives for i = 1, 2, HK (xi ) ≤ HK (ai xi ) + H ≤ |S| + |S1 | + |S2 | + 2gK/k − 2 + H ≤ |S| + 2gK/k − 2 + 5H. Proof of Corollary 7.1.3. Let (x1 , x2 ) be a solution of (7.1.4) with ai xi ∈ / k∗ for i = 1, 2. Pick i ∈ {1, 2}. Then v(xi ) = 0 for every valuation v ∈ MK \ S. Further, by (2.2.8), |v(xi )| = 2HK (xi ) ≤ 2C, v∈S where C is the bound from Theorem 7.1.2. As explained in Section 2.4, we can compute a minimal polynomial over k[z] of each ai , and then estimate from above the heights of the ai using (2.2.10). This leads to an effectively computable upper bound for C. We conclude that the tuple of integers (v(xi ) : i = 1, 2, v ∈ S) has only a finite number of effectively determinable / k∗ for possibilities as (x1 , x2 ) runs over the solutions of (7.1.4) with ai xi ∈ i = 1, 2. Applying Proposition 2.4.1 we infer that apart from a non-zero factor in k, x1 , x2 have only a finite number of possibilities which are effectively determinable. Hence for i = 1, 2 we may write xi = yi ξi , where ξi is some nonzero element of k and yi belongs to a finite computable subset of K. Now Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 178 Analogues over function fields equation (7.1.4) transforms into b1 ξ1 + b2 ξ2 = 1, (7.2.1) where bi := ai yi ∈ / k∗ for i = 1, 2. The pair (b1 , b2 ) belongs to a finite, effectively determinable set, and for each such pair we have to determine the solutions ξ1 , ξ2 ∈ k∗ of (7.2.1). Fix b1 , b2 . We have seen that b1 , b2 ∈ k. If b1 , b2 are linearly dependent over k then (7.2.1) is unsolvable. Assume that b1 , b2 are linearly independent over k. Then by Proposition 2.4.2, equation (7.2.1) has precisely one solution which can be determined effectively. This completes the proof of our assertion. 7.3 Effective results in the more unknowns case For completeness, we now present without proof some generalizations of the results stated in Section 7.1. Let n ≥ 2 be a given integer. For non-zero elements x0 , x1 , . . . , xn of K, we define the homogeneous height by min(v(x0 ), . . . , v(xn )). HKhom (x0 , . . . , xn ) = − v The Sum Formula on K shows that this is actually a height on the projective space Pn (K). Further, we have HK (xi /xj ) ≤ HK∗ (x0 , . . . , xn ) for each i, j with 0 ≤ i, j ≤ n. Write 1 (n − 1)(n − 2) if n ≥ 1. 2 Brownawell and Masser (1986) proved the following general theorem. N0 := 0, Nn := Theorem 7.3.1 Suppose that x0 , . . . , xn are non-zero elements of K such that x0 + · · · + xn = 0, (7.3.1) and no proper subset of {x0 , . . . , xn } is k-linearly dependent. For each valuation v of K, let m(v) = m(v; x0 , . . . , xn ) := |{i : 0 ≤ i ≤ n, v(xi ) = 0}|. Then HKhom (x0 , . . . , xn ) ≤ Nn+1 (2gK/k − 2) + (Nn+1 − Nm(v) ). (7.3.2) v Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.3 Effective results in the more unknowns case 179 The proof of Brownawell and Masser uses logarithmic Wronskians. Fix z ∈ K \ k, so that K is a finite extension of k(z). For f ∈ K, define f (k) := (d/dz)k f . Now we define the logarithmic Wronskian of f1 , . . . , fn ∈ K ∗ by (j −1) λ(f1 , . . . , fn ) := det fi /fi i,j =1,...,n . Given a solution (x0 , . . . , xn ) of (7.3.1), let λi be the logarithmic Wronskian of x0 , . . . , xn with xi omitted. Then the argument of Brownawell and Masser consists of showing that HK (x0 , . . . , xn ) = HK (λ0 , . . . , λn ), and estimating v(λi ) from below for i = 0, . . . , n and v ∈ MK , which leads to an upper bound for HK (λ0 , . . . , λn ). The following consequence of Theorem 7.3.1 was obtained independently by Voloch (1985). Corollary 7.3.2 Suppose that for some finite subset S of MK , x0 , . . . , xn give rise to a solution of (7.3.1) in S-units, and that no proper subset of {x0 , . . . , xn } is k-linearly dependent. Then HKhom (x0 , . . . , xn ) ≤ 1 n(n − 1)(|S| + 2gK/k − 2). 2 In his proof, Voloch did not use the Wronskian argument of Brownawell and Masser, but instead used properties of Weierstrass points on algebraic curves. Notice that we obtain Mason’s result, Theorem 7.1.1, from Corollary 7.3.2 by taking x1 , x2 , x3 in K ∗ with x1 + x2 + x3 = 0 and with (7.1.2) and applying Corollary 7.3.2 to (x1 /x2 ) + (x3 /x2 ) + 1 = 0. Independently, Mason (1986a) proved Corollary 7.3.2 with a larger bound in terms of n. Further, he showed that apart from a common proportional S-unit factor, the full range of possibilities for such x0 , . . . , xn is finite, and may be determined effectively whenever k, K are presented explicitly and the valuations in S are given explicitly. A sharpening of Corollary 7.3.2 was given by Zannier (1993). Hsia and Wang (2004) obtained a generalization of the result of Brownawell and Masser to function fields of arbitrary transcendence degree over constant fields of arbitrary characteristic. Their proof uses generalized Wronskians. A solution x0 , . . . , xn of (7.3.1) is called non-degenerate if i∈I xi = 0 for every non-empty proper subset I of {0, . . . , n}, and degenerate otherwise. Brownawell and Masser (1986) proved that the inequality (7.3.2) in Theorem 7.3.1 remains true, in a slightly modified form, if the assumption of linear independence is replaced by the weaker hypothesis of non-degeneracy. Set GK := max(0, 2gK/k − 2). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 180 Analogues over function fields Theorem 7.3.3 Suppose that x0 , . . . , xn is a non-degenerate solution of (7.3.1). Then HKhom (x0 , . . . , xn ) ≤ Nn+1 · GK + (Nn+1 − Nm(v) ). v This implies the following version of Corollary 7.3.2. Corollary 7.3.4 Suppose that for some finite subset S of MK , x0 , . . . , xn is a non-degenerate solution of (7.3.1) in S-units. Then HKhom (x0 , . . . , xn ) ≤ 1 n(n − 1)(|S| + GK ). 2 (7.3.3) It is likely that for n > 2 the factor 12 n(n − 1) in (7.3.3) is not the best possible one. We derive a result on the inhomogeneous equation a1 x1 + · · · + an xn = 1 in S-units x1 , . . . , xn , (7.3.4) where, as before, S is a finite subset of MK and where a1 , . . . , an are non-zero elements of K. A solution (x1 , . . . , xn ) of this equation is called non-degenerate if i∈I ai xi = 0 for each non-empty subset I of {1, . . . , n}. Theorem 7.3.5 For every non-degenerate solution (x1 , . . . , xn ) of (7.3.4) we have 1 max HK (xi ) ≤ n(n − 1)(|S| + GK ) + (n3 − n2 + 1) max HK (ai ). 1≤i≤n 1≤i≤n 2 Proof. Put H := max1≤i≤n HK (ai ). For i = 1, . . . , n, let Si be the set of valuations v outside S for which v(ai ) = 0. Then by (2.2.8) we have |Si | ≤ 2HK (ai ) ≤ 2H for i = 1, . . . , n. Choose a non-degenerate solution (x1 , . . . , xn ) of (7.3.4). Then, completely similarly as in the proof of Theorem 7.1.2, we have HK (xi ) ≤ HK (ai xi ) + H for i = 1, . . . , n. Now by applying Corollary 7.3.4 with (1, a1 x1 , . . . , an xn ) and S ∪ S1 ∪ · · · ∪ Sn instead of (x0 , x1 , . . . , xn ), S, we obtain for i = 1, . . . , n, HK (xi ) ≤ H + HK (ai xi ) ≤ H + HK (1, a1 x1 , . . . , an xn ) ≤ H + 12 n(n − 1)(GK + |S| + |S1 | + · · · + |Sn |) ≤ 12 n(n − 1)(GK + |S|) + (n3 − n2 + 1)H. The above result does not imply that (7.3.4) has only finitely many solutions. We say that two solutions (x1 , . . . , xn ), (x̃1 , . . . , x̃n ) of (7.3.4) are k-proportional, or lie in the same k-proportionality class, if xi /x̃i ∈ k for i = 1, . . . , n. In general, a k-proportionality class may contain infinitely many non-degenerate solutions. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.3 Effective results in the more unknowns case 181 Corollary 7.3.6 The set of non-degenerate solutions of (7.3.4) is contained in a union of finitely many k-proportionality classes, and if we assume that k, K are presented explicitly and S, a1 , . . . , an are explicitly given, a full system of representatives of these classes can be determined effectively. Proof. Let (x1 , . . . , xn ) be a non-degenerate solution of (7.3.4). Then by (2.2.8) we have |v(xi )| = 2HK (xi ) ≤ 2C for i = 1, . . . , n, v∈S where C is the upper bound from Theorem 7.3.5. By the same method as in the proof of Corollary 7.1.3, one can effectively compute an upper bound for C. This shows that the tuple (v(xi ) : i = 1, . . . , n, v ∈ S) runs through a finite, effectively determinable set as (x1 , . . . , xn ) runs through the non-degenerate solutions of (7.3.4), and the non-degenerate solutions corresponding to a given tuple lie in the same k-proportionality class. Hence the non-degenerate solutions of (7.3.4) lie in only finitely many k-proportionality classes. By Proposition 2.4.1, it can be decided effectively whether for a given tuple (civ : i = 1, . . . , n, v ∈ S) there exist b1 , . . . , bn ∈ OS∗ with v(bi ) = civ for i = 1, . . . , n, v ∈ S and if so, such b1 , . . . , bn can be determined effectively. The non-degenerate solutions of (7.3.4) that are k-proportional to (b1 , . . . , bn ) are of the shape (b1 ξ1 , . . . , bn ξn ) with a1 b1 ξ1 + · · · + an bn ξn = 1, (ξ1 , . . . , ξn ) ∈ kn , ai bi ξ = 0 for each non-empty I ⊂ {1, . . . , n}. (7.3.5) (7.3.6) i∈I The tuples (ξ1 , . . . , ξn ) with (7.3.5) form a linear subvariety V of kn , and the elements of V with (7.3.6) lie in the complement of a finite number of linear subvarieties of V , say V1 , . . . , Vr . Thus, the set of non-degenerate solutions of (7.3.4) that are k-proportional to (b1 , . . . , bn ) can be described as (b1 ξ1 , . . . , bn ξn ) with (ξ1 , . . . , ξn ) ∈ U := V \ (V1 ∪ · · · ∪ Vr ). So we have to decide whether or not U = ∅ and if so, find an element of U . Notice that U = ∅ precisely if V = ∅ and V1 , . . . , Vr are proper linear subvarieties of V . This can be checked using Proposition 2.4.2. Further, assuming U = ∅ one can find an element of U using the parameter representations of V , V1 , . . . , Vr that can be computed according to Proposition 2.4.2. Assume that U = ∅. Then one easily checks that U consists of only one element if V has dimension 0, that is, if a1 b1 , . . . , an bn are linearly independent over k, and U is infinite if V is positive dimensional, which is the case precisely if a1 b1 , . . . , an bn are linearly dependent over k. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 182 Analogues over function fields 7.4 Results on the number of solutions Let again k be an algebraically closed field of characteristic 0 and let now K be an extension field of k of arbitrary positive transcendence degree over k. Let n ≥ 2, and denote by (K ∗ )n the n-fold direct product of the multiplicative group K ∗ , endowed with coordinatewise multiplication. We consider equations with solution vectors from a subgroup of (K ∗ )n such that (k∗ )n ⊂ , and /(k∗ )n has finite rank r. If r = 0 this means that = (k∗ )n , while if r > 0, this means that there are multiplicatively independent elements u1 , . . . , ur 1 wr of such that every element of can be expressed as ξ · uw 1 · · · ur with wi ∗ n ξ ∈ (k ) and w1 , . . . , wr ∈ Q. (The coordinates of ui are determined only up to multiplication by roots of unity, but we just make any choice for them.) We start with the equation in two unknowns a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ , (7.4.1) where is a subgroup of (K ∗ )2 and a1 , a2 ∈ K ∗ . The following theorem is a generalization of a result of Zannier (2004). Theorem 7.4.1 Suppose that ⊃ (k∗ )2 and that /(k∗ )2 has finite rank r ≥ 0. / k∗ for j = 1, 2. Then (7.4.1) has at most 3r solutions with aj xj ∈ We now consider equations in an arbitrary number of unknowns, i.e., a1 x1 + · · · + an xn = 1 in (x1 , . . . , xn ) ∈ , (7.4.2) where is a subgroup of (K ∗ )n and a1 , . . . , an ∈ K ∗ . Recall that a solution of (7.4.2) is called non-degenerate if i∈I ai xi = 0 for each non-empty subset I of {1, . . . , n}. Further, we say that two solutions (x1 , . . . , xn ), (x̃1 , . . . , x̃n ) are k-proportional, or belong to the same k-proportionality class, if xi /x̃i ∈ k∗ for i = 1, . . . , n. The next theorem is the main result from Evertse and Zannier (2008). Theorem 7.4.2 Let n ≥ 2. Suppose that ⊃ (k∗ )n and that /(k∗ )n has finite rank r ≥ 0. Then the non-degenerate solutions of (7.4.2) lie in at most n+1 i 2 i=2 r −n+1 k-proportionality classes. Theorem 7.4.1 follows at once from 7.4.2. Indeed, all solutions of (7.4.1) are non-degenerate. Further, the solutions with aj xj ∈ k∗ for j = 1, 2 are pairwise k-non-proportional, and by substituting n = 2 into the bound of Theorem 7.4.2 we obtain precisely the bound 3r from Theorem 7.4.1. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.5 Proof of Theorem 7.4.1 183 We mention that the proof of Theorem 7.4.2, given in Evertse and Zannier (2008) depends heavily on ideas introduced in Bombieri, Mueller and Zannier (2001) and Zannier (2004). Weaker and less general results were obtained earlier in Evertse and Győry (1988b) and Mueller (2000). To give a flavour of the techniques used in the papers mentioned above, in the next section we prove Theorem 7.4.1 in the special case that K has transcendence degree 1 over k. The general case that K has arbitrary transcendence degree over k can be reduced to this special case by means of a specialization argument. The proof of Theorem 7.4.2, which is not given here, is based on the same ideas as the proof of Theorem 7.4.1. 7.5 Proof of Theorem 7.4.1 Let k be an algebraically closed field of characteristic 0, let K be an extension of k of transcendence degree 1, and let be a subgroup of (K ∗ )2 which contains (k∗ )2 and such that /(k∗ )2 has finite rank r. We would like to define in some way the k-closure of , which is such that if a point (x1 , x2 ) belongs to this k-closure, then so does (x1w , x2w ) for any w ∈ k. Then we would like to consider equation (7.4.1) with solutions from the k-closure of instead of itself. The importance of this is that it will allow us to use techniques from algebraic geometry. It does not suffice to define the k-closure of formally by taking the tensor product of with k, but we have to embed this k-closure somehow into a ring or field, to make sense of our desired extension of (7.4.1). As it turns out, one can define exponentiation with elements from k for formal power series. Then the k-closure of can be defined after embedding K into a formal power series ring. 7.5.1 Extension to the k-closure of We keep the notation introduced in Section 7.4. Let again k be an algebraically closed field of characteristic 0, K an extension of k of transcendence degree 1, and a subgroup of (K ∗ )2 such that ⊃ (k∗ )2 and /(k∗ )2 has rank r. Choose pairs (bi1 , bi2 ) (i = 1, . . . , r) from that are multiplicatively independent over (k∗ )2 , i.e., there is no non-zero vector w = (w1 , . . . , wr ) ∈ Zr with r wi ∈ (k∗ )2 . Then every element of can be expressed as i=1 (bi1 , bi2 ) (ξ1 , ξ2 ) r (bi1 , bi2 )wi , (7.5.1) i=1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 184 Analogues over function fields where (ξ1 , ξ2 ) ∈ (k∗ )2 , w1 , . . . , wr ∈ Q, and where the powers with rational exponents are defined up to multiplication by points consisting of roots of unity. Let L be the extension of k generated by a1 , a2 , bij (i = 1, . . . , r, j = 1, 2). Notice that L is an algebraic function field in one variable over k. Choose a valuation v of L such that v(aj ) = 0, v(bij ) = 0 for j = 1, 2, i = 1, . . . , r. Since v has residue class field k, after multiplying a1 , a2 and the bij by appropriate elements from k∗ , we can achieve that v(aj − 1) > 0, v(bij − 1) > 0 for j = 1, 2, i = 1, . . . , r. (7.5.2) Choose a local parameter z for v. Then the completion of L at v is the field of formal Laurent series k((z)), and we may assume that L is a subfield of k((z)). i The valuation v is extended to k((z)) by setting v( ∞ i=i0 ci z ) = i0 if ci ∈ k for ∞ i ≥ i0 and ci0 = 0. We say that a sequence {fm }m=0 in k((z)) converges to f i if v(fm − f ) → ∞ as m → ∞. The derivative of f = ∞ i=i0 ci z ∈ k((z)) is ∞ i−1 defined by f = df/dz = i=i0 ici z . If limm→∞ fm = f for some sequence {fm } in k((z)), then also limm→∞ fm = f . Denote by k[[z]] the ring of formal power series in z, and denote by 1 + zk[[z]] the set of power series 1+ ∞ ci zi with ci ∈ k for i ≥ 1. i=1 Notice that 1 + zk[[z]] is a multiplicative group. By (7.5.2) the elements aj , bij (j = 1, 2, i = 1, . . . , r) all belong to the group 1 + zk[[z]]. Further, they belong to L, hence are algebraic over k(z). We are now ready to define exponentiation with elements from k. For f ∈ 1 + zk[[z]], w ∈ k we put f w := ∞ w (f − 1)j , j j =0 where ( wj ) := w(w − 1) · · · (w − j + 1)/j !. In the topology of k[[z]] defined by v, this infinite series converges to a limit which belongs to 1 + zk[[z]]. Obviously, by Newton’s binomial formula, f w = f · · · f (w times) for any non-negative integer w. This exponentiation has the usual properties: (f w ) = wf w−1 f for f ∈ 1 + zk[[z]], w ∈ k; (7.5.3) (f g) = f g for f, g ∈ 1 + zk[[z]], w ∈ k; (7.5.4) w f w w w1 +w2 =f w1 w2 =f (f ) w1 f w2 w1 w2 for f ∈ 1 + zk[[z]], w1 , w2 ∈ k; for f ∈ 1 + zk[[z]], w1 , w2 ∈ k. (7.5.5) (7.5.6) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.5 Proof of Theorem 7.4.1 185 Property (7.5.3) can be proved using the fact that ((f − 1)j ) = j (f − 1)j −1 f for all j and that infinite series can be differentiated sumwise. The other properties can be proved using logarithmic derivatives, where the logarithmic derivative of f ∈ k((z)) is f /f . The map f → f /f defines an injective homomorphism from the multiplicative group 1 + zk[[z]] to the additive group of k[[z]], and (f w ) /f w = wf /f for f ∈ 1 + zk[[z]], w ∈ k. For instance, (7.5.4) is proved by showing that both (f g)w and f w g w have logarithmic derivatives w · ((f /f ) + (g /g)). The identities (7.5.5) and (7.5.6) can be proved likewise. In the usual manner, we put (x1 , . . . , xn )w := (x1w , . . . , xnw ) for x1 , . . . , xn ∈ 1 + zk[[z]] and w ∈ k. We now define the k-closure of by r wi ∗ := (ξ1 , ξ2 ) (bi1 , bi2 ) : ξ1 , ξ2 ∈ k , w1 , . . . , wr ∈ k . (7.5.7) i=1 By (7.5.1), this group indeed contains . In what follows, we write w for vectors (w1 , . . . , wr ) ∈ kr . Our result for is as follows. Theorem 7.5.1 Let r be a positive integer and let aj , bij (j = 1, 2, i = 1, . . . , r) be elements of 1 + zk[[z]] such that aj , bij are algebraic over k(z) for j = 1, 2, i = 1, . . . , r, r r there is no w ∈ Z \ {0} with (bi1 , bi2 )wi ∈ (k∗ )2 . (7.5.8) (7.5.9) i=1 Let be given by (7.5.7). Then the equation a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ with a1 x1 ∈ k∗ , a2 x2 ∈ k∗ (7.5.10) has at most 3r solutions. The idea of the proof is to consider the vectors w ∈ kr in the representation (7.5.7) for the solutions (x1 , x2 ) of (7.5.10), and to estimate the number of these w using techniques from algebraic geometry. Here it will be crucial that w can be chosen from kr and not just from Qr , as was the case with the group . 7.5.2 Some algebraic geometry We have collected some basic facts from algebraic geometry. Our basic reference is Hartshorne (1977), chapter 1. As before, k is an algebraically closed field of characteristic 0. Let r be a positive integer. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 186 Analogues over function fields By an algebraic subset of kr we mean the set of common zeros in kr of a collection of polynomials in k[X1 , . . . , Xr ]. The algebraic subset of kr given by f1 , . . . , fm ∈ k[X1 , . . . , Xm ], notation V(f1 , . . . , fm ), is defined as the set of common zeros in kr of f1 , . . . , fm . The algebraic subsets of kr are the closed sets of the Zariski topology on kr . We say that an algebraic subset of kr is defined over a subfield k of k if it is the set of common zeros in kr of polynomials with coefficients in k . The collection of all polynomials f ∈ k[X1 , . . . , Xr ] vanishing identically on a given algebraic set X ⊂ kr is an ideal of k[X1 , . . . , Xr ], which is denoted by I (X ). Clearly, if X , Y are algebraic subsets of kr , then X ⊆ Y if and only if I (X ) ⊇ I (Y). Since k[X1 , . . . , Xr ] is a Noetherian domain, every ascending chain of ideals of k[X1 , . . . , Xr ] is eventually constant. By applying this to ideals associated with algebraic sets, we obtain the descending chain property for algebraic sets: if Xi (i = 1, 2, . . .) are algebraic subsets of kr with X1 ⊇ X2 ⊇ X3 ⊇ · · · , then there is j0 such that Xj = Xj0 for j ≥ j0 . An algebraic subset of kr is called irreducible if it is not the union of two strictly smaller algebraic subsets of kr . An irreducible algebraic subset of kr is also called an algebraic subvariety of kr . A linear subvariety of kr is an algebraic subvariety of kr given by linear polynomials. From the descending chain property for algebraic sets it follows easily that every non-empty algebraic subset X of kr is a union V1 ∪ · · · ∪ Vg of finitely many algebraic subvarieties of kr . If we assume in addition that none of these algebraic subvarieties is contained in the union of the others, they are uniquely determined. In that case, V1 , . . . , Vg are called the irreducible components of X . From the definition of irreducibility it follows at once that any algebraic subvariety contained in X must be contained in an irreducible component of X . Let V be an algebraic subvariety of kr . Then its associated ideal I (V) is a prime ideal of k[X1 , . . . , Xr ], hence the quotient ring k[X1 , . . . , Xr ]/I (V) is an integral domain. The quotient field of this domain is called the function field of V, notation k(V). The transcendence degree over k of this field is called the dimension of V, notation dim V. A zero-dimensional algebraic variety is a point, and a one-dimensional algebraic variety is an algebraic curve. Further, if V1 , V2 are algebraic subvarieties of kr with V1 strictly contained in V2 , then dim V1 < dim V2 . We recall some results from intersection theory. To state these, we need the notion of degree of an algebraic variety. Let V be an n-dimensional algebraic subvariety of kr . Denote by Vm the k-vector space of polynomials in Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.5 Proof of Theorem 7.4.1 187 k[X1 , . . . , Xr ] of total degree at most m and define HV (m) to be the dimension of the quotient k-vector space Vm /(Vm ∩ I (V)). One can show that there is a polynomial pV ∈ Q[X], called the Hilbert polynomial of V, such that HV (m) = pV (m) for every sufficiently large integer m. Further, there is a positive integer deg V, called the degree of V, such that pV (m) = deg V · mn /n!+ (lower powers of m). For instance, deg V = 1 if V = kr or if V is a point. Proposition 7.5.2 Let V be an algebraic subvariety of kr and let X be an algebraic subset of kr given by polynomials of total degree at most d. Then V ∩ X is an algebraic subset of kr with at most d dim V · deg V irreducible components. Proof. We proceed by induction on dim V. If V has dimension 0, i.e., is a point, the assertion is obvious. Suppose that dim V = n > 0. If V ⊆ X we are done since by definition, V itself is irreducible. Assume that V ⊂ X . Then there is a polynomial f ∈ k[X1 , . . . , Xr ] of total degree at most d that vanishes identically on X , but does not vanish identically on V. We now invoke a version of Bézout’s Theorem (see Hartshorne (1977), chapter 1, Theorem 7.7 for a more precise version with multiplicities) which states that if V1 , . . . , Vg are the irreducible components of V ∩ V(f ), then dim Vi = n − 1 for i = 1, . . . , g, g deg Vi ≤ d · deg V. i=1 Now, by the induction hypothesis, we have for i = 1, . . . , g that Vi ∩ X has at most d n−1 · deg Vi irreducible components. Consequently, the number of irreducible components of V ∩ X is at most g d n−1 · deg Vi ≤ d n deg V = d dim V · deg V. i=1 Corollary 7.5.3 Let X be an algebraic subset of kr given by polynomials of total degree at most d. Let Y be an algebraic subset of X such that X \ Y is finite. Then X \ Y has cardinality at most d r . Proof. Assume that X \ Y = {P1 , . . . , Pg }. Notice that the irreducible components of Y, together with P1 , . . . , Pg , form a decomposition of X into irreducible subsets, none of which is contained in the union of the others. This shows that {P1 }, . . . , {Pg } are irreducible components of X . Now our corollary follows at once by applying Proposition 7.5.2 with V = kr . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 188 Analogues over function fields 7.5.3 Proof of Theorem 7.5.1 We keep the notation and assumptions from Subsection 7.5.1. We denote by E the algebraic closure of k(z) in the field k((z)). It is important to observe that the differentiation x → x = dx/dz maps elements from E to elements from E. Suppose that aj , bij (j = 1, 2, i = 1, . . . , r) satisfy (7.5.8) and (7.5.9). Denote by X the set of w = (w1 , . . . , wr ) ∈ kr such that the three functions 1, a1 r wi bi1 , a2 r i=1 wi bi2 i=1 are k-linearly dependent. Further, denote by Y the set of w ∈ kr such that any two among these functions are k-linearly dependent. Let (x1 , x2 ) be a solution of (7.5.10). Representing (x1 , x2 ) as in (7.5.7), we obtain ξ1 a1 r i=1 wi bi1 + ξ2 a2 r wi bi1 =1 (7.5.11) i=1 with ξ1 , ξ2 ∈ k∗ , w = (w1 , . . . , wr ) ∈ kr . Hence w ∈ X . Further, the condition a1 x1 ∈ k∗ , a2 x2 ∈ k∗ implies that w ∈ Y. This shows that w ∈ X \ Y. Conversely, let w ∈ X \ Y. Then there are unique ξ1 , ξ2 ∈ k∗ with (7.5.11), and this leads to a unique solution (x1 , x2 ) = (ξ1 , ξ2 ) ri=1 (bi1 , bi2 )wi of (7.5.10). So in order to prove Theorem 7.5.1 it suffices to prove the following. Proposition 7.5.4 The set X \ Y has cardinality at most 3r . Eventually, we will apply Corollary 7.5.3 from the previous subsection. We first show that X \ Y is finite, and then that X , Y are algebraic sets where X is given by polynomials of total degree at most 3. We first prove a number of lemmas. Lemma 7.5.5 Let L be a finite extension of k(z) contained in E, and let β1 , . . . , βr , α ∈ L∗ and w1 , . . . , wr ∈ k be such that β1w1 · · · βrwr = α. Then for every valuation v of L we have ri=1 wi v(βi ) = v(α). Proof. We need some facts on residues. Choose a local parameter zv of v. Then i every f ∈ L can be expressed as a Laurent series ∞ i=i0 ci zv with ci ∈ k. We define the residue of f at zv by reszv (f ) := c−1 ; this defines a k-linear map from L to k. One can show that the residue depends only on v, i.e., that it is independent of the choice of zv , but we do not need this. We need only the easily verifiable fact that for the logarithmic derivative of f ∈ L∗ with respect to zv we have reszv (f −1 df/dzv ) = v(f ). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.5 Proof of Theorem 7.4.1 189 Recall that the logarithmic derivative x → x −1 dx/dz is defined on the group 1 + zk[[z]] and that it maps products of powers with exponents in k to linear combinations. Thus, ri=1 βiwi = α maps to r wi · i=1 dβi /dz dα/dz = , βi α which is an identity with functions in L. By multiplying with dz/dzv , which also belongs to L, we obtain the same identity, but with zv instead of z. Then by taking residues with respect to zv , our lemma follows. Lemma 7.5.6 Let βij (i = 1, . . . , r, j = 1, . . . , m) be elements of (1 + zk[[z]]) ∩ E such that there is no non-zero w = (w1 , . . . , wr ) ∈ Zr with r βijwi ∈ k∗ for i = 1, . . . , m. i=1 Then the map ψ : w→ r wi βi1 ,..., i=1 r wi βim i=1 defines an injective homomorphism from k to (1 + zk[[z]])m with coordinatewise multiplication. r Proof. By (7.5.5), the map ψ defines a homomorphism on kr . Denote by H the kernel of ψ. Notice that H is the set of w ∈ kr such that r βijwi = 1 for j = 1, . . . , m. (7.5.12) i=1 We have to prove that H = (0). Let L be the extension of K generated by the elements βij (i = 1, . . . , r, j = 1, . . . , m). Then L ⊂ E and L is a finite extension of K. By Lemma 7.5.5, for every w ∈ H we have r wi · v(βij ) = 0 for j = 1, . . . , m, v ∈ ML , i=1 where ML is the set of valuations of L. The latter system defines a proper linear subspace H of kr , defined over Q, containing H . Suppose H = (0). Then H contains a non-zero vector w = (w1 , . . . , wr ) ∈ Zr . Put xj := ri=1 βijwi for j = 1, . . . , m. Then xj ∈ L and v(xj ) = 0 for j = 1, . . . , m, v ∈ ML . But this implies xj ∈ k∗ for j = 1, . . . , m, contradicting our assumption. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 190 Analogues over function fields Lemma 7.5.7 Let m, r be positive integers, let R ∈ E(U1 , . . . , Um ) be a rational function in m variables, and let β1 , . . . , βr ∈ (1 + zk[[z]]) ∩ E. Then there are only finitely many α ∈ F for which there exist w = (w1 , . . . , wr ) ∈ kr , u ∈ km such that r βiwi = R(u) = α. (7.5.13) i=1 Proof. The assertion is obvious if all βi are equal to 1. Suppose that not all βi are equal to 1. Then since (1 + zk[[z]]) ∩ k = {1}, not all βi belong to k∗ . Write R = P /Q, where P , Q ∈ E[U1 , . . . , Um ]. Let L be the extension of k(z) generated by β1 , . . . , βr and the coefficients of P , Q. Then L is a finite extension of K with L ⊂ E. There is a valuation v of L such that the integers v(βi ) are not all equal to 0. We claim that there are integers a, b independent of u such that if R(u) is defined and non-zero, then a ≤ v(R(u)) ≤ b. Choose a local parameter zv for v. By expressing the coefficients of P as Laurent series in zv , we obtain P (u) = ∞ pi (u)zvi i=i0 with pi ∈ k[U1 , . . . , Um ] not all identically 0. Put Xi0 −1 := kr , and for j ≥ i0 , denote by Xj the set of u ∈ kr such that pi (u) = 0 for i0 ≤ i ≤ j . Then Xi0 ⊇ Xi0 +1 ⊇ · · · , and by the descending chain property for algebraic sets, there is j0 such that Xj = Xj0 for j ≥ j0 . Let j0 be the smallest index with this property. If u ∈ kr is such that P (u) = 0 and v(P (u)) = j , say, we have u ∈ Xj −1 \ Xj . Hence v(P (u)) ∈ {i0 , . . . , j0 }. By applying the same reasoning to Q, our claim follows. Let α ∈ E for which there exist w ∈ kr , u ∈ km with (7.5.13). By Lemma 7.5.5 we have j= r wi v(αi ), i=1 where j = v(R(u)) ∈ {a, a + 1, . . . , b}. Thus, w belongs to one of finitely many linear subvarieties of kr of dimension r − 1, all defined over Q. We now proceed by induction on r. If r = 1, we have only finitely many possibilities for w, and then our lemma follows at once. Suppose that r ≥ 2, and let L be one of the linear varieties from above. We have to show that L gives rise to only finitely many α as in (7.5.13). There are c0 ∈ Qr and linearly independent ck ∈ Zr (k = 1, . . . , r − 1) such that every w ∈ L can be expressed as w = c0 + r−1 k=1 tk ck with t1 , . . . , tr−1 ∈ k. Write ck = (c1k , . . . , crk ), Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.5 Proof of Theorem 7.4.1 191 γk := ri=1 βicik for k = 0, . . . , r. Notice that γk ∈ E for k = 0, . . . , r. By substituting our expression for w ∈ L into (7.5.13), we obtain r−1 γktk = γ0−1 R(u) = γ0−1 α. k=1 By the induction hypothesis, we have only finitely many possibilities for γ0−1 α, hence for α. This completes our induction step. Lemma 7.5.8 The set X \ Y is finite. w Proof. Let w = (w1 , . . . , wr ) ∈ X \ Y. Put yj := ri=1 bij j for j = 1, 2. Then the logarithmic derivative of aj yj with respect to z is aj aj + r i=1 wi · bij bij =: Qj for j = 1, 2. Since w ∈ X , there are ξ1 , ξ2 ∈ k∗ with ξ1 a1 y1 + ξ2 a2 y2 = 1. Upon differentiating this identity with respect to z, we obtain Q1 · ξ1 a1 y1 + Q2 · ξ2 a2 y2 = 0. Since w ∈ Y we have a1 y1 = a2 y2 , and since logarithmic differentiation x → x /x is injective on 1 + zk[[z]], we have Q1 = Q2 . Hence the last two equations have a unique solution (y1 , y2 ), and on applying Cramer’s rule we obtain r w bij j = yj = Rj (ξ1 , ξ2 , w1 , . . . , wr ) for j = 1, 2, i=1 where R1 , R2 are certain rational functions in E(U1 , . . . , Ur+2 ). wi r wi , i=1 bi2 ). Lemma 7.5.7 implies that if w runs Put ψ(w) := ( ri=1 bi1 through X \ Y, then ψ(w) runs through a finite set. On the other hand, by condition (7.5.9) and Lemma 7.5.6, ψ defines an injective map. This shows that X \ Y is finite. Lemma 7.5.9 The set X is an algebraic subset of kr , given by polynomials in k[X1 , . . . , Xr ] of total degree at most 3. Further, Y is an algebraic subset of X. Proof. We apply the Wronskian criterion, that functions 1, f1 , f2 ∈ k((z)) are linearly dependent over k if and only if f1 f2 − f2 f1 = 0, where Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 192 Analogues over function fields fj := d2 fj /dz2 . Let fj := aj ri=1 bijwi for j = 1, 2. By a straightforward computation, fj = pj (w)fj , fj = qj (w)fj for j = 1, 2, where pj are linear polynomials, and qj are quadratic polynomials with coefficients in E, for j = 1, 2. Since f1 f2 = 0 for every w, it follows that w ∈ X if and only if h(w) = 0, where h := p1 q2 − p2 q1 . Notice that h has total degree at most 3. From the coefficients of h we select a maximal subset which is linearly independent over k, {a1 , . . . , as }, say. Then the other coefficients of h can be expressed as k linear combinations of a1 , . . . , as , and we get h = sk=1 as hs with polynomials hk ∈ k[X1 , . . . , Xr ] (k = 1, . . . , s) of total degree at most 3. Now, clearly, for w ∈ kr we have h(w) = 0 if and only if hk (w) = 0 for k = 1, . . . , s. Hence X is an algebraic set given by h1 , . . . , hs . To prove that Y is an algebraic subset of X , we use the Wronskian criterion that two functions f1 , f2 ∈ k((z)) are k-linearly dependent if and only if f1 f2 − f1 f2 = 0, and follow the same arguments as above. Now Proposition 7.5.4 follows at once by combining Lemmas 7.5.8 and 7.5.9 with Corollary 7.5.3. 7.6 Results in positive characteristic We give an overview of some results on S-unit equations and generalizations thereof over function fields of positive characteristic. Let k be a field of characteristic p > 0 and K a finite extension of the rational function field k(z). We assume that k is algebraically closed in K. Denote by gK/k the genus of K over k. Similarly as in the characteristic 0 case, we can endow K with a set of valuations MK (i.e., normalized discrete valuations that are trivial on k) sat isfying the Sum Formula v∈MK v(x) = 0 for x ∈ K ∗ . Further, we define for x = (x1 , . . . , xn ) ∈ K n \ {0}, v(x) := − min(v(x1 ), . . . , v( xn )) for v ∈ MK , HKhom (x) := v(x). v∈MK The height of x ∈ K is given by HK (x) := − v∈MK min(0, v(x)). For a finite set of valuations S of MK , we define the group of S-units OS∗ := {x ∈ K : v(x) = 0 for v ∈ MK \ S}. Recall that the Frobenius map x → x p defines an injective field m m homomorphism on K. As a consequence, the sets K p := {x p : x ∈ K} Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.6 Results in positive characteristic 193 (m = 1, 2, . . .) are subfields of K. We say that a1 , . . . , am ∈ K are linearly independent over a subset U of K if there are no c1 , . . . , cm ∈ U \ {0} such that c1 a1 + · · · + cm am = 0. We start with an analogue of Theorem 7.1.1 in the case of characteristic p, also due to Mason (1984), chapter VI, Lemma 10. Theorem 7.6.1 Let x1 , x2 , x3 be non-zero elements of K, and let S be a finite set of valuations of K such that x1 + x2 + x3 = 0, v(x1 ) = v(x2 ) = v(x3 ) for v ∈ MK \ S. Then either x1 /x2 ∈ K p or HK (x1 /x2 ) ≤ 2gK/k − 2 + |S|. Of course, if we allow x1 /x2 ∈ K p , the result may become false, for instance if x1 , x2 , x3 are any elements of K with x1 + x2 + x3 = 0 and x1 /x2 not in the pm pm pm constant field k, then we have also x1 + x2 + x3 = 0 for every positive pm pm integer m and clearly HK (x1 /x2 ) may become arbitrarily large. Mason’s proof of Theorem 7.6.1 is similar to his proof of Theorem 7.1.1, based on derivations. Silverman (1984) proved a similar result (stated in another but equivalent form) by a different geometric method, based on the Riemann– Hurwitz formula and properties of Fermat curves x N + y N = 1. We now turn to S-unit equations in several unknowns. First Mason (1986b) and later in a sharper form Wang (1996, 1999) proved analogues of Corollary 7.3.4 in the case of positive characteristic. We recall Wang (1996), Corollary. 1. Theorem 7.6.2 Suppose that K has genus g. Let S be a finite set of valuations of K, let m be a positive integer, and let x0 , . . . , xn be elements of OS∗ such that x0 + · · · + xn = 0 m and every proper subset of {x0 , . . . , xn } is linearly independent over k · K p . Then HKhom (x) ≤ n(n − 1) m−1 p max(0, 2g − 2 + |S|). 2 Wang’s proof is a positive characteristic analogue of the Wronskian argument of Brownawell and Masser. In Wang (1999) she slightly sharpened her result. Hsia and Wang (2004) proved a further extension to function fields of arbitrary transcendence degree, in the cases of both zero characteristic and positive characteristic. Their proof uses generalized Wronskians. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 194 Analogues over function fields We now turn to linear equations with unknowns from a finitely generated multiplicative group. Let p be a prime, Fp the field of p elements, and Fp its algebraic closure. Again, K is a field of characteristic p > 0 but we now assume that K is a finitely generated, transcendental extension of Fp . Let k be the algebraic closure of Fp in K; so k is a finite field. Let us start with the equation a1 x1 + a2 x2 = 1 in x = (x1 , x2 ) ∈ , (7.6.1) ∗ 2 where is a subgroup of (K ∗ )2 of finite rank not contained in (Fp ) and a = (a1 , a2 ) ∈ (K ∗ )2 . For instance, if there is an integer l coprime with p such that al ∈ , then (7.6.1) may have infinitely many solutions. Specifically, let q be a power of p such that l|q − 1 and let u = (u1 , u2 ) be a solution of (7.6.1) ∗ with u ∈ (Fp )2 , then aq e −1 · uq e (e = 0, 1, 2, . . .) (7.6.2) yield infinitely many different solutions of (7.6.1). The following nice result is due to Voloch (1998), Theorem 2. Theorem 7.6.3 Let have rank r ≥ 0. Assume there is no positive integer l such that al ∈ . Then (7.6.1) has at most pr (p r + p − 2) p−1 solutions. The proof uses derivations on K. Now let n ≥ 2, a finitely generated subgroup of (K ∗ )n (n-fold direct product, not to be confused with the field of p-th powers K p in case n, p happen to be equal), and a = (a1 , . . . , an ) ∈ (K ∗ )n , and consider the equation a1 x1 + · · · + an xn = 1 in x = (x1 , . . . , xn ) ∈ . (7.6.3) The following result is a consequence of Hsia’s and Wang’s analogue of Theorem 7.6.2 for function fields in several variables mentioned above. Theorem 7.6.4 Equation (7.6.3) has only finitely many solutions such that a1 x1 , . . . , an xn are linearly independent over K p . We want to weaken the condition that a1 x1 , . . . , an xn be linearly independent over K p to the condition that (x1 , . . . , xn ) be non-degenerate, that is, ai xi = 0 for each proper subset I of {1, . . . , n}. i∈I Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 7.6 Results in positive characteristic 195 Then the situation becomes much more complicated, due to the Frobenius action on K, and in fact the solutions can be divided into finitely many infinite classes with a particular structure. In Mason (1986b), Masser (2004), Adamczewski and Bell (2012) and Derksen and Masser (2012) various descriptions are given for the set of nondegenerate solutions of (7.6.3). We discuss here a result from the last paper, which unlike Voloch’s result does not give an upper bound for the number of classes of solutions, but instead implies that these classes can be determined effectively. As an introduction to the result of Derksen and Masser, note that we can write the set of solutions of (7.6.1) given in (7.6.2) as ψa−1 ϕqe ψa (u) (e = 0, 1, 2, . . .), (7.6.4) where ψa , ϕq are the maps given by ψa (x) := a · x, ϕq (x) := xq . We now proceed to state the result of Derksen and Masser on the nondegenerate solutions of (7.6.3). Denote by K the group of u ∈ (K ∗ )n for which there exists l ∈ Z>0 with ul ∈ . For a power q of p and for u ∈ , g1 , . . . , gh ∈ K , define the set ϕqe1 ψg1 · · · ψg−1 ϕqeh ψgh (u) : e1 , . . . , eh ∈ Z≥0 , [g1 , . . . , gh ]q (u) := ψg−1 1 h where ϕq and the ψgi are maps from (K ∗ )n to (K ∗ )n given by (7.6.4). We call such a set a (K , q)-set of order h. We agree here that a (K , q)-set of order 0 is a single element. Computing a (K , q)-set means computing q and the tuple (g1 , . . . , gh , u) by which it is defined. The following result is part of Derksen and Masser (2012), Theorem 3. ∗ Theorem 7.6.5 Let a1 , . . . , an ∈ K ∗ . Assume that is not contained in (Fp )n and that K is finitely generated. Then there is a power q of p such that the set of non-degenerate solutions of (7.6.3) is contained in a finite union of (GK , q)sets of order at most n − 1. Further, if suitable effective representations of K and are given, the prime power q and these (K , q)-sets can be determined effectively. In their proof, Derksen and Masser derived a sharpening of Theorem 7.6.4, basically by extending the argument of Brownawell and Masser based on Wronskians. From there, they completed the proof of Theorem 7.6.5 by means of an inductive argument. We mention that earlier, Masser (2004) obtained a less precise and ineffective version of Theorem 7.6.5. Building further on work from Derksen (2007) on Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 196 Analogues over function fields linear recurrence sequences, Adamczewski and Bell (2012), Theorem 3.1 gave a description of the set of non-degenerate solutions of (7.6.3) in terms of finite p-automata, and in fact they proved a much more general result for semiabelian varieties. Finally, general results on semi-abelian varieties over fields of positive characteristic, implying Theorem 7.6.5, have been proved by means of methods from logic, see Hrushovki (1996), Moosa and Scanlon (2002, 2004) and Ghioca (2008). We illustrate Theorem 7.6.5 with an example from Masser (2004). Let K = Fp (z) with z transcendental over Fp and let = G3 , where G is the multiplicative subgroup of K ∗ generated by z and 1 − z. Consider the equation x1 + x2 − x3 = 1 in (x1 , x2 , x3 ) ∈ . (7.6.5) This equation has (among others) the non-degenerate solutions (z(q−1)q , (1 − z)qq , z(q−1)q (1 − z)q ), where q, q run independently through the powers of p different from 1. This set of solutions may be described as [g1 , g2 ]p (u), where g1 = (1, 1, 1), g2 = (z, 1, z(1 − z)−1 ), 2 u = (z(p−1)p , (1 − z)p , z(p−1)p (1 − z)p ). Leitner (2012), Theorem 2 gave a complete description of the set of solutions of (7.6.5) as a union of (K , p)-sets. As it turned out, the set of non-degenerate solutions is contained in a union of 40 (K , p)-sets if p ≥ 5, 48 (K , 3)-sets if p = 3, and 240 (K , 2)-sets if p = 2. For p = 2, his result was obtained earlier by Arenas-Carmona, Berend and Bergelson (2008). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009 8 Effective results for unit equations in two unknowns over finitely generated domains In Chapter 4 we established effective finiteness results on unit equations and S-unit equations in two unknowns in an algebraic number field. In this chapter we extend these results to the finitely generated case. More precisely, let A ⊃ Z be an integral domain which is finitely generated over Z, i.e., A = Z[z1 , . . . , zr ] for certain not necessarily algebraic generators z1 , . . . , zr , K the quotient field of A, and a1 , a2 , a3 non-zero elements of K. By a theorem of Roquette (1957), the group A∗ of units of A is finitely generated, hence we know from, e.g., the results of Chapter 6 that the equation a1 x1 + a2 x2 = a3 in x1 , x2 ∈ A∗ (8.1) has only finitely many solutions. This result is, however, ineffective. In this chapter, we give an effective proof of this finiteness statement, which is valid for any arbitrary integral domain A that is finitely generated over Z. In fact, our main result, Theorem 8.1.1, provides effective upper bounds for the “sizes” of the solutions x1 , x2 in terms of suitable effective representations for A, a1 , a2 , a3 . This enables one to determine all solutions in principle; see Corollary 8.1.2 below. As a further consequence of Theorem 8.1.1 we deduce an effective finiteness theorem on equation (8.1) in unknowns x1 , x2 taken from a finitely generated and effectively given multiplicative subgroup of K ∗ , see Theorem 8.1.3 below. Our strategy of proof of Theorem 8.1.1 is roughly as follows. We construct an integral domain B ⊃ A of a special type that can be dealt with more easily, and consider instead of (8.1) the equation a1 x1 + a2 x2 = a3 in x1 , x2 ∈ B ∗ . (8.2) In the construction of B we use ideas from Seidenberg (1974). We reduce (8.2) to the function field case, and using Mason’s Theorem 7.1.1 we derive an effective upper bound for the degrees of the solutions. Next, by means of 197 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 198 Unit equations over finitely generated domains effective specializations, i.e., explicitly given ring homomorphisms B → Q, we reduce (8.2) to various S-unit equations in different algebraic number fields, and apply the results of Chapter 4 to the S-unit equations obtained. This provides enough information to derive an effective upper bound for the heights of the solutions of (8.2). From this, we can effectively determine all solutions of (8.2). The final, crucial step is to go back from (8.2) to (8.1) and to select from the solutions in B ∗ those that belong to A∗ . For this we have developed an effective procedure, based on an effective result of Aschenbrenner (2004) on systems of linear equations over polynomial rings over Z. The above approach was developed by Győry (1983, 1984). However, at that time Aschenbrenner’s result was not yet available. Hence, to select those solutions from B ∗ of the equations under consideration that belong to A∗ , certain restrictions on the integral domain A had to be imposed. This chapter is organized as follows. In Section 8.1 we give the necessary definitions and state our results. In Sections 8.2–8.6 we prove Theorem 8.1.1. More precisely, in Sections 8.2 and 8.3 we construct the domain B and give the effective procedure to select those elements of B ∗ that belong to A∗ , in Section 8.4 we reduce (8.2) to the function field case and apply Mason’s Theorem, in Section 8.5 we develop some effective specialization theory, and in Section 8.6 we reduce (8.2) to S-unit equations over number fields, apply the results from Chapter 4, and complete the proof. In Section 8.7 we prove Theorem 8.1.3. In Section 8.8 we briefly discuss some related results. The results in this chapter were proved for the first time in Evertse and Győry (2013). We closely follow the exposition of that paper. 8.1 Statements of the results We introduce the notation used in our theorems. Let again A ⊃ Z be an integral domain which is finitely generated over Z, say A = Z[z1 , . . . , zr ]. Let I be the ideal of polynomials f ∈ Z[X1 , . . . , Xr ] such that f (z1 , . . . , zr ) = 0. Then I is finitely generated, hence A∼ = Z[X1 , . . . , Xr ]/I, I = (f1 , . . . , fm ) for some finite set of polynomials f1 , . . . , fm ∈ Z[X1 , . . . , Xr ]. We observe here that given f1 , . . . , fm , it can be checked effectively whether A is a domain containing Z. Indeed, this holds if and only if I is a prime ideal of Z[X1 , . . . , Xr ] with I ∩ Z = (0), and the latter can be checked effectively for instance using Aschenbrenner (2004), Proposition 4.10, Corollary 3.5. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.1 Statements of the results 199 Denote by K the quotient field of A. For α ∈ A, we call f a representative for α or say that f represents α if f ∈ Z[X1 , . . . , Xr ] and α = f (z1 , . . . , zr ). Further, for α ∈ K, we call (f, g) a pair of representatives for α or say that (f, g) represents α if f, g ∈ Z[X1 , . . . , Xr ], g ∈ I and α = f (z1 , . . . , zr )/ g(z1 , . . . , zr ). We say that α ∈ A (resp. α ∈ K) is given if a representative (resp. pair of representatives) for α is given. To do explicit computations in A and K, one needs an ideal membership algorithm for Z[X1 , . . . , Xr ], which decides, for any given polynomial and ideal of Z[X1 , . . . , Xr ], whether the polynomial belongs to the ideal. In the literature there are various such algorithms; we mention only the algorithm of Simmons (1970), and the more precise algorithm of Aschenbrenner (2004) which plays an important role in this chapter; see Lemma 8.2.5 below for a statement of his result. One can perform arithmetic operations on A and K by using representatives. Further, one can decide effectively whether two polynomials g1 , g2 represent the same element of A, i.e., g1 − g2 ∈ I , or whether two pairs of polynomials (g1 , h1 ), (g2 , h2 ) represent the same element of K, i.e., g1 h2 − g2 h1 ∈ I , by using one of the ideal membership algorithms mentioned above. The degree deg f of a polynomial f ∈ Z[X1 , . . . , Xr ] is by definition its total degree. By the logarithmic height h(f ) of f we mean the logarithm of the maximum of the absolute values of its coefficients. The size of f is defined by s(f ) := max(1, deg f, h(f )). Clearly, there are only finitely many polynomials in Z[X1 , . . . , Xr ] of size below a given bound, and these can be determined effectively. We consider equations a1 x1 + a2 x2 = a3 in x1 , x2 ∈ A∗ , (8.1.1) where a1 , a2 , a3 are non-zero elements of A. Theorem 8.1.1 Assume that r ≥ 1. Let a1 , a2 , a3 be representatives for a1 , a2 , a3 , respectively. Assume that f1 , . . . , fm and a1 , a2 , a3 all have degree at most d and logarithmic height at most h, where d ≥ 1, h ≥ 1. Then for each solution (x1 , x2 ) of (8.1.1), there are representatives x1 , x1 , x2 , x2 of x1 , x1−1 , x2 , x2−1 , respectively, such that r s(xi ), s(xi ) ≤ exp (2d)c1 (h + 1) for i = 1, 2, where c1 is an effectively computable absolute constant > 1. By a theorem of Roquette (1957), the unit group of an integral domain finitely generated over Z is finitely generated. In the case that A = OS is the Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 200 Unit equations over finitely generated domains ring of S-integers of a number field it is possible to determine effectively a system of generators for A∗ , and this was used in all effective finiteness proofs for (8.1.1) with A = OS . However, no general algorithm is known to determine a system of generators for the unit group of an arbitrary finitely generated domain A. In our proof of Theorem 8.1.1, we do not need any information on the generators of A∗ . By combining Theorem 8.1.1 with an ideal membership algorithm for the polynomial ring Z[X1 , . . . , Xr ], one easily deduces the following. Corollary 8.1.2 Given f1 , . . . , fm such that A is an integral domain containing Z, and given a1 , a2 , a3 ∈ A \ {0}, the solutions of (8.1.1) can be determined effectively. Proof. Clearly, (x1 , x2 ) is a solution of (8.1.1) if and only if for i = 1, 2, there are polynomials xi , xi ∈ Z[X1 , . . . , Xr ] (i = 1, 2) such that xi represents xi for i = 1, 2, and a1 · x1 + a2 · x2 − a3 ∈ I, , xi · xi − 1 ∈ I for i = 1, 2. (8.1.2) Thus, we obtain all solutions of (8.1.1) by checking, for each quadruple of r polynomials x1 , x1 , x2 , x2 ∈ Z[X1 , . . . , Xr ] of size at most exp((2d)c1 (h + 1)) whether it satisfies (8.1.2). Further, using the ideal membership algorithm, it can be checked effectively whether two different pairs (x1 , x2 ) represent the same solution of (8.1.1). Thus, we can make a list of representatives, one for each solution of (8.1.1). Let γ1 , . . . , γs be multiplicatively independent elements of K ∗ . For given elements γ1 , . . . , γs ∈ K ∗ the multiplicative independence of γ1 , . . . , γs can be checked effectively, see for instance Lemma 8.7.2 below. Let again a1 , a2 , a3 be non-zero elements of A and consider the equation a1 γ1v1 · · · γsvs + a2 γ1w1 · · · γsws = a3 in v1 , . . . , vs , w1 , . . . , ws ∈ Z. (8.1.3) Theorem 8.1.3 Let a1 , a2 , a3 be representatives for a1 , a2 , a3 and for i = 1, . . . , s, let (gi1 , gi2 ) be a pair of representatives for γi . Suppose that f1 , . . . , fm , a1 , a2 , a3 , and gi1 , gi2 (i = 1, . . . , s) all have degree at most d and logarithmic height at most h, where d ≥ 1, h ≥ 1. Then for each solution (v1 , . . . , ws ) of (8.1.3) we have r max(|v1 |, . . . , |vs |, |w1 |, . . . , |ws |) ≤ exp (2s d)c2 (h + 1) , where c2 is an effectively computable absolute constant > 1. An immediate consequence of Theorem 8.1.3 is that for given f1 , . . . , fm , a1 , a2 , a3 and γ1 , . . . , γs , the solutions of (8.1.3) can be determined effectively. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.2 Effective linear algebra over polynomial rings 201 Since every integral domain finitely generated over Z has a finitely generated unit group, equation (8.1.1) may be viewed as a special case of (8.1.3). But since no general effective algorithm is known to find a finite system of generators for the unit group of a finitely generated integral domain, we cannot deduce an effective result for (8.1.1) from Theorem 8.1.3. In fact, we argue reversely, and prove Theorem 8.1.3 by combining Theorem 8.1.1 with an effective result on Diophantine equations of the type γ1v1 · · · γsvs = γ0 in integers v1 , . . . , vs , where γ1 , . . . , γs , γ0 ∈ K ∗ (see Corollary 8.7.3 below). 8.2 Effective linear algebra over polynomial rings We have gathered from the literature some effective results for systems of linear equations to be solved in polynomials with coefficients in a field, or with coefficients in Z. As usual, we write log∗ u := max(1, log u) for u > 0, log∗ 0 := 1. We use the notation O(·) as an abbreviation for c times the expression between the parentheses, where c is an effectively computable positive absolute constant (notice that the meaning of the O-symbol is different from that of the usual Osymbol which means “at most c times the expression between the parentheses”). At each occurrence of O(·), the value of c may be different. Given a ring R, we denote by R m,n the R-module of m × n-matrices with entries in R and by R n the R-module of n-dimensional column vectors with entries in R. Further, as usual GL(n, R) denotes the group of matrices in R n,n with determinant in the unit group R ∗ . The degree of a polynomial f ∈ R[X1 , . . . , XN ], that is, its total degree, is denoted by deg f . From matrices U, V with the same number of rows, we form a matrix [U, V ] by placing the columns of V after those of . UU/, and from two matrices U, V with the same number of columns we form V by placing the rows of V below those of U . The logarithmic height h(S) of a finite set S = {a1 , . . . , at } ⊂ Z is defined by h(S) := log max(|a1 |, . . . , |at |). The logarithmic height h(U ) of a matrix with entries in Z is defined by the logarithmic height of the set of entries of U . The logarithmic height h(f ) of a polynomial with coefficients in Z is the logarithmic height of the set of non-zero coefficients of f . Lemma 8.2.1 Let U ∈ Zm,n . Then the Q-vector space of y ∈ Qn with U y = 0 is generated by vectors in Zn of logarithmic height at most mh(U ) + 12 m log m. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 202 Unit equations over finitely generated domains Proof. Without loss of generality we may assume that U has rank m, and moreover, that the matrix V consisting of the first m columns of U is invertible. Let := det V . By multiplying with V −1 , we can rewrite U y = 0 as [Im , W ]y = 0, where Im is the m × m-unit matrix, and W consists of m × msubdeterminants of U . The solution space of this system is generated by the ]. An application of Hadamard’s inequality gives the upper columns of [ I−W n−m bound from the lemma for the logarithmic heights of these columns. Proposition 8.2.2 Let F be a field, N ≥ 1, and R := F [X1 , . . . , XN ]. Further, let U be an m × n-matrix and b an m-dimensional column vector, both consisting of polynomials from R of degree ≤ d where d ≥ 1. (i) The R-module of x ∈ R n with U x = 0 is generated by vectors x whose N coordinates are polynomials of degree at most (2md)2 . (ii) Suppose that U x = b is solvable in x ∈ R n . Then it has a solution x whose N coordinates are polynomials of degree at most (2md)2 . Proof. See Aschenbrenner (2004), theorems 3.2, 3.4. Part (ii) of Proposition 8.2.2 was obtained earlier by Masser and Wüstholz (1983), Proposition on N −1 p.440, estimate on top of p.442, with the slightly smaller bound (2md)2 . Results of this type, but not with a completely correct proof, were given in Hermann (1926) and Seidenberg (1974). Corollary 8.2.3 Let R := Q[X1 , . . . , XN ]. Further, Let U be an m × n-matrix of polynomials in Z[X1 , . . . , XN ] of degrees at most d and logarithmic heights at most h where d ≥ 1, h ≥ 1. Then the R-module of x ∈ R n with U x = 0 is generated by vectors x, consisting of polynomials in Z[X1 , . . . , XN ] of degree N N at most (2md)2 and height at most (2md)6 (h + 1). Proof. By Proposition 8.2.2 (i) we have to study U x = 0, restricted to vectors N x ∈ R n consisting of polynomials of degree at most (2d)2 . The set of these x is a finite dimensional Q-vector space, and we have to prove that it is generated by vectors whose coordinates are polynomials in Z[X1 , . . . , XN ] of logarithmic N height at most (2md)6 (h + 1). N If x consists of polynomials of degree at most (2md)2 , then U x consists of N m polynomials with coefficients in Q of degrees at most (2md)2 + d, all of whose coefficients have to be set to 0. This leads to a system of linear equations V y = 0, where y consists of the coefficients of the polynomials in x and V consists of integers of logarithmic heights at most h. Notice that the number m∗ of rows of V is m times the number of monomials in N variables of degree Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.2 Effective linear algebra over polynomial rings 203 N at most (2md)2 + d, that is N m∗ ≤ m (2md)2 + d + N . N By Lemma 8.2.1 the solution space of V y = 0 is generated by integer vectors of logarithmic height at most m∗ h + 12 m∗ log m∗ ≤ (2md)6 (h + 1). N This completes the proof of our corollary. Lemma 8.2.4 Let U ∈ Zm,n , b ∈ Zm be such that U y = b is solvable in Zn . Then it has a solution y ∈ Zn with h(y) ≤ mh([U, b]) + 12 m log m. Proof. Assume without loss of generality that U and [U, b] have rank m. By a result of Borosh, Flahive, Rubin and Treybig (1989) (see also Lemma 4.3.5), U y = b has a solution y ∈ Zn such that the absolute values of the entries of y are bounded above by the maximum of the absolute values of the m × msubdeterminants of [U, b]. The upper bound for h(y) as in the lemma easily follows from Hadamard’s inequality. Proposition 8.2.5 Let N ≥ 1 and let f1 , . . . , fm , b ∈ Z[X1 , . . . , XN ] be polynomials of degrees at most d and logarithmic heights at most h where d ≥ 1, h ≥ 1, such that f1 x1 + · · · + fm xm = b (8.2.1) is solvable in x1 , . . . , xm ∈ Z[X1 , . . . , xN ]. Then (8.2.1) has a solution in polynomials x1 , . . . , xm ∈ Z[X1 , . . . , XN ] with deg xi ≤ (2d)exp O(N log ∗ N) (h + 1), h(xi ) ≤ (2d)exp O(N log ∗ N) (h + 1)N+1 (8.2.2) for i = 1, . . . , m. Proof. Aschenbrenner’s main theorem (Aschenbrenner (2004), Theorem A) states that equation (8.2.1) has a solution x1 , . . . , xm ∈ Z[X1 , . . . , XN ] with deg xi ≤ d0 for i = 1, . . . , m, where d0 = (2d)exp O(N log ∗ N) (h + 1). So it remains to show the existence of a solution with small logarithmic height. Let us restrict ourselves to solutions (x1 , . . . , xm ) of (8.2.1) of degree ≤ d0 , and denote by y the vector of coefficients of the polynomials x1 , . . . , xm . Then (8.2.1) translates into a system of linear equations U y = b which is solvable Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 204 Unit equations over finitely generated domains over Z. Here, the number of equations, i.e., number of rows of U , is equal to ). Further, h(U, b) ≤ h. By Lemma 8.2.4, U y = b has a solution m∗ := ( d0 +d+N N y with coordinates in Z of height at most m∗ h + 12 m∗ log m∗ ≤ (2d)exp O(N log ∗ N) (h + 1)N+1 . It follows that (8.2.1) has a solution x1 , . . . , xm ∈ Z[X1 , . . . , XN ] satisfying (8.2.2). Remarks 1 Aschenbrenner (2004) gives an example which shows that the upper bound for the degrees of the xi cannot depend on d and N only. 2 The above lemma gives an effective criterion for ideal membership in the polynomial ring Z[X1 , . . . , XN ]. Let b ∈ Z[X1 , . . . , XN ] be given. Further, suppose that an ideal I of Z[X1 , . . . , XN ] is given by a finite set of generators f1 , . . . , fm . By the above lemma, if b ∈ I then there are x1 , . . . , xm ∈ Z[X1 , . . . , XN ] with upper bounds for the degrees and heights as in (8.2.2) such that b = m i=1 xi fi . It requires only a finite computation to check whether such xi exist. 8.3 A reduction We reduce the general unit equation (8.1.1) to a unit equation over an integral domain B of a special type that can be dealt with more easily. Let again A = Z[z1 , . . . , zr ] be an integral domain finitely generated over Z and denote by K the quotient field of A. We assume that r > 0. We have A∼ = Z[X1 , . . . , Xr ]/I, (8.3.1) where I is the ideal of polynomials f ∈ Z[X1 , . . . , Xr ] such that f (z1 , . . . , zr ) = 0. The ideal I is finitely generated. Let d ≥ 1, h ≥ 1 and assume that I = (f1 , . . . , fm ) with deg fi ≤ d, h(fi ) ≤ h (i = 1, . . . , m). (8.3.2) Suppose that K has transcendence degree q ≥ 0. In the case that q > 0, we assume without loss of generality that z1 , . . . , zq form a transcendence basis of K/Q. We write t := r − q and rename zq+1 , . . . , zr as y1 , . . . , yt , respectively. In the case that t = 0 we have A = Z[z1 , . . . , zq ], A∗ = {±1} and Theorem 8.1.1 is trivial. So we assume henceforth that t > 0. Define A0 := Z[z1 , . . . , zq ], A0 := Z, K0 := Q(z1 , . . . , zq ) if q > 0, K0 := Q if q = 0. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.3 A reduction 205 Then A = A0 [y1 , . . . , yt ], K = K0 (y1 , . . . , yt ). Clearly, K is a finite extension of K0 , so in particular is an algebraic number field if q = 0. Using standard algebra techniques, worked out in detail below, one can show that there exist y ∈ A and f ∈ A0 such that K = K0 (y), y is integral over A0 , and A ⊆ B := A0 [f −1 , y], a1 , a2 , a3 ∈ B ∗ , where a1 , a2 , a3 are the coefficients in (8.1.1). If x1 , x2 ∈ A∗ is a solution to (8.1.1), then xi := ai xi /a3 (i = 1, 2) satisfy x1 + x2 = 1, x1 , x2 ∈ B ∗ . (8.3.3) At the end of this section, we formulate Proposition 8.3.7 which gives an effective result for equations of the type (8.3.3). More precisely, we introduce a different type of degree and height, deg (α) and h(α), for elements α of B, and give effective upper bounds for the deg and h of x1 , x2 . Subsequently we deduce Theorem 8.1.1. The deduction of Theorem 8.1.1 is based on some auxiliary results which are proved first. We start with an explicit construction of y, f , with effective upper bounds in terms of r, d, h and a1 , a2 , a3 for the degrees and logarithmic heights of f and of the coefficients in A0 of the monic minimal polynomial of y over A0 . Here we follow more or less Seidenberg (1974). Second, for a given solution x1 , x2 of (8.1.1), we derive effective upper bounds for the degrees and logarithmic heights of representatives for xi , xi−1 , (i = 1, 2) in terms of deg (xi ), h(xi ) (i = 1, 2). Here we use Proposition 8.2.5 (Aschenbrenner’s result). We introduce some further notation. First let q > 0. Then since z1 , . . . , zq are algebraically independent, we may view them as independent variables, and for α ∈ A0 , we denote by deg α, h(α) the total degree and logarithmic height of α, viewed as a polynomial in z1 , . . . , zq . In the case that q = 0, we have A0 = Z, and we agree that deg α = 0, h(α) = log |α| for α ∈ A0 . We write Y = (Xq+1 , . . . , Xr ) and K0 (Y) = K0 (Xq+1 , . . . , Xr ). Given f ∈ Q(X1 , . . . , Xr ) we denote by f ∗ the rational function of K0 (Y) obtained by substituting zi for Xi for i = 1, . . . , q (and f ∗ = f if q = 0). We denote by degY f ∗ the (total) degree of f ∗ ∈ K0 [Y] with respect to Y. We recall that the total degree deg g is defined for elements g ∈ A0 and is taken with respect to Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 206 Unit equations over finitely generated domains z1 , . . . , zq . With this notation, we can rewrite (8.3.1) and (8.3.2) as ⎫ ∼ A0 [Y]/(f ∗ , . . . , f ∗ ), A= ⎪ m 1 ⎪ ⎪ ⎬ degY fi∗ ≤ d for i = 1, . . . , m, the coefficients of f1∗ , . . . , fm∗ in A0 have degrees at most d ⎪ ⎪ ⎪ ⎭ and logarithmic heights at most h. (8.3.4) Put D := [K : K0 ] and denote by σ1 , . . . , σD the K0 -isomorphic embeddings of K in an algebraic closure K0 of K0 . Lemma 8.3.1 (i) We have D ≤ d t . (ii) There exist integers a1 , . . . , at with |ai | ≤ D 2 for i = 1, . . . , t such that for w := a1 y1 + · · · + at yt we have K = K0 (w). Proof. (i) The images of (y1 , . . . , yt ) under σ1 , . . . , σD all lie in t W := {y ∈ K0 : f1∗ (y) = · · · = fm∗ (y) = 0}. Conversely, using the fact that K ∼ = K0 [Y]/(f1∗ , . . . , fm∗ ), one sees that each assignment (y1 , . . . , yt ) → y with y ∈ W yields a K0 -isomorphic embedding of K in K0 . Hence |W| = D < ∞. Now Corollary 7.5.3 with k = K0 , X = t K0 , Y = ∅ implies that |W| ≤ d t . Hence D ≤ d t . (ii) Let a1 , . . . , at be integers. Then w := ti=1 ai yi generates K over K0 t if and only if j =1 aj σi (yj ) (i = 1, . . . , D) are distinct. There are integers ai with |ai | ≤ D 2 for which this holds. Lemma 8.3.2 There are G0 , . . . , GD ∈ A0 such that D Gi w D−i = 0, G0 GD = 0, (8.3.5) i=0 deg Gi ≤ (2d)exp O(r) , h(Gi ) ≤ (2d)exp O(r) (h + 1), (i = 0, . . . , D). (8.3.6) ut u1 Proof. In what follows we write Y = (Xq+1 , . . . , Xr ) and Yu := Xq+1 · · · Xq+t , |u| := u1 + · · · + ut for tuples of non-negative integers u = (u1 , . . . , ut ). Further, we define W := tj =1 aj Xq+j . G0 , . . . , GD as in (8.3.5) clearly exist since w has degree D over K0 . By ∗ ∈ A0 [Y] such that (8.3.4), there are g1∗ , . . . , gm D i=0 Gi W D−i = m gj∗ fj∗ . (8.3.7) j =1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.3 A reduction 207 By Proposition 8.2.2 (ii), applied with the field F = K0 , there are polynomials gj∗ ∈ K0 [Y] (so with coefficients being rational functions in z) satisfying t t (8.3.7) of degree at most (2 max(d, D))2 ≤ (2d t )2 =: d0 in Y. By multiplying G0 , . . . , GD with an appropriate non-zero factor from A0 we may assume that the gj∗ are polynomials in A0 [Y] of degree at most d0 in Y. By considering (8.3.7) with such polynomials gj∗ , we obtain D Gi W D−i = m ⎛ ⎝ j =1 i=0 ⎞ ⎛ gj,u Yu ⎠ · ⎝ |u|≤d0 ⎞ fj,v Yv ⎠ , (8.3.8) |v|≤d where gj,u ∈ A0 and fj∗ = |v|≤d fj,v Yv with fj,v ∈ A0 . We view G0 , . . . , GD and the polynomials gj,u as the unknowns of (8.3.8). Then (8.3.8) has solutions with G0 GD = 0. We may view (8.3.8) as a system of linear equations U x = 0 over K0 , where x consists of Gi (i = 0, . . . , D) and gj,u (j = 1, . . . , m, |u| ≤ d0 ). By Lemma 8.3.1 and an elementary estimate, the polynomial W D−i = ( tk=1 ak Xq+k )D−i has logarithmic height at most O(D log(2D 2 t)) ≤ (2d)O(t) . By combining this with (8.3.4), it follows that the entries of the matrix U are elements of A0 of degrees at most d and logarithmic heights at most h0 := max((2d)O(t) , h). Further, the number of rows of U is at most the number of monomials in Y of degree at most d0 + d which is bounded above by ). So, by Corollary 8.2.3, the solution module of (8.3.8) is genm0 := ( d0 +d+t t erated by vectors x = (G0 , . . . , GD , {gi,u }), consisting of elements from A0 of degree and height at most (2m0 d)2 ≤ (2d)exp O(r) , q (2m0 d)6 (h0 + 1) ≤ (2d)exp O(r) (h + 1), q respectively. At least one of these vectors x must have G0 GD = 0 since otherwise (8.3.8) would have no solution with G0 GD = 0, contradicting what we already observed about (8.3.5). Thus, there exists a solution x whose components G0 , . . . , GD satisfy both (8.3.5) and (8.3.6). This proves our lemma. It will be more convenient to work with y := G0 w = G0 · (a1 y1 + · · · + at yt ). In the case D = 1 we set y := 1. The following properties of y follow at once from Corollary 1.9.5 and Lemmas 8.3.1 and 8.3.2. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 208 Unit equations over finitely generated domains Corollary 8.3.3 We have K = K0 (y), y ∈ A, y is integral over A0 , and y has minimal polynomial F(X) = XD + F1 XD−1 + · · · + FD over K0 with deg Fi ≤ (2d)exp O(r) , Fi ∈ A0 , h(Fi ) ≤ (2d)exp O(r) (h + 1) for i = 1, . . . , D. Recall that A0 = Z if q = 0 and Z[z1 , . . . , zq ] if q > 0, where in the latter case, z1 , . . . , zq are algebraically independent. Hence A0 is a unique factorization domain, and so the greatest common divisor of a finite set of elements of A0 is well-defined and up to sign uniquely determined. With every element α ∈ K we can associate an up to sign unique tuple Pα,0 , . . . , Pα,D−1 , Qα of elements of A0 such that α = Q−1 α D−1 Pα,j y j with Qα = 0, gcd(Pα,0 , . . . , Pα,D−1 , Qα ) = 1. j =0 (8.3.9) Put deg α := max(deg Pα,0 , . . . , deg Pα,D−1 , deg Qα ), h(α) := max(h(Pα,0 ), . . . , h(Pα,D−1 ), h(Qα )). Then for q = 0 we have deg α = 0, h(α) = log max(|Pα,0 |, . . . , |Pα,D−1 |, |Qα |). Lemma 8.3.4 Let α ∈ K ∗ and let (a, b) be a pair of representatives for α, with a, b ∈ Z[X1 , . . . , Xr ], b ∈ I . Put d ∗ := max(d, deg a, deg b), h∗ := max(h, h(a), h(b)). Then deg α ≤ (2d ∗ )exp O(r) , h(α) ≤ (2d ∗ )exp O(r) (h∗ + 1). Proof. Consider the linear equation Q·α = D−1 Pj y j (8.3.10) j =0 in unknowns P0 , . . . , PD−1 , Q ∈ A0 . This equation has a solution with Q = 0, since α ∈ K = K0 (y) and y has degree D over K0 . Write again Y = (Xq+1 , . . . , Xr ) and put Y := G0 · ( tj =1 aj Xq+j ). Let a ∗ , b∗ ∈ A0 [Y] be obtained from a, b by substituting zi for Xi for i = 1, . . . , q (a ∗ = a, b∗ = b Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.3 A reduction 209 if q = 0). By (8.3.4), there are gj∗ ∈ A0 [Y] such that Q · a ∗ − b∗ D−1 Pj Y j = j =0 m gj∗ fj∗ . (8.3.11) j =1 By Proposition 8.2.2 (ii) this identity holds with polynomials gj∗ ∈ A0 [Y] of t t degree in Y at most (2 max(d ∗ , D))2 ≤ (2d ∗ )t2 , where possibly we have to multiply (P0 , . . . , PD−1 , Q) by a non-zero element from A0 . Now completely similarly as in the proof of Lemma 8.3.2, one can rewrite (8.3.11) as a system of linear equations over K0 and then apply Corollary 8.2.3. It follows that (8.3.10) is satisfied by P0 , . . . , PD−1 , Q ∈ A0 with Q = 0 and deg Pi , deg Q ≤ (2d ∗ )exp O(r) , h(Pi ), h(Q) ≤ (2d ∗ )exp O(r) (h∗ + 1) (i = 0, . . . , D − 1). By dividing P0 , . . . , PD−1 , Q by their greatest common divisor and using Corollary 1.9.5 we obtain Pα,0 , . . . , PD−1,α , Qα ∈ A0 satisfying both (8.3.9) and deg Pi,α , deg Qα ≤ (2d ∗ )exp O(r) , h(Pi,α ), h(Qα ) ≤ (2d ∗ )exp O(r) (h∗ + 1) (i = 0, . . . , D − 1). Lemma 8.3.5 Let α1 , . . . , αn ∈ K ∗ . For i = 1, . . . , n, let (ai , bi ) be a pair of representatives for αi , with ai , bi ∈ Z[X1 , . . . , Xr ], bi ∈ I . Put d ∗∗ := max(d, deg a1 , deg b1 , . . . , deg an , deg bn ), h∗∗ := max(h, h(a1 ), h(b1 ), . . . , h(an ), h(bn )). Then there is a non-zero f ∈ A0 such that A ⊆ A0 [y, f −1 ], α1 , . . . , αn ∈ A0 [y, f −1 ]∗ , deg f ≤ (n + 1)(2d ∗∗ )exp O(r) , h(f ) ≤ (n + 1)(2d ∗∗ )exp O(r) (h∗∗ + 1). (8.3.12) (8.3.13) Proof. Take f := t i=1 Qyi · n Qαi Qαi−1 . j =1 Since in general, Qβ β ∈ A0 [y] for β ∈ K ∗ , we have fβ ∈ A0 [y] for β = y1 , . . . , yt , α1 , α1−1 , . . . , αn , αn−1 . This implies (8.3.12). The inequalities (8.3.13) follow at once from Lemma 8.3.4 and Corollary 1.9.5. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 210 Unit equations over finitely generated domains Lemma 8.3.6 Let λ ∈ K ∗ and let x be a non-zero element of A. Let (a, b) with a, b ∈ Z[X1 , . . . , Xr ] be a pair of representatives for λ. Put d0 := max(deg f1 , . . . , deg fm , deg a, deg b, deg λx), h0 := max(h(f1 ), . . . , h(fm ), h(a), h(b), h(λx)). Then x has a representative x ∈ Z[X1 , . . . , Xr ] such that deg x ≤ (2d0 )exp O(r log ∗ r) h(x) ≤ (2d0 )exp O(r log (h0 + 1), ∗ r) (h0 + 1)r+1 . If moreover x ∈ A∗ , then x −1 has a representative x ∈ Z[X1 , . . . , Xr ] with deg x ≤ (2d0 )exp O(r log ∗ r) h(x ) ≤ (2d0 )exp O(r log (h0 + 1), ∗ r) (h0 + 1)r+1 . Proof. In the case q > 0, we identify zi with Xi and view elements of A0 as polynomials in Z[X1 , . . . , Xq ]. Put Y := G0 · ( ti=1 ai Xq+i ). We have λx = Q−1 D−1 Pi y i (8.3.14) i=0 with P0 , . . . , PD−1 , Q ∈ A0 and gcd(P0 , . . . , PD−1 , Q) = 1. According to (8.3.14), x ∈ Z[X1 , . . . , Xr ] is a representative for x if and only if there are g1 , . . . , gm ∈ Z[X1 , . . . , Xr ] such that x · (Q · a) + m gi fi = b i=1 D−1 Pi Y i . (8.3.15) i=0 We may view (8.3.15) as an inhomogeneous linear equation in the unknowns x, g1 , . . . , gm . Notice that by Lemmas 8.3.1–8.3.4 the degrees and logarithmic i heights of Qa and b D−1 i=0 Pi Y are all bounded above by (2d0 )exp O(r) , (2d0 )exp O(r) (h0 + 1), respectively. Now Proposition 8.2.5 implies that (8.3.15) has a solution with upper bounds for deg x, h(x), as stated in the lemma. Now suppose that x ∈ A∗ . Again by (8.3.14), x ∈ Z[X1 , . . . , Xr ] is a rep ∈ Z[X1 , . . . , Xr ] such resentative for x −1 if and only if there are g1 , . . . , gm that x · b D−1 i=0 Pi Y i + m gi fi = Q · a. i=1 Similarly as above, this equation has a solution with upper bounds for deg x , h(x ) as stated in the lemma. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.3 A reduction 211 Recall that we have defined A0 = Z[z1 , . . . , zq ], K0 = Q(z1 , . . . , zq ) if q > 0 and A0 = Z, K0 = Q if q = 0, and that in the case q = 0, degrees and deg -s are always zero. Theorem 8.1.1 can be deduced from the following proposition, which makes sense also if q = 0. The proof of this proposition is given in Sections 8.4–8.6. Proposition 8.3.7 Let f ∈ A0 with f = 0, and let F = XD + F1 XD−1 + · · · + FD ∈ A0 [X] (D ≥ 1) be the minimal polynomial of y over K0 . Let d1 ≥ 1, h1 ≥ 1 and suppose max(deg f, deg F1 , . . . , deg FD ) ≤ d1 , (8.3.16) max(h(f ), h(F1 ), . . . , h(FD )) ≤ h1 . Define the domain B := A0 [y, f −1 ]. Then for each pair (x1 , x2 ) with x1 + x2 = 1, x1 , x2 ∈ B ∗ (8.3.17) we have deg x1 , deg x2 ≤ 4qD 2 · d1 , (8.3.18) h(x1 ), h(x2 ) ≤ exp O 2D(q + d1 )(log∗ {2D(q + d1 )})2 + D log∗ Dh1 . (8.3.19) Proof of Theorem 8.1.1. Let a1 , a2 , a3 ∈ A \ {0} be the coefficients of (8.1.1), and a1 , a2 , a3 the representatives for a1 , a2 , a3 from the statement of Theorem 8.1.1. By Lemma 8.3.5, there exists non-zero f ∈ A0 such that A ⊆ B := A0 [y, f −1 ], a1 , a2 , a3 ∈ B ∗ , and moreover, deg f ≤ (2d)exp O(r) , h(f ) ≤ (2d)exp O(r) (h + 1). By Corollary 8.3.3 we have the same type of upper bounds for the degrees and logarithmic heights of F1 , . . . , FD . So in Proposition 8.3.7 we may take d1 = (2d)exp O(r) , h1 = (2d)exp O(r) (h + 1). Finally, by Lemma 8.3.1 we have D ≤ dt . Let (x1 , x2 ) be a solution of (8.1.1) and put xi := ai xi /a3 for i = 1, 2. Let i ∈ {1, 2}. By Proposition 8.3.7 we have deg xi ≤ 4qd 2t (2d)exp O(r) ≤ (2d)exp O(r) , h(xi ) ≤ exp((2d)exp O(r) (h + 1)). We apply Lemma 8.3.6 with λ = ai /a3 . Notice that λ is represented by (ai , a3 ). By assumption, ai and a3 have degrees at most d and logarithmic heights at most h. Letting ai , a3 play the role of a, b in Lemma 8.3.6, we see that in Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 212 Unit equations over finitely generated domains that lemma we may take h0 = exp((2d)exp O(r) (h + 1)) and d0 = (2d)exp O(r) . It follows that xi , xi−1 have representatives xi , xi ∈ Z[X1 , . . . , Xr ] such that deg xi , deg xi , h(xi ), h(xi ) ≤ exp (2d)exp O(r) (h + 1) . We observe here that the upper bound for h(xi ) dominates by far the other terms in our estimation. This completes the proof of Theorem 8.1.1. Proposition 8.3.7 is proved in Sections 8.4–8.6. In Section 8.4 we deduce the degree bound (8.3.18). Here, our main tool is Theorem 7.1.1 (Mason’s effective result on S-unit equations over function fields). In Section 8.5 we work out a more precise version of an effective specialization argument of Győry (1983, 1984). In Section 8.6 we prove (8.3.19) by combining the specialization argument from Section 8.5 with a recent effective result for S-unit equations over number fields, due to Győry and Yu (2006). 8.4 Bounding the degree in Proposition 8.3.7 We keep the notation from Proposition 8.3.7. We may assume that q > 0 because the case q = 0 is trivial. Let as before K0 = Q(z1 , . . . , zq ), K = K0 (y), A0 = Z[z1 , . . . , zq ], B = Z[z1 , . . . , zq , f −1 , y]. Choose an algebraic closure K0 of K0 . Then there are precisely D K0 -isomorphic embeddings of K into K0 , which we denote by x → x (i) (i = 1, . . . , D). Fix i ∈ {1, . . . , q}. Let ki be an algebraic closure of Q(z1 , . . . , zi−1 , zi+1 , . . . , zq ), contained in K0 . Thus, A0 is contained in ki [zi ]. Define the field Mi := ki zi , y (1) , . . . , y (D) . That is, Mi is the splitting field of the polynomial XD + F1 XD−1 + · · · + FD over ki (zi ). The subring . / Bi := ki zi , f −1 , y (1) , . . . , y (D) of Mi contains B = Z[z1 , . . . , zq , f −1 , y] as a subring. Put i := [Mi : ki (zi )]. We will apply the estimates from Sections 2.2 and 2.3 with zi , ki , Mi instead of z, k, K. Denote by gMi /ki the genus of Mi over ki . The height HMi is taken with respect to Mi /ki . For G ∈ A0 , we denote by degzi G the degree of G in the variable zi . Lemma 8.4.1 For α ∈ K we have deg α ≤ qD · d1 + q i=1 −1 i D HMi α (j ) . j =1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.4 Bounding the degree in Proposition 8.3.7 213 Proof. We have α = Q−1 D−1 Pj y j j =0 for certain P0 , . . . , PD−1 , Q ∈ A0 with gcd(Q, P0 , . . . , PD−1 ) = 1. Clearly, deg α ≤ q μi , where μi := max(degzi Q, degzi P0 , . . . , degzi PD−1 ). i=1 (8.4.1) Below, we estimate μ1 , . . . , μq from above. We heavily use the height properties listed in Section 2.2. We fix i ∈ {1, . . . , q} and use the notation introduced above. By taking conjugates, we obtain α (k) = Q−1 D−1 Pj · (y (k) )j for k = 1, . . . , D. j =0 Let be the D × D-matrix with rows D−1 D−1 . (1, . . . , 1), y (1) , . . . , y (D) , . . . , y (1) , . . . , y (D) By Cramer’s rule, Pj /Q = δj /δ, where δ = det , and δj is the determinant of the matrix obtained by replacing the j -th row of by (α (1) , . . . , α (D) ). Gauss’ Lemma implies that gcd(P0 , . . . , PD−1 , Q) = 1 in the ring ki [zi ]. So by (2.2.2) (with zi in place of z) we have μi = Hkhom (Q, P0 , . . . , PD−1 ). i (zi ) Notice that (δ, δ1 , . . . , δD ) is a scalar multiple of (Q, P0 , . . . , PD−1 ). By combining (2.2.3), (2.2.1) and inserting [Mi : ki (zi )] = i , we obtain μi = −1 i HMi (Q, P0 , . . . , PD−1 ) = HMi (δ, δ1 , . . . , δD ). (8.4.2) We bound from above the right-hand side. A straightforward estimate yields that for every valuation v of Mi /ki , −min(v(δ), v(δ1 ), . . . , v(δD )) ≤ −D D min(0, v(y (j ) )) − j =1 D min(0, v(α (j ) )), j =1 and then summation over v gives HMi (δ, δ1 , . . . , δD ) ≤ D D j =1 HMi (y (j ) ) + D HMi (α (j ) ). (8.4.3) j =1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 214 Unit equations over finitely generated domains A combination of (2.2.10), (2.2.3), (2.2.2) and assumption (8.3.16) gives D HMi y (j ) = HMi (F) = i Hki (zi ) (F) j =1 = i max(0, degzi F1 , . . . , degzi FD ) ≤ i · d1 . Together with (8.4.2) and (8.4.3) this leads to μi ≤ Dd1 + −1 i D HMi α (j ) . j =1 Now these bounds for i = 1, . . . , q together with (8.4.1) imply our lemma. Proof of (8.3.18). We fix again i ∈ {1, . . . , q} and use the notation introduced above. By Proposition 2.3.2, applied with ki , zi , Mi instead of k, z, K and with F = F = XD + F1 XD−1 + · · · + FD , we have gMi /ki ≤ (i − 1)D max degzi (Fj ) ≤ (i − 1) · Dd1 . j (8.4.4) Let S denote the subset of valuations v of Mi /ki such that v(zi ) < 0 or v(f ) > 0. Each valuation of ki (zi ) can be extended to at most [Mi : ki (zi )] = i valuations of Mi . Hence Mi has at most i valuations v with v(zi ) < 0 and at most i deg f valuations with v(f ) > 0. Thus, |S| ≤ i + i degzi f ≤ i (1 + deg f ) ≤ i (1 + d1 ). (8.4.5) Define the ring of S-integers in Mi , OS = {x ∈ Mi : v(x) ≥ 0 for v ∈ MMi \ S}. This ring contains ki , zi , f and is integrally closed. As a consequence, A0 = Z[z1 , . . . , zq ] ⊂ OS . The elements y (1) , . . . , y (D) belong to Mi and are integral over A0 so they certainly belong to OS . As a consequence, the elements of B and their conjugates over Q(z1 , . . . , zq ) belong to OS . In particular, if x1 , x2 ∈ B ∗ and x1 + x2 = 1, then x1 + x2 = 1, x1 , x2 ∈ OS∗ (j ) (j ) (j ) (j ) for j = 1, . . . , D. We apply Mason’s inequality, Theorem 7.1.1, and insert the upper bounds (j ) (8.4.4) and (8.4.5). It follows that for j = 1, . . . , D we have either x1 ∈ ki or (j ) HMi x1 ≤ |S| + 2gMi /ki − 2 ≤ 3i · Dd1 ; (j ) in fact the last upper bound is valid also if x1 ∈ ki . Together with Lemma 8.4.1 this gives deg x1 ≤ qDd1 + qD · 3Dd1 ≤ 4qD 2 d1 . For deg x2 we derive the same estimate. This proves (8.3.18). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.5 Specializations 215 8.5 Specializations In this section we prove some results about specialization homomorphisms from B to Q, where B is the integral domain B from Proposition 8.3.7. We start with three auxiliary results that are used in the construction of our specializations. Lemma 8.5.1 Let m ≥ 1, let α1 , . . . , αm ∈ Q be distinct and suppose that G(X) := m i=1 (X − αi ) ∈ Z[X]. Let q, p0 , . . . , pm−1 be integers with gcd(q, p0 , . . . , pm−1 ) = 1, and put βi := m−1 1 j pj αi q j =0 (i = 1, . . . , m). Then log max(|q|, |p0 |, . . . , |pm−1 |) ≤ 2m2 + (m − 1)h(G) + m h(βj ). j =1 Proof. We use the height estimates from Section 1.9. For m = 1 the assertion is obvious, so we assume m ≥ 2. Let L = Q(α1 , . . . , αm ). Let be the m × m i ) (i = 0, . . . , m − 1). By Cramer’s rule we have matrix with rows (α1i , . . . , αm pi /q = δi /δ (i = 0, . . . , m − 1), where δ = det and δi is the determinant of the matrix, obtained by replacing the i-th row of by (β1 , . . . , βm ). Put μ := log max(|q|, |p0 |, . . . , |pm−1 |). Then since (δ, δ1 , . . . , δm−1 ) is a scalar multiple of (q, p1 · · · pm−1 ) we have, by (1.9.4) and (1.9.8), μ = hhom (q, p1 , . . . , pm−1 ) = hhom (δ, δ1 , . . . , δm−1 ) 1 = log max(|δ|v , |δ1 |v , . . . , |δm−1 |v ). d v∈M (8.5.1) L Estimating the determinants using Hadamard’s inequality for the infinite places and the ultrametric inequality for the finite places, we get max(|δ|v , |δ1 |v , . . . , |δm |v ) ≤ mms(v)/2 m max(1, |αi |v )m−1 max(1, |βi |v ) i=1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 216 Unit equations over finitely generated domains for v ∈ ML , where s(v) = 1 if v is real, s(v) = 2 if v is complex, and s(v) = 0 if v is finite. Together with (8.5.1) this implies μ ≤ 12 m log m + m ((m − 1)h(αi ) + h(βi )). i=1 A combination with Corollary 1.9.6 implies Lemma 8.5.1. Lemma 8.5.2 Let g ∈ Z[z1 , . . . , zq ] be a non-zero polynomial of degree d and N a subset of Z of cardinality > d. Then |{u ∈ N q : g(u) = 0}| ≤ d|N |q−1 . Proof. We proceed by induction on q. For q = 1 the assertion is clear. Let q ≥ 2. 0 gi (z1 , . . . , zq−1 )zqi with gi ∈ Z[z1 , . . . , zq−1 ] and gd0 = 0. Write g = di=0 Then deg gd0 ≤ d − d0 . The induction hypothesis implies that there are at most (d − d0 )|N |q−2 · |N | tuples (u1 , . . . , uq ) ∈ N q with gd0 (u1 , . . . , uq−1 ) = 0. Further, there are at most |N |q−1 · d0 tuples u ∈ N q with gd0 (u1 , . . . , uq−1 ) = 0 and g(u1 , . . . , uq ) = 0. Summing these two quantities implies that g has at most d|N |q−1 zeros in N q . Lemma 8.5.3 Let g1 , g2 ∈ Z[z1 , . . . , zq ] be two non-zero polynomials of degrees D1 , D2 , respectively, and let N be an integer ≥ max(D1 , D2 ). Define S := {u ∈ Zq : |u| ≤ N, g2 (u) = 0}. Then S is non-empty, and |g1 |p ≤ (4N )qD1 (D1 +1)/2 max{|g1 (u)|p : u ∈ S} for p ∈ MQ = {∞} ∪ {primes}. (8.5.2) Proof. Put Cp := max{|g1 (u)|p : u ∈ S} for p ∈ MQ . We proceed by induction on q, starting with q = 0. In the case q = 0 we interpret g1 , g2 as non-zero constants with |g1 |p = Cp for p ∈ MQ . Then the lemma is trivial. Let q ≥ 1. Write g1 = D1 g1j (z1 , . . . , zq−1 )zqj , j =0 g2 = D2 g2j (z1 , . . . , zq−1 )zqj , j =0 where g1,D1 , g2,D2 = 0. By the induction hypothesis, the set S := {u ∈ Zq−1 : |u | ≤ N, g2,D2 (u ) = 0} is non-empty and moreover, max |g1j |p ≤ (4N)(q−1)D1 (D1 +1)/2 Cp 0≤j ≤D1 for p ∈ MQ , (8.5.3) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.5 Specializations 217 where Cp := max{|g1j (u )|p : u ∈ S , j = 0, . . . , D1 }. We estimate Cp from above in terms of Cp . Fix u ∈ S . There are at least 2N + 1 − D2 ≥ D1 + 1 integers uq with |uq | ≤ N such that g2 (u , uq ) = 0. Let a0 , . . . , aD1 be distinct integers from this set. By Lagrange’s interpolation formula, g1 (u , X) = D1 g1j (u )Xj j =0 D1 D1 X − ai = g1 (u , aj ) . a j − ai i=0 j =0 i=j m Since, in general, the coefficients of a polynomial k=1 (X − ck ) with m c1 , . . . , cm ∈ C have absolute values at most k=1 (1 + |ck |), we deduce D 1 D1 1 + |ai | max |g1j (u )| ≤ C∞ |aj − ai | 0≤j ≤D1 j =0 i=0 i=j ≤ C∞ (D1 + 1)(N + 1)D1 ≤ (4N )D1 (D1 +1)/2 C∞ . Now let p be a prime and put := 1≤i<j ≤D1 |aj − ai |. Then D1 (D1 +1)/2 Cp . max |g1j (u )|p ≤ Cp ||−1 p ≤ Cp ≤ (4N ) 0≤j ≤D1 It follows that Cp ≤ (4N )D1 (D1 +1)/2 Cp for p ∈ MQ . A combination with (8.5.3) gives (8.5.2). We now introduce our specializations B → Q and prove some properties. We assume q > 0 and apart from that keep the notation and assumptions from Proposition 8.3.7. In particular, A0 = Z[z1 , . . . , zq ], K0 = Q(z1 , . . . , zq ) and K = Q(z1 , . . . , zq , y), B = Z[z1 , . . . , zq , f −1 , y], where f is a non-zero element of A0 , y is integral over A0 , and y has minimal polynomial F := XD + F1 XD−1 + · · · + FD ∈ A0 [X] over K0 . In the case D = 1, we take y = 1, F = X − 1. To allow for other applications (e.g., Lemma 8.7.2 below), we consider a more general situation than what is needed for the proof of Proposition 8.3.7. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 218 Unit equations over finitely generated domains Let d1 ≥ d0 ≥ 1, h1 ≥ h0 ≥ 1 and assume that max(deg F1 , . . . , deg FD ) ≤ d0 , max(d0 , deg f ) ≤ d1 , max(h(F1 ), . . . , h(FD )) ≤ h0 , max(h0 , h(f )) ≤ h1 . Let u = (u1 , . . . , uq ) ∈ Zq . Then the substitution z1 → u1 , . . . , zq → uq defines a ring homomorphism (specialization) ϕu : α → α(u) : {g1 /g2 : g1 , g2 ∈ A0 , g2 (u) = 0} → Q. We want to extend this to a ring homomorphism from B to Q and for this, we have to impose some restrictions on u. Denote by F the discriminant of F (with F := 1 if D = deg F = 1), and let H := F FD · f. (8.5.4) Then H ∈ A0 . Using the fact that F is a polynomial of degree 2D − 2 with integer coefficients in F1 , . . . , FD , it follows easily that deg H ≤ (2D − 1)d0 + d1 ≤ 2Dd1 . (8.5.5) Now assume that H(u) = 0. Then f (u) = 0 and, moreover, the polynomial Fu := XD + F1 (u)XD−1 + · · · + FD (u) has D distinct zeros which are all different from 0, say y1 (u), . . . , yD (u) (these numbers should not be confused with the algebraic functions y1 , . . . , yt from Section 8.3). Thus, for j = 1, . . . , D the assignment z1 → u1 , . . . , zq → uq , y → yj (u) defines a ring homomorphism ϕu,j from B to Q; in the case D = 1 it is just ϕu . The image of α ∈ B under ϕu,j is denoted by αj (u). Recall that we may express elements α of B as α= D−1 (Pi /Q)y i (8.5.6) i=0 with P0 , . . . , PD−1 , Q ∈ A0 , gcd(P0 , . . . , PD−1 , Q) = 1. Since α ∈ B, the denominator Q must divide a power of f , hence Q(u) = 0. So we have αj (u) = D−1 (Pi (u)/Q(u))yj (u)i (j = 1, . . . , D). (8.5.7) i=0 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.5 Specializations 219 It is obvious that ϕu,j is the identity on B ∩ Q. Thus, if α ∈ B ∩ Q, then ϕu,j (α) has the same minimal polynomial as α and so it is conjugate to α. For u = (u1 , . . . , uq ) ∈ Zq , we put |u| := max(|u1 |, . . . , |uq |). It is easy to verify that for any g ∈ A0 , u ∈ Zq , log |g(u)| ≤ q log deg g + h(g) + deg g log max(1, |u|). (8.5.8) In particular, h(Fu ) ≤ q log d0 + h0 + d0 log max(1, |u|). (8.5.9) Now an application of Corollary 1.9.6 gives D h(yj (u)) ≤ D + 1 + q log d0 + h0 + d0 log max(1, |u|). (8.5.10) j =1 Define the algebraic number fields Ku,j := Q(yj (u)) (j = 1, . . . , D). We derive an upper bound for the discriminant DKu,j of Ku,j . Lemma 8.5.4 Let u ∈ Zq with H(u) = 0. Then for j = 1, . . . , D we have [Ku,j : Q] ≤ D and q 2D−2 . |DKu,j | ≤ D 2D−1 d0 · eh0 max(1, |u|)d0 Proof. Let j ∈ {1, . . . , D}. The estimate for the degree is obvious. By Lemma 1.5.1 we have |DKu,j | ≤ D 2D−1 H (Fu )2D−2 , where H (Fu ) denotes the maximum of the absolute values of the coefficients of Fu . Now our lemma follows by combining this with (8.5.9). We finish with two lemmas, which relate h(α) to the heights of αj (u) for α ∈ B, u ∈ Zq . Lemma 8.5.5 Let u ∈ Zq with H(u) = 0. Let α ∈ B. Then for j = 1, . . . , D, h(αj (u)) ≤ D 2 + q(D log d0 + log deg α) + Dh0 + h(α) + (Dd0 + deg α) log max(1, |u|). Proof. Let P0 , . . . , PD−1 , Q be as in (8.5.6) and write αj (u) as in (8.5.7). Let L = Q(yj (u)). Then for v ∈ ML , we have |αj (u)|v ≤ D s(v) Av max(1, |yj (u)|)D−1 , v where s(v) = 1 if v is real, s(v) = 2 if v is complex, s(v) = 0 if v is finite, and Av = max(1, |P0 (u)/Q(u)|v , . . . , |PD−1 (u)/Q(u)|v ). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 220 Unit equations over finitely generated domains Hence 1 h(αj (u)) ≤ log D + [L:Q] log Av + (D − 1)h(yj (u)). (8.5.11) v∈ML From (1.9.8), (1.9.4) and (8.5.8) we infer 1 log Av = h(P0 (u)/Q(u), . . . , PD−1 (u)/Q(u)) [L:Q] v∈ML = hhom (Q(u), P0 (u), . . . , PD−1 (u)) ≤ log max(|Q(u)|, |P0 (u)|, . . . , |PD−1 (u)|) ≤ q log deg α + h(α) + deg α · log max(1, |u|). By combining (8.5.11) with this inequality and with (8.5.10), our lemma easily follows. Lemma 8.5.6 Let α ∈ B, α = 0, and let N be an integer with N ≥ max(deg α, 2Dd0 + 2(q + 1)(d1 + 1)). Then the set S := {u ∈ Zq : |u| ≤ N, H(u) = 0} is non-empty, and h(α) ≤ 5N 4 (h1 + 1)2 + 2D(h1 + 1)H where H := max{h(αj (u)) : u ∈ S, j = 1, . . . , D}. Proof. It follows from our assumption on N , (8.5.5) and Lemma 8.5.3 that S is non-empty. We proceed with estimating h(α). Let P0 , . . . , PD−1 , Q ∈ A0 be as in (8.5.6). We analyse Q more closely. Let km l1 g1 · · · gnln f = ±p1k1 · · · pm be the unique factorization of f in A0 , where p1 , . . . , pm are distinct prime numbers, and ±g1 , . . . , ±gn distinct irreducible elements of A0 of positive degree. Notice that m ≤ h(f )/ log 2 ≤ h1 / log 2, n i=1 li h(gi ) ≤ qd1 + h1 , (8.5.12) (8.5.13) where the last inequality is a consequence of Corollary 1.9.5. Since α ∈ B, the polynomial Q is also composed of p1 , . . . , pm , g1 , . . . , gn . Hence Q = aQ k k with a = ±p11 · · · pmm , l l Q = g11 · · · gnn (8.5.14) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.5 Specializations 221 for certain non-negative integers k1 , . . . , ln . Clearly, l1 + · · · + ln ≤ deg Q ≤ deg α ≤ N, and by Corollary 1.9.5 and (8.5.13), h(Q) ≤ q deg Q + n li h(gi ) ≤ N (q + qd1 + h1 ) ≤ N 2 (h1 + 1). (8.5.15) i=1 In view of (8.5.8), we have for u ∈ S, log |Q(u)| ≤ q log d1 + h(Q) + deg Q log N ≤ 32 N log N + N 2 (h1 + 1) ≤ N 2 (h1 + 2). Hence h(Q(u)αj (u)) ≤ N 2 (h1 + 2) + H for u ∈ S, j = 1, . . . , D. Further, by (8.5.7) and (8.5.13) we have Q(u)αj (u) = D−1 (Pi (u)/a)yj (u)i . i=0 Put δ(u) := gcd(a, P0 (u), . . . , PD−1 (u)). Then by applying Lemma 8.5.1 and then (8.5.9) we obtain max(|a|, |P0 (u)|, . . . , |PD−1 (u)| δ(u) 2 ≤ 2D + (D − 1)h(Fu ) + D N 2 (h1 + 2) + H log ≤ 2D 2 + (D − 1)(q log d1 + h1 + d1 log N) + D N 2 (h1 + 2) + H ≤ N 3 (h1 + 2) + DH. (8.5.16) Our assumption that gcd(Q, P0 , . . . , PD−1 ) = 1 implies that the greatest common divisor of a and the coefficients of P0 , . . . , PD−1 is 1. Let p ∈ {p1 , . . . , pm } be one of the prime factors of a. There is j ∈ {0, . . . , D − 1} such that |Pj |p = 1. Our assumption on N and (8.5.5) imply that N ≥ max(deg H, deg Pj ). This means that Lemma 8.5.3 is applicable with g1 = Pj and g2 = H. It follows that max{|Pj (u)|p : u ∈ S} ≥ (4N )−qN(N+1)/2 . That is, there is u0 ∈ S with |Pj (u0 )|p ≥ (4N)−qN(N+1)/2 . Hence |δ(u0 )|p ≥ (4N )−qN(N+1)/2 . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 222 Unit equations over finitely generated domains Together with (8.5.16), this implies −1 log |a|−1 p ≤ log |a/δ(u0 )| + log |δ(u0 )|p ≤ N 3 (h1 + 2) + DH + 12 N 3 log 4N ≤ N 4 (h1 + 3) + DH. Combining this with the upper bound (8.5.12) for the number of prime factors of a, we obtain log |a| ≤ 2N 4 h1 (h1 + 3) + 2Dh1 · H. (8.5.17) Together with (8.5.14) and (8.5.15), this implies h(Q) ≤ 2N 4 h1 (h1 + 3) + 2Dh1 · H + N 2 (h1 + 1) ≤ 3N 4 (h1 + 1)2 + 2Dh1 · H. (8.5.18) Further, the right-hand side of (8.5.17) is also an upper bound for log δ(u), for u ∈ S. Combining this with (8.5.16) gives log max{|Pj (u)| : u ∈ S, j = 0, . . . , D − 1} ≤ N 3 (h1 + 2) + DH + 3N 4 (h1 + 1)2 + 2Dh1 · H ≤ 4N 4 (h1 + 1)2 + 2D(h1 + 1) · H. Another application of Lemma 8.5.3 yields h(Pj ) ≤ 12 qN (N + 1) log 4N + 4N 4 (h1 + 1)2 + 2D(h1 + 1) · H ≤ 5N 4 (h1 + 1)2 + 2D(h1 + 1) · H for j = 0, . . . , D − 1. Together with (8.5.18) this gives the upper bound for h(α) from our lemma. 8.6 Bounding the height in Proposition 8.3.7 It remains to prove the height bound in (8.3.19). As before, we use O(·) to denote a quantity which is c times the expression between the parentheses, where c is an effectively computable positive absolute constant which may be different at each occurrence of the O-symbol. We first consider the case q > 0. Pick u ∈ Zq with H(u) = 0, pick j ∈ {1, . . . , D} and put L := Ku,j . Further, let the set of places S consist of all infinite places of L, and all finite places of L lying above the rational prime divisors of f (u). Let p1 , . . . , pt be the prime ideals in S and define, in the usual Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.6 Bounding the height in Proposition 8.3.7 223 manner, s := |S|, P := max(2, NK (p1 ), . . . , NK (pt )), Q := t NK (pi ). i=1 Further, denote by RS the S-regulator (see (1.8.2)). Note that yj (u) is an algebraic integer, and f (u) ∈ OS∗ . Hence ϕu,j (B) ⊆ OS and ϕu,j (B ∗ ) ⊆ OS∗ . Let x1 , x2 be a solution of (8.3.17). So x1,j (u) + x2,j (u) = 1, x1,j (u), x2,j (u) ∈ OS∗ , where x1,j (u), x2,j (u) are the images of x1 , x2 under ϕu,j . We apply Corollary 4.1.5. In a slightly less precise form, this result gives max(h(x1,j (u)), h(x2,j (u))) P · RS · max(log P , log∗ RS ). ≤ exp(O(s log s)) log P (8.6.1) We estimate this bound from above. By assumption, f has degree at most d1 and logarithmic height at most h1 , hence q |f (u)| ≤ d1 eh1 max(1, |u|)d1 =: R(u). (8.6.2) Since the degree of L is at most D, the cardinality s of S is at most s ≤ D(1 + ω), where ω is the number of prime divisors of f (u). Using the inequality from elementary number theory, ω ≤ O(log |f (u)|/ log log |f (u)|), we obtain s≤O D log∗ R(u) . log∗ log∗ R(u) (8.6.3) Next, we estimate P and RS . By (8.6.2), we have P ≤ Q ≤ |f (u)|D ≤ exp O(D log∗ R(u)). (8.6.4) By inequality (1.8.4) we have RS ≤ |DL |1/2 (log∗ |DL |)D−1 t log NK (pi ) ≤ |DL |1/2 (log∗ |DL |)D−1 (log Q)s . i=1 In view of Lemma 8.5.4 (using d0 ≤ d1 ) we have q 2D−2 |DL | ≤ D 2D−1 d1 eh1 max(1, |u|)d1 ≤ exp O(D log∗ DR(u)), and this easily implies |L |1/2 (log∗ L )D−1 ≤ exp O(D log∗ DR(u)). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 224 Unit equations over finitely generated domains Together with the estimates (8.6.3) and (8.6.4) for s and Q, this leads to RS ≤ exp O(D log∗ DR(u) + s log∗ log∗ Q) ≤ exp O(D log∗ DR(u)). (8.6.5) Now by inserting (8.6.3)–(8.6.5) into the upper bound (8.6.1) we obtain h(x1,j (u)), h(x2,j (u)) ≤ exp O(D log∗ D log∗ R(u)). We apply Lemma 8.5.6 with N := 4D 2 (q + d1 + 1)2 . From the already established (8.3.18) it follows that deg x1 , deg x2 ≤ N . Further, since d1 ≥ d0 we have N ≥ 2Dd0 + 2(d1 + 1)(q + 1). So indeed, Lemma 8.5.6 is applicable with this value of N. It follows that the set S := {u ∈ Zq : |u| ≤ N, H(u) = 0} is not empty. Further, for u ∈ S, j = 1, . . . , D, we have h(x1,j (u)) ≤ exp O(D log∗ D(q log d1 + h1 + d1 log∗ N )) ≤ exp O(N 1/2 (log∗ N )2 + (D log∗ D)h1 ), and so by Lemma 8.5.6, h(x1 ) ≤ exp O(N 1/2 (log∗ N)2 + (D log∗ D)h1 ). For h(x2 ) we obtain the same upper bound. This easily implies (8.3.19) in the case q > 0. Now assume q = 0. In this case, K0 = Q, A0 = Z and B = Z[f −1 , y], where y is an algebraic integer with minimal polynomial F = XD + F1 XD−1 + · · · + FD ∈ Z[X] over Q, and f is a non-zero rational integer. By assumption, log |f | ≤ h1 , log |Fi | ≤ h1 for i = 1, . . . , D. Denote by y1 , . . . , yD the conjugates of y, and let L = Q(yj ) for some j . By Lemma 1.5.1 we have |L | ≤ D 2D−1 e(2D−2)h1 . The isomorphism given by y → yj maps K to L and B to OS , where S consists of the infinite places of L and of the prime ideals of OL that divide f . The estimates (8.6.2)–(8.6.5) remain valid if we replace R(u) by eh1 . Hence for any solution (x1 , x2 ) of (8.3.17), h(x1,j ), h(x2,j ) ≤ exp O((D log∗ D)h1 ), where x1,j , x2,j are the j -th conjugates of x1 , x2 , respectively. Now an application of Lemma 8.5.1 with g = F, m = D, βj = x1,j gives h(x1 ) ≤ exp O((D log∗ D)h1 ). Again we derive the same upper bound for h(x2 ), and deduce (8.3.19). This completes the proof of Proposition 8.3.7. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.7 Proof of Theorem 8.1.3 225 8.7 Proof of Theorem 8.1.3 We start with some results on multiplicative (in)dependence. We first recall a result that was published by Loher and Masser, but attributed by them to Yu Kunrui. Another result of this type was obtained earlier in Loxton and van der Poorten (1983). Lemma 8.7.1 Let L be an algebraic number field of degree d, and γ0 , . . . , γs non-zero elements of L such that γ0 , . . . , γs are multiplicatively dependent, but any s elements among γ0 , . . . , γs are multiplicatively independent. Then there are non-zero integers k0 , . . . , ks such that γ0k0 · · · γsks = 1, |ki | ≤ 58(s!es /s s ) · d s+1 (log∗ d)h(γ0 ) · · · h(γs )/ h(γi ) for i = 0, . . . , s. Proof. See Loher and Masser (2004), Corollary 3.2. We prove a generalization for arbitrary finitely generated integral domains. As before, let A = Z[z1 , . . . , zr ] ⊇ Z be an integral domain finitely generated over Z, and suppose that the ideal I of polynomials f ∈ Z[X1 , . . . , Xr ] with f (z1 , . . . , zr ) = 0 is generated by f1 , . . . , fm . Let K be the quotient field of A. Let γ0 , . . . , γs be non-zero elements of K, and for i = 1, . . . , s, let (gi1 , gi2 ) be a pair of representatives for γi , i.e., elements of Z[X1 , . . . , Xr ] such that γi = gi1 (z1 , . . . , zr ) . gi2 (z1 , . . . , zr ) Lemma 8.7.2 Assume that f1 , . . . , fm and gi1 , gi2 (i = 0, . . . , s) have degrees at most d and logarithmic heights at most h, where d ≥ 1, h ≥ 1. Further, assume that γ0 , . . . , γs are multiplicatively dependent. Then there are integers k0 , . . . , ks , not all equal to 0, such that γ0k0 · · · γsks = 1, |ki | ≤ (2d)exp O(r+s) (h + 1)s for i = 0, . . . , s. Proof. We assume without loss of generality that any s numbers among γ0 , . . . , γs are multiplicatively independent (if this is not the case, take a minimal multiplicatively dependent subset of {γ0 , . . . , γs } and proceed further with this subset). We first assume that q > 0. We use an argument of van der Poorten and Schlickewei (1991). We keep the notation and assumptions from Sections 8.3–8.5. In particular, we assume that z1 , . . . , zq is a transcendence basis of K, and rename zq+1 , . . . , zr as y1 , . . . , yt , respectively. For brevity, we have included the case t = 0 as well in our proof. But it should be possible to prove in this case a sharper result by means of a more elementary Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 226 Unit equations over finitely generated domains method. In the case t > 0, y and F = XD + F1 XD−1 + · · · + FD will be as in Corollary 8.3.3. In the case t = 0 we take m = 1, f1 = 0, d = h = 1, y = 1, F = X − 1, D = 1. We construct a specialization such that among the images of γ0 , . . . , γs no s elements are multiplicatively dependent, and then apply Lemma 8.7.1. Let V ≥ 2d be a positive integer. Later we shall make our choice of V more precise. Let v = (v0 , . . . , vs ) ∈ Zs+1 \ {0} : . (8.7.1) V := |vi | ≤ V for i = 0, . . . , s, vi = 0 for some i Then γv := s γivi −1 (v ∈ V) i=0 are non-zero elements of K, since each proper subset of {γ0 , . . . , γs } is multiplicatively independent. It is not difficult to show that for v ∈ V, γv has a pair of representatives (g1,v , g2,v ) such that deg g1,v , deg g2,v ≤ sdV . In the case t > 0, there exists by Lemma 8.3.5 a non-zero f ∈ A0 such that A ⊆ B := A0 [y, f −1 ], γv ∈ B ∗ for v ∈ V and deg f ≤ V s+1 (2sdV )exp O(r) ≤ V exp O(r+s) . In the case t = 0 this holds true as well, with y = 1 and f = v∈V (g1,v · g2,v ). We apply the theory on specializations explained in Section 8.5 with this f . We put H := F FD f , where F is the discriminant of F. Using Corollary 8.3.3 and inserting the bound D ≤ d t from Lemma 8.3.1 we get for t > 0, d0 ≤ (2d)exp O(r) , h0 ≤ (2d)exp O(r) (h + 1), (8.7.2) where d0 := max(deg f1 , . . . , deg fm , deg F1 , . . . , deg FD ), h0 := max(h(f1 ), . . . , h(fm ), h(F1 ), . . . , h(FD )). With the provision deg 0 = h(0) = −∞, the inequalities (8.7.2) hold true also if t = 0. Combining this with Lemma 8.3.4, we obtain deg H ≤ (2D − 1)d0 + deg f ≤ V exp O(r+s) . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.7 Proof of Theorem 8.1.3 227 By Lemma 8.5.2 there exists u ∈ Zq with H(u) = 0, |u| ≤ V exp O(r+s) . (8.7.3) We proceed further with this u. As we have seen before, γv ∈ B ∗ for v ∈ V. By our choice of u, there are D distinct specialization maps ϕu,j (j = 1, . . . , D) from B to Q. We fix one of these specializations, say ϕu . Given α ∈ B, we write α(u) for ϕu (α). As the elements γv are all units in B, their images under ϕu are non-zero. So we have s γi (u)vi = 1 for v ∈ V, (8.7.4) i=0 where V is defined by (8.7.1). We use Lemma 8.5.5 to estimate the heights h(γi (u)) for i = 0, . . . , s. Recall that by Lemma 8.3.4 we have deg γi ≤ (2d)exp O(r) , h(γi ) ≤ (2d)exp O(r) (h + 1) for i = 0, . . . , s. By inserting these bounds, together with the bound D ≤ d t from Lemma 8.3.1, those for d0 , h0 from (8.7.2) and that for u from (8.7.3) into the bound from Lemma 8.5.5, we obtain for i = 0, . . . , s, h(γi (u)) ≤ (2d)exp O(r) (1 + h + log max(1, |u|)) ≤ (2d) exp O(r+s) (8.7.5) (1 + h + log V ). Assume that among γ0 (u), . . . , γs (u) there are s numbers that are multiplicatively dependent. By Lemma 8.7.1 there are integers k0 , . . . , ks , at least one of which is non-zero and at least one of which is 0, such that s γi (u)ki = 0, i=0 |ki | ≤ (2d)exp O(r+s) (1 + h + log V )s−1 for i = 0, . . . , s. Now for V = (2d)exp O(r+s) (h + 1)s−1 (8.7.6) (with a sufficiently large constant in the O-symbol), the upper bound for the numbers |ki | is smaller than V . But this would imply that si=0 γi (u)vi = 1 for some v ∈ V, contrary to (8.7.4). Thus we conclude that with the choice (8.7.6) for V , there exists u ∈ Zq with (8.7.3), such that any s numbers among γ0 (u), . . . , γs (u) are multiplicatively independent. The numbers γ0 (u),. . .,γs (u) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 228 Unit equations over finitely generated domains are multiplicatively dependent, since they are the images under ϕu of γ0 , . . . , γs , which are multiplicatively dependent. Substituting (8.7.6) into (8.7.5) we obtain h(γi (u)) ≤ (2d)exp O(r+s) (h + 1) for i = 0, . . . , s. (8.7.7) Now Lemma 8.7.1 implies that there are non-zero integers k0 , . . . , ks such that s γi (u)ki = 1, (8.7.8) i=0 |ki | ≤ (2d)exp O(r+s) (h + 1)s for i = 0, . . . , s. (8.7.9) Our assumption on γ0 , . . . , γs implies that there are non-zero integers l0 , . . . , ls such that si=0 γili = 1. Hence si=0 γi (u)li = 1. Together with (8.7.8) this implies s γi (u)l0 ki −li k0 = 1. i=1 But γ1 (u), . . . , γs (u) are multiplicatively independent, hence li k0 = ki l0 for i = 1, . . . , s. Therefore, l k k0 γ0 · · · γsks 0 = γ0l0 · · · γsls 0 = 1, implying that si=0 γiki = ρ for some root of unity ρ. But ϕu (ρ) = 1 and it is conjugate to ρ. Hence ρ = 1. So in fact we have si=0 γiki = 1 with non-zero integers ki satisfying (8.7.9). This proves our lemma, but under the assumption q > 0. If q = 0 then a much simpler argument, without specializations, gives h(γi ) ≤ (2d)exp O(r+s) (h + 1) for i = 0, . . . , s instead of (8.7.7). Then the proof is finished in the same way as in the case q > 0. Corollary 8.7.3 Assume that f1 , . . . , fm and gi1 , gi2 (i = 0, . . . , s) have degrees at most d and logarithmic heights at most h, where d ≥ 1, h ≥ 1. Further, assume that γ1 , . . . , γs are multiplicatively independent and γ0 = γ1k1 · · · γsks for certain integers k1 , . . . , ks . Then |ki | ≤ (2d)exp O(r+s) (h + 1)s for i = 1, . . . , s. Proof. By Lemma 8.7.2, and by the multiplicative independence of γ1 , . . . , γs , there are integers l0 , . . . , lm such that m γili = 1, i=0 l0 = 0, |li | ≤ (2d)exp O(r+s) (h + 1)s for i = 0, . . . , s. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 8.7 Proof of Theorem 8.1.3 229 Now, clearly, we have also s γil0 ki −li = 1, i=1 hence l0 ki − li = 0 for i = 1, . . . , s. It follows that |ki | = |li / l0 | ≤ (2d)exp O(r+s) (h + 1)s for i = 1, . . . , s. This implies our corollary. Proof of Theorem 8.1.3. We keep the notation and assumptions from the statement of Theorem 8.1.3. For i = 1, . . . , s, j = 1, 2, let αij := gij (z1 , . . . , zr ). Then αi1 , αi2 ∈ A and γi = αi1 /αi2 for i = 1, . . . , s. Further, let g := s (gi1 gi2 ), γ := i=1 s (αi1 αi2 ) i=1 and define the ring A := A[γ −1 ]. Then A∼ = Z[X1 , . . . , Xr , Xr+1 ]/I with I = (f1 , . . . , fm , gXr+1 − 1). Clearly, γ ∈ A∗ , therefore also αi1 , αi2 ∈ A∗ , and hence γi ∈ A∗ for i = 1, . . . , s. Further, g has total degree at most O(sd) and logarithmic height at most O(sh). As a consequence, I is generated by polynomials of total degrees at most O(sd) and logarithmic heights at most O(sh). Let (v1 , . . . , ws ) be a solution of (8.1.3), and put x1 := s γivi , x2 := i=1 s γiwi . i=1 Then a1 x1 + a2 x2 = a3 , x1 , x2 ∈ A∗ . By Theorem 8.1.1, x1 has a representative x1 ∈ Z[X1 , . . . , Xr+1 ] of degree and logarithmic height both bounded above by exp (2sd)exp O(r) (h + 1) . Now Corollary 8.7.3 implies |vi | ≤ exp (2s d)exp O(r) (h + 1) for i = 1, . . . , s. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 230 Unit equations over finitely generated domains For |wi | (i = 1, . . . , s) we derive a similar upper bound. This completes the proof of Theorem 8.1.3. 8.8 Notes r Itis well-known that if A is a finitely generated integral domain over Z, then its quotient field K can be represented in the form K = Q(z1 , . . . , zq , y), where {z1 , . . . , zq } is a transcendence basis of K over Q, y is an integral element over A0 := Z[z1 , . . . , zq ], and A is contained in the integral domain B := Z[z1 , . . . , zq , f −1 , y], for some non-zero f ∈ A0 . As was seen in Sections 8.4–8.6, such an overring B has the advantage that it is easier to deal with its elements. The generating sets {z1 , . . . , zq , y} of the above type proved to be useful in several other applications, among others in transcendental number theory; see e.g. Waldschmidt (1973, 1974) where an appropriate size is introduced for the elements of K with respect to a generating set {z1 , . . . , zq , y}. r Following a method of Győry (1983, 1984), analogues of some results of this chapter can be established over integral domains finitely generated over a field of characteristic zero instead of Z. However, in this case finiteness results cannot be obtained, upper bounds can be derived only for the degrees of the solutions. r Theorems 8.1.1–8.1.3 as well as the method of their proofs have several applications. For instance, Theorems 8.1.1–8.1.3 are applied in our book on discriminant equations. Further, their methods of proof are used in several papers to obtain general effective finiteness results for various classes of Diophantine equations over finitely generated domains over Z, namely for Thue equations and superelliptic equations in Bérczes, Evertse and Győry (2014), for polynomial equations f (x, y) = 0 in solutions x, y from a finitely generated multiplicative group, and even from the division group of the latter, in Bérczes (2015a, 2015b), and Koymans (2015) for Catalan’s equation. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010 9 Decomposable form equations Let F ∈ Z[X, Y ] be a binary form, i.e., a homogeneous polynomial of degree n ≥ 3, that is irreducible over Q, and δ a non-zero integer. Thue (1909) proved that the equation F (x, y) = δ in x, y ∈ Z has only finitely many solutions. This was extended by Mahler (1933a) as follows. Let p1 , . . . , pt be distinct prime numbers. Then the equation F (x, y) = ±δp1z1 · · · ptzt in x, y, z1 , . . . , zt ∈ Z with gcd(x, y, p1 · · · pt ) = 1 has only finitely many solutions. Mahler’s result can be reformulated as follows. In accordance with terminology introduced before, let S = {∞, p1 , . . . , pt }, where ∞ is the infinite place of Q, and let ZS = Z[(p1 · · · pt )−1 ] be the ring of S-integers. Then the set of solutions of the equation F (x, y) ∈ δZ∗S in x, y ∈ ZS is a union of only finitely many Z∗S -cosets, i.e., sets of the form {ε(x0 , y0 ) : ε ∈ Z∗S }, where (x0 , y0 ) is a solution of the equation. In this chapter, we deal with generalizations, where instead of equations over Z or ZS we consider equations over integral domains that are finitely generated over Z, and where instead of a binary form F we take a decomposable form in an arbitrary number of variables, that is, a homogeneous polynomial that factors into linear forms over an extension of its field of definition. More precisely, let K be a finitely generated (but not necessarily algebraic) extension field of Q, and F ∈ K[X1 , . . . , Xm ] a decomposable form in m ≥ 2 variables, which factors into linear forms over a finite extension of K. Further, let δ ∈ K ∗ and let A be a subring of K that is finitely generated over Z. We 231 Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 232 Decomposable form equations consider the equations F (x) = δ in x = (x1 , . . . , xm ) ∈ Am (9.1) and F (x) ∈ δA∗ in x = (x1 , . . . , xm ) ∈ Am , (9.2) where A∗ denotes the unit group of A. Equations of the type (9.1) and (9.2) are called decomposable form equations. The set of solutions of (9.2) can be divided into A∗ -cosets x0 A∗ = {εx0 : ε ∈ A∗ } where x0 is a solution of (9.2). By Roquette’s Theorem (Roquette (1957), p.3), the unit group A∗ is finitely generated. Hence it is easy to see that (9.2) can be reduced to finitely many equations of the form (9.1). Clearly, every binary form is a decomposable form in two variables. Equations of type (9.1) and (9.2) are called Thue equations and Thue–Mahler equations, respectively in the case that F is a binary form. Unlike in the results of Thue and Mahler mentioned above, we do not require that the binary form is irreducible over its field of definition. Other important special cases of decomposable form equations are norm form equations, discriminant form equations and index form equations. As we shall see, decomposable form equations are in a certain sense equivalent to unit equations and in particular, Thue equations are equivalent to unit equations in two unknowns. Decomposable form equations have many number-theoretic applications. This chapter is basically an extensive survey, in which for some of the stated results we give a complete proof, whereas for the proofs of others we refer to the literature. Below, we give a brief overview. In Section 9.1 we present a general finiteness criterion for equations (9.1) and (9.2), and in particular for Thue equations when F is a binary form. For convenience for applications, we establish our criterion for slightly more general equations. This criterion gives effectively decidable necessary and sufficient conditions in terms of the linear factors of F such that equations (9.1) and (9.2) have only finitely many (A∗ -cosets of) solutions. In Section 9.2 we explain how our finiteness criterion for decomposable form equations implies the finiteness result for unit equations established in Chapter 6. In Section 9.3 we deduce our finiteness criterion for equations (9.1) and (9.2) from the finiteness result for unit equations. This shows that the finiteness results for unit equations and decomposable form equations are equivalent. In Section 9.4 we give, without proof, a complete description of the set of solutions of (9.1) and (9.2) in the case when this set is infinite. More precisely, in this case, the set of solutions can be divided in a natural way into infinite families, and the number of these families is finite. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.1 A finiteness criterion for decomposable form equations 233 In Section 9.5, we present, without proofs, explicit upper bounds for the number of families and, in the case of finitely many solutions, for the number of S-integral solutions for decomposable form equations over the ring of Sintegers of a number field. In Section 9.6 we derive, with full proofs, effective bounds for the heights of the S-integral solutions of Thue equations, and of decomposable form equations in an arbitrary number of unknowns from a restricted class, including discriminant form equations and certain norm form equations. The proofs are based on effective results from Chapter 4 concerning S-unit equations. In our next book, Discriminant Equations in Diophantine Number Theory, we work out various applications of unit equations to discriminant form equations, index form equations, decomposable form equations of discriminant type and related problems. There is an extensive literature on decomposable form equations. Almost all books and survey papers listed in the Preface of this book that deal with unit equations and their applications are also concerned with decomposable form equations. We refer also to Borevich and Shafarevich (1967), Evertse and Győry (1988d), Feldman and Nesterenko (1998) and Győry (1999) on the subject. Some further references can be found in the Notes (Section 9.7). 9.1 A finiteness criterion for decomposable form equations We present a general finiteness criterion which guarantees the finiteness of the number of solutions of equation (9.1), and the finiteness of the number of A∗ -cosets of solutions of equation (9.2) for every δ ∈ K ∗ and every subring A of K which is finitely generated over Z. Let again K be a field which is finitely generated over Q. We fix an algebraic closure K of K. Let A ⊂ K be a ring finitely generated over Z. Further, let F ∈ K[X1 , . . . , Xm ] be a non-zero decomposable form in m ≥ 2 variables and let δ ∈ K ∗ . For applications it is convenient to extend (9.1) and (9.2) mentioned in the introduction and to consider the equations F (x) = δ in x ∈ M with l(x) = 0 for l ∈ L, (9.1.1) F (x) ∈ δA∗ in x ∈ M with l(x) = 0 for l ∈ L, (9.1.2) and where M is a finitely generated A-module with M ⊂ K m , and L is a finite set of non-zero linear forms from K[X1 , . . . , Xm ]. In the special case where M = Am and L consists of the linear factors of F , equations (9.1.1) and (9.1.2) give (9.1) and (9.2), respectively. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 234 Decomposable form equations We give necessary and sufficient conditions, in terms of L and the linear factors of F , such that (9.1.1) and (9.1.2) have only finitely many (A∗ -cosets of) solutions. We introduce some convenient notation. In what follows we let G be a finite, normal extension of K over which F factorizes into linear factors. For a linear form l = α1 X1 + · · · + αm Xm ∈ G[X1 , . . . , Xm ] and for σ ∈ Gal(G/K) we define σ (l) := σ (α1 )X1 + · · · + σ (αm )Xm . For a subset L of linear forms from G[X1 , . . . , Xm ], we define the following: σ (L) = {σ (l) : l ∈ L} for σ ∈ Gal(G/K); [L] is the G-vector space generated by L; L is called Gal(G/K)-stable if σ (L) = L for each σ ∈ Gal(G/K); L is called Gal(G/K)-proper if for each σ ∈ Gal(G/K) we have either σ (L) = L or σ (L) ∩ L = ∅. Given G-linear subspaces V1 , . . . , Vt of the space of linear forms from G[X1 , . . . , Xm ], we denote by V1 + · · · + Vt the smallest G-vector space containing them. We have F = cl1e1 · · · lnen , (9.1.3) where L0 := {l1 , . . . , ln } ⊂ G[X1 , . . . , Xm ] is a Gal(G/K)-stable set of pairwise non-proportional linear forms, c ∈ K ∗ and e1 , . . . , en are positive integers. Clearly, ei = ej if li = σ (lj ) for some σ ∈ Gal(G/K). Let L be a finite set of pairwise non-proportional linear forms with L ⊇ L0 , L ⊂ G[X1 , . . . , Xm ]. The main result of this section is as follows. Theorem 9.1.1 Let m, K, F , G, L0 , L be as above. Then the following three assertions are equivalent: (i) rankG L0 = m, and for each non-empty subset L1 L0 such that L1 is Gal(G/K)-proper, we have ⎞ ⎛ [σ (L1 )] ∩ [L0 \ σ (L1 )]⎠ = ∅; (9.1.4) L∩⎝ σ ∈Gal(G/K) Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.1 A finiteness criterion for decomposable form equations 235 (ii) for every subring A of K which is finitely generated over Z, every finitely generated A-module M ⊂ K m , and every δ ∈ K ∗ , equation (9.1.1) has only finitely many solutions; (iii) for every A, M, δ as in (ii), equation (9.1.2) has only finitely many A∗ cosets of solutions. Theorem 9.1.1 is new. We shall prove it in Section 9.3. An important feature of this theorem is that it relates statements (i.e., (ii), (iii)) about Diophantine equations to a statement (i.e., (i)) in linear algebra. Furthermore, assertion (i) is effectively decidable provided K, G and the coefficients of the linear forms in L0 and L are effectively given in some sense. The following corollary is an immediate consequence of Theorem 9.1.1. Corollary 9.1.2 Let m, K, F , G, L0 and L be as in Theorem 9.1.1. If (i ) rankG L0 = m and L ∩ ([L1 ] ∩ [L0 \ L1 ]) = ∅ for every proper, non-empty subset L1 of L0 , then (ii), (iii) hold. Moreover, if G = K, then (i ), (ii), (iii) are equivalent. Similar finiteness criteria were established in Evertse and Győry (1988c); see also Evertse, Gaál and Győry (1989). We deduce some further consequences. Corollary 9.1.3 Let m, K, F , G, L0 and L be as in Theorem 9.1.1, and let L = L0 . Assume that |L0 | ≥ 2m − 1 and that L0 is in general position, i.e., each subset of m linear forms from L0 is linearly independent. Then equation (9.1.1) has only finitely many solutions. For M = Am , this was proved in Győry (1993b). Proof of Corollary 9.1.3. We apply Corollary 9.1.2. Notice that for each proper, non-empty subset L1 of L0 we have |L1 | ≥ m or |L0 \ L1 | ≥ m, i.e., rankG [L1 ] = m or rankG [L0 \ L1 ] = m. Hence [L1 ] ∩ [L0 \ L1 ] contains L1 or L0 \ L1 . This implies (i ) with L = L0 , and, by Corollary 9.1.2, (ii) follows. Corollary 9.1.4 Let F ∈ K[X, Y ] be a non-zero binary form. Then the following two assertions are equivalent: (iv) F is divisible by at least three pairwise non-proportional linear forms from K[X, Y ]; (v) for every subring A of K which is finitely generated over Z and every δ ∈ K ∗ , the equation F (x, y) = δ in x, y ∈ A (9.1.5) has only finitely many solutions. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 236 Decomposable form equations The implication (iv) ⇒ (v) follows from work in Thue (1909) for K = Q, A = Z, Siegel (1921) for K an arbitrary algebraic number field and A its ring of integers, Mahler (1933a) for K = Q and A = ZS for some finite set of places S of Q, Parry (1950) for K an arbitrary algebraic number field and A = OS for some finite set of places S of K, and Lang (1960) in the most general case. As was mentioned before, equation (9.1.5) is usually called a Thue equation. Proof of Corollary 9.1.4. Let G be the splitting field of F over K. Then G is a finite, normal extension of K. Let L0 be a maximal Gal(G/K)-stable set of pairwise non-proportional linear forms from G[X, Y ] that divide F . We have to show that (i) with m = 2, L = L0 is equivalent to |L0 | ≥ 3. First assume that |L0 | ≥ 3. Then rankG L0 = 2. Next, for each proper, non-empty subset L1 of L0 we have |L1 | ≥ 2 or |L0 \ L1 | ≥ 2, i.e., rankG [L1 ] = 2 or rankG [L0 \ L1 ] = 2 and this implies that [L1 ] ∩ [L0 \ L1 ] contains L1 or L0 \ L1 . This gives (9.1.4). Conversely, assume that |L0 | = 2. Then each proper, non-empty subset L1 of L0 has |L1 | = 1 and so [σ (L1 )] ∩ [L0 \ σ (L1 )] = (0) for every σ ∈ Gal(G/K). Hence (9.1.4) cannot hold. 9.2 Reduction of unit equations to decomposable form equations It can be shown that unit equations and decomposable form equations are equivalent in the sense that every unit equation leads to a decomposable form equation (over a suitable ring which is finitely generated over Z), and every decomposable form equation can be reduced to finitely many unit equations (in an appropriate finite field extension). Consequently, general finiteness results for unit equations imply general finiteness results for decomposable form equations and vice versa. In the two unknowns case (i.e. for unit equations in two unknowns and for Thue equations) this equivalence was (implicitly) pointed out by Siegel (1926, 1929), while the general case was worked out by Evertse and Győry (1988c). More precisely, we show that Theorem 9.1.1 is equivalent to the following. Theorem 9.2.1 Let K be a field of characteristc 0, a finitely generated multiplicative subgroup of K ∗ and let a1 , . . . , am ∈ K ∗ . Then the equation a1 x1 + · · · + am xm = 1 in x1 , . . . , xm ∈ (9.2.1) has at most finitely many non-degenerate solutions, i.e., with ai xi = 0 for each non-empty subset I of {1, . . . , m}. i∈I Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.3 Reduction of decomposable form equations to unit equations 237 This theorem is proved in Chapter 6 in a more precise quantitative form, see Theorem 6.1.3. For further historical comments, see Section 6.7. All known proofs of Theorem 9.2.1 are ineffective. In the next section, we prove Theorem 9.1.1, taking Theorem 9.2.1 as a starting point. In the present section we show that Theorem 9.1.1 implies Theorem 9.2.1. Proof of the implication Theorem 9.1.1 ⇒ Theorem 9.2.1. Let K be a field of characteristic 0, a finitely generated subgroup of K ∗ , and a1 , . . . , am ∈ K ∗ with m ≥ 2. Define the decomposable form F := X1 · · · Xm (a1 X1 + · · · + am Xm ). Let L0 = {a1 X1 , . . . , am Xm , a1 X1 + . . . + am Xm }, and L be the set of all linear forms of the form ai1 Xi1 + · · · + ais Xis , where {i1 , . . . , is } is a non-empty subset of {1, . . . , m}. Then we have L ⊃ L0 . It is easy to check that these L0 and L satisfy statement (i) in Theorem 9.1.1 with G = K (and even (i ) in Corollary 9.1.2). Let A be the subring of K generated by a1 , . . . , am and the elements of . Then A is finitely generated over Z, and is a subgroup of A∗ . It is now clear that every non-degenerate solution x = (x1 , . . . , xm ) of (9.2.1) satisfies F (x) ∈ A∗ , x ∈ Am , l(x) = 0 for l ∈ L. (9.2.2) Theorem 9.1.1 (or in this case Corollary 9.1.2) implies that there are at most finitely many pairwise linearly independent x with (9.2.2). This implies that (9.2.1) has only finitely many pairwise linearly independent non-degenerate solutions. But obviously, any two linearly dependent solutions of (9.2.1) have to be equal. This implies Theorem 9.2.1. 9.3 Reduction of decomposable form equations to unit equations In this section we prove the equivalence of assertions (i), (ii) and (iii) of Theorem 9.1.1, taking Theorem 9.2.1 as starting point. This section has been divided into three subsections: the first contains the proof of the equivalence of (ii) and (iii), which is elementary and is independent of unit equations, the second contains the proof of the implication (i)⇒(iii) and the last the proof of the implication (iii)⇒(i). In both the second and third subsections we have used Theorem 9.2.1. We keep the notation and definitions from Section 9.1. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 238 Decomposable form equations We need a few facts on Noetherian rings and modules; for a proof of these, we refer to Lang (1984), chapter 6. A commutative ring is called Noetherian if all its ideals are finitely generated. Let A be a Noetherian commutative ring. Then for any ideal I of A the residue class ring A/I is Noetherian. Further, for any integer r ≥ 1 the polynomial ring in r variables A[X1 , . . . , Xr ] is Noetherian. Any finitely generated A-module is Noetherian, i.e., all its A-submodules are finitely generated. Any integral domain finitely generated over Z is isomorphic to Z[X1 , . . . , Xr ]/I for some ideal I of Z[X1 , . . . , Xr ], hence it is Noetherian. 9.3.1 Proof of the equivalence (ii) ⇐⇒ (iii) in Theorem 9.1.1 We need the following result of Roquette. Proposition 9.3.1 Let A be an integral domain that is finitely generated over Z. Then A∗ is a finitely generated group. Proof. See Roquette (1957). (ii)⇐⇒(iii). First assume that (ii) holds. Let A, M, δ be as in the statement of (iii). Proposition 9.3.1 implies that there are a finite set S ⊂ A∗ such that every ε ∈ A∗ can be expressed as ηζ n with η ∈ S, ζ ∈ A∗ , where n := degF . Now if x ∈ M is a solution of (9.1.2), then F (x) = δε with ε ∈ A∗ . Hence there are η ∈ S, ζ ∈ A∗ such that F (ζ −1 x) = δη. By (ii), each equation F (y) = δη (η ∈ S) in y ∈ M with l(y) = 0 for l ∈ L has only finitely many solutions. This implies (iii). Conversely, assume (iii), and take again A, M, δ as in (ii). Then the solutions of (9.1.1) lie in finitely many A∗ -cosets. If x1 , x2 are two solutions in the same A∗ -coset then x2 = εx1 for some ε ∈ A∗ , and εn = F (x2 )/F (x1 ) = 1. So each A∗ -coset contains at most n solutions of (9.1.1). This proves (ii). 9.3.2 Proof of the implication (i) ⇒ (iii) in Theorem 9.1.1 We need the following consequence of Theorem 9.2.1. Proposition 9.3.2 The solutions of (9.2.1) lie in a union of finitely many proper linear subspaces of K m . Proof. The degenerate solutions, i.e., with i∈I ai xi = 0 for some non-empty subset I of {1, . . . , m}, lie in finitely many subspaces and so do the nondegenerate solutions. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.3 Reduction of decomposable form equations to unit equations 239 Remark It was pointed out in Evertse and Győry (1988b), and it is also implicit in the proof of Theorem 6.1.3, that Theorem 9.2.1 and Proposition 9.3.2 are equivalent. (i) ⇒ (iii). We assume assertion (i) of Theorem 9.1.1 and deduce (iii). Let A, M, F , L0 , L, δ be as in (i), (iii) and let V := KM. We proceed by induction on dimK V . If dimK V = 1, assertion (iii) is trivially true. Assume that dimK V =: d ≥ 2, and that the implication (i) ⇒ (iii) is true for finitely generated A-modules in K m that generate a K-vector space of dimension smaller than d. Without loss of generality we assume that none of the linear forms l ∈ L vanishes identically on V . We say that a set of linear forms {l1 , . . . , lt } ⊂ G[X1 , . . . , Xm ] is V -linearly dependent if there are c1 , . . . , ct ∈ G, not all 0, such that c1 l1 + · · · + ct lt vanishes identically on V ; otherwise, {l1 , . . . , lt } is called V -linearly independent. Further, {l1 , . . . , lt } is said to be minimally V -linearly dependent if the set itself is linearly dependent on V , but each of its non-empty proper subsets is linearly independent on V . We first show that there is a subset of L0 of cardinality ≥ 3 that is minimally V -linearly dependent. Assume the contrary. We divide L0 into classes such that two linear forms belong to the same class if and only if they are V -linearly dependent. Let {l1 , . . . , ls } be a full set of representatives for these classes. Then by our assumption, {l1 , . . . , ls } is V -linearly independent. Let L1 be the class of l1 . As is easily seen, L1 is Gal(G/K)-proper. We show that all linear forms in W := [L1 ] ∩ [L0 \ L1 ] vanish identically on V . Let l ∈ W . Since all linear forms in L1 are V -linearly dependent on l1 and since each linear form in L0 \ L1 is V -linearly dependent on one of l2 , . . . , ls , there are c1 , . . . , cs ∈ G such that l(x) = c1 l1 (x) = − s ci li (x) for x ∈ V . i=2 But then, si=1 ci li vanishes identically on V , implying that c1 = · · · = cs = 0. So l vanishes identically on V . In the same way it follows that for each σ ∈ Gal(G/K), the linear forms in σ (W ) := [σ (L1 )] ∩ [L0 \ σ (L1 )] vanish identically on V , hence so do the linear forms in σ ∈Gal(G/K) σ (W ). But then the latter vector space cannot contain elements of L since we assumed that these do not vanish identically on V . This violates assumption (i). So L0 has a minimal V -linearly dependent subset, say {l0 , . . . , lt } with t ≥ 2. This implies that there are a1 , . . . , at ∈ G∗ such that l0 (x) = a1 l1 (x) + · · · + at lt (x) for x ∈ V . Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 240 Decomposable form equations Let the set S consist of the coefficients of l1 , . . . , ln (i.e., the linear factors of F in L0 ), a finite set of generators for M, and c, c−1 , δ, δ −1 , where c, δ are as in (9.1.3). Let B := A[S]. Then for any solution x ∈ M of (9.1.2) we have l1 (x) ∈ B ∗ , . . . , ln (x) ∈ B ∗ . This shows that if x ∈ M is a solution to (9.1.2), then the tuple l1 (x) lt (x) ,..., l0 (x) l0 (x) is a solution to a1 y1 + · · · + at yt = 1 in y1 , . . . , yt ∈ B ∗ . (9.3.1) The domain B is finitely generated over Z, so by Proposition 9.3.1, the group B ∗ is finitely generated. By Proposition 9.3.2, there are at most finitely many non-zero tuples (b1 , . . . , bt ) ∈ Gt such that every solution y = (y1 , . . . , yt ) of (9.3.1) satisfies one of the relations b1 y1 + · · · + bt yt = 0. As a consequence, every solution x ∈ M of (9.1.2) satisfies one of the relations b1 l1 (x) + · · · + bt lt (x) = 0. Since {l0 , . . . , lt } is minimally V -linearly dependent, each of these relations defines a proper linear subspace of V . Hence the solutions x ∈ M of (9.1.2) lie in a finite union of proper linear subspaces of V . By applying the induction hypothesis to the intersection of M with any of these subspaces (which is a finitely generated A-module since A is a Noetherian ring and M is a finitely generated A-module), we infer that (9.1.2) has only finitely many solutions. 9.3.3 Proof of the implication (iii) ⇒ (i) in Theorem 9.1.1 We need another consequence of Theorem 9.2.1. It is in fact a special case of the Skolem–Mahler–Lech Theorem on the zero multiplicity of linear recurrence sequences, see Theorem 10.11.1 below. Proposition 9.3.3 Let a1 , . . . , am , b1 , . . . , bm ∈ K ∗ and suppose that none of the quotients bi /bj (1 ≤ i < j ≤ m) is a root of unity. Then there are only finitely many z ∈ Z with z = 0. a1 b1z + · · · + am bm (9.3.2) Proof. We proceed by induction on m. For m = 2 the assertion is clear. Let m ≥ 3. Apply Proposition 9.3.2 with the group generated by b1 , . . . , bm . By that Proposition, there are a finite number of tuples (c1 , . . . , cm−1 ) = 0 such Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.3 Reduction of decomposable form equations to unit equations 241 that each solution of (9.3.2) satisfies one of the relations m−1 ci (bi /bm )z = 0. i=1 By the induction hypothesis, each of these relations is satisfied by at most finitely many integers z. (iii) ⇒ (i). We assume that assertion (i) of Theorem 9.1.1 does not hold and deduce that (iii) does not hold, that is, there are A, M, δ as in (iii), such that equation (9.1.2) has infinitely many A∗ -cosets of solutions. First assume that rankG L0 < m. Then the vector space of x ∈ Gm with l(x) = 0 for l ∈ L is non-zero. By Lemma 1.1.1 and since L0 is Gal(G/K)stable this vector space has a basis from K m . So we can choose x1 ∈ K m \ {0} with l(x1 ) = 0 for l ∈ L0 . Choose x0 ∈ K m with l(x0 ) = 0 for l ∈ L. Let A be any subring of K which is finitely generated over Z, M the A-module generated by x0 , x1 and δ = F (x0 ). Consider the vectors x0 + kx1 (k ∈ Z). These vectors lie in different A∗ -cosets since x0 , x1 are K-linearly independent. For all but finitely many k we have l(x0 + kx1 ) = 0 for l ∈ L, and by (9.1.3), F (x0 + kx1 ) = c n li (x0 + kx1 )ei = F (x0 ) = δ. i=1 Hence (9.1.2) has infinitely many A∗ -cosets of solutions. Now assume that rankG L0 = m. Since by assumption, assertion (i) of Theorem 9.1.1 does not hold, there is a Gal(G/K)-proper subset L1 of L0 with ∅ L1 L0 such that [σ (L1 )] ∩ [L0 \ σ (L1 )]. L ∩ W = ∅, with W := σ ∈Gal(G/K) Then dimG W < m. Hence the G-vector space W ∗ := {x ∈ Gm : l(x) = 0 for all l ∈ W } has dimension m − dimG W > 0. Since also W is Gal(G/K)-stable, we infer from Lemma 1.1.1 that W ∗ is generated by vectors from K m . Moreover, none of the linear forms in L vanishes identically on W ∗ and so neither do they on W ∗ ∩ K m . Thus, there is x0 ∈ K m with l(x0 ) = 0 for l ∈ W, l(x0 ) = 0 for l ∈ L. (9.3.3) We make a partition {L1 , . . . , Lt } of L0 as follows. Take the distinct sets among σ (L1 ) (σ ∈ Gal(G/K)). Since L1 is Gal(G/K)-proper, these sets are pairwise Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 242 Decomposable form equations disjoint. Let L∗0 := ) σ (L1 ). σ ∈Gal(G/K) If L∗0 = L0 , let L1 , . . . , Lt be the distinct sets among σ (L1 ) (σ ∈ Gal(G/K)). If L∗0 L0 , let L1 , . . . , Lt−1 be the distinct sets among σ (L1 ) (σ ∈ Gal(G/K)), and take Lt := L0 \ L∗0 . Then σ (Lt ) = Lt for all σ ∈ Gal(G/K). Let ⎧ ⎫ ⎨ ⎬ U := u = (ul : l ∈ L0 ) ∈ Gn : ul l = 0 . ⎩ ⎭ l∈L0 We show that ul l(x) = 0 for x ∈ W ∗ , u ∈ U, i = 1, . . . , t. (9.3.4) l∈Li Let u ∈ U , x ∈ W ∗ . For σ ∈ Gal(G/K) we have ul l = − ul l ∈ [σ (L1 )] ∩ [L0 \ σ (L1 )] ⊆ W. l∈L0 \σ (L1 ) l∈σ (L1 ) L∗0 = L0 then (9.3.4) follows at once. If L∗0 L0 , then (9.3.4) holds for If i = 1, . . . , t − 1. But since l∈L0 ul l(x) = 0, it must hold for i = t as well. We now construct numbers θl ∈ G∗ (l ∈ L0 ) with the following properties: θl = θi for l ∈ Li , i = 1, . . . , t, θσ (l) = σ (θl ) θi /θj with θi independent of l; (9.3.5) for l ∈ L0 , σ ∈ Gal(G/K); is not a root of unity for 1 ≤ i < j ≤ t. (9.3.6) (9.3.7) The construction is as follows. Define the field M by Gal(G/M) := {σ ∈ Gal(G/K) : σ (L1 ) = L1 }. We first show that there is θ1 such that M = K(θ1 ) and no quotient of any two distinct conjugates of θ1 over K is a root of unity. We start by taking θ with M = K(θ ). Let θ (1) , . . . , θ (d) be the conjugates of θ over K in G. Since the field G is finitely generated, its group of roots of unity is finite, say of order D. Now we may take θ1 := θ + a, where a ∈ Z is such that the numbers (θ (i) + a)D (i = 1, . . . , d) are distinct. Let θ1 ∈ M be as above and put θi := σi (θ1 ), where σi ∈ Gal(G/K) is such that σi (L1 ) = Li . This does not depend on the choice of σi . In the case L∗0 L0 , choose θt ∈ K ∗ such that θt /θi is not a root of unity for i = 1, . . . , t − 1. Finally, put θl := θi for l ∈ Li , i = 1, . . . , t. If σ ∈ Gal(G/K) is such that σ (Li ) = Lj , with 1 ≤ i < j ≤ t if L∗0 = L0 , and with 1 ≤ i < j ≤ t − 1 if L∗0 L0 , then Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.3 Reduction of decomposable form equations to unit equations 243 σj−1 σ σi ∈ Gal(G/M), hence θj = σ (θi ). Further, if L∗0 L0 , then σ (Lt ) = Lt and σ (θt ) = θt for σ ∈ Gal(G/K). Thus, (9.3.5)–(9.3.7) follow. We now construct A, M, δ such that (9.1.2) has infinitely many A∗ -cosets of solutions. Pick x0 ∈ K m with (9.3.3). We claim that for every k ∈ Z≥0 there is a unique xk ∈ K m such that l(xk ) = l(x0 )θlk for l ∈ L0 , (9.3.8) and that, moreover, these vectors xk are pairwise non-proportional. Indeed, by (9.3.3), (9.3.4) and (9.3.5) we have for any u ∈ U , ⎞ ⎛ t ⎝ ul l(x0 )θlk = ul l(x0 )⎠ θik = 0. i=1 l∈L0 l∈Li Hence there is xk ∈ Gm with (9.3.8). Further, since rankG L0 = m, it is uniquely determined. By (9.3.6) we have σ (l)(σ (xk )) = σ (l)(x0 )θσk (l) for l ∈ L0 , σ ∈ Gal(G/K), and then σ (xk ) satisfies (9.3.8) since L0 is Gal(G/K)-stable. Now since (9.3.8) has only one solution, we must have σ (xk ) = xk for σ ∈ Gal(G/K), hence xk ∈ K m . Finally, by (9.3.7), the tuples (θlk : l ∈ L0 ) (k ∈ Z≥0 ) are pairwise nonproportional. Hence the vectors xk (k ∈ Z≥0 ) are pairwise non-proportional. Notice that by (9.1.3), (9.3.8) and (9.3.6), e F (xk ) = c l(xk )el = F (x0 )uk , where u := θl l ∈ K ∗ . l∈L0 l∈L0 We show that l(xk ) = 0 for l ∈ L and for all but finitely many k. Let l ∗ ∈ L. Then since rankG L0 = m, we have l∗ = ηl l with ηl ∈ G for l ∈ L0 . l∈L0 So by (9.3.5), ⎞ ⎛ t ⎝ l ∗ (xk ) = ηl l(x0 )⎠ θik . i=1 l∈Li By l ∗ (x0 ) = 0 and Proposition 9.3.3, we have l ∗ (xk ) = 0 for at most finitely many k. Putting all this together, we infer for all but finitely many k ∈ Z≥0 , F (xk ) = F (x0 )uk , l(xk ) = 0 for l ∈ L. (9.3.9) Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 244 Decomposable form equations We finish by constructing A, M, δ. Let δ := F (x0 ). Then δ = 0 since L0 ⊆ L. Further, let f (X) = Xs + cs−1 Xs−1 + · · · + c0 ∈ K[X] be a monic polynomial such that θl (l ∈ L0 ) are all zeros of f . Let A := Z[u, u−1 , c0 , . . . , cs−1 ]. Then u ∈ A∗ . Clearly, for k ∈ Z, k ≥ s, l ∈ L0 , we have θlk = −cs−1 θlk−1 − · · · − c0 θlk−s , and so, by the fact that xk ∈ K m is the only solution of (9.3.8), xk = −cs−1 xk−1 − · · · − c0 xk−s for k ∈ Z, k ≥ s. Now let M be the A-module generated by x0 , . . . , xs−1 . Then xk ∈ M for k ∈ Z≥0 . Invoking (9.3.9), we infer that for all but finitely many k the vector xk is a solution to (9.1.2). Moreover, the vectors xk are pairwise non-proportional. Hence (9.1.2) has infinitely many distinct A∗ -cosets of solutions, i.e., assertion (iii) of Theorem 9.1.1 does not hold. 9.4 Finiteness of the number of families of solutions In this section we describe the structure of the set of solutions of the decomposable form equations (9.1) and (9.2). Let K be a finitely generated extension field of Q, L a finite extension of K of degree n ≥ 2 and G a finite, normal extension of K containing L. There are n distinct K-isomorphisms of L in G, σ1 , . . . , σn say. Let α1 , . . . , αm (m ≥ 2) be elements of L and consider the linear form l = α1 X1 + · · · + αm Xm . Define the conjugates of l, l (i) = σi (l)) = m j =1 σi (αj )Xj (i = 1, . . . , n). Then NL/K (l) := n l (i) = i=1 n (σi (α1 ) i=1 is a decomposable form of degree n in K[X1 , . . . , Xm ], called a norm form, and the equation NL/K (l(x)) = δ in x = (x1 , . . . , xm ) ∈ Am (9.4.1) is called a norm form equation over K, where δ ∈ K ∗ and A is a subring of K which is finitely generated over Z. In what follows, it will be more convenient to consider equation (9.4.1) in the form NL/K (μ) = δ in μ ∈ M, (9.4.2) Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.4 Finiteness of the number of families of solutions 245 where M := {μ = l(x) : x ∈ Am }. Notice that M is a finitely generated Asubmodule of L. If we assume that α1 , . . . , αm are linearly independent over K, there is a one-to-one correspondence between the solutions of (9.4.1) and (9.4.2). Using the Subspace Theorem and its p-adic generalization, Schmidt (1971, 1972) and Schlickewei (1977c) established very important finiteness theorems on these equations over Q. The results of Schmidt and Schlickewei were later extended in Laurent (1984) to the case where the ground field K is a finitely generated extension of Q. These will be presented later as special cases of more general results concerning decomposable form equations stated below. Equation (9.4.1) is a special decomposable form equation. Let now F ∈ K[X1 , . . . , Xm ] be an arbitrary decomposable form of degree n ≥ 2 and let G be a finite, normal extension of K over which F factorizes into linear factors. Consider the decomposable form equation F (x) = δ in x = (x1 , . . . , xm ) ∈ Am , (9.1) where A is a subring of K which is finitely generated over Z. We can reformulate this in a shape similar to (9.4.2) as follows. First observe that F can be expressed as F =c q NLj /K (lj ), (9.4.3) j =1 where L1 , . . . , Lq are finite extensions of K, lj is a linear form from Lj [X1 , . . . , Xm ] for j = 1, . . . , q, and c ∈ K ∗ . Indeed, we may write F as F = cl1 · · · ln , (9.4.4) where c ∈ K ∗ and lj = Xnj + αnj +1,j Xnj +1 + · · · + αmj Xm with αij ∈ G for j = 1, . . . , n, i ∈ {nj + 1, . . . , m}. For each σ ∈ Gal(G/K) we have σ (F ) = F ∈ K[X1 , . . . , Xm ]. Since G[X1 , . . . , Xm ] is a unique factorization domain, (9.4.4) implies that there is a permutation (σ (1), . . . , σ (n)) of (1, . . . , n) such that σ (lj ) = lσ (j ) for j = 1, . . . , n. The index set {1, . . . , n} can be partitioned into subsets C1 , . . . , Cq such that i, j belong to the same subset if and only if σ (i) = j for some σ ∈ Gal(G/K). Assume without loss of generality that j ∈ Cj for j = 1, . . . , q, and let Lj = K(αnj +1,j , . . . , αmj ) for j = 1, . . . , q. Then li = NLj /K (lj ) for j = 1, . . . , q, i∈Cj and (9.4.3) follows. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 246 Decomposable form equations Define the K-algebra := L1 × · · · × Lq which is endowed with coordinatewise addition and multiplication. Recall that any K-algebra isomorphic to a direct product of finite field extensions of K is called a finite étale K-algebra. Let 1 = (1, . . . , 1) denote the unit element of . We agree that K-subalgebras of contain by default 1. It can be shown that any K-subalgebra of is itself a finite étale K-algebra. We view K as a subalgebra of by identifying a ∈ K with a · 1. α ) of α = (α1 , . . . , αq ) ∈ by We define the norm N/K (α α ) = NL1 /K (α1 ) · · · NLq /K (αq ). N/K (α (9.4.5) It can be shown that this is the determinant of the K-linear map x → α x from to itself. The A-module M := μ = (l1 (x), . . . , lq (x)) : x ∈ Am is contained in . Replacing δ/c by δ in (9.1), the identities (9.4.3) and (9.4.5) imply that every solution x of the equation (9.1) yields a solution of the equation μ) = δ N/K (μ in μ ∈ M. (9.4.6) Further, if F is of maximal rank, that is, if F has m linearly independent linear factors in its factorization over G, then there is a one-to-one correspondence between the solutions of (9.1) and (9.4.6). For q = 1, (9.4.6) reduces to a norm form equation. In what follows we consider (9.4.6) where we allow M to be any finitely generated non-zero A-module in . Denote by KM the vector space generated by M in . For each K-subalgebra ϒ of , denote by Aϒ the integral closure of A in ϒ, and by Eϒ the multiplicative subgroup of A∗ϒ , consisting of all elements ε ∈ A∗ϒ with N/K (εε ) = 1. The group Eϒ is finitely generated. For every solution μ of (9.4.6) and every K-subalgebra ϒ of for which μ ϒ ⊆ KM, μEϒ∗ ) ∩ M are solutions of (9.4.6). Such a subset of solutions all elements of (μ μEϒ∗ ) ∩ M is called a wide (M, ϒ)-family of solutions of (9.4.6). (μ We state some results of Győry without proof. Theorem 9.4.1 The set of solutions of (9.4.6) is a union of at most finitely many wide families of solutions of (9.4.6). Proof. See Győry (1993a). Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.4 Finiteness of the number of families of solutions 247 Consider now the equation μ) ∈ δA∗ N/K (μ in μ ∈ M. (9.4.7) For any K-subalgebra ϒ of , denote by Uϒ the subgroup of A∗ϒ consisting of all elements ε with N/K (εε ) ∈ A∗ . The group Uϒ is finitely generated. Further, μ Uϒ ) ∩ M for every solution μ of (9.4.7) with μ ϒ ⊆ KM, all elements of (μ are also solutions of (9.4.7). Such a set of solutions is called a wide (M, ϒ)family of solutions of (9.4.7). Theorem 9.4.1 easily follows from the following. Theorem 9.4.2 The set of solutions of (9.4.7) is a union of finitely many wide families of solutions of (9.4.7). Proof. See Győry (1993a). The proof of Theorem 9.4.2 depends again on Proposition 9.3.2 concerning the unit equation (9.2.1). Let V be a non-zero K-linear subspace of . For a K-subalgebra ϒ of define μ ∈ V : μ ϒ ⊆ V }. V ϒ := {μ We call V non-degenerate if V ϒ = (0) for every K-subalgebra ϒ of K different from K, and degenerate otherwise. Let M be a finitely generated A-module in with KM = V . If V is non-degenerate, then by Theorem 9.4.1, all solutions μEK ) ∩ M. of (9.4.6) are contained in a union of finitely many sets of the form (μ But EK is finite, hence this proves the implication (i) ⇒ (ii) of the following. Theorem 9.4.3 Let V be a fixed, non-zero K-linear subspace of . Then the following three statements are equivalent: (i) V is non-degenerate; (ii) for every ring A with quotient field K which is finitely generated over Z, every finitely generated A-module M with KM = V and every δ ∈ K ∗ , equation (9.4.6) has only finitely many solutions; (iii) for every A, M, δ as in (ii), equation (9.4.7) has only finitely many A∗ cosets of solutions. Proof. See Győry (1993a). This theorem is equivalent to Theorem 9.1.1 with L = L0 . We now specialize the above results to the norm form equations (9.4.1) and (9.4.2). Then, in the classical case K = Q, A = Z, Schmidt (1971) proved a fundamental theorem, which states that the norm form equation (9.4.2) has Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 248 Decomposable form equations finitely many solutions for all δ ∈ Q∗ if and only if the Q-vector space QM has no subspace of the form μL , where μ ∈ L∗ and L is a subfield of L different from Q and the imaginary quadratic number fields. The result of Schmidt was generalized by Schlickewei (1977c) for the case K = Q and A a ring of Sintegers. In case of norm form equations, Theorem 9.4.3 as well as Theorem 9.4.1 and Theorem 9.4.2 were proved in Laurent (1984). In the number field case, when in (9.4.1) and (9.4.2) K is an algebraic number field, Schmidt (1972) for K = Q, A = Z, Schlickewei (1977c) for K = Q and Laurent (1984) for an arbitrary number field K gave a more precise description of the set of solutions of the norm form equations (9.4.1) and (9.4.2), in which the solutions are divided into more restrictive families of solutions instead of the wide families of Theorems 9.4.1 and 9.4.2. The next theorem is a generalization of these results to arbitrary decomposable form equations. Let K be an algebraic number field, S a finite set of places on K containing all infinite places, OS the ring of S-integers in K, and a finite étale K-algebra. Let δ ∈ K ∗ , M a finitely generated OS -module contained in , and consider the equation μ) ∈ δOS∗ N/K (μ in μ ∈ M. (9.4.8) For each K-subalgebra ϒ of , denote by OS,ϒ the integral closure of OS in ϒ. Further, we define the sets μ ∈ KM : μ ϒ ⊆ KM}, Mϒ := (KM)ϒ ∩ M, (KM)ϒ := {μ where KM is the K-vector space in generated by M. Consider the subgroup ∗ UM,ϒ := ε ∈ OS,ϒ : ε Mϒ = Mϒ ∗ of the unit group of OS,ϒ . The group OS,ϒ is finitely generated, hence its rank ∗ thus has the same is finite. One can show that UM,ϒ is of finite index in OS,ϒ ∗ rank as OS,ϒ . An (M, ϒ)-family of solutions of (9.4.8) is a coset μ UM,ϒ , where ϒ is a K-subalgebra of and μ ∈ Mϒ is a solution of (9.4.8). Every element of μ UM,ϒ is a solution of (9.4.8). Theorem 9.4.4 The set of solution of (9.4.8) is a union of finitely many families. Proof. See Győry (1993a). As was mentioned above, in the case of norm form equations, Theorem 9.4.4 is due to Schmidt (1972) for K = Q, OS = Z, Schlickewei (1977c) for K = Q and Laurent (1984) for arbitrary number fields K. Theorem 9.4.4 was deduced from Theorem 9.4.2 by showing that every wide family of solutions splits into Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.5 Upper bounds for the number of solutions 249 finitely many families of solutions. It follows from an observation of Laurent (1984) that Theorem 9.4.4 cannot be extended to the case of an arbitrary finitely generated ground field K. The proofs of Theorems 9.4.1–9.4.4 in Győry (1993a) are based on Theorem 9.2.1 on unit equations. See also Bombieri and Gubler (2006) where, in the case of norm form equations over Z, the proof of the above Theorem 9.4.4 involves also Theorem 9.2.1. As was explained in Chapter 6, the proof of Theorem 9.2.1 depends on the p-adic Subspace Theorem. We note that in contrast, Schmidt and Schlickewei deduced their results concerning norm form equations directly from the Subspace Theorem and its p-adic generalization. 9.5 Upper bounds for the number of solutions In this section, we consider decomposable form equations over the ring of S-integers in an algebraic number field. We give an overview of quantitative results, giving explicit upper bounds for the number of solutions. We first recall some history. In Subsection 9.5.1 we recall from Evertse (1995) a general result on systems of S-unit equations with a Galois action, and in Subsection 9.5.2 we deduce, among other things, a quantitative version of Theorem 9.1.1. Let K be an algebraic number field, S a finite set of places of K containing the infinite places, δ a non-zero element of OS , and F ∈ OS [X, Y ] a binary form of degree n ≥ 3 with at least three pairwise non-proportional linear factors over K. Consider the equation F (x, y) ∈ δOS∗ in (x, y) ∈ (OS∗ )2 . (9.5.1) Denote by s the cardinality of S and by ωS (δ) the number of prime ideals outside S occurring in the factorization of δ. The solutions of (9.5.1) are divided into OS∗ -cosets in the usual manner. Lewis and Mahler (1961) were the first to give, in the case K = Q, a completely explicit upper bound for the number of OS∗ -cosets of solutions of (9.5.1), depending on s, ωS (δ), n, and also on the heights of the coefficients of F . In chapter 6 of his PhD thesis Evertse (1983) extended the result of Lewis and Mahler to arbitrary number fields and sets of places S, and derived an explicit upper bound for the number of OS∗ -cosets of solutions of (9.5.1) that depends only on n, s, ωS (δ), and so is independent of the coefficients of F . On the other hand, Evertse’s bound had a much worse dependence on the degree n of F than that of Lewis and Mahler. Later, Evertse’s bound was reduced substantially in the case that F is irreducible over K. Bombieri and Schmidt (1987) proved that if F ∈ Z[X, Y ] is an irreducible Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 250 Decomposable form equations binary form of degree n ≥ 3, then the equation F (x, y) = 1 in x, y ∈ Z has at most cn solutions with c an absolute constant. For n sufficiently large, c can be taken equal to 430. The example (x − a1 y) · · · (x − an y) + y n = 1 with a1 , . . . , an distinct integers shows that the bound of Bombieri and Schmidt is best possible in terms of n. Bombieri considered more generally (9.5.1) with arbitrary K, S but with F irreducible over K and of degree n ≥ 6. In Bombieri (1994) he obtained the upper bound (12n)12(s+ωS (δ)) for the number of OS∗ cosets of solutions of (9.5.1). For binary forms F ∈ OS [X, Y ] of degree n ≥ 3 irreducible over K, this was improved in Evertse (1997) to (105 n)s+ωS (δ) . This is still the best bound for general Thue–Mahler equations. Schmidt generalized the above mentioned results on Thue equations to norm form equations over Z in more than two unknowns. Let L be a number field of degree n and l = α1 X1 + · · · + αm Xm , where L = Q(α1 , . . . , αm ) and α1 , . . . , αm are linearly independent over Q. Let c be a non-zero integer such that F := cNL/Q (l) = c (σ (α1 )X1 + · · · + σ (αn )Xn ) ∈ Z[X1 , . . . , Xm ], σ where the product is over all embeddings σ : L → Q. Recall that F is called non-degenerate if the Q-vector space V := {l(x) : x ∈ Qm } does not contain μL for some μ ∈ L∗ and some subfield L of L that is not equal to Q or an imaginary quadratic field. Under this hypothesis, Schmidt (1990), Theorem 1 proved that the equation |F (x)| = 1 in x ∈ Zm has at most 30m 2 m+4 c1 (m, n) = min n2 n , nc2 (m) with c2 (m) = (2m)m×2 solutions. In the same paper, Schmidt proved also that if δ is any positive integer, then the equation |F (x)| = δ has at most n ω(δ) dm−1 (δ n ) c1 (m, n) m−1 primitive solutions, i.e., with coordinates having greatest common divisor 1, where ω(δ) denotes the number of distinct primes dividing δ, and dm−1 (δ n ) denotes the number of ways that δ n can be expressed as a product of m − 1 positive integers. Schmidt’s main tool was his quantitative version of the Subspace Theorem that he had established shortly before in Schmidt (1989). Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.5 Upper bounds for the number of solutions 251 Győry (1993a) gave explicit upper bounds for the number of solutions of arbitrary decomposable form equations over the ring of S-integers of a number field K, in the case that this number is finite. More generally, in the case that the number of solutions is infinite, he gave an explicit upper bound for the number of families of solutions. He derived his bounds by making a reduction to S-unit equations over the splitting field over K of the decomposable form involved and this led to bounds that are exponential in both the cardinality of S and the degree of the splitting field. Notice that if the decomposable form involved has degree n, then in the worst case, its splitting field has degree n! and then Győry’s bound is exponential in n!. Evertse (1995) proved a general quantitative result on “Galois symmetric S-unit vectors”, and this enabled him to prove much sharper upper bounds for the number of solutions (if finite) of decomposable form equations over OS . This was extended in Evertse and Győry (1997) to estimates for the number of families of solutions in the case when the number of solutions is infinite. In the next subsection, we recall, without proof, Evertse’s result on Galois symmetric S-unit vectors. In the subsequent subsection we discuss some consequences for decomposable form equations and S-unit equations. 9.5.1 Galois symmetric S-unit vectors Let K be an algebraic number field, S a finite set of places of K, and G a finite normal extension of K. Denote by OS,G the integral closure of OS in G. Let n ≥ 3 be an integer and an action of Gal(G/K) on {1, . . . , n}, i.e., a homomorphism from Gal(G/K) to the permutation group of {1, . . . , n}. That is, maps σ ∈ Gal(G/K) to a permutation (σ (1), . . . , σ (n)) of (1, . . . , n). We define the K-algebra u = (u1 , . . . , un ) ∈ Gn : := σ (ui ) = uσ (i) for σ ∈ Gal(G/K), i = 1, . . . , n with coordinatewise addition, multiplication, and scalar multiplication with K. The unit element of is 1 := (1, . . . , 1). We embed K into via ι : a → a · 1. A -symmetric partition is a collection of non-empty, pairwise disjoint sets P = {P1 , . . . , Pt } such that t ) Pi = {1, . . . , n}, σ (Pi ) ∈ P for σ ∈ Gal(G/K), i = 1, . . . , t. i=1 In particular we have the trivial -symmetric partition P0 := {{1, . . . , n}}. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 252 Decomposable form equations P A pair i ∼ j is a pair i, j ∈ {1, . . . , n} belonging to the same set of P. With a -symmetric partition P we associate the sets P P := u ∈ : ui = uj for each pair i ∼ j , n ∩ P . OS,P := OS,G The set P is a K-subalgebra of , and OS,P is the integral closure of OS in P . For instance, P0 = ι(K), and P = for P = {{1}, . . . , {n}}. Let W be a K-linear subspace of . Define ⊥ W := y = (y1 , . . . , yn ) ∈ G : n n yi ui = 0 for all u ∈ W . i=1 For a -symmetric partition P = {P1 , . . . , Pt }, we define the subspace of W , ⎧ ⎫ ⎨ ⎬ yj uj = 0 for all y ∈ W ⊥ , i = 1, . . . , t . WP := u ∈ W : ⎩ ⎭ j ∈Pi One can show that WP = {u ∈ W : uP ⊆ W } (see Evertse (1995), Lemma 10). As a consequence, P WP ⊆ WP . A K ∗ -coset is a set {a · u : a ∈ K ∗ } with some fixed u ∈ . We are now ready to state our result. Theorem 9.5.1 Let K be a number field, G a finite normal extension of K, n ≥ 3, a Gal(G/K)-action on {1, . . . , n}, W a K-linear subspace of of dimension m and S a finite set of places of K of cardinality s, containing all infinite places. Then the set of u = (u1 , . . . , un ) with ∗ ui /uj ∈ OS,G for i, j = 1, . . . , n, (9.5.2) u ∈ W, u1 · · · un = 0, u ∈ WP for each -symmetric partition P ∗ /ι(OS∗ ) is infinite such that OS,P (9.5.3) is a union of at most (233 n2 )m s K ∗ -cosets. 3 Proof. See Evertse (1995), Theorem 4, Lemma 10. The proof is based on a quantitative version of the Subspace Theorem, proved in Evertse (1996). Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.5 Upper bounds for the number of solutions 253 9.5.2 Consequences for decomposable form equations and S-unit equations Let K be a number field and S a finite set of places of K containing all infinite places. Suppose |S| = s. For non-zero δ ∈ OS , we denote by ωS (δ) the number of prime ideals outside S occurring in the prime ideal factorization of δ. Let F ∈ OS [X1 , . . . , Xm ] be a decomposable form, and denote by G its splitting field over K. Recall that there exists a Gal(G/K)-stable set L0 = {l1 , . . . , ln } ⊂ G[X1 , . . . , Xm ] of pairwise non-proportional linear forms, c ∈ K ∗ and positive integers e1 , . . . , en , such that F = cl1e1 · · · lnen . Let L be a finite set of pairwise non-proportional linear forms from G[X1 , . . . , Xm ] with L ⊇ L0 . We deduce a quantitative version of the implication (i) ⇒ (iii) of Theorem 9.1.1 from Theorem 9.5.1. Theorem 9.5.2 Let m, K, S, F , G, L0 , L be as above. Assume that rankG L0 = m, ⎛ ⎞ L∩⎝ [σ (L1 )] ∩ [L0 \ σ (L1 )]⎠ = ∅ (9.5.4) (9.5.5) σ ∈Gal(G/K) for each non-empty Gal(G/K)-proper subset L1 L0 . Then the solutions of F (x) ∈ δOS∗ in x ∈ OSm with l(x) = 0 for l ∈ L (9.5.6) lie in at most (233 n2 )m (s+ωS (δ)) OS∗ -cosets. 3 Proof. Let S consist of the places in S and of the prime ideals in the factorization of δ. Then |S | = s + ωS (δ). Assume that (9.5.6) is solvable (if not we are done) and choose a solution x0 ∈ OSm . After multiplying l1 , . . . , ln by suitable scalars, which does not affect the above assumptions on L0 , L, we may assume that li (x0 ) = 1 for i = 1, . . . , n and c ∈ OS∗ . Denote by OS ,G the integral closure of OS in G. We first show that for every solution x ∈ OSm of (9.5.6) we have li (x) ∈ OS∗ ,G for i = 1, . . . , n. (9.5.7) This is equivalent to the assertion that |li (x)|V = 1 for i = 1, . . . , n and every place V of G not lying above a place from S . To prove this, take such a place V . For a polynomial H with coefficients in G denote by |H |V the maximum of Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 254 Decomposable form equations the | · |V -values of the coefficients of H . By our assumption on l1 , . . . , ln we have |li |V ≥ 1 for i = 1, . . . , n while on the other hand, by Proposition 1.9.4 and our assumption F ∈ OS [X1 , . . . , Xn ], n |li |eVi = |F |V ≤ 1. i=1 Hence |li |V = 1 for i = 1, . . . , n. So if x ∈ OSm is a solution of (9.5.6) then |li (x)|V ≤ 1 for i = 1, . . . , n and ni=1 |li (x)|eVi = |F (x)|V = 1. This implies |li (x)|V = 1 for i = 1, . . . , n, as required. Define the K-linear map ϕ : x → (l1 (x), . . . , ln (x)) : K m → Gn . By (9.5.4), it is injective. Let ϕ(K m ) =: W . Since {l1 , . . . , ln } is Gal(G/K)stable, there is an action of Gal(G/K) on {1, . . . , n} such that σ (li ) = lσ (i) for i = 1, . . . , n, σ ∈ Gal(G/K). This implies that W is an m-dimensional, K-linear subspace of . In view of (9.5.7) we have for every solution x ∈ OSm of (9.5.6) that ϕ(x) ∈ W, ϕ(x) ∈ (OS∗ ,G )n , (9.5.8) so certainly, u := ϕ(x) satisfies (9.5.2). We next show that if x ∈ OSm is a solution of (9.5.6), then ϕ(x) ∈ WP for each -symmetric partition P = {{1, . . . , n}}, (9.5.9) which is stronger than (9.5.3). Let x ∈ OSm be a solution of (9.5.6), and P = {P1 , . . . , Pt } a -symmetric partition different from {{1, . . . , n}}. Further, let L1 = {li : i ∈ P1 }. Then L1 L0 and L1 is Gal(G/K)-proper. By assumption (9.5.5), there is a linear form in [σ (L1 )] ∩ [L0 \ σ (L1 )] σ ∈Gal(G/K) that does not vanish at x. This implies that there are σ ∈ Gal(G/K) and l ∈ [σ (L1 )] ∩ [L0 \ σ (L1 )] such that l(x) = 0. There is a set Pi ∈ P such that σ (L1 ) = {lj : j ∈ Pi }. Now there are cj ∈ G for j = 1, . . . , n such that cj lj = − cj lj , l= j ∈Pi j ∈Pic where Pic = {1, . . . , n} \ Pi . The vector (c1 , . . . , cn ) belongs to W ⊥ , and our observation l(x) = 0 implies that for the vector u = ϕ(x) we have j ∈Pi cj uj = 0. So indeed, ϕ(x) ∈ WP . Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.5 Upper bounds for the number of solutions 255 We conclude that if x ∈ OSm is a solution of (9.5.6), then ϕ(x) satisfies (9.5.8), (9.5.9), hence (9.5.2), (9.5.3) with S instead of S. Now Theorem 9.5.1 with S , s = s + ωS (δ), instead of S, s, implies that the vectors ϕ(x), with x ∈ OSm 3 a solution of (9.5.2), lie in at most N := (233 n2 )m (s+ωS (δ)) K ∗ -cosets. Since ϕ is an injective, K-linear map, this implies that the solutions x themselves lie in at most N K ∗ -cosets. But, clearly, any two solutions of (9.5.6) in the same K ∗ -coset lie in fact in the same OS∗ -coset. Theorem 9.5.2 follows. The next consequence is an improvement of Theorem 6.1.3 in the case = (OS∗ )m . Theorem 9.5.3 Let K, S be as above, and let a1 , . . . , am ∈ K ∗ . Then the equation a1 u1 + · · · + am um = 1 in u1 , . . . , um ∈ OS∗ (9.5.10) 3 has at most (235 m2 )m s solutions with ai ui = 0 for each non-empty I ⊆ {1, . . . , m}. (9.5.11) i∈I Proof. We apply Theorem 9.5.1 with n = m + 1, G = K, and W = {(u1 , . . . , um , um+1 ) ∈ K m+1 : a1 u1 + · · · + am um = um+1 }. Then the points (u1 , . . . , um , 1) with (9.5.10) and (9.5.11) satisfy (9.5.2) and (9.5.3). Since these points lie in different K ∗ -cosets, Theorem 9.5.3 follows. We state without proof a consequence of Theorem 9.5.1 for the number of families of solutions of decomposable form equations, giving a quantitative version of Theorem 9.4.4. We keep the notation from Section 9.4. Let as before K be an algebraic number field, S a finite set of places on K containing all infinite places and a finite étale K-algebra. Let c, δ be non-zero elements of OS , M a finitely generated OS -module contained in , and consider the equation μ) ∈ δOS∗ cN/K (μ in μ ∈ M. (9.5.12) α 1 , . . . , α t } of M, the We assume that for some OS -module generating set {α polynomial cN/K (X1α 1 + · · · + Xt α t ) has its coefficients in OS . In fact, this does not depend on the choice of the generating set. For the definition of the submodules Mϒ and the groups UM,ϒ (for ϒ a K-subalgebra of ) and that of a family of solutions of (9.5.12) we refer ∗ of finite index if to Section 9.4. Recall that UM,ϒ is a subgroup of OS,ϒ ϒ M = (0). A family of solutions of (9.5.12) is called irreducible if it is not a union of finitely many strictly smaller families of solutions. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 256 Decomposable form equations Let n := [ : K], m := dimK KM, s := |S|, and put ωS (δ) ordv (δ) + m − 1 n ψ(δ) := , m−1 m−1 v∈S where ordv (δ) is the exponent on the prime ideal corresponding to v in the factorization of δ. Consider the K-subalgebras ϒ of such that (9.5.12) has irreducible (M, ϒ)-families of solutions, and denote by IM the maximum of ∗ : UM,ϒ ], taken over all such algebras ϒ. the indices [OS,ϒ We state without proof the following quantitative result, which can be deduced from Theorem 9.5.1. Theorem 9.5.4 The set of solutions of (9.5.12) is a union of at most 3 (233 n2 )m s ψ(δ) · IM irreducible families. Proof. This is a simplified version of Evertse and Győry (1997), Theorem 1. Notice that by taking for a finite extension field of K, we obtain from Theorem 9.5.4 an upper bound for the number of families of solutions of a norm form equation. By an OS∗ -coset of solutions, we mean a coset μ OS∗ , where μ is a solution of (9.5.12). Corollary 9.5.5 Assume that (9.5.12) has only finitely many OS∗ -cosets of solutions. Then the number of these is at most 3 (233 n2 )m s ψ(δ). Proof. By assumption, (9.5.12) cannot have irreducible families of solutions that are the union of infinitely many OS∗ -cosets. So it has only irreducible families that are the union of only finitely many OS∗ -cosets, and such families must be OS∗ -cosets themselves. In this situation, IM = 1. Corollary 9.5.5 follows. We can express the set of solutions of (9.5.12) as a minimal finite union of irreducible families F1 ∪ · · · ∪ Ft , i.e., none of the families in this union is contained in the union of the others. Evertse and Győry (1997) showed that this way of expressing the set of solutions is unique, and moreover, that F1 , . . . , Ft are precisely the maximal irreducible families of solutions of (9.5.2), that is, if F is any other irreducible family of solutions of (9.5.12), then F ⊆ Fi for some i ∈ {1, . . . , t}. Voutier (2014) showed that if L is an algebraic number field of degree n > 3 and M a free Z-module of rank 3 contained in OL , then the norm form Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.6 Effective results 257 equation μ) = 1 NL/Q (μ in μ ∈ M (9.5.13) has at most 10969 n10 families of solutions. On the other hand, in his paper, Voutier showed that for every number field L of degree n ≥ 3 and every integer N > 0, there exists a full module M ⊆ OL , i.e., of rank equal to n, such that (9.5.13) has at least N families of solutions. This implies that the bound in Theorem 9.5.4 cannot be replaced by one depending only on m, n, s, δ and independent of M. 9.6 Effective results In this section, effective results are presented for some important classes of decomposable form equations of the form F (x) = δ in x = (x1 , . . . , xm ) ∈ OSm with l(x) = 0 for l ∈ L (9.6.1) and F (x) ∈ δOS∗ in x = (x1 , . . . , xm ) ∈ OSm with l(x) = 0 for l ∈ L, (9.6.2) where OS is the ring of S-integers of a number field K, δ ∈ OS \ {0}, F (X) is a decomposable form of degree n ≥ 3 with coefficients in OS and L is a finite set of non-zero linear forms from K[X1 , . . . , Xm ]. Using the effective results of Section 4.1 on S-unit equations, we derive effective bounds for the S-integral solutions of Thue equations, discriminant equations, certain norm form equations and decomposable form equations of an arbitrary number of unknowns. In the case of equation (9.6.1), these imply the finiteness of the number of solutions, and make it possible, at least in principle, to determine the solutions, provided that K, S, δ, n and the coefficients of F are given effectively in the sense described in Section 1.10. The results presented in this section have many important applications in number theory. As was already mentioned, equation (9.6.2) can be reduced to finitely many equations of the form (9.6.1). This can be carried out in an effective way. Indeed, let x be a solution of (9.6.2). Then F (x) = δη with some η ∈ OS∗ . By Proposition 4.3.12 there is an ε ∈ OS∗ for which h(ηε n ) and hence h(δηε n ) are effectively bounded. Further, εx is a solution of equation (9.6.1) with δ replaced by δηεn . In what follows, we deal only with equation (9.6.1). Further effective applications of S-unit equations to discriminant form and index form equations and related Diophantine problems are given in our next book on discriminant equations. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 258 Decomposable form equations 9.6.1 Thue equations Let K be an algebraic number field and S a finite set of places of K, containing all infinite places. Let F ∈ OS [X, Y ] be a binary form of degree n ≥ 3 having at least three pairwise non-proportional linear factors over K, and let δ ∈ OS \ {0}. Consider the Thue equation F (x, y) = δ in x, y ∈ OS . (9.6.3) In the classical case when K = Q, S = {∞} and F (X, Y ) is irreducible over Q, the first explicit upper bound for the solutions of this equation was obtained in Baker (1968a). His bound depends only on δ, n, and the maximum of the absolute values of the coefficients of F . Baker’s proof is based on his effective estimates for linear forms in logarithms of algebraic numbers. Baker’s result was extended in Coates (1969) to the case when K = Q and S is arbitrary and in Kotov and Sprindžuk (1973) to the case of equation (9.6.3). Later, several improvements and generalizations have been established; for references see the Notes (Section 9.7). We note that better bounds can be obtained for the solutions if certain parameters of the number field generated by one or more zeros of F (x, 1) are also involved. For applications, we give completely explicit upper bounds for the solutions of equation (9.6.3). Let d, hK and RK denote the degree, class number and regulator of K. Further, let s = |S|, RS the S-regulator of K (see (1.8.2)), PK := max N (p) if S MK∞ p and QK := N (p) if S MK∞ and PK := 2 if S = MK∞ , and QK := 1 if S = MK∞ , p where the maximum and product are taken over all prime ideals p from S, and N(p) := |OK /p| denotes the norm of p. The case s = 1 being trivial, we assume that s ≥ 2. Finally, let H (≥ 2) be an upper bound for the maximum of the logarithmic heights of the coefficients of F (X, Y ). The next theorem is a slightly weaker version of Corollary 3 of Győry and Yu (2006). Theorem 9.6.1 Suppose that the binary form F (X, Y ) in (9.6.3) factorizes over K into linear factors and that at least three of these factors are pairwise non-proportional. Then all solutions x, y of (9.6.3) satisfy max(h(x), h(y)) < 1 h(δ) + 7n(64eds)2s+5 PK NK RS (log∗ RS ), n (9.6.4) Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.6 Effective results 259 where NK = n5 H + 1 hK log NS (δ) + d d RK + log QK . d d The proof is based on Corollary 4.1.5 on S-unit equations. Consider now the case when F (X, Y ) does not factorize over K into linear forms. For later convenience, we assume that F (1, 0) = 0. Then we may assume that three zeros, say α1 , α2 , α3 , of F (X, 1) are distinct, and α1 is not contained in K. Let L = K(α1 ), hL and RL be the class number and regulator of L, T the set of places of L lying above those of S, and RT the T -regulator of L. Further, let M = K(α1 , α2 , α3 ) and PM = PK[M:K] if S MK∞ and PM = 2 if S = MK∞ . Theorem 9.6.2 Let F (X, Y ) be a binary form as in (9.6.3). Suppose that F (1, 0) = 0, that α1 , α2 , α3 are distinct zeros of F (X, 1), and that α1 is not contained in K. Then, with the above notation, all solutions of (9.6.3) satisfy max(h(x), h(y)) < 1 h(δ) + 17(64edn2 s)2ns+5 PM NL RT (log∗ RT ), (9.6.5) n where 1 hL log NS (δ) + (nd)nd RL + log QK . d d In particular, if K = Q, S = {∞} and F (X, Y ) is irreducible over Q, then max(|x|, |y|) < exp c(H + log |δ| + nn RL )RL (log∗ RL ) NL = n2 H + where c = 34(64en2 )2n+6 . Apart from the value of c, this latter bound was established in Bugeaud and Győry (1996b). Combining this bound with (1.5.2) and Lemma 1.5.1, one gets at once an upper bound that depends only on δ, n and H . Using their methods mentioned in Section 4.5, Bombieri (1993) in the case S = MK∞ , F (X, 1) monic, and Bugeaud (1998) in the case F (X, 1) monic and irreducible over K derived similar bounds for the solutions of equation (9.6.3). Theorem 9.6.2 is a generalization and, apart from the factor log∗ RT in (9.6.5), is an improvement of these results of Bombieri and Bugeaud. Remark The restriction F (1, 0) = 0 is not an essential one. Indeed, there is an a ∈ Z with 1 ≤ a ≤ n such that F (1, a) = 0. Then one may take the binary form G(X, Y ) = F (X, aX + Y ) instead of F (X, Y ), in which the coefficient of Xn is F (1, a) = 0 and the logarithmic heights of the coefficients of G do not exceed (n + 1)(H + n log n) + log(n + 1). For convenience, we give a common proof for Theorems 9.6.1 and 9.6.2. We first prove Theorem 9.6.1 by means of Corollary 4.1.5. Proving a version of Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 260 Decomposable form equations Theorem 9.6.2, we could also use this corollary in the field M = K(α1 , α2 , α3 ) with the set of places V consisting of the set of places of M lying above the places of S. However, we get a better bound by applying Theorem 4.1.3, where one of the unknowns of the V -unit equation involved belongs to a finitely generated subgroup of M ∗ which is much smaller than the group of V -units in M. Proof of Theorems 9.6.1 and 9.6.2. We shall use some basic facts from Chapter 1 without any further mention. In view of the above remark we may assume that F (1, 0) = 0 holds in Theorem 9.6.1, too. Then, in the proof below of Theorem 9.6.1, one has to work with H1 = (n + 1)(H + n log n) + log(n + 1) instead of H . Further, we may assume that in both cases α1 , α2 , α3 are distinct zeros of F (X, 1) (in the latter case, not necessarily in K). For i = 1, 2, 3, let Li := K(αi ) with L1 = L, hLi , RLi the class number and regulator of Li , Ti with T1 = T the set of places of Li lying above those in S, OTi , OT∗i the ring of Ti -integers and the group of Ti -units in Li , and QL i = NLi (P) if S MK∞ and QLi = 1 if S = MK∞ , P where the product is taken over all prime ideals P from Ti and NLi (P) := |OLi /P| is the absolute norm of P. Let x, y be a solution of (9.6.3), and let a0 = F (1, 0). The number a0 αi is integral over OS , and so it is in OTi for i = 1, 2, 3. Thus a0 (x − αi y) is also in OTi , it divides a0n−1 F (x, y) and hence a0n−1 δ in OTi , i = 1, 2, 3. By Proposition 4.3.12 there is an εi in OT∗i such that, putting δi = εi a0 (x − αi y) and using the fact that NS (a0 ) ≤ dh(a0 ) ≤ dH , we have h(δi ) ≤ 1 log NTi (a0 (x − αi y)) + 300RLi [Li : Q] nd 2 1 log NS (δ) + 300RLi d for i = 1, 2, 3. nd ≤ (n − 1)H + =: Ai nd 2 nd + + h Li log QLi [Li : Q] hLi log QK d (9.6.6) Substituting x − αi y = δi /εi into the identity (α3 − α2 )(x − α1 y) + (α2 − α1 )(x − α3 y) + (α1 − α3 )(x − α2 y) = 0, we infer that τ ε2 ε2 + ρ = 1, ε1 ε3 (9.6.7) Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.6 Effective results 261 where τ= α3 − α2 δ1 · , α3 − α1 δ2 ρ= α2 − α1 δ3 · . α3 − α1 δ2 (9.6.8) We shall give an upper bound for h(ε2 /ε1 ). First we must derive an upper bound for h(τ ) and h(ρ). The numbers a0 αi are zeros of the monic polynomial F (X) := a0n−1 F (X/a0 , 1). The maximum of the logarithmic heights of the coefficients of F is at most nH . Then Corollary 1.9.6 and (1.9.6) give h(a0 αi ) ≤ n2 H + n log 2 =: A4 for i = 1, 2, 3, whence, using (9.6.6) and (9.6.8), it follows that max(h(τ ), h(ρ)) < 4A4 + 2 log 2 + 2 max Ai =: A5 . 1≤i≤3 (9.6.9) We first prove Theorem 9.6.1 when α1 , α2 , α3 are in K. Then we must take H1 in place of H . Further, δi ∈ OS , εi ∈ OS∗ and, instead of (9.6.6), we get 1 log NS (δ) + 300RK d =: A6 for i = 1, 2, 3. h(δi ) < (n − 1)H1 + d 2 d + hK log QK d (9.6.10) In this case it follows as in (9.6.9) that max(h(τ ), h(ρ)) < 4A4 + 2 log 2 + 2A6 =: A7 with H1 instead of H in A4 . Since ε2 /ε1 , ε2 /ε3 are S-units in K, we can apply Corollary 4.1.5 to the S-unit equation (9.6.7) and we get h(ε2 /ε1 ) < 6.5c1 c2 (PK / log PK )A7 RS max(log(c1 PK ), log∗ (c2 RS )), where c1 = 11λs 2 (log∗ s)(16ed)3s+2 with λ = 12 if s = 2, λ = 1 if s ≥ 3, and √ c2 = ((s − 1)!)2 /(2s−2 d s−1 ). But we have m!em /mm ≤ e m for any integer m ≥ 1. Hence after some computation and simplification we obtain h(ε2 /ε1 ) < 1.9(64eds)2s+5 (PK / log PK )NK RS max(log PK , log∗ RS ) =: A8 , (9.6.11) where 1 hK log NS (δ) + d d RK + log QK . d d We now give an upper bound for h(x/y). Put κ := (x − α1 y)/(x − α2 y). Then κ = (δ1 /δ2 )(ε2 /ε1 ) and, by (9.6.10) and (9.6.11), we infer that h(κ) < 2A6 + A8 ≤ 1.1A8 . But x/y = (κα2 − α1 )/(κ − 1), hence we get h(x/y) < 3.3A8 . Finally, using y n F (x/y, 1) = δ, we get (9.6.4) for h(y). The bound for h(x) follows in the same way. NK := n5 H + Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 262 Decomposable form equations Next we prove Theorem 9.6.2. Then α1 is not contained in K and we may assume that α2 = σ (α1 ) with some K-isomorphism σ . We recall that M = K(α1 , α2 , α3 ). Let V be the set of places of M lying above the places of S, and OV , OV∗ the ring of V -integers and group of V -units in M. By Proposition 4.3.9 there exists in L a fundamental system {ξ1 , . . . , ξt−1 } of T -units such that t−1 h(ξj ) ≤ c3 RT , j =1 where t = |T | ≤ sn and c3 = ((t − 1)!)2 /2t−2 [L : Q]t−1 . Denote by the subgroup of OV∗ , generated by σ (ξ1 )/ξ1 , . . . , σ (ξt−1 )/ξt−1 . In this situation ε2 , δ2 above can be chosen so that ε2 = σ (ε1 ) and δ2 = σ (δ1 ). Then, in the equation (9.6.7), ε2 /ε1 ∈ and ε2 /ε3 ∈ OV∗ . We apply now Theorem 4.1.3 to the equation (9.6.7) under these conditions. Set := t−1 h(σ (ξj )/ξj ). j =1 Then by Theorem 4.1.3 we have h(ε2 /ε1 ) < 6.5c4 v(PM / log PM )A5 max(log(c4 vPM ), log∗ ), where v = |V | and c4 = 11λt(log∗ t)(16e[M : Q])3t+2 with λ = 12 if t = 1, λ = 1 if t ≥ 2. Further, [M : Q] ≤ dn(n − 1)(n − 2), v ≤ sn(n − 1)(n − 2), t ≤ sn, and it follows that ≤ 2t−1 t−1 h(ξj ) ≤ 2sn c3 RT . j =1 Using these inequalities and simplifying the bound so obtained for h(ε2 /ε1 ) we infer as above in the proof of Theorem 9.6.1 that h(ε2 /ε1 ) < 5.1(64edn2 s)2sn+4.5 PM NL RT (log∗ RT ), where 1 hL log NS (δ) + (nd)nd RL + log QK . d d Finally, we can derive the bound in (9.6.5) h(x) and h(y) as at the end of the proof of Theorem 9.6.1. NL = n2 H + Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.6 Effective results 263 9.6.2 Decomposable form equations in an arbitrary number of unknowns Let again K be an algebraic number field, and S a finite set of places of K containing all infinite places. Consider now the general decomposable form equation F (x) = δ in x = (x1 , . . . , xm ) ∈ OSm with l(x) = 0 for l ∈ L, (9.6.1) where OS is the ring of S-integers of K, δ ∈ OS \ {0}, F ∈ OS [X1 , . . . , Xm ] is a decomposable form of degree n ≥ 3 and L is a finite set of non-zero linear forms from K[X1 , . . . , Xm ]. In this subsection we prove effective finiteness results for some important classes of equations of the form (9.6.1), including discriminant form equations and certain norm form equations. In case of discriminant form equations and norm form equations in an arbitrary number of unknowns the first effective results were established in Győry (1976) and Győry and Papp (1978), respectively. The arguments of Section 9.3.2 show that equation (9.6.1) leads to systems of unit equations in some finite extension of K. There are no general effective results for unit equations in more than two unknowns, hence one cannot obtain for (9.6.1) effective theorems in full generality. However, it will be seen that if the linear factors of F possess appropriate connectedness properties, then one can arrive at systems of unit equations consisting of equations in two unknowns in which the equations have similar connectedness properties. Then one can apply the effective results from Chapter 4 to the solutions of the arising unit equations, and using the connectedness properties of these equations, one can derive an effective upper bound for the heights of the solutions of (9.6.1). For simplicity, we shall give the bounds explicitly in terms of S only. For completely explicit bounds, we shall refer to some original papers. Extending the ground field K if necessary, we may assume that in (9.6.1) F factorizes into linear forms over K. These linear factors of F are uniquely determined over K up to proportional factors from K ∗ . Fix a factorization of F into linear forms l1 , . . . , ln , and denote by L0 a maximal subset of pairwise linearly independent linear factors of F . To obtain effective finiteness results on equation (9.6.1), we make some assumptions on L0 . We denote by G(L0 ) the graph with vertex set L0 in which the edges are the unordered pairs {l, l }, where l, l are distinct elements of L0 with the property that there exists a third linear form l ∈ L0 that is a K-linear combination of l, l . If L0 has at least three elements and G(L0 ) is connected, then F is said to be triangularly connected. In this case one can reduce equation (9.6.1) to a socalled triangularly connected system of unit equations in two unknowns, and, Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 264 Decomposable form equations as a consequence, can give an effective upper bound for the heights of the solutions of (9.6.1). The first effective result of this type was obtained in Győry and Papp (1978) for S = MK∞ , and in Győry (1978/1979, 1980a) for arbitrary S. When G(L0 ) is not connected, let L01 , . . . , L0k denote the vertex sets of the connected components of G(L0 ). If k > 1, we introduce the graph H(L01 , . . . , L0k ) with vertex set {L01 , . . . , L0k }, in which the pair {L0i , L0j } is an edge if there exists a non-zero linear form lij which can be expressed simultaneously as a K-linear combination of the forms in L0i and in L0j . In this case lij can be chosen so that the total number of non-zero terms in both representations lij = l∈L0i λl · l = l∈L0j λl · l is minimal. We pick for each edge {L0i , L0j } such an lij , and we denote by L the union of L0 and the set of the lij so chosen. The following generalization was proved in Győry (1998) with an explicit but weaker upper bound in terms of S. The improvement in S given below is due to the use of the recent Theorem 4.1.7 in which the upper bound is better in terms of S than in the other effective results concerning S-unit equations. In the formulation of the next theorem we keep the notation of Subsection 9.6.1. Namely, s denotes the cardinality of S, RS the S-regulator of K, PK the maximal norm and QK the product of the norms of the prime ideals in S if S MK∞ , and PK = 2, QK = 1 if S = MK∞ . Further, let d and DK be the degree and discriminant of K, and H an upper bound for the logarithmic heights of the coefficients of F . Theorem 9.6.3 Let F ∈ OS [X1 , . . . , Xm ] be a decomposable form of degree n that factors into linear forms over K and satisfies the following conditions: (i) the set L0 has rank m, (ii) either k = 1 or k > 1 and the graph H(L01 , . . . , L0k ) is connected. Then every solution x = (x1 , . . . , xm ) ∈ OSm of (9.6.1) with l(x) = 0 for all l ∈ L if k > 1, satisfies max h(xi ) < c5s PK (log∗ QK )RS , 1≤i≤m (9.6.12) where c5 is an effectively computable positive number which depends only on d, DK , H , m, n and h(δ). The improved dependence on S has applications in Corollaries 9.6.4 and 9.6.5; see also the Notes (Section 9.7). A completely explicit version of Theorem 9.6.3 can be found in Győry and Yu (2006). We mention that Theorem 9.6.3 is also applicable if F does not factor into linear forms over K but over a finite extension, G, say of K: by applying the above theorem with G, T instead of Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.6 Effective results 265 K, S where T is the set of places of G lying above those in S one obtains an upper bound for h(xi ) (i = 1, . . . , m) like (9.6.12) but with K, S replaced by G, T . In Győry (1998), another effective result on (9.6.1) has been derived for decomposable forms F satisfying conditions (i) and (ii) and with splitting field G K, which gives much better bounds if G is large. This is important for applications to norm form equations and discriminant form equations, see Corollaries 9.6.6–9.6.8 below. Remark Theorem 9.6.3 implies that under the assumptions (i) and (ii), equation (9.6.1) has only finitely many solutions, and all of them can be determined effectively, at least in principle. We note that the finiteness of the number of solutions in Theorem 9.6.3, and hence in Corollaries 9.6.6–9.6.8 below, follows already from Corollary 9.1.2 in the more general case as well, when K is replaced by a finitely generated extension of Q and OS by a finitely generated subring A of K over Z. More precisely, the finiteness condition (i ) L ∩ ([L1 ] ∩ [L0 \ L1 ]) = ∅ for every proper, non-empty subset L1 of L0 with L = L0 if k = 1 of Corollary 9.1.2 is a consequence of the condition (ii) of Theorem 9.6.3. Indeed, let L1 be a proper, non-empty subset of L0 . First consider the case when, in Theorem 9.6.3, k = 1. Since G(L0 ) is connected, there are l ∈ L1 and l ∈ L0 \ L1 such that l, l are connected by an edge in G(L0 ), i.e., λl + λ l + λ l = 0 for some l ∈ L0 and non-zero λ, λ , λ ∈ K which proves (i ). Next assume that k > 1. If there is an L0i with 1 ≤ i ≤ k such that L0i ∩ L1 = ∅ and L0i ∩ (L0 \ L1 ) = ∅, then (i ) follows as in the case k = 1. Suppose that any L0i is either in L1 or in L0 \ L1 . Since by assumption H(L01 , . . . , L0k ) is connected, there is a pair L0i , L0j which is an edge in H and L0i is in L1 and L0j in L0 \ L1 or conversely. But there is a non-zero linear form lij contained in [L0i ] and [L0j ] and hence in [L1 ] and [L0 \ L1 ] which yields (i ). We now present some consequences of Theorem 9.6.3. We start with another version of Theorem 9.6.1 which gives a better bound for the solutions of equation (9.6.3) in terms of S. Consider again the equation F (x, y) = δ in x, y ∈ OS , (9.6.3) where F ∈ OS [X, Y ] is a binary form of degree n ≥ 3 which factorizes into linear factors over K and at least three of these factors are pairwise nonproportional. Further, let δ ∈ OS \ {0} and H an upper bound for the logarithmic heights of the coefficients of F . It is easy to check that in this case F satisfies the conditions (i), (ii) of Theorem 9.6.3 with m = 2, k = 1. Hence Theorem 9.6.3 implies the following. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 266 Decomposable form equations Corollary 9.6.4 Under the above assumptions and notation, all solutions x, y of (9.6.3) satisfy max(h(x), h(y)) < c6s PK (log∗ QK )RS , where c6 is an effectively computable positive number depending only on d, DK , H , n and h(δ). For an explicit value of c6 , we refer to Győry and Yu (2006). The following consequence of Theorem 9.6.3 provides some information about the arithmetical properties of decomposable forms at integral points with coordinates in OK , where OK denotes the ring of integers of K. We denote by ω(α) the number of distinct prime ideal divisors of α ∈ OK \ {0}, and by P (α) the greatest of the norms of these prime ideals, with the convention that P (α) = 1 if α ∈ OK∗ . Corollary 9.6.5 Let F ∈ OK [X1 , . . . , Xm ] be a decomposable form as in Theorem 9.6.3, and let N0 be a positive integer. Further, let x = (x1 , . . . , xm ) ∈ OKm be such that NK ((x1 , . . . , xm )) ≤ N0 , F (x) = 0, l(x) = 0 for l ∈ L if k > 1. Then P (log P )ω > c7 (log N )c8 and P > c9 (log N )c10 if ω ≤ log P / log2 P , c11 (log2 N)(log3 N )/(log4 N) otherwise, provided that N = max1≤i≤m |NK/Q (xi )| ≥ N1 , where P = P (F (x)) and ω = ω(F (x)). Here c7 , . . . , c11 and N1 are effectively computable positive numbers which depend at most on K, F and N0 . The deduction of this corollary from Theorem 9.6.3 is straightforward, for this we refer to Győry and Yu (2006). An important special case of Corollary 9.6.5 is m = 2, k = 1 when F is a binary form with splitting field K and with at least three pairwise nonproportional linear factors. In this special case the corollary implies a similar result for polynomials F (X) ∈ OK [X]. Corollary 9.6.5 is a generalization and improvement of the corresponding results of Győry (1978/1979, 1981a), Haristoy (2003) and many earlier special lower estimates. It motivates the following. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.6 Effective results Conjecture (Győry and Yu (2006)) Corollary 9.6.5, 267 Under the assumptions and notation of P > c12 (log N )c13 if N ≥ N1 holds, where c12 , c13 and N1 are effectively computable positive numbers depending at most on K, F and N0 . Let now L be an extension of K of degree n ≥ 3 and α1 = 1, α2 , . . . , αm K-linearly independent elements of L over K with m ≥ 2 which are integral over OS . Consider the norm form equation NL/K (α1 x1 + · · · + αm xm ) = δ in x1 , . . . , xm ∈ OS , (9.6.13) where δ ∈ OS \ {0}. For m = 2, this is a Thue equation over OS . Corollary 9.6.6 Suppose that αm is of degree ≥ 3 over K(α1 , . . . , αm−1 ). Then all solutions (x1 , . . . , xm ) of (9.6.13) with xm = 0 satisfy max h(xi ) ≤ C1 , 1≤i≤m where C1 is an effectively computable positive number which depends only on K, L, S, m, n, α1 , . . . , αm and δ. This implies the following. Corollary 9.6.7 Suppose that αi+1 is of degree ≥ 3 over K(α1 , . . . , αi ) for i = 1, . . . , m − 1. Then every solution (x1 , . . . , xm ) of (9.6.13) satisfies max h(xi ) ≤ C2 , 1≤i≤m where C2 is an effectively computable positive number which depends only on K, L, S, m, n, α1 , . . . , αm and δ. For S = MK∞ , the first version of Corollary 9.6.7 was obtained in Győry and Papp (1978). In case of arbitrary S, Corollaries 9.6.6 and 9.6.7 were first proved by Győry (1981a, 1981b) and independently by Kotov (1981). The best known, completely explicit upper bounds for the solutions of equation (9.6.13) are given in Bugeaud and Győry (1996b) and Győry (1998). Remark In Corollaries 9.6.6, 9.6.7 and hence in Theorem 9.6.3, the respective assumptions xm = 0 and l(x) = 0 for l ∈ L cannot be dropped, and the lower cannot be diminished√ in general. Indeed, let α ∈ Q bound 3 for the degrees of αi √ of degree ≥ 3 over L1 = Q( 2) and let L2 = Q( 2, α). Then the equations √ √ NL1 /Q (x1 + 2x2 ) = ±1 and NL2 /Q (x1 + 2x2 + αx3 ) = ±1 have infinitely many integral solutions (x1 , x2 , x3 ) with x3 = 0. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 268 Decomposable form equations Let again L be an extension of degree n ≥ 3 of K and 1, α1 , . . . , αm K-linearly independent elements of L, integral over OS , such that L = K(α1 , . . . , αm ), and let l = X0 + α1 X1 + · · · + αm Xm . Denote by σ1 , . . . , σn the K-isomorphic embeddings of L in Q, and define l (i) := X0 + σi (α1 )X1 + · · · + σi (αn )Xn for i = 1, . . . , n. Put 2 DL/K (α1 X1 + · · · + αm Xm ) := l (i) − l (j ) . 1≤i<j ≤n This is a decomposable form in OS [X1 , . . . , Xm ] of degree n(n − 1), independent of X0 . It is called a discriminant form. Consider now the discriminant form equation DL/K (α1 x1 + · · · + αm xm ) = δ in (x1 , . . . , xm ) ∈ OSm , (9.6.14) where δ ∈ OS \ {0}. Corollary 9.6.8 Under the above assumptions, all solutions (x1 , . . . , xm ) of (9.6.14) satisfy max h(xi ) < C3 , 1≤i≤m where C3 is an effectively computable positive number which depends only on K, S, m, n, α1 , . . . , αm and δ. For K = Q, S = {∞}, the first version of this corollary was proved in Győry (1976), and for arbitrary K and S, in Győry and Papp (1977) and Győry (1981b). The best known, explicit version of Corollary 9.6.8 can be found in Győry (1998). Corollary 9.6.8 and its other versions have several applications, among others to index form equations, algebraic integers of given discriminant or of given index and power integral bases. Such results are treated in detail in our next book on discriminant equations. Some related results are also briefly discussed in the Notes (Section 9.7) and in Section 10.6. We note that from Theorem 9.6.3 one could easily deduce in Corollaries 9.6.6–9.6.8 explicit bounds in terms of S. Moreover, combining the explicit version of Theorem 9.6.3 from Győry and Yu (2006) with some arguments from Győry (1998), one can give completely explicit version of Corollaries 9.6.6– 9.6.8 with slightly better upper bounds than those in Bugeaud and Győry (1996b) and Győry (1998). Finally, we observe that Corollary 9.6.5 is in particular applicable in the case that F is a discriminant form, or a norm form like in Corollaries 9.6.6 and 9.6.7. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.6 Effective results 269 We give only a sketch of the proof of Theorem 9.6.3. For a detailed proof we refer to Győry and Yu (2006). Proof of Theorem 9.6.3 (sketch). We keep the above notation of Subsection 9.6.2. We shall denote by c14 , c15 , . . . , c44 effectively computable positive numbers which depend at most on d, the class number hK and regulator RK of K, and on H , m, n and h(δ). But by (1.5.2) and (1.5.3) hK , RK can be estimated from above in terms of d and the discriminant DK of K. Hence we can replace the dependence on hK and RK by DK . We make some preliminary remarks. We show in two steps that equation (9.6.1) can be written in the form l1 (x) · · · ln (x) = δ in x ∈ OSm with l(x) = 0 for l ∈ L, (9.6.15) where, up to a proportional factor, l1 · · · ln is a factorization of F into linear forms in X1 , . . . , Xm with coefficients in OK , the logarithmic heights of the coefficients of l1 , . . . , ln do not exceed c14 and the new δ ∈ OS \ {0} has height h(δ) ≤ c15 log∗ QK . First we recall that as in (9.4.4), F can be written as cl1 · · · ln , where c ∈ OS , c = 0 and li = Xni + αni +1,i Xni +1 + · · · + αmi Xm with αj i ∈ K for i = 1, . . . , n, j ∈ {ni + 1, . . . , m}. Then by (1.9.5) and (1.9.6) we have hhom (F ) ≤ c16 and, by Corollary 1.9.5, h(li ) = hhom (li ) ≤ c17 for i = 1, . . . , n. Thus the maximum of the logarithmic heights of the coefficients of li is at most c18 . Since c is a coefficient of F , we have h(c) ≤ c19 which implies that the coefficients of cl1 have logarithmic heights at most c20 . In the next step we multiply cl1 , l2 , . . . , ln and δ by the product of the denominators of the coefficients of cl1 , l2 , . . . , ln . Then the logarithmic heights of the new δ and the coefficients of the new linear factors, for simplicity denoted again by l1 , . . . , ln , are at most c21 . Therefore our claim is proved. Let now x ∈ OSm be a solution of equation (9.6.15) with l(x) = 0 for l ∈ L if k > 1, and write li (x) = δi , i = 1, . . . , n. (9.6.16) Then δi is a divisor of δ in OS and so, by (1.9.2) and the above upper bound for h(δ), we have log NS (δi ) ≤ log NS (δ) ≤ c22 h(δ) ≤ c23 . By Proposition 4.3.12 there is an εi ∈ OS∗ such that h(δi /εi ) ≤ c24 log∗ QK , i = 1, . . . , n. (9.6.17) Let L0 be a maximal subset of pairwise linearly independent linear forms in the set of new linear forms l1 , . . . , ln . Then the new L0 and its associated graph G(L0 ) also satisfy the assumptions (i) and (ii) of the theorem. Let L01 , . . . , L0k Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 270 Decomposable form equations denote the vertex sets of the connected components of G(L0 ). First assume that k = 1. Then by assumption (i), G(L0 ) is of order at least 3. If {li , lj } is an edge in G(L0 ), then λi li + λj lj + λl = 0 for some l ∈ L0 and some non-zero λi , λj , λ in K with logarithmic heights not exceeding c25 . Together with (9.6.17) this yields an S-unit equation τi εi + τj εj + τ ε = 0 in εi , εj , ε ∈ OS∗ (9.6.18) where the coefficients τi , τj , τ are non-zero elements of K with logarithmic height ≤ c25 log∗ QK . Now applying Theorem 4.1.7 to equation (9.6.18), we infer that s PK (log∗ QK )RS max(h(εi /ε), h(εj /ε)) ≤ c26 and so, by (9.6.16) and (9.6.17), s max(h(δi /ε), h(δj /ε)) ≤ c27 PK (log∗ QK )RS =: A. (9.6.19) If now {li , lq } is an edge in G(L0 ) then we deduce in the same way that there is an ε ∈ OS∗ such that max(h(δj /ε ), h(δq /ε )) ≤ A. Together with (9.6.19) this implies h(ε /ε) ≤ 2A, whence h(δq /ε) ≤ 3A. Using the assumption that G(L0 ) is connected and repeating the above procedure with the shortest path connecting two vertices, we infer that h(δi /ε) ≤ c28 A for each i with li ∈ L0 . Further, if li ∈ L \ L0 is proportional to a linear form li ∈ L0 , then li = ρli with some non-zero ρ ∈ K with h(ρ) ≤ c29 , hence h(δi /ε) ≤ c30 A for i = 1, . . . , n. Together with (9.6.15) this gives h(δ/ε n ) ≤ c31 A. Thus h(ε) ≤ c32 A, and so h(δi ) ≤ c33 A for i = 1, . . . , n. Considering (9.6.16) as a system of linear equations in x = (x1 , . . . , xm ) and using the assumption (i), we infer by Cramer’s Rule that h(xt ) ≤ c34 A for t = 1, . . . , n. (9.6.20) Next consider the case when k > 1 and the graph H(L01 , . . . , L0k ) is connected. For j = 1, . . . , k, let Jj denote the set of indices i with li ∈ L0j . We may assume without loss of generality that {L01 , L02 } is an edge in this graph. Then by assumption there is a non-zero l1,2 ∈ L which can be represented in the form λ i li = λ i li (9.6.21) i∈J1 i∈J2 such that the total number of non-zero λi ∈ K in both sides of (9.6.21) is minimal. Then, up to a proportional factor, these λi provide a uniquely determined Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.6 Effective results 271 solution of (9.6.21) as a system of linear equations in λi with i ∈ J1 ∪ J2 . One can prove that there is a non-zero λ1,2 in K such that λ1,2 l1,2 can be expressed in the form (9.6.21) with non-zero λi ∈ K for which h(λi ) ≤ c35 . As was seen above in the case k = 1, we have h(δi /ε1 ) ≤ c36 A for i ∈ J1 and h(δi /ε2 ) ≤ c37 A for i ∈ J2 (9.6.22) with some ε1 , ε2 ∈ OS∗ . By (9.6.17), this also holds if J1 or J2 consists of a single element. For the solution x considered above we deduce from (9.6.21) and (9.6.22) that h(λ1,2 l1,2 (x)/εq ) ≤ c38 A for q = 1, 2. But l1,2 (x) = 0, hence it follows that h(ε2 /ε1 ) ≤ c39 A, whence, by (9.6.22), h(δi /ε1 ) ≤ c40 A for i ∈ J1 ∪ J2 . Using the fact that the graph H(L01 , . . . , L0k ) is connected and repeating this process with the shortest path connecting two vertices, we infer that h(δi /ε1 ) ≤ c41 A for each i in J1 ∪ · · · ∪ Jk . It follows as above in the case k = 1 that h(δi /ε1 ) ≤ c42 A and so, in view of (9.6.15), h(δi ) ≤ c43 A for i = 1, . . . , n. We now infer as in the case k = 1 that (9.6.20) holds with a c44 in place of c34 for t = 1, . . . , n, whence (9.6.12) follows. Proof of Corollary 9.6.6. Put M = K(α1 , . . . , αm ), and denote by L0 the set of the conjugates of the linear form l = α1 X1 + · · · + αm Xm with respect to M/K. By assumption α1 = 1, hence the forms in L0 are pairwise nonproportional. They form a maximal subset of such forms in the set of linear forms of NL/K (l). Partition the linear forms in L0 into subsets so that l , l belong to the same subset if the coefficients of X1 , . . . , Xm−1 in l , l coincide. Then we get a partition L01 , . . . , L0k with k denoting the degree of K(α1 , . . . , αm−1 ) over K, and it is easily seen that each of the graphs G(L01 ), . . . , G(L0k ) defined above is connected. Further, L0 has the properties (i), (ii) from Theorem 9.6.3 with L = L0 ∪ {Xm }. Considering now equation (9.6.13) over the normal closure, say G, of L over K, Theorem 9.6.3 applies to equation (9.6.13) and gives an effective upper bound for the solutions in terms of H , m, n, h(δ), the degree g and discriminant DG of G, and the parameters involved of SG , that is the set of places of G lying above those of S. But H can be effectively bounded in terms of m, n and h(α1 ), . . ., h(αm ). Further, using explicit estimates form Sections 1.4, 1.5 and 1.8, g, |DG | and the parameters mentioned can be effectively estimated from above in terms of S and the degrees and discriminants of K and L. This completes the proof. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 272 Decomposable form equations Proof of Corollary 9.6.7. Let (x1 , . . . , xm ) be a solution of (9.6.13), and denote by m the greatest integer with xm = 0. If m ≥ 2, Corollary 9.6.6 applies with m instead of m, while for m = 1 the assertion is trivial. Proof of Corollary 9.6.8. Using the notation and assumptions of the corollary, L = K(α1 , . . . , αm ) implies that the linear forms l (1) , . . . , l (n) are pairwise nonproportional. Further, it follows from the linear independence of 1, α1 , . . . , αm over K that there are indices i1 , . . . , im+1 such that rank{l (i1 ) , . . . , l (im+1 ) } = m + 1. Notice that the linear forms lij := l (i) − l (j ) (1 ≤ i, j ≤ n) depend only on X1 , . . . , Xm , and that lij . DL/K (α1 X1 + · · · + αm Xm ) = (−1)n(n−1)/2 1 ≤ i, j ≤ n i = j Further, rank{li1 ,im+1 , . . . , lim ,im+1 } = m. This means that rank L0 = m, where L0 denotes a maximal set of pairwise non-proportional linear factors of the left-hand side. For distinct u, v, w ∈ {1, . . . , n} we have luv + lvw + lwu = 0. It is easy to check that the graph G(L0 ) is connected, and hence Theorem 9.6.3 combined with some arguments from the end of the proof of Corollary 9.6.6 yields Corollary 9.6.8. 9.7 Notes We make some historical notes and mention some refinements, applications and generalizations of the results presented in this chapter. r There are many papers on effective results for decomposable form equations. The first effective upper bound for the solutions of Thue equations over Z was established by Baker (1968b) by means of his effective estimates for linear forms in logarithms. In the case of discriminant form and index form equations, the first effective bounds for the solutions were given in Győry (1976), and for the case of certain norm form and decomposable form equations, in Győry and Papp (1978). Their proofs also involved Baker’s method but via Győry’s effective results on unit equations in two unknowns. Later, a number of various effective results with explicit bounds and generalizations were obtained on the equations mentioned; for results and references, see the books and survey papers Győry (1980b, 2002), Shorey and Tijdeman (1986), Evertse, Győry, Stewart and Tijdeman (1988b), Evertse and Győry (1988d), Sprindžuk (1993) and Feldman and Nesterenko (1998). Practical algorithms for solving concrete equations of these types were also worked out; see Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.7 Notes 273 de Weger (1989), Tzanakis and de Weger (1989), Bilu and Hanrot (1996, 1999), Smart (1998), Gaál (2002), our book on discriminant equations and the references given there. All these results were established by Baker’s method, many of them via unit equations. As was mentioned in Subsection 9.6.1, another method was developed and used in Bombieri (1993), Bombieri and Cohen (1997, 2003) and Bugeaud (1998) to obtain effective bounds for the solutions of Thue equations. For the solutions of decomposable form equations, the best effective upper bounds to date are given in Bugeaud and Győry (1996b), Bugeaud (1998), Győry (1998), Győry and Yu (2006) and in Section 9.6 above. The effective results concerning decomposable form equations have many applications in Diophantine number theory and algebraic number theory. Several such applications are treated in detail in our book on discriminant equations. r Thue equations have many applications. We present here a classical application. Let f ∈ Z[X] be a non-linear polynomial of degree n and m a given integer ≥ 2 and consider the equation f (x) = y m in x, y ∈ Z. (9.7.1) An important special case is Mordell’s equation x 3 + k = y 2 , where k is a non-zero integer. Equation (9.7.2) is called an elliptic equation if m = 2, deg f = 3, a hyperelliptic equation if m = 2 and deg f ≥ 3 and a superelliptic equation if m ≥ 3 and deg f ≥ 2. The example of the Pell equation dx 2 + 1 = y 2 shows that (9.7.1) may have infinitely many solutions if m = 2 and deg f = 2. Mordell (1922b, 1923) in the elliptic case and later Siegel (1926) in the case that f has degree ≥ 3 and no multiple zeros, proved that (9.7.1) has only finitely many solutions. LeVeque (1964) gave a general finiteness criterion for equation (9.7.1). Their proofs are based on Thue’s and Siegel’s ineffective finiteness theorems on Thue equations over Q resp. over number fields, hence they are also ineffective. Baker (1968b, 1968c, 1969) was the first to give effective upper bounds for the solutions of (9.7.1) in the case when f has at least 3 simple zeros if m = 2 and at least 2 simple zeros if m ≥ 3. We sketch the main steps of his proof. Assume, for simplicity, that f is monic, and that α1 , α2 and, in the case m = 2, α3 are simple zeros of f . Put Ki = Q(αi ) for i = 1, 2, 3. If (x, y) is a solution of (9.7.1), then following Siegel’s argument one deduces that x − αi = βi σim , where βi is a non-zero element of K with bounded height, and σi is an unknown integer in Ki for i = 1, 2, 3. This implies that β1 σ1m − β2 σ2m = α2 − α1 . Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 274 Decomposable form equations For m ≥ 3, this is a Thue equation over K1 K2 . In this case, Baker applied his effective result concerning Thue equations over number fields to give an effective upper bound for the heights of σ1 , σ2 and thereby for x and y. If m = 2, we have a system of three equations βi σi2 − βj σj2 = αj − αi (1 ≤ i < j ≤ 3). Following Siegel (1926), Baker reduced this system to a single Thue equation over an appropriate finite extension of K1 K2 K3 and applied his effective result on Thue equations to the latter. We mention that alternatively one can combine the above system into a single equation over K1 K2 K3 in three unknowns σ1 , σ2 , σ3 , βi σi2 − βj σj2 = (αj − αi ), 1≤i<j ≤3 1≤i<j ≤3 where the left-hand side is a triangularly connected decomposable form in σ1 , σ2 , σ3 , whose linear factors form a system of rank 3, and then apply Theorem 9.6.3. We note that using here Theorem 9.6.1 or 9.6.2, or the explicit version of Theorem 9.6.3 from Győry and Yu (2006) one can get better bounds for the solutions x, y of equation (9.7.1). Quantitative improvements and generalizations of Baker’s theorems were later obtained by many authors, including Brindza (1984), who gave an effective upper bound for the solutions x, y of (9.7.1) under LeVeque’s general criterion. For practical methods for complete resolution of elliptic and superelliptic equations, we refer to Gebel, Pethő and Zimmer (1994), Stroeker and Tzanakis (1994), Bilu and Hanrot (1998) and Tzanakis (2013). All these results and methods are based on the theory of logarithmic forms or its elliptic analogue. Recently, Bérczes, Evertse and Győry (2014) proved an effective finiteness result for hyper- and superelliptic equations over finitely generated domains. There are a couple of results on the number of solutions of hyper- and superelliptic equations. We recall without proof the following special case of a result of Evertse and Silverman (1986). Its proof takes as starting point Evertse’s quantitative results on S-unit equations in two unknowns and the Thue–Mahler equation Evertse (1984a) and follows the same lines as the argument sketched above. Let m be an integer ≥ 2 and f ∈ Z[X] a polynomial of degree n and discriminant D(f ) = 0. Let ω(D(f )) denote the number of primes dividing D(f ). Assume that n ≥ 3 if m = 2 and n ≥ 2 if m ≥ 3. Let K be a number field, containing three zeros of f if m = 2 and two zeros of f if m ≥ 3, and denote by hm (K) the number of ideal classes of OK whose Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.7 Notes 275 m-th power is the principal ideal class. Then the number of solutions of f (x) = y m is at most 3 717n (ω(D(f ))+1) 13 in x, y ∈ Z h2 (K)2 2 n2 (ω(D(f ))+1) (17 m ) (9.7.1) if m = 2, hm (K) if m ≥ 3. A folklore conjecture asserts that hm (K) m,[K:Q],ε |DK |ε for every ε > 0, where DK denotes the discriminant of K and the implied constant depends only on m, [K : Q] and ε. By elementary estimates, one can estimate |DK | from above by a power of |D(f )|. This leads to the conjecture that equation (9.7.1) has m,n,ε |D(f )|ε solutions, for every ε > 0. r The simplest discriminant equation is (xi − xj )2 ∈ A∗ in x = (x1 , . . . , xm ) ∈ Am , Dm (x) = (9.7.2) 1≤i<j ≤m where A is a subring of a field K of characteristic 0 which is integrally closed and finitely generated over Z. The form (Xi − Xj )2 , Dm := 1≤i<j ≤m called a decomposable form of discriminant type, is just the discriminant of the polynomial f (X) = (X − X1 ) · · · (X − Xm ). Two solutions x = (x1 , . . . , xm ), x = (x1 , . . . , xm ) of (9.7.2) are called A-equivalent if there are u ∈ A∗ , a ∈ A such that xi = uxi + a for i = 1, . . . , m. It is easily seen that the decomposable form Dm is triangularly connected. Hence, in the number field case when A = OS , the ring of S-integers of a number field K, Theorem 9.6.3 gives that the set of solution of (9.7.2) is the union of finitely many OS -equivalence classes of solutions which can be effectively determined. A generalization for the finitely generated case and other related results will be treated in detail in our next book on discriminant equations. r For applications, discriminant form equations and index form equations belong to the most important classes of decomposable form equations. For a detailed treatment of these equations and their applications we refer again to our next book on discriminant equations. We mention here some basic facts only about these equations. See also Section 10.6. Let K be a field of characteristic 0, L an extension of K of degree n ≥ 2, A a domain with quotient field K which is integrally closed in K and O an Aorder, that is a subring of L containing A that as an A-module is free of rank n. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 276 Decomposable form equations Let {ω1 = 1, ω2 , . . . , ωn } be an A-module basis of O. Define the linear form l := X1 + ω2 X2 + · · · + ωn Xn , let l (1) = l, l (2) , . . . , l (n) be the conjugates of l over K, and define the discriminant form DL/K (ω2 X2 + · · · + ωn Xn ) as in Section 9.6, i.e., 1≤i<j ≤n (l (i) − l (j ) )2 . Then DL/K (ω2 X2 + · · · + ωn Xn ) = I (ω2 X2 + · · · + ωn Xn )2 · , where I = I (ω2 X2 + · · · + ωn Xn ) is a decomposable form in A[X2 , . . . , Xn ] of degree n(n − 1)/2 and = DL/K (1, ω2 , . . . , ωn ) is the discriminant of the basis {1, ω2 , . . . , ωn }. Using the finiteness result of Lang (1960) on unit equations in two unknowns, it was proved in Győry (1982a, 1982b) that apart from a proportional factor from A∗ and a translation of the form α → α + a, a ∈ A, the equations (i) O = A[α], (ii) DL/K (α) ∈ δA∗ , (iii) DL/K (ω2 x2 + · · · + ωn xn ) ∈ δA∗ , (iv) I (ω2 x2 + · · · + ωn xn ) ∈ δA∗ , where δ ∈ A \ {0}, have only finitely many solutions in α ∈ O resp. in x2 , . . . , xn ∈ A. Moreover, putting α = ni=1 ωi xi with x1 , . . . , xn ∈ A, one can show that these equations are equivalent. In the special case K = Q, A = Z, the quantity |I (ω2 x2 + · · · + ωn xn )| is just the index of the additive group Z[α]+ in O + , therefore, the form I is called an index form. Effective versions of the above finiteness assertions are given in Győry (1976, 1978b, 1981b) and Győry and Papp (1977) over number fields, and, in our next book on discriminant equations, over finitely generated domains. r Let K be a number field with ring of integers OK , S a finite set of places of K containing all infinite places, p1 , . . . , ps the prime ideals in S, and phi K = (πi ), where hK denotes the class number of K and πi ∈ OK \ {0} which, by Proposition 4.3.12, can be chosen so that h(πi ) is effectively bounded, for i = 1, . . . , s. Let F ∈ K[X1 , . . . , Xm ] be a decomposable form, and let δ ∈ K \ {0}. It is easy to see that the equation F (x) = δ in x = (x1 , . . . , xm ) ∈ OSm (9.7.3) leads to finitely many equations of the form F (x) = δ π1z1 · · · πszs in x = (x1 , . . . , xm ) ∈ OKm and z1 , . . . , zs ∈ Z≥0 , (9.7.4) where δ can take only finitely many and effectively determinable values from K \ {0}. Conversely, any equation of the shape (9.7.4) can be reduced Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.7 Notes 277 to finitely many and effectively determinable equations of the form (9.7.3). In our book, we considered equations in the form (9.7.3), but in the earlier literature many results were formulated and proved for equations (9.7.4). r Combining the proof of Theorem 9.6.3 with the effective results of Chapter 8 on unit equations, Theorem 9.6.3 and its corollaries formulated in Section 9.6 can be generalized for the case where the ground ring is an arbitrary finitely generated integral domain over Z. Using the method presented in Chapter 8, in Bérczes, Evertse and Győry (2014) and in our next book on discriminant equations effective finiteness results are obtained in a more direct way concerning the solutions of Thue equations and discriminant equations over finitely generated domains. r Schmidt (1971) gave a finiteness criterion for decomposable form equations over Z in m unknowns, F (x) = δ in x ∈ Zm , but he restricted himself to decomposable forms F with the property that F (x) = 0 for all non-zero x ∈ Zm . Schmidt’s result was later extended by Schlickewei (1977d) to the case of decomposable form equations over rings of S-integers in Q. We note that the condition F (x) = 0 for x ∈ Zm \ {0} is independent of condition (i) of Theorem 9.1.1. For instance, the decomposable form F = X1 · · · Xm (a1 X1 + · · · + am Xm ) with a1 , . . . , am non-zero integers satisfies (i) with L consisting of all subsums of a1 X1 + · · · + am Xm , but it certainly vanishes at non-zero integral points. r Let K be a number field, S a finite set of place of K of cardinality s containing all infinite places, δ a non-zero element of OS , and F ∈ OS [X, Y ] a binary form of degree n ≥ 3. It was proved in Evertse (1997) that the Thue–Mahler equation F (x, y) ∈ δOS∗ in x, y ∈ OS (9.7.5) has at most (5 · 106 n)s+ωS (δ) OS∗ -cosets of solutions. Erdős, Stewart and Tijdeman (1988) proved the following result, which implies that Evertse’s bound cannot be replaced by a bound polynomial in s, say. Let p1 = 2, p2 = 3, . . . be the sequence of primes and n an integer ≥ 2. Then for every > 0, there exists t0 (n, ) such that for every t ≥ t0 (n, ) there is a polynomial f ∈ Z[X] of degree n with n distinct zeros in Q for which the equation f (x) = p1z1 · · · ptzt has at least exp((n2 − )t 1/n (log t)−(n−1)/n ) solutions in x, z1 , . . . , zt ∈ Z. Moree and Stewart (1990) proved a similar result with f irreducible over Q. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 278 Decomposable form equations On the other hand it turned out that, in a certain sense, most of the equations of type (9.7.5) have much fewer solutions. Let K, S be as above. Two binary forms F , G ∈ OS [X, Y ] are said to be GL(2, OS )-equivalent if G(X, Y ) = εF (aX + bY, cX + dY ) for some ε ∈ OS∗ and a, b, c, d ∈ OS with ad − bc ∈ OS∗ . Obviously, the number of OS∗ -cosets of solutions of (9.7.5) does not change when F is replaced by a GL(2, OS )-equivalent form. Using the number field case of Theorem 6.1.6 (see Evertse, Győry, Stewart and Tijdeman (1988a)), Evertse and Győry (1989) proved the following: for every finite extension L of K and every integer n ≥ 3 there are up to GL(2, OS )-equivalence only finitely many binary forms F ∈ OS [X, Y ] of degree n with non-zero discriminant that factorize into linear factors over L and for which equation (9.7.5) has more than two OS∗ -cosets of solutions. Here the bound 2 is already best possible. Further, the assertion does not remain valid without fixing the splitting field. The proof of Evertse and Győry is ineffective in the sense that it does not allow us to determine effectively a full system of representatives for the exceptional equivalence classes. In Evertse and Győry (1989) the authors established also an effective version, but with the bound 1 + s · min(m, n(n − 1)(n − 2)) instead of 2 where m := [L : K]. For a connection with inequalities involving resultants of binary forms, see Section 10.9. r Mahler (1933b) gave asymptotic formulas for the number of solutions of Thue and Thue–Mahler inequalities, and these were later generalized to decomposable form inequalities. We give an overview of the recent results. We need the following notation. Let S = {∞, p1 , . . . , pt } be a finite set of places of Q. Call a point x = (x1 , . . . , xm ) ∈ Zm S-primitive if gcd(x1 , . . . , xm , p1 · · · pt ) = 1 (in the case S = {∞} this condition is void). Denote by μ = μ∞ the Lebesgue measure on R normalized such that μ∞ ([0, 1]) = 1. For a prime number p, denote by μp the Haar measure on Qp , normalized such that μp (Zp ) = 1. Further, denote by μS the product measure p∈S μp on p∈S Qp and for m positive integers m, by μm S the product measure on p∈S Qp . We call a point m (xp : p ∈ S) ∈ p∈S Qp S-primitive if |xp |p = 1 for p ∈ S \ {∞}, where as usual we define |x|p := maxi |xi |p for x = (x1 , . . . , xm ). For a decomposable form F ∈ Z[X1 , . . . , Xm ], we denote by NF,S (k) the number of solutions of the decomposable form inequality |F (x)|p ≤ k in S-primitive x ∈ Zm p∈S Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.7 Notes 279 and we write NF (k) for NF,S (k) if S = {∞}. Further, we define the set ⎧ ⎫ ⎨ ⎬ |F (x )| ≤ k, p p p∈S AF,S (k) := (xp : p ∈ S) ∈ Qm : . p ⎩ (xp : p ∈ S) S-primitive⎭ p∈S m/n , where μF,S := Then for k > 0 we have μm S (AF,S (k)) = μF,S k m μS (AF,S (1)). In the case S = {∞} we write μF for μF,S . It is an obvious problem to compare NF,S (k) with μF,S k m/n . In the 1930s, Mahler (1933b) proved that if F ∈ Z[X, Y ] is an irreducible binary form of degree n ≥ 3, then |NF,S (k) − μF,S k 2/n | F,S k 1/(n−1) (log k)t as k → ∞, where the implied constant depends on F and S. In his master’s thesis, de Jong (1999) proved analogues of Mahler’s results for certain classes of norm form inequalities. In the case S = {∞}, Thunder proved a more substantial generalization of Mahler’s result to decomposable form inequalities, and made Mahler’s result more precise. For the simplicity of our presentation, we give slightly weaker versions of Thunder’s results. Let F ∈ Z[X1 , . . . , Xm ] be a decomposable form satisfying condition (i) of Theorem 9.1.1 with L = L0 and also F (x) = 0 for all x ∈ Zm \ {0}. This condition is slightly stronger than the one imposed by Thunder. In his paper Thunder (2001) proved that μF m,n 1 and NF (k) m,n k m/n as k → ∞, where the implied constants are effectively computable and depend only on m, n and moreover, −2 |NF (k) − μF k n/d | F k n/(d+n as k → ∞, ) (9.7.6) where the implied constant is effectively computable and depends on F . In the special case that gcd(m, n) = 1, Thunder (2005) obtained an estimate similar to (9.7.6) with an effectively computable implied constant depending only on m and n. Thunder’s arguments consisted of an application of the Quantitative Subspace Theorem and geometry of numbers. It is still open to prove an estimate like (9.7.6), with implicit constant depending only on m, n, without the constraint gcd(m, n) = 1. J. Liu (2015) obtained in his PhD-thesis generalizations of Thunder’s results for arbitrary finite sets of places S. More precisely, he proved that μF,S m,n,S 1, NF,S (k) m,n,S k m/n as k → ∞, and −2 |NF,S (k) − μF,S k n/d | F,S k n/(d+n ) as k → ∞, Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 280 Decomposable form equations where now all implied constants depend also on the primes in S. Further, in the case that m and n are coprime, he obtained a similar estimate with implicit constant depending only on m, n and the primes in S. Here again, all implicit constants are effectively computable. r Let A be an integral domain that is finitely generated over Z and K its quotient field. Further, let P ∈ A[X] be a polynomial of degree n without multiple zeros. Consider the resultant equation R(P , Q) = δ in Q ∈ A[X] with deg Q = m, (9.7.7) where R(P , Q) denotes the resultant of P and Q and where δ ∈ K \ {0}. Writing P = a0 (X − α1 ) · · · (X − αn ) with distinct α1 , . . . , αn from a finite extension of K and Q = x0 Xm + x1 Xm−1 + · · · + xm ∈ A[X], we have n R(P , Q) = a0m x0 αim + x1 αim−1 + · · · + xm . i=1 Thus, (9.7.7) can be regarded as a decomposable form equation. By means of an earlier version of Corollary 9.1.2 it was proved in Győry (1993b) that this equation has only finitely many solutions if m < n/2 and this bound n/2 is in general sharp. This improved and generalized results of Wirsing (1971), Schmidt (1973) and Schlickewei (1977e) obtained in the case K = Q, A = Z or ZS , a ring of S-integers in Q. In the case when K is a number field and A = OS is a ring of S-integers in K, a quantitative finiteness result from Evertse (1995) on decomposable form equations was used in Győry (1994) 3 to derive the upper bound (234 n2 )m s for the number of solutions of (9.7.7), where s = |S|. We note that in Sections 10.8, 10.9 other versions of (9.7.7) are considered, where both P and Q are unknowns, but the splitting field of P · Q is fixed. r The next application is concerned with irreducible polynomials. It gave an affirmative answer to a problem of M. Szegedy. Let P ∈ Z[X] be a monic polynomial of degree n without multiple zeros. Further, let p1 , . . . , ps be distinct primes and denote by S the set of integers not divisible by primes different from p1 , . . . , ps . It was proved in Győry (1994) that there are at 3 most (217 n)n (s+1)/3 values a ∈ S for which P (X) + a is reducible over Q. Indeed, if for some a ∈ S, Q(X) = Xm + x1 Xm−1 + · · · + xm is a divisor of degree m ≤ n/2 of P (X) + a in Z[X] and if α1 , . . . , αn denote the zeros of P (X), then (1, x1 , . . . , xm ) is a solution of the equation F (x0 , x1 , . . . , xm ) ∈ S in (x0 , x1 , . . . , xm ) ∈ Zm+1 , (9.7.8) n m−1 m where F = X0 i=1 (αi X0 + αi X1 + · · · + Xm ) is a decomposable form with coefficients in Z. Using an earlier version of Corollary 9.5.5 from Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.7 Notes 281 Evertse (1995), one can get an upper bound for the number of solutions of (9.7.8), i.e. for the number of polynomials Q under consideration and the assertion follows. From this it is easy to deduce that for any monic P ∈ Z[X] 3 of degree n there is an a ∈ Z with |a| ≤ exp{(217 n)n } for which P (X) + a is irreducible over Q. It is an important feature of this bound that it depends only on n. r In the number field case when in (9.2) K is a number field and A is a ring of S-integers in K, Győry (1993a) gave a criterion for (9.2) to have only finitely many A∗ -cosets of solutions. Also in the number field case when is a group of S-units in K, Theorem 9.2.1 on unit equations is equivalent not only to Theorem 9.1.1 on decomposable form equations, but also to the following assertion. For any set of n + 2 distinct hyperplanes H0 , . . . , Hn+1 in Pn (K), the set of S-integral points of Pn (K) \ (H0 ∪ · · · ∪ Hn+1 ) is contained in a finite union of hyperplanes of Pn (K); see LeVesque and Waldschmidt (2011), Ru and Wong (1991), Győry (1993b) and, for more general results, Vojta (1987, 1996) and Levin (2008). Some refinements of Theorems 9.4.1 to 9.4.4 can be found in Győry (1993a) and Evertse and Győry (1997). r Consider a decomposable form equation F (x) = ±δ over Z and its reformulation of the form (9.4.8) with K = Q, OS = Z. Assume that this equation has infinitely many solutions in x ∈ Zm . Then the maximal rank r of its families of solutions satisfies 1 ≤ r < ∞. Denote by P (N ) the number of solutions x = (x1 , . . . , xm ) with max1≤i≤m |xi | ≤ N . In Everest and Győry (1997) it was deduced from the case K = Q, OS = Z of Theorem 9.4.4 that P (N ) = c1 (log N )r + O((log N )r−1 ) as N → ∞, where c1 is a positive number which depends only upon F and δ. See also Győry and Pethő (1980) and Evertse and Győry (1997). r Let G be a finite abelian group, and let Z[G] denote the integral group ring which consists of all formal expressions g∈G xg · g with xg ∈ Z. Then Z[G]∗ , the unit group of Z[G], is finitely generated. There is a considerable interest in the units of Z[G]; see e.g. Karpilovsky (1988) and Sehgal (1978). For x = g∈G xg · g ∈ Z[G], let |x| := maxg∈G |xg |, and let UG (N ) := |{x ∈ Z[G]∗ : |x| ≤ N }|. Suppose that Z[G]∗ has rank r > 0. It was proved in Everest and Győry (1997) as a special case of the above result concerning P (N ) that UG (N ) = c2 (log N )r + O((log N)r−1 ) as N → ∞, where c2 is a positive number which depends only on G. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 282 Decomposable form equations r Let F ∈ Z[X1 , . . . , Xm ] be a decomposable form of degree n in m ≥ 2 variables. For given c > 0, ν ≥ 0 consider the decomposable form inequality 0 < |F (x)| ≤ c|x|ν in x = (x1 , . . . , xm ) ∈ Zm , (9.7.9) where |x| := max1≤i≤m |xi |. By means of his Subspace Theorem Schmidt (1973, 1980) proved that (9.7.9) has only finitely many solutions, provided that (i) n > 2(m − 1), ν < n − 2(m − 1) and the linear factors of F are in general position (i.e. any m of them are linearly independent over Q), (ii) F is not divisible in Q[X1 , . . . , Xm ] by any form of degree less than m. This was extended by Schlickewei (1977e) to the case when the ground ring is an arbitrary finitely generated subring of Q. These results have obvious applications to decomposable form equations of the form F (x) = G(x) = 0, where G ∈ Z[X1 , . . . , Xm ] is a non-zero polynomial of degree ν < n − 2(m − 1). The above results were generalized in Győry and Ru (1998) for the number field case, without assuming (ii). The proof involves Schmidt’s Subspace Theorem with moving targets proved by Ru and Vojta (1997). r As a generalization of decomposable form equations, several people studied decomposable polynomial equations of the form F (x) = δ in x = (x1 , . . . , xm ) ∈ Am , (9.7.10) where A is a subring of a finitely generated extension K of Q which is finitely generated over Z, δ ∈ K \ {0} and F ∈ K[X1 , . . . , Xm ] is a decomposable polynomial, i.e., it factorizes into not necessarily homogeneous linear polynomials over a finite extension of K. In Evertse, Gaál and Győry (1989) a finiteness criterion was given for equation (9.7.10). Later, in the case K = Q, explicit upper bounds were derived in Bérczes and Győry (2002) for the number of solutions, provided that this number is finite. Over number fields, effective bounds were derived for the solutions in Sprindžuk (1974) and Bilu (1995) for m = 2, and in Gaál (1984, 1985, 1986) for certain norm polynomial and discriminant polynomial equations. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 9.7 Notes 283 r Let f1 , . . . , fn , G be non-zero polynomials in K[X1 , . . . , Xm ] (m ≥ 2), where K is a number field. Let F = f1 · · · fn , and assume that deg F > m max (deg fi ) + deg G. 1≤i≤n Further, let OS be a ring of S-integers in K, and as a generalization of decomposable polynomial equations, consider the equation F (x) = G(x) in x ∈ OSm . Let X be the hypersurface defined by F = G. It is proved in Corvaja and Zannier (2004a) that under certain additional assumptions, X ∩ OSm is not Zariski dense in X . r Finally, we note that Győry (1983), Mason (1986a, 1986b, 1987, 1988) and Gaál (1988a, 1988b) established effective results for various decomposable form equations over function fields. Their proofs are based on some earlier variants of results from Chapter 7 concerning unit equations. Gaál and Pohst (2006a, 2006b, 2010) gave the complete resolution of some norm form equations over certain function fields over a finite field. Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011 10 Further applications In the previous chapters several applications of unit equations were presented or mentioned. Moreover, in Chapter 9 we showed that unit equations and decomposable form equations are in a certain sense equivalent, and using results concerning unit equations we proved several general results for decomposable form equations. Unit equations have, however, a great variety of other applications. In this chapter we briefly present some of these applications in their simplest form, without aiming at completeness. We note that numerous further applications to discriminant equations are treated in our subsequent book Discriminant Equations in Diophantine Number Theory. The following topics are discussed: prime factors of sums of integers in Section 10.1, representations of elements of integral domains as sums of units in Section 10.2, lengths of finite orbits of polynomial maps on integral domains in Section 10.3, divisibility properties of polynomials with few non-zero coefficients in Section 10.4, arithmetic graphs with applications to irreducibility problems for polynomials in Section 10.5, discriminant equations and power integral bases in number fields in Section 10.6, finiteness results for binary forms of given discriminant in Section 10.7, equations involving resultants of monic polynomials in Section 10.8, equations and inequalities involving resultants of binary forms in Section 10.9, Lang’s conjecture for tori in Section 10.10, linear recurrence sequences and exponential-polynomial equations in Section 10.11, and finally algebraic independence results for values of lacunary power series in Section 10.12. 10.1 Prime factors of sums of integers We start with a simple application. Denote by ω(n) the number of distinct prime factors of a positive integer n, and by P (n) the greatest prime factor of n. 284 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.1 Prime factors of sums of integers 285 Erdős and Turán (1934) proved that for any finite subset A of Z>0 with |A| ≥ 2, ω (a + a ) > c1 log |A|, a,a ∈A where c1 denotes an effectively computable positive number. Further, they conjectured, see Erdős (1976) that for every t there is a number c(t) so that if A and B are finite subsets of Z>0 with |A| = |B| ≥ c(t) then ω (a + b) > t. a∈A,b∈B Using the result from Evertse (1984a) on the number of solutions of S-unit equations in two unknowns, see also the Notes in Section 6.7, Győry, Stewart and Tijdeman (1986) proved the conjecture in the following more general and more precise form. Theorem 10.1.1 There exists an effectively computable positive absolute constant c2 such that if A and B are any finite subsets of Z>0 with |A| ≥ |B| ≥ 2, then ω (a + b) > c2 log |A|. (10.1.1) a∈A,b∈B Since the n-th prime can be estimated from below by a constant times n log n, Theorem 10.1.1 implies the following result. Corollary 10.1.2 There exists an effectively computable positive absolute constant c3 such that if A and B are any finite subsets of Z>0 with |A| ≥ |B| ≥ 2, then there exist integers a ∈ A and b ∈ B for which P (a + b) > c3 log |A| log log |A|. (10.1.2) Erdős, Stewart and Tijdeman (1988) proved that (10.1.1) and (10.1.2) are not far from being best possible. More precisely, they showed that there is a positive number c4 such that for each integer k, with k ≥ 3, there exist subsets A and B of Z>0 with k = |A| ≥ |B| ≥ 2 such that ω (a + b) < c4 (log |A|)2 log log |A|. a∈A,b∈B Further, they obtained a similar result for P ( a∈A,b∈B (a + b)) as well. Proof of Theorem 10.1.1. We deduce Theorem 10.1.1 from Corollary 6.1.5 concerning unit equations. It is enough to prove this theorem in the case when Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 286 Further applications |B| = 2. Let a1 , . . . , ak denote the elements of A and let b1 , b2 be the elements of B. Let p1 , . . . , pt be the primes which divide k 2 (ai + bj ). i=1 j =1 Each ai yields a solution x = ai + b1 , y = ai + b2 of the equation x − y = b1 − b2 . By Corollary 6.1.5, there are at most 28(2t+2) such pairs (ai + b1 , ai + b2 ). Hence k ≤ 216(t+1) , which gives t > c5 log k for some effectively computable positive absolute constant c5 . Győry, Sárközy and Stewart (1996) proved a multiplicative analogue of (10.1.1) by showing that there exists an effectively computable positive number c6 , such that if A and B are any finite subsets of Z>0 with |A| ≥ |B| ≥ 2, then ω (ab + 1) > c6 log |A|. (10.1.3) a∈A,b∈B This implies a similar multiplicative analogue of (10.1.2). Further, they obtained the following common generalization of (10.1.1) and (10.1.3). Theorem 10.1.3 Let n ≥ 2 be an integer, and let A and B be ordered finite subsets of Zn>0 with |A| ≥ |B| ≥ 2(n − 1) and with the following properties: the n-th coordinate of each vector in A is equal to 1 and any n vectors in B ∪ {(0, . . . , 0, 1)} are linearly independent. Then ⎞ ⎛ ⎟ ⎜ (a1 b1 + · · · + an bn )⎠ > c7 log |A| ω⎝ (a1 , . . . , an ) ∈ A (b1 , . . . , bn ) ∈ B with an effectively computable positive number c7 depending only on n. Note that (10.1.1) follows from Theorem 10.1.3 by taking n = 2 and b1 = 1 for all (b1 , b2 ) in B. Further, for n = 2, Theorem 10.1.3 gives (10.1.3) if b2 = 1 for each (b1 , b2 ) in B. In Theorem 10.1.3, all assumptions are necessary. The proof of Theorem 10.1.3 depends on some finiteness results of Evertse and Győry (1988c) and Evertse (1995) on decomposable form equations. In their above-mentioned paper, Győry, Sárközy and Stewart formulated the conjecture that if a, b and c denote distinct positive integers and Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.2 Additive unit representations 287 max (a, b, c) → ∞, then P ((ab + 1)(bc + 1)(ca + 1)) → ∞. The conjecture was confirmed in stronger forms by Corvaja and Zannier (2003) and, independently, by Hernández and Luca (2003). For further related results, see Bugeaud and Luca (2004), Luca (2005) and Zannier (2012). 10.2 Additive unit representations in finitely generated integral domains Many people have investigated additive unit representations of elements in various rings. A central problem is whether as a Z-module the ring of integers of a number field or, more generally, a finitely generated integral domain of characteristic 0 can be generated by its units. Further, if the answer is yes, how many units are needed to represent the elements of the ring? Ashrafi and Vámos (2005) proved that if K is a quadratic, a complex cubic or a cyclotomic number field generated by a primitive 2m -th root of unity then there is no integer n ≥ 1 such that every integer in K can be represented as the sum of not more than n units. Further, they conjectured that it holds true for all algebraic number fields K. Jarden and Narkiewicz (2007) proved the conjecture in the following more general situation. Theorem 10.2.1 If A is a finitely generated integral domain of characteristic 0, then there is no integer n such that every element of A is a sum of at most n units. In particular, this holds for the rings of integers and the rings of S-integers of number fields. Theorem 10.2.1 is a consequence of the next theorem from Jarden and Narkiewicz (2007) and a classical result of van der Waerden. Theorem 10.2.2 If A is a finitely generated integral domain of characteristic 0 and n ≥ 1 is an integer then there exists a constant C1 (A, n), depending only on A and n, such that every non-constant arithmetic progression in A having more than C1 (A, n) elements contains an element which is not a sum of n units. Theorem 10.2.2 is a special case of the following theorem which was established independently by Hajdu (2007). Let K be a field of characteristic 0 and a multiplicative subgroup of K ∗ of finite rank r. Further, let n ≥ 1 be an integer, and A a non-empty finite subset of K n of cardinality t. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 288 Further applications Theorem 10.2.3 There exists a constant C2 (r, n, t) depending only on r, n and t such that the length of any non-constant arithmetic progression in the set n n ai si : (a1 , . . . , an ) ∈ A, (s1 , . . . , sn ) ∈ i=1 is at most C2 (r, n, t). In the special case when A is a finitely generated integral domain of zero characteristic, = A∗ , t = 1 and A = (1, . . . , 1), Theorem 10.2.3 gives Theorem 10.2.2. We recall that A∗ , i.e., the unit group of A, is finitely generated, and hence of finite rank. The proofs of Theorems 10.2.2 and 10.2.3 are both based on earlier versions of Theorem 6.1.3 on unit equations and a result of van der Waerden (1927) from Ramsey theory. Later, Hajdu and Luca (2010) proved Theorem 10.2.3 with a completely explicit value of C2 (r, n, t). Its proof depends only on Theorem 6.1.3, and avoids the use of van der Waerden’s Theorem. Below we sketch the proofs of Theorems 10.2.2 and 10.2.1. We shall use the following version of van der Waerden’s Theorem. Theorem 10.2.4 Let r, s be fixed positive integers. Then for any integer N sufficiently large in terms of r, s the following holds: for any arithmetic progression P of length N of rational integers, and any splitting of P into r subsets, at least one of these subsets contains an arithmetic progression of length s. Proof. See van der Waerden (1927). Proof of Theorem 10.2.2. We proceed by induction on n. Let first n = 1. Let aj = a0 + (j − 1)δ, j = 1, . . . , N , be an arithmetic progression consisting of units of A, where δ is a non-zero element of A. We have aj +1 − aj = δ for j = 1, . . . , N − 1, hence an earlier version, due to Evertse and Győry (1988b), of Theorem 6.1.3 implies that N is bounded by a number depending only on A. Next let n ≥ 1, and assume that the assertion holds with a constant C1 (A, k) for each positive integer k not exceeding n. For δ ∈ A \ {0}, consider now a finite arithmetic progression aj = a0 + (j − 1)δ in A, j = 1, . . . , N , each term of which is a sum of n + 1 units from A. We show that N can be bounded above by a number which depends only on A and n. Denote by (δ) the set of all units u in A which appear in a proper representation of the form δ = u1 + · · · + um with m = 1, 2, . . . , 2n + 2, that is the unit sum u1 + · · · + um has no vanishing subsum. Put (δ) = {x1 , . . . , xM }. It Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.2 Additive unit representations 289 follows from the above-mentioned result from Evertse and Győry (1988b) that M is bounded above by a number C3 (A, n) which depends only on A and n. We have aj = n+1 uk,j for some uk,j ∈ A∗ , j = 1, . . . , N, k=1 whence δ = aj +1 − aj = n+1 k=1 uk,j +1 − n+1 uk,j , j = 1, . . . , N. (10.2.1) k=1 Cancel the possible vanishing subsums at the right-hand side of (10.2.1). Then, for each j , at least one of the units in (10.2.1) belong to (δ). We may assume without loss of generality that for every j either u1,j or u1,j +1 belongs to (δ). For t = 1, 2, . . . , M, put Xt = 1 ≤ j ≤ N : u1,j = xt , Yt = 1 ≤ j ≤ N : u1,j +1 = xt . Then the set {1, 2, . . . , N} is the union of the sets Xt , Yt , t = 1, . . . , M. It follows from van der Waerden’s Theorem stated above that at least one of the sets X1 , . . . , XM , Y1 , . . . , YM contains an arithmetic progression P of length T > C1 (A, n) if N is sufficiently large with respect to C1 (A, n) and C3 (A, n). We may assume that X1 has this property. Let d be the difference of P , and put P = {n1 , . . . , nT }, where ni = n1 + (i − 1)d for i = 1, . . . , T . Then one can easily verify that ani − x1 = an1 − x1 + (i − 1)dδ for i = 1, . . . , T , and hence an1 − x1 , . . . , anT − x1 is an arithmetic progression of length > C1 (A, n) in A, each term of which is a sum of n units. This contradicts the induction hypothesis. Thus N ≤ C4 (A, n) where C4 (A, n) depends only on A and n. Proof of Theorem 10.2.1. Assume that every non-zero element of A can be represented as the sum of at most n units, and let n be the smallest positive integer with this property. Consider a sufficiently long arithmetic progression aj = a0 + (j − 1)δ, j = 1, . . . , N in A, where δ is a non-zero element of A. We follow the above argument. Let Xi , 1 ≤ i ≤ n, be the set of those indices j ∈ {1, 2 . . . , N} for which aj is a proper sum of i units from A. Then the set {1, 2 . . . , N} is the union of X1 , . . . , Xn . If N is large enough then van der Waerden’s Theorem implies that one of the Xi , say Xk , contains a long arithmetic progression P = {n1 , . . . , nT }. Then one can see similarly as above that if N is sufficiently large then an1 , . . . , anT is a long arithmetic progression each term of which is the sum of k units, contradicting Theorem 10.2.2. This proves Theorem 10.2.1. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 290 Further applications We present some consequences of Theorems 10.2.2 and 10.2.3 as well as some related results. For a finite set S of primes, denote by ZS the ring of S-integers, and by Z∗S the group of S-units. Jarden and Narkiewicz proved the following consequence of their Theorem 10.2.2. Corollary 10.2.5 Let n ≥ 1 be an integer, and S a finite set of primes. Then the set of positive integers which are sums of at most n elements of Z∗S has density zero. This follows from Theorem 10.2.2 applied with A = ZS and from Szemerédi’s Theorem (Szemerédi (1975)) on arithmetic progressions. In his above-mentioned paper, Hajdu deduced from his Theorem 10.2.3 and from the theorem of Green and Tao (2008) about arithmetic progressions of primes the following. Corollary 10.2.6 Let n ≥ 1, S = {p1 , . . . , pt } be a finite set of primes, US the set of integers of the shape ±p1z1 · · · ptzt with z1 , . . . , zt ∈ Z≥0 , and A a non-empty finite subset of Zn . Then there are infinitely many primes outside the set n n ai si : (a1 , . . . , an ) ∈ A, (s1 , . . . , sn ) ∈ US . i=1 For A = (1, . . . , 1) this gives the following Corollary 10.2.7 Let n ≥ 1, S and US be as in Corollary 10.2.6. There are infinitely many primes which are not the sum of n elements of AS . For n = 2, S = {2, 3}, this provided a negative answer to a question of Pohst (oral communication) who asked whether every prime can be written in the form 2u ± 3v with some non-negative integers u, v. It is easy to see that there are number fields whose rings of integers cannot be generated by their units. √ Such number fields are, for example, the imaginary quadratic fields Q( d) with squarefree integers d < −3. Jarden and Narkiewicz (2007) formulated the following problem. Problem Give a criterion for an algebraic extension of Q to have the property that its ring of integers is generated by its units. Jarden and Narkiewicz provided some examples of infinite algebraic extensions of Q having this property. For example, the fields of all algebraic numbers and all real algebraic numbers are such fields. Further, by the Kronecker–Weber Theorem the maximal abelian extension of Q also has the property mentioned. In particular, the ring of integers of an abelian number field is generated by its Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.3 Orbits of polynomial and rational maps 291 units. Besides these results the above problem has been solved for quadratic number fields in Belcher (1974) and later in Ashrafi and Vámos (2005), for pure cubic fields in Tichy and Ziegler (2007) and for pure quartic complex fields in Filipin, Tichy and Ziegler (2008). Answering affirmatively another problem of Jarden and Narkiewicz, Frei (2012) proved that for any number field K, there exists a finite extension L of K such that the ring of integers of L is generated by its units. For further related results, we refer to Bertók (2013), Dombek, Hajdu and Pethő (2014), the survey paper Barroero, Frei, Tichy (2011) and the references given there. 10.3 Orbits of polynomial and rational maps We start with some generalities. Let for the moment X be any non-empty set and φ : X → X any map from X to itself (usually called self-map of X). We denote by φ (i) the i-th iterate of φ (φ applied i times) where we agree that φ (0) is the identity. An orbit of φ is a sequence Oφ (a0 ) := {φ (i) (a0 )}∞ i=0 , where a0 ∈ X. A cycle of φ is a sequence (a0 , . . . , am−1 ) in X, where a0 , . . . , am−1 are distinct, ai = φ(ai−1 ) for i = 1, . . . , m − 1, and a0 = φ(am−1 ). We call m the length of the cycle. In this case the orbit Oφ (a0 ) is periodic with period m. Any a0 ∈ X that is the starting point of a cycle of φ of length m is called a periodic point of φ of period m. An orbit Oφ (a0 ) of φ is called finite if there are only finitely many distinct elements among φ (i) (a0 ) (i = 0, 1, 2, . . .). Suppose this is the case. Write ai := φ (i) (a0 ) for i ≥ 0. Then there exists l > 0 such that there is k with 0 ≤ k < l and ak = al . Take l with this property minimal and put m := l − k. Then a0 , . . . , ak+m−1 are distinct, and ai+m = ai for i ≥ k. We express the orbit Oφ (a0 ) conveniently as Oφ (a0 ) = (a0 , . . . , ak−1 , ak , . . . , ak+m−1 ), where the overline indicates that (ak , . . . , ak+m−1 ) is the recurring cycle of the orbit. We call k + m the length of Oφ (a0 ), (a0 , . . . , ak−1 ) the tail of Oφ (a0 ), and (ak , . . . , ak+m−1 ) the cycle of Oφ (a0 ). Any a0 ∈ X for which Oφ (a0 ) is finite is called a preperiodic point of φ. In particular, every periodic point is preperiodic. There is a vast literature on orbits of maps defined by polynomials on rings (see for instance Narkiewicz (1995)) or of morphisms of algebraic varieties (see, e.g., Silverman (2007)). Here, we restrict ourselves to certain aspects that Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 292 Further applications are closely related to unit equations. These concern bounding the lengths of cycles and finite orbits in the cases that X = A is an integral domain and φ is defined by a polynomial, or X is the one-dimensional projective line P1 (K) over a field K and φ is a rational self-map of P1 (K). Let A be an integral domain. A polynomial cycle, resp. finite polynomial orbit in A is a cycle, resp. finite orbit of a map of the type x → f (x) : A → A where f ∈ A[X]. We sloppily say that it is a cycle or finite orbit of f . We denote by f (i) the i-th iterate of the map x → f (x). Notice that linear polynomials cX + d with c, d ∈ A, c = 0 do not give rise to finite orbits or cycles in A, unless c is a root of unity different from 1. Narkiewicz (1989) proved that every polynomial cycle in Z has length at most 2, and in Narkiewicz and Pezda (1997) it is shown that every finite polynomial orbit in Z has length at most 4. Results from Narkiewicz (1989), Pezda (1994) and Narkiewicz and Pezda (1997) imply that for a large class of integral domains A the lengths of polynomial cycles and finite polynomial orbits in A are uniformly bounded in terms of A. We define the following quantities: N1 (A, b) := |{(x1 , x2 ) ∈ A∗ × A∗ : x1 + x2 = b}| N1 (A) := sup N1 (A, b) : b ∈ A \ {0} . (b ∈ A \ {0}), The following result is part of Pezda (2014), Theorem 1. Its proof is based on ideas from Narkiewicz (1989). Theorem 10.3.1 Let A be an integral domain for which N1 (A) is finite. Then every polynomial cycle in A has length at most 6(N1 (A) + 2)2 . By Roquette’s Theorem (Roquette (1957)) (see also Proposition 9.3.1 in this book), if A is an integral domain of characteristic 0 that is finitely generated over Z, then its unit group A∗ is finitely generated. We consider more generally integral domains of which the unit group has finite rank. If A is such a domain, and A∗ has rank r, then N1 (A) ≤ 216r+16 by Corollary 6.1.5. This leads at once to the following corollary. Corollary 10.3.2 Let A be an integral domain of characteristic 0 such that A∗ has finite rank r. Then every polynomial cycle of A has length at most 232r+35 . In the proof of Theorem 10.3.1 we need the following simple lemma. Lemma 10.3.3 Let g ∈ A[X] and a, b ∈ A with a = b. Then a − b divides g(a) − g(b). Proof. Use the fact that g(X)−g(a) X−a ∈ A[X]. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.3 Orbits of polynomial and rational maps 293 Proof of Theorem 10.3.1. Notice that if (a0 , a1 , . . . , am−1 ) (with m ≥ 2) is a a −a0 a −a ) is a cycle of g(X) := cycle in A of f ∈ A[X], then (0, 1, a2 −a0 , . . . , m−1 a1 −a0 1 0 −1 (a1 − a0 ) (f ((a1 − a0 )X + a0 ) − a0 ), which is a polynomial in A[X]. So there is no loss of generality to consider only polynomial cycles starting with 0, 1. Let (a0 = 0, a1 = 1, a2 , . . . , am−1 ) be such a cycle, say of f ∈ A[X], and assume without loss of generality that m ≥ 6. Let V be the set of integers i ∈ {0, . . . , m − 1} that are coprime with m(m − 1)/2. Let p1 , . . . , pt be the distinct primes dividing m, with p1 < · · · < pt . Then V consists precisely of the integers i ∈ {0, . . . , m − 1} such that i ≡ 0, 2 (mod pj ) for j = 1, . . . , t. Now the Chinese Remainder Theorem and a very generous estimate yield |V| = ⎧ t ⎪ ⎪ ⎪ m · 1 − 2pj−1 ⎪ ⎨ ⎫ ⎪ ⎪ if p1 > 2,⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎪ ⎩m · ⎪ ⎪ if p1 = 2 ⎪ ⎪ ⎭ j =1 t 1 2 j =2 1− 2pj−1 ≥ $ m/6. (10.3.1) Given an integer k with 1 ≤ k ≤ m − 1, we easily see, by repeatedly applying Lemma 10.3.3 with g = f (k) , that ak divides atk − a(t−1)k for t = 1, 2, . . .. This implies that ak divides atk for t = 1, 2, . . .. Let i ∈ V with i ≥ 3. There are k, l ∈ Z such that ik = 1 + lm, and so ai divides a1+lm = 1. Hence ai ∈ A∗ . Likewise, ai−2 ∈ A∗ . Further, by applying Lemma 10.3.3 with g = f (2) , g = f (m−2) , respectively, we deduce that ai−2 divides ai − a2 and ai − a2 divides ai−2 , that is, ai − a2 ∈ A∗ . It follows that (ai , a2 − ai ) (i ∈ V, i ≥ 3) are all solutions to the unit equation x1 + x2 = a2 in x1 , x2 ∈ A∗ . Hence |V| ≤ N1 (A) + 2. Together with (10.3.1) this implies m ≤ 6(N1 (A) + 2)2 . This proves Theorem 10.3.1. Pezda (1994) proved the following, by a totally different, local method, independent of unit equations. Let K be a field of characteristic 0 with discrete valuation v and A = {x ∈ K : v(x) ≥ 0} the associated discrete valuation domain. Assume that the residue class field of A is finite, say with pf elements, where p is a prime number. Then every polynomial cycle of A has length at most pf (p f − 1)p1+log v(p)/ log 2 . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 294 Further applications By applying this with K a number field of degree d and v the discrete valuation corresponding to a prime ideal of the ring of integers OK of K lying above 2, one obtains that every polynomial cycle of OK has length at most 2d+1 (2d − 1). This bound is comparable with the bound of Corollary 10.3.2 in that it is exponential in rank OK∗ . In his Ph.D. thesis, Zieve (1996) proved various extensions of Pezda’s result. We now consider the finite polynomial orbits of an integral domain A. Denote by B(A) the supremum of the lengths of the polynomial cycles of A. Narkiewicz and Pezda (1997), Theorem 1 proved that if B(A) is finite and if moreover the number of non-degenerate solutions of x1 + x2 + x3 = 1 in x1 , x2 , x3 ∈ A∗ is finite, say C(A), then every finite polynomial orbit of A has length at most 1 B(A)(31 3 + C(A)) − 1. We prove a variation on this result with a simpler proof. Define N2 (A, b) := |{(x1 , x2 ) ∈ A∗ × A∗ : (1 + x1 )(1 + x2 ) = b}| (b ∈ A \ {0, 1}), N2 (A) := sup N2 (A, b) : b ∈ A \ {0, 1} . Theorem 10.3.4 Let A be an integral domain for which both B(A) and N2 (A) are finite. Then every finite polynomial orbit of A has length at most B(A)(2N2 (A) + 5). This has the following consequence for integral domains of characteristic 0 with unit group of finite rank. Corollary 10.3.5 Let A be an integral domain of characteristic 0 such that A∗ has finite rank r. Then every finite polynomial orbit of A has length at most 21600(r+5) . Proof. The equation (1 + x1 )(1 + x2 ) = b in x1 , x2 ∈ A∗ (with b ∈ A, b = 0, 1) can be rewritten as a three term unit equation x1 + x2 + x1 x2 = b − 1. Since x1 , x2 = −1, there can be no solutions with x1 + x1 x2 = 0 or x2 + x1 x2 = 0. Further, there are at most two solutions with x1 + x2 = 0. So apart from at most two solutions, each proper subsum of the left-hand side is non-zero. Now by applying the bound (6.1.4) of Amoroso and Viada with n = 3, we infer N2 (A) ≤ 2 + 24324(r+4) . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.3 Orbits of polynomial and rational maps 295 Together with the upper bound for B(A) from Corollary 10.3.2 this implies Corollary 10.3.5. Proof of Theorem 10.3.4. Let (a0 , . . . , ak−1 , ak , . . . , ak+m−1 ) be a finite polynomial orbit in A, say of f ∈ A[X], where a0 , . . . , ak+m−1 are distinct. Then (ak , . . . , ak+m−1 ) is a polynomial cycle, and so m ≤ B(A). We first make some reductions. Write k = qm + r with q, r ∈ Z and 0 ≤ r ≤ m − 1. Then (ar , ar+m , . . . , ar+qm ) is a finite orbit of f (m) . Let ar+im − ak (i = 0, 1, 2, . . .), ar − ak h(X) := (ar − ak )−1 f (m) ((ar − ak )X + ak ) − ak . bi := Then h ∈ A[X], b0 = 1, bi = 0 for i ≥ q, b0 , . . . , bq−1 are distinct, and (1, b1 , . . . , bq−1 , 0) is a finite orbit of h. We show that q ≤ 2N2 (A) + 3. (10.3.2) Then using k + m < (q + 2)m ≤ (q + 2)B(A) we obtain at once Theorem 10.3.4. We use that by Lemma 10.3.3 with g = h(t) , bi − bj divides bi+t − bj +t for any i, j with 0 ≤ i, j ≤ q and i = j and any t > 0. Assume without loss of generality that q ≥ 5 and let i be an index with q/2 ≤ i ≤ q − 2. Then bi − 1 = bi − b0 divides b2i − bi = −bi , hence x1 := bi − 1 ∈ A∗ . Further, bq−1 − bi divides b2q−2−i − bq−1 = −bq−1 and hence also bi , while bi = bi − bq divides bq−1 − b2q−1−i = bq−1 , and hence also bq−1 − bi . So x2 := (bq−1 /bi ) − 1 ∈ A∗ . Notice that x1 , x2 are elements of A∗ satisfying (1 + x1 )(1 + x2 ) = bq−1 and that bq−1 = 0, 1. So the number of indices i with 12 q ≤ i ≤ q − 2 is at most N2 (A). This implies (10.3.2) and hence Theorem 10.3.4. Let again A be an integral domain. We call two sequences {ai }ri=0 , {bi }ri=0 in A (with r finite or infinite) equivalent if there are ε ∈ A∗ , a ∈ A such that bi = εai + a for i = 0, 1, 2, . . .. If {ai }ri=0 is a cycle or orbit of a polynomial f ∈ A[X], then {bi }ri=0 is a cycle or orbit of g(X) := εf (ε−1 (X − a)) + a. A polynomial cycle of A is called linear if it is a cycle of a linear polynomial from A[X], otherwise non-linear. A finite polynomial orbit of A is called (non-)linear if its cycle is (non-)linear. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 296 Further applications Halter-Koch and Narkiewicz (1997, 2000) proved that if A is an integral domain of characteristic 0 that is finitely generated over Z and integrally closed, then it has up to equivalence only finitely many non-linear polynomial cycles and only finitely many finite non-linear polynomial orbits. The non-linearity assumption is needed here. For instance, we obtain infinitely many pairwise inequivalent linear orbits by taking (1, 0, a) (a ∈ A \ {0, 1}), which is a finite orbit of f = (X − 1)(X − a) with linear cycle (0, a) coming from a − X. The proof of Halter-Koch and Narkiewicz heavily uses finiteness results on unit equations. Pezda (2014) gave an effective algorithm that computes, for any given number field K, a full set of representatives for the equivalence classes of the non-linear polynomial cycles and finite orbits of OK . We state without proof some results on the lengths of cycles and finite orbits of rational maps on the projective line. In general, for an arbitrary field K, a rational map φ : P1 (K) → P1 (K) of degree n is given by φ : (x : y) → (F (x, y) : G(x, y)), (10.3.3) where F, G ∈ K[X, Y ] are two binary forms of degree n without a common factor, i.e., with resultant R(F, G) = 0. Notice that the map φ is unaffected if we replace F, G by λF, λG for some λ ∈ K ∗ . We assume henceforth that K is a number field. Let φ be the rational selfmap of P1 (K) of degree n, given by (10.3.3). Let p be a prime ideal of OK . We say that φ has good reduction at p if the following holds: choose F, G such that their coefficients lie in OK but not all in p; then R(F, G) ∈ p. Otherwise, we say that φ has bad reduction at p. It is not difficult to show that for a rational self-map of P1 (K) there are only finitely many prime ideals of OK at which it has bad reduction. This notion of good reduction has an alternative interpretation. Let Fp := OK /p denote the residue class field of p and denote by Fp , Gp the binary forms in Fp [X, Y ], obtained by reducing the coefficients of F, G modulo p. Then the reduction φp of φ at p is the self-map of P1 (Fp ) given by (x : y) → Fp (x, y) Gp (x, y) : , H (x, y) H (x, y) where H is the greatest common divisor of Fp , Gp in Fp [X, Y ]. Notice that R(F, G) ∈ p if and only if H is constant. This means that φ has good reduction at p if and only if φp has the same degree as φ. For more information on reduction of rational maps, see Silverman (2007), sections 2.3–2.5. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.3 Orbits of polynomial and rational maps 297 We state without proof the following result. Theorem 10.3.6 Let K be an algebraic number field of degree d and let φ be a rational self-map of P1 (K). Let S be the set of places of K consisting of the infinite places of K and the prime ideals at which φ has bad reduction. Denote by t the number of prime ideals of OK at which φ has bad reduction, and let s := |S|. (i) Every cycle of φ has length at most d C1 (d, t) := 12(t + 2) log 5(t + 2) . (ii) Every finite orbit of φ has length at most 12 C2 (s) := e10 (s + 1)8 (log 5(s + 1))8 s . Part (i) has been proved by Morton and Silverman (1994), corollary B. The proof is by means of a local method, extending that of Pezda (1994). For similar and related results see Zieve (1996) and Silverman (2007), section 2.6. Part (ii) has been proved by Canci (2007), Theorem 1. His proof is an extension of that of Theorem 10.3.4. His main tools are part (i), and Theorem 6.1.3 on unit equations. We consider the preperiodic points of rational self-maps of P1 (K). Let φ be a rational self-map of P1 (K). First suppose that φ is linear, that is, φ(x : y) = (ax + by : cx + dy) where B := ( ac db ) ∈ GL(2, K). If B has two eigenvalues in K whose quotient is a root of unity, then there is m > 0 such that φ (m) is the identity and every point in P1 (K) is a periodic point of φ. Otherwise, B has at most two fixed points, depending on the number of eigenvalues of B in K, and no other preperiodic points. Assume henceforth that φ has degree at least 2. We denote by PrePerK (φ) the set of preperiodic points of φ in P1 (K). More generally, we may extend φ to a rational self-map of P1 (Q) and consider the set PrePerK,D (φ) of all preperiodic points of φ that have degree at most D over K. Then Northcott (1950) proved that for any integer D > 0, the set PrePerK,D (φ) is finite. We state a special case of the Uniform Boundedness Conjecture, which was first formulated in Morton and Silverman (1994). Conjecture 10.3.7 Let K be a number field of degree d, and φ a rational self-map of P1 (K) of degree n ≥ 2. Then |PrePerK (φ)| ≤ C(d, n), where C(d, n) depends on d and n only. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 298 Further applications From Theorem 10.3.6 we deduce a weaker version. Corollary 10.3.8 Let K, d, φ, n be as in Conjecture 10.3.7 and assume that φ has bad reduction at precisely t prime ideals of OK . Then |PrePerK (φ)| ≤ C(d, n, t), where C(d, n, t) is an effectively computable number, depending on d, n and t only. Proof. For the i-th iterate of φ we have φ (i) (x : y) = (Fi (x, y) : Gi (x, y)), where both Fi , Gi are binary forms of degree ni with R(Fi , Gi ) = 0. By Theorem 10.3.6, for every point (x : y) ∈ PrePerK (φ), there are k, l with 0 ≤ k < l ≤ C2 (s), such that φ (k) (x : y) = φ (l) (x : y), that is, Fk (x, y)Gl (x, y) = Fl (x, y)Gk (x, y). This shows that the preperiodic points of φ are among the zeros of the binary form (Fk Gl − Fl Gk ), P := 0≤k<l≤C2 (s) which is not identically zero since Fi , Gi are coprime for i ≥ 0. Now the number of preperiodic points of φ is at most the degree of P , which can be estimated from above effectively in terms of s and n, hence in terms of d, t and n. 10.4 Polynomials dividing many k-nomials By a monic k-nomial over Q we will mean a polynomial of the form Xm1 + a2 Xm2 + · · · + ak−1 Xmk−1 + ak Xmk ∈ Q[X] with m1 > · · · > mk−1 > mk = 0. If the polynomial is not a (k − 1)-nomial, i.e., if all ai = 0, we call (m1 , . . . , mk ) its exponent k-tuple. Put P ∈ Q[X] : ∃Q ∈ Q[X], r ∈ Z≥1 with deg (Q) < k . P Rk := such that P (X) | Q(X r ) over Q Posner and Rumsey (1965) noted that P (X) ∈ P Rk implies that P (X) divides infinitely many monic k-nomials over Q. Indeed, if P (X) divides Q(Xr ) over Q for some Q(X) of degree < k and integer r ≥ 1, then the vector space of polynomials in Q[X] modulo Q(X) is at most (k − 1)-dimensional, and hence Q(X) divides infinitely many k-nomials T (X) over Q. But then Q(Xr ) divides T (Xr ) and so P (X) divides T (Xr ) over Q. Conversely, Posner and Rumsey conjectured that if a polynomial P ∈ Q[X] divides infinitely many monic knomials over Q then P ∈ P Rk . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.4 Polynomials dividing many k-nomials 299 For k = 2 the conjecture is obvious. For k = 3, Posner and Rumsey proved a weaker version of their conjecture. Later, Győry and Schinzel (1994) showed that the conjecture is true for k = 3 and false for k ≥ 4. The disproof for the case k ≥ 4 is elementary. For k = 3, the proof involves some deep results on S-unit equations in two unknowns. Győry and Schinzel (1994) proved the following stronger assertion. Theorem 10.4.1 Let P ∈ Q[X] be a non-constant polynomial with t distinct zeros, K the splitting field of P , d the degree of K over Q, and s the number of distinct prime ideal factors of the zeros different from 0 of P . There are effectively computable numbers C1 , C2 depending only on d and s such that if P divides more than C1 · C2t monic trinomials over Q then P ∈ P R3 . Győry and Schinzel gave C1 and C2 in explicit form. It should be observed that these numbers do not depend on the size of the coefficients of P . Proof (sketch). The proof of Theorem 10.4.1 is based on some earlier, quantitative versions of Corollary 6.1.5 and Theorem 6.1.6. We sketch the basic idea of the proof. Let P be a polynomial as in the theorem, and let T = Xm + aXn + b be a trinomial over Q which is divisible by P . If X divides P (X) or if ab = 0, then P ∈ P R3 easily follows. Hence we assume that X does not divide P and ab = 0. It is easy to show that P can be written in the form Ps P1 P22 , where P1 and P2 are relatively prime squarefree polynomials in Q[X]. Denote by α1 , . . . , αt the distinct zeros of P1 P2 , and by S the set of prime ideal factors of these zeros in K. Then, for i = 1, . . . , t, (αim , αin ) is a solution of the S-unit equation (−1/b)x1 + (−a/b)x2 = 1 in S-units x1 , x2 . (10.4.1) First consider those trinomials T = Xm + aXn + b (ab = 0) over Q which are divisible by P and for which the corresponding equation (10.4.1) has at most two solutions. One can show that if there are more than 15 such trinomials then P ∈ P R3 . Next consider those trinomials T = Xm + aXn + b (ab = 0) over Q which are divisible by P (X) and the corresponding equation (10.4.1) has more than two solutions. If Xm + aXn + b and Xm + a Xn + b are such trinomials and if the corresponding equations of the form (10.4.1) are S-equivalent in the sense defined before the enunciation of Theorem 6.1.6 then a /a, b /b ∈ OS∗ ∩ Q∗ , where OS∗ denotes as usual the S-unit group in K. Hence it follows from a quantitative version, due to Győry (1992b), of Theorem 6.1.6 over number fields that there is a subset A of (Q∗ )2 of cardinality at most C3 such Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 300 Further applications that for each trinomial Xm + aXn + b under consideration, a = εa0 , b = ηb0 with ε, η ∈ OS∗ ∩ Q∗ and some (a0 , b0 ) ∈ A. Here C3 is a number depending only on d and s which can be given explicitly. Fix such a pair (a0 , b0 ) ∈ A and consider all the trinomials of the form Xm + εa0 Xn + ηb0 with ε, η ∈ OS∗ ∩ Q∗ , which are divisible by P over Q. If Xm + εa0 Xn + ηb0 and Xm + ε a0 Xn + η b0 are such trinomials then P (X) divides Xn + c with some c ∈ Q∗ and so P ∈ P R3 . Hence it suffices to deal with those trinomials for which the pairs (m, n) are pairwise distinct. We may assume that in the pairs (m, n) in question, say m1 , . . . , mu are pairwise distinct for u > C4t with a number C4 specified below. Then P divides Ti = Xmi + εi a0 Xni + ηi b0 over Q for i = 1, . . . , u, and so, for each i, (−1/b0 ) αjmi /ηi + (−a0 /b0 ) εi αjni /ηi = 1 for j = 1, . . . , t, where εi , ηi ∈ OS∗ ∩ Q∗ for i = 1, . . . , u. By the abovementioned version, due to Evertse (1984a), of Corollary 6.1.5, C4 can be chosen as an explicit expression of d and s such that for each j with 1 ≤ j ≤ t, αjmi /ηi can assume at most C4 values. Since by assumption u > C4t , there are distinct mi mi i1 and i2 with 1 ≤ i1 , i2 ≤ u such that αj 1 /ηi1 = αj 2 /ηi2 for j = 1, . . . , t and, if mi1 > mi2 , then putting r = mi1 − mi2 and η = ηi1 /ηi2 , we get αjr = η for j = 1, . . . , t. Consequently, P1 P2 divides Xr − η, i.e. P divides (Xr − η)2 and so P ∈ P R3 . Finally, we obtain that if P divides more than 15 + C3 · C42t trinomials then P ∈ P R3 . Schlickewei and Viola (1997) improved the bound occurring in Theorem 10.4.1. They proved the theorem with a bound of the form C5 · q C6 where q denotes the degree of P and C5 , C6 are explicitly given absolute constants. We note that under the above notation, t ≤ q ≤ 2t and q ≤ d ≤ q! hold. In their paper Schlickewei and Viola made the conjecture that the bound C5 · q C6 may be replaced by an absolute constant which does not involve the degree of P at all. However, as is mentioned by them, at present this seems to be out of reach. In Győry and Schinzel (1994), the authors proposed as a problem a modified version of the conjecture of Posner and Rumsey for k ≥ 4. Hajdu (1997) gave a negative answer to the problem and proposed a further refinement of the conjecture. For k = 5, this was disproved by Hajdu and Tijdeman (2003). Further, they noticed that if P divides two monic k-nomials, say T1 and T2 , over Q with the same exponent k-tuple, then it divides infinitely many ka b T1 + a+b T2 for every pair (a, b) of nomials, for example the k-nomials a+b positive rationals. Then in their paper (Hajdu and Tijdeman (2003)), they made the following conjecture. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.5 Irreducible polynomials and arithmetic graphs 301 Conjecture 10.4.2 For any k ≥ 5, a polynomial P ∈ Q[X] with P (0) = 0 divides infinitely many monic k-nomials with non-zero constant terms over Q if and only if either (i) P ∈ P Rk or (ii) P divides over Q two monic k-nomials with the same exponent k-tuple. In the same paper, the authors proved this assertion for k = 4 and for polynomials P with only simple zeros. Further, in Hajdu and Tijdeman (2008) they confirmed the conjecture for k ≥ 5 in the important special case when P is irreducible over Q and its Galois group is [2k/3]-times transitive. The proof is complicated; it depends on Theorem 6.1.3 which gives an upper bound for the number of solutions of multivariate unit equations. Finally, we note that Schlickewei and Viola (1999) described a so-called “proper” family Fk of monic k-nomials such that if a polynomial P having only simple zeros divides more than C7 (k) elements of Fk with a C7 (k) given explicitly in terms of k, then P ∈ P Rk . 10.5 Irreducible polynomials and arithmetic graphs Let K be an algebraic number field, S a finite set of places on K containing all infinite places, OS the ring of S-integers, NS (·) the S-norm and N a positive integer. For any finite subset A = {α1 , . . . , αm } of OS with m ≥ 3, we denote by GS (A) = GS (A, N ) the graph whose vertex set is A and whose edges are the unordered pairs {αi , αj } with NS (αi − αj ) > N. When S consists of the infinite places, this graph will be denoted by G(A) = G(A, N). These graphs G(A) and GS (A) were introduced in Győry (1971, 1972, 1980c) and were studied and applied by Győry and others; see Győry (2008b), Győry, Hajdu and Tijdeman (2011) and the references given there. Several Diophantine problems, for instance related to irreducibility of polynomials (see Theorem 10.5.3), decomposable form equations (see Subsection 9.6.2), discriminant equations (see Theorems 10.6.1–10.6.3) and resultant equations (see Theorem 10.8.1) can be reduced to the study of connectedness properties of graphs GS (A, N). Such properties are stated in Theorems 10.5.1 and 10.5.2 below. In the complement of GS (A, 1), {αi , αj } is an edge if and only if αi − αj is an S-unit. Hence this complement is called a difference graph of S-units. For any finite (simple) graph G of order ≥ 3 there is a finite set S of places on Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 302 Further applications K containing all infinite ones such that GS (A, 1) is isomorphic to G for some subset A of OS . Further, such S and A can be effectively determined, provided that K is effectively given; see Győry, Hajdu and Tijdeman (2014). } of OS are called SThe subsets A = {α1 , . . . , αm }, A = {α1 , . . . , αm equivalent if, after some reordering of α1 , . . . , αm , αi = εαi + β, i = 1, . . . , m for some ε ∈ OS∗ and β ∈ OS . In this case the graphs GS (A) and GS (A ) are obviously isomorphic, they have the same structure. There are infinitely many S-equivalence classes of subsets A of OS with given cardinality m ≥ 3. We present two theorems in simplified from on the structure of graphs GS (A). Denote by d the degree of K, and let s := |S|. Further, as in Chapter 4, let P denote the greatest norm and Q the product of norms of the prime ideals involved in S, and let RS be the S-regulator of K. The following theorem was proved by Győry (2008b) in a more precise form. Its first, weaker version can be found in Győry (1980c). Theorem 10.5.1 Let m ≥ 3 be an integer, and A = {α1 , . . . , αm } a subset of OS . Then the graph GS (A, N ) has at most two connected components, except possibly in the case when there is an ε ∈ OS∗ such that max h((αi − αj )/ε) ≤ C1 m3 (C2 s)2(s+2) P RS (log∗ RS )(log∗ QN ). 1≤i,j ≤m Here C1 , C2 are effectively computable positive numbers such that C1 depends only on the degree d of K and the regulator and class number of K, and C2 only on d. This means that the number of exceptional S-equivalence classes is finite, and a representative of each class can be, at least in principle, effectively determined. Proof (sketch). Theorem 10.5.1 is proved by repeated application of Corollary 4.1.5. We sketch some ideas behind the proof. For a finite graph G we denote by G ! the triangle graph of G, i.e. the graph whose vertices are the edges of G, and two vertices e1 and e2 of G ! are connected by an edge if and only if G contains a triangle having e1 and e2 as edges. Further, if both G and G ! are connected then we say that G is !-connected. Consider now GS (A) = GS (A, N) in Theorem 10.5.1 and assume that this graph has at least three connected components. It is easy to see that in this case the complement of GS (A), for simplicity denoted by G, is !-connected. Let Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.5 Irreducible polynomials and arithmetic graphs 303 {αi , αj , αk } be a triangle in G. Then we have NS (αi − αj ) ≤ N, NS (αj − αk ) ≤ N, NS (αk − αi ) ≤ N. Using Proposition 4.3.12, this gives that up to unknown S-unit factors, the numbers αi − αj , αj − αk , αk − αi have effectively bounded heights. But (αi − αj )/(αi − αk ) + (αj − αk )/(αi − αk ) = 1, hence Corollary 4.1.5 implies that the height of (αi − αj )/(αj − αk ) can be effectively bounded above. If {αj , αk , αl } is another triangle in G, then the heights of (αj − αk )/(αk − αl ) and so (αi − αj )/(αk − αl ) are also effectively bounded. Continuing this procedure, it follows that for any two connected vertices {αi , αj }, {αp , αq } in G ! , the height of (αi − αj )/(αp − αq ) is effectively bounded. But G ! is connected, hence for each quadruple {αi , αj , αp , αq } for which {αi , αj } and {αp , αq } are edges in G, the height of (αi − αj )/(αp − αq ) can be effectively bounded. Fix p and q. Since G is connected, each distinct αa and αb can be connected by a path in G. Summing over all terms (αi − αj )/(αp − αq ) for the edges in this path we infer that for each pair (a, b) the height of (αa − αb )/(αp − αq ) can be effectively bounded. From these facts it follows easily that up to a common S-unit factor, the height of αa − αb is effectively bounded for each distinct αa , αb , as stated in Theorem 10.5.1. The following theorem is a more precise but ineffective version of Theorem 10.5.1. There are only finitely many pairwise non-associate α ∈ OS with NS (α) ≤ N. Denote by $S (N ) the maximal number of such α. Theorem 10.5.2 Let m ≥ 3 be an integer with m = 4. Apart from at most finitely many S-equivalence classes of subsets A = {α1 , . . . , αm } of OS , GS (A) has a connected component of order at least m − 1. (10.5.1) Further, if m > 3 · 216s $S2 (N ), (10.5.2) then (10.5.1) holds for all subsets A = {α1 , . . . , αm } of OS . We note that the assumption m = 4 is necessary, and the lower bound m − 1 in (10.5.1) is sharp. A more general and quantitative version of the first part of Theorem 10.5.2 is given in Győry (2008b); see also Győry (1990). The second part is a special case of Theorem 2.3 of Győry (2008b). For earlier versions of this part, see Győry (1980c, 1990). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 304 Further applications Proof (sketch). The proof of the first part of Theorem 10.5.2 depends on Corollary 6.1.2 or, in the quantitative case, on Theorem 6.1.3 and (6.1.4) concerning S-unit equations. We now sketch the ideas behind the proof of the second statement. Let A = {α1 , . . . , αm } be a subset of OS , and let G1 , . . . , Gl be the connected components of GS (A) such that |G1 | ≤ |G2 | ≤ · · · ≤ |Gl |. Suppose that l ≥ 3 or l = 2 and |G1 | ≥ 2. If l ≥ 3, let αi1 , αi2 be vertices of G1 and G2 , respectively, while if l = 2, let αi1 , αi2 be vertices of G1 . Then αi2 − αi1 = (αi2 − αj ) + (αj − αi1 ) (10.5.3) follows for every vertex αj of G3 , . . . , Gl if l ≥ 3, and of G2 if l = 2. Further, αi2 − αj and αj − αi1 have S-norms at most N for each j . There are $S2 (N ) pairs (β1 , β2 ) ∈ OS2 with non-zero β1 , β2 such that αi2 − αj = β1 x1 , αj − αi1 = β2 x2 with S-units x1 , x2 . For fixed αi1 , αi2 , (10.5.3) leads to at most $S2 (N ) Sunit equations whose total number of solutions is by Theorem 6.1.4 at most 216s $S2 (N ). But the number of αj in question is at least 13 m. This shows that if (10.5.2) holds, then l = 1 or l = 2 and |G1 | = 1, which was to be proved. Theorems 10.5.1 and 10.5.2 have applications to irreducible polynomials. I. Schur and later A. Brauer, R. Brauer, H. Hopf and others investigated the irreducibility of polynomials of the form g(f (X)), where f , g are monic polynomials with integral coefficients, g is irreducible over Q, and the zeros of f are distinct integers. For a survey of results of this type, see Győry (1972, 1982c). These investigations were extended in Győry (1971, 1972, 1982c, 1992c) to the more general case that the zeros of f are in an arbitrary but fixed totally real number field K. Let A = {α1 , . . . , αm } be the set of zeros of such a monic polynomial f ∈ Z[X] and suppose that g ∈ Z[X] is an irreducible monic polynomial whose splitting field is a CM-field, i.e. a totally imaginary quadratic extension of a totally real number field. In this case g is called of CM-type. For example, cyclotomic polynomials and quadratic polynomials of negative discriminant are of CM-type. Consider the graph G(A) = G(A, N) with N = 2d |g(0)|d/ deg (g) , where d = [K : Q]. It was proved in Győry (1971) that if this graph G(A) has a connected component having k vertices, then the number of irreducible factors of g(f (X)) over Q is not greater than deg (f )/k. This estimate is in general best possible; see Győry (1972). For f ∈ Z[X] and a ∈ Z, the polynomials f (X) and f (X + a) will be called equivalent. Then, for irreducible g ∈ Z[X], g(f (X)) and g(f (X + a)) are at the same time reducible or irreducible. Using the fact that $MK∞ (N ) ≤ C3 N with an effectively computable number C3 depending only on d and the discriminant Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.6 Discriminant equations and power integral bases 305 of K (see Sunley (1973)), Theorem 10.5.2 implies immediately the following theorem. Theorem 10.5.3 Let g ∈ Z[X] be a monic irreducible polynomial of CMtype, and K a totally real number field of degree d. There are only finitely many equivalence classes of monic polynomials f ∈ Z[X] with deg (f ) ≥ 3, deg (f ) = 4, and with distinct zeros in K for which g(f (X)) is reducible over Q. Further, if deg(f ) > C4 |g(0)|2d/ deg(g) then g(f (X)) is irreducible over Q. Here C4 is an effectively computable number depending only on d and the discriminant of K. We note that for suitable g and K, in Theorem 10.5.3 there exist infinitely many exceptional equivalence classes of quartic f for which g(f (X)) is reducible, and these exceptions are described in Győry (1992c). Further, Theorem 10.5.3 does not remain valid for any monic irreducible g ∈ Z[X] and for any number field K; see e.g. Győry (1992c). In Győry, Hajdu and Tijdeman (2011), an upper bound is given for the number of exceptional equivalence classes of polynomials f . Theorem 10.5.3 is ineffective, in the sense that the method of proof does not make it possible to determine the exceptional equivalence classes. A weaker but effective version can be deduced from Theorem 10.5.1. For the first effective results of this type, see Győry (1982c). 10.6 Discriminant equations and power integral bases in number fields Several Diophantine problems of number theory lead to discriminant equations. To illustrate applications of unit equations to such equations, we restrict ourselves here to some finiteness results in their simplest form. Many other, more general results, quantitative versions and applications are discussed in our book Discriminant Equations in Diophantine Number Theory. Two important discriminant equations are DK/Q (α) = D in α ∈ OK (10.6.1) and D(f ) = D in monic polynomials f ∈ Z[X], (10.6.2) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 306 Further applications where K is an algebraic number field, OK its ring of integers, D(f ) the discriminant of f , DK/Q (α) the discriminant of the minimal polynomial, say fα , of α over Z, and D a non-zero rational integer. In other words, if α satisfies (10.6.1) then fα satisfies (10.6.2). Equation (10.6.2) can have, however, other, not necessarily irreducible solutions without zeros in K. Hence equation (10.6.2) is more general than (10.6.1). If α is a solution of (10.6.1) then so is α + a for all a ∈ Z. Elements α, α ∗ ∈ OK with α − α ∗ ∈ Z are called equivalent. Similarly, if f is a solution of (10.6.2), then so is f ∗ (X) = f (X + a) for every a ∈ Z. As in Section 10.5, such polynomials f , f ∗ are called equivalent. The minimal polynomials of equivalent α, α ∗ from OK are obviously equivalent. In the quadratic case, when in (10.6.1) K is a quadratic number field and in (10.6.2) the polynomials f are quadratic, the solutions of the above equations can be easily found. Delone (1930) and Nagell (1930) proved independently of each other that up to equivalence, there are only finitely many irreducible monic polynomials f ∈ Z[X] of degree 3 for which (10.6.2) holds. This implies that for a cubic number field K, equation (10.6.1) has also only finitely many equivalence classes of solutions. In the quartic case, the same assertions were obtained later by Nagell (1967, 1968a). The proofs of Delone and Nagell are ineffective. Nagell (1967) conjectured that the finiteness assertion concerning equation (10.6.1) is true for every number field K. Let K be as above an algebraic number field, and denote by d and DK the degree and discriminant of K. By repeatedly applying an earlier version of Theorem 4.1.1, Győry (1973) proved the following general effective result. Theorem 10.6.1 Every solution α of (10.6.1) is equivalent to a solution α ∗ ∈ OK for which h(α ∗ ) < C1 , (10.6.3) where C1 is an effectively computable number depending only on d, DK and D. This implies that there are only finitely many pairwise inequivalent elements in OK with discriminant D, and a full set of representatives of such elements can be, at least in principle, effectively determined. This finiteness assertion was proved independently in an ineffective form in Birch and Merriman (1972). In view of Minkowski’s inequality (1.5.4), the degree d of K can be estimated from above in terms of |DK |. Further, if (10.6.1) is solvable then DK divides D. Hence, in (10.6.3), the dependence of the bound on d and DK , and hence on K can be dropped; see Győry (1973). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.6 Discriminant equations and power integral bases 307 Proof of Theorem 10.6.1 (sketch). We reduce (10.6.1) to a system of unit equations. Let G denote the normal closure of K/Q, let g be its degree over Q, and let α (1) = α, α (2) , . . . , α (d) be the conjugates of α with respect to K/Q. If d ≥ 3 then α (i) − α (2) α (1) − α (i) + = 1 for i = 3, . . . , d. α (1) − α (2) α (1) − α (2) (10.6.4) Further, the numbers α (1) − α (2) , α (1) − α (i) and α (i) − α (2) divide D in the ring of integers of G. Hence Proposition 4.3.12 implies that apart from some unknown unit factors, the heights of these differences can be effectively bounded above. Thus, equation (10.6.4) reduces indeed to finitely many unit equations in two unknowns in G. Finally, by Theorem 4.1.1 the heights of α (1) − α (i) , α (2) − α (i) and so α (i) − α (j ) can be effectively estimated from above up to the common factor α (1) − α (2) whose height can be effectively bounded above from (10.6.1) in terms of D, g and the class number and regulator of G. Since these parameters of G can be estimated from above in terms of d, DK and D, Theorem 10.6.1 follows. Theorem 10.6.1 is in fact a consequence of Theorem 10.5.1. Let A = α (1) , . . . , α (d) , N = |D|g . Then A is a subset of the ring of integers of G, and (10.6.1) gives |NG/Q α (i) − α (j ) | ≤ N for 1 ≤ i < j ≤ d. Hence the graph G(A, N ) defined in Section 10.5 consists of isolated vertices. Thus Theorem 10.5.1 applies and the heights of the differences α (i) − α (j ) can be effectively bounded above apart from a common unit factor ε in G, while the height of ε can be estimated from above from (10.6.1). As was mentioned above, one may assume that in (10.6.1) DK divides D. Let ω denote the number of distinct prime factors of the quotient D/DK . Using Theorem 6.1.4 concerning unit equations, one can prove the following theorem, as a special case of a more general result of Evertse and Győry from their book on discriminant equations. Theorem 10.6.2 Equation (10.6.1) has at most 25d 2 (ω+1) equivalence classes of solutions. The first, weaker version of this type was proved in Evertse and Győry (1985). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 308 Further applications Concerning equation (10.6.2), Delone and Faddeev (1940) posed the problem of giving an algorithm for finding all cubic monic polynomials with integer coefficients and given non-zero discriminant. In 1973, Győry (1973) proved the following general theorem. Theorem 10.6.3 Every solution f ∈ Z[X] of (10.6.2) is equivalent to a solution f ∗ ∈ Z[X] for which deg(f ∗ ) ≤ C2 , H (f ∗ ) ≤ C3 , (10.6.5) where H (f ∗ ) denotes the maximum of the absolute values of the coefficients of f ∗ and C2 , C3 are effectively computable numbers depending only on D. This makes it possible, at least in principle, to determine all monic polynomials in Z[X] with given non-zero discriminant. Later, several quantitative versions of Theorems 10.6.1 and 10.6.3, and generalizations for S-integers and for polynomials with S-integral coefficients in number fields, were established by Győry. References and the best known values for C1 and C3 are given in our book on discriminant equations. The best possible upper bound C2 can be found in Győry (1974). For irreducible polynomials f ∈ Z[X], Theorem 10.6.1 implies Theorem 10.6.3. The “reducible” case can be reduced to the “irreducible” one by means of the relation ⎞ ⎛ k 2 R(fi , fj ) ⎠ , D(fi ) · ⎝ D(f ) = i=1 1≤i<j ≤k k where f = i=1 fi is the irreducible factorization of f in Z[X] and R(fi , fj ) denotes the resultant of fi and fj . Another option is to apply Theorem 10.5.1 to equation (10.6.2) as in the proof of Theorem 10.6.1, and then estimate in the bound obtained for H (f ∗ ) the parameters involved in terms of D. An upper bound can also be derived for deg (f ∗ ) by means of Theorem 10.5.2. We present some consequences of Theorems 10.6.1 and 10.6.2. For other applications, for example to discriminant form and index form equations, we refer to Győry (1976, 1980b), Evertse and Győry (1988a) and our book on discriminant equations. As is known, there exist algebraic number fields K having power integral bases (i.e. integral bases of the form {1, α, . . . , α d−1 } where d = [K : Q]), but this is not the case in general. The existence of such a basis considerably facilitates the calculations in K and the study of arithmetical properties of OK , the ring of integers of K. More generally, we consider orders in K, these are the subrings of OK whose quotient field is K. There are infinitely many orders in K, and OK is Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.6 Discriminant equations and power integral bases 309 the maximal one among them. The order O in K is said to be monogenic if O = Z[α] for some α ∈ O. Equivalently, in this case {1, α, . . . , α d−1 } is a Z-module basis of O, where d = [K : Q]. In particular, the number field K is called monogenic if OK is monogenic, that is, if K has a power integral basis. It is known that α ∈ O generates O if and only if DK/Q (α) = DO , where DO denotes the discriminant of O. If α is a generator of O then so are all α ∗ ∈ O which are equivalent to α. Choosing D = DO , and using the fact that DK divides DO , Theorem 10.6.1 gives at once the following corollary, see Győry (1976): Corollary 10.6.4 If O = Z[α] for some α ∈ O, then there is an α ∗ ∈ O which is equivalent to α such that h(α ∗ ) < C4 , where C4 is an effectively computable number depending only on d and DO . In the special case O = OK , we get immediately the following consequence, already obtained in Győry (1976). Corollary 10.6.5 If {1, α, . . . α d−1 } is an integral basis of K, then there is an α ∗ ∈ OK which is equivalent to α such that h(α ∗ ) < C5 , where C5 is an effectively computable number depending only on d and DK . Thus, up to equivalence, there are only finitely many elements in OK and, more generally in O, which generate a power integral basis and they can be, at least in principle, effectively determined. Combining this effective approach with some reduction procedures, all power integral bases have been determined in many number fields of relatively small degree; see e.g. Gaál (2002), Bilu, Gaál and Győry (2004) and our book on discriminant equations. An immediate consequence of Theorem 10.6.2 is the following. Corollary 10.6.6 Let O be an order in K. Up to equivalence, there are at 2 most 25d elements α ∈ O such that O = Z[α]. In particular, the same assertion is true for OK . An order O in K is said to be k times monogenic if there are at least k distinct equivalence classes of α satisfying O = Z[α]. The following result was proved by Bérczes, Evertse and Győry (2013). Theorem 10.6.7 Let K be an algebraic number field of degree ≥ 3. Then there are at most finitely many three times monogenic orders in K. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 310 Further applications The bound 3 is best possible, that is there are number fields having infinitely many two times monogenic orders. The proof of Theorem 10.6.7 depends on earlier, qualitative versions of Corollary 6.1.5 and Theorem 6.1.6 on unit equations. A non-zero element α in an order O of an algebraic number field K is called a basis of a canonical number system (or CNS basis) for O if every non-zero element of O can be represented in the form a0 + a1 α + · · · + am α m with m ≥ 0, ai ∈ {0, 1, . . . , |NK/Q (α)| − 1} for i = 0, . . . , m and am = 0. Canonical number systems can be viewed as natural generalizations of radix representations of rational integers to algebraic integers. If there exists a canonical number system in O, then O is called a CNS order. Orders of this kind have been intensively investigated; we refer to the survey paper Brunotte, Huszti and Pethő (2006) and the references given there. It was proved in Kovács (1981) and Kovács and Pethő (1991) that O is a CNS order if and only if O is monogenic. More precisely, if α is a CNS basis in O, then it is easy to see that O = Z[α]. Conversely, O = Z[α] does not imply in general that α is a CNS basis. However, in this case there are infinitely many α which are equivalent to α such that α is a CNS basis for O. A characterization of CNS bases in O is given in Kovács and Pethő (1991). The close connection between elements α of O with O = Z[α] and CNS bases in O enables one to apply results concerning monogenic orders to CNS orders and CNS bases. For example, it follows from Corollary 10.6.4 that up to equivalence there are only finitely many canonical number systems in O. We say that O is a k times CNS order if there are at least k pairwise inequivalent CNS bases in O. Theorem 10.6.7 implies the following result, see also Bérczes, Evertse and Győry (2013). Corollary 10.6.8 Let K be an algebraic number field of degree ≥ 3. Then there are at most finitely many three times CNS orders in K. 10.7 Binary forms of given discriminant Let F = a0 Xn + a1 Xn−1 Y + · · · + an Y n be a binary form of degree n ≥ 2 with coefficients in a field K. We can factor F over an algebraic closure of K as F = n (αi X − βi Y ); (10.7.1) i=1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.7 Binary forms of given discriminant 311 then the discriminant of F is given by (αi βj − αj βi )2 . D(F ) = 1≤i<j ≤n We can express D(F ) otherwise as a homogeneous polynomial of degree 2n − 2 in Z[a0 , . . . , an ]. Define the binary form FU by FU (X, Y ) = F (aX + bY, cX + dY ) for a c b ∈ GL(2, K). d Then we have D(λFU ) = λ2n−2 (det U )n(n−1) D(F ) for λ ∈ K ∗ , U ∈ GL(2, K). (10.7.2) Given a subring A of K, we say that two binary forms F, G ∈ A[X, Y ] are GL(2, A)-equivalent if there are a unit u ∈ A∗ and U ∈ GL(2, A) such that G = uFU . By (10.7.2), two GL(2, A)-equivalent binary forms have, up to multiplication with a unit from A, the same discriminant. We now restrict ourselves to binary forms with coefficients in Z. By (10.7.2),two GL(2, Z)-equivalent binary forms have the same discriminant. We have the following fundamental theorem. Theorem 10.7.1 Let n, D be integers with n ≥ 2 and D = 0. Then there are only finitely many GL(2, Z)-equivalence classes of binary forms F ∈ Z[X, Y ] of degree n and discriminant D. For n = 2 this is a classical theorem of Lagrange (1773) and for n = 3 a classical theorem of Hermite (1851). For n ≥ 4 this was proved only in 1972 by Birch and Merriman (Birch and Merriman (1972), Theorem 2). The proofs of Lagrange and Hermite are effective, while that of Birch and Merriman is ineffective. Proof of Birch and Merriman (sketch). We give a brief sketch of the proof of Birch and Merriman, explaining at which point it fails to be effective. Take a binary form F ∈ Z[X, Y ] of degree n ≥ 4 and discriminant D = 0. The discriminant of the splitting field of F can be estimated from above in terms of D, and by the Hermite–Minkowski Theorem, this leaves only a finite, effectively determinable collection of possible splitting fields for F . So we may restrict ourselves to binary forms F with given splitting field L, say. Let H denote the Hilbert class field of L, and let S be a finite set of places of H such that D ∈ OS∗ . Then F can be factored as in (10.7.1) with αi , βi ∈ OS and αi βj − αj βi ∈ OS∗ for 1 ≤ i < j ≤ n. There is a matrix U ∈ GL(2, OS ) such that FU = εXY (X − Y )(X − γ3 Y ) · · · (X − γn Y ) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 312 Further applications with ε ∈ OS∗ , γ3 , . . . , γn ∈ OS . Further, by (10.7.2), D(FU ) = ±ε2n−2 (γi (1 − γi )) ∈ OS∗ , i=3 OS∗ for i = 3, . . . , n. In this way, the problem of which implies that γi , 1 − γi ∈ finding the binary forms F of given discriminant reduces to an S-unit equation in two unknowns x + y = 1 in x, y ∈ OS∗ . Now by an effective finiteness result for such equations such as Theorem 4.1.3, one can show that there are only finitely many possibilities for γ3 , . . . , γn that can be determined effectively. This shows that the binary forms F ∈ Z[X, Y ] of degree n and discriminant D lie in only finitely many GL(2, OS )equivalence classes. The final step of the proof of Birch and Merriman is to show that the binary forms in Z[X, Y ] of discriminant D in a given GL(2, OS )equivalence class lie in only finitely many GL(2, Z)-equivalence classes. At this point, the argument of Birch and Merriman is ineffective, since it does not give an effective procedure to check whether a given GL(2, OS )-equivalence class contains a binary form from Z[X, Y ]. Evertse and Győry (1991) managed to give an effective version of the result of Birch and Merriman. The following is a less precise version of Theorem 1 from their paper. Given a binary form F ∈ Z[X, Y ], denote by H (F ) the maximum of the absolute values of the coefficients of F . Theorem 10.7.2 Let n, D be integers with n ≥ 2 and D = 0. Then there is an effectively computable number C1 , depending only on n and D, such that for every binary form F ∈ Z[X, Y ] of degree n and discriminant D there is U ∈ GL(2, Z) such that H (FU ) ≤ C1 . Proof (sketch). We give only the main idea of the proof. We may again restrict ourselves to binary forms with given splitting field L. Let F ∈ Z[X, Y ] be a binary form of degree n and discriminant D with splitting field L. Take a factorization of F as in (10.7.1). After multiplying F by a small integer (effectively bounded in terms of L), we may assume that F has a factorization as in (10.7.1) with αi , βi ∈ OL for i = 1, . . . , n. Put ij := αi βj − αj βi for 1 ≤ i, j ≤ n. Then for any quadruple i, j, k, l of distinct indices we have the identity ij kl + j k il + ki j l = 0. (10.7.3) Notice that all terms ij are in OL and divide D; hence |NL/Q (ij | ≤ |D|[L:Q] for all i, j . Using Proposition 4.3.12 we can express each term ij as a product of an element of height effectively bounded in terms of n, D, L and an element of OL∗ . By substituting this into the identities (10.7.3) we obtain homogeneous Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.7 Binary forms of given discriminant 313 unit equations like in Theorem 4.1.1. By applying the latter, we obtain effective upper bounds for the heights of the quotients ij kl /ik j l . We have some freedom to choose the α, βi in (10.7.1). By doing this in an appropriate way, we can deduce in fact effective upper bounds for the heights of the numbers ij themselves. Then, with the help of an argument from the geometry of numbers, one can construct a matrix U ∈ GL(2, Z) as in Theorem 10.7.2. In our book Discriminant Equations in Diophantine Number Theory we give a complete proof of Theorem 10.7.2, with the explicit value 2 C1 = exp (16n3 )25n |D|5n−3 . It is possible to give a semi-effective version of Theorem 10.7.2 with for C1 a bound with a much better dependence on D, but with an ineffective dependence on the splitting field of the binary form F . The following result is Theorem 1 of Evertse (1993). Theorem 10.7.3 Let F ∈ Z[X, Y ] be a binary form of degree n ≥ 4 and of discriminant D = 0. Assume that F has splitting field L. Then there is U ∈ GL(2, Z) such that H (F ) ≤ C ineff (n, L)|D|21/(n−1) . Here, C ineff (n, L) is a number, not effectively computable from the proof, that depends only on n and L. Proof (sketch). The proof is similar to that of Theorem 10.7.2 but one has to apply Theorem 6.1.1 with n = 2 to (10.7.3). Some precise combinatorics is needed to get an exponent O(1/n) on |D|. It is possible to give explicit upper bounds for the number of GL(2, Z)-equivalence classes of binary forms, under certain additional constraints. Although it is possible to treat reducible binary forms as well, we restrict ourselves to binary forms that are irreducible over Q. Let F = a0 Xn + a1 Xn−1 Y + · · · + an Y n ∈ Z[X, Y ] be a binary form of degree n ≥ 2. We say that F is associated with a number field K if F is irreducible over Q, and there is α with F (α, 1) = 0, K = Q(α). This being the case, we can factor F over K as F = (X − αY ) a0 Xn−1 + ω1 Xn−2 Y + · · · + ωn−1 Y n−1 , where ω1 , . . . , ωn−1 ∈ K. Denote by OF the Z-module generated by the numbers 1, ω1 , . . . , ωn−1 . We call OF the invariant order of F . This naming is motivated by work of Simon (2001), who showed that OF is in fact an order in K, i.e., a subring of K of rank n as a Z-module, and that GL(2, Z)-equivalent Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 314 Further applications binary forms have isomorphic invariant orders. Of course OF depends on the choice of K and α, but it is unique up to Z-algebra isomorphism. It is not hard to show that for the discriminant of OF , i.e., DK/Q (1, ω1 , . . . , ωn−1 ), we have D(OF ) = D(F ). This implies D(F ) = c2 DK , (10.7.4) where c = [OK : OF ]. The following result is a less precise form of Corollary 2.2 of Bérczes, Evertse and Győry (2004). Theorem 10.7.4 Let K be a number field of degree n ≥ 2. and c a positive integer. Then for every > 0, the number of GL(2, Z)-equivalence classes of irreducible binary forms F ∈ Z[X, Y ] are associated with K and satisfy (10.7.4) is c(2/n(n−1))+ , where the implied constant is effectively computable and depends only on n and . It is shown in Bérczes, Evertse and Győry (2004) that the bound in Theorem 2 . 10.7.4 cannot be replaced by one of order cα with α < n(n−1) We subdivide the irreducible binary forms with (10.7.4) further and consider binary forms with given invariant order. By a result of Delone and Faddeev (1940), section 15, for every cubic number field K and every order O in K, there is precisely one GL(2, Z)-equivalence class of cubic forms F ∈ Z[X, Y ] such that OF ∼ = O. On the other hand, in his paper referred to above, Simon proved that for every n ≥ 4 there are number fields K of degree n such that OK is not the invariant order of a binary form. The following result is Corollary 2.1 of Bérczes, Evertse and Győry (2004). Theorem 10.7.5 Let O be an order in a number field K of degree n ≥ 4. Then 3 there are at most 224n GL(2, Z)-equivalence classes of irreducible binary forms F ∈ Z[X, Y ] such that OF ∼ = O. In our book Discriminant Equations in Diophantine Number Theory the bound 3 2 224n is improved to 25n . One can define more generally the invariant order of a reducible binary form of degree n, which is an order of rank n, i.e. a commutative ring which as a Z-module is free of rank n. When F has non-zero discriminant, its invariant order has no nilpotents. In our book on discriminant equations, we have proved a generalization of Theorem 10.7.5 where O is a given nilpotent-free order of rank n. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.8 Resultant equations for monic polynomials 315 We mention that both the proofs of Theorems 10.7.4, 10.7.5 use Theorem 6.1.4 (the result of Beukers and Schlickewei). We finally remark that the papers Evertse and Győry (1991), Evertse (1993) and Bérczes, Evertse and Győry (2004), as well as our book on discriminant equations, contain proofs of generalizations of Theorems 10.7.2–10.7.5 for binary forms with S-integral coefficients in number fields. A further generalization of Theorem 10.7.2 is given in Evertse and Győry (1992a, 1992b) for decomposable forms of given discriminant. See also our book on discriminant equations. 10.8 Resultant equations for monic polynomials Recall that the resultant of two monic polynomials f = m (X − αi ), g = i=1 m+n (X − αi ) i=m+1 is given by R(f, g) = (αi − αj ) 1≤i≤m m+1≤j ≤m+n and that R(f, g) is a polynomial with integer coefficients in terms of the coefficients of f and g. Let K be an algebraic number field, and consider the resultant equation R(f, g) = R in monic f, g ∈ Z[X] having all their zeros in K, (10.8.1) where R is a non-zero rational integer. If f , g is a solution of (10.8.1) then so is f ∗ (X) = f (X + a), g ∗ (X) = g(X + a) for all a ∈ Z. Such pairs of polynomials f , g, and f ∗ , g ∗ are called equivalent. The following result was obtained in Győry (1990). Theorem 10.8.1 There are only finitely many equivalence classes of pairs f , g with deg (f ) ≥ 2, deg (g) ≥ 2 and deg (f ) + deg (g) ≥ 5, without multiple zeros, such that (10.8.1) holds. We note that the assumptions concerning the degrees of f and g are necessary. Further, the condition that the zeros of f and g are contained in a fixed number field cannot be dropped. However, the restriction concerning the multiplicity of the zeros can be weakened, see Győry (1993c, 2008b). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 316 Further applications Győry (1990, 1993c, 2008b) and Bérczes, Evertse and Győry (2007a) obtained quantitative versions of Theorem 10.8.1 which provide upper bounds for the degrees of f and g and for the number of equivalence classes of pairs f , g under consideration. For example, it is proved in Győry (1990) that, in Theorem 10.8.1, deg(f ) + deg(g) ≤ 12 · 73d+2ω , where d is the degree of K over Q, and ω denotes the number of distinct prime factors of R. It should be remarked that Theorem 10.8.1 is established in Győry (1990) in the more general case when the ground ring is any integrally closed integral domain of characteristic 0 which is finitely generated over Z. Proof of Theorem 10.8.1 (sketch). We reduce equation (10.8.1) to unit equations. The basic idea is as follows. Let f , g be a solution of (10.8.1) with deg(f ) = m ≥ 2, deg(g) = n ≥ 2, m + n ≥ 5, and let {α1 , . . . , αm }, {αm+1 , . . . , αm+n } be the zeros of f and g in K. Since f , g are monic, these zeros are contained in OK , the ring of integers of K, and by assumption they are distinct. Then (10.8.1) can be written in the form (αi − αj ) = R. (10.8.2) 1≤i≤m m+1≤j ≤m+n The differences αi − αj divide R in OK . Hence taking norms, we infer that |NK/Q (αi − αj )| ≤ N for each i, j, (10.8.3) where N = |R|d . By Proposition 4.3.12, αi − αj may take only finitely many values up to a unit factor from OK . There exist several linear relations among these differences, for example (αi − αj ) + (αj − αk ) + (αk − αl ) = αi − αl , with 1 ≤ i, k ≤ m, m + 1 ≤ j, l ≤ m + n. This leads to inhomogeneous unit equations in three unknowns. We arrive in this way at a complicated system of unit equations. However, in contrast with the case of discriminant equations, in this situation we get unit equations in more than two unknowns. Thus one has to apply the ineffective Corollary 6.1.2 or Theorem 6.1.3. to obtain Theorem 10.8.1. Therefore, Theorem 10.8.1 is ineffective. It is simpler to deduce Theorem 10.8.1 from Theorem 10.5.2. We recall, however, that the proof of Theorem 10.5.2 is also based on the results concerning unit equations mentioned in Chapter 6. Consider the graph G(A) = G(A, N), Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.9 Resultant inequalities and equations for binary forms 317 where A = {α1 , . . . , αm , . . . , αm+n }. Using the above notation, it follows from (10.8.2) and (10.8.3) that G(A) has either at least three connected components or two connected components of order at least 2. Hence Theorem 10.5.2 implies that m + n is bounded. Further, for fixed m and n, we have αi = εαi + β, i = 1, . . . , m + n with some ε ∈ OK∗ , β ∈ OK and with α1 , . . . , αm+n ∈ OK which may take only finitely many values. This gives αi − αj = ε(αi − αj ), 0 ≤ i ≤ m, m + 1 ≤ j ≤ m + n. (10.8.4) We see from (10.8.2) and (10.8.4) that for fixed α1 , . . . , αm+n , εmn is also fixed, that is ε can assume only finitely many values. Finally, one can infer that αi = αi∗ + a with some a ∈ Z and with finitely many possible αi∗ ∈ OK , i = 1, . . . , m + n, whence Theorem 10.8.1 follows. 10.9 Resultant inequalities and equations for binary forms m−i i We keep the notation introduced in Section 10.7. Let F = m Y , i=1 ai X n n−i i G = i=1 bi X Y be two binary forms with coefficients in a field K. Assume that over an algebraic closure of K, the forms F, G factor as F = m (αi X − βi Y ), G = n (γj X − δj Y ); (10.9.1) j =1 i=1 then the resultant of F, G is given by R(F, G) = m n (βi γj − αi δj ). i=1 j =1 We can express R(F, G) otherwise as a polynomial with integer coefficients in a0 , . . . , am , b0 , . . . , bn , homogeneous of degree n in a0 , . . . , am and homogeneous of degree m in b0 , . . . , bn . Notice that for λ, μ ∈ K ∗ and U ∈ GL(2, K) we have R(λFU , μGU ) = λn μm (det U )mn R(F, G). (10.9.2) Let A be a subring of K. We call two pairs of binary forms (F, G), (F , G ) with coefficients in A GL(2, A)-equivalent if F = u1 FU , G = u2 GU for some u1 , u2 ∈ A∗ and U ∈ GL(2, A). By (10.9.2), GL(2, A)-equivalent pairs of binary forms have, up to multiplication with a unit from A∗ , the same resultant. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 318 Further applications We restrict ourselves to binary forms with coefficients in Z. We start with formulating some results for resultant inequalities and then deduce some analogues for binary forms of some of the results from the previous section. By Ciineff (·) we denote positive numbers, depending on the parameters between the parentheses, that are not effectively computable by the method of proof of the theorem in which they appear. We call a binary form square-free if it is not divisible by the square of a non-constant binary form. Our first result, which is Theorem 1 of Evertse and Győry (1993), gives a lower bound for the resultant of two binary forms in terms of their discriminants. Theorem 10.9.1 Let L be a finite, normal extension of Q, and F, G ∈ Z[X, Y ] binary forms such that deg F = m ≥ 3, deg G = n ≥ 3, F G is square-free, F G has splitting field L. (10.9.3) Then |R(F, G)| ≥ C1ineff (m, n, L) |D(F )|n/(m−1) |D(G)|m/(n−1) 1/18 . It was shown in Evertse and Győry (1993) that the dependence on L in Theorem 10.9.1 is necessary, and that neither of the conditions m ≥ 3, n ≥ 3 can be removed. Proof (sketch). Let F, G ∈ Z[X, Y ] be binary forms as in the statement of Theorem 10.9.1. After multiplying F, G by small integers bounded above in terms of L which will not have an effect on our result, we may assume that F, G have factorizations as in (10.9.1) with αi , βi , γj , δj ∈ OL for i = 1, . . . , m, j = 1, . . . , n. Put ij := βi γj − αi δj for i = 1, . . . , m, j = 1, . . . , n. Then ij ∈ OL for all i, j and i,j ij = R(F, G). Further, for all distinct i, j, k ∈ {1, . . . , m}, p, q, r ∈ {1, . . . , n} we have ip iq ir jp j q j r = ip j q kr + iq j r kp + ir jp kq kq kr kp −iq jp kr − ip j r kq − ir j q kp = 0. (10.9.4) Similarly as in the proof of Theorem 6.1.6, we consider all possible splittings of (10.9.4) into minimal non-vanishing subsums, and then apply Theorem 6.1.1 to each of these minimal sums. This leads to lower bounds for the quantities |NL/Q (ip jp kp iq j q kq ir j r kr )| for all i, j, k, p, q, r. By taking the product of these, the theorem follows. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.9 Resultant inequalities and equations for binary forms 319 It is also possible to give a lower bound for |R(F, G)| in terms of the heights of a pair of binary forms that is GL(2, Z)-equivalent to F, G. The following result is Theorem 1 of Evertse (1998). Theorem 10.9.2 Let F, G ∈ Z[X, Y ] be binary forms with (10.9.3). Then there is U ∈ GL(2, Z) such that 1/718 . |R(F, G)| ≥ C2ineff (m, n, L) H (FU )n H (GU )m Of course, the theorem does not hold without the matrix U , since by varying the pairs (F, G) in a given GL(2, Z)-equivalence class, |R(F, G)| remains the same, while H (F ), H (G) may become arbitrarily large. Proof (sketch). Apply Theorem 10.9.1. According to Theorem 10.7.3, there is U ∈ GL(2, Z) such that H (GU ) is bounded above in terms of |D(G)|, and so in terms of |R(F, G)|. Writing FU = m i=1 (αi X − βi Y ), we get m GU (αi , βi ) = ±R(FU , GU ) = ±R(F, G). i=1 Thus, for i = 1, . . . , m, the number GU (αi , βi ) divides R(F, G) in OL . We may view the pairs (αi , βi ) as solutions to a Thue equation over OL . This leads to upper bounds for the heights of the αi , βi , and hence of H (FU ), in terms of H (GU ) and |R(F, G)|. Thus, both H (FU ), H (GU ) can be estimated from above in terms of R(F, G). A precise computation gives Theorem 10.9.2. We deduce some consequences. The first is an analogue of Theorem 10.8.1. Corollary 10.9.3 Let R be a non-zero integer. Then the pairs of binary forms F, G ∈ Z[X, Y ] with (10.9.3) and with R(F, G) = R lie in at most finitely many GL(2, Z)-equivalence classes. Proof. Immediate consequence of Theorem 10.9.2. The next consequence is a special case of Theorem 1 of Evertse and Győry (1989). Given a binary form F ∈ Z[X, Y ] and an integer m > 0, we consider the Thue inequality 0 < |F (x, y)| ≤ m in x, y ∈ Z. (10.9.5) Two solutions (x, y), (x , y ) of (10.9.5) are called proportional if (x , y ) = a(x, y) for some a ∈ Q∗ . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 320 Further applications Corollary 10.9.4 Let n ≥ 3 be an integer and L a finite normal extension of Q. Then up to GL(2, Z)-equivalence, there are only finitely many binary forms F ∈ Z[X, Y ] of degree n and with splitting field L such that (10.9.5) has more than two pairwise non-proportional solutions. Proof. Let F ∈ Z[X, Y ] be a binary form of degree n and splitting field L and suppose that (10.9.5) has three pairwise non-proportional solutions, say (x1 , y1 ), (x2 , y2 ), (x3 , y3 ). Define the binary form G := 3i=1 (yi X − xi Y ). Then 0 < |R(F, G)| = |F (x1 , y1 )F (x2 , y2 )F (x3 , y3 )| ≤ m3 . By applying Corollary 10.9.3 with R = ±1, . . . , ±m3 , we see that up to GL(2, Z)-equivalence there are only finitely many possibilities for the pairs F, G, and so in particular only finitely many possibilities for F . We finish with a result of LeVesque and Waldschmidt on parametrized Thue inequalities. Let F = Xn + a1 Xn−1 Y + · · · + an Y n ∈ Z[X, Y ] be a squarefree binary form of degree n ≥ 3 and with given splitting field L. We can factor F over L as F = (X − α1 Y ) · · · (X − αn Y ), with α1 , . . . , αn distinct elements of L. Consider tuples ε := (ε1 , . . . , εn ) with ε1 , . . . , εn ∈ OL∗ , ε1 α1 , . . . , εn αn distinct, Fε := (X − ε1 α1 Y ) · · · (X − εn αn ) ∈ Z[X, Y ]. (10.9.6) Notice that for ε with (10.9.6) we necessarily have ε1 · · · εn = ±1. Let m be an integer with m ≥ |F (0, 1)|. Then for every with (10.9.6), the Thue inequality |Fε (x, y)| ≤ m in x, y ∈ Z (10.9.7) has solutions (1, 0), (0, 1). Solutions (x, y) of (10.9.7) with xy = 0 are called trivial. The following result is a special case of Theorem 3.1 of LeVesque and Waldschmidt (2012). Corollary 10.9.5 There are only finitely many ε with (10.9.6) such that (10.9.7) has non-trivial solutions. Proof. By Corollary 10.9.4, the binary forms Fε (with ε as in (10.9.6)) such that (10.9.7) has non-trivial solutions lie in only finitely many GL(2, Z)-equivalence classes. So it suffices to show that a GL(2, Z)-equivalence class can contain only finitely many binary forms Fε . Let ε be as in (10.9.6), and suppose that Fε = ±FU for some U = ( ac db ) ∈ GL(2, Z). Then F (a, c) = ±Fε (1, 0) = ±F (1, 0) = ±1, F (b, d) = ±Fε (0, 1) = ±F (0, 1). Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.10 Lang’s Conjecture for tori 321 Now by Thue’s Theorem there are only finitely many possibilities for a, b, c, d, hence for ε. This proves Corollary 10.9.5. We mention that in all papers referred to above, generalizations of the theorems and corollaries stated above have been proved for binary forms with S-integral coefficients in a number field. 10.10 Lang’s Conjecture for tori Let K be an algebraically closed field of characteristic 0 and n an integer ≥ 2. Let (K ∗ )n denote the n-fold direct product of the multiplicative group K ∗ of non-zero elements of K. That is, (K ∗ )n is the group with coordinatewise multiplication x · y := (x1 y1 , . . . , xn yn ) for x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) ∈ (K ∗ )n , and with unit element 1 = (1, . . . , 1). We write polynomials f ∈ K[X1 , . . . , Xn ] as a∈I c(a)Xa , where I is a finite subset of (Z≥0 )n , c(a) ∈ K ∗ for a ∈ I and Xa := X1a1 · · · Xnan if a = (a1 , . . . , an ). A subvariety of (K ∗ )n is a set X = {x ∈ (K ∗ )n : f1 (x) = 0, . . . , fr (x) = 0}, where f1 , . . . , fr ∈ K[X1 , . . . , Xn ]. We do not require here that X is irreducible. An algebraic subgroup of (K ∗ )n is a subvariety of (K ∗ )n that is also a subgroup of (K ∗ )n . For instance, a subvariety of (K ∗ )n given by equations xai = xbi (i = 1, . . . , r) with ai , bi ∈ Zn≥0 for i = 1, . . . , r is an algebraic subgroup of (K ∗ )n and in fact any algebraic subgroup of (K ∗ )n can be expressed in this form (see, e.g., Schmidt (1996)). An algebraic coset is a subvariety of (K ∗ )n of the shape uH = {u · x : x ∈ H } where H is an algebraic subgroup of (K ∗ )n and u ∈ (K ∗ )n . Such a coset is more precisely called a coset of H . The following is a more precise quantitative version of theorems of Liardet (1974, 1975) for n = 2, and Laurent (1984) for n ≥ 3. Theorem 10.10.1 Let X be a subvariety of (K ∗ )n given by polynomials of total degree at most . Let be a subgroup of (K ∗ )n of finite rank r. Then X ∩ is contained in a finite union u1 H1 ∪ · · · ∪ ut Ht of algebraic cosets with ui Hi ⊆ X for i = 1, . . . , t, where t ≤ C(n, )r+1 , with C(n, ) effectively computable in terms of n and . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 322 Further applications Below, we give a simple proof of Theorem 10.10.1 by making a reduction to unit equations a1 x1 + · · · + an xn = 1 in (x1 , . . . , xn ) ∈ , (10.10.1) where a1 , . . . , an ∈ K ∗ , and using the fact that such equations have only finitely many non-degenerate solutions (see Chapter 6). But, conversely, this finiteness result for (10.10.1) is a consequence of Theorem 10.10.1. Indeed, let X be the subvariety of (K ∗ )n given by the linear equation a1 x1 + · · · + an xn = 1. Denote by X 0 the set of points of X that remain if we remove all algebraic cosets of dimension > 0 that are contained in X . By Theorem 10.10.1, the set X 0 ∩ is finite. It can be shown that X 0 consists precisely of the non-degenerate points in X , i.e., for which i∈J ai xi = 0 for each proper, non-empty subset J of {1, . . . , n}. So Theorem 10.10.1 gives back the result that (10.10.1) has only finitely many non-degenerate solutions. Theorem 10.10.1 may be viewed as the special case for algebraic tori of a general conjecture of Lang on semi-abelian varieties (see Lang (1960)). We do not formally define the n-dimensional algebraic torus Gnm,K over a field K. We only need the fact that its group of K-rational points is Gnm,K (K) = (K ∗ )n , endowed with coordinatewise multiplication. A semi-abelian variety over a field K is a commutative group variety A over K, for which there exists a short exact sequence of group varieties over K, 0 → Gnm,K → A → A0 → 0, where n ≥ 0 and A is an abelian variety over K. If n = 0 then A is an abelian variety, while if A0 = 0 then A is an algebraic torus. Writing + for the group operation of A, we define a translate of a semi-abelian subvariety over K of A to be a subvariety of the shape a + B := {a + x : x ∈ B}, with B a semi-abelian subvariety of A over K and a ∈ A(K). Then Lang’s general conjecture for semi-abelian varieties is as follows. Conjecture Let A be a semi-abelian variety and X a subvariety of A, both defined over an algebraically closed field K of characteristic 0. Let be a subgroup of A(K) of finite rank. Then X (K) ∩ is contained in a finite union of translates (a1 + B1 ) ∪ · · · ∪ (at + Bt ) of semi-abelian subvarieties of A that are each contained in X . Lang’s Conjecture implies Mordell’s Conjecture (Mordell (1922a)) that for any ireducible algebraic curve C of genus g ≥ 2 defined over Q, and any number field L, the set of L-rational points C(L) is finite. Indeed, we may view C as a subvariety of its Jacobian JacC which is a g-dimensional abelian variety over Q. By the Mordell–Weil Theorem, the group JacC (L) is finitely generated. One-dimensional abelian subvarieties of JacC are elliptic curves, Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.10 Lang’s Conjecture for tori 323 and so translates of those are curves of genus 1. The curve C itself cannot be a translate of an abelian subvariety of JacC since it has genus at least 2. So the translates of abelian subvarieties of Jac(C) that are contained in C are necessarily points, and one deduces that C(L) is finite. Lang’s Conjecture is now a theorem. Faltings (1983) proved Mordell’s Conjecture. Laurent (1984) proved Lang’s Conjecture in the case of tori. Again Faltings (1991, 1994) proved Lang’s Conjecture in the case that K = Q, A is an abelian variety but for a finitely generated subgroup of A(Q) instead of an arbitrary group of finite rank. Vojta (1996) proved Lang’s Conjecture for arbitrary semi-abelian varieties over Q, but still with finitely generated. Finally, McQuillan (1995), combining Vojta’s arguments with Hindry’s (Hindry (1988)), proved Lang’s Conjecture in full generality, with K an arbitrary algebraically closed fields of characteristic 0, and an arbitrary subgroup of A(K) of finite rank. We now prove Theorem 10.10.1. Our main tool is Theorem 6.1.3 on unit equations. Proof of Theorem 10.10.1. We have ⎫ ⎧ ⎬ ⎨ ci (a)xa = 0 (i = 1, . . . , t) , X = x ∈ (K ∗ )n : fi (x) = ⎭ ⎩ a∈Ii ∗ Zn≥0 where Ii ⊂ is finite, ci (a) ∈ K for i = 1, . . . , t, a ∈ Ii , and the polynomials f1 , . . . , ft have total degree at most . 0 Let I := ti=1 Ii . With a point x ∈ X we associate an unordered graph Gx as follows. The vertices of Gx are the elements of I , and a pair {p, q} is an edge of Gx if there are i ∈ {i, . . . , t} and a non-empty subset J of Ii such that ⎫ ⎪ ci (a)xa = 0, p, q ∈ I, ⎪ ⎬ a∈J (10.10.2) a ci (a)x = 0 for each proper, non-empty subset J of J . ⎪ ⎪ ⎭ a∈J 2 n+ Notice that there are at most 2( n ) possibilities for the graph Gx . For a graph G on I , let XG = {x ∈ X : Gx = G}. We fix a graph G on I , and show that XG ∩ is contained in a union of at most C1r+1 algebraic cosets, each of which is contained in X , where C1 is an effectively computable number depending only on n and . This clearly suffices. In fact, these cosets will all be cosets of the algebraic group HG := x ∈ (K ∗ )n : xp = xq for each edge {p, q} of G . Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 324 Further applications We first show that if u ∈ XG , then uHG ⊂ X . We can express I1 as a union of pairwise disjoint sets J1 ∪ · · · ∪ Jr such that c1 (a)ua = 0 for i = 1, . . . , r, a∈Ji and a∈J c1 (a)u = 0 if J is a proper subset of one of the Ji . Let x ∈ HG . Clearly, any pair {p, q} contained in the same set Ji is an edge of G, hence xp = xq . Consequently, a f1 (u · x) = r c1 (a)(u · x)a = 0. i=1 a∈Ji Similarly, fi (u · x) = 0 for i = 2, . . . , t. This shows that u · x ∈ X for every x ∈ HG . Let {p, q} be an edge of G and x ∈ XG ∩ . Choose a set J as in (10.10.2). Then c (a) − ci (q) xa−q = 1. a∈J \{q} i Notice that the tuple (xa−q : a ∈ J \ {q}) is a non-degenerate solution to this equation, and that it lies in a homomorphic image of the group . Now Theorem 6.1.3 implies that xp−q ∈ Up,q , where Up,q is a set, that may depend on p, q but is otherwise independent of x, of cardinality at most C2r+1 , where C2 is effectively computable and depends only on n, . The values xp−q , taken for all edges {p, q} of G, uniquely determine the coset xHG . It follows that the points x ∈ XG ∩ lie in a union of at most C1r+1 cosets of HG . This completes our proof. We give an overview of some extensions and refinements of Theorem ∗ 10.10.1. For x = (x1 , . . . , xn ) ∈ (Q )n we define + h(x) := n h(xi ), i=1 where, as usual, h(x) denotes the absolute logarithmic height of x ∈ Q. Let ∗ be a finitely generated subgroup of (Q )n . We denote by the division group ∗ ∗ of , that is the subgroup of (Q )n consisting of the points x ∈ (Q )n for which there is m ∈ Z>0 such that xm ∈ . Define the following enlargements of : ∗ := y · z : y ∈ , z ∈ (Q )n , + h(z) < , ∗ h(z) < (1 + + h(y) . C(, ) := y · z : y ∈ , z ∈ (Q )n , + Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.10 Lang’s Conjecture for tori 325 We may view these as a “cylinder” and “truncated cone” around . The sets and C(, ) were introduced by Poonen (1999) and Evertse (2002), respectively, in a more general context. Clearly, for > 0 we have ⊂ ⊂ C(, ). It is important to note that and C(, ) are not groups. Poonen (1999) formulated a “Lang–Bogomolov Conjecture” for semiabelian varieties. In the case of algebraic tori, this states that if X is a subvariety ∗ ∗ of (Q )n and is a finitely generated subgroup of (Q )n , then there is > 0 such that X ∩ is contained in a finite union of algebraic cosets, all contained ∗ in X . For = 0 this is Lang’s Conjecture for tori, and for = (Qtors )n we get Bogomolov’s Conjecture for tori. The general conjecture for semi-abelian varieties is similar, except that there one has to use a suitable canonical height on the semi-abelian variety under consideration. Poonen himself and independently S. Zhang (2000) proved the Lang–Bogomolov Conjecture for almost split semi-abelian varieties, these include algebraic tori and abelian varieties. The full conjecture was proved by Rémond (2003). ∗ Let X be a subvariety of (Q )n and denote again by X 0 the set that remains if we remove from X all algebraic cosets of positive dimension that are contained in X . In Evertse (2002) it was stated, and sketched, that there is > 0 such that X 0 ∩ C(, ) is finite. Rémond (2003) proved a generalization of this for semi-abelian varieties. Rémond (2002) obtained a quantitative version of the Lang–Bogomolov Conjecture for tori, a somewhat simplified version of which is as follows. Suppose that X is given by polynomials of degree at most . Define the number := exp(−(n3n+3 log(n)). Assume has rank r. Then X ∩ is contained in a union of at most 2 exp n3n +3 log(n)(r + 1) algebraic cosets, each contained in X . In his proof, Rémond did not use the Subspace Theorem or results on unit equations, but instead the ideas introduced by Faltings (1991), which led to bounds with a better dependence on . Rémond (2000a, 2000b) proved an analogous result for subvarieties of abelian varieties. In very few cases, effective results for the above mentioned finiteness results for algebraic tori have been proved. The first case is when X is a curve in ∗ ∗ Q × Q , i.e., X is given by P (x1 , x2 ) = 0 with P ∈ Q[X1 , X2 ]. We assume that P is an absolutely irreducible polynomial not of the form aX1m + bX2n or aX1m X2n + b, so that X is not an algebraic coset Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 326 Further applications ∗ ∗ of Q × Q , which means that X does not contain one-dimensional algebraic ∗ ∗ cosets. Let be a finitely generated subgroup of Q × Q . Bombieri and Gubler (2006), Theorem 5.4.5 gave an effectively computable upper bound in terms of the heights of the coefficients of P and of a set of generators for for the heights of the points x ∈ X ∩ . The result of Bombieri and Gubler was extended by Bérczes, Evertse, Győry and Pontreau (2009) to sets X ∩ C(, ), where is a finitely generated sub∗ ∗ group of Q × Q , and > 0 is an effectively computable number depending only on the coefficients of P and a generating set for . Moreover, in this latter work explicit upper bounds are given both for the heights and for the degrees of the coordinates of the points of these sets, all in terms of the coefficients of P and the given generators of . Further, this work contains effective versions of Lang’s Conjecture for tori, with extensions to X ∩ , X ∩ C(, ), ∗ for higher dimensional subvarieties X of (Q )n from a very restricted class, namely subvarieties given by polynomials with at most three non-zero terms. Applying the specialization techniques discussed in Chapter 8, Bérczes (2015a) proved effective finiteness results for equations P (x1 , x2 ) = 0 in x1 , x2 ∈ A∗ , where A is an integral domain that is finitely generated over Z and P ∈ A[X1 , X2 ]. In a subsequent paper, Bérczes (2015b) he proved an effective finitenes result for P (x1 , x2 ) = 0 in (x1 , x2 ) ∈ , where is a finitely generated subgroup of K ∗ × K ∗ . 10.11 Linear recurrence sequences and exponential-polynomial equations We give a brief overview of some results concerning zeros of linear recurrence sequences, and more generally, integer solutions of exponential-polynomial equations. Much more on these topics can be found in Schmidt (2003). Let K be an algebraically closed field of characteristic 0. Recall that a (two-sided) linear recurrence sequence U = {um }∞ m=−∞ in K is given by initial values u0 , . . . , uk−1 and a linear recurrence um+k = c1 um+k−1 + · · · + ck um for m ∈ Z, where the ci belong to K and ck = 0. Further, assume that the length k of the recurrence has been chosen minimally. Then we call k the order of U , and fU := Xk − c1 Xk−1 − · · · − ck the companion polynomial of U . These are uniquely determined by U . Assume that fU = (X − α1 )e1 · · · (X − αr )er , Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.11 Exponential-polynomial equations 327 where α1 , . . . , αr are distinct elements of K and e1 , . . . , er positive integers. Then the terms um can be expressed otherwise as um = f1 (m)α1m + · · · + fr (m)αrm (m ∈ Z), where fi ∈ K[X] is a polynomial of degree exactly ei − 1, for i = 1, . . . , r. The sequence U is called non-degenerate if α1 · · · αr = 0 and αi /αj is not a root of unity for any two distinct indices i, j from {1, . . . , r}. The zero-multiplicity N(U ) of U is the number of integers m such that um = 0, that is the number of solutions of the exponential-polynomial equation f1 (m)α1m + · · · + fr (m)αrm = 0 in m ∈ Z. (10.11.1) We have the following general result, due to Skolem, Mahler and Lech. Theorem 10.11.1 Let U be a non-degenerate linear recurrence sequence in a field K of characteristic 0. Then N (U ) is finite. Skolem (1935) proved this in the case that U has its terms in Q and Mahler (1935a) did so for sequences U with algebraic terms. Finally, Lech (1953) proved the general result. The proofs of Skolem, Mahler and Lech were all based on Skolem’s p-adic power series method (Skolem (1933)). We discuss a generalization of (10.11.1) to exponential-polynomial eqations in several variables. Let again K be a field of characteristic 0 and n ≥ 1. For α = (α1 , . . . , αn ) ∈ (K ∗ )n and m = (m1 , . . . , mn ) ∈ Zn , we write α m := α1m1 · · · αnmn . We consider equations m f1 (m)α m 1 + · · · + fr (m)α r = 0 in m ∈ Zn , (10.11.2) where fi ∈ K[X1 , . . . , Xn ], α i ∈ (K ∗ )n for i = 1, . . . , r. A solution m ∈ Zn of (10.11.2) is called non-degenerate if i∈I fi (m)α m i = 0 for each proper, non-empty subset I of {1, . . . , r}. We recall the following result, which is a special case of a more general theorem of Laurent (1984, 1989). Define the group m G := {m ∈ Zn : α m 1 = · · · = α r }. Theorem 10.11.2 Assume that G = {0}. Then (10.11.2) has only finitely many non-degenerate solutions. Proof of Theorem 10.11.2 =⇒ Theorem 10.11.1. We proceed by induction on r. For r = 1 Theorem 10.11.1 is trivial. Let r ≥ 2 and assume that none of the quotients αi /αj (1 ≤ i < j ≤ r) is a root of unity. This implies that {m ∈ Z : α1m = · · · = αrm } = {0}. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 328 Further applications Hence (10.11.1) has only finitely many non-degenerate solutions. By applying the induction hypothesis to any of the vanishing subsums, we infer that there are also only finitely many degenerate solutions. Proof of Theorem 10.11.2 (sketch). The proof depends on Theorem 6.1.1. By means of a specialization argument (see for instance Schmidt (2003), section 9), Theorem 10.11.2 can be reduced to the case that the coordinates of the α i and the coefficients of the polynomials fi lie in an algebraic number field K. We restrict ourselves to this special case. Choose a finite set of places S of K, containing the infinite places, such that the coordinates of the α i (i = 1, . . . , r) are all S-units. We apply Theorem 6.1.1 to (10.11.2). Pick a nondegenerate solution m of (10.11.2), and put xi := fi (m)α m i for i = 1, . . . , r. Then x1 + · · · + xr = 0 and no proper subsum of the left-hand side is 0. Put m := max(|m1 |, . . . , |mn |). Thanks to our choice of S, we have NS (x1 · · · xr ) = NS (f1 (m) · · · fr (m)) ≤ C1 mC2 , (10.11.3) where here and below, the Ci are constants > 1 depending on the fi and the α i . Further, since m is non-degenerate, we have fi (m) = 0 for i = 1, . . . , r, which implies HS (x1 , . . . , xr ) = max(|x1 |v , . . . , |xr |v ) (10.11.4) v∈S ≥ C3−1 m−C4 · v∈S max |α m i |v . 1≤i≤r By the Product Formula and our choice for S, we have for z ∈ Zn , 1 z log max |α zi |v ≥ 2 h((α i α −1 j ) ) =: ψ(z). 1≤i≤r r v∈S 1≤i,j ≤r One can easily show that ψ satisfies the triangle inequality, and ψ(λz) = z |λ|ψ(z) for λ ∈ Z, z ∈ Zn . Further, if ψ(z) = 0, then all terms (α i α −1 j ) are roots of unity. By our assumption on G, this implies that z = 0. Hence ψ defines a norm on Zn . Both ψ and the maximum norm · can be extended to norms on Rn and by a simple compactness argument one shows that there is c > 0 such that ψ(z) ≥ cz for z ∈ Rn . So max |α zi |v ≥ exp r −2 cz for z ∈ Zn . v∈S 1≤i≤r Together with (10.11.4) this implies HS (x1 , . . . , xr ) ≥ C3−1 m−C4 C5m . From Theorem 6.1.1 we deduce HS (x1 , . . . , xr ) ≤ C6 NS (x1 · · · xr )2 , say. Combining this with the lower bound for HS (x1 , . . . , xr ) just derived and the Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.11 Exponential-polynomial equations 329 upper bound for NS (x1 · · · xr ) from (10.11.3), we obtain m C3−1 m−C4 C5 ≤ C6 (C1 mC2 )2 , which implies that m is bounded. Below, we discuss quantitative results (upper bounds for the number of solutions) of (10.11.1) and (10.11.2), that have been obtained as consequences of the Quantitative Subspace Theorem. It has been an open problem for a long time to obtain a uniform upper bound for the zero multiplicity N (U ) of a non-degenerate linear recurrence sequence U depending only on the order of U . This was finally settled by Schmidt (1999), who proved the following. Theorem 10.11.3 Let U be a non-degenerate linear recurrence sequence of order k in a field of characteristic 0. Then N (U ) ≤ exp exp exp(3k log k). Schmidt’s very intricate proof is based on the Quantitative Subspace Theorem from Evertse and Schlickewei (2002), but uses various other techniques. In fact, the special case where the polynomials fi in (10.11.1) are all constants follows easily from Theorem 6.1.3, but the extension to arbitrary polynomials fi was very difficult. Schmidt’s bound has been subsequently improved by Schmidt himself (Schmidt (2000)), and Allen (2007) and Amoroso and Viada (2011). The best upper bound to date for N (U ), from the last mentioned paper, is exp exp(70k). In Schmidt (2003), Schmidt worked out his method of proof in a special case, giving a flavour of the main ideas. It is conjectured that under the assumption G = {0}, the number of solutions of (10.11.2) is bounded above by a quantity depending only on r and the total degrees of f1 , . . . , fr . Schlickewei and Schmidt (2000) proved the following weaker result for exponential-polynomial equations over number fields. We keep the notation from (10.11.2). Theorem 10.11.4 Assume that the coordinates of the points α i and the coefficients of the polynomials fi lie in an algebraic number field K of degree d. Let δi denote the total degree of fi for i = 1, . . . , r and put r n + δi B := max n, . n i=1 Assume that G = {0}. Then equation (10.11.2) has at most c(B, d) := 3 2 235B d 6B non-degenerate solutions. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 330 Further applications Again the main tool in the proof is the Quantitative Subspace Theorem from Evertse and Schlickewei (2002) (which was already proved a couple of years earlier). There are various generalizations of Theorem 10.11.4, see Schlickewei and Schmidt (2000) and Ahlgren (1999). From Schmidt (2009) and Corvaja, Schmidt and Zannier (2010) the following special case of the above conjecture can be deduced. Let K be any field of characteristic 0, f ∈ K[X1 , . . . , Xn ] a polynomial of total degree δ, and α = (α1 , . . . , αn ), where α1 , . . . , αn are multiplicatively independent, non-zero elements of K. Then the equation α m = f (m) ). has at most exp(B 9B ) solutions m ∈ Zn , where B := 1 + ( n+δ δ 10.12 Algebraic independence results A possibly infinite sequence α1 , α2 , . . . is called algebraically independent over a field K if there are no N and P ∈ K[X1 , . . . , XN ] − {0} such that P (α1 , . . . , αN ) = 0. Nishioka (1986, 1987, 1989, 1994) proved various algebraic independence results for values of certain power series at algebraic arguments. All these results are applications of the semi-effective Theorem 6.1.1. Here, we prove a special case of one of Nishioka’s results, and mention some of her other results. Below, by algebraic numbers we always mean complex numbers that are algebraic over Q. Let K ⊂ C be an algebraic number field and f (z) = ∞ ak zek k=0 a power series with coefficients ak ∈ K and with {ek }∞ k=0 a strictly increasing sequence of non-negative integers. Assume that f (z) has radius of convergence R > 0. Further, assume that {ek } grows rapidly, i.e., ek + ki=1 h(ai ) lim = 0, k→∞ ek+1 where as usual h(α) denotes the absolute logarithmic height of an algebraic number α. By Cijsouw and Tijdeman (1973), the number f (α) is transcendental for any algebraic number α with 0 < |α| < R. Further, it was shown by Bundschuh and Wylegala (1980), that f (α1 ), . . . , f (αn ) are algebraically independent for any algebraic numbers α1 , . . . , αn with 0 < |α1 | < · · · < |αn | < R. These results were extended by Nishioka as follows. Call non-zero algebraic Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.12 Algebraic independence results 331 numbers α1 , . . . , αs {ek }-dependent if there are γ , roots of unity ζ1 , . . . , ζs , and algebraic numbers d1 , . . . , ds , not all zero, such that αi = ζi γ for i = 1, . . . , s, s di ζiek = 0 for all sufficiently large k. i=1 Denote by f (l) the l-th derivative of f , where f (0) = f . The following result is Theorem 1 of Nishioka (1987). Theorem 10.12.1 Let α1 , . . . , αn be algebraic numbers with 0 < |αi | < R for i = 1, . . . , n. Then the following three assertions are equivalent: (i) f (l) (αi ) (i = 1, . . . , n, l ≥ 0) are algebraically dependent over Q; (ii) there are distinct i1 , . . . , is ∈ {1, . . . , n} such that αi1 , . . . , αis are {ek }dependent; (iii) 1, f (α1 ), . . . , f (αn ) are linearly dependent over the algebraic numbers. To give a flavour of Nishioka’s method of proof, we prove the following special case. ∞ ek Theorem 10.12.2 Let f (z) = ∞ k=0 z , where {ek }k=0 is a strictly increasing sequence of non-negative integers with limk→∞ ek /ek+1 = 0. Further, let α1 , . . . , αn be algebraic numbers such that |αi | < 1 for i = 1, . . . , n and none of the quotients αi /αj (1 ≤ i < j ≤ n) is a root of unity. Then the numbers f (l) (αi ) (i = 1, . . . , n, l ≥ 0) are algebraically independent over Q. In fact, this was proved earlier by Nishioka (1986), with ek = k! for all k. We first prove a crucial lemma (see Nishioka (1989), Lemma 1), which is a consequence of Theorem 6.1.1. Lemma 10.12.3 Let be an infinite set of non-negative integers. Further, let K be a number field, γ1 , . . . , γn non-zero elements of K, and {Ai (m)}m∈ sequences of elements of K, such that γi /γj is not a root of unity for all i, j with 1 ≤ i < j ≤ n,(10.12.1) Ai (m) = 0 for i = 1, . . . , n, m ∈ , h(Ai (m)) = 0 for i = 1, . . . , n. lim m→∞ m m∈ (10.12.2) (10.12.3) Then for every θ with 0 < θ < 1 and every place v of K, there are only finitely many m ∈ such that m |A1 (m)γ1m + · · · + An (m)γnm |v ≤ |γ1 |m vθ . (10.12.4) Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 332 Further applications Remark This lemma easily implies Theorem 10.11.1 (the Skolem–Mahler– Lech Theorem for linear recurrence sequences) in the case of linear recurrence sequences with terms in an algebraic number field. Proof. The proof is by induction on n. For n = 1, the lemma is an easy consequence of assumptions (10.12.2), (10.12.3) and of (1.9.1), more precisely the inequality log |A1 (m)|v ≥ −[K : Q] · h(A1 (m)). Now let n ≥ 2 and suppose the lemma is true for sums with fewer than n terms. The induction hypothesis implies that for all but finitely many m ∈ , every proper sub sum of ni=1 Ai (m)γim is non-zero. We show that also ni=1 Ai (m)γim can be 0 for at most finitely many m. Indeed, by (10.12.1) there is v ∈ MK such that |γ1 /γn |v = 1. Assume without loss of generality that |γ1 /γn |v > 1. Notice that by (10.12.3), h(Ai (m)) + h(An (m)) h(Ai (m)/An (m)) ≤ → 0 as m ∈ , m → ∞. m m Now if ni=1 Ai (m)γim = 0, then A1 (m) An−1 (m) m m m · γ · γ + · · · + n−1 = |γn |v , A (m) 1 An (m) n v and by the induction hypothesis this is possible for only finitely many m. So, after removing at most finitely many integers, we obtain an infinite set of positive integers such that for every m ∈ , each subsum of ni=1 Ai (m)γim is non-zero. By Lemma 1.9.1, for every m ∈ there is a positive rational integer dm , with log dm ≤ [K : Q] ni=1 h(Ai (m)), such that dm Ai (m) ∈ OK for i = 1, . . . , n. Now, clearly, (10.12.3) remains valid with dm Ai (m) instead of Ai (m), and (log dm )/m → 0 as m → ∞. Hence we may as well prove our lemma with dm Ai (m) instead of Ai (m). So we may and will assume that all Ai (m) are algebraic integers without loss of generality. Now let S be a finite set of places of K such that v ∈ S, and γ1 , . . . , γn are all S-units. Put ui (m) := Ai (m)γim for i = 1, . . . , n. We apply Proposition 6.2.1 (which is in fact equivalent to Theorem 6.1.1) with xi = ui (m) for i = 1, . . . , m, S as above, and T = {v}. Define [K : Q] ni=1 h(Ai (m)) , H := log max(|γ1 |w , . . . , |γn |w ). δm := m w∈S Then H ≥ 0. Notice that for m ∈ we have by (1.9.1), n w∈S i=1 |ui (m)|w = n |Ai (m)|w ≤ exp(δm m) w∈S i=1 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.12 Algebraic independence results while HS (u1 (m), . . . , un (m)) ≤ exp(mH ) · 333 max(1, |A1 (m)|w , , . . . , |An (m)m ) w∈S ≤ exp((H + δm )m) and lastly, by (1.9.1), max(|u1 (m)|v , . . . , |un (m)|v ) ≥ |γ1 |m v exp(−δm m). By combining these three inequalities with Proposition 6.2.1 we obtain that for every > 0 there is a constant C() > 0 such that for all m ∈ , |A1 (m)γ1m + · · · + An (m)γnm |v ≥ C() max(|u1 (m)|v , . . . , |un (m)|v ) n −1 × |ui (m)|w · HS (u1 (m), . . . , un (m))− ≥ w∈S i=1 C()|γ1 |m v exp − m((2 + )δm + H ) . Since we can choose arbitrarily small and since δm → 0 by (10.12.3), it follows that for every θ with 0 < θ < 1, inequality (10.12.4) has only finitely many solutions in m ∈ . Proof of √ Theorem 10.12.2. We use the following notation. Let K be the number field Q( −1, α1 , . . . , αn ). Let v be the (necessarily complex) place of K such that | · |v = | · |2 . For x = (x1 , . . . , xr ) ∈ Cr we put x := max1≤i≤r |xi |. Further, for x = (x1 , . . . , xr ) ∈ K r , w ∈ MK we put xw := max1≤i≤r |xi |w . Assume that Theorem 10.12.2 is false. Then there is L ≥ 0 such that the numbers f (l) (αi ) (i = 1, . . . , n, l = 1, . . . , L) are algebraically dependent over Q. Assume that |α1 | = max |αi |. (10.12.5) 1≤i≤n ek n(L+1) For m ≥ 0 put fm (z) := m and um ∈ k=0 z . Define vectors u ∈ C n(L+1) (m = 0, 1, 2, . . . ,) by K u := αil f (l) (αi ) i=1,...,n , um := αil fm(l) (αi ) i=1,...,n . l=0,...,L l=0,...,L Then limm→∞ um = u. Note that m αil fm(l) (αi ) = ek (ek − 1) · · · (ek − l + 1)αiek . k=0 There is a non-zero polynomial Q[X10 , . . . , Xn,L+1 ] such that in n(L + 1) variables P ∈ P (u) = 0 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 334 Further applications (i.e., P evaluated at Xil = αil f (l) (αi ) for i = 1, . . . , n, l = 1, . . . , L). We choose such P of minimal total degree. Denote the total degree of P by D. The constants Ci introduced below will be ≥ 1 and depend on P , α1 , . . . , αn only. Further, {Ciw }w∈MK will be tuples of constants depending only on P , α1 , . . . , αn , where Ciw ≥ 1 for all w ∈ MK and Ciw = 1 for all but finitely many w. For m = 0, 1, 2, . . . , we have by (10.12.5), 2L |α1 |2em+1 . |P (um )|v = |P (um ) − P (u)|2 ≤ C1 um − u2 ≤ C2 em+1 Further, for w ∈ MK \ {v} we have |P (um )|w ≤ C3w max(1, um w )D D L ≤ C3w max 1, |(m + 1)em |w max(1, |α1 |w , . . . , |αn |w )Dem em ≤ C4w . Hence (10.12.6) 2L |P (m)|v ≤ C5em em+1 |α1 |2em+1 , v∈MK which is < 1 for m sufficiently large since |α1 | < 1 and em /em+1 → 0 as m → ∞. So by the Product Formula, P (um ) = 0 for all sufficiently large m. We infer that for sufficiently large m we have by Taylor’s formula, 0 = P (um ) − P (um−1 ) n L a = em (em − 1) · · · (em − l + 1)αiem il , Pa (um−1 ) a i=1 l=0 where the sum is over all tuples of non-negative integers a = (ail )i=1,...,n, l=0,...,L with a = 0, i,l ail ≤ D and where Pa := ( ni=1 Ll=0 (ail !)−1 ∂ ail /∂Xilail )P . Estimating the terms with partial derivatives of order at least 2, we get by (10.12.5), n LD βi (m)αiem ≤ C6 max(1, um−1 )D em |α1 |2em i=1 e LD ≤ C7m−1 em |α1 |2em , (10.12.7) where βi (m) := L l=0 em (em − 1) · · · (em − l + 1) · ∂P (um−1 ) ∂Xil for i = 1, . . . , n. We apply Lemma 10.12.3 with = {ek }∞ k=0 minus possibly a finite subset, γi = αi , Ai (ek ) = βi (k) for i = 1, . . . , n, k ≥ 0, to show that (10.12.7) cannot Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 10.12 Algebraic independence results 335 hold for infinitely many m and derive a contradiction. Condition (10.12.1) is satisfied with γi = αi for i = 1, . . . , n by assumption. As for condition (10.12.2), we have to show that for all sufficiently large m we have βi (m) = 0 for i = 1, . . . , m. Indeed, suppose that for some i we have βi (m) = 0 for infinitely many m. Then for these m, ∂P (um−1 ) 1 ∂P (um−1 ) · =− ∂XiL (em − l) · · · (em − L + 1) ∂Xil l=0 L−1 and by letting m → ∞ we get ∂P (u) = 0. ∂XiL But this is impossible since we had chosen P of minimal total degree with P (u) = 0. To verify (10.12.3) we have to estimate the absolute logarithmic height of βi (m). By a similar computation as in (10.12.6), one has for all sufficiently large m, e |βi (m)|w ≤ C8wm−1 for i = 1, . . . , n, w ∈ MK and so h(βi (m)) ≤ C9 em−1 for i = 1, . . . , n. Hence h(βi (m)) C9 em−1 ≤ → 0 as m → ∞. em em So all conditions of Lemma 10.12.3 are satisfied. By (10.12.7) and |α1 |v = |α1 |2 < 1, em−1 /em → 0 as m → ∞, there is θ with 0 < θ < 1 such that for all sufficiently large m, |β1 (m)α1em + · · · + βn (m)αnem |v ≤ (|α1 |v θ )em . But this contradicts Lemma 10.12.3. Theorem 10.12.2 follows. Nishioka (1989) proved algebraic independence results for values at algebraic points of power series fω (z) = ∞ [kω]zk , k=0 where ω is a real irrational number and [x] denotes the integral part of a real number x. Her method of proof is a variation on that given above, and the main tool is again Theorem 6.1.1. One of her results from that paper (i.e., Theorem 2) states that if ω has unbounded partial quotients in its continued fraction expansion, α1 , . . . , αn are algebraic numbers such that |αi | < 1 for i = 1, . . . , n and none of the quotients αi /αj (1 ≤ i < j ≤ n) is a root of Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 336 Further applications unity, then the numbers fω (α1 ), . . . , fω (αn ) are algebraically independent over the rationals. In Nishioka (1994), the author applies Theorem 6.1.1 to obtain algebraic independence results for values of Mahler functions at algebraic points. For an extensive treatment of transcendence theory of Mahler functions we refer to Nishioka (1996). We state only the following special case of Nishioka (1994), Proposition: let Fr (z) := ∞ k=0 k zr , Gr (z) := ∞ k (1 − zr ). k=0 Then for every algebraic number α with 0 < |α| < 1, the numbers Fr (α) (r = 2, 3, . . .), Gr (α) (r = 2, 3, . . .) are algebraically independent over the rationals. There are various other transcendence results that follow from the Subspace Theorem but not specifically from Theorem 6.1.1, see for instance Corvaja and Zannier (2002b), the survey Bugeaud (2011), and the book Bugeaud (2012), in particular chapters 8 and 9. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012 References Adamczewski, B. and J. P. Bell (2012), On vanishing coefficients of algebraic power series over fields of positive characteristic, Invent. Math. 187, 343–393. Ahlgren, S. (1999), The set of solutions of a polynomial-exponential equation, Acta Arith. 87, 189–207. Allen, P. B. (2007), On the multiplicity of linear recurrence sequences, J. Number Theory 126, 212–216. Amoroso, F. and E. Viada (2009), Small points on subvarieties of a torus, Duke Math. J. 150, 407–442. Amoroso, F. and E. Viada (2011), On the zeros of linear recurrence sequences, Acta Arith. 147 (2011), 387–396. Arenas-Carmona, L., D. Berend and V. Bergelson (2008), Ledrappier’s system is almost mixing of all orders, Ergodic Theory Dynam. Systems 28, 339–365. Aschenbrenner, M. (2004), Ideal membership in polynomial rings over the integers, J. Amer. Math. Soc. 17, 407–442. Ashrafi, N. and P. Vámos (2005), On the unit sum number of some rings, Quart. J. Math. 56, 1–12. Baker, A. (1966), Linear forms in the logarithms of algebraic numbers, Mathematika 13, 204–216. Baker, A. (1967a), Linear forms in the logarithms of algebraic numbers, II, Mathematika 14, 102–107. Baker, A. (1967b), Linear forms in the logarithms of algebraic numbers, III, Mathematika 14, 220–228. Baker, A. (1968a), Linear forms in the logarithms of algebraic numbers, IV, Mathematika 15, 204–216. Baker, A. (1968b), Contributions to the theory of Diophantine equations, Philos. Trans. Roy. Soc. London, Ser. A 263, 173–208. Baker, A. (1968c), The Diophantine equation y 2 = ax 3 + bx 2 + cx + d, J. London Math. Soc. 43, 1–9. Baker, A. (1969), Bounds for the solutions of the hyperelliptic equation, Proc. Camb. Philos. Soc. 65, 439–444. Baker, A. (1975), Transcendental number theory, Cambridge University Press. Baker, A., ed. (1988), New Advances in Transcendence Theory, Cambridge University Press. 337 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 338 References Baker, A. (1998), Logarithmic forms and the abc-conjecture, in: Number Theory Diophantine, Computational and Algebraic Aspects, Proc. Conf. Eger, 1966, K. Győry, A. Pethő and V. T. Sós, eds., de Gruyter, 37–44. Baker, A. (2004), Experiments on the abc-conjecture, Publ. Math. Debrecen 65, 253– 260. Baker, A. and H. Davenport (1969), The equations 3x 2 − 2 = y 2 and 8x 2 − 7 = z2 , Quart. J. Math. Oxford Ser. (2) 20, 129–137. Baker, A. and D. W. Masser, eds. (1977), Transcendence theory: advances and applications, Academic Press. Baker, A. and G. Wüstholz (2007), Logarithmic Forms and Diophantine Geometry, Cambridge University Press. Barroero, F., C. Frei and R. F. Tichy (2011), Additive unit representations in rings over global fields – a survey, Publ. Math. Debrecen 79, 291–307. Belcher, P. (1974), Integers expressible as sums of distinct units, Bull. London Math. Soc. 6, 66–68. Bertók, Cs. (2013), Representing integers as sums or differences of general power products, Acta Math. Hungar. 141, 291–300. Bérczes, A. (2000), On the number of solutions of index form equations, Publ. Math. Debrecen 56, 251–262. Bérczes, A. (2015a), Effective results for unit points over finitely generated domains, Math. Proc. Camb. Phil. Soc. 158, 331–353. Bérczes, A. (2015b), Effective results for division points on curves in G2m , J. Th. Nombers Bordeaux, to appear. Bérczes, A., J.-H. Evertse and K. Győry (2004), On the number of equivalence classes of binary forms of given degree and given discriminant, Acta Arith. 113, 363–399. Bérczes, A., J.-H. Evertse and K. Győry (2007a), On the number of pairs of binary forms with given degree and given resultant, Acta Arith. 128, 19–54. Bérczes, A., J.-H. Evertse and K. Győry (2007b), Diophantine problems related to discriminants and resultants of binary forms, in: Diophantine Geometry, proceedings of a trimester held from April–July 2005, U. Zannier, ed., CRM series, Scuola Normale Superiore Pisa, pp. 45–63. Bérczes, A., J.-H. Evertse and K. Győry (2009), Effective results for linear equations in two unknowns from a multiplicative division group, Acta Arith. 136, 331–349. Bérczes, A., J.-H. Evertse and K. Győry (2013), Multiply monogenic orders, Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 12, 467–497. Bérczes, A., J.-H. Evertse and K. Győry (2014), Effective results for Diophantine equations over finitely generated domains, Acta Arith. 163, 71–100. Bérczes, A., J.-H. Evertse, K. Győry and C. Pontreau (2009), Effective results for points on certain subvarieties of a tori, Math. Proc. Camb. Phil. Soc. 147, 69–94. Bérczes, A. and K. Győry (2002), On the number of solutions of decomposable polynomial equations, Acta Arith. 101, 171–187. Beukers, F. and H. P. Schlickewei (1996), The equation x + y = 1 in finitely generated groups, Acta. Arith. 78, 189–199. Beukers, F. and D. Zagier (1997), Lower bounds of heights of points on hypersurfaces, Acta Arith. 79, 103–111. Bilu, Yu. F. (1995), Effective analysis of integral points on algebraic curves, Israel J. Math. 90, 235–252. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 339 Bilu, Yu. F. (2002), Baker’s method and modular curves, in: A Panorama of Number Theory, or The View from Baker’s Garden, Proc. conf. ETH Zurich, 1999, G. Wüstholz, ed., Cambridge University Press, pp. 73–88. Bilu, Yu. F. (2008), The many faces of the subspace theorem [after Adamczewski, Bugeaud, Corvaja, Zannier, . . .], Séminaire Bourbaki, Vol. 2006/2007, Astérisque 317, Exp. No. 967, vii, 1–38. Bilu, Yu. F. and Y. Bugeaud (2000), Démonstration du théorème de Baker-Feldman via les formes linéaires en deux logarithmes, J. Théorie des Nombres, Bordeaux, 12, 13–23. Bilu, Yu. F., I. Gaál and K. Győry (2004), Index form equations in sextic fields: a hard computation, Acta Arith. 115, 85–96. Bilu, Yu. F. and G. Hanrot (1996), Solving Thue equations of high degree, J. Number Theory, 60, 373–392. Bilu, Yu. F. and G. Hanrot (1998), Solving superelliptic Diophantine equations by Baker’s method, Compositio Math. 112, 273–312. Bilu, Yu. F. and G. Hanrot (1999), Thue equations with composite fields, Acta Arith., 88, 311–326. Birch, B. J. and J. R. Merriman (1972), Finiteness theorems for binary forms with given discriminant, Proc. London Math. Soc. 24, 385–394. Bombieri, E. (1993), Effective diophantine approximation on GM , Ann. Scuola Norm. Sup. Pisa (IV) 20, 61–89. Bombieri, E. (1994), On the Thue-Mahler equation (II), Acta Arith. 67, 69–96. Bombieri, E. and P. B. Cohen (1997), Effective Diophantine approximation on Gm , II, Ann. Scuola Norm. Sup. Pisa (IV) 24, 205–225. Bombieri, E. and P. B. Cohen (2003), An elementary approach to effective Diophantine approximation on Gm , in Number Theory and Algebraic Geometry, To Peter Swinaerton Dyer on his 75th birthday, London Math. Soc. Lecture Note Series 303, M. Reid and A. Skorobogatov, eds. Cambridge University Press, pp. 41–62. Bombieri, E. and W. Gubler (2006), Heights in Diophantine Geometry, Cambridge University Press. Bombieri, E., J. Mueller and M. Poe (1997), The unit equation and the cluster principle, Acta Arith. 79, 361–389. Bombieri, E., J. Mueller and U. Zannier (2001), Equations in one variable over function fields, Acta Arith. 99, 27–39. Bombieri, E. and W. M. Schmidt (1987), On Thue’s equation, Invent. Math. 88, 69–81. Borevich, Z. I. and I. R. Shafarevich (1967), Number Theory, 2nd edn., Academic Press. Borosh, I., M. Flahive, D. Rubin and B. Treybig (1989), A sharp bound for solutions of linear Diophantine equations, Proc. Amer. Math. Soc. 105, 844–846. Bosma, W., J. Cannon and C. Playoust (1997), The Magma algebra system I. The user languange, J. Symbolic Comput, 24, 235–265. Brindza, B. (1984), On S-integral solutions of the equation y m = f (x), Acta Math. Hungar. 44, 133–139. Brindza, B. and K. Győry (1990), On unit equations with rational coefficients, Acta Arith. 53, 367–388. Broberg, N. (1999), Some examples related to the abc-conjecture for algebraic number fields, Math. Comp. 69, 1707–1710. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 340 References Browkin, J. (2000), The abc-conjecture, in: Number Theory, R. P. Bambah, V. C. Dumir and R. J. Hans-Gill, eds., Birkhäuser, pp. 75–105. Brownawell, W. D. and D. W. Masser (1986), Vanishing sums in function fields, Math. Soc. Camb. Phil. Soc. 100, 427–434. Brunotte, H., A. Huszti and A. Pethő (2006), Bases of canonical number systems in quartic number fields, J. Théor. Nombres Bordeaux 18, 537–557. Bugeaud, Y. (1998), Bornes effectives pour les solutions des équations en S-unités et des équations de Thue-Mahler, J. Number Theory 71, 227–244. Bugeaud, Y. (2011), Quantitative versions of the subspace theorem and applications, J. Théor. Nombres Bordeaux 23, 35–57. Bugeaud, Y. (2012), Distribution Modulo One and Diophantine Approximation, Cambridge Tracts in Mathematics 193, Cambridge University Press. Bugeaud, Y. and K. Győry (1996a), Bounds for the solutions of unit equations, Acta Arith. 74, 67–80. Bugeaud, Y. and K. Győry (1996b), Bounds for the solutions of Thue-Mahler equations and norm form equations, Acta Arith. 74, 273–292. Bugeaud, Y. and F. Luca (2004), A quantitative lower bound for the greatest prime factor of (ab + 1)(bc + a)(ca + 1), Acta Arith. 114, 275–294. Bundschuh, P. and F.-J. Wylegala (1980), Über algebraische Unabhängigkeit bei gewissen nichtfortsetzbaren Potenzreihen, Arch. Math. 34, 32–36. Canci, J. K. (2007), Finite orbits for rational functions, Indag. Mathem., N.S. 18, 203– 214. Cassels, J. W. S. (1959), An Introduction to the Geometry of Numbers, Springer Verlag. Cijsouw, P. L. and R. Tijdeman (1973), On the transcendence of certain power series of algebraic numbers, Acta Arith. 23, 301–305. Coates, J. (1969), An effective p-adic analogue of a theorem of Thue, Acta Arith. 15, 279–305. Coates, J. (1970), An effective p-adic analogue of a theorem of Thue II, The greatest prime factor of a binary form, Acta Arith, 16, 392–412. Cohen, H. (1993), A Course in Computational Algebraic Number Theory, Springer Verlag. Cohen, H. (2000), Advanced Topics in Computational Number Theory, Springer Verlag. Conway, J. H. and A. J. Jones (1976), Trigonometric Diophantine equations (on vanishing sums of roots of unity), Acta Arith. 30, 229–240. Corvaja, P., W. M. Schmidt and U. Zannier (2010), The Diophantine equation α1x1 · · · αnxn = f (x1 , . . . , xn ) II, Trans. Amer. Math. Soc. 362, 2115–2123. Corvaja, P. and U. Zannier (2002a), A subspace theorem approach to integral points on curves, C.R. Math. Acad. Sci. Paris 334, 267–271. Corvaja, P. and U. Zannier (2002b), Some new applications of the subspace theorem, Compos. Math. 131, 319–340. Corvaja, P. and U. Zannier (2003), On the greatest prime factor of (ab + 1)(ac + 1), Proc. Amer. Math. Soc. 131, 1705–1709. Corvaja, P. and U. Zannier (2004a), On a general Thue’s equation, Amer. J. Math. 126, 1033–1055. Corvaja, P. and U. Zannier (2004b), On integral points on surfaces, Ann. Math. 160, 705–726. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 341 Corvaja, P. and U. Zannier (2006), On the integral points on certain surfaces, Int. Math. Res. Not. Art.ID 98623, 20 pp. Corvaja, P. and U. Zannier (2008), Applications of the Subspace Theorem to certain Diophantine problems: a survey of some recent results, in: Diophantine Approximation, Festschrift for Wolfgang Schmidt, H. P. Schlickewei, K. Schmidt and R. Tichy, eds., Springer Verlag, pp. 161–174. Daberkow, M., C. Fieker, J. Klüners, M. Pohst, K. Roegner and K. Wildanger (1997), KANT V4, J. Symbolic Comput. 24, 267–283. David, S. and P. Philippon (1999), Minorations des hauteurs normalisées des sousvariétés des tores, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 28, 489–543, Errata, 29, 729–731. Delone (Delaunay), B. N. (1930), Über die Darstellung der Zahlen durch die binären kubischen Formen von negativer Diskriminante, Math. Z, 31, 1–26. Delone, B. N. and D. K. Faddeev (1940), The theory of irrationalities of the third degree (Russian), Inst. Math. Steklov 11, Acad. Sci. USSR. English translation, Amer. Math. Soc., 1964. Derksen, H. (2007), A Skolem-Mahler-Lech theorem in positive characteristic and finite automata, Invent. Math. 168, 175–244. Derksen, H. and D. W. Masser (2012), Linear equations over multiplicative groups, recurrences, and mixing I, Proc. London Math. Soc. 104, 1045–1083. Dombek, D., L. Hajdu and A. Pethő (2014), Representing algebraic integers as linear combinations of units, Period. Math. Hung. 68, 135–142. Dubois, E. and G. Rhin (1975) Approximation rationnelles simultanées de nombres algébriques réels et de nombres algébriques p-adiques, in: Journées Arithmétiques de Bordeaux (Conf. Univ. Bordeaux, 1974), W. W. Adams, ed., Astérisque 24/25, Soc. Math. France, pp. 211–227. Dubois, E. and G. Rhin (1976), Sur la majoration de formes linéaires à coefficients algébriques réels et p-adiques. Démonstration d’une conjecture de K. Mahler, C.R. Acad. Sci. Paris Sér. A-B 282, A1211–A1214. Dvornicich, R. and U. Zannier (2000), On sums of roots of unity, Monatsh. Math. 129, 97–108. Dyson, F. J. (1947), The approximation of algebraic numbers by rationals, Acta Math. 79, 225–240. Eichler, M. (1966), Introduction to the theory of algebraic numbers and functions, Academic Press. Elkies, N. D. (1991), ABC implies Mordell, Int. Math. Res. Not. 7, 99–109. Erdős, P. (1976), Problems in number theory and combinatorics, Proc. 6th Manitoba Conference on Numerical Math. pp. 35–58. Erdős, P., C. L. Stewart and R. Tijdeman (1988), Some Diophantine equations with many solutions, Compos. Math. 66, 37–56. Erdős, P. and P. Turán (1934), On a problem in the elementary theory of numbers, Amer. Math. Monthly 41, 608–611. Everest, G. R. and K. Győry (1997), Counting solutions of decomposable form equations, Acta Arith. 79, 173–191. Evertse, J.-H. (1983), Upper bounds for the numbers of solutions of Diophantine equations, Ph.D. thesis, University of Leiden, Leiden. Also published as Math. Centre Tracts No. 168, CWI, Amsterdam. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 342 References Evertse, J.-H. (1984a), On equations in S-units and the Thue-Mahler equation, Invent. Math. 75, 561–584. Evertse, J.-H. (1984b), On sums of S-units and linear recurrences, Compos. Math. 53, 225–244. Evertse, J.-H. (1993), Estimates for reduced binary forms, J. Reine Angew. Math. 434, 159–190. Evertse, J.-H. (1995), The number of solutions of decomposable form equations, Invent. Math. 122, 559–601. Evertse, J.-H. (1996), An improvement of the quantitative subspace theorem, Compos. Math. 101, 225–311. Evertse, J.-H. (1997), The number of solutions of the Thue-Mahler equation, J. Reine Angew. Math. 482, 121–149. Evertse, J.-H. (1998), Lower bounds for resultants, II, in: Number Theory, Diophantine, Computational and Algebraic Aspects, Proc. Conf. Eger, Hungary, 1996, K. Győry, A. Pethö, V. T. Sós, eds., Walter de Gruyter, pp. 181–198. Evertse, J.-H. (1999), The number of solutions of linear equations in roots of unity, Acta Arith. 89, 45–51. Evertse, J.-H. (2002), Points on subvarieties of tori, in: A Panorama of Number Theory, or the View from Baker’s Garden, Proc. conf. ETH Zürich, 1999, G. Wüstholz, ed., Cambridge University Press, pp. 214–230. Evertse, J.-H. (2004), Linear equations with unknowns from a multiplicative group whose solutions lie in a small number of subspaces, Indag. Math. (N.S.) 15, 347– 355. Evertse, J.-H. and R. G. Ferretti (2002), Diophantine inequalities on projective varieties, Int. Math. Res. Not. 2002:25, 1295–1130. Evertse, J.-H. and R. G. Ferretti (2008), A generalization of the Subspace Theorem with polynomials of higher degree, in: Diophantine Approximation, Festschrift for Wolfgang Schmidt, H. P. Schlickewei, K. Schmidt and R. Tichy, eds., Springer Verlag, pp. 175–198. Evertse, J.-H. and R. G. Ferretti (2013), A further improvement of the Quantitative Subspace Theorem, Ann. Math. 177, 513–590. Evertse, J.-H., I. Gaál and K. Győry (1989), On the numbers of solutions of decomposable polynomial equations, Arch. Math. 52, 337–353. Evertse, J.-H. and K. Győry (1985), On unit equations and decomposable form equations, J. Reine Angew. Math. 358, 6–19. Evertse, J.-H. and K. Győry (1988a), On the number of polynomials and integral elements of given discriminant, Acta. Math. Hung. 51, 341–362. Evertse, J.-H. and K. Győry (1988b), On the number of solutions of weighted unit equations, Compos. Math. 66, 329–354. Evertse, J.-H. and K. Győry (1988c), Finiteness criteria for decomposable form equations, Acta Arith. 50, 357–379. Evertse, J.-H. and K. Győry (1988d), Decomposable form equations, in: New Advances in Transcendence Theory, Proc. conf. Durham 1986, A. Baker, ed., pp. 175–202. Evertse, J.-H. and K. Győry (1989), Thue-Mahler equations with a small number of solutions, J. Reine Angew. Math. 399, 60–80. Evertse, J.-H. and K. Győry (1991), Effective finiteness results for binary forms with given discriminant, Compositio Math., 79, 169–204. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 343 Evertse, J.-H. and K. Győry (1992a), Effective finiteness theorems for decomposable forms of given discriminant, Acta. Arith. 60, 233–277. Evertse, J.-H. and K. Győry (1992b), Discriminants of decomposable forms, in: New Trends in Probability and Statistics, F. Schweiger and E. Manstavičius, eds., pp. 39– 56. Evertse, J.-H. and K. Győry (1993), Lower bounds for resultants, I, Compositio Math. 88, 1–23. Evertse, J.-H. and K. Győry (1997), The number of families of solutions of decomposable form equations, Acta. Arith. 80, 367–394. Evertse, J.-H. and K. Győry (2013), Effective results for unit equations over finitely generated domains, Math. Proc. Camb. Phil. Soc. 154, 351–380. Evertse, J.-H. and K. Győry (2016), Discriminant Equations in Diophantine Number Theory, Cambridge: Cambridge University Press, to appear. Evertse, J.-H., K. Győry, C. L. Stewart and R. Tijdeman (1988a), On S-unit equations in two unknowns, Invent. math. 92, 461–477. Evertse, J.-H., K. Győry, C. L. Stewart and R. Tijdeman (1988b), S-unit equations and their applications, in: New Advances in Transcendence Theory, Proc. conf. Durham 1986, A. Baker, ed., pp. 110–174. Cambridge University Press. Evertse, J.-H., P. Moree, C. L. Stewart and R. Tijdeman (2003), Multivariate equations with many solutions, Acta Arith. 107 (2003), 103–125. Evertse, J.-H. and H. P. Schlickewei (1999), The Absolute Subspace Theorem and linear equations with unknowns from a multiplicative group, in: Number Theory in Progress, proc. conf. Zakopane 1997 in honour of the 60th birthday of Prof. Andrzej Schinzel, K. Győry, H. Iwaniec and J. Urbanowicz, eds., Walter de Gruyter, pp. 121–142. Evertse, J.-H. and H. P. Schlickewei (2002), A quantitative version of the Absolute Subspace Theorem, J. Reine Angew. Math. 548, 21–127. Evertse, J.-H., H. P. Schlickewei and W. M. Schmidt (2002), Linear equations in variables which lie in a multiplicative group, Ann. Math. 155, 807–836. Evertse, J.-H. and J.-H. Silverman (1986), Uniform bounds for the number of solutions to Y n = f (X), Math. Proc. Camb. Phil. Soc. 100, 237–248. Evertse, J.-H. and U. Zannier (2008), Linear equations with unknowns from a multiplicative group in a function field, Acta Arith. 133, volume dedicated to the 75th birthday of Wolfgang Schmidt, 157–170. Faltings, G. (1983), Endlichkeitssätze für abelsche Varietäten über Zahlkörpern, Invent. Math. 73, 349–366, Erratum: Invent. Math. 75 (1984), 381. Faltings, G. (1991), Diophantine approximation on abelian varieties, Ann. Math. 133, 549–576. Faltings, G. (1994), The general case of S. Lang’s conjecture, in: Bersotti symposium in Algebraic Geometry (Abano Terme, 1991), 175–182, Perspect. Math. 15, Academic Press. Faltings, G. and G. Wüstholz (1994), Diophantine approximations on projective spaces, Invent. Math. 116, 109–138. Feldman, N. I. and Y. V. Nesterenko (1998), Transcendental numbers, Springer Verlag, Vol. 44 of Encyclopaedia of Math. Sci. Filipin, A., R. F. Tichy and V. Ziegler (2008), The additive unit structure of pure quartic complex fields, Funct. Approx. Comment. Math. 39, 113–131. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 344 References Fincke, U. and M. Pohst (1985), Improved methods for calculating vectors of short length in a lattice, including a complexity analysis, Math. Comp. 44, 463–471. Frei, C. (2012), On rings of integers generated by their units, Bull. London Math. Soc. 44, 167–182. Friedman, E. (1989), Analytic formulas for regulators of number fields, Invent. Math. 98, 599–622. Fröhlich, A. and J. C. Shepherdson (1956), Effective procedures in field theory, Philos. Trans. Roy. Soc. London, Ser. A 248, 407–432. Gaál, I. (1984), Norm form equations with several dominating variables and explicit lower bounds for inhomogeneous linear forms with algebraic coefficients, Studia Sci. Math. Hungar 19, 399–411. Gaál, I. (1985), Norm form equations with several dominating variables and explicit lower bounds for inhomogeneous linear forms with algebraic coefficients, II, Studia Sci. Math. Hungar 20, 333–344. Gaál, I. (1986), Inhomogeneous discriminant form and index form equations and their applications, Publ. Math. Debrecen 33, 1–12. Gaál, I. (1988a), Integral elements with given discriminant over function fields, Acta Math. Hungar. 52, 133–146. Gaál, I. (1988b), Inhomogeneous norm form equations over function fields, Acta Arith. 51, 61–73. Gaál, I. (2002), Diophantine equations and power integral bases, Birkhäuser. Gaál, I. and M. Pohst (2002), On the resolution of relative Thue equations, Math. Comp. 71, no. 237, 429–440 (electronic). Gaál, I. and M. Pohst (2006a), Diophantine equations over global function fields I, The Thue equation, J. Number Theory 119, 49–65. Gaál, I. and M. Pohst (2006b), Diophantine equations over global function fields II, S-integral solutions of Thue equations, Exper. Math. 15, 1–6. Gaál, I. and M. Pohst (2010), Diophantine equations over global function fields IV, S-unit equations in several variables with an application to norm form equations, J. Number Theory 130, 493–506. Gebel, J., A. Pethő and H. G. Zimmer (1994), Computing integral points on elliptic curves, Acta Arith. 67, 171–192. Gelfond, A. O. (1934), Sur le septième problème de Hilbert, Izv. Akad. Nauk SSSR 7, 623–630. Gelfond, A. O. (1935), On approximating transcendental numbers by algebraic numbers, Dokl. Akad. Nauk SSSR 2, 177–182. Gelfond, A. O. (1940), Sur la divisibilité de la différence des puissances de deux nombres entiers par une puissance d’un idéal premier, Mat. Sbornik 7 (49), 7–26. Gelfond, A. O. (1960), Transcendental and algebraic numbers, New York, Dover. Ghioca, D. (2008), The isotrivial case in the Mordell-Lang theorem, Trans. Amer. Math. Soc. 360, 3839–3856. Grant, D. (1996), Sequences of Fields with Many Solutions to the Unit Equation, The Rocky Mountain J. Math. 26, 1017–1029. Granville, A. (1998), ABC allow us to count squarefrees, Int. Math. Res. Not. 19, 991–1009. Granville, A. and H. M. Stark (2000), abc implies no “Siegel zeros” for L-functions of characters with negative discriminant, Invent. Math. 139, 509–523. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 345 Green, B. and T. Tao (2008), The primes contain arbitrarily long arithmetic progressions, Ann. of Math. 167, 481–547. Győry, K. (1971), Sur l’irréductibilité d’une classe des polynômes, I, Publ. Math. Debrecen 18, 289–307. Győry, K. (1972), Sur l’irréductibilité d’une classe des polynômes, II, Publ. Math. Debrecen 19, 293–326. Győry, K. (1973), Sur les polynômes à coefficients entiers et de discriminant donné, Acta Arith. 23, 419–426. Győry, K. (1974), Sur les polynômes à coefficients entiers et de discriminant donné II, Publ. Math. Debrecen 21, 125–144. Győry, K. (1976), Sur les polynômes à coefficients entiers et de discriminant donné III, Publ. Math. Debrecen 23, 141–165. Győry, K. (1978a), On polynomials with integer coefficients and given discriminant IV, Publ. Math. Debrecen 25, 155–167. Győry, K. (1978b), On polynomials with integer coefficients and given discriminant V, p-adic generalizations, Acta Math. Acad. Sci. Hung. 32, 175–190. Győry, K. (1978/1979), On the greatest prime factors of decomposable forms at integer points, Ann. Acad. Sci. Fenn., Ser. A I, Math. 4, 341–355. Győry, K. (1979), On the number of solutions of linear equations in units of an algebraic number field, Comment. Math. Helv. 54, 583–600. Győry, K. (1979/1980), On the solutions of linear diophantine equations in algebraic integers of bounded norm, Ann. Univ. Sci. Budapest. Eötvös, Sect. Math. 22–23, 225–233. Győry, K. (1980a), Explicit upper bounds for the solutions of some diophantine equations, Ann. Acad. Sci. Fenn., Ser A I, Math. 5, 3–12. Győry, K. (1980b), Résultats effectifs sur la représentation des entiers par des formes désomposables, Queen’s Papers in Pure and Applied Math., No.56. Győry, K. (1980c), On certain graphs composed of algebraic integers of a number field and their applications I, Publ. Math. Debrecen 27, 229-242. Győry, K. (1981a), On the representation of integers by decomposable forms in several variables, Publ. Math. Debrecen 28, 89–98. Győry, K. (1981b), On S-integral solutions of norm form, discriminant form and index form equations, Studia Sci. Math. Hungar 16, 149–161. Győry, K. (1981c), On discriminants and indices of integers of an algebraic number field, J. Reine Angew. Math. 324, 114–126. Győry, K. (1982a), Polynomials of given discriminant and integral elements of given discriminant over integral domains, C. R. Math. Rep. Acad. Sci. Canada 4, 75–80. Győry, K. (1982b), On certain graphs associated with an integral domain and their applications to Diophantine problems, Publ. Math. Debrecen 29, 79–94. Győry, K. (1982c), On the irreducibility of a class of polynomials III. J. Number Theory 15, 164–181. Győry, K. (1983), Bounds for the solutions of norm form, discriminant form and index form equations in finitely generated integral domains, Acta Math. Hung. 42, 45–80. Győry, K. (1984), Effective finiteness theorems for polynomials with given discriminant and integral elements with given discriminant over finitely generated domains, J. Reine Angew. Math. 346, 54–100. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 346 References Győry, K. (1990), On arithmetic graphs associated with integral domains, in: A Tribute to Paul Erdős, Cambridge University Press, pp. 207–222. Győry, K. (1992a), Some recent applications of S-unit equations, Astérisque 209, 17–38. Győry, K. (1992b), Upper bounds for the numbers of solutions of unit equations in two unknowns, Lithuanian Math. J. 32, 40–44. Győry, K. (1992c), On the irreducibility of a class of polynomials IV, Acta Arith. 62, 399–405. Győry, K. (1993a), On the numbers of families of solutions of systems of decomposable form equations, Publ. Math. Debrecen 42, 65–101. Győry, K. (1993b), Some applications of decomposable form equations to resultant equations, Coll. Math. 65, 267–275. Győry, K. (1993c), On the number of pairs of polynomials with given resultant or given semi-resultant, Acta Sci. Math. 57, 515–529. Győry, K. (1994), On the irreducibility of neighbouring polynomials, Acta. Arith. 67, 283–294. Győry, K. (1996), Applications of unit equations, in: Analytic Number Theory, RIMS Kokyusoku 958, Kyoto, Japan, pp. 62–78. Győry, K. (1998), Bounds for the solutions of decomposable form equations, Publ. Math. Debrecen 52, 1–31. Győry, K. (1999), On the distribution of solutions of decomposable form equations, in: Number Theory in Progress, Proc. conf. in honour of 60th birthday of Andrzej Schinzel, K. Győry, H. Iwaniec and J. Urbanowicz, eds., de Gruyter, pp. 237– 365. Győry, K. (2002), Solving diophantine equations by Baker’s theory, in: A Panorama of Number Theory, Cambridge, pp. 38–72. Győry, K. (2006), Polynomials and binary forms with given discriminant, Publ. Math. Debrecen 69, 473–499. Győry, K. (2008a), On the abc-conjecture in algebraic number fields, Acta Arith. 133, 281–295. Győry, K. (2008b), On certain arithmetic graphs and their applications to diophantine problems, Funct. Approx. Comment. Math., 39, 289–314. Győry, K. (2010), S-unit equations in number fields: effective results, generalizations, ABC-conjecture, in: Analytic number theory and related topics, RIMS Kokyusoku 1710, pp. 71–84. Győry, K., L. Hajdu and R. Tijdeman (2011), Irreducibility criteria of Schur-type and Pólya-type, Monatsh. Math. 163, 415–443. Győry, K., L. Hajdu and R. Tijdeman (2014), Representation of finite graphs as difference graphs of S-units, I, J. Combinatorial Theory, Ser. A, 127, 314–335. Győry, K. and Z. Z. Papp (1977), On discriminant form and index form equations, Studia Sci. Math. Hungar. 12, 47–60. Győry, K. and Z. Z. Papp (1978), Effective estimates for the integer solutions of norm form and discriminant form equations, Publ. Math. Debrecen 25, 311–325. Győry, K. and A. Pethő (1980), Über die Verteilung der Lösungen von Normformen Gleichungen III, Acta Arith. 37, 143–165. Győry, K., I. Pink and Á. Pintér (2004), Power values of polynomials and binomial Thue-Mahler equations, Publ. Math. Debrecen 65, 341–362. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 347 Győry, K. and Á. Pintér (2008), Polynomial powers and a common generalization of binomial Thue-Mahler equations and S-unit equations, in: Diophantine Equations, Proc. conf. in honour of Tarlok Shorey’s 60th birthday, N. Saradha, ed., New Delhi, pp. 103–119. Győry, K. and M. Ru (1998), Integer solutions of a sequence of decomposable form inequalities, Acta Arith. 86, 227–237. Győry, K., A. Sárközy and C. L. Stewart (1996), On the number of prime factors of integers of the form ab + 1, Acta Arith. 74, 365–385. Győry, K. and A. Schinzel (1994), On a conjecture of Posner and Rumsey, J. Number Theory, 47, 63–78. Győry, K., C. L. Stewart and R. Tijdeman (1986), On prime factors of sums of integers I, Compositio Math 59, 81–88. Győry, K. and K. Yu (2006), Bounds for the solutions of S-unit equations and decomposable form equations, Acta Arith. 123, 9–41. Hajdu, L. (1993), A quantitative version of Dirichlet’s S-unit theorem in algebraic number fields, Publ. Math. Debrecen 42, 239–246. Hajdu, L. (1997), On a problem of Győry and Schinzel concerning polynomials, Acta Arith. 78, 287–295. Hajdu, L. (2007), Arithmetic progressions in linear combinations of S-units, Period. Math. Hung. 54, 175–181. Hajdu, L. (2009), Optimal systems of fundamental S-units for LLL-reduction, Periodica Math. Hung. 59, 79–105. Hajdu, L. and F. Luca (2010), On the length of arithmetic progressions in linear combinations of S-units, Archiv Math. 94, 357–363. Hajdu, L. and R. Tijdeman (2003), Polynomials dividing infinitely many quadrinomials or quintinomials, Acta Arith. 107, 381–404. Hajdu, L. and R. Tijdeman (2008), A criterion for polynomials to divide infinitely many k-nomials, in: Diophantine Approximation, Festschrift for Wolfgang Schmidt, H. P. Schlickewei, K. Schmidt and R. Tichy, eds., Springer Verlag, pp. 175–198. Halter-Koch, F. and W. Narkiewicz (1997), Polynomial cycles and dynamical units, in: Proc. Conf. Analytic and Elementary Number Theory, dedicated to the 80th birthday of E. Hlawka, W. G. Nowak and J. Schoißengeier, eds., Wien, 1997, 70–80. Halter-Koch, F. and W. Narkiewicz (2000), Scarcity of finite polynomial orbits, Publ. Math. Debrecen 56, 405–414. Hardy, G. H. and E. M. Wright (1980), An introduction to the theory of numbers, 5th. edn., Oxford University Press. Haristoy, J. (2003), Équations diophantiennes exponentielles, Thèse de docteur, Strasbourg. Harris, J. (1992), Algebraic Geometry, A First Course, Springer Verlag. Hartshorne, R. (1977), Algebraic Geometry, Springer Verlag. Hermann, G. (1926), Die Frage der endlich vielen Schritte in der Theorie der Polynomideale, Math. Ann. 95, 736–788. Hermite, C. (1851), Sur l’introduction des variables continues dans la théorie des nombres, J. Reine Angew. Math. 41, 191–216. Hernández, S. and F. Luca (2003), On the largest prime factor of (ab + 1)(ac + 1)(bc + 1), Bol. Soc. Mat. Mexicana, 9, 235–244. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 348 References Hindry, M. (1988), Autour d’une Conjecture de Serge Lang, Invent. Math. 94, 575–603. Houriet, J. (2007), Exceptional units and Euclidean number fields, Archiv Math. 88, 425–433. Hrushovki, E. (1996), The Mordell-Lang conjecture for function fields, J. Amer. Math. Soc. 9, 667–690. Hsia, L.-C. and J. T.-Y. Wang (2004), The ABC theorem for higher-dimensional function fields, Trans. Amer. Math. Soc. 356, no. 7, 2871–2887. Jarden, M. and W. Narkiewicz (2007), On sums of units, Monatsh. Math. 150, 327–332. de Jong, R. S. (1999), On p-adic norm form inequalities, Master thesis, Leiden. de Jong, R. S. and G. Rémond (2011), Conjecture de Shafarevich effective pour les revêtements cycliques, Algebra and Number Theory 5, 1133–1143. von Känel, R. (2011), An effective proof of the hyperelliptic Shafarevich conjecture and applications, Ph.D. thesis, ETH Zürich. von Känel, R. (2013), On Szpiro’s discriminant conjecture, Internat. Math. Res. Notices 1–35. Published online: doi:10.193/imrn/vnt079. von Känel, R. (2014a), An effective proof of the hyperelliptic Shafarevich conjecture, J. Théorie des Nombres, Bordeaux, 26, 507–530. von Känel, R. (2014b) Modularity and integral points on moduli schemes, arXiv:1310.7263v2 [math.NT]. Karpilovsky, G. (1988), Unit groups of classical rings, Oxford University Press. Koblitz, N. (1984), p-adic Numbers, p-adic Analysis, and Zeta-Functions, Springer Verlag. Konyagin, S. and K. Soundararajan (2007), Two S-unit equations with many solutions, J. Number Theory 124, 193–199. Kotov, S. V. (1981), Effective bound for a linear form with algebraic coefficients in the archimedean and p-adic metrics, Inst. Math. Akad. Nauk BSSR, Preprint No. 24, Minsk (Russian). Kotov, S. V. and V. G. Sprindžuk (1973), An effective analysis of the Thue-Mahler equation in relative fields, Dokl. Akad. Nauk BSSR 17, 393–395 (Russian). Kotov, S. V. and L. Trelina (1979), S-ganze Punkte auf elliptischen Kurven, J. Reine Angew. Math. 306, 28–41. Kovács, B. (1981), Canonical number systems in algebraic number fields, Acta Math. Acad. Sci. Hungar. 37, 405–407. Kovács, B. and A. Pethő (1991), Number systems in integral domains, especially in orders of algebraic number fields, Acta Sci. Math. 55, 287–299. Koymans, P. (2015), The Catalan Equation, Master thesis, Leiden University. Lagarias, J. C. and K. Soundararajan (2011), Smooth solutions to the abc equation: the xyz conjecture, J. Théorie des Nombres de Bordeaux 23, 209–234. Lagrange, J. L. (1773), Recherches d’arithmétiques, Nouv. Mém. Acad. Berlin, 265–312; Oeuvres III, 693–758. Landau, E. (1918), Verallgemeinerung eines Pólyaschen Satzes auf algebraische Zahlkörper, Nachr. Ges. Wiss. Göttingen, 478–488. Lang, S. (1960), Integral points on curves, Inst. Hautes Études Sci. Publ. Math. 6, 27–43. Lang, S. (1962), Diophantine geometry, Wiley. Lang, S. (1970), Algebraic Number Theory, Addison-Wesley. Lang, S. (1978), Elliptic curves: Diophantine analysis, Springer Verlag. Lang, S. (1983), Fundamentals of Diophantine Geometry, Springer Verlag. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 349 Lang, S. (1984), Algebra, 2nd. edn., Addison-Wesley. Langevin, M. (1999), Liens entre le théorème de Mason et la conjecture (abc), in: Number Theory (5th conf. of CNTA, Ottawa ON 1996), R. Gupta and K. S. Williams, eds. 187–213. CRM Proc. Lecture Notes 19, AMS, Providence RI. Laurent, M. (1984), Équations diophantiennes exponentielles, Invent. Math. 78, 299– 327. Laurent, M. (1989), Équations exponentielles polynômes et suites récurrentes linéaires, II, J. Number Theory 31, 24–53. Lech, C. (1953), A note on recurring series, Ark. Math. 2, 417–421. Lehmer, D. H. (1933), Factorization of certain cyclotomic functions, Ann. Math. (2) 34, 461–479. Leitner, D. J. (2012), Linear equations over multiplicative groups in positive characteristic, Acta Arith. 153, 325–347. Lenstra Jr., H. W. (1977), Euclidean number fields of large degree, Inventiones Math. 38, 237–254. Lenstra, A. K., H. W. Lenstra Jr. and L. Lovász (1982), Factoring polynomials with rational coefficients, Math. Ann. 261, 515–534. Leutbecher, A. (1985), Euclidean fields having a large Lenstra constant, Ann. Inst. Fourier 35, 83–106. Leutbecher, A. and J. Martinet (1982), Lenstra’s constant and euclidean number fields, Astérisque 94, 87–131. Leutbecher, A. and G. Niklasch (1989), On cliques of exceptional units and Lenstra’s construction of Euclidean fields, Lecture Notes Math. 1380, 150–178. LeVeque, W. J. (1964), On the equation y m = f (x), Acta Arith. 9, 209–219. LeVesque, C. and M. Waldschmidt (2011), Some remarks on diophantine equations and diophantine approximation, Vietnam J. Math. 39, 343–368. LeVesque, C. and M. Waldschmidt (2012), Familles d’équations de Thue-Mahler n’ayant que des solutions triviales, Acta Arith. 155, 117–138. Levin, A. (2006), One-parameter families of unit equations, Math. Res. Lett. 13, 935– 945. Levin, A. (2008), The dimension of integral points and holomorphic curves on the complements of hyperplanes, Acta Arith. 134, 259–270. Levin, A. (2014), Lower bounds in logarithms and integral points on higher dimensional varieties, Algebra Number Theory 8, 647–687. Lewis, D. J. and K. Mahler (1961), Representation of integers by binary forms, Acta Arith. 6, 333–363. Liardet, P. (1974), Sur une conjecture de Serge Lang, C.R. Acad. Sci. Paris 279, 435– 437. Liardet, P. (1975), Sur une conjecture de Serge Lang, Astérisque 24–25, Soc. Math. France. Liu, J. (2015), On p-adic Decomposable Form Inequalities, Ph.D. thesis, Leiden. Loher, T. and D. Masser (2004), Uniformly counting points of bounded height, Acta Arith. 111, 277–297. Louboutin, S. (2000), Explicit bounds for residues of Dedekind zeta functions, values of L-functions at s = 1, and relative class numbers, J. Number Theory 85, 263–282. Loxton, J. H. and A. J. van der Poorten (1983), Multiplicative dependence in number fields, Acta Arith. 42, 291–302. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 350 References Luca, F. (2005), On the greatest common divisor of u − 1 and v − 1 with u and v near S-units, Monatsh. Math. 146, 239–256. Mahler, K. (1933a), Zur Approximation algebraischer Zahlen I: Über den grössten Primteiler binärer Formen, Math. Ann. 107, 691–730. Mahler, K. (1933b), Zur Approximation algebraischer Zahlen III: Über die mittlere Anzahl grosser Zahlen durch binäre Formen, Acta Math. 62, 91–166. Mahler, K. (1935a), Eine arithmetische Eigenschaft der Taylor-koeffizienten rationaler Functionen, Proc. Kon. Ned. Akad. Wetensch. 38, 50–60. Mahler, K. (1935b), Über transzendente p-adische Zahlen, Compos. Math. 2, 259–275. Mann, H. B. (1965), On linear relations between roots of unity, Mathematika 12, 107– 117. Mason, R. C. (1983), The hyperelliptic equation over function fields, Math. Proc. Camb. Phil. Soc. 93, 219–230. Mason, R. C. (1984), Diophantine equations over function fields, Cambridge University Press. Mason, R. C. (1986a), Norm form equations I, J. Number Theory 22, 190–207. Mason, R. C. (1986b), Norm form equations III: positive characteristic, Math. Proc. Camb. Phil. Soc. 99, 409–423. Mason, R. C. (1987), Norm form equations V. Degenerate modules, J. Number Theory 25, 239–248. Mason, R. C. (1988), The study of Diophantine equations over function fields, in: New Advances in Transcendence Theory, Proc. conf. Durham 1986, A. Baker, ed., Cambridge University Press, pp. 229–247. Masser, D. W. (1985), Conjecture in “Open Problems” section, in: Proc. Symposium on Analytic Number Theory, London, 25. Masser, D. W. (2002), On abc and discriminants, Proc. Amer. Math. Soc. 130, 3141– 3150. Masser, D. W. (2004), Mixing and linear equations over groups in positive characteristic, Israel J. Math. 142, 189–204. Masser, D. W. and G. Wüstholz (1983), Fields of large transcendence degree generated by values of elliptic functions, Invent. Math. 72, 407–464. Matveev, E. M. (2000), An explicit lower bound for a homogeneous rational linear form in logarithms of algebraic numbers, II. Izvestiya: Mathematics 64, 1217–1269. McQuillan, M. (1995), Division points on semi-abelian varieties, Invent. Math. 120 (1995), 143–159. Mestre, J. F. (1981), Corps euclidiens, unités exceptionnelles et courbes elliptiques, J. Number Theory 13, 123–137. Minkowski, H. (1910), Geometrie der Zahlen, Teubner (Posthumously published; prepared by D. Hilbert and A. Speiser). Moosa, R. and T. Scanlon (2002), The Mordell-Lang conjecture in positive characteristic revisited, In: Model theory and applications, Quaderni di matematica 11, L. Belair, Z. Chatzidakis, P. D’Aquino, D. Marker, M. Otero, F. Point and A. Wilkie, eds. Dipartimento di Matematica Seconda Università di Napoli. pp. 273–296. Moosa, R. and T. Scanlon (2004), F -structures and integral points on semiabelian varieties over finite fields, Amer. J. Math. 126, 473–522. Mordell, L. J. (1922a), On the rational solutions of the indeterminate equations of the third and fourth degrees, Proc. Cambridge Philos. Soc. 21, 179–192. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 351 Mordell, L. J. (1922b), Note on the integer solutions of the equation Ey 2 = Ax 3 + Bx 2 + Cx + D, Messenger Math. 51, 169–171. Mordell, L. J. (1923), On the integer solutions of the equation ey 2 = ax 3 + bx 2 + cx + d, Proc. London Math. Soc. (2) 21, 415–419. Moree, P. and C. L. Stewart (1990), Some Ramanujan-Nagell equations with many solutions, Indag Math. (N. S.), 1, 465–472. Morton, P. and J. H. Silverman (1994), Rational periodic points of rational functions, Intern. Math. Res. Not. (2), 97–110. Mueller, J. (2000), S-unit equations in function fields via the abc-theorem, Bull. London Math. Soc. 32, 163–170. Murty, M. R. and H. Pasten (2013), Modular forms and effective Diophantine approximation, J. Number Theory 133, 3739–3754. Nagell, T. (1930), Zur Theorie der kubischen Irrationalitäten, Acta Math. 55, 33–65. Nagell, T. (1964), Sur une propriété des unités d’un corps algébrique, Arkiv för Mat. 5, 343–356. Nagell, T. (1967), Sur les discriminants des nombres algébriques, Arkiv för Mat. 7, 265–282. Nagell, T. (1968a), Quelques propriétés des nombres algébriques du quatrième degré, Arkiv för Mat. 7, 517–525. Nagell, T. (1968b), Sur les unités dans les corps biquadratiques primitifs du premier rang, Arkiv för Mat. 7, 359–394. Nagell, T. (1970), Sur un type particulier d’unités algébriques, Arkiv för Mat. 8, 163– 184. Narkiewicz, W. (1989), Polynomial cycles in algebraic number fields, Colloq. Math. 58, 149–153. Narkiewicz, W. (1995), Polynomial Mappings, Lecture Notes Math. 1600, Springer Verlag. Narkiewicz, W. and T. Pezda (1997), Finite Polynomial Orbits in Finitely Generated Domains, Monatsh. Math. 124, 309–316. Neukirch, J. (1992), Algebraische Zahlentheorie, Springer Verlag. Nishioka, K. (1986), Proof of Masser’s Conjecture on the Algebraic Independence of Values of Liouville Series, Proc. Japan Acad. Ser. A 62, 219–222. Nishioka, K. (1987), Conditions for algebraic independence of certain power series of algebraic numbers, Compos. Math. 62, 53–61. Nishioka, K. (1989), Evertse theorem in algebraic independence, Arch. Math. 53, 159– 170. Nishioka, K. (1994), Algebraic independence by Mahler’s method and S-unit equations, Compos. Math. 92, 87–110. Nishioka, K. (1996), Mahler Functions and Transcendence, Lecture Notes Math. 1631, Springer Verlag. Northcott, D. G. (1950), Periodic points on an algebraic variety, Ann. Math. 51, 167–177. Parry, C. J. (1950), The p-adic generalization of the Thue-Siegel theorem, Acta Math. 83, 1–100. Pasten, H. (2014), Arithmetic problems around the abc-conjecture and connections with logic, Ph.D. thesis, Queen’s University, Canada. Pethő, A. and R. Schulenberg (1987), Effektives Lösen von Thue Gleichungen, Publ. Math. Debrecen 34, 189–196. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 352 References Pethő, A. and B. M. M. de Weger (1986), Products of prime powers in binary recurrence sequences I. The hyperbolic case, with an application to the generalized Ramanujan-Nagell equation, Math. Comp. 47, 713–727. Pezda, T. (1994), Polynomial cycles in certain local domains, Acta Arith. 66, 11–22. Pezda, T. (2014), An algorithm determining cycles of polynomial mappings in integral domains, Publ. Math. Debrecen 84, 399–414. Poe, M. (1997), On distribution of solutions of S-unit equations, J. Number Theory 62, 221–241. Pohst, M. E. (1993), Computational Algebraic Number Theory, Birkhäuser Verlag. Pohst, M. E. and H. Zassenhaus (1989), Algorithmic algebraic number theory, Cambridge University Press. Poonen, R. (1999), Mordell-Lang plus Bogomolov, Invent. Math. 137, 413–425. van der Poorten, A. J. and H. P. Schlickewei (1982), The growth condition for recurrence sequences, Macquarie University Math. Rep. 82–0041. van der Poorten, A. J. and H. P. Schlickewei (1991), Additive relations in fields, J. Austral. Math. Soc. (Ser. A) 51, 154–170. Posner, E. C. and H. Rumsey, Jr. (1965), Polynomials that divide infinitely many trinomials, Michigan Math. J., 12, 339–348. Rémond, G. (2000a), Inégalité de Vojta en dimension supérieure, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 29, 101–151. Rémond, G. (2000b), Décompte dans une conjecture de Lang, Invent. Math. 142, 513– 545. Rémond, G. (2002), Sur les sous-variétés des tores, Compos. Math. 134, 337–366. Rémond, G. (2003), Approximation diophantienne sur les variétés semi-abeliennes, Ann. Sci. École Norm. Sup. (4) 36, 191–212. Ridout, P. (1958), The p-adic generalization of the Thue-Siegel-Roth Theorem, Mathematika 5, 40–48. Robert, O., C. L. Stewart and G. Tenenbaum (2014), A refinement of the abc conjecture, Bull. London Math. Soc. 46, 1156–1166. Roquette, P. (1957), Einheiten und Divisorenklassen in endlich erzeugbaren Körpern, Jahresber. Deutsch. Math. Verein 60, 1–21. Rosser, J. B. and L. Schoenfeld (1962), Approximate formulas for some functions of prime numbers, Illinois J. Math. 6, 64–94. Roth, K. F. (1955), Rational approximations to algebraic numbers, Mathematika 2, 1–20. Ru, M. and P. Vojta (1997), Schmidt’s subspace theorem with moving targets, Invent. Math. 127, 51–65. Ru, M. and P. M. Wong (1991), Integral points of Pn \ {2n + 1 hyperplanes in general position}, Invent. Math. 106, 195–216. Schinzel, A. (1988), Reducibility of lacunary polynomials VIII, Acta Arith. 50, 91–106. Schlickewei, H. P. (1976a), Linearformen mit algebraischen Koeffizienten, Manuscripta Math. 18, 147–185. Schlickewei, H. P. (1976b), Die p-adische Verallgemeinerung des Satzes von ThueSiegel-Roth-Schmidt, J. Reine Angew. Math. 288, 86–105. Schlickewei, H. P. (1976c), On products of special linear forms with algebraic coefficients, Acta Arith. 31, 389–398. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 353 Schlickewei, H. P. (1977a), Über die diophantische Gleichung x1 + · · · + xn = 0, Acta Arith. 33 (1977), 183–185. Schlickewei, H. P. (1977b), The p-adic Thue-Siegel-Roth-Schmidt theorem, Arch. Math. (Basel) 29, 267–270. Schlickewei, H. P. (1977c), On norm form equations, J. Number Theory 9, 370–380. Schlickewei, H. P. (1977d), On linear forms with algebraic coefficients and Diophantine equations, J. Number Theory 9, 381–392. Schlickewei, H. P. (1977e), Inequalities for decomposable forms, Astérisque 41–42, pp. 267–271. Schlickewei, H. P. (1990), S-unit equations over number fields, Invent. Math. 102, 95–107. Schlickewei, H. P. (1992), The quantitative Subspace Theorem for number fields, Compos. Math. 82, 245–273. Schlickewei, H. P. (1996a), Multiplicities of recurrence sequences, Acta Math. 176, 171–243. Schlickewei, H. P. (1996b), Equations in roots of unity, Acta Arith. 76, 99–108. Schlickewei, H. P. and W. M. Schmidt (2000), The Number of Solutions of PolynomialExponential Equations, Compos. Math. 120, 193–225. Schlickewei, H. P. and C. Viola (1997), Polynomials that divide many trinomials, Acta Arith. 78, 267–273. Schlickewei, H. P. and C. Viola (1999), Polynomials that divide many k-nomials, in: Number Theory in Progress, Vol. I, Proc. conf. in honour of the 60th birthday of Andrzej Schinzel, K. Győry, H. Iwaniec and J. Urbanowicz eds. de Gruyter, pp. 445–450. Schlickewei, H. P. and E. Wirsing (1997), Lower bounds for the heights of solutions of linear equations, Invent. Math. 129, 1–10. Schmidt, W. M. (1971), Linearformen mit algebraischen Koeffizienten II, Math. Ann. 191, 1–20. Schmidt, W. M. (1972), Norm form equations, Ann. Math. 96, 526–551. Schmidt, W. M. (1973), Inequalities for resultants and for decomposable forms, in: Diophantine Approximation and its Applications, Academic Press, pp. 235– 253. Schmidt, W. M. (1975), Simultaneous approximation to algebraic numbers by elements of a number field, Monatsh. Math. 79, 55–66. Schmidt, W. M. (1978), Thue’s equation over function fields, J. Austral. Math. Soc. Ser A 25, 385–422. Schmidt, W. M. (1980), Diophantine Approximation, Lecture Notes Math. 785, Springer Verlag. Schmidt, W. M. (1989), The subspace theorem in diophantine approximation, Compos. Math. 96, 121–173. Schmidt, W. M. (1990), The number of solutions of norm form equations, Trans. Amer. Math. Soc. 317, 197–227. Schmidt, W. M. (1991), Diophantine Approximations and Diophantine Equations, Lecture Notes Math. 1467, Springer Verlag. Schmidt, W. M. (1992), Integer points on curves of genus 1, Compositio Math. 81, 33–59. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 354 References Schmidt, W. M. (1996), Heights of points on subvarieties of Gnm , In: Number Theory 1993–94, London Math. Soc. Lecture Note Ser. 235, S. David, ed., 157–187. Cambridge University Press. Schmidt, W. M. (1999), The zero multiplicity of linear recurrence sequences, Acta Math. 182, 243–282. Schmidt, W. M. (2000), Zeros of linear recurrence sequences, Publ. Math. Debrecen 56, 609–630. Schmidt, W. M. (2003), Linear recurrence sequences, in: Diophantine Approximation, C.I.M.E. Summer school, Cetraro, Italy, June 28–July 6, 2000, F. Amoroso, U. Zannier, eds., Lecture Notes Math. 1819, Springer Verlag, pp. 171–247. Schmidt, W. M. (2009), The Diophantine equation α1x1 · · · αnxn = f (x1 , . . . , xn ), in: Analytic Number Theory. Essays in Honour of Klaus Roth, W. W. L. Chen, W. T. Gowers, H. Halberstem and W. M. Schmidt, eds., pp. 414–420. Cambridge University Press. Schneider, T. (1934), Transzendenzuntersuchungen periodischer Funktionen: I Transzendenz von Potenzen; II Transzendenzeigenschaften elliptischer Funktionen, J. Reine Angew. Math. 172, 65–74. Sehgal, S. (1978), Topics in Group Rings, Marcel Dekker. Seidenberg, A. (1974), Constructions in algebra, Trans. Amer. Math. Soc. 197, 273– 313. Serre, J.-P. (1989), Lectures on the Mordell-Weil theorem, Aspects of Math. E15, Vieweg. Shorey, T. N. and R. Tijdeman (1986), Exponential Diophantine Equations, Cambridge University Press. Siegel, C. L. (1921), Approximation algebraischer Zahlen, Math. Z. 10, 173–213. Siegel, C. L. (1926), The integer solutions of the equation y 2 = ax n + bx n−1 + · · · + k, J. London Math. Soc. 1, 66–68. Siegel, C. L. (1929), Über einige Anwendungen diophantischer Approximationen, Abh. Preuss. Akad. Wiss., Phys. Math. Kl., No. 1. Siegel, C. L. (1969), Abschätzung von Einheiten, Nachr. Göttingen, 71–86. Silverman, J. H. (1984), The S-unit equation over function fields, Math. Proc. Camb. Phil. Soc. 95, 3–4. Silverman, J. H. (1995), Exceptional units and numbers of small Mahler measure, Experiment. Math. 4, 70–83. Silverman, J. H. (2007), The arithmetic of dynamical systems, Springer Verlag. Simmons, H. (1970), The solution of a decision problem for several classes of rings, Pacific J. Math. 34, 547–557. Simon, D. (2001), The index of nonmonic polynomials, Indag. Math. (N.S) 12, 505–517. Skolem, Th. (1933), Einige Sätze über gewisse Reihenentwicklungen und exponentiale Beziehungen mit Anwendung auf diophantische Gleichungen, Oslo Vid. akad. Skrifter 6, 1–61. Skolem, Th. (1935), Ein Verfahren zur Behandlung gewisser exponentialer Gleichungen, 8. Skand. Mat.-Kongr. Stockholm 163–188. Smart, N. (1995), The solution of triangularly connected decomposable form equations, Math. Comp. 64, 819–840. Smart, N. P. (1997), S-unit equations, binary forms and curves of genus 2, Proc. London Math. Soc. (3) 75, 271–307. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 355 Smart, N. P. (1998), The Algorithmic Resolution of Diophantine Equations, Cambridge University Press. Smart, N. P. (1999), Determining the small solutions to S-unit equations, Math. Comput. 68, 1687–1699. Sprindžuk, V. G. (1969), Effective estimates in “ternary” exponential diophantine equations (Russian), Dokl. Akad. Nauk BSSR, 13, 777–780. Sprindžuk, V. G. (1973), Squarefree divisors of polynomials and class numbers of algebraic number fields (Russian), Acta Arith. 24, 143–149. Sprindžuk, V. G. (1974), Representation of numbers by the norm forms with two dominating variables, J. Number Theory, 6, 481–486. Sprindžuk, V. G. (1976), A hyperelliptic diophantine equation and class numbers (Russian), Acta Arith. 30, 95–108. Sprindžuk, V. G. (1982), Classical Diophantine Equations in Two Unknowns (Russian), Nauka. Sprindžuk, V. G. (1993), Classical Diophantine Equations, Lecture Notes Math. 1559, Springer Verlag. Stewart, C. L. and R. Tijdeman (1986), On the Oesterlé-Masser conjecture, Monatsh. Math. 102, 251–257. Stewart, C. L. and K. Yu (1991), On the abc conjecture, Math. Ann. 291, 225–230. Stewart, C. L. and K. Yu (2001), On the abc conjecture, II, Duke Math. J. 108, 169– 181. Stothers, W. W. (1981), Polynomial identities and Hauptmodulen, Quart. J. Math. Oxford Ser. (2) 32, 349–370. Stroeker, R. J. and N. Tzanakis (1994), Solving elliptic Diophantine equations by estimating linear forms in elliptic logarithms, Acta Arith. 67, 177–196. Sunley, J. S. (1973), Class numbers of totally imaginary quadratic extensions of totally real fields, Trans. Amer. Math. Soc. 175, 209–232. Surroca, A. (2007), Sur l’effectivité du théorème de Siegel et la conjecture abc, J. Number Theory, 124, 267-290. Szemerédi, E. (1975), On sets of integers containing no k elements in arithmetic progression, Acta Arith. 27, 299–345. Taylor, R. and A. Wiles (1995), Ring-theoretic properties of certain Hecke algebras, Ann. Math. (2) 141, 553–572. Teske, E. (1998), A space efficient algorithm for group structure computation, Math. Comp. 67, 1637–1663. Thue, A. (1909), Über Annäherungswerte algebraischer Zahlen, J. Reine Angew. Math. 135, 284–305. Thunder, J. L. (2001), Decomposable Form Inequalities, Ann. Math. 153, 767–804. Thunder, J. L. (2005), Asymptotic estimates for the number of integer solutions to decomposable form inequalities, Compos. Math. 141 (2005), 271–292. Tichy, R. F and V. Ziegler (2007), Units generating the ring of integers of complex cubic fields, Colloq. Math. 109, 71–83. Tzanakis, N. (2013), Elliptic Diophantine Equations, de Gruyter. Tzanakis, N. and B. M. M. de Weger (1989), On the practical solution of the Thue equation, J. Number Theory 31, 99–132. Vaaler, J. (2014), Heights on groups and small multiplicative dependencies, Trans. Amer. Math. Soc. 366, 3295–3323. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 356 References Vojta, P. (1983), Integral points on varieties, Ph.D.-thesis, Harvard University. Vojta, P. (1987), Diophantine Approximation and Value Distribution Theory, Lecture Notes in Math. 1239. Springer Verlag. Vojta, P. (1996), Integral points on subvarieties of semiabelian varieties, I, Invent Math. 126, 133–181. Vojta, P. (2000), On the ABC-conjecture and diophantine approxination by rational points, Amer. J. Math. 122, 843–872. Correction, Amer. J. Math. 123 (2001), 383– 384. Voloch, J. F. (1985), Diagonal equations over function fields, Bol. Soc. Bras. Mat. 16, 29–39. Voloch, J. F. (1998), The equation ax + by = 1 in characteristic p, J. Number Th. 73, 195–200. Voutier, P. (1996), An effective lower bound for the height of algebraic numbers, Acta Arith. 74, 81–95. Voutier, P. (2014), Modules with many non-associates and norm form equations with many families of solutions, J. Number Theory 138, 20–36. van der Waerden, B. L. (1927), Beweis einer Baudetschen Vermutung, Nieuw. Arch. Wisk. (2) 15, 212–216. Waldschmidt, M. (1973), Propriétés arithmétiques des valeurs de fonctions méromorphes algébriquement indépendantes, Acta Arith. 23, 19–88. Waldschmidt, M. (1974), Nombres Transcendants, Springer Verlag. Waldschmidt, M. (2000), Diophantine approximation on linear algebraic groups, Springer Verlag. Wang, J. T.-Y. (1996), The truncated second main theorem of function fields, J. Number Theory 58, 139–157. Wang, J. T.-Y. (1999), A note on Wronskians and the ABC theorem, Manuscripta Math. 98, 255–264. de Weger, B. (1987), Algorithms for Diophantine Equations, Dissertation, Centrum voor Wiskunde en Informatica, Amsterdam. de Weger, B. (1989), Algorithms for Diophantine Equations, CWI Tract 65, Amsterdam. Wildanger, K. (1997), Über das Lösen von Einheiten- und Indexformgleichungen in algebraischen Zahlkörpern mit einer Anwerdung auf die Bestimmung aller ganzen Punkte einer Mordellschen Kurve, Dissertation, Technical University, Berlin. Wildanger, K. (2000), Über das Lösen von Einheiten- und Indexformgleichungen in algebraischen Zahlkörpern, J. Number Theory 82, 188–224. Wiles, A. (1995), Modular elliptic curves and Fermat’s Last Theorem, Ann. Math. (2) 141, 443–551. Wirsing, E. (1971), On approximation of algebraic numbers by algebraic numbers of bounded degree, in: Proc. Sympos. Pure Math. 20, Amer. Math. Soc., Providence, pp. 213–247. Wüstholz, G., ed. (2002), A panorama of number theory or the view from Baker’s garden, Cambridge University Press. Yu K. (2007), P -adic logarithmic forms and group varieties III, Forum Mathematicum, 19, 187–280. Zannier, U. (1993), Some remarks on the S-unit equation in function fields, Acta Arith. 64, 87–98. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 References 357 Zannier, U. (2003), Some applications of diophantine approximation to diophantine equations (with special emphasis on the Schmidt subspace theorem), Forum. Zannier, U. (2004), On the integer solutions of exponential equations in function fields, Ann. Inst. Fourier (Grenoble) 54, 849–874. Zannier, U. (2009), Lecture notes on Diophantine analysis, Edizioni della Normale. Zannier, U. (2012), Some Problems of Unlikely Intersections in Arithmetic and Geometry, Princeton University Press. Zhang, S. (2000), Distribution of almost division points, Duke Math. J. 103, 39–46. Zieve, M. E. (1996), Cycles of polynomial mappings, Ph.D. thesis, University of California, Berkeley. Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013 Glossary of frequently used notation General |S| log∗ x log∗n x , f (x) = O(g(x)) as x→∞ f (x) = o(g(x)) as x→∞ Z>0 , Z≥0 Fp Pn (K) A, A+ , A∗ A[X1 , . . . , Xn ] gcd GL(n, A), SL(n, A) L/K TrL/K (α), NL/K (α) DL/K (ω1 , . . . , ωn ) D(f ), D(F ) R(f, g), R(F, G) cardinality of a finite set S max(1, log x), log∗ 0 := 1. log∗ iterated n times applied to x Vinogradov symbols; A(x) B(x) or B(x) A(x) means that there is a constant c > 0 such that A(x) ≥ cB(x) for all x in the specified domain these are constants c1 , c2 > 0 such that |f (x)| ∈ c1 g(x) for all x ≥ c2 . lim f (x)/g(x) = 0. x→∞ positive integers, non-negative integers finite field of p elements. n-dimensional projective space over a field K. ring (always commutative with 1), additive group of A, group of units of A ring of polynomials in n variables with coefficients in A greatest common divisor multiplicative group of n × n-matrices with entries in A and determinant in A∗ , resp. determinant 1 field extension L/K trace, norm of α ∈ L over K discriminant of a K-basis {ω1 , . . . , ωn } of L discriminant of a polynomial f (X), binary form F (X, Y ) resultant of polynomials f (X), g(X), binary forms F (X, Y ), G(X, Y ). 358 Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:51, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.014 Glossary of frequently used notation 359 Number fields ordp (a) |a|p |a|∞ Qp MQ O K , D K , h K , RK p, a (α) = αOK ordp (a) ordp (α) NK (a) e(P|p), f (P|p) MK MK∞ MK0 | · |v (v ∈ MK ) Kv S OS OS∗ NS (α) |x|v (v ∈ MK ) H hom (x) H (x) exponent of a prime number p in the unique prime factorization of a ∈ Q, ordp (0) = ∞ p−ordp (a) , p-adic absolute value of a ∈ Q max(a, −a), ordinary absolute value of a ∈ Q p-adic completion of Q, Q∞ = R {∞} ∪ {primes}, set of places of Q ring of integers, discriminant, class number, regulator of a number field K non-zero prime ideal, fractional ideal of OK fractional ideal generated by α exponent of p in the unique prime ideal factorization of a exponent of p in the unique prime ideal factorization of (α) for α ∈ K, with ordp (0) := ∞. absolute norm of a fractional ideal a of OK (written as N (a) if it is clear which is the underlying number field) ramification index, residue class degree of a prime ideal P over a prime ideal p. set of places of a number field K set of infinite (archimedean) places of K set of finite (non-archimedean) places of K, identified with the non-zero prime ideals of OK normalized absolute values of K, satisfying the product formula, with |α|v := NK (p)−ordp (α) if α ∈ K and v = p is a prime ideal of OK completion of K at v finite set of places of K, containing MK∞ {α ∈ K : |α|v ≤ 1 for v ∈ MK \ S}, ring of S-integers, written as ZS if K = Q {α ∈ K : |α|v = 1 for v ∈ MK \ S}, group of S-units, written as Z∗S if K = Q v∈S |α|v , S-norm of α ∈ K maxi |xi |v , v-adic norm of x = (x1 , . . . , xn ) ∈ K n ( v∈MK |x|v )1/[K:Q] , absolute homogeneous height of x ∈ Kn ( v∈MK max(1, |x|v ))1/[K:Q] , absolute height of x ∈ Kn Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:51, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.014 360 H (α) hhom (x), h(x), h(α) Glossary of frequently used notation ( v∈MK max(1, |α|v ))1/[K:Q] , absolute height of α∈K log H hom (x), log H (x), log H (α), absolute logarithmic heights Function fields k k((z)) gK/k MK v(x) (v ∈ MK ) HKhom (x) HK (x) field of constants (always algebraically closed) field of Laurent series in z genus of function field K with constant field k set of (normalized discrete) valuations of K, trivial on k mini v(xi ), v-adic norm of x = (x1 , . . . , xn ) ∈ K n − v(x), homogeneous height of x ∈ K n v∈MK v∈MK max(0, −v(x)), height of x ∈ K Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:51, subject to the Cambridge Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.014 Cambridge University Press 978-1-107-09760-5 - Unit Equations in Diophantine Number Theory Jan-Hendrik Evertse and Kálmán Győry Index More information Index A-order, 275 abc-conjecture, 90 number field version, 92 abc-theorem for function fields, 174 absolute value, 12 archimedean, 13 continuation, 14 equivalence, 13 extension, 14 non-archimedean, 13 trivial, 12 additive unit representation, 287 algebraic coset, 321 algebraic function field, 30 algebraic subgroup, 321 bad reduction, of rational self-map, 296 Baker’s method, 98 Baker’s type inequalities, 97 binary form, 231 canonical number system, 310 characteristic polynomial, 3 class group, 10 class number, 10, 11 CM-field, 121 CNS basis, 310 CNS order, 310 completion, 13 complex place, 15 cycle, 291 cycle, polynomial, 292 decomposable form, 231 decomposable form equation, 232, 263, 272, 286 decomposable form inequality, 278, 282 decomposable form of discriminant type, 275 decomposable form, triangularly connected, 263 decomposable polynomial, 282 decomposable polynomial equations, 282 Dedekind domain, 6 derivative of algebraic function, 36 difference graph, 301 differential, 35 holomorphic, 36 discrete valuation, 7, 13 discriminant of basis, 4 of number field, 10, 11 discriminant equation, 305 discriminant form, 268, 276 discriminant form equation, 233, 263, 268, 272 division group, 324 effective specialization, 198 effectively computable algebraic number, 23 effectively computable fractional ideal, 24 effectively given algebraic number, 23 effectively given fractional ideal, 24 effectively given number field, 23 elliptic equation, 273 equivalence of binary forms, 311 equivalent of algebraic integers, 306 of monic polynomials, 306 Euclidean norm, 123 exceptional units, 121 explicitly presented field, 37, 175 361 © in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-09760-5 - Unit Equations in Diophantine Number Theory Jan-Hendrik Evertse and Kálmán Győry Index More information 362 Index exponential-polynomial equations, 326 Extension Formula, 33 family of solutions of decomposable form equation, 248 Fermat’s Last Theorem, 91 field of p-adic numbers, 26 field with absolute value, 13 complete, 13 completion, 13 Fincke–Pohst algorithm, 117, 118 finite étale K-algebra, 246 finite place of Q, 14 of number field, 15 fractional ideal, 5 absolute norm, 9 extension, 7 generated by S, 5 greatest common divisor, 6 inverse, 6 lowest common multiple, 6 product, 6 relative norm, 8 fundamental system of S-units, 18 Gal(G/K)-proper, 234 Gal(G/K)-stable, 234 Galois symmetric S-unit vector, 251 generalized Fermat equation together, 91 genus, 36, 173 GL(2, A)-equivalence, 317 good reduction, of rational self-map, 296 Gram–Schmidt orthogonalization process, 123 group of S-units, 17 height S-height, 44, 130 absolute logarithmic, 19 absolute multiplicative, 19 homogeneous of polynomial over function field, 35 homogeneous of vector over function field, 33 logarithmic of finite set S, 201 logarithmic of matrix, 201 logarithmic of vector, 21 multiplicative homogeneous of vector, 21 multiplicative of vector, 21 of algebraic function, 34 of polynomial, 22 twisted, 46 hyperelliptic equation, 273 ideal membership algorithm, 199, 204 index form, 276 index form equation, 268, 272 infinite place of Q, 14 of number field, 15 infinite valuation function field, 33 inner product on, 123 irreducible family of solutions of decomposable form equation, 255 KANT, 119, 123 k-nomial, 298 k-proportional solutions, 180 Lang’s Conjecture, 322, 323 Lang–Bogomolov Conjecture, 325 lattice full in real vector space, 68 in real vector space, 68 lattice, full in real vector space, 10 Laurent series, 31 length, of cycle, 291 linear forms in logarithms, 52 linear recurrence sequence, 326 companion polynomial, 326 non-degenerate, 327 order, 326 zero-multiplicity, 327 LLL-reduced basis, 104, 110, 123 LLL-reduction algorithm, 103, 124, 125 local parameter, 31 local ring of discrete valuation, 30 MAGMA, 119 Mahler measure, 21 minimal polynomial over Z, 20 MINIMIZE, 117 Minkowski’s Theorem on successive minima, 68 monic minimal polynomial, 3 monogenic number field, 309 monogenic order, 309 monogenic, k times, 309 Mordell’s Conjecture, 322, 323 Mordell’s equation, 273 Mordell–Weil Theorem, 322 © in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-09760-5 - Unit Equations in Diophantine Number Theory Jan-Hendrik Evertse and Kálmán Győry Index More information Index Noetherian module, 238 Noetherian ring, 238 non-degenerate solution, 180 non-degenerate solutions, 128 norm absolute, of fractional ideal, 9 of algebraic number, 4 on real vector space, 68 relative, of fractional ideal, 8 relative, of prime ideal, 8 unit ball, 68 v-adic of polynomial, 22 v-adic of vector, 21 norm form, 244 norm form equation, 233, 244, 250, 263, 267 normal closure, 3 orbit, 291 orbit, finite, 291 orbit, finite polynomial, 292 order, in number field, 308 order, monogenic, 309 p-adic exponential, 28 p-adic logarithm, 27 p-adic numbers, 14 pair of representatives, 199 periodic point, 291 place lying above, 16 place lying below, 16 power integral basis, 268, 308 preperiodic point, 291 prime ideal of ring of intergers, 6 Product Formula, 14, 15 Puiseux expansions, 31 radical, 89 ramification index, 8, 31 ramification index of local field, 26 Ramsey theory, 288 real place, 15 regulator, 11 representation of algebraic function, 37 representation of algebric function, 175 representative, 199 residue class degree, 8 resultant, 315, 317 resultant equation, 280, 315 ring of p-adic integers, 26 363 ring of S-integer, 17 Roth’s Theorem, 42, 91 S-integer, 17 S-norm, 17 S-regulator, 18 -symmetric partition, 251 S-units, 17 S-regulator, 18 fundamental system, 18 S-unit of function field, 173 self-map, 291 semi-abelian variety, 322 Skolem, Mahler–Lech theorem, 327 specialization, 140, 218 splitting field, 3 Subspace Theorem, 43, 245 p-adic, 44, 249 parametric, 45, 46 quantitative, 45, 252, 279, 329 successive minimum, 68 Sum Formula, 31 superelliptic equation, 273 Thue equation, 232, 250, 258 Thue–Mahler equation, 232, 249, 277 trace, 4 triangle graph, 302 ultrametric inequality, 13 Uniform Boundedness Conjecture, 297 unit equations, 61, 232 homogeneous, 61 units, 10 exceptional, 121 fundamental system, 11 regulator, 11 unit group of ring of integers, 10 unit rank, 11 valuation, 13 discrete, 7, 13 value group, 13 valuation on function field, 30, 173 explicitly given, 38, 175 Vandermonde’s identity, 4 Weak Nullstellensatz, 140 wide family of solutions of decomposable form equation, 246 © in this web service Cambridge University Press www.cambridge.org