Telechargé par atias singer

(Cambridge Studies in Advanced Mathematics 146) Jan-Hendrik Evertse, Kalman Gyory - Unit Equations in Diophantine Number Theory-Cambridge University Press (2015)

publicité
CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS 146
Editorial Board
B . B O L L O B Á S , W . F U L T O N , A . K A T O K , F . K I R W A N ,
P. SARNAK, B. SIMON, B. TOTARO
UNIT EQUATIONS IN
DIOPHANTINE NUMBER THEORY
Diophantine number theory is an active area that has seen tremendous growth over the
past century, and in this theory unit equations play a central role. This comprehensive
treatment is the first volume devoted to these equations. The authors gather together all
the most important results and look at many different aspects, including effective results
on unit equations over number fields, estimates on the number of solutions, analogues for
function fields, and effective results for unit equations over finitely generated domains.
They also present a variety of applications. Introductory chapters provide the necessary
background in algebraic number theory and function field theory, as well as an account
of the required tools from Diophantine approximation and transcendence theory. This
makes the book suitable for young researchers as well as for experts who are looking
for an up-to-date overview of the field.
Jan-Hendrik Evertse works at the Mathematical Institute of Leiden University. His
research concentrates on Diophantine approximation and applications to Diophantine
problems. In this area he has obtained some influential results, in particular on estimates
for the numbers of solutions of Diophantine equations and inequalities. He has written
more than 75 research papers and co-authored one book with Bas Edixhoven entitled
Diophantine Approximation and Abelian Varieties.
Kálmán Győry is Professor Emeritus at the University of Debrecen, a member of the
Hungarian Academy of Sciences and a well-known researcher in Diophantine number
theory. Over his career he has obtained several significant and pioneering results, among
others on unit equations, decomposable form equations, and their various applications.
His results have been published in one book and 160 research papers. Győry is also the
founder and leader of the number theory research group in Debrecen, which consists of
his former students and their students.
CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS
Editorial Board:
B. Bollobás, W. Fulton, A. Katok, F. Kirwan, P. Sarnak, B. Simon, B. Totaro
All the titles listed below can be obtained from good booksellers or from Cambridge University Press.
For a complete series listing visit www.cambridge.org/mathematics.
Already published
109 H. Geiges An introduction to contact topology
110 J. Faraut Analysis on Lie groups: An introduction
111 E. Park Complex topological K-theory
112 D. W. Stroock Partial differential equations for probabilists
113 A. Kirillov, Jr An introduction to Lie groups and Lie algebras
114 F. Gesztesy et al. Soliton equations and their algebro-geometric solutions, II
115 E. de Faria & W. de Melo Mathematical tools for one-dimensional dynamics
116 D. Applebaum Lévy processes and stochastic calculus (2nd Edition)
117 T. Szamuely Galois groups and fundamental groups
118 G. W. Anderson, A. Guionnet & O. Zeitouni An introduction to random matrices
119 C. Perez-Garcia & W. H. Schikhof Locally convex spaces over non-Archimedean valued fields
120 P. K. Friz & N. B. Victoir Multidimensional stochastic processes as rough paths
121 T. Ceccherini-Silberstein, F. Scarabotti & F. Tolli Representation theory of the symmetric groups
122 S. Kalikow & R. McCutcheon An outline of ergodic theory
123 G. F. Lawler & V. Limic Random walk: A modern introduction
124 K. Lux & H. Pahlings Representations of groups
125 K. S. Kedlaya p-adic differential equations
126 R. Beals & R. Wong Special functions
127 E. de Faria & W. de Melo Mathematical aspects of quantum field theory
128 A. Terras Zeta functions of graphs
129 D. Goldfeld & J. Hundley Automorphic representations and L-functions for the general linear group, I
130 D. Goldfeld & J. Hundley Automorphic representations and L-functions for the general linear group, II
131 D. A. Craven The theory of fusion systems
132 J.Väänänen Models and games
133 G. Malle & D. Testerman Linear algebraic groups and finite groups of Lie type
134 P. Li Geometric analysis
135 F. Maggi Sets of finite perimeter and geometric variational problems
136 M. Brodmann & R. Y. Sharp Local cohomology (2nd Edition)
137 C. Muscalu & W. Schlag Classical and multilinear harmonic analysis, I
138 C. Muscalu & W. Schlag Classical and multilinear harmonic analysis, II
139 B. Helffer Spectral theory and its applications
140 R. Pemantle & M. C. Wilson Analytic combinatorics in several variables
141 B. Branner & N. Fagella Quasiconformal surgery in holomorphic dynamics
142 R. M. Dudley Uniform central limit theorems (2nd Edition)
143 T. Leinster Basic category theory
144 I. Arzhantsev, U. Derenthal, J. Hausen & A. Laface Cox rings
145 M. Viana Lectures on Lyapunov exponents
146 J.-H. Evertse & K. Győry Unit equations in Diophantine number theory
147 A. Prasad Representation theory
148 S. R. Garcia, J. Mashreghi & W. T. Ross Introduction to model spaces and their operators
Unit Equations in
Diophantine Number Theory
JA N - H E N D R I K E V E RT S E
Universiteit Leiden
K Á L M Á N G Y Ő RY
Debreceni Egyetem, Hungary
University Printing House, Cambridge CB2 8BS, United Kingdom
Cambridge University Press is part of the University of Cambridge.
It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning and research at the highest international levels of excellence.
www.cambridge.org
Information on this title: www.cambridge.org/9781107097605
© Jan-Hendrik Evertse and Kálmán Győry 2015
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2015
Printed in the United Kingdom by Clays, St Ives plc
A catalogue record for this publication is available from the British Library
ISBN 978-1-107-09760-5 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication,
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
Contents
Preface
Summary
PART I
1
page ix
xi
PRELIMINARIES
Basic algebraic number theory
1.1 Characteristic polynomial, trace, norm, discriminant
1.2 Ideal theory for algebraic number fields
1.3 Extension of ideals; norm of ideals
1.4 Discriminant, class number, unit group and regulator
1.5 Explicit estimates
1.6 Absolute values: generalities
1.7 Absolute values and places on number fields
1.8 S-integers, S-units and S-norm
1.9 Heights
1.9.1 Heights of algebraic numbers
1.9.2 v-adic norms and heights of vectors
and polynomials
1.10 Effective computations in number fields
1.11 p-adic numbers
3
3
5
7
9
11
12
15
17
19
19
2
Algebraic function fields
2.1 Valuations
2.2 Heights
2.3 Derivatives and genus
2.4 Effective computations
30
30
33
35
37
3
Tools from Diophantine approximation and
transcendence theory
42
v
21
23
26
vi
Contents
3.1
3.2
The Subspace Theorem and some variations
Effective estimates for linear forms in logarithms
PART II
4
5
42
51
U N I T E QUAT I O N S A N D A P P L I C AT I O N S
Effective results for unit equations in two unknowns over
number fields
4.1 Effective bounds for the heights of the solutions
4.1.1 Equations in units of a number field
4.1.2 Equations with unknowns from a finitely
generated multiplicative group
4.2 Approximation by elements of a finitely generated
multiplicative group
4.3 Tools
4.3.1 Some geometry of numbers
4.3.2 Estimates for units and S-units
4.4 Proofs
4.4.1 Proofs of Theorems 4.1.1 and 4.1.2
4.4.2 Proofs of Theorems 4.2.1 and 4.2.2
4.4.3 Proofs of Theorem 4.1.3 and its corollaries
4.5 Alternative methods, comparison of the bounds
4.5.1 The results of Bombieri, Bombieri and Cohen,
and Bugeaud
4.5.2 The results of Murty, Pasten and von Känel
4.6 The abc-conjecture
4.7 Notes
4.7.1 Historical remarks and some related results
4.7.2 Some notes on applications
Algorithmic resolution of unit equations in two unknowns
5.1 Application of Baker’s type estimates
5.1.1 Infinite places
5.1.2 Finite places
5.2 Reduction of the bounds
5.2.1 Infinite places
5.2.2 Finite places
5.3 Enumeration of the “small” solutions
5.4 Examples
5.5 Exceptional units
5.6 Supplement: LLL lattice basis reduction
5.7 Notes
61
62
62
64
67
68
68
72
79
79
81
84
87
87
88
89
93
93
94
96
97
100
102
103
103
105
111
119
121
123
126
Contents
vii
6
Unit equations in several unknowns
6.1 Results
6.1.1 A semi-effective result
6.1.2 Upper bounds for the number of solutions
6.1.3 Lower bounds
6.2 Proofs of Theorem 6.1.1 and Corollary 6.1.2
6.3 A sketch of the proof of Theorem 6.1.3
6.3.1 A reduction
6.3.2 Notation
6.3.3 Covering results
6.3.4 The large solutions
6.3.5 The small solutions, and conclusion of the proof
6.4 Proof of Theorem 6.1.4
6.5 Proof of Theorem 6.1.6
6.6 Proofs of Theorems 6.1.7 and 6.1.8
6.7 Notes
128
130
130
131
134
136
140
140
142
142
144
147
148
158
161
165
7
Analogues over function fields
7.1 Mason’s inequality
7.2 Proofs
7.3 Effective results in the more unknowns case
7.4 Results on the number of solutions
7.5 Proof of Theorem 7.4.1
7.5.1 Extension to the k-closure of 7.5.2 Some algebraic geometry
7.5.3 Proof of Theorem 7.5.1
7.6 Results in positive characteristic
173
174
176
178
182
183
183
185
188
192
8
Effective results for unit equations in two unknowns over
finitely generated domains
8.1 Statements of the results
8.2 Effective linear algebra over polynomial rings
8.3 A reduction
8.4 Bounding the degree in Proposition 8.3.7
8.5 Specializations
8.6 Bounding the height in Proposition 8.3.7
8.7 Proof of Theorem 8.1.3
8.8 Notes
197
198
201
204
212
215
222
225
230
Decomposable form equations
9.1 A finiteness criterion for decomposable form equations
231
233
9
viii
Contents
9.2
9.3
9.4
9.5
9.6
9.7
10
Reduction of unit equations to decomposable
form equations
Reduction of decomposable form equations to
unit equations
9.3.1 Proof of the equivalence (ii) ⇐⇒ (iii)
in Theorem 9.1.1
9.3.2 Proof of the implication (i) ⇒ (iii) in
Theorem 9.1.1
9.3.3 Proof of the implication (iii) ⇒ (i) in
Theorem 9.1.1
Finiteness of the number of families of solutions
Upper bounds for the number of solutions
9.5.1 Galois symmetric S-unit vectors
9.5.2 Consequences for decomposable form equations
and S-unit equations
Effective results
9.6.1 Thue equations
9.6.2 Decomposable form equations in an arbitrary
number of unknowns
Notes
236
237
238
238
240
244
249
251
253
257
258
263
272
Further applications
10.1
Prime factors of sums of integers
10.2
Additive unit representations in finitely generated
integral domains
10.3
Orbits of polynomial and rational maps
10.4
Polynomials dividing many k-nomials
10.5
Irreducible polynomials and arithmetic graphs
10.6
Discriminant equations and power integral bases in
number fields
10.7
Binary forms of given discriminant
10.8
Resultant equations for monic polynomials
10.9
Resultant inequalities and equations for binary forms
10.10 Lang’s Conjecture for tori
10.11 Linear recurrence sequences and
exponential-polynomial equations
10.12 Algebraic independence results
284
284
References
Glossary of frequently used notation
Index
337
358
361
287
291
298
301
305
310
315
317
321
326
330
Preface
Diophantine number theory (the study of Diophantine equations, Diophantine
inequalities and their applications) is a very active area in number theory with a
long history. This book is about unit equations, a class of Diophantine equations
of central importance in Diophantine number theory, and their applications.
Unit equations are equations of the form
a1 x1 + · · · + an xn = 1
to be solved in elements x1 , . . . , xn from a finitely generated multiplicative
group , contained in a field K, where a1 , . . . , an are non-zero elements of
K. Such equations were studied originally in the cases where the number
of unknowns n = 2, K is a number field and is the group of units of the
ring of integers of K, or more generally, where is the group of S-units
in K. Unit equations have a great variety of applications, among others to
other classes of Diophantine equations, to algebraic number theory and to
Diophantine geometry.
Certain results concerning unit equations and their applications covered in
our book were already presented, mostly in special or weaker form, in the books
of Lang (1962, 1978, 1983), Győry (1980b), Sprindžuk (1982, 1993), Evertse
(1983), Mason (1984), Shorey and Tijdeman (1986), de Weger (1989), Schmidt
(1991), Smart (1998), Bombieri and Gubler (2006), Baker and Wüstholz (2007)
and Zannier (2009), and in the survey papers of Evertse, Győry, Stewart and
Tijdeman (1988b), Győry (1992a, 1996, 2002a, 2010) and Bérczes, Evertse
and Győry (2007b).
In 1988, we wrote, together with Stewart and Tijdeman, the survey Evertse,
Győry, Stewart and Tijdeman (1988b) on unit equations and their applications
giving the state of the art of the subject at that time. Since then, the theory
of unit equations has been greatly expanded. In the present book we have
ix
x
Preface
tried to give a comprehensive and up-to-date treatment of unit equations and
their applications. We prove effective finiteness results for unit equations in
two unknowns, describe practical algorithms to solve such equations, give
explicit upper bounds for the number of solutions, discuss analogues of unit
equations over function fields and over finitely generated domains, and present
various applications. The proofs of the results concerning unit equations are
mostly based on the very powerful Thue–Siegel–Roth–Schmidt theory from
Diophantine approximation and Baker’s theory from transcendence theory. We
note that there are other important methods and applications, some discovered
very recently, that deserve a detailed discussion, but to which we could pay
only little or no attention due to lack of time and space.
The present book is the first in a series of two. The second book,
titled Discriminant Equations in Diophantine Number Theory, also published
by Cambridge University Press, is about polynomials and binary forms of
given discriminant, with applications to algebraic number theory, Diophantine
approximation and Diophantine geometry. There, we will apply the results
from the present book. The contents of these two books are an outgrowth of
research, done by the two authors since the 1970s.
The present book is aimed at anybody (graduate students and experts) with
basic knowledge of algebra (groups, commutative rings, fields, Galois theory)
and elementary algebraic number theory. For convenience of the reader, in part
I of the book we have provided some necessary background.
Acknowledgments
We are very grateful to Yann Bugeaud, Andrej Dujella, István Gaál, Rafael
von Känel, Attila Pethő, Michael Pohst, Andrzej Schinzel and two anonymous
referees for carefully reading and critically commenting on some chapters of
our book, to Csaba Rakaczki for his careful typing of a considerable part of
this book, and to Cambridge University Press, in particular David Tranah, Sam
Harrison and Clare Dennison, for their suggestions for and assistance with the
final preparation of the manuscript.
The research of the second named author was supported in part by Grants
100339 and 104208 from the Hungarian National Foundation for Scientific
Research (OTKA).
Summary
We start with a brief historical overview and then outline the contents of
our book. Thue (1909) proved that if F ∈ Z[X, Y ] is a binary form (i.e., a
homogeneous polynomial) of degree at least 3 which is irreducible over Q and
if δ is a non-zero integer, then the equation
F (x, y) = δ in x, y ∈ Z
(nowadays called a Thue equation) has only finitely many solutions. To this end,
Thue developed a very original Diophantine approximation method concerning
the approximation of algebraic numbers by rationals, which was extended later
by Siegel, Dyson, Gelfond and Roth.
Thue’s result was generalized by Siegel (1921) as follows. Let K be an
algebraic number field of degree d with ring of integers OK , let F ∈ OK [X, Y ]
be a binary form of degree n > 4d 2 − 2d such that F (1, 0) = 0 and F (X, 1)
has no multiple zeros, and let δ be a non-zero element of OK . Then the equation
F (x, y) = δ in x, y ∈ OK
has only finitely many solutions. This has the following interesting consequence, which was not stated explicitly by Siegel, but which was implicitly
proved by him. Denote by OK∗ the group of units of OK . Let a1 , a2 be non-zero
elements of the number field K. Then the equation
a1 x1 + a2 x2 = 1
OK∗ .
(1)
has only finitely many solutions in x1 , x2 ∈
To prove this, choose an
integer n > 4d 2 − 2d. By Dirichlet’s Unit Theorem, the group OK∗ is finitely
generated, and thus, any solution x1 , x2 ∈ OK∗ of (1) can be written as xi = βi εin
for i = 1, 2 with βi , εi ∈ OK∗ , such that βi may assume only finitely many
values. Thus, we get a finite number of Thue equations
a1 β1 ε1n + a2 β2 ε2n = 1,
each of which has only finitely many solutions in ε1 , ε2 .
xi
xii
Summary
Mahler (1933a) proved another generalization of Thue’s theorem. Let F ∈
Z[X, Y ] be a binary form of degree n ≥ 3 such that F (1, 0) = 0 and F (X, 1)
has no multiple zeros, and let p1 , . . . , pt be distinct primes. Then the equation
F (x, y) = ±p1z1 · · · ptzt
(today called a Thue–Mahler equation) has only finitely many solutions in integers x, y, z1 , . . . , zt with gcd(x, y) = 1. A consequence of this result, proved
by Mahler in a slightly different formulation, is as follows. Let a1 , a2 be nonzero rational numbers and let be the multiplicative group generated by −1,
p1 , . . . , pt . Then (1) has only finitely many solutions in x1 , x2 ∈ . The argument is similar to that above. By extending the set of primes p1 , . . . , pt , we
may assume that the numerators and denominators of a1 , a2 are composed of
primes from p1 · · · pt . Then, by clearing denominators, we can rewrite (1) as
u + v = w,
where u, v, w are integers, composed of primes from p1 , . . . , pt , with
gcd(u, v, w) = 1. Choose n ≥ 3. Then we may write u as ax n and v as by n ,
where a, b, x, y are integers composed of primes from p1 , . . . , pt and a, b are
from a finite set independent of x1 , x2 . Thus, equation (1) can be reduced to a
finite number of Thue–Mahler equations as above with F = aXn + bY n which
all have only finitely many solutions.
Lang (1960) considered equation (1) with unknowns x1 , x2 taken from a
finitely generated multiplicative group, and was the first to realize the central
importance of this equation. He proved the general result that if a1 , a2 are nonzero elements from an arbitrary field K of characteristic 0 and is an arbitrary
finitely generated multiplicative subgroup of K ∗ , then (1) has only finitely many
solutions in elements x1 , x2 ∈ . Inspired by Siegel’s original result, equations
of type (1) with unknowns from a finitely generated multiplicative group are
called unit equations (in two unknowns), although the group need not be the
unit group of a ring. The proofs of all results mentioned above are based on
extensions of Thue’s method, which are ineffective in the sense that they do
not provide a method to determine the solutions of the equations considered
above.
In the 1960s, A. Baker developed a new method in transcendence theory,
giving non-trivial effective lower bounds for linear forms in logarithms of
algebraic numbers. This turned out to be a very powerful tool to prove effective
finiteness results for Diophantine equations, that enable one to determine all
solutions of the equation, at least in principle. With this method, and extensions
thereof, it became possible to give explicit upper bounds for the heights of the
solutions of Thue equations and Thue–Mahler equations, and also for the
Summary
xiii
heights of the solutions of equations (1) in units of the ring of integers of a
number field or more generally, in S-units, these are elements in the number
field in whose prime ideal factorizations only prime ideals from a prescribed,
finite set S occur. Baker (1968b) obtained explicit upper bounds for the solutions
of Thue equations. His result was extended by Coates (1969) to Thue–Mahler
equations. For explicit upper bounds for the heights of the solutions of unit
equations and S-unit equations in two unknowns, see Győry (1972, 1973,
1974, 1979), and the many subsequent improvements discussed in Chapter 4.
The bounds enabled one to determine, at least in principle, all solutions. Since
the 1980s, practical algorithms have been developed, combining Baker’s theory
with the Lenstra–Lenstra–Lovász (LLL) lattice basis reduction algorithm and
enumeration techniques, which allow one to solve in practice concrete Thue
equations, Thue–Mahler equations and (S-) unit equations, see for instance de
Weger (1989), Wildanger (1997) and Smart (1998).
In the 1960s and early 1970s, Schmidt developed his higher dimensional generalization of the Thue–Siegel–Roth method, leading to his Subspace Theorem
in Schmidt (1972). Schlickewei (1977b) proved an extension of the Subspace
Theorem, involving both archimedean and non-archimedean absolute values.
Using this so-called p-adic Subspace Theorem, several authors obtained finiteness results for the number of soultions of unit equations in an arbitrary number
of unknowns, i.e., for linear equations
a1 x1 + · · · + an xn = 1 in x1 , . . . , xn ∈ ,
(2)
where a1 , . . . , an are non-zero elements, and is a finitely generated multiplicative group in a field K of characteristic 0, see Dubois and Rhin (1976),
Schlickewei (1977a), Evertse (1984b), Evertse and Győry (1988b) and van der
Poorten and Schlickewei (1982, 1991). We mention that the p-adic Subspace
Theorem is ineffective, and so its consequences for equation (2) are ineffective.
It is still open to solve unit equations of the form (2) in more than two unknowns
effectively.
In part I of the book, consisting of the first three chapters, we have collected
some basic tools. Chapter 1 gives a collection of the results from elementary
algebraic number theory that we need throughout the book. In Chapter 2 we
recall some basic facts about algebraic function fields. These are used in Chapters 7 and 8. In Chapter 3 we have stated without proof some fundamental
results from Diophantine approximation and transcendence theory. We have
included some versions of the Subspace Theorem, due to Schmidt, Schlickewei and Evertse, and estimates of Matveev (2000) and Yu (2007) concerning
linear forms in logarithms, which are used in Chapters 4, 5 and 6.
xiv
Summary
Part II, consisting of the other chapters, is the main body of our book.
Chapter 4 provides a survey of effective results concerning unit equations in
two unknowns over number fields. We derive among others the best effective
upper bounds to date, established in Győry and Yu (2006), for the solutions of
equation (1) in S-units of a number field. For applications, we give the bounds
in completely explicit form. The main tools in the proofs are the results on
linear forms in logarithms mentioned above.
In Chapter 5 we address the problem of practically solving concrete equations of the form (1) in units and S-units. Here, we combine estimates for
linear forms in logarithms as mentioned in Chapter 3 with the LLL lattice basis
reduction algorithm and an enumeration process.
In Chapter 6, we give an overview of the ineffective theory of unit equations
in several unknowns. Among other things, we sketch a proof of the theorem
of Evertse, Schlickewei and Schmidt (2002), giving an explicit upper bound
for the number of those solutions of (2) for which the left side in (2) has no
vanishing subsum. The bound depends only on the number n of unknowns
and the rank of . We also include a proof of the theorem of Beukers and
Schlickewei (1996) which gives a similar, but sharper, result for equations in
two unknowns. Further, we discuss some results giving lower bounds for the
number of solutions of unit equations.
In Chapter 7, we deal with analogues over function fields of characteristic 0
of some of the effective and ineffective results discussed in Chapters 4 and 6. In
particular, we present the Stothers–Mason abc-theorem due to Stothers (1981)
and Mason (1984) for algebraic functions, and a result of Evertse and Zannier
(2008) on the number of solutions of unit equations in two unknowns over
function fields, analogous to the result of Beukers and Schlickewei mentioned
above. Further, we give a brief overview of recent results on unit equations over
function fields of positive characteristic.
In Chapter 8, the effective results of Chapters 4 and 7 on S-unit equations in
two unknowns over number fields and over function fields are combined with
some effective specialization argument to prove a general effective finiteness
theorem, due to Evertse and Győry (2013), on the solutions of equation (1) in
units x1 , x2 of an arbitrary, effectively given finitely generated integral domain
A over Z.
Chapter 9 deals with applications of unit equations to decomposable form
equations, which are higher dimensional generalizations of Thue and Thue–
Mahler equations. It is proved that unit equations in an arbitrary number of
unknowns are in a certain sense equivalent to decomposable form equations,
and in particular unit equations in two unknowns are equivalent to Thue equations. Further, a complete description of the set of solutions of decomposable
Summary
xv
form equations is presented. We give explicit upper bounds for the number
of solutions when this number is finite. The bounds do not depend on the
coefficients of the decomposable forms involved. We also discuss effective
results for some important classes of decomposable form equations, including
Thue equations, discriminant form equations, and certain norm form equations.
The presented results have many applications, especially to algebraic number
theory.
The results on unit equations have many further applications to other Diophantine problems. In Chapter 10 we have made a small selection. We give
among other things applications to prime factors of sums of integers, additive
unit representations in integral domains, dynamics of polynomial maps, arithmetic graphs, irreducible polynomials, equations and inequalities involving
discriminants and resultants, power integral bases in number fields, Diophantine geometry, exponential-polynomial equations, and transcendence theory.
As was mentioned in the Preface, a number of applications of the results
of the present book are given in our second book Discriminant Equations in
Diophantine Number Theory.
At the end of several chapters there are Notes in which some historical
remarks are made and further related results, generalizations and applications
are mentioned.
1
Basic algebraic number theory
We have collected some basic facts about algebraic number fields (finite field
extensions of Q), p-adic numbers, and related topics. For further details and
proofs, we refer to Lang (1970), chapters I–V, Neukirch (1992), Kapitel I–III
and Koblitz (1984).
In the present book, a ring is by default a commutative ring with unit element,
and an integral domain is a commutative ring with unit element and without
divisors of 0. Given a ring A, we denote by A+ its underlying additive group,
and by A∗ its unit group (multiplicative group of invertible elements).
The ring of integers of an algebraic number field K, that is the integral
closure of Z in K, is denoted by OK .
1.1 Characteristic polynomial, trace, norm, discriminant
For the moment, let K be any field of characteristic 0. Choose an algebraic
closure K of K. For every α ∈ K, there is a unique, monic, irreducible polynomial fα ∈ K[X], such that fα (α) = 0, and fα divides g for every polynomial
g ∈ K[X] with g(α) = 0. We call fα the monic minimal polynomial of α.
Let f ∈ K[X] be a non-zero polynomial. Then f = a(X − α1 ) . . . (X − αr )
with a ∈ K ∗ , α1 , . . . , αr ∈ K. We call K(α1 , . . . , αr ) the splitting field of f
over K.
Let L be a finite extension of K of degree n. Then there are precisely n
distinct K-isomorphic embeddings L → K, σ1 , . . . , σn , say. The composition
of the fields σ1 (L), . . . , σn (L) is called the normal closure of L over K. We
define the characteristic polynomial of α ∈ L with respect to L/K by
n
(X − σi (α)).
χL/K,α :=
i=1
In fact, we have χL/K,α = fα[L:K(α)] and χL/K,α is the characteristic polynomial
of the K-linear map x → αx from L to L. So χL/K,α ∈ K[X].
3
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
4
Basic algebraic number theory
We define the trace and norm of α ∈ L over K by
TrL/K (α) :=
n
σi (α),
NL/K (α) :=
i=1
n
σi (α).
i=1
These are up to sign coefficients of χL/K,α . So we have
TrL/K (α),
NL/K (α) ∈ K
for α ∈ L.
(1.1.1)
Notice that TrL/K is a K-linear map L → K and that NL/K is a multiplicative
map L → K. Further, the trace and norm are transitive in towers: let M ⊃ L ⊃
K be a tower of finite extension fields; then
TrM/K (α) = TrL/K (TrM/L (α)),
for α ∈ M.
NM/K (α) = NL/K (NM/L (α))
Let again L be a finite extension of K of degree n. Take a K-basis
{ω1 , . . . , ωn } of L. Then the discriminant of this basis is given by
DL/K (ω1 , . . . , ωn ) := det(TrL/K (ωi ωj ))i,j =1,...,n .
By (1.1.1) we have DL/K (ω1 , . . . , ωn ) ∈ K. The discriminant can be expressed
otherwise as
DL/K (ω1 , . . . , ωn ) = (det(σi (ωj ))i,j =1,...,n )2 ,
where σ1 , . . . , σn are the K-isomorphic embeddings of L in K. For instance,
if L = K(θ ), then {1, θ, . . . , θ n−1 } is a K-basis of L and by Vandermonde’s
identity,
(σi (θ ) − σj (θ ))2 = 0.
(1.1.2)
DL/K (1, θ, . . . , θ n−1 ) =
1≤i<j ≤n
Let {θ1 , . . . , θn }, {ω1 , . . . , ωn } be any two K-bases of L. Then
ωi =
n
aij θj
for i = 1, . . . , n
j =1
with aij ∈ K and det(aij ) = 0. By a straightforward computation we have
DL/K (ω1 , . . . , ωn ) = (det(aij )i,j =1,...,n )2 DL/K (θ1 , . . . , θn ).
(1.1.3)
By applying this relation with {1, θ, . . . , θ } for {θ1 , . . . , θn }, and using
(1.1.2), we deduce that if {ω1 , . . . , ωn } is any K-basis of L, then
n−1
DL/K (ω1 , . . . , ωn ) = 0.
We give an application to linear algebra. Let again K be a field of characteristic 0, and let G be a Galois extension of K. For a vector x = (x1 , . . . , xg ) ∈ Gg
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.2 Ideal theory for algebraic number fields
5
and for σ in the Galois group Gal(G/K) of G over K, we define σ (x) :=
(σ (x1 ), . . . , σ (xg )).
Lemma 1.1.1 Let g ≥ 1, and let V be a G-linear subspace of Gg such that
σ (x) ∈ V
for x ∈ V , σ ∈ Gal(G/K).
Then V has a basis consisting of vectors from K g .
Proof. Pick a non-zero vector b ∈ V . Let L ⊆ G be the smallest Galois extension of K containing the coefficients of b and choose a K-basis {ω1 , . . . , ωn } of
L. Let Gal(L/K) = {σ1 , . . . , σn }. We have b = nj=1 ωj yj with yj ∈ K g for
n
j = 1, . . . , n. Then also σi (b) = j =1 σi (ωj )yj for i = 1, . . . , n. The matrix
(σi (ωj ))i,j =1,...,n is invertible (the square of its determinant being the discriminant of ω1 , . . . , ωn ), hence y1 , . . . , yn are L-linear combinations of σi (b)
(i = 1, . . . , n). Now our assumption on V implies that y1 , . . . , yn ∈ V . It follows that V is generated by vectors from K g , hence it has a basis from K g .
Now let K be an algebraic number field and L a finite extension of K. Then
for α ∈ L we have
α ∈ OL ⇐⇒ χL/K,α ∈ OK [X].
As a consequence,
TrL/K (α), NL/K (α) ∈ OK
for α ∈ OL ,
and
DL/K (ω1 , . . . , ωn ) ∈ OK
for every K-basis {ω1 , . . . , ωn } of L with ω1 , . . . , ωn ∈ OL .
1.2 Ideal theory for algebraic number fields
We start with some general notation. Let A be an integral domain with quotient
field K. For α ∈ K and a subset F of K, we define αF := {αx : x ∈ F}.
A fractional ideal of A is a subset a of K such that a = {0} and there is
α ∈ A \ {0} such that αa is an ideal of A. In particular, for α ∈ K ∗ , the set αA
is a fractional ideal, which we denote by (α) when it is clear from the context
what the underlying domain A is. More generally, given a subset S = {0} of
K such that there is α ∈ A \ {0} with αS ⊂ A, the set of all finite A-linear
combinations with elements from S is a fractional ideal of A, denoted by SA,
called the fractional ideal generated by S.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
6
Basic algebraic number theory
Let K be an algebraic number field. Recall that its ring of integers OK is a
Dedekind domain, that is, OK is integrally closed, every ideal of OK is finitely
generated, and every non-zero prime ideal of OK is a maximal ideal (see Lang
(1970), chapter 1, sections 2, 3). Henceforth, when we are dealing with prime
ideals of OK , we always exclude (0).
Let a, b be two fractional ideals of OK . We define their greatest common
divisor or sum, lowest common multiple and product by
gcd(a, b) = a + b := {α + β : α ∈ a, β ∈ b},
lcm(a, b) := a ∩ b,
ab := OK -module generated by all products αβ with α ∈ a and β ∈ b,
respectively. Further, the inverse of a fractional ideal a of OK is defined by
a−1 := {α ∈ K : αa ⊆ OK }.
The gcd, lcm and product of two fractional ideals of OK , and the inverse of a
fractional ideal of OK are again fractional ideals of OK .
We denote by P(OK ) the collection of non-zero prime ideals of OK . The
following result comprises the ideal theory for OK .
Theorem 1.2.1
(i) The fractional ideals of OK form an abelian group with product and
inverse as defined above, and with unit element OK = (1).
(ii) Every fractional ideal a of OK can be decomposed uniquely as a product
of powers of prime ideals
pordp (a) ,
a=
p∈P(OK )
where the exponents ordp (a) are rational integers, at most finitely many
of which are non-zero.
(iii) A fractional ideal a of OK is contained in OK if and only if ordp (a) ≥ 0
for every p ∈ P(OK ).
Proof. See Lang (1970), chapter 1, section 6.
The group of fractional ideals of OK is denoted by I (OK ).
The following consequences are obvious.
Corollary 1.2.2 Let a, b be two fractional ideals of OK . Then
a ⊆ b ⇐⇒ ordp (a) ≥ ordp (b) for every p ∈ P(OK ).
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.3 Extension of ideals; norm of ideals
7
Further, we have for every p ∈ P(OK ),
ordp (a · b) = ordp (a) + ordp (b),
ordp (a + b) = min(ordp (a), ordp (b)),
ordp (a ∩ b) = max(ordp (a), ordp (b)).
For p ∈ P(OK ) we define
ordp (x) := ordp ((x))
if x ∈ K ∗ , ordp (0) := ∞.
Corollary 1.2.2 implies that for every p ∈ P(OK ), ordp defines a discrete valuation on K, i.e., ordp is a surjective map from K to Z ∪ {∞} such that for
x, y ∈ K we have
ordp (xy) = ordp (x) + ordp (y);
ordp (x + y) ≥ min(ordp (x), ordp (y)),
ordp (x) = ∞ ⇐⇒ x = 0.
The next corollary, whose proof is straightforward, gives some other consequences.
Corollary 1.2.3
(i) Let a be a fractional ideal of OK . Then
x ∈ a ⇐⇒ ordp (x) ≥ ordp (a) for all p ∈ P(OK ).
In particular,
x ∈ OK ⇐⇒ ordp (x) ≥ 0 for all p ∈ P(OK ).
(ii) Let a be the fractional ideal of OK generated by a set S. Then
ordp (a) = min{ordp (α) : α ∈ S}
for p ∈ P(OK ).
1.3 Extension of ideals; norm of ideals
Let K be an algebraic number field and L a finite extension of K of degree n.
Every fractional ideal a of OK can be extended to a fractional ideal of OL ,
aOL := {αy : α ∈ a, y ∈ OL },
and the map a → aOL gives an injective group homomorphism from the group
of fractional ideals of OK to the group of fractional ideals of OL . The extension
of a prime ideal p of OK can be decomposed in a unique way as a product of
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
8
Basic algebraic number theory
powers of prime ideals of OL , that is,
pOL =
g
Pei i ,
i=1
where P1 , . . . , Pg are distinct prime ideals of OL and e1 , . . . , eg are positive
integers. We call P1 , . . . , Pg the prime ideals of OL lying above p. The exponent ei , henceforth denoted by e(Pi |p), is called the ramification index of Pi
over p. The residue class ring OL /Pi is a finite field extension of OK /p. The
degree [OL /Pi : OK /p] of this extension, called the residue class degree of
Pi over p, is denoted by f (Pi |p). The next proposition gives some properties
of ramification indices and residue class degrees.
Proposition 1.3.1 Let L, p, P1 , . . . , Pg be as above.
g
(i) We have i=1 e(Pi |p)f (Pi |p) = [L : K].
(ii) Assume that L/K is a Galois extension. Then for any two prime ideals
Pi , Pj ∈ {P1 , . . . , Pg } there is σ ∈ Gal(L/K) such that Pj = σ Pi .
Further, e(P1 |p) = · · · = e(Pg |p) and f (P1 |p) = · · · = f (Pg |p).
Proof. See Lang (1970), chapter 1, section 7, proposition 21, corollary 2.
Proposition 1.3.2 (transitivity in towers) Let M ⊃ L ⊃ K be a tower of finite
field extensions. Further, let P be a prime ideal of OL in the prime ideal
factorization of pOL and Q a prime ideal in the prime ideal factorization of
POM . Then
e(Q|p) = e(Q|P) · e(P|p),
f (Q|p) = f (Q|P) · f (P|p).
Proof. See Lang (1970), chapter 1, section 7, proposition 20.
Let again K be an algebraic number field and L a finite extension of K. We
define the norm over K of a prime ideal P of OL by NL/K (P) := pf (P|p) , where
p is the prime ideal of OK such that P occurs in the prime ideal factorization
of pOL . Then the norm NL/K (A) of an arbitrary fractional ideal A of OL is
defined by multiplicativity, i.e.,
p P|p f (P|p)·ordP (A) ,
(1.3.1)
NL/K (A) :=
p∈P(OK )
where the sum is over all prime ideals of OL lying above p. Thus, NL/K defines
a homomorphism from the group of fractional ideals of OL to the group of
fractional ideals of OK .
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.4 Discriminant, class number, unit group and regulator
9
Below, we give some properties of the norm.
Proposition 1.3.3 Let L be a finite extension of K.
(i) Let A be a fractional ideal of OL . Then NL/K (A) is equal to the fractional
ideal generated by the numbers NL/K (α), α ∈ A.
(ii) For every α ∈ L∗ we have NL/K (αOL ) = NL/K (α)OK .
(iii) Let p be a prime ideal of OK , and P1 , . . . , Pg the prime ideals of OL
dividing p. Then for every α ∈ OL ,
ordp (NL/K (α)) =
g
f (Pi |p)ordPi (α).
i=1
(iv) For every fractional ideal a of OK we have NL/K (aOL ) = a[L:K] .
(v) Let M be a finite extension of L. Then for every fractional ideal C of OM ,
NM/K (C) = NL/K (NM/L (C)).
Proof. For (i), (iv), (v) see Neukirch (1992), Kapitel III, Satz 1.6. Part (ii) is a
consequence of (i), and part (iii) a consequence of (ii) and (1.3.1).
Let K be an algebraic number field. The norm NK/Q (a) of a fractional ideal
a of OK is a fractional ideal of Z. Hence there is a positive rational number
a such that NK/Q (a) = (a). This number a is called the absolute norm of a,
notation NK (a) (often written as N (a) if it is clear from the context which is the
underlying number field). It is obvious that the absolute norm is multiplicative.
From parts (ii) and (iv) of Proposition 1.3.3, we obtain at once:
NK ((α)) = |NK/Q (α)| for α ∈ K ∗ ,
NK ((a)) = |a|[K:Q] for a ∈ Q∗ .
If p is a prime ideal of OK dividing a prime number p, we have NK (p) =
pf (p|p) = |OK /p|. More generally, for any non-zero ideal a of OK we have
NK (a) = |OK /a|.
1.4 Discriminant, class number, unit group and regulator
Let K be an algebraic number field of degree d over Q. There are d distinct
isomorphic embeddings of K in C, which we denote by σ1 , . . . , σd ; further, we
will write α (i) := σi (α) for α ∈ K. We assume that among these embeddings
there are precisely r1 real embeddings, i.e., embeddings σ with σ (K) ⊂ R,
and r2 pairs of complex conjugate embeddings, i.e., pairs {σ, σ } where σ (α) =
σ (α) for α ∈ K. Thus, d = r1 + 2r2 and after reordering the embeddings we
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
10
Basic algebraic number theory
may assume that σi (i = 1, . . . , r1 ) are the real embeddings and {σi , σi+r2 }
(i = r1 + 1, . . . r1 + r2 ) the pairs of complex conjugate embeddings.
Viewed as a Z-module, OK is free of rank d. Taking any Z-basis
{ω1 , . . . , ωd } of OK , we define the discriminant of K by
2
DK := DK/Q (ω1 , . . . , ωd ) = det ωj(i) i,j =1,...,d .
This is a non-zero rational integer which, by (1.1.3), is independent of the
choice of the basis.
Denote as before by I (OK ) the group of fractional ideals of OK . Further,
denote by P (OK ) the subgroup of principal fractional ideals of OK . The quotient group Cl(OK ) = I (OK )/P (OK ) is called the class group of K.
Theorem 1.4.1 The class group Cl(OK ) of OK is finite.
Proof. See Neukirch (1992), Kapitel I, Satz 6.3.
The cardinality of the class group is called the class number of K, and we
denote this by hK .
We denote by WK the multiplicative group consisting of all roots of unity
in K. This is a finite, cyclic subgroup of K ∗ . We denote the number of roots of
unity of K by ωK .
We recall the following fundamental theorem of Dirichlet concerning the
unit group OK∗ of OK . Elements of OK∗ will usually be referred to as units of
K. Recall that if V is an n-dimensional vector space over R, then a full lattice
in V is an additive subgroup
{z1 a1 + · · · + zn an : z1 , . . . , zn ∈ Z},
where {a1 , . . . , an } is a basis of V .
Theorem 1.4.2 The map
LOGK : ε → e1 log |ε(1) |, . . . , er1 +r2 log |ε(r1 +r2 ) |
(where ej = 1 for j = 1, . . . r1 and ej = 2 for j = r1 + 1, . . . , r1 + r2 ) defines
a surjective homomorphism from OK∗ to a full lattice in the real vector space
given by
{x = (x1 , . . . , xr1 +r2 ) ∈ Rr1 +r2 : x1 + · · · + xr1 +r2 = 0}
with kernel WK .
Proof. See Neukirch (1992), Kapitel I, Satz 7.1.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.5 Explicit estimates
11
The following consequence is immediate.
Corollary 1.4.3 Put r = rK := r1 + r2 − 1. Then
OK∗ ∼
= WK × Zr .
More explicitly, there are ε1 , . . . , εr ∈ OK∗ such that every ε ∈ OK∗ can be
expressed uniquely as
ε = ζ ε1b1 . . . εrbr
where ζ is a root of unity in K and b1 , . . . br are rational integers.
The number rK (denoted by r if it is clear to which number field it refers)
is called the unit rank of K. A set of units {ε1 , . . . , εr } as above is called a
fundamental system of units for K. Given such a system, we define the regulator
of K by
(j ) RK := det ej log εi i,j =1,...,r .
This regulator is non-zero, and independent of the choice of ε1 , . . . , εr .
1.5 Explicit estimates
We have collected from the literature some estimates for the field parameters
defined above. As before, K is an algebraic number field of degree d.
For the number ωK of roots of unity of K we have the estimate
ωK ≤ 20d log log d
if d ≥ 3.
(1.5.1)
This follows from the observation that the degree of the maximal cyclotomic
subfield of K, which is ϕ(ωK ) where ϕ is Euler’s totient function, divides d,
and from Rosser and Schoenfeld (1962), Theorem 15, which gives an explicit
lower bound for ϕ(n) of the order n/ log log n.
For the class number and regulator of K we have
hK RK ≤ |DK |1/2 (log∗ |DK |)d−1 .
(1.5.2)
The first inequality of this type was proved by Landau (1918). The above
version follows from Louboutin (2000) and (1.5.1); see (59) in Győry and Yu
(2006). The following lower bound for the regulator is due to Friedman (1989):
RK > 0.2052.
(1.5.3)
We recall an important lower estimate for discriminants. By an inequality due
to Minkowski (see, e.g., Lang (1970), chapter V, section 4, proof of corollary
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
12
Basic algebraic number theory
of theorem 4) we have
|DK | >
π
4
d
dd
d!
2
.
(1.5.4)
Further, we need the following lemma.
Lemma 1.5.1 Let g ∈ Z[X] be a monic polynomial of degree m with non-zero
discriminant. Assume that the coefficients of g have absolute values at most M.
Let K = Q(θ ), where θ is a zero of g. Then
|DK | ≤ m2m−1 M 2m−2 .
Proof. The monic minimal polynomial, say f , of θ is in Z[X] and it divides
g in Z[X]. Suppose K has degree d. Using the expression of the discriminant
of a monic polynomial as the product of the squares of the differences of its
zeros, one easily shows that the discriminant D(f ) of f divides D(g) in the
ring of algebraic integers and so also in Z. Further, by (1.1.2), we have D(f ) =
DK/Q (1, θ, . . . , θ d−1 ). Writing 1, θ, . . . , θ d−1 as Z-linear combinations of a Zbasis of OK , and using (1.1.3), we infer that DK divides D(f ). Therefore, DK
divides D(g). Using for instance an estimate from Lewis and Mahler (1961)
(bottom of p. 335), which uses a determinantal expression for D(g), one obtains
|D(g)| ≤ m2m−1 M 2m−2 .
This proves our lemma.
Remark There is an analogue for this lemma where for g one can take any nonzero polynomial in Z[X], not necessarily monic or of non-zero discriminant.
We will not work this out.
1.6 Absolute values: generalities
We have collected some facts on absolute values. Our basic reference is
Neukirch (1992), Kapitel II.
Let K be an infinite field. An absolute value on K is a function | · | : K →
R≥0 satisfying the following conditions:
|xy| = |x| · |y| for all x, y ∈ K;
there is C ≥ 1 such that |x + y| ≤ C max(|x|, |y|) for all x, y ∈ K;
|x| = 0 ⇐⇒ x = 0.
These conditions imply that |1| = 1. An absolute value | · | on K is called
trivial if |x| = 1 for x ∈ K ∗ . Henceforth, all absolute values we will consider
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.6 Absolute values: generalities
13
are non-trivial. Two absolute values | · |1 , | · |2 on K are called equivalent if
there is c > 0 such that |x|2 = |x|c1 for all x ∈ K.
An absolute value | · | on K is called non-archimedean if it satisfies the
ultrametric inequality
|x + y| ≤ max(|x|, |y|)
for x, y ∈ K
and archimedean if it does not satisfy this inequality.
A valuation on K is a function v : K → R ∪ {∞} such that C −v defines a
non-archimedean absolute value on K, where C is any constant > 1. Equivalently,
v(0) = ∞,
v(x) ∈ R
v(xy) = v(x) + v(y),
for x ∈ K ∗ ,
v(x + y) ≥ min(v(x), v(y))
for x, y ∈ K.
Notice that if v is a valuation on K, then
v(K ∗ ) = {v(x) : x ∈ K ∗ }
is an additive subgroup of R, called the value group of v. In this book, we agree
that a discrete valuation on K is a valuation v on K for which v(K ∗ ) = Z.
(In much of the literature, a discrete valuation on K is a valuation v for which
v(K ∗ ) is a non-trivial discrete subgroup of R; then a valuation v for which
v(K ∗ ) = Z is called a normalized discrete valuation).
A field with absolute value is a pair (K, | · |), where K is an infinite
field, and | · | a non-trivial absolute value on K. An injective homomorphism/isomorphism of fields with absolute value ϕ : (K1 , | · |1 ) → (K2 , | · |2 )
is an injective homomorphism/isomorphism ϕ : K1 → K2 such that |x|1 =
|ϕ(x)|2 for x ∈ K.
Let (K, | · |) be a field with absolute value. A sequence {an } = {an }∞
n=0 in
K is called a convergent sequence of (K, | · |), if there is α ∈ K such that
limn→∞ |an − α| = 0. A Cauchy sequence of (K, | · |) is a sequence {an } in
K with limm,n→∞ |am − an | = 0. We call (K, | · |) complete if every Cauchy
sequence of (K, | · |) converges.
Suppose that (K, | · |) is a non-complete field with absolute value. Then we
can extend this to a complete field with absolute value (K, | · |), the completion
of (K, | · |), as follows. Let R be the ring of Cauchy sequences of (K, | · |) with
componentwise addition and multiplication, and M the ideal of sequences of
(K, | · |) converging to 0. Then M is a maximal ideal of R and thus, R/M is a
field which will be our K. We view K as a subfield of K by identifying α ∈ K
with the element of K represented by the constant sequence {α}. We extend
| · | to K by setting |α| := limn→∞ |an | for α ∈ K, where {an } is any Cauchy
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
14
Basic algebraic number theory
sequence of (K, | · |) representing α. The field K is the smallest complete field
containing K, in the sense that if there exists an injective homomorphism of
(K, | · |) into a complete field with absolute value (L, | · | ), say, then this can
be extended in precisely one way to an injective homomorphism from (K, | · |)
into (L, | · | ).
It is easy to see that notions such as convergence, Cauchy sequence, completeness, completion, depend on the equivalence class of an absolute value
rather than the absolute value itself.
If K is a field with valuation v, then notions such as convergence, Cauchy
sequence, completeness with respect to v are meant to be the corresponding
notions with respect to the absolute value C −v , where C > 1 is any constant.
Example 1: Absolute values on Q. Define MQ := {∞} ∪ {prime numbers}.
This is called the set of places of Q. We call ∞ the infinite place, and the prime
numbers the finite places of Q. We define absolute values | · |p (p ∈ MQ ) by
|a|∞ := max(a, −a)
|a|p := p
−ordp (a)
for a ∈ Q,
for a ∈ Q
for every prime number p, where ordp (a) is the exponent of p in the unique
prime factorization of a, i.e., if a = pm b/c with m, b, c ∈ Z and p bc, then
ordp (a) = m. We agree that ordp (0) = ∞ and |0|p = 0.
The absolute value | · |∞ is archimedean, while the other ones are nonarchimedean. The completion of Q with respect to | · |∞ is Q∞ := R. For
a prime number p, the completion of Q with respect to | · |p is the field
of p-adic numbers, denoted by Qp . The above absolute values satisfy the
Product Formula
|a|p = 1 for a ∈ Q∗ .
p∈MQ
By a theorem of Ostrowski (see Neukirch (1992), Kapitel II, Satz 3.7), every
non-trivial absolute value on Q is equivalent to one of | · |p (p ∈ MQ ).
Example 2. By another theorem of Ostrowski (see Neukirch (1992), Kapitel II,
Satz 4.2), if (K, | · |) is a complete field with archimedean absolute value, then
up to isomorphism, K = R or C, and | · | is equivalent to the ordinary absolute
value.
We finish with recalling some facts about extensions of absolute values. Let
(K, | · |) be a field with absolute value, and L an extension field of K. By an
extension or continuation of | · | to L we mean an absolute value on L whose
restriction to K is | · |.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.7 Absolute values and places on number fields
15
Proposition 1.6.1 Let (K, | · |) be a complete field with absolute value.
(i) Let L be a finite extension of K. Then there is precisely one extension of
| · | to L, which is given by |NL/K (·)|1/[L:K] . The field L is complete with
respect to this extension.
(ii) Let K be an algebraic closure of K. Then | · | has a unique extension to
K. If we denote this extension also by | · |, we have |τ (x)| = |x| for x ∈ K,
τ ∈ Gal(K/K).
Proof. See for instance Neukirch (1992), Kapitel II, Theorem 4.8.
1.7 Absolute values and places on number fields
Let K be an algebraic number field. We introduce a collection of normalized
absolute values {| · |v }v∈MK on K by taking suitable powers of the extensions
to K of the absolute values | · |p (p ∈ MQ ) defined in the previous subsection.
A real place of K is a set {σ } where σ : K → R is a real embedding of
K. A complex place of K is a pair {σ, σ } of conjugate complex embeddings
K → C. An infinite place is a real or complex place. Clearly, if r1 , r2 denote
the number of real and complex places of K, we have r1 + 2r2 = [K : Q]. A
finite place of K is a non-zero prime ideal of OK . We denote by MK∞ , MK0 the
sets of infinite places and finite places, respectively, of K, and by MK the set
of all places of K, i.e., MK := MK∞ ∪ MK0 .
With every place v ∈ MK we associate an absolute value | · |v on K, which
is defined as follows for α ∈ K:
|α|v := |σ (α)| if v = {σ } is real;
|α|v := |σ (α)|2 = |σ (α)|2
|a|v := NK (p)
−ordp (a)
if v = {σ, σ } is complex;
if v = p is a prime ideal of OK ,
where NK (p) = |OK /p| is the absolute norm of p, and ordp (α) is the exponent
of p in the prime ideal factorization of (α), where we agree that ordp (0) = ∞.
We denote by Kv the completion of K with respect to | · |v . Notice that Kv = R
if v is real, Kv = C if v is complex, while Kv is a finite extension of Qp if v = p
is a prime ideal of OK , and p is the prime number with p ∩ Z = (p). Combining
the Product Formula over Q with the identity NK ((α)) = |NK/Q (α)| for α ∈ K,
where the left-hand side denotes the absolute norm of (α), one easily deduces
the Product Formula over K,
|α|v = 1 for α ∈ K ∗ .
(1.7.1)
v∈MK
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
16
Basic algebraic number theory
To deal with archimedean and non-archimedean absolute values simultaneously, it is convenient to use
|x1 + · · · + xn |v ≤ ns(v) max(|x1 |v , . . . , |xn |v )
(1.7.2)
for v ∈ MK , x1 , . . . , xn ∈ K, where
s(v) = 1 if v is real, s(v) = 2 if v is complex, s(v) = 0 if v is finite.
Note that v∈MK∞ s(v) = [K : Q].
Let ρ : K1 → K2 be an isomorphism of algebraic number fields, and v a
place of K2 . We define a place v ◦ ρ on K1 by
⎧
if v = {σ } is real,
⎨{σρ}
v ◦ ρ := {τρ, τ ρ} if v = {τ, τ } is complex,
⎩ −1
ρ (p) if v = p is a prime ideal of OK .
Then
|α|v◦ ρ = |ρ(α)|v
for α ∈ K1 , v ∈ MK2 .
(1.7.3)
Let L be a finite extension of K and v, V places of K, L, respectively. We
say that V lies above v or v below V , notation V |v, if the restriction of | · |V to
K is a power of | · |v . This is the case precisely if either both v, V are infinite
and the embeddings in v are the restrictions to K of the embeddings in V ;
or if v = p, V = P are prime ideals of OK , OL , respectively with P ⊃ p. In
that case, the completion LV of L with respect to | · |V is a finite extension of
Kv . In fact, [LV : Kv ] is 1 or 2 if v, V are infinite, while if v = p, V = P are
finite, we have [LV : Kv ] = e(P|p)f (P|p), where e(P|p), f (P|p) denote the
ramification index and residue class degree of P over p.
We say that two places V1 , V2 of L are conjugate over K if there is a
K-automorphism σ of L such that V2 = V1 ◦ σ .
Proposition 1.7.1 Let K be a number field, and L a finite extension of K.
Further, let v be a place of K and V1 , . . . , Vg the places of L above v. Then
[LV :Kv ]
|α|Vk = |α|v k
for α ∈ K, k = 1, . . . , g,
g
|α|Vk = |NL/K (α)|v for α ∈ L,
(1.7.4)
(1.7.5)
k=1
g
[LVk : Kv ] = [L : K].
(1.7.6)
k=1
Further, if L/K is Galois, then V1 , . . . , Vg are conjugate to each other, and we
have [LVk : Kv ] = [L : K]/g for k = 1, . . . , g.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.8 S-integers, S-units and S-norm
17
Proof. The verification is completely straightforward if v is an infinite place.
So assume that v = p is a finite place; then Vk = Pk (k = 1, . . . , g) are the
prime ideals of OL containing p. The first identity (1.7.4) follows from the
observation that for α ∈ K, P ∈ {P1 , . . . , Pg },
|α|P = NL (P)−ordP (α) = NK (p)−f (P|p)e(P|p)ordp (α) = |α|p P
[L :Kp ]
.
Identity (1.7.5) follows by expressing both sides of the identity as powers of
NK (p) and showing by means of Proposition 1.3.3 (iii) that the exponents are
equal. Identity (1.7.6) follows by combining (1.7.4) with (1.7.5) with α ∈ K ∗ .
The last assertion follows from Proposition 1.3.1 (ii).
1.8 S-integers, S-units and S-norm
Let S denote a finite subset of MK containing all infinite places. We say that
α ∈ K is an S-integer if |α|v ≤ 1 for all v ∈ MK \ S. The S-integers form a
ring in K, denoted by OS . Its unit group, denoted OS∗ , is called the group of
S-units.
For S = MK∞ the ring of S-integers is just OK and the group of S-units just
∗
OK . Otherwise, we have S = MK∞ ∪ {p1 , . . . , pt }, where p1 , . . . , pt are prime
ideals of OK . Then OS = OK [(p1 · · · pt )−1 ], and OS∗ consists of those elements
α of K such that (α) is composed of prime ideals from p1 , . . . , pt . In the case
K = Q, S = {∞, p1 , . . . , pt } where p1 , . . . , pt are prime numbers, we write
ZS for the ring of S-integers. Thus, ZS = Z[(p1 · · · pt )−1 ].
We define the S-norm of α ∈ K by
|α|v .
NS (α) :=
v∈S
Notice that the S-norm is multiplicative. Let again S = MK∞ ∪ {p1 , . . . , pt }.
Take α ∈ K ∗ and write
(α) = pk11 · · · pkt t a,
where a is a fractional ideal of OK composed of prime ideals outside S. Then
by the Product Formula,
|α|−1
NK (p)ordp (α)
NS (α) =
v =
v∈S
p∈P(OK )\{p1 ,...,pt }
= NK (a).
Let L be a finite extension of K and T the set of places of L lying above the
places of S. Then OT , the ring of T -integers of L is the integral closure of OS
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
18
Basic algebraic number theory
in L. Further, by Proposition 1.7.1,
NT (α) = NS (α)[L:K]
for α ∈ K ∗ .
We recall the extension of Dirichlet’s Unit Theorem to S-units.
Theorem 1.8.1 (S-unit Theorem) Let S = {v1 , . . . , vs } be a finite set of places
of K, containing all infinite places. Then the map
LOGS : ε → ((log |ε|v1 , . . . , log |ε|vs )
defines a surjective homomorphism from OS∗ to a full lattice in the real vector
space
{x = (x1 , . . . , xs ) ∈ Rs : x1 + · · · + xs = 0}
with kernel WK .
Proof. See Lang (1970), chapter V, section 1, Unit Theorem.
This implies at once:
Corollary 1.8.2 We have
OS∗ ∼
= WK × Zs−1 .
More explicitly, there are ε1 , . . . , εs−1 ∈ OS∗ such that every ε ∈ OS∗ can be
expressed uniquely as
b
s−1
,
ε = ζ ε1b1 . . . εs−1
(1.8.1)
where ζ is a root of unity in K and b1 , . . . bs−1 are rational integers.
A system {ε1 , . . . , εs−1 } as above is called a fundamental system of S-units.
Analogously as for units of OK , we define the S-regulator by
(1.8.2)
RS := det log |εi |vj i,j =1,...,s−1 .
This quantity is non-zero, and independent of the choice of ε1 , . . . , εs−1 and of
the choice v1 , . . . , vs−1 from S. In the case that S = MK∞ , the S-regulator RS
is equal to the regulator RK . More generally, we have
RS = RK · [I (S) : P (S)] ·
t
log N (pi ) ,
(1.8.3)
i=1
where p1 , . . . , pt are the prime ideals in S, I (S) is the group of fractional ideals
of OK composed of prime ideals from p1 , . . . , pt and P (S) is the group of
principal fractional ideals of OK composed of prime ideals from p1 , . . . , pt .
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.9 Heights
19
We note that the index [I (S) : P (S)] is a divisor of the class number hK . By
combining (1.8.3) with (1.5.2) we obtain
RS ≤ hK RK ·
t
log NK (pi )
i=1
≤ |DK |1/2 (log∗ |DK |)d−1 ·
t
log NK (pi ).
(1.8.4)
i=1
By combining (1.8.3) with (1.5.3), we obtain
(log 2)(log 3)s−2 if K = Q, s := |S| ≥ 3,
RS ≥
0.2052(log 2)s−2 if K = Q, s ≥ 3.
(1.8.5)
1.9 Heights
There are various ways to define the height of an algebraic number, a vector
with algebraic coordinates or a polynomial with algebraic coefficients. Here we
have made a small selection. The other notions of height needed in this book
will be defined on the spot. We fix an algebraic closure Q of Q.
1.9.1 Heights of algebraic numbers
The absolute multiplicative height of α ∈ Q is defined by
max(1, |α|v )1/[K:Q] ,
H (α) :=
v∈MK
where K ⊂ Q is any number field containing α. It follows from Proposition
1.7.1 that this is independent of the choice of K. The absolute logarithmic
height of α is given by
h(α) := log H (α).
Below, we have brought together some properties of the absolute logarithmic height. These can easily be reformulated into properties of the absolute
multiplicative height.
We start with a trivial but useful observation: if K is an algebraic number
field, S any non-empty subset of MK , and α = 0, then
−h(α) ≤
1
log |α|v ≤ h(α).
[K : Q] v∈S
(1.9.1)
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
20
Basic algebraic number theory
1 Indeed, the upper bound is obvious from h(α) = [K:Q]
v∈MK log max(1, |α|v ),
and the lower bound follows in the same manner, applying the Product Formula
v∈S log |α|v = −
v∈S log |α|v . In the case that S is a finite subset of MK ,
containing the infinite places, (1.9.1) translates into
−h(α) ≤
1
log NS (α) ≤ h(α).
[K : Q]
(1.9.2)
The next lemma gives an estimate for the denominator of an algebraic
number.
Lemma 1.9.1 Let K be a number field and α ∈ K ∗ . Then there is a positive
integer d such that d ≤ H (α)[K:Q] and dα ∈ OK .
Proof. We take d := v∈MK0 max(1, |α|v ). It is clear that d ≤ H (α)[K:Q] . We
show that d is a positive integer and dα ∈ OK . First observe that
NK (p)max(0,−ordp (α)) =
p p|p f (p|p) max(0,−ordp (α)) ∈ Z>0 ,
d=
p∈P(OK )
p
where the product is over the rational primes. Further, if p is a prime ideal of
OK lying above the prime p, say, then
ordp (d) ≥ ordp (d) ≥ −ordp (α),
implying ordp (dα) ≥ 0. This holds for every prime ideal p of OK , hence
dα ∈ OK .
In the following lemma we have listed some further properties.
Lemma 1.9.2 Let α, α1 , . . . , αn ∈ Q, m ∈ Z and let σ be an automorphism
of Q. Then
h(σ (α)) = h(α);
n
h(α1 · · · αn ) ≤
h(αi );
i=1
h(α1 + · · · + αn ) ≤ log n +
n
h(αi );
i=1
h(α m ) = |m|h(α) if α = 0.
Proof. The first property is a consequence of (1.7.3), the third of (1.7.2), while
the other two are obvious. See also Waldschmidt (2000), chapter 3.
The minimal polynomial of α ∈ Q over Z, denoted by Pα , is by definition the
polynomial P ∈ Z[X] of minimal degree, having positive leading coefficient
and coefficients with greatest common divisor 1, such that P (α) = 0. Writing
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.9 Heights
21
Pα = a0 (X − α (1) ) · · · (X − α (d) ) where d = deg α and α (1) , . . . , α (d) are the
conjugates of α in C, we have
1/d
d
(i) H (α) = |a0 |
max 1, α
,
i=1
i.e., H (α) is the d-th root of the Mahler measure of α (see Waldschmidt (2000),
Lemma 3.10). Further, writing Pα = a0 Xd + · · · + ad , we have
1
log(d + 1) + h(α) ≤ h(Pα ) ≤ log 2 + h(α),
− 2d
(1.9.3)
where h(Pα ) := log max(|a0 |, . . . , |ad |) (see Waldschmidt (2000), Lemma
3.11).
From this we deduce at once:
Theorem 1.9.3 (Northcott’s Theorem) Let D, H be positive reals. Then there
are only finitely many α ∈ Q such that deg α ≤ D and h(α) ≤ H .
1.9.2 v-adic norms and heights of vectors and polynomials
Let K be an algebraic number field, v ∈ MK , and denote the unique extension of | · |v to Kv also by | · |v . We define the v-adic norm of a vector
n
x = (x1 , . . . , xn ) ∈ Kv by
|x|v = |x1 , . . . , xn |v := max(|x1 |v , . . . , |xn |v ).
n
Let x = (x1 , . . . , xn ) ∈ Q and choose an algebraic number field K such that
x ∈ K n . Then the multiplicative height and homogeneous multiplicative height
of x are defined by
1/[K:Q]
H (x) = H (x1 , . . . , xn ) :=
max(1, |x|v )
,
v∈MK
H hom (x) = H hom (x1 , . . . , xn ) :=
1/[K:Q]
|x|v
,
v∈MK
respectively. By Proposition 1.7.1, these definitions are independent of the
choice of K. We define the corresponding logarithmic heights by
h(x) := log H (x),
hhom (x) := log H hom (x),
respectively. For instance, for x = (x1 , . . . , xn ) ∈ Zn \ {0} we have
⎫
h(x) = log max(|x1 |, . . . , |xn |),
⎬
|,
.
.
.
,
|x
|)
max(|x
1
n
.⎭
hhom (x) = log
gcd(x1 , . . . , xn )
(1.9.4)
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
22
Basic algebraic number theory
n
∗
It is easy to verify that for x = (x1 , . . . , xn ) ∈ Q , λ ∈ Q and for x1 , . . . , xm ∈
n
Q ,
hhom (x) ≤ h(x),
(1.9.5)
max h(xi ) ≤ h(x) ≤
1≤i≤n
n
h(xi ),
(1.9.6)
i=1
h(x) − h(λ) ≤ h(λx) ≤ h(x) + h(λ),
(1.9.7)
hhom (λx) = hhom (x),
m
h(xi ) + log m.
h(x1 + · · · + xm ) ≤
(1.9.8)
(1.9.9)
i=1
We recall a few facts on heights and norms of polynomials. Let K be an
algebraic number field and v ∈ MK . Denote the unique extension of | · |v to Kv
also by | · |v . For a polynomial P ∈ Kv [X1 , . . . , Xg ], we denote by |P |v the
v-adic norm of a vector, consisting of all non-zero coefficients of P . We write
as before s(v) = 1 if v is real, s(v) = 2 if v is complex, and s(v) = 0 if v is
finite.
Proposition 1.9.4 Let P1 , . . . , Pm be non-zero polynomials in Kv [X1 , . . . ,
Xg ] and let n be the sum of the partial degrees of P := P1 · · · Pm . Then
2−ns(v) ≤
|P |v
≤ 2ns(v) .
|P1 |v · · · |Pm |v
Proof. In the case that v is finite then the term 2ns(v) is 1, and so this is Gauss’
Lemma. In the case that v is infinite this is a version of a lemma of Gelfond.
Proofs of both can be found for instance in Bombieri and Gubler (2006),
Lemmas 1.6.3, 1.6.11.
For a polynomial P ∈ Q[X1 , . . . , Xg ], we denote by H (P ), H hom (P ), h(P ),
h (P ), the respective heights of a vector whose coordinates are the non-zero
coefficients of P . Obviously, for polynomials we have similar inequalities as
in (1.9.5)–(1.9.9). From Proposition 1.9.4 we deduce at once:
hom
Corollary 1.9.5 Let P1 , . . . , Pm be non-zero polynomials in Q[X1 , . . . , Xg ]
and let n be the sum of the partial degrees of P := P1 · · · Pm . Then
m
hom
hom
h (Pi ) ≤ n log 2.
h (P ) −
i=1
Proof. Choose a number field K containing the coefficients of P1 , . . . , Pm ,
apply Proposition 1.9.4 and take the product over v ∈ MK .
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.10 Effective computations in number fields
23
Corollary 1.9.6 Let P = (X − α1 ) · · · (X − αn ) ∈ Q[X]. Then
n
h(αi ) ≤ n log 2,
h(P ) −
i=1
n
2−n H (P ) ≤
H (αi ) ≤ 2n H (P ).
i=1
Proof. The second assertion is an immediate consequence of the first one. To
prove the first, observe that h(α) = hhom (X − α) for α ∈ Q and that h(P ) =
hhom (P ) since P is monic. Applying Corollary 1.9.5 to the identity P (X) =
n
i=1 (X − αi ), the first assertion follows.
For monic irreducible polynomials P with coefficients in Z, Corollary 1.9.6
gives a slightly weaker version of (1.9.3).
1.10 Effective computations in number fields
We have listed the basic algorithmic results for algebraic number fields that
will be needed later. We shall not present the algorithms themselves, but refer
to the literature for their description. Our main references are Borevich and
Shafarevich (1967), Pohst and Zassenhaus (1989) and Cohen (1993, 2000).
When we say that for any given input from a specified set we can determine/compute effectively an output, we mean that there exists an algorithm
(that is, a deterministic Turing machine) which, for any choice of input from
the given set, computes the output in finitely many steps. We say that an object
is given effectively if it is given in such a way that it can serve as input for an
algorithm.
An algebraic number field K is said to be effectively given if K = Q(θ )
and the monic minimal polynomial P ∈ Q[X] of θ are given. Then K Q[X]/(P ). Here we may assume that P ∈ Z[X] and that θ is an algebraic
integer. Throughout this section we assume that K is effectively given in the
form K = Q(θ ) with the monic minimal polynomial P of θ in Z[X]. We denote
by d the degree of P , that is the degree of K over Q.
We say that an element α of K is effectively given/computable if in the
representation
α = a0 + a1 θ + · · · + ad−1 θ d−1
(1.10.1)
of α the coefficients a0 , . . . , ad−1 ∈ Q are effectively given/computable.
(1.10.1) is regarded as the standard representation of α ∈ K with respect to θ .
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
24
Basic algebraic number theory
We shall need the following algorithmic results.
(I) If α, β ∈ K are effectively given/computable then α ± β, αβ and
α/β (β = 0) are also effectively computable; see e.g. Cohen (1993),
section 4.2.
(II) One can determine effectively an integral basis of K, that is a Z-module
basis {1, ω2 , . . . , ωd } of the ring of integers OK of K, and from that the
discriminant DK of K; see e.g. Cohen (1993), section 6.1. It is easy to
see that if α ∈ K is effectively given then one can determine b1 , . . . , bd
in Q such that
α = b1 + b2 ω2 + · · · + bd ωd .
(1.10.2)
In particular, one can decide whether α is in OK .
(III) For any given F ∈ K[X], one can factorize F into irreducible polynomials over K; see Pohst and Zassenhaus (1989) or Cohen (1993),
section 3.6. As a consequence, for given F ∈ K[X] one can determine
all zeros of F in K.
(IV) If α ∈ K is effectively given, then its characteristic polynomial relative
to K/Q and its monic minimal polynomial over Q can be effectively
determined. Conversely, if a monic, irreducible polynomial P (X) over
Q is given, then one can decide whether any of its zeros belongs to K,
and if it is so then all zeros of P (X) in K can be effectively determined.
Consequently, if K/Q is normal, then all conjugates of any given α ∈ K
can be effectively determined; see e.g. Győry (1983), remark 1.
(V) For given C > 0 one can determine a finite and effectively determinable
subset A of K such that if α ∈ K and h(α) ≤ C then α ∈ A. Indeed,
representing α in the form (1.10.2) with an effectively given integral
basis {1, ω2 , . . . , ωd }, taking conjugates with respect to K/Q and using
Cramer’s Rule, one can get an effective upper bound for maxi h(bi ).
But, for such bi , the numbers b1 + b2 ω2 + · · · + bd ωd form a finite and
effectively computable subset of K.
(VI) If α ∈ K is effectively given then one can effectively compute an upper
bound for h(α). Indeed, by (IV) we can compute the minimal polynomial
Pα (X) ∈ Z[X] of α with relatively prime coefficients. Then (1.9.3)
provides an upper bound for h(α).
We say that a fractional ideal a of OK is effectively given/
computable if a finite set of generators of a over OK is effectively
given/computable. For other representations of fractional ideals we
refer to Pohst and Zassenhaus (1989), section 6.3 and Cohen (1993),
section 4.7.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.10 Effective computations in number fields
25
(VII) If a fractional ideal a of OK is effectively given then it can be decided
whether a is principal. Further, if it is, one can compute an α ∈ K such
that a = (α); see Cohen (1993), section 6.5.
(VIII) For effectively given fractional ideals of OK one can compute their
sum, product and their absolute norms. Further, one can test equality,
inclusion (i.e. divisibility) and whether a given element of K is in a
given fractional ideal; see e.g. Cohen (1993), section 4.7.
(IX) If a is an effectively given fractional ideal of OK then its prime ideal
factorization can be effectively determined; see e.g. Cohen (2000), section 2.3. In particular, one can decide whether a is an ideal of OK or a
prime ideal.
Let S be a finite set of places of K containing all infinite places. We
say that S is effectively given if the prime ideals in S are effectively
given. In what follows, we assume that S is effectively given. We recall
that OS (resp. OS∗ ) denotes the ring of S-integers (resp. the group of
S-units) in K.
(X) For an effectively given place v ∈ MK and an effectively given α ∈ K,
one can effectively compute |α|v . For the definition of an infinite place v
being effectively given, and the computation of |α|v see Cohen (1993),
section 4.2. For the computation of |α|v for a finite place v combine
(IX) and (VIII) above.
(XI) In view of (IX) one can decide for any given α ∈ K ∗ whether α ∈ OS
or α ∈ OS∗ .
(XII) A fundamental system of S-units can be effectively determined; see
Cohen (2000), section 7.4. In particular, a fundamental system of units
in K can be effectively found; see Borevich and Shafarevich (1967),
chapter 2, section 5 or Pohst and Zassenhaus (1989), section 5.7. Further,
the roots of unity in K can be effectively found; see e.g. Pohst and
Zassenhaus (1989), section 5.4 or Cohen (1993), section 4.9.
(XIII) If ε ∈ OS∗ and a fundamental system of S-units {ε1 , . . . , εs−1 } are
effectively given, then one can determine effectively rational integers
b1 , . . . , bs−1 and a root of unity ζ in K such that (1.8.1) holds.
Proof. By Corollary 1.8.2, ε can be written in the form (1.8.1). Let S =
{v1 , . . . , vs } be as in Theorem 1.8.1. Then (1.8.1) implies that
log |ε|vi =
s−1
bj log |εj |vi
for i = 1, . . . , s.
j =1
Considering this as a system of linear equations in b1 , . . . , bs−1 and using
Cramer’s Rule, (1.8.5) and the fact that | log |ε|vi | and | log |εj |vi | can be
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
26
Basic algebraic number theory
effectively bounded above for each i and j , one can derive an effectively computable upper bound for maxj |bj |. Testing all possible values of b1 , . . . , bs−1 ,
bs−1
is a root of unity.
one can determine b1 , . . . , bs−1 such that ε/ε1b1 · · · εs−1
1.11 p-adic numbers
Let p be a prime number. Recall that we have defined the absolute value | · |p
on Q by | · |p := p−ordp (·) , where ordp (a) denotes the exponent of p in the
unique prime factorization of a ∈ Q∗ , and ordp (0) = ∞. We denote by Qp the
completion of Q with respect to | · |p or equivalently, with respect to ordp .
Clearly, ordp defines a discrete valuation on Q, and hence on Qp . We define
the ring of p-adic integers by
Zp := {x ∈ Qp : ordp (x) ≥ 0}.
Let L be a finite extension of Qp . There is precisely one absolute value on
1/[L:Qp ]
L that extends | · |p , given by |NL/Qp (·)|p
, and L is complete with respect
to this absolute value. We can extend ordp to a valuation on L, by defining
ordp (α) := − log |NL/Qp (α)|p /[L : Qp ] log p for α ∈ L. Clearly, ordp (α) is a
rational number with denominator dividing [L : Qp ]. As a consequence, the
value set of ordp on L∗ is a cyclic subgroup of Q containing Z, say of the shape
e−1 Z, where e is a positive integer. This integer e is called the ramification
index of L over Qp . Any positive integer e may occur as ramification index;
for instance if α e = p, then Qp (α) has ramification index e over Qp .
Now, let Qp denote an algebraic closure of Qp . Then the above considerations imply that ordp extends uniquely to a valuation on Qp , denoted also by
∗
ordp , with value group ordp (Qp ) = Q. It can be shown that Qp is not complete
with respect to ordp but that the completion of Qp is algebraically closed (see
Koblitz (1984), pp. 71–73). The ring of integers of Qp and its unit group are
given by
Zp := {x ∈ Qp : ordp (x) ≥ 0},
∗
Zp = {x ∈ Qp : ordp (x) = 0}.
Let K be an algebraic number field. Then any discrete valuation v on K
lying above ordp corresponds to a prime ideal of OK above p, that is,
p := {x ∈ OK : v(x) > 0}.
Further, for x ∈ K ∗ , v(x) is precisely the exponent of p in the unique prime
ideal factorization of (x), i.e., v = ordp . The completion of K with respect to
ordp , denoted by Kp , is a finite extension of Qp , and the ramification index
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.11 p-adic numbers
27
of Kp over Qp is precisely the ramification index e(p|p) of p over p. Further,
there is an embedding σ : K → Qp such that
ordp (x) = e(p|p)ordp (σ (x)) for x ∈ K .
We are going to define the p-adic logarithm. We start with some preliminaries.
∗
Lemma 1.11.1 Let α ∈ Zp and c ∈ R>0 . Then there is a positive integer m
such that ordp (α m − 1) > c. This integer m depends only on p, c and the field
Qp (α).
Proof. Let L := Qp (α) and define
o := {x ∈ L : ordp (x) ≥ 0},
m := {x ∈ L : ordp (x) > c}.
Then o is the integral closure of Zp in L and m is an ideal of o. Let l be the
smallest integer ≥ c. Then m ⊇ pl o. Since the additive structure of o is that of
a free Zp -module of rank [L : Qp ], the residue class ring o/pl o has cardinality
pl·[L:Qp ] . This shows that the residue class ring o/m is finite. But then its unit
group (o/m)∗ is also finite, say of order m, and α m − 1 ∈ m. Clearly, m depends
only on p, c and L.
Lemma 1.11.2
∗
(i) Let α ∈ Zp with ordp (α − 1) > 0. Then the series
logp (α) :=
∞
(−1)n−1
n=1
n
· (α − 1)n
converges to a limit in the field Qp (α).
∗
(ii) Let α ∈ Zp with ordp (α − 1) > 1/(p − 1). Then
ordp (logp (α)) = ordp (α − 1).
∗
(iii) Let α, β ∈ Zp with ordp (α − 1) > 0, ordp (β − 1) > 0. Then
logp (αβ) = logp (α) + logp (β).
Proof. The proofs of (i) and (iii) can be found in Koblitz (1984), section 4.1.
To prove (ii), put κ := ordp (α − 1). Then, since κ > 1/(p − 1),
ordp (logp (α)) = min(κ, pκ − 1, p2 κ − 2, . . .) = κ.
∗
∗
We now define logp on the whole group Zp as follows: take α ∈ Zp ,
choose a positive integer m such that ordp (α m − 1) > 0 (which exists by
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
28
Basic algebraic number theory
Lemma 1.11.1) and put
logp (α) :=
1
logp (α m ).
m
By part (iii) of Lemma 1.11.2 this does not depend on the choice of m. Moreover,
logp (αβ) = logp (α) + logp (β)
∗
for α, β ∈ Zp .
(1.11.1)
Proposition 1.11.3
∗
(i) logp defines a surjective group homomorphism from Zp to the additive
group of Qp with kernel the roots of unity in Qp .
(ii) Let L be a finite extension of Qp . Then logp defines a non-surjective
∗
homomorphism from Zp ∩ L to the additive group of L.
∗
Proof. (i) By (1.11.1), logp defines a homomorphism from Zp to the additive
group of Qp . We determine the kernel of logp . First, let α be a root of unity
∗
from Qp . Then α ∈ Zp . Further, there is a positive integer m with α m = 1 and
∗
so, logp α = m−1 logp (α m ) = 0. Now let α ∈ Zp which is not a root of unity.
By Lemma 1.11.1, there exists a positive integer m such that ordp (α m − 1) >
1/(p − 1). Then by part (ii) of Lemma 1.11.2,
ordp (logp (α)) = ordp (logp (α m )) − ordp (m)
= ordp (α m − 1) − ordp (m) < ∞
(1.11.2)
and thus logp (α) = 0. This proves that the kernel of logp consists of the roots
of unity of Qp .
To prove the surjectivity of logp , we use the p-adic exponential. By, e.g.,
∗
Koblitz (1984), section 4.1, for α ∈ Qp with ordp (α) > p/(p − 1), the series
expp (α) :=
∞
αn
n=0
n!
converges to a limit in the field Qp (α). Moreover, again by Koblitz (1984),
section 4.1, for these α we have ordp (expp (α) − 1) > 0 and logp (expp (α)) = α.
Now let β ∈ Qp be arbitrary. Choose k ∈ Z>0 such that k + ordp (β) >
∗
k
p/(p − 1) and then α ∈ Qp with α p = expp (p k β). We have α ∈ Zp since
ordp (expp (p k β) − 1) > 0. Now, clearly,
logp (α) = p−k logp (expp (p k β)) = β.
This proves the surjectivity of logp .
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
1.11 p-adic numbers
29
∗
(ii) Let α ∈ Zp ∩ L. By Lemma 1.11.1, there exists a positive integer
m, depending only on p and L, such that ordp (α m − 1) > 1/(p − 1). Then
logp (α) = m−1 logp (α m ) ∈ L and also, similarly to (1.11.2),
ordp (logp (α)) = ordp (α m − 1) − ordp (m) >
1
− ordp (m),
p−1
∗
which is independent of α. So logp : Zp ∩ L → L is certainly not
surjective.
For the computation of p-adic logarithms of algebraic numbers, see de
Weger (1989) and Smart (1998).
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:27:23, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.003
2
Algebraic function fields
By an algebraic function field in one variable over a field k, or, in short, function
field over k, we mean a finitely generated field extension of transcendence
degree 1 over k. We shall restrict ourselves to the case that k is algebraically
closed and of characteristic 0. Thus, if K is a function field over k and z is
any element from K \ k, then K is a finite extension of the field of rational
functions k(z).
We have collected here the concepts and results that are used in our book.
For further details and proofs, we refer to the books Eichler (1966) and Mason
(1984) and to the paper Schmidt (1978).
2.1 Valuations
Let k be an algebraically closed field of characteristic 0 and K an algebraic
function field over k. By a valuation on K over k we mean a discrete valuation
with value group Z such that v(x) = 0 for x ∈ k∗ , i.e., a surjective map v :
K → Z ∪ {∞} such that
v(x) = ∞ ⇐⇒ x = 0;
v(xy) = v(x) + v(y), v(x + y) ≥ min(v(x), v(y))
for x, y ∈ K;
∗
v(x) = 0 for x ∈ k .
The corresponding local ring and maximal ideal of v are given by
ov := {x ∈ K : v(x) ≥ 0},
mv := {x ∈ K : v(x) > 0},
respectively. The quotient ov /mv is a field, called the residue class field of v.
Since k is algebraically closed, we have ov /mv = k.
30
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
2.1 Valuations
31
By a local parameter of v we mean an element zv of K such that v(zv ) = 1.
Then the completion of K at v is the field of formal Laurent series k((zv )).
Analogously to number fields, we denote the set of valuations on K by MK .
Let K be a function field over k and L a finite extension of K. Let v be a
valuation on K. We say that a valuation w of L lies above v, notation w|v, if the
restriction of w to K is a multiple of v. In that case, we have w(x) = e(w|v)v(x)
for x ∈ K, where e(w|v) is a positive integer, called the ramification index of
w over v.
First we describe the valuations on k(z). For every element a of k, each
non-zero element x of k(z) may be expanded as a formal Laurent series
∞
am (z − a)m ,
m=n
where am is an element of k and an = 0. Then orda defined by orda (x) := n is a
valuation on k(z), and the field of Laurent series k((z − a)) is the completion of
k(z) at orda . Similarly, we define a valuation ord∞ on k(z) expanding x ∈ k(z)
as a Laurent series in z−1 . In particular, ord∞ (x) = − deg(x) for x ∈ k[z]. The
completion of k(z) at ord∞ is k((z−1 )). The valuations orda (a ∈ k ∪ {∞})
provide all valuations on k(z). These valuations satisfy the Sum Formula
orda (x) = 0 for x ∈ k(z)∗ .
a∈k∪{i∞}
Now, let K be an algebraic function field over k. We give a concrete description of the valuations on K by means of Puiseux expansions. To this end, fix
z ∈ K \ k, so that K is a finite extension of k(z). Put d := [K : k(z)]. The function field K has a primitive element y over k(z) which satisfies an irreducible
equation
P (y, z) = y d + p1 (z)y d−1 + · · · + pd (z) = 0
(2.1.1)
with coefficients pi (z) in k(z). If Q(z) is the common denominator of the
rational functions pi (z), then y1 := Q(z)y satisfies an equation of the form
(2.1.1) with coefficients from k[z]. Replacing y by y1 , we may assume that in
(2.1.1) y is a primitive element of K with polynomials pi (z) ∈ k[z].
The field K may be embedded both in the field of fractional power series in
z − a, where a is an arbitrary element of k, and in the field of fractional power
series in z−1 . These fields are all algebraically closed. Every element x of K
may be expressed in a unique way in the form
x=
d
qi (z)y i−1
i=1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
32
Algebraic function fields
with some q1 , . . . , qd in k(z). Hence, to expand the functions of K in power
series in z − a or z−1 , it suffices to so expand the single function y.
We recall Puiseux’s classical theorem.
Theorem 2.1.1 For each element a of k, there are positive integers ra ≤ d
and e1 , . . . , era with e1 + · · · + era = d, and formal Puiseux series
yρ =
∞
aρm (z − a)m/eρ ,
aρnρ = 0,
ρ = 1, . . . , ra
m=nρ
with coefficients aρm in k, that satisfy (2.1.1). Further, if ζ is a primitive eρ -th
root of unity and
yρj =
∞
aρm ζ j m (z − a)m/eρ
(j = 1, . . . , eρ − 1),
m=nρ
then the left-hand side of (2.1.1) is identical with
(y − yρj ).
P (y, z) =
ρ,j
A similar assertion holds with z
−1
instead of z − a.
Proof. See, e.g., Eichler (1966) chapter III, section 1.
For each a in k and each ρ, j as above, the map ϕρj : y → yρj determines
uniquely an embedding of K into the field of formal Laurent expansions in
powers of (z − a)1/eρ , i.e., for x ∈ K we have
ϕρj (x) =
∞
am (z − a)m/eρ with am ∈ k for m ≥ n and an = 0.
m=n
For every ρ with 1 ≤ ρ ≤ ra , we construct a valuation v on K as follows:
choose any j with 1 ≤ j ≤ eρ . Then, for any x ∈ K ∗ , we define v(x) := n in
the above Laurent series expression for ϕρj (x). Notice that the valuation v on
K lies above u := orda , that (z − a)1/eρ is a local parameter for v, and that eρ
is the ramification index e(v|u) of v over u. The above construction gives all
extensions of u = orda to K.
One can construct in a similar way the extensions v of ord∞ to K. Each of
these v is defined as the order of vanishing of the Laurent expansion in a local
parameter z−1/eρ .
In this way all valuations v of K are described. For convenience we say that
v lies above a (a ∈ k ∪ {∞}) if it lies above orda and write e(v|a) for e(v|orda ).
Notice that for all a ∈ k ∪ {∞} we have v|a e(v|a) = [K : k(z)], where the
sum is taken over all valuations v of K lying above a.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
2.2 Heights
33
We say that v ∈ MK is called infinite with respect to z if it lies above ∞,
i.e., if v(z) < 0. We denote this by v | ∞; otherwise, we say that v is finite with
respect to z.
To get a uniform notation, if v lies above a and corresponds to the pair
(ρ, j ), we write zv for (z − a)1/eρ if a = ∞ and for z−1/eρ if a = ∞, and yv for
yρj . Thus, for every valuation v of K, zv is a local parameter for v, and y → yv
defines an isomorphic embedding ϕv : K → k((zv )). The valuations defined
above have the following properties:
v(x) = 0 for all v ∈ MK ⇐⇒ x ∈ k∗ ,
v(x) = 0 for x in K ∗ (Sum Formula).
(2.1.2)
(2.1.3)
v∈MK
For each non-zero x ∈ K, only finitely many summands are non-zero.
Let L be a finite extension of K. On L we define valuations in the same way
as for K. Then
e(w|v) = [L : K] for v ∈ MK ,
(2.1.4)
w|v
where the sum is taken over all valuations w of L lying above v. More generally,
we have the Extension Formula
w(x) = v(NL/K (x)) for x ∈ L.
(2.1.5)
w|v
2.2 Heights
Let again K be an algebraic function field in one variable over an algebraically
closed field k of characteristic 0 and MK its set of valuations over k. For a
vector x = (x1 , . . . , xn ) ∈ K n \ {0} we define
v(x) := −min(v(x1 ), . . . , v(xn ))
for v ∈ MK ,
and then
HKhom (x) = HKhom (x1 , . . . , xn ) :=
v(x),
v
where as usual v indicates that the sum is taken over all valuations v ∈ MK .
This is called the homogeneous height of x with respect to K. The height
HK may be viewed as the function field analogue of the logarithmic height
n
hhom
L (x) :=
v log maxi |xi |v for x = (x1 , . . . , xn ) ∈ L \ {0} relative to a number field L; this is [L : Q] times the absolute logarithmic height hhom defined
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
34
Algebraic function fields
in Section 1.9. It has become common practice to denote function field heights
by capital H . By the Sum Formula we have
HKhom (λx) = HKhom (x)
for λ ∈ K ∗ .
(2.2.1)
For instance, let p1 , . . . , pn ∈ k[z] with gcd(p1 , . . . , pn ) = 1. Then
hom
Hk(z)
(p1 , . . . , pn ) = max(deg p1 , . . . , deg pn ).
(2.2.2)
If L is a finite extension of K, the valuations on L may be constructed as
above, and the height in L may be defined accordingly. Furthermore, for x =
(x1 , . . . , xn ) ∈ K n \ {0} we have
HLhom (x) = [L : K]HKhom (x).
We define a height for elements of K by
HK (x) := HKhom (1, x) = −
min(0, v(x))
(2.2.3)
for x ∈ K,
(2.2.4)
v
For instance, if K = k(z) and x = p/q where p, q are coprime polynomials
from k[z], then Hk(z) (x) = max(deg p, deg q). From (2.1.4) and (2.1.5), one
deduces that for any finite extension L of K,
HL (x) = [L : K] · HK (x)
for x ∈ K,
(2.2.5)
where HL (x) denotes the height of x with respect to L.
We mention some properties of the height HK . It is evident from (2.1.2) and
(2.1.3) that
HK (x) ≥ 0 for x ∈ K,
HK (x) = 0 ⇐⇒ x ∈ k.
Further, from simple manipulations with valuations and from the Sum Formula
it follows that
HK (x m ) = |m|HK (x)
and
HK (x + y)
HK (xy)
for x ∈ K ∗ , m ∈ Z,
(2.2.6)
≤ HK (x) + HK (y)
for x, y ∈ K.
Next, from (2.2.4) and (2.2.6) it follows that
HK (x) = 12 HK (x) + HK (x −1 ) = 12
|v(x)| ≥ 12 |S|
(2.2.7)
for x ∈ K ∗ ,
v∈MK
(2.2.8)
where S is the set of valuations v ∈ MK for which v(x) = 0.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
2.3 Derivatives and genus
35
Let P ∈ K[X1 , . . . , Xr ] be a non-zero polynomial, and {p1 , . . . , pn } the set
of non-zero coefficients of P . We define
v(P ) := v(p1 , . . . , pn ) = − min(v(p1 ), . . . , v(pn )) for v ∈ MK ,
.
HKhom (P ) :=
v(P ).
v
We have obvious analogues of (2.2.1), (2.2.2) and (2.2.5) for polynomials.
By Gauss’ Lemma for valuations (the method of proof is similar to that of
Bombieri and Gubler (2006), lemma 1.6.3) we have for any two polynomials
P , Q ∈ K[X1 , . . . , Kr ],
v(P Q) = v(P ) + v(Q) for v ∈ MK ,
hence
HKhom (P Q) = HKhom (P ) + HKhom (Q).
(2.2.9)
Suppose that P = f0 (X − α1 ) · · · (X − αg ) with f0 , α1 , . . . , αg ∈ K. Then by
(2.2.9) and the Sum Formula, applied to f0 , we obtain
HKhom (P ) =
g
i=1
HK (αi ) ≥ max HK (αi ).
1≤i≤g
(2.2.10)
2.3 Derivatives and genus
For the moment, let L be any field extension, not necessarily of finite type, which
has transcendence degree 1 over an algebraically closed field k of characteristic
0. The L-vector space (L/k) of differentials of L over k may be constructed
as follows. We start with taking a variable δx for every x ∈ L. Then let V be the
L-vector space consisting of all finite formal linear combinations i yi δxi with
yi , xi ∈ L, and let V0 be the L-linear subspace of V generated by δx+y − δx − δy
and δxy − xδy − yδx for x, y ∈ L and δx for x ∈ k. Then define (L/k) :=
V /V0 . For x ∈ L denote by dx the residue class of δx modulo V0 . Thus,
(L/k) consists of all finite linear combinations ω = i yi dxi , where xi , yi ∈
L, and we have dx = 0 for x ∈ L \ k, dx = 0 for x ∈ k, and d(x + y) = dx +
dy, d(xy) = xdy + ydx for x, y ∈ L. Consequently, d(λx) = λdx for x ∈ L,
λ ∈ k.
It is clear that if L is a subfield of L then up to isomorphism, (L /k) is
contained in (L/k).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
36
Algebraic function fields
For any x, y ∈ L with y ∈ k, there is an irreducible polynomial Q ∈ k[X, Y ]
such that Q(x, y) = 0. Then we have
∂Q
∂Q
(x, y)dx +
(x, y)dy = 0.
∂X
∂Y
Hence there exists a function in L (in fact in k(x, y)), which we denote by
dx/dy, such that
dx =
dx
· dy.
dy
We call dx/dy the derivative of x with respect to y. Notice that we have the
chain rule
dx
dx dy
=
·
dz
dy dz
for any x, y, z ∈ L with y, z ∈ k. As a consequence, if we fix z ∈ L \ k, then
every differential ω ∈ (L/k) can be expressed as (ω/dz) · dz with ω/dz ∈ L.
Now let again K be a function field in one variable over k, i.e., a finite type
extension of transcendence degree 1 over k. For every valuation v ∈ MK we
choose a local parameter zv . Let x ∈ K ∗ . Then for v ∈ MK we can express x
ai zvi with n0 = v(x), ai ∈ k for i ≥ n0 and
as a formal Laurent series ∞
0
i=n
∞
an0 = 0, and then dx/dzv = i=n0 iai zvi−1 . As a consequence,
v
v
dx
dzv
dx
dzv
= v(x) − 1 for any x in K with v(x) = 0,
(2.3.1)
≥ 0 if v(x) = 0.
(2.3.2)
This shows that v(dx/dzv ) is independent of the choice of zv . Indeed, let zv be
another local parameter for v. Then v(zv ) = 1, hence v(dzv /dzv ) = 0, and so
v(dx/dzv ) = v(dx/dzv ).
A differential ω of K over k is called holomorphic if v(ω/dzv ) ≥ 0 for all
v ∈ MK ; this notion is independent of the choice of the zv . It can be shown that
the holomorphic differentials of K over k form a finite dimensional k-vector
space. The dimension of this space is called the genus of K over k, denoted by
gK/k .
Let x ∈ K \ k be arbitrary. It follows from the Sum Formula (2.1.3) and the
chain rule dx/dzv = (dx/dz) · (dz/dzv ) that
v
v
dx
dzv
=
v
v
dz
dzv
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
2.4 Effective computations
37
is independent of x provided the right-hand side of the equality is finite. We
need only the following special case of the Riemann–Roch Theorem.
Theorem 2.3.1 We have
dx
v
dzv
v
= 2gK/k − 2 for every x ∈ K \ k.
It is not difficult to check that k(z) has genus 0.
The following genus estimate will be useful.
Proposition 2.3.2 Let z ∈ K \ k, let F = Xg + f1 Xg−1 + · · · + fg with coefficients f1 , . . . , fg ∈ k[z], and suppose that K is the splitting field of F over
k(z). Then
gK/k ≤ (d − 1)g max(deg f1 , . . . , deg fg ),
where d := [K : k(z)].
Proof. This is Lemma H in Schmidt (1978).
2.4 Effective computations
In order to perform effective computations in the function field K, it is necessary
to assume that the ground field k is presented explicitly in the sense of Fröhlich
and Shepherdson (1956). This means here that there is an algorithm to determine
the zeros of any polynomial with coefficients in k. In particular, in this case we
can perform all the field operations with elements of k.
Further, we assume that K is presented explicitly. This means that K is given
in the form k(z)(y), with z a variable, and y a primitive element of K over k(z),
with an explicitly given defining polynomial y d + p1 (z)y d−1 + · · · + pd (z)
over K. We may assume that y is integral over k[z], that is that p1 (z), . . . , pd (z)
are polynomials with coefficients in k.
Every element x of K can be expressed uniquely in the form
d
qi (z)
i=1
q(z)
· y i−1 ,
where q1 , . . . , qd , q are polynomials of k[z] such that gcd(q, q1 , . . . , qd ) = 1
and qd is monic. We call (q1 , . . . , qd , q) a representation for x, and we say that
x is given explicitly if a representation for x is given explicitly, and that x can
be determined effectively from certain given input data if there is an algorithm
to determine a representation for x from these data. From representations for
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
38
Algebraic function fields
elements x1 , x2 ∈ K one can determine representations for x1 ± x2 , x1 x2 and
x1 /x2 , if x2 = 0.
One can easily compute a minimal polynomial of an explicitly given x over
k[z], i.e., a polynomial F = f0 Xr + f1 Xr−1 + · · · + fr ∈ k[z][X] of minimal degree such that F (x) = 0 and gcd(f0 , . . . , fr ) = 1. Indeed, one starts
by computing representations for x 2 , . . . , x d . Then by straightforward linear
algebra one can determine the smallest r, which is ≤ d, for which there exist
g0 , . . . , gr ∈ k(z), not all 0, such that g0 + g1 x + · · · + gr x r = 0, and having
found such, one obtains a minimal polynomial of x by clearing denominators.
It is important to note that if k and K are presented explicitly, then the
valuations of K can be described explicitly. Specifically, in Section 2.1 we
gave, for every valuation v of K, a local parameter zv for v as well as a Laurent
series yv in zv , such that y → yv gives rise to an isomorphic embedding of K
into k((zv )). The pair (zv , yv ) can be determined from the defining polynomial
of y and the element of k ∪ {∞} above which v lies. By determining yv we
mean that by an inductive procedure we can determine the coefficients of yv
one by one. We say that the valuation v is given explicitly, if the pair (zv , yv ) is
given, i.e., the inductive procedure to compute the coefficients of yv is given.
If x ∈ K and the valuation v are given, then we can express x as a Laurent
series in zv by substituting yv for y in the expression di=1 (qi (z)/q(z)) · y i−1
for x and by expressing z as a Laurent series in zv . Then we can compute v(x)
by searching for the first non-zero coefficient in the Laurent series expansion
for x.
Further details may be found in Eichler (1966), chapter III, section 1 and
Mason (1984), chapter V.
We recall a result of Mason (1984), p. 11, lemma 1.
Proposition 2.4.1 Suppose that k, K are presented explicitly, and a finite set
S of valuations of K and integers nv (v ∈ S) are explicitly given. Then we can
determine effectively whether there exists an element x in K such that
v(x) = nv for v ∈ S,
v(x) = 0 for v ∈ MK \ S.
(2.4.1)
Moreover, if such an x exists then it may be computed, and it is unique up to a
non-zero factor in k.
Proof. We do not lose any generality by augmenting S with a finite set of
explicitly given valuations and setting nv := 0 for the added valuations. So,
without loss of generality, we may assume that S contains all valuations that
lie above {∞, a1 , . . . , at }, where a1 , . . . , at are certain elements of k.
Further, it is enough to prove the assertion for the case when nv ≥ 0 for all
finite v ∈ S, i.e., not lying above ∞; then the elements x under consideration
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
2.4 Effective computations
39
are integral over k[z]. For assume that x satisfies (2.4.1). Denote by ev the
ramification index of v over the element of k ∪ {∞} over which it lies. Choose
an integer m such that m > 0 and m + nv ≥ 0 for v ∈ S, and put q(z) :=
t
m
j =1 (z − aj ) . Then if v|∞ we have v(q(z)x) = nv − ev mt =: nv while if
v|ai for some i we have v(q(z)x) = nv + ev m =: nv ≥ 0. Further, for v outside
S we have v(q(z)) = 0 and so v(q(z)x) = 0. Clearly, q(z) can be determined
effectively. So it suffices to prove our assertion with nv (v ∈ S) instead of nv .
Recall that K is explicitly given in the form k(z)(y), with an explicitly given
minimal polynomial of y over k(z) which is monic and has its coefficients in
k[z]. Consider now the system of equations (2.4.1) in x, where it is assumed that
nv ≥ 0 for v ∈ S with v finite. Thus, the elements x ∈ K under consideration
are integral over k[z]. Each such x can be expressed in a unique way in the form
x=
d
qi (z)y i−1
i=1
with qi (z) ∈ k(z), i = 1, . . . , d. Denote by σ1 , . . . , σd the distinct embeddings
of K in a fixed algebraic closure of K. Then we infer that
σj (x) =
d
qi (z)(σj (y))i−1
for j = 1, . . . , d.
(2.4.2)
i=1
Let D = det(σj (y)i−1 )2 = 1≤i<j ≤d (σi (y) − σj (y))2 , i.e., D is the discriminant of y. It has an explicit expression as a polynomial with integer
coefficients in terms of the coefficients of the minimal polynomial of y. Hence
it belongs to k[z] and is effectively computable. Since by assumption k is
presented explicitly, we can determine the zeros of D in k together with their
multiplicities. Hence we can give all valuations of K lying above the zeros of D
explicitly, and for each such v we can determine v(D). We may augment S with
these valuations. Thus, without loss of generality, v(D) = 0 for v outside S.
It follows from (2.4.2) that, for each i, Dqi (z) may be expressed as a
polynomial with integer coefficients in the σj (x) and σj (y). But σj (x) and
σj (y) are integral over k[z] for each j , hence Dqi (z) ∈ k[z] for all i. By
selecting σ1 , . . . , σd to be the embeddings corresponding to the infinite
valuations on K, we may determine an integer u depending only on the
integers nv (v|∞) such that ord∞ (Dqi (z)) ≥ −u for 1 ≤ i ≤ d, and hence
each Dqi (z) is a polynomial of degree at most u. Consequently, we may write
Dx =
u d
aij zj y i−1
(2.4.3)
j =0 i=1
with some aij in k to be determined.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
40
Algebraic function fields
We now prove that, assuming that x satisfies (2.4.3) and x = 0, we can
replace (2.4.1) by the finite list of conditions
for v ∈ S.
(2.4.4)
Indeed, by the Sum Formula we must have v∈S nv = 0, otherwise, (2.4.1)
is not solvable. By (2.4.3) we have v(Dx) ≥ 0 for every finite valuation v.
So v(x) ≥ 0 for every valuation v outside S. Suppose that v(x) > nv for some
v ∈ S or v(x) > 0 for some v outside S. Then
v(x) >
nv = 0,
0=
v(x) ≥ nv
v∈MK
v∈S
a contradiction.
So (2.4.1) is equivalent to the combination of (2.4.3) and (2.4.4). By replacing in (2.4.3) z and y by their Laurent expansions in terms of zv , we obtain for
m
Dx an expansion ∞
m=mv Lvm (a)zv where every term Lvm (a) is a linear form
with coefficients in k in the coefficients aij in (2.4.3). Thus, x satisfies (2.4.3)
and (2.4.4) if and only if the aij satisfy the finite system of linear equations
Lvm (a) = 0
for v ∈ S,
m = mv , mv + 1, . . . , nv + v(D) − 1.
Now we can decide whether this system of linear equations has a non-zero
solution in k, and if so, compute one. Consequently, we may determine
whether there exists an element x in K with (2.4.1) and if so, compute such
an x. Finally, if there are two elements x1 and x2 in K which satisfy (2.4.1),
then v(x1 /x2 ) = 0 for all v, whence x1 /x2 ∈ k. Thus, x is unique apart from
a non-zero factor from k. This completes the proof.
Proposition 2.4.2 Let a1 , . . . , ar , b be explicitly given elements of K. Then
it can be determined effectively whether
ξ1 a1 + · · · + ξr ar = b
(2.4.5)
is solvable in (ξ1 , . . . , ξr ) ∈ kr . If so, the set of solutions in kr of (2.4.5) is a
linear variety of dimension r − rank k (a1 , . . . , ar ), a parameter representation
of which can be determined effectively.
Proof. First assume that a1 , . . . , ar are linearly independent over k. Recall that
from any given x ∈ K, we can effectively determine its derivatives x (j ) :=
dj x/dzj for all j ≥ 0. Now, clearly, if ξ1 , . . . , ξr ∈ k satisfy (2.4.5), then
(j )
ξ1 a1 + · · · + ξr ar(j ) = b(j )
(j = 0, . . . , r − 1).
Since a1 , . . . , ar are linearly independent over k, the Wronskian determinant
(j −1)
)i,j =1,...,r is non-zero. Hence the latter system has a unique solution
det(ai
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
2.4 Effective computations
41
(ξ1 , . . . , ξr ) ∈ K r which can be determined effectively. Then it can be checked
whether this solution belongs to kr .
Now suppose that rank k {a1 , . . . , ar } = m < r. By means of the above procedure, we can select a k-linearly independent subset of m elements from
{a1 , . . . , ar } and express the other elements as k-linear combinations of this
subset. Assume that {a1 , . . . , am } is k-linearly independent. Check with the
above procedure whether b is a k-linear combination of a1 , . . . , am . If so,
express b and am+1 , . . . , ar as k-linear combinations of a1 , . . . , am , substitute
these into (2.4.5) and compare the coefficients of a1 , . . . , am . Thus, one can
rewrite (2.4.5) as a system of linear equations with coefficients in k, whose
solution set is a linear variety of dimension r − m, and it is straightforward to
compute a parameter representation of the latter.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:27:57, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.004
3
Tools from Diophantine approximation and
transcendence theory
In this chapter, we have collected some fundamental results from Diophantine
approximation and transcendence theory on which the main results of this book
are based. Section 3.1 is about Schmidt’s Subspace Theorem and its variations.
These will be applied in Chapter 6. In Section 3.2 we recall the best known
effective estimates for linear forms in logarithms, which are used in Chapters 4
and 5. For more details and background, we refer to Schmidt (1980), Evertse
and Schlickewei (2002), Bombieri and Gubler (2006), chapter 6, 7 and Baker
and Wüstholz (2007).
3.1 The Subspace Theorem and some variations
In this section we formulate some versions of the Subspace Theorem that
are used in Chapter 6. In particular, we recall the p-adic Subspace Theorem,
the Parametric Subspace Theorem, and a quantitative version of a special
case of the latter. We start with a brief introduction, taking as starting point
Roth’s celebrated Theorem on the approximation of algebraic numbers by
rationals.
Theorem 3.1.1 Let α ∈ R \ Q be an algebraic number and > 0. Then there
are only finitely many pairs (x, y) ∈ Z2 with y > 0 such that
α − x ≤ max(|x|, |y|)−2− .
(3.1.1)
y
Proof. See Roth (1955). Roth’s proof consists of two steps: first the deduction of
a non-vanishing result for polynomials, now known as Roth’s Lemma; second,
under the assumption that Theorem 3.1.1 is false the construction of an auxiliary
polynomial that violates Roth’s Lemma.
42
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
3.1 The Subspace Theorem and some variations
43
Weaker versions of Roth’s Theorem were proved earlier by Thue (1909) with
1
d + 1 instead of 2 where d = deg α, Siegel (1921) with exponent
exponent
2
√
2 d, and Dyson (1947)
√ and Gelfond (1960) (result proved in the late 1940s),
both with exponent 2d. The proofs of Thue–Roth are all ineffective, in that
they do not provide a method to determine the solutions of the inequality under
consideration.
Extending earlier work of Ridout (1958), Lang (1960) proved a generalization of Roth’s Theorem, usually referred to as the p-adic Roth’s Theorem,
where the underlying inequality takes its solutions from an algebraic number
field and where various archimedean and non-archimedean absolute values
from this number field are involved.
Roth’s Theorem was generalized in another direction by W. M. Schmidt to
simultaneous approximation. His work culminated in his so-called Subspace
Theorem. Below, we denote by · the maximum norm on Cn , i.e.,
x := max(|x1 |, . . . , |xn |) for x = (x1 , . . . , xn ) ∈ Cn .
Theorem 3.1.2 (Subspace Theorem) Let n ≥ 2 and let Li = nj=1 αij Xj
(i = 1, . . . , n) be linearly independent linear forms with algebraic coefficients
in C and let > 0. Then the set of solutions of the inequality
|L1 (x) · · · Ln (x)| ≤ x− in x ∈ Zn \ {0}
(3.1.2)
is contained in a finite union of proper linear subspaces of Qn .
Proof. See Schmidt (1972) or Schmidt (1980).
In general, inequalities of the shape (3.1.2) need not have finitely many
solutions.
Theorem 3.1.2 =⇒ Theorem 3.1.1. Notice that if (x, y) ∈ Z2 with y > 0 is a
solution of (3.1.1), then
|y(x − αy)| ≤ max(|x|, |y|)− .
Now Theorem 3.1.2 implies that the solutions of the latter, hence of (3.1.1), lie
in finitely many one-dimensional subspaces of Q2 . But the solutions of (3.1.1)
in a given one-dimensional subspace of Q2 are all of the shape m(x0 , y0 ) with
(x0 , y0 ) a fixed pair of integers with gcd(x0 , y0 ) = 1 and y0 > 0, and m ∈ Z>0 .
By substituting this into (3.1.1) we see that m is bounded. This shows that a
given one-dimensional subspace of Q2 contains only finitely many solutions of
(3.1.1).
Schmidt (1975) generalized his Subspace Theorem to inequalities of which
the unknowns are taken from an algebraic number field, and Schlickewei
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
44
Diophantine approximation and transcendence
(1977b) extended this further to inequalities involving both archimedean and
non-archimedean absolute values, thus generalizing both Lang’s p-adic Roth’s
Theorem mentioned above and Schmidt’s Subpsace Theorem. We give a reformulation of his result that is better adapted to our purposes.
Let K be an algebraic number field. We use the absolute values | · |v (v ∈
MK ) defined in Section 1.7. Let S be a finite set of places of K, containing all
infinite places. Recall that the ring of S-integers of K is given by OS = {x ∈ K :
|x|v ≤ 1 for v ∈ MK \ S}. We define the S-height of x = (x1 , . . . , xn ) ∈ OSn by
HS (x) = HS (x1 , . . . , xn ) :=
max(|x1 |v , . . . , |xn |v ).
v∈S
It follows easily from the Product Formula that HS (εx) = HS (x) for ε ∈ OS∗ .
We shall show below that for any C > 0 there are, up to multiplication with a
scalar from OS∗ , only finitely many vectors x ∈ OSn with HS (x) ≤ C.
Theorem 3.1.3 (p-adic Subspace Theorem) For v ∈ S, let L1v , . . . , Lnv be
linearly independent linear forms in X1 , . . . , Xn with coefficients in K. Further,
let > 0. Then the set of solutions of
|L1v (x) · · · Lnv (x)|v ≤ HS (x)− in x ∈ OSn \ {0}
(3.1.3)
v∈S
is contained in a union of finitely many proper linear subspaces of K n .
Proof. This is a reformulation of a result of Schlickewei (1977b). His proof is
based on his earlier papers Schlickewei (1976a, 1976b, 1976c). A special case
of Schlickewei’s result was proved independently by Dubois and Rhin (1975).
A complete proof of Schlickewei’s theorem can also be found in Bombieri and
Gubler (2006), chapter 7.
Theorem 3.1.3 =⇒ Theorem 3.1.2. Let L1 , . . . , Ln be the linear forms from
Theorem 3.1.2. Let K be the algebraic number field generated by the coefficients of L1 , . . . , Ln and their conjugates, and suppose that K has degree d.
Let S be the set of infinite places of K. Recall that if v is an infinite place of K,
then either | · |v = |σ (·)| if v = σ is a real embedding of K or | · |v = |σ (·)|2 if
v = {σ, σ } is a pair of conjugate complex embeddings of K. For either v = σ
a real embedding or v = {σ, σ } a pair of conjugate complex embeddings, we
put Liv := σ −1 (Li ), where σ −1 (Li ) is the linear form obtained by applying
σ −1 to the coefficients of Li . For x ∈ Zn , the left- and right-hand sides of
(3.1.3) are precisely the d-th powers of the left- and right-hand sides of (3.1.2).
Thus, for x ∈ Zn , inequality (3.1.2) implies (3.1.3), and then an application of
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
3.1 The Subspace Theorem and some variations
45
Theorem 3.1.3 implies that the solutions of (3.1.2) lie in a union of finitely
many proper linear subspaces of Qn .
Schmidt’s proof of Theorem 3.1.2 is basically an extension of Roth’s method,
i.e., the construction of an auxiliary polynomial and an application of Roth’s
Lemma, combined with techniques from the geometry of numbers. The arguments of Schlickewei and Dubois and Rhin are essentially a “p-adization”
of Schmidt’s method. In their groundbreaking paper Faltings and Wüstholz
(1994) gave a totally different proof of Theorem 3.1.3, where they avoided
the use of geometry of numbers by applying a very powerful generalization of
Roth’s Lemma due to Faltings, his Product Theorem, see Faltings (1991). We
mention that both the method of Schmidt and that of Faltings and Wüstholz are
ineffective, in that they do not provide a method to determine the subspaces
containing the solutions of the inequality under consideration.
Theorems 3.1.2 and 3.1.3 are very powerful tools to obtain finiteness results
for various types of Diophantine equations, such as unit equations, norm form
equations, decomposable form equations and exponential-polynomial equations, see Chapters 6, 9 and Section 10.11 in the present book. The proofs of
these finiteness results are all ineffective, in the sense that they do not provide a method to determine the solutions. On the other hand, there are now
good quantitative versions of Theorems 3.1.2 and 3.1.3, giving explicit upper
bounds for the number of subspaces, that led to explicit upper bounds for
the numbers of solutions of the above mentioned equations. Schmidt (1989)
obtained a quantitative version of Theorem 3.1.2, giving an explicit upper
bound for the number of subspaces containing the “large” solutions. Schlickewei (1992) generalized this, and obtained a quantitative version of Theorem
3.1.3. This was substantially improved by Evertse (1996), by using a quantitative version of Faltings’ Product Theorem. Schlickewei made the important
observation that a sufficiently good quantitative version of the so-called Parametric Subspace Theorem (see below), which deals with a parametrized class
of twisted heights, would lead to much better bounds for the number of solutions of certain classes of Diophantine equations, than quantitative versions
of Theorem 3.1.3. In Schlickewei (1996a), he proved a special case of such a
quantitative version, and applied this to obtain sharper estimates for the zero
multiplicity of a linear recurrence sequence (see Section 10.11). Evertse and
Schlickewei (2002) sharpened and extended Schlickewei’s result, and obtained
a completely general quantitative version of the Parametric Subspace Theorem.
For more historical information we refer to Evertse and Schlickewei (1999).
Evertse and Schlickewei essentially followed Schmidt’s proof of his Theorem
3.1.2, with the necessary refinements. Evertse and Ferretti (2013) obtained
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
46
Diophantine approximation and transcendence
a further improvement, following also Schmidt’s proof scheme, but inserting
ideas from Faltings and Wüstholz (1994).
We first state the Parametric Subspace Theorem in a qualitative form and
then give, in a special case relevant for our purposes, a quantitative version
of the latter. The Parametric Subspace Theorem is in fact a generalization of
Theorem 3.1.3, although this is not obvious at a first glance.
Let again K be a number field, S a finite set of places of K containing all
infinite places, > 0, and for v ∈ S, let {L1v , . . . , Lnv } be a system of linearly
independent linear forms in K[X1 , . . . , Xn ]. Take a solution x ∈ OSn \ {0} of
(3.1.3). Assume that the left-hand side of (3.1.3) is non-zero. Write
|Liv (x)|v = HS (x)div
(v ∈ S, i = 1, . . . , n),
Liv := Xi , div := 0
(v ∈ MK \ S, i = 1, . . . , n),
d = (div : v ∈ MK , i = 1, . . . , n),
Q := HS (x).
Define the so-called twisted height:
HQ,d (x) :=
max |Liv (x)|v Q−div .
v∈MK
1≤i≤n
(3.1.4)
Notice that by (3.1.3) we have
n
div ≤ −,
v∈MK i=1
and that
HQ,d (x) ≤ 1.
In the above observations, both d and Q vary with x. The Parametric Subspace
Theorem deals with inequalities involving twisted heights, where Q varies but
d is fixed.
Theorem 3.1.4 (Parametric Subspace Theorem) Let K be an algebraic number field and S a finite set of places of K containing all infinite places. Further,
let n ≥ 2, let {L1v , . . . , Lnv } (v ∈ S) be systems of linearly independent linear
forms from K[X1 , . . . , Xn ], and let d = (div : v ∈ MK , i = 1, . . . , n) be a
tuple of reals such that
div = 0 for v ∈ MK \ S, i = 1, . . . , n.
Put
n
μ := n1
div .
v∈MK i=1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
3.1 The Subspace Theorem and some variations
47
Then for every δ > 0 there are Q0 and a finite collection {T1 , . . . , Tt } of proper
linear subspaces of K n such that for every Q ≥ Q0 there is T ∈ {T1 , . . . , Tt }
with
x ∈ K n : HQ,d (x) ≤ Q−μ−δ ⊆ T .
Proof. This was first formulated by Evertse and Schlickewei (2002), in a quantitative form with explicit upper bounds for Q0 and t. In fact, in their paper,
Evertse and Schlickewei proved an “Absolute Parametric Subspace Theorem”,
n
with solutions x taken from Q instead of K n .
Below we deduce Theorem 3.1.3 from Theorem 3.1.4. We first prove
a lemma. We keep our convention that K is a number field and S a
finite set of places of K, containing all infinite places. Further, we set
d := [K : Q], s := |S|. For x = (x1 , . . . , xn ) ∈ K n , v ∈ MK , we put xv :=
max(|x1 |v , . . . , |xn |v ).
Lemma 3.1.5
(i) There is a constant C depending only on K and S, such that for every
x ∈ OSn \ {0}, there is ε ∈ OS∗ with
εxv ≤ CHS (x)1/s for v ∈ MK .
(ii) For every A > 0 there are, up to multiplication with a scalar from OS∗ , only
finitely many vectors x ∈ OSn with HS (x) ≤ A.
Proof. (i) Let S = {v1 , . . . , vs } and
H := {x = (x1 , . . . , xs ) ∈ Rs : x1 + · · · + xs = 0}.
Then by the S-unit Theorem (see Theorem 1.8.1) the map
LOGS : ε → (log |ε|v1 , . . . , log |ε|vs )
OS∗
to an (s − 1)-dimensional lattice in H .
maps
Let x ∈ OSn \ {0}. Then the point
a := (s −1 log HS (x) − log xv1 , . . . , s −1 log HS (x) − log xvs )
lies in H . Choose ε ∈ OS∗ such that the lattice point LOGS (ε) is closest to a.
Then in fact a − logS (ε) ≤ γ , where · is the maximum norm on Rs and
γ is a constant depending only on K, S. This ε satisfies (i) with C = eγ .
(ii) Consider x ∈ OSn with HS (x) ≤ A. After multiplying x with a suitable
S-unit, we can arrange that xv ≤ C · A1/s for v ∈ S. Then the coordinates
x1 , . . . , xn of x have absolute heights
max(1, |xi |v ) ≤ C s/d A1/d for i = 1, . . . , n.
H (xi ) :=
v∈MK
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
48
Diophantine approximation and transcendence
By Northcott’s Theorem (see Theorem 1.9.3) this leaves only finitely many
possibilities for x1 , . . . , xn , hence for x.
Theorem 3.1.4 =⇒ Theorem 3.1.3. By Lemma 3.1.5 (i), it suffices to show that
the solutions x ∈ OSn \ {0} of (3.1.3) with the additional property
xv ≤ CHS (x)1/s for v ∈ S
(3.1.5)
lie in finitely many proper linear subspaces of K n . Lemma 3.1.5 (ii) implies
that by assuming HS (x) to be sufficiently large, we exclude at most finitely
many one-dimensional subspaces of solutions x. Solutions with (3.1.5) and
with HS (x) sufficiently large, in fact satisfy
|Liv (x)|v ≤ HS (x)2/s for v ∈ S, i = 1, . . . , n.
(3.1.6)
Hence it suffices to prove that the solutions of (3.1.3) with (3.1.6) lie in finitely
many proper linear subspaces of K n .
Let x be a solution of (3.1.3) with (3.1.6), and define
div (x) := max −2n − ,
log |Liv (x)|v
log HS (x)
for v ∈ S, i = 1, . . . , n;
taking log 0 := −∞, this is well-defined also if Liv (x) = 0. Then
⎫
−2n − ≤ div (x) ≤ 2/s for v ∈ S, i = 1, . . . , n, ⎪
⎪
⎬
n
⎪
div (x) ≤ −.
⎪
⎭
(3.1.7)
v∈S i=1
We define a tuple d := (div : v ∈ MK , i = 1, . . . , n) by div := 0 for v ∈
MK \ S, i = 1, . . . , n and
Z, div −
< div (x) ≤ div for v ∈ S, i = 1, . . . , n. (3.1.8)
div ∈
2ns
2ns
Then by (3.1.7),
2
+
for v ∈ S, i = 1, . . . , n.
s
2ns
Notice that by (3.1.7) we have also
− 2n − ≤ div ≤
μ :=
1
n
n
div ≤ −
v∈MK i=1
(3.1.9)
.
2n
Further, we have |Liv (x)|v ≤ HS (x)div (x) ≤ HS (x)div for v ∈ S, i = 1, . . . , n,
hence with Q := HS (x) we have
HQ,d (x) ≤ 1.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
3.1 The Subspace Theorem and some variations
49
By Theorem 3.1.4, the solutions of (3.1.3) satisfying (3.1.8) for some fixed
tuple d lie in finitely many proper linear subspaces of K n . Further, by (3.1.8)
and (3.1.9), the tuples d belong to a finite set independent of x. This proves
Theorem 3.1.3.
The general statement of the quantitative version of Theorem 3.1.4, with
explicit upper bounds for Q0 , t, is quite technical. We give here only a special
case, which is sufficient for our purposes. We keep our assumptions that K
is an algebraic number field of degree d and S is a finite set of places of K,
containing the infinite places.
Theorem 3.1.6 Let Liv (v ∈ MK , i = 1, . . . , n) be linear forms such that for
every v ∈ MK , the set {L1v , . . . , Lnv } is linearly independent and
{L1v , . . . , Lnv } ⊂ {X1 , . . . , Xn , X1 + · · · + Xn }.
Let d = (div : v ∈ MK , i = 1, . . . , n) be any tuple of reals such that div = 0
for v ∈ MK \ S, i = 1, . . . , n. Put
n
μ := n1
div
v∈MK i=1
and suppose that
max (d1v , . . . , dnv ) ≤ λ with λ > μ.
v∈MK
Let 0 < δ < λ − μ and put
:=
λ−μ
.
δ
Let HQ,d be defined by (3.1.4).
Then there is a finite collection {T1 , . . . , Tt } of proper linear subspaces of
K n of cardinality
t ≤ C(n, )
with C(n, ) effectively computable and depending only on n and , such that
for every Q with
Q > n2d/δ
(3.1.10)
there is T ∈ {T1 , . . . , Tt } such that
{x ∈ K n : HQ,d (x) ≤ Q−μ−δ } ⊆ T .
(3.1.11)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
50
Diophantine approximation and transcendence
This was proved by Evertse and Schlickewei (2002), Theorem 1.1 in the special
case μ = 0, λ = 1, with
2
C(n, ) = 4(n+9) n+4 .
For our purposes, the precise value of C(n, ) will not matter, but we should
mention here that Evertse and Ferretti (2013), Theorem 1.1 proved the same
result with the better bound
C(n, ) = 106 22n n10 3 (log(6n))2 ,
again in the special case μ = 0, λ = 1.
It is not difficult to reduce Theorem 3.1.6 to the special case μ = 0, λ = 1.
Put
n
1
(v ∈ MK , i = 1, . . . , n),
div − n1
:= λ−μ
div
j =1 dj v
: v ∈ MK , i = 1, . . . , n ,
d := div
Q := Qλ−μ ,
δ
δ := λ−μ
.
= 0 for v ∈ MK \ S, i = 1, . . . , n,
Then div
n
div
= 0,
v∈MK i=1
≤ 1,
max d1v
, . . . , dnv
v∈MK
and (3.1.10) changes into Q > n2d/δ . Further,
HQ,d (x) = HQ ,d (x)Qμ ,
hence (3.1.11) changes into
{x ∈ K n : HQ ,d (x) ≤ Q−δ } ⊆ T .
Thus, Theorem 3.1.6 follows from the special case μ = 0, λ = 1.
The Subspace Theorem and its generalizations and quantitative refimenents
have many applications. In this book we have focused on applications to unit
equations and subsequent applications thereof, see Chapters 6, 9, 10, but there is
much more, see for instance the survey papers Bilu (2008), Corvaja and Zannier
(2008), Bugeaud (2011), and the book Zannier (2003). Somewhat surprisingly,
from Theorem 3.1.3 one can derive extensions where the linear polynomials are
replaced by higher degree polynomials and where the solutions are taken from
an arbitrary algebraic variety instead of K n , see Corvaja and Zannier (2004a)
and Evertse and Ferretti (2002, 2008).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
3.2 Effective estimates for linear forms in logarithms
51
3.2 Effective estimates for linear forms in logarithms
In this section we present some results from Baker’s theory of logarithmic forms
that are used in Chapters 4 and 5. We formulate, without proof, the best known
effective estimates for linear forms in logarithms, due to Matveev (2000) in the
complex case and Yu (2007) in the p-adic case, as well as a common, uniform
formulation of them which will be more convenient to apply.
We first give a brief introduction, starting with the famous Gelfond–
Schneider Theorem on transcendental numbers. For the moment, Q denotes
the algebraic closure of Q in C, and algebraic numbers are supposed to belong
to Q. Here and below log denotes, except otherwise stated, any fixed determination of the logarithm, and for α, β ∈ C with α = 0 we define α β := eβ log α .
Theorem 3.2.1 Suppose that α and β are algebraic numbers such that α = 0,
1 and that β is not rational. Then α β is transcendental.
Proof. See Gelfond (1934) and Schneider (1934). The theorem was proved
independently by Gelfond and Schneider. Their proofs are different, but both
depend on the construction of an auxiliary function. Assuming that in Theorem 3.2.1 α β is algebraic and following the arguments of Gelfond, one can
construct a function F (z) of a complex variable z which is a polynomial in
α z and α βz with integral coefficients, not all zero, such that F (m) (l) = 0 for
all integers l, m with 1 ≤ l ≤ h and 0 ≤ m < k, where h, k are appropriate
parameters. Then combining some arithmetic and analytic considerations and
using induction on k, one can prove that F (m) (l) = 0 for all m, which leads to a
contradiction.
Theorem 3.2.1 provided an answer to Hilbert’s seventh problem. An equivalent formulation of the theorem is that if α1 , α2 are non-zero algebraic numbers
such that log α1 and log α2 are linearly independent over Q, then they are
linearly independent over Q.
By means of a refinement of his method of proof, Gelfond (1935) gave a
non-trivial effective lower bound for the absolute value of β1 log α1 + β2 log α2 ,
where β1 , β2 denote algebraic numbers, not both 0, and α1 , α2 denote algebraic
numbers different from 0 and 1 such that log α1 / log α2 is not rational.
Mahler (1935b) proved a p-adic analogue of the Gelfond–Schneider Theorem. A generalization to the p-adic absolute value was given in Gelfond (1940)
in a quantitative form. In his book Gelfond (1960), Gelfond remarked that a
generalization of his above results from two logarithms to arbitrary many would
be of great significance for the solutions of many difficult problems in number
theory.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
52
Diophantine approximation and transcendence
In his celebrated series of papers, Baker (1966, 1967a, 1967b, 1968a) made
a major breakthrough in transcendental number theory by generalizing the
Gelfond–Schneider Theorem to arbitrary many logarithms. In Baker (1966,
1967b), he proved the following.
Theorem 3.2.2 Let α1 , . . . , αn denote non-zero algebraic numbers. If
log α1 , . . . , log αn are linearly independent over Q, then 1, log α1 , . . . , log αn
are linearly independent over Q.
Further, Baker (1967a, 1967b, 1968a) gave non-trivial lower bounds for the
absolute value of linear forms in logarithms of the form
β1 log α1 + · · · + βn log αn ,
where α1 , . . . , αn are non-zero algebraic numbers such that log α1 , . . . , log αn
are linearly independent over Q and β1 , . . . , βn are algebraic numbers, not
all 0.
Proof of Theorem 3.2.2 (sketch; see Baker (1967b) for full details). To illustrate most of the principal ideas of Baker, we sketch the main steps of the proof
of a slightly weaker assertion, which states that if α1 , . . . , αn , β1 , . . . , βn−1 are
non-zero algebraic numbers such that α1 , . . . , αn are multiplicatively indepenβn−1
β
= αn cannot hold. Supposing the opposite and following
dent, then α1 1 · · · αn−1
the arguments of Baker, one can construct an auxiliary function F (z1 , . . . , zn−1 )
in n − 1 complex variables, which generalizes the function of a single variable
zn−1
and
employed by Gelfond. The function is a polynomial in α1z1 , . . . , αn−1
βn−1 zn−1
β1 z1
α1 · · · αn−1 , such that
F (z, . . . , z) =
L
λ1 =0
···
L
p(λ1 , . . . , λn )α1λ1 z · · · αnλn z ,
λn =0
where L is a large parameter and p(λ1 , . . . , λn ) are rational integers, not all 0.
Then for every positive integer l, the number F (l, . . . , l) lies in the algebraic
number field Q(α1 , . . . , αn ). It follows from a well-known lemma on linear
equations (known as Siegel’s Lemma) that the p(λ1 , . . . , λn ) can be chosen
such that their absolute values are not too large and such that
Fm1 ,...,mn−1 (l, . . . , l) = 0
(3.2.1)
for all integers l, m1 , . . . , mn−1 with 1 ≤ l ≤ h and m1 + · · · + mn−1 ≤ k,
where Fm1 ,...,mn−1 denotes the corresponding derivative of F (z1 , . . . , zn−1 ) and
h, k are appropriate parameters. In this situation, the basic interpolation techniques used earlier by Gelfond and others do not work in general. Using some
analytic considerations Baker applied an ingenious extrapolation procedure to
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
3.2 Effective estimates for linear forms in logarithms
53
extend (3.2.1) to a larger range of values for l, at the price of slightly diminishing the range of values for m1 + · · · + mn−1 . Repeating this procedure, one can
get F (l, . . . , l) = 0 for 1 ≤ l ≤ (L + 1)n . This can be regarded as a system of
linear equations in the coefficients p(λ1 , . . . , λn ) of F , which, because of the
multiplicative independence of α1 , . . . , αn , cannot have a non-zero solution.
This proves the assertion.
Let again α1 , . . . , αn be n ≥ 2 non-zero algebraic numbers, and let
log α1 , . . . , log αn denote now the principal values of the logarithm.
Theorem 3.2.3 Let b1 , . . . , bn be rational integers and 0 < ε ≤ 1. Assume
that
0 < |b1 log α1 + · · · + bn log αn | < e−εB ,
where B = max {|b1 |, . . . , |bn |}. Then B ≤ B0 , where B0 is effectively computable in terms of α1 , . . . , αn and ε.
This was proved in Baker (1968a) with
B0 = (4n ε−1 d 2n A)(2n+1) ,
2
2
where d ≥ 4 and A ≥ 4 are upper bounds for the degrees and heights, respectively, of α1 , . . . , αn . Here, by the height of an algebraic number we mean
the maximum of the absolute values of the coefficients in its minimal defining polynomial, which is chosen such that its coefficients are relatively prime
integers.
Baker’s general effective estimates led to significant applications in number
theory. For applications to Diophantine equations, the inequalities of Baker
(1968a, 1968b) in which β1 , . . . , βn are rational integers proved to be particularly useful. Using his effective estimates, Baker (1968b, 1968c, 1969) gave
the first explicit upper bounds for the solutions of Thue equations, Mordell
equations, and superelliptic and hyperelliptic equations; see also Sections 9.6
and 9.7.
Later, several improvements and generalizations were established by Baker
and others, including Feldman, Baker and Stark, Tijdeman, van der Poorten,
Sprindžuk, Shorey, Wüstholz, Philippon and Waldschmidt, Waldschmidt,
Baker and Wüstholz, Laurent, Mignotte, Nesterenko and Matveev and, in the
p-adic case, Coates, Sprindžuk, Brumer, Vinogradov and Sprindžuk, van der
Poorten, Bugeaud, Laurent and Yu. They have introduced various new ideas
to improve or refine the previous bounds. Their results made it possible to
obtain enormously many applications. For further applications to Diophantine
problems, we refer to Győry (1980b, 2002, 2010), Sprindžuk (1982, 1993),
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
54
Diophantine approximation and transcendence
Shorey and Tijdeman (1986), Serre (1989), de Weger (1989), Bilu (1995),
Wildanger (1997, 2000), Smart (1998), Gaál (2002) and Tzanakis (2013), to
Chapters 4 and 5 of the present book, to our next book on discriminant equations, and to the references given there.
Using an elementary geometric lemma due to Bombieri (1993) and Bombieri
and Cohen (1997), Bilu and Bugeaud (2000) showed that one does not need
the full strength of Baker’s theory to get, for bn = ±1, an effective version
of Theorem 3.2.3: it can be deduced from an estimate for linear forms in just
two logarithms. However, the results of the theory of linear forms in n ≥ 2
logarithms provide much better bounds for B.
For comprehensive accounts of Baker’s theory, analogues for elliptic logarithms and algebraic groups and extensive bibliographies the reader can consult Baker (1975, 1988), Baker and Masser (1977), Lang (1978), Feldman and
Nesterenko (1998), Waldschmidt (2000), Wüstholz (2002) and, for the state of
the art as well, Baker and Wüstholz (2007).
We now state the results of Matveev and Yu and give a common, uniform
formulation of them.
Let again K be an algebraic number field of degree d, and assume that it is
embedded in C. We put χ = 1 if K is real, and χ = 2 otherwise. Let
= b1 log α1 + · · · + bn log αn ,
where α1 , . . . , αn are n (≥ 2) non-zero elements of K with some fixed non-zero
values of log α1 , . . . , log αn , and b1 , . . . , bn are rational integers, not all zero.
Let A1 , . . . , An be reals with
Ai ≥ max {dh(αi ), | log αi |, 0.16} (i = 1, . . . , n)
and put
B := max {1, max {|bi |Ai /An : 1 ≤ i ≤ n}} .
The following deep result was proved by Matveev (2000).
Theorem 3.2.4 Let K, α1 , . . . , αn , b1 , . . . , bn and be as above, and suppose
that = 0. Then
log || > −C1 (n, d)A1 · · · An log(eB),
where
C1 (n, d) := min
1
χ
1
en
2
χ
30n+3 n3.5 , 26n+20 d 2 log(ed).
Further, B may be replaced by max (|b1 |, . . . , |bn |).
Proof. This is Corollary 2.3 of Matveev (2000).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
3.2 Effective estimates for linear forms in logarithms
55
We shall use the following consequence of Theorem 3.2.4. Let
= α1b1 · · · αnbn − 1
(3.2.2)
and
Ai ≥ max {dh(αi ), π } , i = 1, . . . , n.
Theorem 3.2.5 Suppose that = 0, bn = ±1 and that B satisfies
nπ B ≥ max |b1 |, . . . , |bn−1 |, 2e max √ , A1 , . . . , An−1 An .
2
(3.2.3)
Then we have
√
log || > −C2 (n, d)A1 · · · An log B / 2An ,
(3.2.4)
where
√
C2 (n, d) := min 1.451(30 2)n+4 (n + 1)5.5 , π 26.5n+27 d 2 log(ed).
Proof. Let log denote the principal value of the logarithm. There exists an even
rational integer b0 such that |b0 | ≤ |b1 | + · · · + |bn | and that |Im( )| ≤ π ,
where
:= b0 log α0 + b1 log α1 + · · · + bn log αn
and α0 = −1. The assumption = 0 implies that = 0. We may assume
that |e − 1| = || ≤ 1/3. Then it follows that | | ≤ 0.6, whence
|| ≥
1 | |.
2
(3.2.5)
Using | log |αi || ≤ dh(αi ), it is easy to show that
√
| log αi | ≤ 2 max {dh(αi ), π } , i = 1, . . . , n.
√
Thus, setting A0 = π/ 2, we have
√ 2Ai ≥ max {dh(αi ), | log αi |, 0.16} , i = 0, 1, . . . , n.
Further, (3.2.3) implies
2
|bi |Ai
B
.
≥ e max 1, max
√
0≤i≤n
An
2An
By applying now Theorem 3.2.4 to | | and using (3.2.5), we obtain (3.2.4).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
56
Diophantine approximation and transcendence
Remark 3.2.6 Since for any complex number z, |ez − 1| ≤ |z|e|z| holds, for
|| ≤ 1 it follows that
|| ≤ e||.
Together with (3.2.5) this implies that if we have an effective and quantitative
result for || then we also have a similar one for the corresponding || or | |,
and conversely.
Consider again defined by (3.2.2). Let now B and Bn be real numbers
satisfying
B ≥ max {|b1 |, . . . , |bn |} , B ≥ Bn ≥ |bn |.
Let p be a prime ideal of OK and denote by ep and fp the ramification index
and the residue class degree of p, respectively. Suppose that p lies above the
rational prime number p. Then the norm of p is N (p) = pfp .
The following profound result is due to Yu (2007).
Theorem 3.2.7 Assume that ordp bn ≤ ordp bi for i = 1, . . . , n, and set
hi := max{h(αi ), 1/16e2 d 2 }, i = 1, . . . , n.
If = 0, then for any real δ with 0 < δ ≤ 1/2 we have
epn N (p)
δB
−1
ordp < C3 (n, d)
,
max
h
·
·
·
h
log(Mδ
),
1
n
Bn C4 (n, d)
(log N (p))2
(3.2.6)
where
C3 (n, d) := (16ed)2(n+1) n3/2 log(2nd) log(2d),
C4 (n, d) := (2d)2n+1 log(2d) log3 (3d),
and
M := Bn C5 (n, d)N (p)n+1 h1 · · · hn−1
with
C5 (n, d) := 2e(n+1)(6n+5) d 3n log(2d).
Proof. This is the second consequence of the Main Theorem in Yu (2007). As
is remarked there, for p > 2, the expression (16ed)2(n+1) can be replaced by
(10ed)2(n+1) .
For the proof of Theorem 4.2.1, it will be more convenient to use a uniform
lower bound for log ||v which is valid both for infinite and for finite places v.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
3.2 Effective estimates for linear forms in logarithms
57
For a place v ∈ MK , we write as above
2
if v is infinite,
N (v) :=
N (p) if v = p is finite.
The following theorem is a consequence of Theorems 3.2.5 and 3.2.7.
Theorem 3.2.8 Let v ∈ MK . Suppose that in (3.2.2) = 0, bn = ±1 and that
α1 , . . . , αn−1 are not roots of unity. Let
:= h(α1 ) · · · h(αn−1 ), H := max {h(αn ), 1} .
If B is a real number such that
B ≥ max{|b1 |, . . . , |bn−1 |, 2e(3d)2n H },
(3.2.7)
then
log ||v > −C6 (n, d)
N (v)
H log∗
log N (v)
BN (v)
.
H
(3.2.8)
where C6 (n, d) := λ(16ed)3n+2 (log∗ d)2 , and λ = 1 or 12 according as n ≥ 3
or n = 2.
In the proof, we shall also need the following.
Proposition 3.2.9 Let α be a non-zero algebraic number of degree d which
is not a root of unity. Then
log 2 if d = 1,
dh(α) ≥
2/(log 3d)3 if d ≥ 2.
Proof. This result is due to Voutier (1996).
Remark 3.2.10 For d ≥ 2 this lower bound may be replaced by the quantity
(1/4)(log log d/ log d)3 ; see Voutier (1996). It is a conjecture, inspired by a
question of D. H. Lehmer (1933), that even dh(α) ≥ c > 0 should hold for
some absolute constant c.
Proof of Theorem 3.2.8. First assume that v is infinite. There is an embedding
σ : K → C such that ||v = |σ ()| or |σ ()|2 according as σ is real or
not. Observe further that h(σ (α)) = h(α) for each α ∈ Q. Hence it suffices to
prove (3.2.8) for ||. Suppose that in Theorem 3.2.5 Ai = max {dh(αi ), π } for
i = 1, . . . , n. Then, using Proposition 3.2.9, it is easy to see that
A1 · · · An ≤ (2.52d)2n H.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
58
Diophantine approximation and transcendence
√
2An > H /N(v) and
nπ 2e max √ , A1 , . . . , An−1 An ≤ 2e (3d)2n H.
2
Further, we have
Now (3.2.7) implies (3.2.3), and (3.2.8) follows from the inequality (3.2.4) of
Theorem 3.2.5.
Next assume that v is finite. Keeping the notation of Theorem 3.2.7 and
using again Proposition 3.2.9, we infer that
hi = h(αi ) for i = 1, . . . , n − 1 and hn ≤ max {h(αn ), 1} = H.
Choosing δ = h1 · · · hn−1 H /B and Bn = 1 in Theorem 3.2.7, (3.2.7) implies
that δ ≤ 12 . Using the fact that ||v = N (p)−ordp , after some computation
(3.2.8) follows from (3.2.6) of Theorem 3.2.7.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:26:50, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.005
4
Effective results for unit equations in two
unknowns over number fields
In this chapter we present effective finiteness results in quantitative form on
equations of the shape
a1 x1 + a2 x2 = 1,
(4.1)
where a1 , a2 are non-zero elements of an algebraic number field K, and the
unknowns x1 , x2 are units, S-units or, more generally, elements of a finitely
generated multiplicative subgroup of K ∗ . We usually refer to such equations
as “unit equations”, also if the unknowns are taken from a group that is not the
unit group of a ring. In the case that the unknowns are S-units, we speak about
an S-unit equation. In certain applications, it is more convenient to consider
equation (4.1) in homogeneous form
a1 x1 + a2 x2 + a3 x3 = 0,
(4.2)
where a1 , a2 , a3 denote non-zero elements of K, and the unknowns x1 , x2 , x3
are units, S-units or elements of .
For a long time equations (4.1) and (4.2) were utilized merely in special
cases and in an implicit way. It was proved by Siegel (1921) (in an implicit
form) for units of a number field, and by Mahler (1933a) for S-units in Q
that equation (4.1) has only finitely many solutions. For S-unit equations over
number fields, the finiteness of the number of solutions follows from work
of Parry (1950). Extending results of Siegel, Mahler and Parry, Lang (1960)
proved that equation (4.1) has only finitely many solutions in x1 , x2 ∈ even in
the case when K is any field of characteristic 0 and is any finitely generated
multiplicative subgroup of K ∗ . This implies that, up to a common proportional
factor, (4.2) has also finitely many solutions. These results are ineffective.
In this chapter we restrict ourselves to the case when K is a number field.
The general case will be discussed in Chapters 6 and 8.
Using Baker’s theory of logarithmic forms, Győry (1972, 1973, 1974, 1979,
1979/1980) gave the first effective upper bounds for the heights of the solutions
61
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
62
Unit equations in two unknowns
of unit equations and S-unit equations over number fields. He systematically
applied his results among others to decomposable form equations, polynomials
and algebraic numbers of given discriminant, and irreducible polynomials, see
Győry (1972, 1973, 1974, 1976, 1978a,b, 1980a,b, 1981a,b,c, 1982c). Győry’s
bounds have been improved by several people, for references see the Notes in
Section 4.7.
In the present chapter, we derive effective upper bounds for the heights
of the solutions of S-unit equations by means of the best known variants of
the classical Baker’s method. There are now other methods giving effective
bounds for the solutions, see Bombieri (1993), Bombieri and Cohen (1997,
2003), Bugeaud (1998), Murty and Pasten (2013) and von Känel (2014b). A
brief discussion of these methods, together with a comparison of the bounds
they yield, is given in Section 4.5.
In Section 4.1, we present the best upper bounds to date for the heights of the
solutions of (4.1) and (4.2) in units, S-units and, more generally, in an arbitrary
finitely generated subgroup of a number field K. These results will be used
to prove the main results in Chapter 8 on unit equations over finitely generated
integral domains, and in Section 9.6 on decomposable form equations over K.
Further, they will be applied to discriminant equations in our next book. For
these and other possible applications, we give the upper bounds in completely
explicit form.
In Section 4.2 we state new effective and quantitative results on approximation of numbers from K ∗ by elements of a finitely generated subgroup of
K ∗ . These are the hard core of our proofs. In Section 4.6, an application is
presented in the direction of the abc-conjecture over number fields. Many other
applications are mentioned in the Notes, Section 4.7 of this chapter and in
Chapter 10.
Sections 3.2 and 4.3 contain the main tools needed in the proofs. We recalled
in Section 3.2 the best known effective estimates, due to Matveev (2000) and
Yu (2007), for linear forms in logarithms. Further, in Section 4.3 we prove a
new result from the geometry of numbers and give height estimates for units/
S-units in a fundamental/maximal independent system of units/S-units. Finally,
in Section 4.4 we prove the results from Sections 4.1 and 4.2.
4.1 Effective bounds for the heights of the solutions
4.1.1 Equations in units of a number field
Let K be an algebraic number field of degree d, OK the ring of integers of K
and OK∗ the group of units of OK . We denote by R the regulator of K, by r
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.1 Effective bounds for the heights of the solutions
63
the rank of OK∗ , by MK the set of (infinite and finite) places, and by MK∞
the set of infinite places of K. For v ∈ MK , | · |v denotes the absolute value
corresponding to v, defined in Section 1.7.
We recall that the absolute (multiplicative) height H (α) of α ∈ K is defined
by
1/d
max(1, |α|v )
H (α) :=
v∈MK
and the absolute logarithmic height h(α) by
h(α) := log H (α).
More generally, we define the height h(α) of α ∈ Q by taking a number field
K containing α and using the above definition; one can show that this is
independent of the choice of K. For more details and for the most important
properties of the height, we refer to Section 1.9. We shall frequently use these
properties without any further reference.
Let a1 , a2 , a3 be non-zero elements of K and let H be a real with
H ≥ max{h(a1 ), h(a2 ), h(a3 )},
H ≥ max{1, π/d}.
Consider the homogeneous unit equation
a1 x1 + a2 x2 + a3 x3 = 0 in x1 , x2 , x3 ∈ OK∗ .
(4.1.1)
The following theorem is due to Győry and Yu (2006).
Theorem 4.1.1 All solutions x1 , x2 , x3 of (4.1.1) satisfy
max h(xi /xj ) ≤ c1 R(log∗ R)H,
i,j
(4.1.2)
where
c1 := 4(r + 1)2r+9 23.2(r+12) log(2r + 2)(d log∗ (2d))3 .
In some applications, for instance in our book on discriminant equations,
at least two of the unknowns x1 , x2 , x3 are conjugate to each other over Q. In
these situations the following theorem will lead to much sharper quantitative
results.
Let K1 be a subfield of K with degree d1 , unit rank r1 and regulator RK1 .
Assume that for some Q-isomorphism σ of K1 , σ (K1 ) is also a subfield of K.
Theorem 4.1.2 All solutions x1 , x2 , x3 of (4.1.1) with x2 ∈ K1 , x3 = σ (x2 )
satisfy
max h(xi /xj ) ≤ c2 RK1 H log
1≤i,j ≤3
h(x2 )
,
H
(4.1.3)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
64
Unit equations in two unknowns
provided that
h(x2 ) > c3 RK1 H,
(4.1.4)
where
c2 := 25.5r1 +45 r12r1 +2.5 ,
c3 := 320d 2 r12r1 .
It should be observed that in (4.1.3) the upper bound depends on h(x2 ).
In terms of d and r1 , Theorem 4.1.2 is an improvement of a result of Győry
(1998).
In the next subsection we give more general versions of Theorem 4.1.1. A
similar generalization of Theorem 4.1.2 is given in Győry (1998). But Theorems 4.1.1 and 4.1.2 provide, in the special situation they deal with, much
better bounds in terms of d and r.
4.1.2 Equations with unknowns from a finitely generated
multiplicative group
Let again K be an algebraic number field of degree d. Let be a finitely
generated multiplicative subgroup of K ∗ of rank q > 0, and ∞ the torsion
subgroup of consisting of all elements of finite order. We recall that q is
the smallest positive integer such that / tors has a system of q generators.
Let S denote the smallest set of places of K such that S contains all infinite
places, and ⊆ OS∗ where OS∗ denotes the group of S-units in K. Further, let
a1 , a2 ∈ K ∗ . We consider the equation
a1 x1 + a2 x2 = 1 in x1 ∈ , x2 ∈ OS∗ .
(4.1.5)
In our first theorem below the following notation is used:
H := max{1, h(a1 ), h(a2 )};
{ξ1 , . . . , ξm } is a system of generators for / tors (not necessarily a basis)
and
:= h(ξ1 ) · · · h(ξm );
s := |S|, p1 , . . . , pt are the prime ideals in S, and
P := max{2, N (p1 ), . . . , N (pt )},
where N(pi ) := |OK /pi | denotes the norm of pi ; in the case that S consists
only of the infinite places we put t := 0, P := 2.
Theorem 4.1.3 If x1 , x2 is a solution of (4.1.5), then
max{h(x1 ), h(x2 )} < 6.5 c4 s
P
H max{log(c4 sP ), log∗ },
log P
(4.1.6)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.1 Effective bounds for the heights of the solutions
65
where
c4 := 11λ · (m + 1)(log∗ m)(16ed)3m+5
with λ = 12 if m = 1, λ = 1 if m ≥ 2.
For some of our applications it is essential that we allow ξ1 , . . . , ξm to be any
set of generators of / tors and not necessarily a basis; see for instance the proof
of Theorem 9.6.2 and the proofs of certain results on discriminant equations,
to be discussed in our next book. Almost the same bounds as in (4.1.6) were
obtained in Bérczes, Evertse and Győry (2009), but with c4 replaced by a
constant which, for m > q > 0, contains also the factor q q . This improvement
here will be important in our book on discriminant equations.
Theorem 4.1.3 implies in an effective way the finiteness of the number of
solutions x1 , x2 ∈ of (4.1.5). To formulate this in a precise form we recall that
as in Section 1.10, K is said to be effectively given if the minimal polynomial
over Z of a primitive element θ of K over Q is given. We may assume that θ is
an algebraic integer. Further, an element α of K is said to be given/effectively
determinable if it is expressed in the form
α = (p0 + p1 θ + · · · + pd−1 θ d−1 )/q
with rational integers p0 , . . . , pd−1 , q with gcd(p0 , . . . , pd−1 , q) = 1 that are
given/can be effectively computed (see Section 1.10).
Corollary 4.1.4 For given a1 , a2 ∈ K ∗ , equation (4.1.5) has only finitely many
solutions in x1 , x2 ∈ . Further, there exists an algorithm which, from effectively
given K, a1 , a2 , a system of generators for / tors and tors , computes all
solutions x1 , x2 .
In the special case = OS∗ , we obtain from Theorem 4.1.3 the following.
Let S be a finite subset of MK containing all infinite places, with the above
parameters s, P . Denote by RS the S-regulator (see (1.8.2)). Define
c5 := 11λs 2 (log∗ s)(16ed)3s+2 with λ = 12 if s = 2, λ = 1 if s ≥ 3,
c6 := ((s − 1)!)2 /(2s−2 d s−1 ).
Corollary 4.1.5 Every solution x1 , x2 of
a1 x1 + a2 x2 = 1
in x1 , x2 ∈ OS∗
(4.1.7)
satisfies
max(h(x1 ), h(x2 )) < 6.5c5 c6 (P / log P )H RS max{log(c5 P ), log∗ (c6 RS )}.
(4.1.8)
This was proved by Győry and Yu (2006) in a slightly sharper form in terms
of d and s. Their proof is a more general variant of that of Theorem 4.1.1. In
the special case S = MK∞ , Corollary 4.1.5 gives Theorem 4.1.1 but only with
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
66
Unit equations in two unknowns
a weaker bound in terms of d and r. From Theorem 4.1.3, a weaker version of
Theorem 4.1.2 can also be deduced.
We say that S is effectively given if the prime ideals in S are effectively
given in the sense defined in Section 1.10. The next corollary follows both
from Corollary 4.1.5 and from Corollary 4.1.4.
Corollary 4.1.6 For given a1 , a2 ∈ K ∗ , equation (4.1.7) has only finitely many
solutions. Further, there exists an algorithm which, from effectively given K,
a1 , a2 and S, computes all solutions.
If the number t of finite places in S exceeds log P , then, in terms of S,
s is the dominating factor in the bound occurring in (4.1.8). This factor is a
consequence of the use of Proposition 4.3.9 concerning S-units whose proof is
based on Minkowski’s Theorem on successive minima. In the following version
of Corollary 4.1.5 there is no factor of the form s s or t t . This improvement
plays an important role in some applications, see e.g. Győry, Pink and Pintér
(2004), Győry and Yu (2006), Győry (2006) and it is also applied in our next
book on discriminant equations.
Let
s
R := max{h, R},
where h and R denote the class number and regulator of K, respectively.
Further, let r denote the unit rank of K. From Theorem 4.2.1 below we shall
deduce the following.
Theorem 4.1.7 Let t > 0. Then every solution x1 , x2 of equation (4.1.7)
satisfies
t+4
max{h(x1 ), h(x2 )} < c7 d r+3 R
P H RS ,
(4.1.9)
where c7 is an effectively computable positive absolute constant.
This was established in Győry and Yu (2006) in a somewhat different and
completely explicit form; for a slight improvement see Győry (2008a).
We note that in view of (1.5.2) and (1.5.3), R can be estimated from above
in terms of d and the discriminant of K. Further, in view of (1.8.3) we have
R
t
i=1
log N(pi ) ≤ RS ≤ hR
t
log N (pi ).
i=1
The linear dependence on H of the bounds in (4.1.6), (4.1.8) and (4.1.9)
cannot be improved. Indeed, let a1 = 1 − ε with ε ∈ OS∗ and a2 = 1. Then
equations (4.1.5) and (4.1.7) have the solution x1 = 1, x2 = ε, and it is easy to
see that
H − log 2 ≤ max{h(x1 ), h(x2 )} ≤ H + log 2.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.2 Elements of a finitely generated multiplicative group
67
4.2 Approximation by elements of a finitely generated
multiplicative group
We deduce Theorem 4.1.3 from the following Diophantine approximation theorem. Keeping the above notation, we put N(v) := 2 if v is an infinite place,
and N (v) := N (p) if v = p is a finite place, i.e., prime ideal of OK .
Theorem 4.2.1 Let be a finitely generated multiplicative subgroup of K ∗
with system of generators {ξ1 , . . . , ξm } for / tors . Let α ∈ K ∗ , and put
H := max(h(α), 1),
:= h(ξ1 ) · · · h(ξm ).
Further, let v ∈ MK . Then for every ξ ∈ with αξ = 1, we have
log |1 − αξ |v > −c8
N (v)
H log∗
log N(v)
N (v)h(ξ )
,
H
(4.2.1)
where
c8 := 2λ · (m + 1) log∗ (dm)(log∗ d)2 (16ed)3m+5
with λ = 12 if m = 1, λ = 1 if m ≥ 2.
The following theorem is an immediate consequence of Theorem 4.2.1. The
estimate (4.2.3) below is of a similar flavour to results in Bombieri (1993),
Bombieri and Cohen (1997, 2003) and Bugeaud (1998) (see also Bilu (2002),
Bombieri and Gubler (2006), section 5.4), but, as will be seen in Section 4.5,
inequality (4.2.3) below gives in many cases a better upper bound for h(ξ ).
Theorem 4.2.2 Let α ∈ K ∗ , v ∈ MK and 0 < κ ≤ 1. If ξ ∈ is such that
αξ = 1 and
log |1 − αξ |v < −κh(ξ )
(4.2.2)
then
h(ξ ) < 6.4(c8 /κ)
N (v)
H max{log((c8 /κ)N (v)), log∗ }.
log N (v)
(4.2.3)
Similar results were proved in Bérczes, Evertse and Győry (2009) but with c8
replaced by a constant which, for m > q > 0, contains also the factor q q . Here
q denotes the rank of . It is crucial for some applications of Theorems 4.2.1
and 4.2.2, for example in Theorem 4.1.3 and Theorem 4.1.7, that no factor q q
occurs in c8 .
The main tool in the proofs of Theorems 4.2.1 and 4.2.2 is the theory of logarithmic forms, more precisely Theorem 3.2.8. It will be combined with some
new results from the geometry of numbers and some estimates for fundamental/
independent units.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
68
Unit equations in two unknowns
4.3 Tools
4.3.1 Some geometry of numbers
Let V be a real vector space of finite dimension n. We endow V with a topology
by choosing a linear isomorphism ϕ : V → Rn and taking the inverse images
under ϕ of the open sets of Rn . This does not depend on the choice of ϕ.
By a lattice in V we mean an additive group of the shape
q
zi ai : z1 , . . . , zq ∈ Z ,
L=
i=1
where a1 , . . . , aq are linearly independent vectors of V . We call {a1 , . . . , aq } a
basis of L and q the dimension of L. Clearly, q ≤ n. By a full lattice in V we
mean a lattice in V of maximal dimension n.
A norm or convex distance function on V is a function . : V → R≥0 such
that
x + y ≤ x + y for x, y ∈ V ;
λx = |λ| · x
x = 0
for x ∈ V , λ ∈ R;
if and only if x = 0.
The unit ball of . is defined by
B. = {x ∈ V : x ≤ 1}.
It is a convex, compact, symmetric body in V , i.e., it is convex, symmetric
about 0, and it is compact and has interior points with respect to the topology
on V defined above. Conversely, with any convex, compact, symmetric body
C in V one can associate a norm .C on V such that C is the unit ball of
.C : take xC := λ, where λ is the minimum of all reals μ ≥ 0 such that
x ∈ μC := {μy : y ∈ C}.
Let · be a norm on V , and L a q-dimensional lattice in V . For i = 1, . . . , q
we define the i-th successive minimum λi = λi (., L) of . with respect to
L, to be the minimum of all numbers λ such that {x ∈ V : x ≤ λ} contains
at least i linearly independent vectors from L.
We recall Minkowski’s Theorem on successive minima. For technical simplicity we restrict ourselves to the special case of full lattices in Rq . But note
that the general case can be reduced to this special case by means of a linear
isomorphism. We denote by “vol” the Lebesgue measure on Rq , normalized
such that the unit cube [0, 1]q has measure 1. If L is a full lattice in Rq with
basis {a1 , . . . , aq }, say, we define the determinant of L by
d(L) = |det(a1 , . . . , aq )|.
This is independent of the choice of the basis.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.3 Tools
69
Theorem 4.3.1 Let λ1 , . . . , λq be the successive minima of a norm . on Rq
with respect to a full lattice L in Rq . Then
vol(B. )
2q
≤ λ1 · · · λq
≤ 2q .
q!
d(L)
Proof. For a proof, see Cassels (1959), chapter VIII or Minkowski (1910). We
note that both the upper bound and the lower bound are best possible.
Corollary 4.3.2 Let · be a norm on Rq , and L a full lattice in Rq such that
vol(B. ) ≥ 2q d(L). Then there is a non-zero x ∈ L with x ≤ 1.
Proof. By Theorem 4.3.1 we have λ1 ≤ (λ1 · · · λq )1/q ≤ 1.
Theorem 4.3.3 Let . be a norm on Rq , L a full lattice in Rq , and λ1 , . . . , λq
the successive minima of · with respect to L. Then L has a basis {a1 , . . . , aq }
such that ai ≤ max(1, i/2)λi for i = 1, . . . , q.
Proof. See Cassels (1959), chapter V. The idea of the proof originates from
Mahler.
We now prove a technical result, which will be applied later in combination
with logarithmic forms estimates. Proposition 4.4.1 from Section 4.4, which is
a consequence of Proposition 4.3.4 below, will play an important role in the
proof of Theorem 4.2.1.
Proposition 4.3.4 Let V be a real vector space, L a lattice in V of dimension
q ≥ 1, and . a norm on V , such that x ≥ θ > 0 for all x ∈ L \ {0}. Further,
let m ≥ q be an integer, and let a1 , . . . , am be vectors in L \ {0} for which
a1 , . . . , am generate L as a Z-module,
and among all systems of m vectors that generate L,
m
ai is minimal.
(4.3.1)
i=1
Then for every x ∈ L there are b1 , . . . , bm ∈ Z such that
x = b1 a1 + · · · + bm am
with |bi | ≤ q 2q
x
θ
for i = 1, . . . , m.
It is crucial for applications that in Proposition 4.3.4 a1 , . . . , am do not have
to form a basis of L.
We assume that V = Rq , L = Zq which is no loss of generality. Indeed,
we may assume without loss of generality that V is the real vector space
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
70
Unit equations in two unknowns
spanned by L. Let ϕ : Rq → V be a linear isomorphism such that ϕ(Zq ) = L.
Define a norm .ϕ on Rq by xϕ := ϕ(x). Then clearly, it suffices to
prove Proposition 4.3.4 with Rq , Zq , ϕ −1 (a1 ), . . . , ϕ −1 (am ), .ϕ instead of L,
a1 , . . . , am , ..
For the proof of Proposition 4.3.4 (with V = Rq , L = Zq ) we make some
preparations. Since we assume L = Zq , the factor d(L) in these results disappears. Denote by λ1 , . . . , λq the successive minima of . with respect to Zq .
By assumption, we have λ1 ≥ θ . We define
V := vol(B· ) = vol({x ∈ Rq : x ≤ 1}).
We need a number of lemmas.
Lemma 4.3.5 Let f0 , f1 , . . . , fm be vectors in Zq such that f1 , . . . , fm generate
Zq . Then there are integers b1 , . . . , bm such that
f0 =
m
bi fi ,
|bi | ≤ M(f0 , . . . , fm ) for i = 1, . . . , m,
i=1
where
M(f0 , . . . , fm ) =
max
0≤i1 <···<iq ≤m
|det(fi1 , . . . , fiq )|.
Proof. This is a result of Borosh, Flahive, Rubin and Treybig (1989).
Lemma 4.3.6 Let f1 , . . . , fq ∈ Rq . Then
|det(f1 , . . . , fq )| ≤
q!
V f1 · · · fq .
2q
Proof. We assume without loss of generality that f1 , . . . , fq are linearly independent. Put gi := fi −1 fi for i = 1, . . . , q, and denote by D the convex hull
of the points ±gi (i = 1, . . . , q). Then our lemma follows at once from the
observations D ⊂ B· and
vol(D) =
2q |det(f1 , . . . , fq )|
2q
· |det(g1 , . . . , gq )| =
·
.
q!
q!
f1 · · · fq In what follows, let a1 , . . . , am be as in Proposition 4.3.4, and assume again
that L = Zq .
Lemma 4.3.7 Let i1 , . . . , iq be any distinct indices from {1, . . . , m}. Then
q
j =1
aij ≤
q!
λ1 · · · λq .
2q−1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.3 Tools
71
Proof. For convenience, we put
μ1 = · · · = μm−q+1 := λ1 ,
μm−q+2 := λ2 , . . . , μm := λq .
By Theorem 4.3.3, the lattice Zq has a basis {y1 , . . . , yq } such that yi ≤
max(1, i/2)λi for i = 1, . . . , q. This implies
a1 · · · am ≤ y1 m−q+1 y2 · · · yq q! m−q+1
q!
≤ q−1 λ1
λ2 · · · λq = q−1 μ1 · · · μm , (4.3.2)
2
2
where we have used (4.3.1).
Without loss of generality we may assume that a1 ≤ · · · ≤ am . Let
i0 := 0 and for j = 1, . . . , q define ij to be the largest index i such that
rank{a1 , . . . , ai } = j . Then
ai ≥ λj
for ij −1 + 1 ≤ i ≤ ij , j = 1, . . . , q,
and so
ai ≥ μi for i = 1, . . . , m.
Together with (4.3.2) this implies that for any subset I of {1, . . . , m},
ai q!
≤ q−1 .
μ
2
i
i∈I
Hence for any q distinct indices i1 , . . . , iq from {1, . . . , m},
q
aij ≤ am−q+1 · · · am ≤
j =1
q!
q!
μm−q+1 · · · μm ≤ q−1 λ1 · · · λq ,
q−1
2
2
which is our lemma.
Proof of Proposition 4.3.4. Without loss of generality, we assume that x = 0.
In view of Lemma 4.3.5, it suffices to show that
x
.
(4.3.3)
θ
First, let i1 , . . . , iq be any q distinct indices from {1, . . . , m}. Then by Lemmas 4.3.6 and 4.3.7, Theorem 4.3.1 (Minkowski’s Theorem on successive
minima) and our assumption x ≥ θ , we have
M(x, a1 , . . . , am ) ≤ q 2q ·
q!
· V · ai1 · · · aiq 2q
(q!)2
≤ 2q−1 · V λ1 · · · λq
2
(q!)2
x
.
≤ q−1 ≤ q 2q ·
2
θ
|det(ai1 , . . . , aiq )| ≤
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
72
Unit equations in two unknowns
Next, let i1 , . . . , iq−1 be any q − 1 distinct indices from {1, . . . , m}. Using
Lemma 4.3.6, our assumption ai ≥ θ for i = 1, . . . , m, and Lemma 4.3.7
and Theorem 4.3.1 (Minkowski’s Theorem), we get
q!
· V x · ai1 · · · aiq−1 2q
x
q!
≤ q · V ai1 · · · aiq ·
2
θ
(with iq ∈ {1, . . . , m} \ {i1 , . . . , iq−1 })
|det(x, ai1 , . . . , aiq−1 )| ≤
(q!)2
x
· V λ1 · · · λq ·
22q−1
θ
2
x
(q!) x
≤ q 2q ·
.
≤ q−1 ·
2
θ
θ
≤
This clearly proves (4.3.3) and Proposition 4.3.4.
4.3.2 Estimates for units and S-units
Let K be an algebraic number field of degree d with ring of integers OK ,
unit rank r and regulator R. Denote by ωK the number of roots of unity
in K. We determine upper bounds for the heights of units and S-units in
a fundamental/maximal independent system. We start with some auxiliary
results. The first is due to Loher and Masser.
Proposition 4.3.8 For n ≥ 1, let α1 , . . . , αn be multiplicatively independent
non-zero elements of K. Then we have
58(n!en /nn )d n+1 (log∗ d)h(α1 ) · · · h(αn ) ≥ ωK .
Proof. This is a consequence of Loher and Masser (2004), Theorem 3.
As is known, n!en /nn is asymptotic to
Proposition 4.3.8 gives
√
√
2π n and n!en /nn ≤ e n. Hence
√
58e n d n+1 (log∗ d)h(α1 ) · · · h(αn ) ≥ ωK .
(4.3.4)
For simplicity, we shall apply the consequence (4.3.4) of Proposition 4.3.8.
Let S = {v1 , . . . , vs } be a finite set of places on K which contains the set
MK∞ of the infinite places. Denote by OS , OS∗ and RS the ring of S-integers, the
group of S-units and the S-regulator of K, respectively. If in particular S = MK∞ ,
then s = r + 1, OS = OK , OS∗ is just the unit group OK∗ of K, and RS = R.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.3 Tools
73
We define the constants
c9 := ((s − 1)!)2 /(2s−2 d s−1 ),
c9 := (s − 1)!/d s−1 ,
√
c10 := 29e s − 2 d s−1 (log∗ d) c9 (s ≥ 3),
√
c10
:= 29e s − 2 d s−1 (log∗ d)c9 (s ≥ 3),
c11 := (((s − 1)!)2 /2s−1 )(log(3d))3 .
Proposition 4.3.9 Let s ≥ 2. There exists in K a fundamental (respectively
independent) system {ε1 , . . . , εs−1 } of S-units with the following properties:
(i)
s−1
h(εi ) ≤ c9 RS (resp. c9 RS );
i=1
RS ) if s ≥ 3;
(ii) max h(εi ) ≤ c10 RS (resp. c10
1≤i≤s−1
(iii) for such a fundamental system {ε1 , . . . , εs−1 }, the absolute values of the
entries of the inverse matrix of (log |εi |vj )i,j =1,...,s−1 do not exceed c11 .
Remark A similar result was proved earlier by Siegel (1969) for ordinary units,
i.e., in the case S = MK∞ . The proof given below, which is a straightforward
extension of Siegel’s argument, is due to Győry and Yu (2006) and, in slightly
weaker forms Hajdu (1993) and Bugeaud and Győry (1996a).
Recently, for multiplicatively independent S-units, Vaaler (2014),
theorems 1, 2 obtained the slightly better upper bound s!/(2d)s−1 instead of c9 .
Proof. For α ∈ K \ {0}, put
v(α) := log |α|v1 , . . . , log |α|vs−1 .
The full lattice L in Rs−1 spanned by the vectors v(η) with η ∈ OS∗ has determinant RS ; see Section 1.8.
The function · : Rs−1 → R defined by
x := |x1 | + · · · + |xs−1 |
for x = (x1 , . . . , xs−1 ) ∈ Rs−1 is a norm; see Section 4.3.1. Denote by V the
volume of the unit ball {x ∈ Rs−1 : x ≤ 1}. It is easy to check that
V = 2s−1 /(s − 1)!.
By Theorem 4.3.1 (Minkowski’s Theorem on successive minima) the successive minima λ1 , . . . , λs−1 of L with respect to · have the property
λ1 · · · λs−1 ≤ 2s−1 RS /V = (s − 1)!RS .
(4.3.5)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
74
Unit equations in two unknowns
Further, there are multiplicatively independent S-units η1 , . . . , ηs−1 in OS for
which
v(ηi ) = λi ,
However, for every η ∈ OS∗ we have
i = 1, . . . , s − 1.
s
j =1
(4.3.6)
|η|vj = 1, hence
s
s
1
1 log |η|vj max 0, log |η|vj =
d j =1
2d j =1
⎛
⎞
s−1
s−1
1 ⎝ log |η|vj + =
log |η|vi ⎠ ,
2d j =1
i=1
h(η) =
which implies that
1
1
v(η) ≤ h(η) ≤ v(η).
2d
d
(4.3.7)
We infer from (4.3.5), (4.3.6) and (4.3.7) that
s−1
h(ηi ) ≤
i=1
(s − 1)!
· RS ,
d s−1
i.e. (i) holds for η1 , . . . , ηs−1 .
It follows from Theorem 4.3.3 that there exists a fundamental system of
S-units {ε1 , . . . , εs−1 } in OS such that
v(εi ) ≤ max{1, i/2}v(ηi ),
i = 1, . . . , s − 1.
(4.3.8)
Further, by (4.3.7), (4.3.8), (4.3.6) and (4.3.5) we have
s−1
i=1
h(εi ) ≤
s−1
1 d s−1
v(εi ) ≤
i=1
((s − 1)!)2
≤ s−2 s−1 · RS ,
2 d
s−1
(s − 1)! v(ηi )
2s−2 d s−1 i=1
(4.3.9)
which proves (i).
(ii) is an immediate consequence of (i) and (4.3.4).
It remains to prove (iii). Putting E := (log |εi |vj )i,j =1,...,s−1 we have
|det(E)| = RS . If s = 2, then (1.5.3) and (1.8.5) prove (iii). Now let s > 2
and eij := det(Eij )/ det(E), where Eij denotes the matrix obtained from E by
omitting the i-th row and j -th column. It follows from (4.3.9) and Hadamard’s
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.3 Tools
75
inequality that
!
s−1 "
s−1
" s−1 2 "
log
|ε
|det(Eij )| ≤
|
≤
v(εp )
p vq
#
p=1
p = i
≤
q=1
q = j
p=1
p = i
RS
((s − 1)!)2
.
·
s−2
2
v(εi )
Together with (4.3.7), |det(E)| = RS and Proposition 3.2.9, this proves (iii).
For s ≥ 3, let
√
(s − 1)!)2
c12 := 29e s − 2 ·
· π s−2 d log∗ d.
2s−2
When we apply Theorem 3.2.5 to unit equations, we shall get better bounds by
using the following version of (i), Proposition 4.3.9.
Lemma 4.3.10 Let {ε1 , . . . , εs−1 } be a fundamental system of S-units in K
with the properties specified in Proposition 4.3.9. Then
s−1
i=1
max(RS , π ),
max(dh(εi ), π ) ≤
c12 RS ,
if s = 2,
if s ≥ 3.
(4.3.10)
Proof. The case s = 2 is trivially true by Proposition 4.3.9. Suppose s ≥ 3. Let
k denote the number of indices i with 1 ≤ i ≤ s − 1 such that dh(εi ) < π .
Suppose first 1 ≤ k ≤ s − 2. Without loss of generality, we may assume
dh(εi ) < π for i = 1, . . . , k and dh(εj ) ≥ π for j = k + 1, . . . , s − 1. Thus,
using (4.3.4) and Proposition 4.3.9, we infer that
s−1
max(dh(εi ), π ) = π k /d k h(ε1 ) · · · h(εk ) d s−1 h(ε1 ) · · · h(εs−1 )
i=1
√ ((s − 1)!)2
· π k d(log∗ d) RS ≤ c12 RS ,
≤ 29e k ·
2s−2
which proves (4.3.10).
If k = 0, then (4.3.10) immediately follows from (i) of Proposition 4.3.9.
For k = s − 1, we have dh(εi ) < π for each i. Further, if s ≥ 3 then (1.8.5)
gives a lower bound for RS and (4.3.10) follows.
Let p1 , . . . , pt be the prime ideals in S, and put
Q := N(p1 · · · pt )
if t > 0,
Q := 1 if t = 0.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
76
Unit equations in two unknowns
Let hK denote the class number of K, and let
⎧
⎨0,
c13 := 1/d,
√
⎩
29er!r r − 1 log d,
if r = 0,
if r = 1,
if r ≥ 2.
Proposition 4.3.11 Let θv (v ∈ S) be reals with v∈S θv = 0. Then there
exists ε ∈ OS∗ such that
| log |ε|v − θv | ≤ c13 dR + hK log Q.
(4.3.11)
v∈S
Remark As will follow from the proof, in the special case t = 0 the unit
ε ∈ OK∗ occurring in Proposition 4.3.11 can be chosen from the group generated by independent units having the properties specified in (i) and (ii) of
Proposition 4.3.9.
Proof. We start with the case t = 0. Then S = {v1 , . . . , vr+1 }, where
v1 , . . . , vr+1 are the infinite places of K. Write θi for θvi . If r = 0, then
S = {v1 }, hence θ1 = 0, and thus the assertion holds with ε = 1. Assume that
r > 0. Choose a system of independent units ε1 , . . . , εr in K with the properties
specified in Proposition 4.3.9. Consider the system of linear equations
r
log |εj |vi xj = θi
i = 1, . . . , r + 1
j =1
in x1 , . . . , xr . The equations with i = 1, . . . , r have a unique solution
(x1 , . . . , xr ) ∈ Rr , since det(log |εj |vi )i,j =1,...,r = R = 0. This solution satisfies
also the equation with i = r + 1, since r+1
i=1 log |εj |vi = 0 for j = 1, . . . , r,
r+1
and i=1 θi = 0. Let b1 , . . . , br be the rational integers with
− 12 < bj − xj ≤
1
2
for j = 1, . . . , r
and take ε = ε1b1 · · · εrbr . Then
r+1
r+1 r
(bj − xj ) log |εj |v log |ε|v − θi =
i
i
i=1
i=1 j =1
≤
1
2
r+1 r r
r
log |εj |v ≤
log |εj |v .
i
i=1 j =1
i
j =1 i=1
We assert that if r > 1, then the inner sum over i in the last expression
is at most (d/r)c13 R. This can be seen by using (4.3.5), (4.3.6), the second
inequality of (4.3.7) and by applying Proposition 4.3.8 to any r − 1 of the
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.3 Tools
77
εi (1 ≤ i ≤ r). Thus Proposition 4.3.11 is proved for r > 1. If r = 1, we can
use (i) of Proposition 4.3.9 to prove the assertion.
Now let t > 0. Recall that S = MK∞ ∪ S0 , where S0 = {p1 , . . . , pt } is the set
of finite places, i.e., prime ideals in S. For p ∈ S0 , let kp be the integer such that
− 12 < kp +
θp
≤ 12 .
hK log Np
There is α ∈ K ∗ such that (α) = ( p∈S0 pkp )hK . We have α ∈ OS∗ and
log |α|p − θp = kp hK log Np − θp ≤ 1 hK log N p (p ∈ S0 ). (4.3.12)
2
By the Product Formula, v∈S log |α|v = 0. Together with what we just proved,
this implies that there is η ∈ OK∗ such that
log |η|v + log |α|v − θv + B ≤ c13 dR,
A :=
(4.3.13)
r + 1
∞
v∈MK
where
B :=
(log |α|p − θp ).
p∈S0
Now take ε := ηα. Clearly, (4.3.12) holds with ε instead of α. Hence
| log |ε|p − θp | ≤ 12 hK log Q.
p∈S0
Further, (4.3.13) and (4.3.12) imply
| log |ε|v − θv | ≤ A + |B| ≤ c13 dR + 12 hK log Q.
v∈MK∞
Now (4.3.11) follows by a simple addition.
Recall that we have defined the S-norm NS (α) := v∈S |α|v for α ∈ K; see
Section 1.8. In the case of S = MK∞ , this is just |NK/Q (α)|. In addition, we
define
⎛
⎞
⎠
max(1, |α|v ),
max(1, |α|−1
MS (α) := max ⎝
v )
v∈MK \S
v∈MK \S
for α ∈ K ∗ . By the Product Formula we have
MS (α) =
|α|−1
v = NS (α) for α ∈ OS \ {0}.
v∈MK \S
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
78
Unit equations in two unknowns
Proposition 4.3.12 Let α ∈ K ∗ and let n be a positive integer. Then there
exists ε ∈ OS∗ such that
h(ε n α) ≤
1
hK
log MS (α) + n c13 R +
log Q .
d
d
(4.3.14)
In particular, if α ∈ OS \ {0} then there exists ε ∈ OS∗ such that
h(ε n α) ≤
1
hK
log NS (α) + n c13 R +
log Q .
d
d
(4.3.15)
Proof. Inequality (4.3.15) is an immediate consequence of (4.3.14). We prove
(4.3.14). We assume that NS (α) ≥ 1. This is no loss of generality since both the
height and MS are invariant under x → x −1 , and NS (α) ≥ 1 can be achieved
by replacing α by α −1 if necessary.
By Proposition 4.3.11, there is ε ∈ OS∗ such that
1
1
B :=
log |ε|v + n log |α|v − sn log NS (α) ≤ c13 dR + hK log Q,
v∈S
where s = |S|. Hence
1
1
log
max(1, |εn α|v ) =
max(0, n log |ε|v + log |α|v )
d
d v∈S
v∈S
1
n
· B + log NS (α)
d
d
1
hK
log Q + log NS (α)
≤ n c13 R +
d
d
≤
since by assumption NS (α) ≥ 1. By adding d1 log v∈MK \S max(1, |εn α|v ) on
both sides and observing that by the Product Formula
NS (α)
max 1, |εn α|v = NS (α)
max(1, |α|v )
v∈MK \S
v∈MK \S
=
|α|−1
v · max(1, |α|v ) ≤ MS (α),
v∈MK \S
our Proposition follows.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.4 Proofs
79
4.4 Proofs
4.4.1 Proofs of Theorems 4.1.1 and 4.1.2
We keep the notation from Section 4.1. Thus, K is an algebraic number field
of degree d, R is the regulator of K and r the rank of OK∗ . Further, H is a real
with H ≥ max(h(a1 ), h(a2 ), h(a3 )) and H ≥ max(1, π/d).
Proof of Theorem 4.1.1. For r = 0 the assertion is trivial, hence we assume that
r ≥ 1. Let x1 , x2 , x3 be a solution of (4.1.1). Assume without loss of generality
that h(x1 /x3 ) ≥ h(xi /xj ) for 1 ≤ i < j ≤ 3. Put
α := −a1 /a3 , β := −a2 /a3 ,
x := x1 /x3 , y := x2 /x3 .
(4.4.1)
Then
αx + βy = 1,
x, y ∈ OK∗ ,
max{h(α), h(β)} ≤ 2H.
(4.4.2)
(4.4.3)
Clearly, h(x) ≥ h(y). We give an upper bound for the height of x. Let ε1 , . . . , εr
be a fundamental system of units in K with the properties specified in Proposition 4.3.9. Then y can be written in the form
y = ζ ε1b1 · · · εrbr ,
(4.4.4)
where ζ is a root of unity in K and b1 , . . . , br are rational integers. Denote by
v1 , . . . , vr+1 the infinite places of K. We infer from (4.4.4) that
log |y|vj =
r
bi log |εi |vj ,
j = 1, . . . , r,
i=1
whence, using (iii) of Proposition 4.3.9 and the fact that y is a unit, we get
max{|b1 |, . . . , |br |} ≤ c11
r
log |y|v = 2c dh(y) ≤ 2c dh(x), (4.4.5)
j
11
11
j =1
denotes the constant c11 with s − 1 replaced by r. Set αr+1 := ζβ
where c11
and br+1 = 1. Let v be an infinite place for which |x|v is minimal. Then, from
(4.4.2) we deduce that
d
br+1
h(x) + 2dH.
− 1v = log |αx|v ≤ −
log ε1b1 · · · εrbr αr+1
r +1
(4.4.6)
We shall prove that
h(x) < c14 R(log∗ R)H,
(4.4.7)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
80
Unit equations in two unknowns
where
c14 := min{(r + 1)2r+9 23.2r+38.4 , (r + 1)2r+3.5 24.3r+44.3 }
× 2 log(2r + 2)(d log∗ (2d))3 ,
which is somewhat stronger than (4.1.2). Set
Ai = max(dh(εi ), π ), i = 1, . . . , r,
Ar+1 = 2dH ≥ max(dh(αr+1 ), π )
.
(4.4.8)
We may assume that h(x) > 4(r + 1)H and
2c11
dh(x) > 2e max
(r + 1)π , A1 , . . . , Ar Ar+1 ,
√
2
since otherwise, using (1.5.3), Proposition 4.3.9 and (4.3.4), the upper bound
(4.4.7) easily follows. In view of (4.4.5), we can apply Theorem 3.2.5 with
dh(x). Combining this with Lemma 4.3.10, and using (4.4.6) and
B = 2c11
(4.4.8), we infer that
⎧
11 h(x)
⎨−2dv C2 (2, d)dH max{R, π } log 2c√
if r = 1,
2 2H
log |αx|v >
2c
h(x)
11
⎩−2dv C2 (r + 1, d)c dH R log √
if r ≥ 2,
12
2 2H
where C2 (r + 1, d) is the constant occurring in Theorem 3.2.5 and c12
denotes
the constant c12 with s − 1 replaced by r. Together with (4.4.6) this implies
(4.4.7), hence (4.1.2).
Proof of Theorem 4.1.2. We follow the arguments of the proof of Theorem 4.1.1. Let x1 , x2 , x3 be an arbitrary but fixed solution of (4.1.1) with
x2 ∈ K1 and x3 = σ (x2 ). The cases x3 = x2 and r1 = 0 being trivial, we assume
that x3 = x2 and r1 ≥ 1. Then d ≥ d1 ≥ 2. We define again α := −a1 /a3 ,
β := −a2 /a3 , x := x1 /x3 , y := x2 /x3 so that we have again (4.4.2), (4.4.3).
Let {ε1 , . . . , εr1 } be a fundamental system of units in K1 with the properties
specified in Proposition 4.3.9. Then
br
x2 = ζ ε1b1 · · · εr1 1
with a root of unity ζ in K1 and with rational integers b1 , . . . , br1 . We obtain
as in the proof of Theorem 4.1.1 that
(4.4.9)
max |b1 |, . . . , |br1 | ≤ 2c15 d1 h(x2 ),
where
c15 := 2 (r1 !)2 /2r1 (log(3d1 ))3 .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.4 Proofs
81
Consider the infinite place v on K for which |x|v is minimal. Setting αr1 +1 =
−α2 ζ /(α3 σ (ζ )) and ηi = εi /σ (εi ) for i = 1, . . . , r1 , we deduce from (4.4.2)
that
α1 x1 br
log η1b1 · · · ηr1 1 αr1 +1 − 1v = log α x 3 3 v
d
h(x1 /x3 ) + 2dH.
≤−
r +1
(4.4.10)
We have h(αr1 +1 ) ≤ 2H and h(ηi ) ≤ 2h(εi ) for i = 1, . . . , r1 .
To apply Theorem 3.2.5 to the left-hand side of (4.4.10), set
Ai := max{dh(εi ), π },
i = 1, . . . , r1 , Ar1 +1 := 2dH.
(4.4.11)
These imply
max{dh(ηi ), π } ≤ 2Ai , i = 1, . . . , r1 ,
max dh(αr1 +1 ), π ≤ Ar1 +1 .
(4.4.12)
(4.4.13)
We may assume that h(x1 /x3 ) > 4(r + 1)H and
(r1 + 1)π
, 2A1 , . . . , 2Ar1 Ar1 +1
2c15 d1 h(x2 ) > 2e max
√
2
since otherwise, using (1.5.3), Proposition 4.3.9 and (4.4.11), (4.4.13),
we obtain h(x2 ) < 320d 2 r1 RK1 H which contradicts our assumption (4.1.4).
Applying now Theorem 3.2.5 and using (4.4.9), (4.4.10), (4.4.12) and (4.4.13),
we obtain
a1 x1 > −C2 (r1 + 1, d)2r1 A · · · A log c15 d√1 h(x2 ) , (4.4.14)
log 1
r1 +1
a x d 2H
3 3 v
where C2 (r1 + 1, d), coming from Theorem 3.2.5, is
√
C2 (r1 + 1, d) = min 1.451(30 2)r1 +5 (r1 + 2)5.5 , π 26.5r1 +33.5 d 2 log(ed).
Comparing (4.4.14) with (4.4.10) and using (4.4.11), (4.4.13), Lemma 4.3.10,
(4.1.4) and (1.5.3) we deduce first for h(x1 /x3 ) and then, by (4.4.1)–(4.4.3), for
each h(xi /xj ) the estimate (4.1.3).
4.4.2 Proofs of Theorems 4.2.1 and 4.2.2
The proof of Theorem 4.2.1 will be based on Theorem 3.2.8. Theorem 4.2.2 is
a simple corollary of Theorem 4.2.1. We need also the following.
Proposition 4.4.1 Let be a finitely generated multiplicative subgroup of
K ∗ with rank = q > 0. Let m ≥ q be a given integer, and let {ξ1 , . . . , ξm } be
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
82
Unit equations in two unknowns
a system of generators for / tors such that the product
h(ξ1 ) · · · h(ξm ) is minimal
(4.4.15)
among all systems of m elements that generate / tors . Then for every ξ ∈ there are rational integers b1 , . . . , bm and a root of unity ζ such that ξ =
ζ ξ1b1 · · · ξmbm and
max(|b1 |, . . . , |bm |) ≤ c16 (d/2)(log 3d)3 h(ξ ),
(4.4.16)
where c16 := q 2q .
Proof. Let S = {v1 , . . . , vs } be a finite subset of MK such that S ⊇ MK∞ and is a subgroup of OS∗ . Then for ξ ∈ we have
h(ξ ) =
s
1 | log |ξ |vi |.
2d i=1
Let {η1 , . . . , ηq } be a basis for / tors . Then every ξ ∈ can be expressed
uniquely as
x
ξ = ζ η1x1 · · · ηqq with x1 , . . . , xq ∈ Z and a root of unity ζ.
(4.4.17)
We define a norm on Rq by
s q
1 xj log |ηj |vi for x = (x1 , . . . , xq ) ∈ Rq .
x :=
2d i=1 j =1
Then if ξ and x = (x1 , . . . , xq ) ∈ Zq are related by (4.4.17), we have
h(ξ ) = x.
(4.4.18)
Further, by Proposition 3.2.9,
x ≥ θ :=
2
d(log 3d)3
for x ∈ Z \ {0}.
Define vectors ai = (ai1 , . . . , aiq ) ∈ Zq by
a
ξi = ζi η1ai1 · · · ηqiq ,
i = 1, . . . , m,
where ζi is a root of unity. Then a1 , . . . , am generate Zq and by (4.4.15) and
(4.4.18), the product a1 · · · am is minimal. Further, if ξ and the vector
x = (x1 , . . . , xq ) ∈ Zq are related by (4.4.17), it follows that
ξ = ζ ξ1b1 · · · ξmbm
with ζ ∈ tors , b1 , . . . , bm ∈ Z
(4.4.19)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.4 Proofs
83
holds for some root of unity ζ if and only if
x=
m
bi ai .
i=1
Now Proposition 4.4.1 follows immediately by applying Proposition 4.3.4 to
the norm . defined above.
Proof of Theorem 4.2.1. We may assume without loss of generality that ξ1 , . . . ,
ξm have been chosen so that h(ξ1 ) · · · h(ξm ) is minimal among all systems of
m elements that generate . By Proposition 4.4.1, there are rational integers
b1 , . . . , bm and a root of unity ζ in K for which (4.4.19) and (4.4.16) hold.
Then we have
1 − αξ = 1 − α ξ1b1 · · · ξmbm with α = ζ α.
Let
c17 := m2m (d/2)(log 3d)3 .
We distinguish two cases.
First assume that c17 h(ξ ) ≥ 2e(3d)2(m+1) H . Then we can apply Theorem 3.2.8 with B = c17 h(ξ ) and (4.2.1) follows.
Next consider the case when c17 h(ξ ) < 2e(3d)2(m+1) H . Then, by the Product Formula we have the following Liouville type inequality
−d
|1 − αξ |v =
|1 − αξ |−1
max(1, |αξ |w )−1
w ≥2
w ∈ MK
w = v
w ∈ MK
w = v
≥ 2−d exp(−dh(αξ )) ≥ 2−d exp −d H +
2e(3d)2(m+1) H
c17
.
In view of Proposition 3.2.9 we have
≥
2
d(log 3d)3
m
if d ≥ 2 and ≥ (log 2)m if d = 1,
(4.4.20)
hence we obtain again (4.2.1).
Proof of Theorem 4.2.2. Let ξ ∈ be such that αξ = 1 and (4.2.2) holds. By
Theorem 4.2.1 we have (4.2.1). Then, with the notation X := N (v)h(ξ )/H and
b := c8 κ −1 N (v)2 /(log N(v)), it follows that X ≤ b log X. In view of (4.4.20)
we have b ≥ e2 . We use now that if a, b, X are real numbers with a ≥ 0, b ≥ 1,
X ≥ 1 and X ≤ b log X + a then
2(b log b + a) if b > e2
(4.4.21)
X≤
2(2e2 + a)
if b ≤ e2
(see Pethő and de Weger (1986)). This gives (4.2.3).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
84
Unit equations in two unknowns
4.4.3 Proofs of Theorem 4.1.3 and its corollaries
We keep the notation from Section 4.1. In particular, K is an algebraic number
field of degree d, is a finitely generated subgroup of K ∗ of rank q > 0, and
S is the smallest set of places of K containing all infinite places, such that
⊆ OS∗ .
Proof of Theorem 4.1.3. Let x1 , x2 be a solution of (4.1.5). It follows from
(4.1.5) that
h(x1 ) ≤ 3H + h(x2 ) + log 2.
(4.4.22)
First assume that h(x2 ) < 400sH . Then (4.4.22) gives
h(x1 ) < 404sH,
whence P · h(x1 )/H ≤ 404sP . Using now Proposition 3.2.9 and the fact that
the function X/ log X is monotone increasing for X > e, (4.1.6) easily follows.
Now assume that
h(x2 ) ≥ 400sH.
(4.4.23)
Pick v ∈ S for which |x2 |v is minimal. Then we deduce from (4.1.5) that
d
log |1 − a1 x1 |v = log |a2 x2 |v ≤ − h(x2 ) + dH.
s
(4.4.24)
Further, (4.4.22) and (4.4.23) imply that
h(x1 ) ≤ 1.01h(x2 ).
Hence we infer from (4.4.23) and (4.4.24) that
log |1 − a1 x1 |v < −κh(x1 )
with the choice κ = d/(2.02s). By applying Theorem 4.2.2, for h(x1 ) we get the
upper bound occurring in (4.1.6) with 6.5 replaced by 6.4 in the bound. Finally,
(4.1.5) implies h(x2 ) ≤ 3H + h(x1 ) + log 2. But, in view of Proposition 3.2.9
the bound obtained for h(x1 ) is much larger than 3H + log 2, hence we get
(4.1.6) for h(x2 ) as well.
Proof of Corollary 4.1.4. For given a1 , a2 ∈ K ∗ , the finiteness of the number of
solutions of equation (4.1.5) in x1 , x2 ∈ immediately follows from Theorem
4.1.3 and Theorem 1.9.3.
The group of roots of unity in K being cyclic, tors is also cyclic. Suppose
that K, a1 , a2 , a system of generators {ξ1 , . . . , ξm } of / tors and a generator
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.4 Proofs
85
ζ of tors are effectively given. We shall utilize some algorithmic results from
algebraic number theory, references to the literature of which are listed in Section 1.10. The factorizations of ξ1 , . . . , ξm into prime ideals can be effectively
determined. Then the set S consisting of all infinite places of K and of all
prime ideals occurring in these factorizations can be effectively determined. If
x1 , x2 ∈ is a solution of (4.1.5), then x1 , x2 belong to OS∗ , the group of S-units
and Theorem 4.1.3 provides an effectively computable upper bound for h(x1 )
and h(x2 ). Therefore x1 , x2 are contained in a finite and effectively computable
subset, say H, of K ∗ .
We can select those pairs (x1 , x2 ) from H × H that satisfy (4.1.5). From the
remaining x1 , x1 one can select those x1 , x2 that are S-units. We have still to
decide whether such x1 , x2 are contained in or not, that is that
x1 = ζ z0 ξ1z1 · · · ξmzm
with some z0 , . . . , zm ∈ Z,
(4.4.25)
and similarly for x2 .
One can determine a fundamental system {ε1 , . . . , εs−1 } of S-units in K
where s = |S|, and a generator ρ of the group of roots of unity in K. Further,
for any effectively given ε ∈ OS∗ , one can determine effectively rational integers
b1 , . . . , bs−1 and b with 0 ≤ b < wK such that
b
s−1
ε = ρ b ε1b1 · · · εs−1
,
(4.4.26)
where wK denotes the number of roots of unity in K. In (4.4.25) we represent
now x, ζ , ξ1 , . . . , ξm in the form (4.4.26) and compare the representations of
the left- and right-hand sides of (4.4.25). Then we arrive at a system of linear
equations in z0 , z1 , . . . , zm . But one can decide whether this system of equations
is solvable in Z or not, that is, whether x1 ∈ or not. In case of x2 one can
proceed in the same way.
Proof of Corollary 4.1.5. Let {ε1 , . . . , εs−1 } be a fundamental system of S-units
in K with the properties described in Proposition 4.3.9. Then, putting
:= h(ε1 ) · · · h(εs−1 ),
we have ≤ c9 RS with c9 = ((s − 1)!)2 /(2s−2 d s−1 ). Now the assertion follows
from Theorem 4.1.3.
Proof of Corollary 4.1.6. Corollary 4.1.6 can be deduced both from Corollary
4.1.5 and from Corollary 4.1.4. The finiteness of the number of solutions of
(4.1.7) follows immediately from Corollary 4.1.4 with the choice = OS∗ . If
S is effectively given, a fundamental system of S-units and the roots of unity
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
86
Unit equations in two unknowns
in K are effectively determinable. Hence the effective part of Corollary 4.1.6
is also an immediate consequence of Corollary 4.1.4.
Theorem 4.1.7 will be deduced from Theorem 4.2.1.
Proof of Theorem 4.1.7. Let x1 , x2 be a solution of (4.1.7). We infer as in the
proof of Theorem 4.1.3 that, for some v ∈ S,
0 < log |1 − a1 x1 |v < −(d/2.02s)h(x1 ),
(4.4.27)
whence |1 − a1 x1 |v ≤ 1. This implies that |a1 x1 |v ≤ 1 or 4 according as v is
finite or not. Consequently, we have
|1 − (a1 x1 )hK |v = |1 − a1 x1 |v · |1 + a1 x1 + · · · + (a1 x1 )hK −1 |v
≤ c18 |1 − a1 x1 |v ,
(4.4.28)
where c18 := 1 or 4hK , according as v is finite or not. Here hK denotes the class
number of K.
We shall give a lower bound for |1 − (a1 x1 )hK |v by means of Theorem 4.2.1.
We first construct a subgroup of OS∗ such that x1hK ∈ . Denote by p1 , . . . , pt
the prime ideals in S. There are π1 , . . . , πt in OK such that (πi ) = phi K and by
Proposition 4.3.12, they can be chosen so that
h(πi ) ≤ c19 d r R log N (pi ),
i = 1, . . . , t.
Here c19 and c20 , . . . , c25 below denote effectively computable absolute constants. By Proposition 4.3.9, there exists a fundamental system of units
{ε1 , . . . , εr } such that h(ε1 ) · · · h(εr ) ≤ c20 d r R. Then
:= h(ε1 ) · · · h(εr )h(π1 ) · · · h(πt )
t
t+1 t+1 ≤ c21 d r
R
log N (pi ).
(4.4.29)
i=1
Since x1 is an S-unit in K, we can write (x1 ) = pu1 1 · · · put t with appropriate
integers u1 , . . . , ut . Consequently, we can write
x1hK = ζ ε1b1 · · · εrbr π1u1 · · · πtut ,
where ζ is a root of unity and b1 , . . . , br are integers. That is, x1hK belongs to
the multiplicative subgroup of OS∗ generated by ε1 , . . . , εr , π1 , . . . , πt and
the roots of unity of K. Further, putting H := max(h(a1 ), 1), we have
hK H ≥ max h a1hK , 1 ≥ H .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.5 Alternative methods, comparison of the bounds
87
If a1hK x1hK = 1, together with (4.1.7) and (1.8.5) this implies immediately
(4.1.9). If a1hK x1hK = 1, then Theorem 4.2.1 gives
P
log 1 − a1hK x1hK v > −c (d, hK , s)
H log∗
log P
P h(x1 )
,
H
(4.4.30)
where c (d, hK , s) = (c22 d)3s+2 (log∗ d)3 log∗ hK . We may assume that h(x1 ) ≥
12 log(4shK /d), since otherwise by (4.1.7) we are done. Then (4.4.27)
and (4.4.28) imply that log |1 − a1hK x1hK |v can be estimated from above by
−(d/4s)h(x). Comparing this with (4.4.30), we infer that
h(x1 ) ≤ c (d, hK , s)
P
H log∗ (P ),
log P
where c (d, hK , s) = (c23 d)3s+2 (log∗ hK )2 . In view of (4.4.29) we get
log∗ (P ) ≤ c24 td(log∗ d)(log∗ R) log P .
Finally, using again (4.4.29), (1.5.3) and (1.8.3), we obtain
t
t+4
t+4
h(x1 ) ≤ c25 d r+3 R
PH
log N (pi ) ≤ c25 d r+3 R
P H RS ,
i=1
which gives (4.1.9) for h(x1 ). An upper bound of the same form follows for
h(x2 ) from the equation (4.1.7).
4.5 Alternative methods, comparison of the bounds
Baker’s theory of logarithmic forms made it possible to derive effective bounds
for the solutions of S-unit equations. Later, some alternative methods were
developed to obtain effective results for such equations. We briefly discuss
these methods.
4.5.1 The results of Bombieri, Bombieri and Cohen, and Bugeaud
Bombieri (1993) and Bombieri and Cohen (1997, 2003) developed an effective
method in Diophantine approximation, based on an extended version of the
Thue–Siegel principle, the Dyson Lemma and some geometry of numbers, to
prove an earlier, weaker version of Theorem 4.2.2. Bugeaud (1998), following
their approach and combining it with estimates for linear forms in logarithms,
obtained results which are in certain parameters sharper than those of Bombieri
and Cohen. This improvement is partly due to the use of linear forms in at most
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
88
Unit equations in two unknowns
three logarithms. It follows from Bugeaud’s results that if (4.2.2) holds with
0 < κ ≤ 1, then
10T max{H, T }
if v is infinite,
h(ξ ) ≤
(4.5.1)
8c26 T max{H, 40T } if v is finite ,
where
T := (2mc26 )m N (v)(log N (v)),
with
c26 :=
8 × 1019 (d 4 (log 3d)7 /κ) log∗ (2d/κ) if v is infinite,
8 × 106 (d 5 /κ)(log∗ (2d/κ))2
if v is finite.
It is easily seen that the bound in (4.2.3) has a better dependence on each
parameter than the bound in (4.5.1), except possibly and H . In fact, the bound
in (4.5.1) is smaller than that in (4.2.3) precisely when both and H log /
are large relative to d, κ and N (v), and in that case, the bound (4.5.1) is at most
a factor log better than (4.2.3). It is important to observe that in contrast with
(4.5.1), the bound (4.2.3) does not contain the factor mm .
Bugeaud (1998) used his result (4.5.1) to derive the bound
max{h(x1 ), h(x2 )} < c27 P (log∗ P )RS max{c27 P (log∗ P )RS , H }
(4.5.2)
for the solutions x1 , x2 of the S-unit equation (4.1.7), where
c27 := (1023 s 4 (log∗ s)2 d 3 )s .
Observe that (4.1.8) is better than (4.5.2) in terms of each parameter, except
possibly RS and H . The bound in (4.5.2) is smaller than that in (4.1.8) precisely
if both RS and H log RS /RS are large relative to P , d and s, and in that case
the bound in (4.5.2) is at most a factor log∗ RS better than that in (4.1.8). If
H > c27 P (log∗ P )RS , then there is no log∗ RS factor in (4.5.2), however this
bound contains s s . Our bound in (4.1.9) contains neither log∗ RS nor s s , but it
depends on Rt .
4.5.2 The results of Murty, Pasten and von Känel
Let S ⊂ MQ consist of the infinite place and the prime numbers p1 , . . . , pt .
Then the corresponding ring of S-integers is ZS = Z[(p1 · · · pt )−1 ]. Murty and
Pasten (2013) (see also Pasten (2014)) developed a new effective method to
bound the heights of the solutions of special S-unit equations of the form
x1 + x2 = 1 in x1 , x2 ∈ Z∗S .
(4.5.3)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.6 The abc-conjecture
89
A similar method was obtained later and independently by von Känel (2014b).
The basic idea behind the approach of Murty and Pasten and that of von Känel
is an observation by Frey, that the now proved Shimura–Taniyama Conjecture,
which states that the L-function of an elliptic curve is equal to that of an
associated modular form, implies that (4.5.3) has only finitely many solutions.
Murty and Pasten and von Känel observed that this can be made effective, and
used this to obtain an explicit upper bound for the heights of the solutions of
(4.5.3). We formulate the result of Murty and Pasten, which is slightly sharper.
To be precise, let S be as above, and put Q := p1 · · · pt . Further, assume that
t ≥ 2 and 2 ∈ S. Murty and Pasten proved that any solution x1 , x2 of the S-unit
equation (4.5.3) satisfies
h(x1 ), h(x2 ) ≤ 4.8Q log Q + 13Q + 25.
(4.5.4)
In the special case of equation (4.1), where the number field is Q and the
coefficients a1 , a2 are equal to 1, (4.5.4) can be compared with the estimates
obtained in Theorem 4.1.3 and Corollary 4.1.5, and even with the slightly
sharper Theorem 2 of Győry and Yu (2006). In this case this latter result gives
the estimate
t
log pi
(4.5.5)
h(x1 ), h(x2 ) ≤ 210t+2 t 4 (P / log P )
i=1
for the solutions x1 , x2 of (4.5.3), where P := maxi pi . It is easily seen that
(4.5.4) improves (4.5.5) if Q is small, in particular if Q ≤ 230 . However, if t is
small and Q is large then (4.5.5) gives a better bound for the solutions.
It should be remarked that for most applications of S-unit equations, more
general results concerning equations of the form (4.1) a1 x1 + a2 x2 = 1 over
number fields are needed.
4.6 The abc-conjecture
An extremely important S-unit equation is the abc-equation
a + b = c,
where S is a finite set of primes, and a, b, c are coprime positive integers not
divisible by primes outside S. Then Corollary 4.1.5 and Theorem 4.1.7 provide
explicit upper bounds for c in terms of S. However, these bounds are far from
being best possible.
The radical of (a, b, c) is defined as
Q(a, b, c) :=
p.
p|abc
Oesterlé and, in a refined form, Masser (1985) proposed the following.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
90
Unit equations in two unknowns
abc-conjecture For every ε > 0, there is a positive number C(ε) such that if
a, b, c are coprime positive integers with
a+b =c
and radical Q = Q(a, b, c),
(4.6.1)
c < C(ε)Q1+ε .
(4.6.2)
then
This is already sharp in the sense that (4.6.2) does not remain valid for ε = 0.
On August 30, 2012, Shinichi Mochizuki (Kyoto University), posted on
the internet a sequence of four papers on Inter-universal Teichmüller theory
in which he claims to prove the abc-conjecture. For recent updates of these
papers, see Mochizuki’s home page
www.kurims.kyoto-u.ac.jp/∼motizuki/top-english.html.
At the moment of completion of this book, his proof had not yet been checked.
There are several refinements or modifications of the abc-conjecture; for
references see e.g. Robert, Stewart and Tenenbaum (2014) where the authors
propose and motivate the following conjecture. We denote by logn the n times
iterated natural logarithm.
There exists a real number C1 such that if a, b, c are coprime positive integers
as in (4.6.1) then
$
c < Q exp(4 3 log Q/ log2 Q(1 + (log3 Q + C1 )/2 log2 Q)).
Furthermore, there exists a real number C2 and infinitely many pairs of coprime
positive integers a, b, c with (4.6.1) such that
$
c > Q exp(4 3 log Q/ log2 Q(1 + (log3 Q + C2 )/2 log2 Q)).
For any positive integer m we denote by ω(m) the number of distinct
prime factors of m. Baker (1998, 2004) and Granville (1998) formulated such
refinements of the abc-conjecture which involve also ω(abc). The following
completely explicit refined version is due to Baker (2004).
If a, b, c are coprime positive integers with (4.6.1) then
c<
6
Q(log Q)t ,
5t!
where t = ω(abc).
The abc-conjecture has a very extensive literature. It unifies and motivates
a number of results and problems in number theory. Further, it has several
striking consequences. We mention here only some of them.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.6 The abc-conjecture
91
r It is easy to show that the abc-conjecture implies Fermat’s Last Theorem for
every sufficiently large exponent. Indeed, assume that x, y, z are relatively
prime positive integers, n > 3 and x n + y n = zn . Then, for ε = 1, the abcconjecture gives zn < C(1)Q2 , where
Q :=
p=
p ≤ xyz < z3 ,
p|x n y n zn
r
r
r
r
p|xyz
whence zn < C(1)z6 . This proves that there exists n0 such that n ≤ n0 or,
in other words, Fermat’s Last Theorem is asymptotically true. The weaker
version of the abc-conjecture when ε = 1, C(1) = 1 implies in the same way
that n ≤ 5. As is known, Fermat’s Last Theorem is now proved by Wiles
(1995), Taylor and Wiles (1995).
It follows in a similar manner from the abc-conjecture that the generalized
Fermat equation Ax k + By m + Czn = 0, where A, B, C are given non-zero
integers, has finitely many solutions in relatively prime integers x, y, z greater
that 1, and positive integers k, m, n which satisfy 1/k + 1/m + 1/n < 1.
Elkies (1991) proved that the abc-conjecture implies Roth’s Approximation
Theorem, that is Theorem 3.1.1, and that an effective abc-theorem would
make Roth’s Theorem effective. See also Langevin (1999).
Confirming a conjecture of Mordell (1922a), Faltings (1983) proved that a
geometrically irreducible smooth projective curve of genus g ≥ 2, defined
over Q, has only finitely many rational points. Falting’s Theorem is ineffective. Elkies (1991) showed that the abc-conjecture implies this theorem of
Faltings, and in fact even an effective version of this if an effective version
of the abc-conjecture is available.
By a result of Lagarias and Soundararajan (2011), the abc-conjecture implies
that for any fixed κ < 1, there are only finitely many coprime positive integers
a, b, c such that
a + b = c and P (abc) ≤ (log c)κ .
On the other hand, under the Generalized Riemann Hypothesis the authors
proved that for κ ≥ 8 there are infinitely many triples a, b, c satisfying these
properties.
Some weaker versions of the abc-conjecture have been proved in an effective
way. By means of the theory of logarithmic forms, Stewart and Tijdeman (1986),
Stewart and Yu (1991, 2001), Győry and Yu (2006), Surroca (2007) and Győry
(2008a) obtained upper bounds for c as a function of Q(a, b, c). Stewart and
Yu (2001) proved that
(4.6.3)
c < exp c28 Q1/3 (log Q)3 ,
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
92
Unit equations in two unknowns
where Q = Q(a, b, c) and c28 is an effectively computable positive absolute
constant. Further, Győry (2008a) deduced from a slightly improved and completely explicit version of Theorem 4.1.7 that
c < exp(210t+22 /t t−4 )Q(log Q)t ).
(4.6.4)
For a deeper connection between the abc-conjecture and the theory of logarithmic forms, we refer to Baker (2004).
We present now a number field version of the abc-conjecture. Let K be
an algebraic number field of degree d, and MK the set of places on K; see
Section 1.7. The height of (a, b, c) ∈ (K ∗ )3 is defined as
max(|a|v , |b|v , |c|v )
HK (a, b, c) =
v∈MK
and the radical of (a, b, c) as
QK (a, b, c) :=
(NK (p))e(p) .
p
Here the product is taken over all prime ideals p for which |a|p , |b|p , |c|p are
not all equal and e(p) is the ramification index of p over the rational prime
below it. Denote by DK the absolute value of the discriminant of K.
Vojta (1987) proposed a very general conjecture, and, as a consequence,
suggested the first number field version of the abc-conjecture. Later, several
refinements of Vojta’s version were suggested, see Elkies (1991), Broberg
(1999), Vojta (2000), Granville and Stark (2000), Browkin (2000) and Masser
(2002). The following uniform version is due to Masser.
ABC-conjecture for the number field K For every > 0 there exists C() >
0, such that if
a + b + c = 0 with a, b, c ∈ K ∗ , QK = QK (a, b, c),
(4.6.5)
then
HK (a, b, c) < C()d (|DK | · QK )1+ .
For K = Q, this reduces to the Oesterlé–Masser Conjecture. The upper bound
is again best possible in term of . This general conjecture has also a very rich
literature, and has many profound implications; see the abc-conjecture home
page mentioned below.
The bounds obtained for the solutions of S-unit equations can be used to
derive weaker but unconditional upper bounds for HK (a, b, c). Let a, b, c be
non-zero elements of K with a + b + c = 0, and let
S = MK∞ ∪ {finite v ∈ MK such that |a|v , |b|v , |c|v are not all equal}.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.7 Notes
93
Then x = −a/c, y = −b/c is a solution of the S-unit equation
x + y = 1 in x, y ∈ OS∗ .
Every bound for h(x), h(y) gives a bound for HK (a, b, c). Using a result
of Bugeaud and Győry (1996a) concerning S-unit equations, Surroca (2007)
derived a bound for HK (a, b, c). By means of a slightly improved and explicit
version of Theorem 4.1.7, Győry (2008a) considerably improved Surroca’s
bound by showing that if > 0 and (4.6.5) holds then HK (a, b, c) can be
estimated from above by
(4.6.6)
exp c29 (d, DK , )Q1+
K
and, if
by
QK = QK (a, b, c) > max |DK |2/ , exp exp(max(|DK |, e)) ,
exp c30 (d, )(|DK | · QK )1+ ,
(4.6.7)
where c29 , c30 are effectively computable constants depending only on the
parameters occurring in the parentheses. Clearly, the bounds in (4.6.3), (4.6.4),
(4.6.6) and (4.6.7) are still far from the conjectured best bounds.
For other details, including generalizations and applications of the abcconjecture, we refer the reader to Bombieri and Gubler (2006), Baker and
Wüstholz (2007), the abc-conjecture home page created and maintained by
Nitaj, www.math.unicaen.fr/∼nitaj/abc.html, and the references
given there.
4.7 Notes
4.7.1 Historical remarks and some related results
r Inthe special case of S-unit equations over Q, effective finiteness results can
be deduced for the solutions from a theorem of Coates (1969) on the greatest
prime factor of binary forms and also from a result of Sprindžuk (1969)
on ternary exponential equations. In the general case, for unit and S-unit
equations over number fields, various effective bounds for the solutions were
established in several papers and books, including Győry (1972, 1973, 1974,
1979, 1979/1980, 1980b, 2008a), Sprindžuk (1973, 1976, 1982, 1993), Lang
(1978), Kotov and Trelina (1979), Schmidt (1992), Bugeaud and Győry
(1996a), and Haristoy (2003). The best known bounds can be found in Győry
and Yu (2006) and, in a more general form, for the solutions from a finitely
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
94
Unit equations in two unknowns
∗
generated multiplicative subgroup of Q , in Section 4.1 of the present book.
Later, Bérczes, Evertse and Győry (2009) gave effective bounds for the
heights and degrees of the solutions from the division group of a finitely
∗
generated multiplicative subgroup of Q . We note that in certain applications
of Baker’s theory, Bilu systematically used so-called functional units instead
of applying unit equations; see Bilu (2002) and the references given there.
r Corollary 4.1.6 states that over a number field K, the S-unit equation
x1 + x2 = 1 in x1 , x2 ∈ OS∗ has only finitely many solutions, and all of them
can be, at least in principle, effectively determined. An equivalent statement
is that the set of S-integral points of P1 (K) \ {0, 1, ∞} is finite, and these
points can be, at least in principle, effectively computed. Here P1 (K) denotes
the projective line over K. For this and other equivalent statements, see e.g.
Section 9.2 and LeVesque and Waldschmidt (2011).
r Let p1 , . . . , pt be distinct rational primes, and S = {∞, p1 , . . . , pt } and
denote by Z∗S the group of S-unit equations in Q. As a common generalization
of S-unit equations and binomial Thue equations Győry and Pintér (2008)
considered over Q the equation
un1 x1 + un2 x2 = 1 in u1 , u2 ∈ Z \ {0}, n ≥ 3, x1 , x2 ∈ Z∗S
with gcd(u1 u2 , p1 · · · pt ) = 1.
(4.7.1)
They proved that the heights of un1 , un2 , x1 and x2 can be effectively bounded
above in terms of S. This implies that there are only finitely many un1 , un2
with the given properties for which equation (4.7.1) can have a solution x1 ,
x2 , and these un1 , un2 , together with the possible solutions x1 , x2 , can be, at
least in principle, effectively determined. All the results mentioned above
were proved by means of the theory of logarithmic forms.
4.7.2 Some notes on applications
r The effective results concerning equations (4.1) and (4.2) led to a great
number of applications, among others to
– Thue equations, Thue–Mahler equations and decomposable form equations, see Section 9.6 and the Notes in Chapter 9,
– discriminant form and index form equations, see Section 9.6, the Notes in
Chapter 9 and our book on discriminant equations,
– discriminant equations and power integral bases, see Section 10.6 and our
book on discriminant equations,
– binary forms and decomposable forms of given discriminant, see
Section 10.7 and our book on discriminant equations,
– irreducible polynomials and arithmetic graphs, see Section 10.5,
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
4.7 Notes
95
– unit equations in two unknowns over finitely generated domains, see
Chapter 8,
– bounding the number of solutions of S-unit equations, see the Notes in
Chapter 6.
For applications of the so-obtained results and for references, the reader
should consult Chapters 9 and 10, the books and survey papers Győry (1980b,
1992a, 2002, 2010), Sprindžuk (1982, 1993), Shorey and Tijdeman (1986),
Evertse, Győry, Stewart and Tijdeman (1988b), Bombieri and Gubler (2006),
Baker and Wüstholz (2007), and our book on discriminant equations.
r In many cases, the applicability of Baker’s theory can be considerably
extended by reducing the Diophantine problem under consideration to the
study of such systems of unit equations in which the equations possess certain
graph-theoretic connectedness properties. Then a combination of the effective results concerning equation (4.1) with some combinatorial arguments
enables one to derive a bound for the solutions of the initial Diophantine
problem; see Sections 9.6, 10.5, 10.6, Győry (1980c, 1981a, 1981c, 1982c)
and Evertse, Győry, Stewart and Tijdeman (1988b).
r We now mention some recent applications of the results presented in this
chapter. There are many important applications to polynomials and binary
forms of given discriminant; these will be discussed in full detail in our book
on discriminant equations. Further, Theorem 1 of Győry and Yu (2006), that
is Corollary 4.1.5 of the present chapter, has been recently used to obtain
among others the following effective results.
– In von Känel (2011, 2014a), an effective version of Shafarevich’s
conjecture/Faltings’ Theorem is proved for hyperelliptic curves. This has
been worked out in our book on discriminant equations.
– In de Jong and Rémond (2011), the authors give an effective version
of Shafarevich’s conjecture/Faltings’ Theorem for cyclic covers of the
projective line of prime degree.
– Finally, in von Känel (2013) a generalization of Szpiro’s Discriminant
conjecture concerning elliptic curves is formulated for hyperelliptic curves,
and a completely explicit exponential version of Szpiro’s Generalized
conjecture is established.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.006
5
Algorithmic resolution of unit equations
in two unknowns
Let K be an algebraic number field, 1 , 2 two finitely generated multiplicative
subgroups of K ∗ , and a1 , a2 two non-zero elements of K. It follows from the
results of the preceding chapter that the equation
a1 x1 + a2 x2 = 1
in (x1 , x2 ) ∈ 1 × 2
has only finitely many solutions, and effective upper bounds can be given for
the heights of the solutions. These bounds are, however, too large for practical
use, for finding all solutions of concrete equations of the above form. In this
chapter a practical method will be provided to locate all the solutions to such
equations, subject to the conditions that a1 , a2 and the generators of 1 and
2 are effectively given and that the ranks of 1 and 2 are not too large,
presently the bound is about 12. In particular, we present an efficient algorithm
for solving completely S-unit equations in two unknowns.
The unknowns x1 and x2 can be represented as a power product of the
generators of 1 and 2 , respectively. Assuming that the generators of infinite
order are multiplicatively independent, these representations are unique up
to powers of roots of unity. Thus, we arrive at an exponential Diophantine
equation of the form (5.1.3) below which has to be solved. As in Chapter 4,
we first derive an explicit upper bound for the absolute values of the unknown
exponents, using the best known Baker’s type inequalities concerning linear
forms in logarithms. In this way the existence of “large” solutions will be
excluded. This part is an adaptation of Győry’s method (Győry (1979)) who
was the first to give explicit bounds for the solutions in case of S-unit equations
over number fields. Then, in concrete cases, we can considerably reduce the
obtained bound by means of de Weger’s reduction techniques (de Weger (1987,
1989)) based on the LLL lattice basis reduction algorithm. This means that even
“medium” sized solutions do not exist. Finally, some enumeration procedures
due to Wildanger (1997, 2000) and Smart (1999) can be utilized to determine
96
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.1 Application of Baker’s type estimates
97
the “small” solutions under the reduced bound. We shall briefly illustrate the
resolution process on two concrete equations. Of course, during the process
some standard algebraic number-theoretical concepts and algorithms will also
be needed, references to which, for convenience, are collected in Section 1.10.
For further details, related results, methods, applications and examples we
refer to de Weger (1987, 1989), Wildanger (1997, 2000), Smart (1998, 1999),
Győry (2002), Gaál (2002) and Baker and Wüstholz (2007).
5.1 Application of Baker’s type estimates
Let K be an algebraic number field of degree d, given by the minimal polynomial of a primitive integral element θ of K over Q. Assume that two non-zero
algebraic numbers a1 , a2 are explicitly given, as defined in Section 1.10, that
is,
ai = (pi,0 + pi,1 θ + · · · + pi,d−1 θ d−1 )/qi
with given rational integers pi,0 , . . . , pi,d−1 , qi with gcd(pi,0 , . . . , pi,d−1 , qi ) =
1 for i = 1, 2. Further, for i = 1, 2, let i be a multiplicative subgroup of rank
ri in K ∗ , and i,∞ the torsion subgroup of i . We assume that for i = 1, 2, a
system of generators ξi,1 , . . . , ξi,ri , that is a basis of i / i,∞ is explicitly given.
We consider the equation
a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ 1 × 2 .
(5.1.1)
To avoid trivialities, we deal only with the case r1 , r2 ≥ 1. Then each solution
x1 , x2 of (5.1.1) can be written uniquely in the form
xi = ζi
ri
b
ξi,ji,j ,
i = 1, 2,
(5.1.2)
j =1
where ζi is a root of unity in K and the bi,j are rational integers. Hence (5.1.1)
takes the form
(a1 ζ1 )
r1
j =1
b
ξ1,j1,j + (a2 ζ2 )
r2
b
ξ2,j2,j = 1
(5.1.3)
j =1
with unknown integer exponents bi,j .
Let B := maxi,j |bi,j |. We are going to derive an upper bound for B. Such a
bound could be deduced from the general effective results of Chapter 4. However, as will be seen, in concrete cases it is more profitable to reduce (5.1.3) to
Baker’s type inequalities; see (5.1.10), (5.1.13) and (5.1.15) below. Then we
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
98
Algorithmic resolution of unit equations in two unknowns
can apply Baker’s method to the left-hand sides of these inequalities to get a
bound for B, and then we can use the LLL-algorithm to reduce the bound so
obtained.
Let MK denote the set of places on K. Further, for i = 1, 2 let Si be the
support of the group i , that is the subset of MK which consists of the infinite
places and of those finite places v for which
|α|v = 1 for some α ∈ i .
In view of the assumptions made on 1 and 2 , the sets S1 and S2 can be
effectively determined in the sense defined in Section 1.10. In what follows,
we assume some implicit fixed order for the real and complex infinite places
and for the finite places in MK . This gives an order on S1 and S2 .
We first consider the case when B = maxj |b1,j |. We infer from (5.1.2) that
log |x1 |v =
r1
b1,j log |ξ1,j |v
j =1
for all v ∈ MK . Let S1 = {v1 , . . . , vs1 }, and choose k, l ∈ {1, . . . , s1 } such that
| log |x1 |vk | = max | log |x1 |v |
v∈S1
|x1 |vl = min |x1 |v .
v∈S1
We need to perform our calculations for each possible l. Using the algorithms
(XII) and (XIII) mentioned in Section 1.10, one can determine a fundamental
as1 −1,j
a
with a
system {ε1 , . . . , εs1 −1 } of S1 -units, and write ξ1,j = ζ1,j ε11,j · · · εs1 −1
root of unity ζ1,j and with rational integers ai,j ∈ Z. In view of the multiplicative independence of ξ1,1 , . . . , ξ1,r1 it follows that the rank of the matrix
(ai,j )1 ≤ i ≤ s1 − 1 is r1 . But the matrix (log |εi |vj )1 ≤ i ≤ s1 − 1 is invertible, hence it is
1 ≤ j ≤ r1
1 ≤ j ≤ s1 − 1
easy to show that the matrix (log |ξ1,i |vj )1 ≤ i ≤ r1
1 ≤ j ≤ s1 − 1
quently, there is a subset {u1 , . . . , ur1 } of S1
⎛
log |ξ1,1 |u1 · · ·
⎜..
..
M = ⎝.
.
log |ξ1,1 |ur1
···
is also of rank r1 . Conse-
such that the matrix
⎞
log |ξ1,r1 |u1
⎟
..
⎠
.
log |ξ1,r1 |ur1
is invertible. Thus we have
⎛
⎞
⎛
⎞
log |x1 |u1
b1,1
⎜ .. ⎟
⎟
..
−1 ⎜
⎝ . ⎠=M ⎝
⎠.
.
b1,r1
log |x1 |ur1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.1 Application of Baker’s type estimates
99
This gives
B ≤ c1 | log |x1 |vk |,
(5.1.4)
where c1 is the row norm of M −1 , that is the maximum, taken over all rows, of
the sum of the absolute values of the elements of a row of M −1 .
Remark 5.1.1 The value of c1 depends not only on 1 but also on the choice
of the generators ξ1,1 , . . . , ξ1,r1 and the matrix M. As will be seen later, the
bounds that will be derived for B depend heavily on c1 .
When B = maxj |b2,j |, we obtain an inequality similar to (5.1.4). Thus we
can compute a constant c1∗ ≥ c1 such that
max |bi,j | ≤ c1∗ max {|log |x1 |v | , |log |x2 |v |} .
v∈S1 ∪S2
i,j
(5.1.5)
This constant c1∗ will be needed in Section 5.3. For later purpose it is worth
keeping c1 and c1∗ as small as possible.
Remark 5.1.2 In the case of S-unit equations, Hajdu (2009) proved that there
is a system of fundamental S-units which is optimal with respect to c1 and such
a system can be constructed.
Choose now c2 such that 0 < c2 < 1/c1 (s1 − 1). We shall see that an appropriate choice for c2 is 0.999/c1 (s1 − 1), provided that s1 is not too large. We
show that
|x1 |vl ≤ exp{−c2 B}.
(5.1.6)
Assuming the contrary, in view of (5.1.4) there are two possibilities. If |x1 |vk ≥
exp{ c11 B}, then using the Product Formula (1.7.1) we get
exp
s1
1
B ≤
|x1 |−1
vj < exp{c2 (s1 − 1)B},
c1
j =1
j = k
which is impossible because of 1/c1 > c2 (s1 − 1). On the other hand, if |x1 |vk ≤
exp{− c11 B} then
1
exp{−c2 B} < |x1 |vl ≤ |x1 |vk ≤ exp − B
c1
which is again a contradiction. This proves (5.1.6).
Set
:= 1 − a2 x2 = 1 − a2
r2
b
ξ2,j2,j ,
(5.1.7)
j =1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
100
Algorithmic resolution of unit equations in two unknowns
where a2 := a2 ζ2 and, in view of (5.1.1), = 0. The following computations
must be performed for all roots of unity ζ2 in K. Let c3 := maxv∈S1 |a1 |v . Then
it follows from (5.1.1) and (5.1.6) that
||vl ≤ c3 exp{−c2 B}.
(5.1.8)
We shall give a lower bound for ||vl which, together with (5.1.8), will yield
an upper bound for B. We could apply here Theorem 3.2.8 which is valid for
each v. We shall, however, get a slightly better bound for B if we use Theorem
3.2.4 or Theorem 3.2.7 according as vl is infinite or finite.
5.1.1 Infinite places
First consider the case when vl is infinite. There is an embedding σ of K in C
such that ||vl = |σ ()| if vl is real and |σ ()|2 otherwise. Since h(σ (α)) =
h(α) for each α ∈ K, in applying Theorem 3.2.4 we omit vl and σ and we write
simply
|| ≤ c3 exp{−c2 B},
in place of (5.1.8), where c3 := c3 , c2 := c2 if vl is real and c3 :=
c2 /2 otherwise.
(5.1.9)
√
c3 , c2 :=
vl is real. Using the inequality | log z| ≤ 2|z − 1| which holds for |z − 1| <
0.795, we deduce from (5.1.3), (5.1.7) and (5.1.9) that putting
:= log |a2 | + b2,1 log |ξ2,1 | + · · · + b2,r2 log |ξ2,r2 |,
we have
|| = | log |a2 x2 || ≤ 2|1 − |a2 x2 || ≤ 2|1 − a2 x2 | = 2||
≤ 2c3 exp{−c2 B}.
(5.1.10)
Further, = 0 implies that = 0. Let
H := max{dh(α2 ), | log α2 |, 0.16}
and c4 := max1≤j ≤r2 {dh(|ξ2,j |), | log |ξ2,j ||, 0.16}.
We recall that B = maxj |b1,j |. Applying Theorem 3.2.4 to || with B
replaced by c4 B/H , we can compute explicit constants c5 and c6 such that
either
B≤
1
H
c4
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.1 Application of Baker’s type estimates
or
c6 B
|| > exp −c5 H log
.
H
101
(5.1.11)
In the second case (5.1.10) and (5.1.11) imply that
c5 c6
c6 B
c6 B
<
log
H
c2
H
Hence (5.1.12) and (4.4.21) give
c5
c5 c6
B ≤ 2H max
log
c2
c2
Thus we get
,
+
2e2
c6
c6 log(2c3 )
.
c2 H
+2
1
H, c7
B0 (vl ) := max
c4
(5.1.12)
log(2c3 )
=: c7 .
c2
as an upper bound for B.
vl is complex. Let log denote the principal value of the logarithm. There exists
an even rational integer b2,0 such that
|b2,0 | ≤ 1 + |b2,1 | + · · · + |b2,r2 | ≤ (r2 + 1) max |b2,j |
j
and that |Im()| ≤ π , where now
= log a2 + b2,0 log ξ2,0 + · · · + b2,r2 log ξ2,r2
with ξ2,0 = −1. It follows from = 0 that = 0. We infer from (5.1.9) that
either
B<
log(3c3 )
=: c8
c2
or |e − 1| = || ≤ 1/3. In the latter case || ≤ 0.6, whence || ≤ 2|| and
so, by (5.1.9),
|| ≤ 2c3 exp{−c2 B}.
(5.1.13)
We apply now Theorem 3.2.4 to ||. We can compute explicit constants c9 , c10
and c11 such that either
B ≤ c9 H
or
c11 B
.
|| > exp −c10 H log
H
(5.1.14)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
102
Algorithmic resolution of unit equations in two unknowns
Comparing now (5.1.13) and (5.1.14), we deduce that
c11
c10 c11
B
c11 B
<
log
H
c2
H
whence, using (4.4.21), we infer that
c10
c10 c11
B < 2H max
log
c2
c2
,
c11 log(2c3 )
,
c2 H
+
2e2
c11
+
2 log(2c3 )
=: c12 .
c2
Thus
B0 (vl ) := max{c8 , c9 H, c12 }
is an upper bound for B.
5.1.2 Finite places
Suppose now that in (5.1.8) vl is finite. Let p denote the prime ideal of K which
corresponds to vl . Using
log ||vl = −(ordp ) log N (p),
we infer from (5.1.8) that
ordp ≥ (c2 B − log c3 )/ log N (p).
(5.1.15)
Recall that the generators ξ2,1 , . . . , ξ2,r2 are not roots of unity. Taking into
consideration Proposition 3.2.9, we can apply Theorem 3.2.7 to ordp with
the choice hj = h(ξ2,j ), j = 1, . . . , r2 , H = max(h(a2 ), 1), Bn = 1 and δ =
h(ξ2,1 ) · · · h(ξ2,r2 )H /B. We can compute explicit constants c13 , c14 , c15 which
depend among others on vl such that 2h(ξ2,1 ) · · · h(ξ2,r2 ) ≤ c13 and either
B ≤ c13 H
or B > c13 H , which guarantees that in Theorem 3.2.7, δ ≤ 1/2. In the second
case Theorem 3.2.7 gives
ordp < c14 H log c15
B
H
.
(5.1.16)
Now (5.1.15) and (5.1.16) imply that
c15
c15 c16
B
B
<
log c15
H
c2
H
+ c15
log c3
,
c2 H
where c16 := c14 log N (p). In view of (4.4.21) this gives
2 log c3
c15 c16
c16
, 2e2 +
log
=: c17 ,
B < 2H max
c2
c2
c2
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.2 Reduction of the bounds
103
whence we obtain
B0 (vl ) := max{c13 H, c17 }
as an upper bound for B.
When B = maxj |b2,j |, we can get in the same way an upper bound for B
for each v ∈ S2 . So in all cases we have a bound on B = maxi,j |bi,j |.
5.2 Reduction of the bounds
The bounds obtained above for B are too large for practical use, to find all
solutions of (5.1.3) in bi,j . We now show how to reduce these bounds by means
of the LLL-algorithm. For the LLL-algorithm, we refer to Section 5.6. Further
details on the applications of the LLL-algorithm to reduce Baker’s type bounds
can be found in de Weger (1989), Smart (1998) and, in case of infinite places,
in Gaál (2002).
We first consider the case when B = maxj |b1,j |. We shall distinguish again
two cases according as vl is infinite or finite. We illustrate the reduction procedure on the inequalities (5.1.10) (vl infinite and real), (5.1.13) (vl infinite and
complex) and (5.1.15) (vl finite), reducing the corresponding bounds B0 (vl ) to
much smaller ones.
5.2.1 Infinite places
The inequalities (5.1.10) and (5.1.13) are of the form
|b1 ϑ1 + · · · + bt ϑt | < c18 exp{−c19 B},
(5.2.1)
where ϑ1 , . . . , ϑt are logarithms of some non-zero algebraic numbers, c18 ,
c19 are given explicit positive constants, and b1 , . . . , bt are unknown rational
integers such that
0 < max(|b1 |, . . . , |bt |) ≤ B
and B ≤ B0
with some explicit constant B0 .
Remark We could also work with an inhomogeneous version of (5.1.1), when
b1 = 1.
We want to considerably reduce the upper bound B0 in the following way.
Consider the inequality (5.2.1), where ϑ1 , . . . , ϑt are real or complex numbers. Denote by L the t-dimensional lattice spanned by the columns of the
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
104
Algorithmic resolution of unit equations in two unknowns
(t + 2) × t matrix
⎛
1
0
..
.
0
1
..
.
···
···
⎜
⎜
⎜
⎜
⎜
⎜
⎜
0
0
···
⎜
⎝CRe(ϑ1 ) CRe(ϑ2 ) · · ·
CIm(ϑ1 ) CIm(ϑ2 ) · · ·
0
0
..
.
⎞
⎟
⎟
⎟
⎟
⎟
⎟
1 ⎟
⎟
CRe(ϑt )⎠
CIm(ϑt )
where C is a large constant to be specified in numerical cases. The last row
can be omitted if ϑ1 , . . . , ϑt are all reals. Let a1 denote the first vector of an
LLL-reduced basis of L.
Lemma 5.2.1 If in (5.2.1) maxi |bi | ≤ B ≤ B0 and
$
a1 ≥ (t + 1)2t−1 B0 ,
(5.2.2)
then
B≤
log C + log c18 − log B0
.
c19
(5.2.3)
This is a slight extension of a result of Gaál and Pohst (2002) where it is
assumed that B = max(|b1 |, . . . , |bt |). Our version will be important below,
applying Lemma 5.2.1 to (5.1.10) and (5.1.13).
Proof. Following the proof of Lemma 1 in Gaál and Pohst (2002), we denote
by a0 the shortest non-zero vector in L. Then it follows from the inequality (iv)
of Proposition 5.6.1 that a1 2 ≤ 2t−1 a0 2 . Using (5.2.1) and the assumptions
of our lemma, we infer that
2
exp{−2c19 B}.
21−t (t + 1)2t−1 B02 ≤ 21−t a1 2 ≤ a0 2 ≤ tB02 + C 2 c18
This gives
B0 ≤ C · c18 exp{−c19 B},
whence (5.2.3) follows.
We note that if in (5.2.1) the numbers ϑ1 , . . . , ϑt are linearly dependent over
Q, then the number of unknowns can be reduced and we can apply Lemma
5.2.1 to a lower dimensional lattice.
We expect our Lemma 5.2.1 to reduce our upper bound B0 for B, because it
is believed that the logarithms of algebraic numbers behave as random complex
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.2 Reduction of the bounds
105
numbers. To ensure (5.2.2) we have to choose C sufficiently large. A suitable
value of C is usually of magnitude B0t . Then the bound B0 is reduced almost to
its logarithm. If Lemma 5.2.1 does not reduce our upper bound, a larger C can
be chosen and we repeat the procedure.
Keeping the notation of Section 5.1, we apply Lemma 5.2.1, for each infinite
vl , to the corresponding inequality
log |a | + b2,1 log |ξ2,1 | + · · · + b2,r log |ξ2,r | ≤ 2c3 exp{−c2 B} (5.1.10)
2
2
2
or
log a + b2,0 log(−1) + b2,1 log ξ2,1 + · · · + b2,r log ξ2,r 2
2
2
√
c2
≤ 2 c3 exp −
B ,
2(r2 + 1)
(5.1.13)
according as vl is real or not. We recall that in the first case max1≤j ≤r2 |b2,j | ≤ B,
while in the second case max0≤j ≤r2 |b2,j | ≤ B , where B = (r2 + 1)B. In the
previous section we derived in each case an explicit upper bound B0 (vl ) for B.
Lemma 5.2.1 can be applied to (5.1.10) and (5.1.13) repeatedly. In every step
we take as B0 the previous bound, initially the bound B0 (vl ), to get smaller and
smaller bounds for B. The reduction is very efficient in the first and second
steps. After about 4 − 5 steps the procedure stabilizes, that is does not yield an
improvement any more. The final reduced bound is usually between 100 and
1000.
5.2.2 Finite places
Now let vl be finite, and p the prime ideal of OK corresponding to vl . We recall
that
a2 = ζ2 a2
and x2 = ζ2
r2
b
ξ2,j2,j .
j =1
Consider now (5.1.15) in the form
⎞
⎛
r2
b
ξ2,j2,j − 1⎠ ≥ c20 B − c21 ,
ordp ⎝a2
(5.2.4)
j =1
where b2,1 , . . . , b2,r2 are rational integers with max1≤j ≤r2 |b2,j | ≤ B and c20 :=
c2 / log N(p), c21 := log c3 / log N (p). In Section 5.1.2 we derived an upper
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
106
Algorithmic resolution of unit equations in two unknowns
bound B0 (vl ) for B. We are now going to reduce this bound to a much smaller
one by means of the LLL-reduction algorithm. To apply the reduction procedure
we have to convert (5.2.4) into a linear form estimate. This will be done by
using p-adic logarithms.
We shall proceed in several steps. We follow the arguments of de Weger
(1989) and Smart (1998).
Step 1. Firstly we show that a2 x2 can be written in the form
a2 x2 = η0
q2
d
ηj j ,
(5.2.5)
j =1
where dj are rational integers with |dj | ≤ |b2,j | ≤ B and ηj , j ≥ 1, are multiplicatively independent elements of K ∗ with ordp (ηj ) = 0 for j = 0, . . . , q2 ,
/ S2 , and q2 = r2 − 1 otherwise.
q2 = r2 if ordp (a2 ) = 0 and vl ∈
It follows from (5.2.4) and (5.1.1) that ordp (a1 x1 ) > 0 if B > c21 /c20 .
Hence (5.1.1) implies that ordp (a2 x2 ) = 0. Put
m0 := ordp (a2 )
and mj := ordp (ξ2,j )
for j = 1, . . . , r2 .
Then we infer that
m0 +
r2
mj b2,j = 0.
(5.2.6)
j =1
If mj = 0 for each j with 0 ≤ j ≤ r2 , then we may take η0 = a2 , ηj = ξ2,j
for j = 1, . . . , r2 and we are done. Next assume that not all mj are zero.
Choose k > 0 such that |mk | is minimal among the non-zero numbers |mj |,
j = 1, . . . , r2 . Let
−m
mk
ξ2,k j
ηj := ξ2,j
for j = 1, . . . , r2 .
Then ηk = 1, the other ηj are multiplicatively independent and ordp (ηj ) = 0
for j ≥ 1, j = k. Let dj , tj be rational integers such that
b2,j = mk dj + tj
with 0 ≤ tj < |mk |,
j = 1, . . . , r2
(5.2.7)
⎛
⎞
r2
tj ⎟ m/mk
⎜
η0 = a2 ⎝ ξ2,j
⎠ ξ2,k .
(5.2.8)
and let
⎛
⎜
m = − ⎝ m0 +
⎞
r2
j =1
j = k
⎟
m j tj ⎠ ,
j =1
j = k
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.2 Reduction of the bounds
107
It follows from (5.2.6), (5.2.7) and (5.2.8) that
m = mk
r2
mj dj + mk tk ,
(5.2.9)
j =1
whence m ≡ 0 (mod mk ) which implies that η0 ∈ K ∗ . Further, we obtain that
⎛
⎞
r2
r2
r2
r2
d
d
tj ⎟ m/mk
b
⎜
mk −mj j
η0
ξ2,j
ηj j = a2 ⎝ ξ2,j
ξ2,k
= a2
ξ2,j2,j = a2 x2 ,
⎠ ξ2,k
j =1
j = k
j =1
j = k
j =1
j =1
j = k
whence (5.2.5) follows. Finally, in view of (5.2.7), (5.2.8) and (5.2.9) we have
ordp (η0 ) = 0 which proves our claim.
Remark 5.2.2 It is important to note that m0 , m1 , . . . , mr2 and hence the
numbers η0 , η1 , . . . , ηr2 can be explicitly determined. However, we get different
η0 for each possible choice of t1 , . . . , tr2 with 0 ≤ tj < |mk |, j = 1, . . . , r2 and
m0 + rj2=1 mj tj ≡ 0 (mod mk ), and we have to perform our computations for
each η0 .
Step 2. We reduce (5.2.4) to an inequality concerning linear form in p-adic
logarithms. In view of (5.2.4) and (5.2.5) we have
ordp () ≥ c20 B − c21 ,
(5.2.10)
where
= 1 − η0
q2
d
ηj j .
j =1
Then (5.2.10) implies that ordp () >
B>
1
c20
1
p−1
c21 +
whenever
1
p−1
=: c22 .
Using p-adic logarithms in Qp , the algebraic closure of Qp , and applying
Lemma 1.11.2 and (1.11.1), we infer that
ordp = ordp () ≥ c20 B − c21 ,
(5.2.11)
where
= logp η0 + d1 logp η1 + · · · + dq2 logp ηq2 .
We note that here ordp (ηj ) = 0 for j = 0, . . . , q2 , hence the p-adic logarithms
of η0 , . . . , ηq2 are well-defined (see Section 1.11). Further, logp η0 , . . . , logp ηq2
are elements of Kp , the p-adic completion of K (see Proposition 1.11.3), and
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
108
Algorithmic resolution of unit equations in two unknowns
they can be approximated with any desired accuracy; see de Weger (1989) and
Smart (1998), chapter 5.
Step 3. We consider now (5.2.11) in the following more general form:
∞ > ordp (b1 ϑ1 + · · · + bt ϑt ) ≥ c23 B − c24 ,
(5.2.12)
where ϑ1 , . . . , ϑt are given elements of Kp , c23 > 0, c24 are given explicit
constants, b1 , . . . , bt ∈ Z with |bj | ≤ B for j = 1, . . . , t and B ≤ B0 for some
explicit constant B0 . This is the p-adic analogue of the inequality (5.2.1).
We first show that (5.2.12) can be reduced to the case when, in (5.2.12), all
ϑi are integers in Qp . The field Qp (ϑ1 , . . . , ϑt ) is a finite extension of degree
m, say, of Qp . Using standard arguments, we can determine an element δ which
is integral over Qp , and p-adic numbers ϑij (i = 1, . . . , t, j = 0, . . . , m − 1),
such that Qp (ϑ1 , . . . , ϑt ) = Qp (δ), and
ϑi =
m−1
i = 1, . . . , t.
ϑij δ j ,
j =0
Putting
(b) :=
t
bi ϑi and j (b) =
i=1
t
bi ϑij
(j = 0, . . . , m − 1),
i=1
we have
(b) =
m−1
j (b)δ j .
(5.2.13)
j =0
We claim that
ordp (j (b)) ≥ c23 B − c24 − 12 ordp (D(δ))
(5.2.14)
for j = 0, . . . , m − 1, where D(δ) denotes the discriminant of δ over Qp .
To prove (5.2.14), consider the conjugates δ (1) = δ, . . . , δ (m) of δ over Qp .
Taking the corresponding conjugates in (5.2.13) we get
(i) (b) =
m−1
j (b)(δ (i) )j ,
i = 1, . . . , m.
(5.2.15)
j =0
Put (δ) := 1≤i<j ≤m (δ (i) − δ (j ) ). It follows from (5.2.15) that there are padic algebraic numbers κij such that ordp (κij ) ≥ 0 and
(δ)j (b) =
m
κij (i) (b)
for j = 0, . . . , m − 1.
(5.2.16)
i=1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.2 Reduction of the bounds
109
Since ordp ((i) (b)) does not depend on i, we infer from (5.2.12), (5.2.16) and
2ordp (δ) =ordp (D(δ)) that (5.2.14) holds.
We could consider here the linear form estimates (5.2.14) simultaneously,
as is done in Smart (1995, 1998). This would give a better reduced bound than
using only one linear form. Nevertheless, we work only with one form, say
j0 (b) = ti=1 bi ϑij0 such that j0 (b) = 0. On one hand, this case is simpler
to apply. On the other hand it will enable us to apply the LLL-reduction
algorithm similarly to that used in the complex case.
For simplicity we omit the index j0 . Then, in view of (5.2.14), we arrive
at (5.2.12) under the assumption that now ϑ1 , . . . , ϑt are elements of Qp and
c24 is replaced by c25 := c24 + 12 ordp (D(δ)). We may assume without loss of
generality that mini ordp (ϑi ) = ordp (ϑt ) =: c26 . Then
ϑi := −ϑi /ϑt ∈ Zp
for i = 1, . . . , t − 1
and (5.2.12) implies that
+ bt ) ≥ c23 B − c27 ,
∞ > ordp (−b1 ϑ1 − · · · − bt−1 ϑt−1
(5.2.17)
where c27 := c25 + c26 .
Step 4. We apply now the LLL-reduction algorithm to (5.2.17) to reduce the
bound B0 .
For any ϑ ∈ Zp and for any positive integer u denote by ϑ {u} the unique
rational integer such that
ϑ ≡ ϑ {u}
(mod pu )
and
0 ≤ ϑ {u} < pu .
Let Lu denote the t-dimensional lattice generated by the columns of the matrix
⎞
⎛
1
0
···
0
0
⎜ 0
1
···
0
0⎟
⎟
⎜
⎟
⎜ .
.
.
.
⎜
..
..
.. ⎟
A = ⎜ ..
⎟.
⎟
⎜
0
···
1
0⎠
⎝ 0
{u}
{u}
{u}
ϑ1
ϑ2
· · · ϑt−1 pu
For any b = (b1 , . . . , bt ) ∈ Zt , write
+ bt .
(b) = −b1 ϑ1 − · · · − bt−1 ϑt−1
We claim that
Lu = bT : b = (b1 , . . . , bt ) ∈ Zt and ordp (b) ≥ u .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
110
Algorithmic resolution of unit equations in two unknowns
Indeed, if bT ∈ Lu then bT = AxT for some x = (x1 , . . . , xt ) ∈ Zt . This implies
that bi = xi for i = 1, . . . , t − 1 and
bt =
t−1
{u}
xi ϑi
+ xt pu ≡
i=1
t−1
bi ϑi
(mod pu ),
i=1
whence ordp (b) ≥ u follows. Conversely, if ordp ((b)) ≥ u for some b ∈ Zt
then there exists x ∈ Zt such that bT = AxT , that is bT ∈ Lu which proves our
claim.
We recall that in (5.2.17) we have a bound B0 for B. Choose an integer
constant u such that pu ≥ B0t+1 . We may expect that u is large enough to bound
B using the following lemma. If it is not sufficiently large then we make u a
little larger and apply the lemma again.
Let a1 denote the first vector of an LLL-reduced basis of Lu . We prove the
following analogue of Lemma 5.2.1.
Lemma 5.2.3 If in (5.2.17) maxi |bi | ≤ B ≤ B0 and
√
a1 > t2t−1 B0 ,
(5.2.18)
then
B≤
1
(u − 1 + c27 ).
c23
(5.2.19)
Proof. Using (iv) of Proposition 5.6.1 and (5.2.18) we infer that
a0 2 ≥ 21−t a1 2 > tB02 ,
where a0 denotes the shortest non-zero vector in the lattice Lu . Hence we infer
that
a0 >
√
tB0 .
This means that for b = (b1 . . . , bt ) ∈ Zt with maxi |bi | ≤ B ≤ B0 which satisfies (5.2.17), (b1 , . . . , bt )T cannot be a lattice point in Lu . Hence, for such a
b, ordp (b) ≤ u − 1, which, together with (5.2.17), gives (5.2.19).
Similarly to Lemma 5.2.1, Lemma 5.2.3 also reduces the bound B0 to almost
its logarithm. Of course, Lemma 5.2.3 can also be applied repeatedly until we
get a better bound than the previous one.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.3 Enumeration of the “small” solutions
111
Finally we note that performing the above procedure in (5.2.11) for all
v ∈ S1 (when B = maxj |b1j |), then proceeding similarly for all v ∈ S2 (when
B = maxj |b2j |), and denoting by BR the maximum of the reduced bounds
obtained, we get
B ≤ BR
in our equation (5.1.3).
5.3 Enumeration of the “small” solutions
In the first section we gave an upper bound for the solutions bi,j of the equation (5.1.3). Further, in the previous section we considerably reduced this bound
to a new bound BR . A crucial problem in the resolution of equation (5.1.3) is
now to check the remaining (2BR + 1)r1 +r2 cases for the exponents, where r1 ,
r2 denote the ranks of 1 and 2 , respectively. Even if the bound BR is moderate
(say < 100) the direct enumeration is almost hopeless whenever the number
r1 + r2 of exponents is greater than eight.
We now present an efficient algorithm for finding all solutions of (5.1.3)
under the reduced bound BR . This algorithm has been established by Wildanger
(1997, 2000) for the case when both 1 and 2 are the unit group of OK , and by
Smart (1999) in the general case. We follow the presentation of Smart (1999)
with certain simplifications.
For any real number H > 1 and for any finite set S of places of K containing
the infinite places, we define the set
1
≤ |α|v ≤ H for all v ∈ S .
H, S := α ∈ K:
H
Denote by S the set of solutions (x1 , x2 ) ∈ 1 × 2 of (5.1.1). Writing x1 , x2
in the form (5.1.2), we consider (5.1.1) as the exponential equation (5.1.3) in
integers bi,j with B = maxi,j |bi,j |. For a positive integer Bk , denote by SBk
the set of solutions of (5.1.1) such that the absolute values of the corresponding
exponents bi,j is at most Bk . Then S = SBR . We define
SBk (H ) := {(x1 , x2 ) ∈ SBk : x1 ∈ H, S1 }.
We first show that for
⎞
r1
log |ξ1,j |v ⎠ ,
H0 := max exp ⎝BR
⎛
v∈S1
j =1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
112
Algorithmic resolution of unit equations in two unknowns
we have
S = SBR (H0 ).
(5.3.1)
Indeed, using (5.1.2), for every solution (x1 , x2 ) of (5.1.1) and for each v ∈ S1
we infer that
r1
r1
log |ξ1,j |v |log |x1 |v | = b1,j log |ξ1,j |v ≤ B
j =1
j =1
⎛
⎞
r
1
log |ξ1,j |v ⎠ = log H0 .
≤ max ⎝BR
v∈S1
j =1
This means that
1
≤ |x1 |v ≤ H0 ,
H0
whence (5.3.1) follows.
In what follows, we shall proceed in several steps.
Step 1. We first decompose the solution set S into appropriate subsets.
Set
ti := max max |ai |v , ai−1 v
v∈S1 ∪S2
for i = 1, 2,
and
t3 := max min |a2 |v , a2−1 v .
v∈S1 ∪S2
For k ≥ 0, let Bk be a positive number with the choice B0 = BR , and let Hk ,
Hk+1 be real numbers such that
max t1 , t2 , t3 ,
t3 − 1
t1
< Hk+1 < Hk .
Note that Hk+1 > 1. We intend to find a positive number Bk+1 < Bk and then
decompose the set SBk (Hk ) into the union of SBk+1 (Hk+1 ) and a union of some
subsets, each containing a few elements which can be easily determined. If,
starting with k = 0, that is with (5.3.1), this process can be repeated, finally it
will remain to enumerate a set of the form SBk0 (Hk0 ) for some small values of
Bk0 and Hk0 .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.3 Enumeration of the “small” solutions
113
We define the sets Tj,v = Tj,v (Bk , Hk , Hk+1 ), j = 1, 2, 3, 4, in the following
way:
(
'
1
(v ∈ S2 ),
T1,v := (x1 , x2 ) ∈ SBk (Hk ) : |a1 x1 − 1|v <
1 + t1 Hk+1
(
'
1
1
T2,v := (x1 , x2 ) ∈ SBk (Hk ) : (v ∈ S1 ∪ S2 ),
− 1 <
a1 x1
1 + t1 Hk+1
v
'
t1
,
T3,v := (x1 , x2 ) ∈ SBk (Hk ) : |a2 x2 − 1|v <
Hk+1
(
a2 x2 ∈ 1 + t1 Hk , S2 (v ∈ S1 ),
'
a2 x2
t1
T4,v := (x1 , x2 ) ∈ SBk (Hk ) : −
− 1 <
,
a1 x1
Hk+1
v
(
a2 x2
∈ 1 + t1 Hk , S1 ∪ S2 (v ∈ S1 ).
a1 x1
Further, let
T1 (Bk , Hk , Hk+1 ) :=
)
T1,v (Bk , Hk , Hk+1 ),
v∈S2
T2 (Bk , Hk , Hk+1 ) :=
)
T2,v (Bk , Hk , Hk+1 ),
v∈S1 ∪S2
T3 (Bk , Hk , Hk+1 ) :=
)
T3,v (Bk , Hk , Hk+1 ),
v∈S1
T4 (Bk , Hk , Hk+1 ) :=
)
T4,v (Bk , Hk , Hk+1 ).
v∈S1
We recall that c1∗ denotes the constant occurring in (5.1.5).
Lemma 5.3.1 Let
c28 := max log
t1 Hk+1 + 1
, log(Hk+1 )
t3
and Bk+1 := c1∗ c28 . Then
SBk (Hk ) = SBk+1 (Hk+1 )
4
)
Tj (Bk , Hk , Hk+1 ).
(5.3.2)
j =1
Proof. Assume that (x1 , x2 ) ∈ SBk (Hk ) and that (x1 , x2 ) ∈
/ SBk (Hk+1 ). Then
there is a v ∈ S1 such that either |x1 |v < 1/Hk+1 or |x1 |v > Hk+1 . In the first
case we infer that
t1
|a2 x2 − 1|v = |a1 x1 |v <
.
Hk+1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
114
Algorithmic resolution of unit equations in two unknowns
/ T1 (Bk , Hk , Hk+1 ) then, for each u ∈ S2 ,
If (x1 , x2 ) ∈
|a2 x2 |u = |a1 x1 − 1|u ≥
1
.
1 + t1 Hk+1
Further, we have
|a2 x2 |u = |a1 x1 − 1|u ≤ 1 + |a1 x1 |u ≤
1 + t1 Hk if u ∈ S1 ,
1 + t1 if u ∈ S2 \ S1 .
Consequently, for each u ∈ S2 we have
|log |a2 x2 |u | ≤ max {log(1 + t1 ), log(1 + t1 Hk+1 ), log(1 + t1 Hk )}
= log(1 + t1 Hk ).
This implies that if |x1 |v < 1/Hk+1 for some v ∈ S1 and (x1 , x2 ) ∈
/
T1 (Bk , Hk , Hk+1 ), then (x1 , x2 ) ∈ T3 (Bk , Hk , Hk+1 ).
Next consider the case when |x1 |v > Hk+1 . Then it follows that
a2 x2
1 t1
−
a x − 1 = a x < H .
1 1
1 1 v
k+1
v
/ T2 (Bk , Hk , Hk+1 ). Then for each u ∈ S1 ∪ S2 we have
Assume that (x1 , x2 ) ∈
a2 x2 1
= 1 − 1 ≥
.
a x a x
1
+
t
1 1 u
1 1
1 Hk+1
u
Further, it follows that
a2 x2 = 1 − 1 ≤ 1 + 1 ≤ 1 + t1 Hk if u ∈ S1 ,
a x a x
a x 1 + t1 if u ∈ S2 \ S1 .
1 1 u
1 1
1 1 u
u
This implies that (x1 , x2 ) ∈ T4 (Bk , Hk , Hk+1 ). Hence
⎛
⎞
4
) )
⎝ Tj (Bk , Hk , Hk+1 )⎠ .
SBk (Hk ) = SBk (Hk+1 )
(5.3.3)
j =1
Now consider the case when (x1 , x2 ) ∈ SBk (Hk+1 ). Then for each v ∈ S1 ∪
S2 we have |log |x1 |v | ≤ log Hk+1 and
a1 x1 − 1 ≤ |a1 x1 |v + 1 ≤ t1 Hk+1 + 1 .
|x2 |v = a2 v
|a2 |v
t3
/ T1 (Bk , Hk , Hk+1 ), then for each v ∈ S2
If (x1 , x2 ) ∈
a1 x1 − 1 t3
≥
.
|x2 |v = a2
t1 Hk+1 + 1
v
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.3 Enumeration of the “small” solutions
115
Clearly, this inequality holds for v ∈ S1 \ S2 as well. Thus, we deduce that if
(x1 , x2 ) ∈ SBk (Hk+1 ) \ T1 (Bk , Hk , Hk+1 ), then for each v ∈ S1 ∪ S2
|log |x1 |v | ≤ c28 ,
|log |x2 |v | ≤ c28 .
However, in view of (5.1.5) we must have
B ≤ c1∗ c28 = Bk+1 .
Thus
SBk (Hk+1 ) = SBk+1 (Hk+1 )
)
T1 (Bk , Hk , Hk+1 )
which, together with (5.3.3), completes the proof.
Remark 5.3.2 Applying Lemma 5.3.1 we need to choose a value Hk+1
such that the algorithm prescribed below allows us to deduce that the sets
Tj,v (Bk , Hk , Hk+1 ) are easy to enumerate for each j and v under consideration.
Wildanger (2000) provides a heuristic method to find the best value for Hk+1 in
the case when 1 = 2 is the unit group of OK . In the general case the analysis
appears similar, and the choice of Wildanger for Hk+1 seems to be sufficient.
Step 2. We are going to show how to enumerate, for each j and v in question,
all the possible elements in Tj,v (Bk , Hk , Hk+1 ).
In all cases our problem can be reformulated as trying to enumerate all nontrivial solutions of the following problem. Let α, ξ1 , . . . , ξr be explicitly given
elements of K ∗ such that ξ1 , . . . , ξr are multiplicatively independent. Further,
let S = {v1 , . . . , vs } be the support of the multiplicative group generated by
ξ1 , . . . , ξr . Set
x = ζ b0
r
b
ξj j ,
(5.3.4)
j =1
where ζ is a primitive root of unity in K, and b0 , . . . , br are rational integers
with 0 ≤ b0 < w, where w denotes the number of roots of unity in K. We have,
for some H > 1 and for all v ∈ S,
1
≤ |αx|v ≤ H.
H
(5.3.5)
We wish to determine all x for which (5.3.4), (5.3.5) and, for some v ∈ S and
some given ε ∈ (0, 1)
|αx − 1|v < (5.3.6)
hold.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
116
Algorithmic resolution of unit equations in two unknowns
We shall distinguish two cases according as v is infinite or finite in (5.3.6).
v is infinite. Let v = vi . For any z ∈ C, |z − 1| < ε implies that |log |z|| <
log (1/(1 − )). Hence we deduce from (5.3.6) that
log 1 ,
v real,
|log |αx|v | ≤ := 1 1− 1√
log
,
v complex.
2
1− Further, we have, with obvious notation
√
Arg (αx)(i) ≤ arccos 1 − =: ε .
For simplicity, for any β ∈ K ∗ we denote by β (i) the conjugate of β over Q
corresponding to vi .
Consider the sublattice of Rs+1 which is generated by the columns of the
matrix M obtained from the (s + 1) × (r + 1) matrix
⎛
⎞
log |ξ1 |v1 · · · log |ξr |v1 0
⎜
..
..
.. ⎟
1 ⎜
.
.
.⎟
⎜
⎟
log H ⎝
log |ξ1 |v · · · log |ξr |v 0⎠
s
0
···
s
0
0
by replacing the i-th row by
1 log |ξ1 |vi , . . . , log |ξr |vi , 0
and the last row by
1
Arg ξ1(i) , . . . , Arg ξr(i) , Arg ζ (i) .
We expect the i-th and last row of M to have much larger entries than the other
rows.
Let x = Mb, where b = (b1 , . . . , br , b0 )T , and consider the vector y
obtained from the vector
T
1 −log |α|v1 , . . . , − log |α|vr , 0 ∈ Rr+1
log H
by replacing the i-th coordinate by −log |α|vi / for i = 1, . . . , s, and the last
coordinate by Arg((1/α)(i) )/ . Then we have
log2 |αx|v
Arg2 (αx)(i)
log2 |αx|vi
2
+
+
x − y =
≤ s + 1. (5.3.7)
2
2
log2 H
v∈S
v = vi
Hence we have proved that for any (b0 , b1 , . . . , br ) ∈ Zr+1 which corresponds
by (5.3.4) to a solution x of (5.3.5) and (5.3.6), inequality (5.3.7) holds. The
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.3 Enumeration of the “small” solutions
117
inequality (5.3.7) defines an ellipsoid with center y. The lattice points contained
in this ellipsoid can be enumerated by means of the algorithm of Fincke and
Pohst (1985). The enumeration is usually very fast. However, it is essential that
the improved version (see Fincke and Pohst (1985)) of the algorithm must be
used, involving LLL reduction.
v is finite. Let again p be the prime ideal of OK that corresponds to v.
We proceed as above with the following modifications. Then (5.3.6) implies
that ordp (αx) = 0. As in Step 1 of Section 5.2.2, we can reduce (5.3.4) to
q
d
a similar problem where αx = η0 j =1 ηj j with multiplicatively independent
η1 , . . . , ηq such that ordp (ηj ) = 0 for j = 0, . . . , q. We recall that η0 may
assume finitely many values, each of which can be determined. We have to
perform our computations for every possible value of η0 . Suppose that p has
residue degree f and that it lies above the rational prime p. Choose a positive
integer n such that
≤ p−f n .
Then in view of (5.3.6) we have
η0
q
d
ηj j ≡ 1 (mod pn ).
(5.3.8)
j =1
First assume that η0 , . . . , ηq are multiplicatively independent. Let G denote
the subgroup of K ∗ generated by η0 , . . . , ηq . Since ordp (ηj ) = 0 for all j , we
can consider the image of G in (OK /pn )∗ under reduction mod pn . The order
of ηj (mod pn ) can be computed very quickly, as it is a divisor of the order
of (OK /pn )∗ which is p(n−1)f (p f − 1). All d = (d0 , . . . , dq ) ∈ Zq+1 for which
d
η0d0 · · · ηqq ≡ 1 (mod pn ) form a full lattice in Zq+1 , a basis of which can be
computed by using the algorithm MINIMIZE; see Teske (1998). This algorithm
computes such a basis in the form
dj = (d0j , . . . , djj , 0, . . . , 0)
for j = 0, . . . , q,
d
where dij ∈ Z and djj is the smallest positive integer for which ηj jj belongs to
the subgroup of G/pn generated by {η0 , . . . , ηj −1 }. Then putting
ηj := η00j · · · ηj jj ,
d
d
we can write
αx =
q
ηj
nj
j =0
with suitable integers n0 , . . . , nq . Let S = {u1 , . . . , us } denote the support of
the group generated by η0 , . . . , ηq . Obviously S ⊂ S. We can now proceed in
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
118
Algorithmic resolution of unit equations in two unknowns
a similar way as in the case of infinite places. Consider the sublattice of Rs
generated by the vectors
ηi =
1 log |ηi |u1 , . . . , log |ηi |us
log H
for i = 0, . . . , q. We have
n0 η0 + · · · + nq ηq 2 =
log2 |αx|u
u∈S log2 H
≤ s + 1.
Then, as above, we can determine all (n0 , . . . , nq ) and hence all x under
consideration using the Fincke–Pohst algorithm.
Next consider the case when η0 , . . . , ηq are multiplicatively dependent, and
h
let h be a positive
integer for which η0 is contained in the multiplicative group
generated by η1 , . . . , ηq . Then it follows from (5.3.8) that
(αx)h =
q
d
ηj j ≡ 1
(mod pn )
j =1
with some rational integers d1 , . . . , dq . Then we infer as above that
(αx)h =
q
n
η j j
j =1
with some rational integers n1 , . . . , nq . Thus, following the above procedure,
we can determine all (n1 , . . . , nq ) and hence, up to a factor of a root of unity
in K, all x can also be found. The factor in question can be easily determined
from equation (5.1.1).
Step 3. By means of the above process we can determine all elements of the set
Tj,v (Bk , Hk , Hk+1 ) for j = 1, 2, 3, 4, and for all v in question.
Finally, at the end of the repeated procedure described in Steps 1 and 2 of
this section we arrive at a set of the form SBk0 (Hk0 ) for some small values of
Bk0 and Hk0 . Consider the lattice in Rs1 generated by the vectors
T
1 log |ξ1,i |v1 , . . . , log |ξ1,i |vs1
log Hk0
for i = 1, . . . , r1 , where v1 , . . . , vs1 = S1 . Then the set SBk0 (Hk0 ) is contained in the ellipsoid
ξ i :=
b1,1 ξ 1 + · · · + b1,r1 ξ r1 2 ≤ s1
whose points can be found by using again the Fincke–Pohst algorithm.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.4 Examples
119
5.4 Examples
We now briefly illustrate the use of the above presented algorithm on two
concrete S-unit equations. In these examples 1 = 2 holds, and this group
is the unit group of the ring of integers, respectively an S-unit group of the
underlying number field. In our book on discriminant equations, we will meet
examples where 1 , 2 are distinct.
We note that in the examples below the fundamental units were computed
by the KANT package; see Daberkow et al. (1997). For an alternative package,
we mention MAGMA; see Bosma et al. (1997).
Example 5.4.1 (Smart (1997, 1999)). Let K16 denote the 16th cyclotomic field
generated by ζ , where ζ is a 16th primitive root of unity. There is a prime ideal
p in K16 such that p8 = (2). This ideal is principal and we can take π = 1 − ζ
as a generator for p. Let S denote the set of places of K16 which consists of all
infinite places and the single finite place v = p. Consider the S-unit equation
x1 + x2 = 1
in S-units x1 , x2 of K16 .
(5.4.1)
The degree and unit rank of K16 are 8 and 3, respectively. One can take
ε1 = ζ 2 + ζ 4 + ζ 6 , ε2 = − ζ 2 + ζ 3 + ζ 4 , ε3 = 1 + ζ 3 − ζ 5
as generators for the unit group of K16 . Then the solutions of (5.4.1) can be
uniquely represented in the form
b
b
b
x1 = ζ b1,0 ε11,1 ε21,2 ε31,3 π b1,4 ,
b
b
b
x2 = ζ b2,0 ε12,1 ε22,2 ε32,3 π b2,4 .
with rational integer exponents bi,j . Obviously one can assume that
0 ≤ b1,0 ,
b2.0 ≤ 15.
(5.4.2)
Using Baker’s method and reduction techniques, it was shown in Smart (1997,
1999) that
max |bi,j | ≤ 1066 = BR ,
i,j
where BR denotes the reduced bound as defined in Section 5.2.
We note that in Smart (1997, 1999) some earlier versions of Theorems 3.2.4, 3.2.7 and of the reduction algorithm were utilized. Using our
versions described in Sections 5.1 and 5.2 we could get a slightly better value
for BR , but this is in fact irrelevant for the last part of the computations.
The enumeration process was applied repeatedly with the initial values
B0 = BR , H0 = 103598 and with c1∗ = 1.63189. Then SH0 (B0 ) is just the set of
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
120
Algorithmic resolution of unit equations in two unknowns
solutions of (5.4.1). Smart (1999) then chose
H1 = 1090 ,
H2 = 1030 ,
H3 = 1015 ,
H4 = 106 ,
H5 = 103 .
After the necessary computations it turned out that the sets Tj,v (Bk , Hk , Hk+1 )
are empty both for 0 ≤ k ≤ 4, 1 ≤ j ≤ 4 and all v ∈ S infinite, and for
0 ≤ k ≤ 2, 1 ≤ j ≤ 4 and v finite. For the finite v and for k = 3, 4, 1 ≤ j ≤ 4,
the solutions in Tj,v (Bk , Hk , Hk+1 ) were determined by the Fincke–Pohst
algorithm.
Finally, it remained to enumerate the set SB5 (H5 ) for B5 = 11 which was
accomplished again by means of the Fincke–Pohst method.
The equation (5.4.1) has exactly 795 solutions, each of which satisfies (5.4.2)
and
max |b1,j |, |b2,j | ≤ 11.
1≤j ≤4
This result was needed in Smart (1997) for the calculation of curves of genus
2 with good reduction away from 2.
Example 5.4.2 (Wildanger (2000)). Let K19 be the 19th cyclotomic field gen+
the maximal real subfield of
erated by ζ = exp (2π i/19), and denote by K19
+
−1
K19 . Then K19 = Q(θ ), where θ = ζ + ζ . Consider the unit equation
x1 + x2 = 1
+
in units x1 , x2 of the ring of integers of K19
.
(5.4.3)
+
is totally real, its degree is 9 and its unit rank is 8. Further,
The number field K19
ε1 = 1 − 4θ − 10θ 2 + 10θ 3 + 15θ 4 − 6θ 5 − 7θ 6 + θ 7 + θ 8 ,
ε2 = 3θ − θ 3 ,
ε5 = θ,
ε3 = 1 − 2θ − 3θ 2 + θ 3 + θ 4 ,
ε6 = 2 − θ 2 ,
ε4 = 2 − 9θ 2 + 6θ 4 − θ 6 ,
ε7 = 2 − 4θ 2 + θ 4 ,
ε8 = −5θ + 5θ 2 + 10θ 3 − 5θ 4 − 6θ 5 + θ 6 + θ 7
+
is a system of fundamental units in K19
. The solutions of (5.4.3) can be written
uniquely in the form
x1 = ±
8
b
εj 1,j ,
x2 = ±
j =1
8
b
εj 2,j ,
j =1
where bi,j are rational integers.
By means of Baker’s method Wildanger proved that
max |bi,j | ≤ 1038 .
i,j
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.5 Exceptional units
121
Further, using reduction techniques he showed that
max |bi,j | ≤ 2076 = BR ,
i,j
BR being the reduced bound.
Finally, Wildanger’s variant of the enumeration algorithm was used repeatedly with the initial values B0 = BR , H0 = 6.9 × 104843 and then with the
values
H1 = 1.49 × 1030 , H2 = 3.89 × 1011 , H3 = 5.52 × 107 , H4 = 982 337.37,
H5 = 73 360.74,
H9 = 74.25,
H6 = 9896.88,
H7 = 1780.14,
H8 = 365.36,
H10 = 11.47.
At the final enumeration all the 28 398 solutions of (5.4.3) were found.
5.5 Exceptional units
The units ε of the ring of integers of a number field K such that 1 − ε is also a
unit of this ring of integers are called exceptional units of K, see Nagell (1970).
Nagell (1964, 1968b, 1970) determined all exceptional units in number fields
of unit rank 1 and in certain number fields of unit rank 2.
+
has exactly 28 398 excepAs was mentioned above, the number field K19
tional units. For a positive integer m for which m ≡ 2 (mod 4), denote by Km
the m-th cyclotomic field and by Km+ its maximal real subfield. Using the above
method, Wildanger (2000) determined all exceptional units in the number fields
Km+ for m ≤ 23. Further, by means of the next lemma he extended his result to
the number fields Km as well.
A number field is called a CM-field if it is a totally imaginary quadratic
extension of a totally real number field. For example, the imaginary quadratic
number fields and the cyclotomic fields are all CM-fields.
Lemma 5.5.1 Let K be a CM-field. Then all non-real exceptional units in K
are of the form
1 − ζ2
,
ζ1 − ζ2
where ζ1 , ζ2 are roots of unity in K.
Proof. See Győry (1971).
+
Denote by Sm and S+
m the set of exceptional units in Km and Km , respec+
tively, and let |Sm | and |Sm | be their cardinalities. The following table given
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
122
Algorithmic resolution of unit equations in two unknowns
by Wildanger contains the values of |Sm | and |S+
m | for those m for which the
+
unit rank of Km is at most 10.
m
1
3
4
5
7
8
9
11
12
13
15
16
17
19
20
21
23
24
25
27
28
32
33
36
40
44
48
60
[Km+ : Q]
1
1
1
2
3
2
3
5
2
6
4
4
8
9
4
6
11
4
10
9
6
8
10
6
8
10
8
8
|S+
m|
0
0
0
6
42
0
18
570
0
1830
90
0
11 700
28 398
54
1416
130 812
0
47 766
8676
678
0
73 110
354
4398
30 030
0
14 274
[Km : Q]
1
2
2
4
6
4
6
10
4
12
8
8
16
18
8
12
22
8
20
18
12
16
20
12
16
20
16
16
|Sm |
0
2
0
18
72
0
38
660
14
1962
440
0
11 940
28 704
138
2192
131 274
86
48 078
8858
888
0
75 242
710
4914
30 660
422
16 340
We remark that for m = 8, 16, 24, 32, 48 there is a prime ideal in OKm+ of
norm 2. Hence these number fields Km+ cannot have exceptional units. This
implies that in the solutions of the equation (5.4.1) in Example 5.4.1 one of the
exponents b1,4 and b2,4 must be different from zero.
For each d ∈ {2, . . . , 8}, r ∈ {2, . . . , d − 1}, Wildanger (2000) considered
the number fields of degree d and unit rank r having one of the five discriminants
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.6 Supplement: LLL lattice basis reduction
123
with smallest absolute values, and computed for each of them all exceptional
units.
Finally we note that Wildanger’s method was implemented in KANT, see
Daberkow et al. (1997).
5.6 Supplement: LLL lattice basis reduction
Let n be an integer ≥ 2. The standard inner product on Rn is defined by
a, b =
n
ai bi for a = (a1 , . . . , an ), b = (b1 , . . . , bn ) ∈ Rn .
i=1
We use · to denote the Euclidean norm on Rn . Thus for a = (a1 , . . . , an ) ∈
Rn we have
*
a = a, a1/2 = a12 + · · · + an2 .
Let L be a t-dimensional lattice in Rn , i.e.,
L := {z1 a1 + · · · zt at : z1 , . . . , zt ∈ Z},
where a1 , . . . , at are linearly independent vectors in Rn . Then the determinant
d(L) of L is given by
1/2
.
d(L) = det ai , aj 1≤i,j ≤t
If in particular L is a full lattice in R , i.e., with t = n, then
n
d(L) = | det(a1 , . . . , an )|.
The determinant of L is independent of the choice of a1 , . . . , at .
In this section, by a basis of a lattice or vector space we mean an ordered
tuple of vectors a1 , . . . , at and not just a set {a1 , . . . , at }, since the outcome
of the LLL-algorithm depends on the order in which the vectors of the initial
basis are inserted.
Let a1 , . . . , at be a basis of a t-dimensional lattice L in Rn , where 1 ≤ t ≤
n. To define an LLL-reduced basis of L we need an appropriate orthogonal
basis in the subspace of Rn spanned by L. By means of the Gram–Schmidt
orthogonalization process such an orthogonal basis a∗1 , . . . , a∗t can be defined
inductively by
a∗i = ai −
i−1
μij a∗j ,
1 ≤ i ≤ t,
(5.6.1)
j =1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
124
Algorithmic resolution of unit equations in two unknowns
where
μij = ai , a∗j / a∗j 2 ,
1 ≤ j < i ≤ t.
(5.6.2)
A. K. Lenstra, H. W. Lenstra and Lovász (1982) introduced the notion of
what is nowadays called an LLL-reduced basis of a lattice. A basis a1 , . . . , at
of a lattice L in Rn is called LLL-reduced if a1 , . . . , at and the vectors
a∗1 , . . . , a∗t of the corresponding orthogonal basis satisfy
|μij | ≤ 12 ,
1≤j <i≤t
and
a∗i + μi,i−1 a∗i−1 2 ≥
3 ∗ 2
a ,
4 i−1
1 < i ≤ t.
(5.6.3)
Clearly, (5.6.3) can be rewritten as
a∗i 2 ≥
3
− μ2i,i−1 a∗i−1 2 .
4
Lenstra, Lenstra and Lovász proved that every lattice in Rn has such a basis.
Further, they developed a very practical algorithm, which from any given lattice
and any basis of this lattice computes an LLL-reduced basis of this lattice. (In
fact, Lenstra, Lenstra and Lovász formally stated their results only for full
lattices, but the generalization to arbitrary lattices is implicit in their proof; see
also Pohst (1993)).
LLL-reduced bases have several useful properties. In our book the inequality
(iv) below plays an important role in solving concrete Diophantine equations.
Proposition 5.6.1 Let a1 , . . . , at be an LLL-reduced basis of a lattice L in Rn
with associated orthogonal basis a∗1 , . . . , a∗t defined in (5.6.1). Then we have
(i)
(ii)
(iii)
(iv)
(v)
aj 2 ≤ 2i−1 a∗i for 1 ≤ j ≤ i ≤ t;
d(L) ≤ ti=1 ai ≤ 2t(t−1)/4 d(L);
a1 ≤ 2(t−1)/4 d(L)1/t ;
a1 2 ≤ 2t−1 x2 for every x ∈ L, x = 0;
for 1 ≤ j ≤ s, where 1 ≤ s ≤ t,
aj 2 ≤ 2t−1 max x1 2 , . . . , xs 2
and x1 , . . . , xs are linearly independent vectors of L.
Proof. See Lenstra, Lenstra and Lovász (1982) for t = n, and Pohst (1993) in
the case t ≤ n.
Following Lenstra, Lenstra and Lovász (1982) and Pohst (1993), we now
briefly present the LLL-lattice basis reduction algorithm, that transforms a
given basis a1 , . . . , at of a given lattice L in Rn into an LLL-reduced one.
First the constants μij and the orthogonal basis vectors a∗i are calculated using
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.6 Supplement: LLL lattice basis reduction
125
(5.6.1) and (5.6.2). Then an LLL-reduced basis can be constructed by induction
on the number of reduced basis vectors. The vectors a1 , . . . , at will be changed
several times. However, the a∗i and μij will be updated at each step so that
(5.6.1) and (5.6.2) remain valid.
Assume that for some m with 2 ≤ m ≤ t + 1, the vectors a1 , . . . , am−1 are
already LLL-reduced, that is form an LLL-reduced basis of the lattice generated
by them. In other words, we assume that
|μij | ≤
1
2
for 1 ≤ j < i < m
and
3 ∗ 2
a for 1 < i < m.
4 i−1
These inequalities trivially hold if m = 2. For m = t + 1 the algorithm terminates because then the full basis a1 , . . . , at is reduced. Next consider the case
m ≤ t.
The major steps are as follows.
a∗i + μi,i−1 a∗i−1 2 ≥
(a) Reduce μm,m−1 to |μm,m−1 | ≤ 1/2, subtracting an appropriate multiple of
am−1 from am . After these changes all the vectors a∗i remain unchanged.
(b) If (5.6.3) holds for i = m, one can proceed to (c). Otherwise interchange
am−1 and am and, if m > 2, replace m by m − 1. Then one can go on
with (a).
(c) Reduce μmj as in (a) to |μmj | ≤ 1/2 for j = m − 2, m − 3, . . . , 1. Then
take m + 1 in place of m. If m > t, the algorithm terminates, otherwise we
can go on with (a).
The vectors a∗i are not used explicitly in the algorithm, only the squares of
their norms
Ai := a∗i 2 .
LLL-reduction algorithm (Pohst (1993)).
Input: A basis a1 , . . . , at of a t-dimensional lattice L ⊆ Rn .
Output: A basis a1 , . . . , at of L which is LLL-reduced.
(a) (Initialization)
For i = 1, . . . , t set:
μij ← ai , a∗j /Aj
a∗i ← ai −
i−1
(1 ≤ j ≤ i − 1),
μij a∗j ,
Ai ← ai , a∗i .
j =1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
126
Algorithmic resolution of unit equations in two unknowns
Then set m ← 2.
(b) (Set l). Set l ← m − 1.
(c) (Change μml in the case |μml | > 12 ). If |μml | >
integer to μml and
1
2
set r to the closest rational
am ← am − ral ,
μmj ← μmj − rμlj
(1 ≤ j ≤ l − 1),
μml ← μml − r.
For l = m − 1 go to (d), else to (e).
(d) For Am < 34 − μ2m,m−1 Am−1 go to (f).
(e) (Decrease l). Set l ← l − 1. For l > 0 go to (c). For m = t terminate; else
set m ← m + 1 and go to (b).
(f) (Interchange am−1 , am ). Set μ ← μm,m−1 , A ← Am + μ2 Am−1 ,
μm,m−1 ← μAm−1 /A, Am ← Am−1 Am /A, Am−1 ← A; then set for
1 ≤ j ≤ m − 2 and m + 1 ≤ i ≤ t
am−1
am
μi,m−1
μim
←
←
am
,
am−1
μm−1,j
μmj
1 μm,m−1
0
1
←
0 1
1 −μ
μmj
,
μm−1,j
μi,m−1
.
μim
For m > 2 decrease m by 1. Then go to (b).
As is proved in Lenstra, Lenstra and Lovász (1982), see also Pohst (1993), the
above algorithm always terminates. Further, it is shown that if L is a sublattice
of Zn of rank n with basis a1 , . . . , an with ai 2 ≤ A for i = 1, . . . , n, where
A ≥ 2, then the algorithm uses O(n4 log A) arithmetic operations, while the
integers occurring in the algorithm have binary lengths O(n log A).
For more detailed treatments of the LLL-algorithm as well as for some
refinements, we refer to the books de Weger (1989), Pohst (1993), Cohen
(1993), Smart (1998) and Gaál (2002).
5.7 Notes
r In the inhomogeneous version of (5.2.1), the first reduction algorithm was
established in Baker and Davenport (1969). Generalizations of this algorithm
to the case of several variables were given in Pethő and Schulenberg (1987)
and de Weger (1989).
r We note that the enumeration algorithm presented in Section 5.3 can be
made even more efficient by combining it with some sieving procedure with
appropriate prime ideals; see e.g. Smart (1998).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
5.7 Notes
127
r Exceptional units have several applications. An important application was
given by Lenstra (1977) who showed that if a number field K contains a
“large” subset {ε1 , . . . , εm } of integers of K such that εi − εj is a unit for
each i = j then (the ring of integers in) K is Euclidean (with respect to the
norm). This was used by Lenstra and others, see, e.g., Lenstra (1977), Mestre
(1981), Leutbecher and Martinet (1982), Leutbecher (1985), Leutbecher and
Niklasch (1989), Houriet (2007) to obtain several hundreds of new examples
of Euclidean number fields.
r There is also a link between exceptional units, Lenstra’s result and the dynamics of iterated polynomial mappings; see Zieve (1996).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:29, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.007
6
Unit equations in several unknowns
In the previous chapters we considered equations
a1 x1 + a2 x2 = 1,
(6.1)
where the unknowns x1 , x2 are taken from the group of S-units, or more
generally from a finitely generated multiplicative group in a number field. We
proved effective finiteness results, which enable one to determine all solutions
at least in principle. In fact, in several cases there are even practical algorithms
to solve such equations. Our proofs are based on Baker-type inequalities for
linear forms in ordinary or p-adic logarithms of algebraic numbers.
In this chapter, we consider equations
a1 x1 + · · · + an xn = 1
(6.2)
in an arbitrary number of unknowns x1 , . . . , xn , which again may be S-units
of a number field, or elements from a finitely generated multiplicative group.
It should be noticed that equations of the type (6.2) in n > 2 unknowns may
have infinitely many solutions. For instance, consider (6.2) with solutions taken
from an infinite multiplicative group , and let (x1 , . . . , xn ) be a solution of
this equation, with a1 x1 + · · · + am xm = 0, say, where 2 ≤ m < n. Then one
obtains an infinite family of solutions by taking (ux1 , . . . , uxm , xm+1 , . . . , xn )
with u an arbitrary element of . To obstruct such obvious constructions of
infinite families, we usually consider only non-degenerate solutions of (6.2),
i.e., with
ai xi = 0 for each non-empty subset I of {1, . . . , n}.
i∈I
We mention that for equations of type (6.2) in more than two unknowns, we
can prove only ineffective finiteness results, as the only available methods to
deal with such equations are ineffective. On the other hand, these methods make
128
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
Unit equations in several unknowns
129
it possible to give an explicit upper bound for the number of non-degenerate
solutions of equation (6.2). The first method, which is the one followed in this
book, is based on the p-adic Subspace Theorem of Schmidt and Schlickewei.
The second method, originating from ideas in Faltings (1991) and further
developed in Rémond (2002), is independent of the Subspace Theorem but
uses instead Faltings’ Product Theorem. We should mention here that the
second method has a wider applicability but that the first method based on
the Subspace Theorem leads to smaller upper bounds for the number of nondegenerate solutions of (6.2).
We give a quick overview of the results proved in this chapter. Our first
theorem is a so-called “semi-effective” result, which is a reformulation of a
result from Evertse (1984b). Let K be an algebraic number field, and S a finite
set of places of K, containing the infinite places. For a vector x = (x0 , . . . , xn ) ∈
OSn+1 , define
HS (x0 , . . . , xn ) :=
max |xi |v , NS (x0 · · · xn ) :=
|x0 · · · xn |v .
v∈S
i
Then for every > 0, and every x ∈
and
v∈S
OSn+1
with
x0 + · · · + xn = 0
i∈I
xi = 0 for each proper, non-empty subset I of {0, . . . , n}, we have
HS (x0 , . . . , xn ) K,S,n, NS (x0 · · · xn )1+ ,
where the implied constant depends only on K, S, n, . This constant is not
effectively computable from our method of proof. We deduce this result from
Theorem 3.1.3 (the p-adic Subspace Theorem).
A consequence of this result is that equation (6.2) has only finitely many
non-degenerate solutions in S-units x1 , . . . , xn .
More generally, we consider equation (6.2) as an equation with unknowns
from a multiplicative group of finite rank , contained in any field K of characteristic 0. Taking as starting point Theorem 3.1.6 (a quantitative version of
the Parametric Subspace Theorem), we prove a result from Evertse, Schlickewei and Schmidt (2002), stating that equation (6.2) has only finitely many
non-degenerate solutions in x1 , . . . , xn ∈ , whose number is bounded above
by C(n)r+1 , where r = rank , and C(n) is an effectively computable number
depending only on n.
Next, we consider again equation (6.1), in unknowns x1 , x2 ∈ . We have
included a proof of the result of Beukers and Schlickewei (1996), implying that
for every pair of non-zero coefficients a1 , a2 , equation (6.1) has at most C r+1
solutions, where r = rank , and C is an effectively computable constant. We
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
130
Unit equations in several unknowns
should mention here that the approach of Evertse, Schlickewei and Schmidt
(2002) gives a similar result, but with a much larger constant C. Further, we
prove a result from Evertse, Győry, Stewart and Tijdeman (1988a), which states
in a precise way that for most pairs (a1 , a2 ), equation (6.1) has at most two
solutions.
We finish with some results concerning lower bounds for the number of
solutions of (6.1) and (6.2). In particular, we have included a result √
by Konyagin
and Soundararajan (2007) which implies that for every β < 2 − 2 there are
groups of arbitrarily large rank r and a1 , a2 such that (6.1) has at least exp(r β )
solutions.
In Section 6.7, the Notes of this chapter, we give an overview of some
historical developments, and some related results.
The results presented in this chapter have applications in Chapter 9 to
decomposable form equations. Further, they will be applied in our book on
discriminant equations.
6.1 Results
6.1.1 A semi-effective result
Let K be an algebraic number field, and S a finite set of places of K containing
all infinite places. We define the S-height of x = (x0 , . . . , xn ) ∈ OSn+1 by
HS (x) = HS (x0 , . . . , xn ) :=
max(|x0 |v , . . . , |xn |v ),
v∈S
where the absolute value | · |v is normalized as in Section 1.7. Recall that the
S-norm of a ∈ OS is defined by
NS (a) :=
|a|v .
v∈S
Our first result is as follows.
Theorem 6.1.1 Let > 0, n ≥ 1. There is a constant C ineff (K, S, n, ) depending only on K, S, n, for which the following holds. For any non-zero
x0 , x1 , . . . , xn ∈ OS with
x0 + x1 + · · · + xn = 0,
xi = 0 for each proper, non-empty subset I of {0, . . . , n}
i∈I
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.1 Results
131
we have
HS (x0 , x1 , . . . , xn ) ≤ C ineff (K, S, n, )NS (x0 · · · xn )1+ .
(6.1.1)
This is in fact an equivalent formulation to Evertse (1984b), theorem 1. We
have indicated by means of the superscript “ineff” that the constant C ineff
is not effectively computable by means of our method of proof. We view
Theorem 6.1.1 as a “semi-effective result”, since it is effective in terms of
NS (x0 · · · xn ), but ineffective in terms of n, K, S, .
From Theorem 6.1.1 we deduce a finiteness result on the equation
a1 x1 + · · · + an xn = 1
in x1 , . . . , xn ∈ OS∗ ,
(6.1.2)
where n ≥ 2 and a1 , . . . , an are non-zero elements of K. Recall that a solution
(x1 , . . . , xn ) of (6.1.2) is called non-degenerate if
ai xi = 0 for each non-empty subset I of {1, . . . , n}
i∈I
and degenerate otherwise. Theorem 6.1.1 implies the following finiteness
result.
Corollary 6.1.2 Equation (6.1.2) has only finitely many non-degenerate solutions in x1 , . . . , xn ∈ OS∗ .
This result was proved independently in Evertse (1984b) and van der Poorten
and Schlickewei (1982). It was announced in van der Poorten and Schlickewei
(1982) and then proved in Evertse and Győry (1988b) and later in van der
Poorten and Schlickewei (1991) that Corollary 6.1.2 is valid in the more general
situation as well when, in (6.1.2), K is any field of characteristic 0, and OS∗ is
replaced by any finitely generated multiplicative subgroup of K ∗ . Further,
in Evertse and Győry (1988b) it was shown that the number of non-degenerate
solutions can be estimated from above by a number depending only on n and
, but with the method of proof in that paper it is not possible to effectively
compute this number.
In Section 6.2 we deduce Theorem 6.1.1 from the p-adic Subspace Theorem
and then deduce from this Corollary 6.1.2. Here we follow Evertse (1984b).
6.1.2 Upper bounds for the number of solutions
In this subsection we consider a generalization of (6.1.2) and give an upper
bound for the number of its solutions. We say that a multiplicatively written
abelian group is of finite rank r if has a free subgroup 0 of rank r such
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
132
Unit equations in several unknowns
that for every x ∈ there is a positive integer m such that x m ∈ 0 . We say
that is of rank 0 if every element of has finite order.
Let now K be any field of characteristic 0, let n ≥ 2, and denote by (K ∗ )n
the n-fold direct product of the multiplicative group K ∗ of K, endowed with
coordinatewise multiplication (x1 , . . . , xn )(y1 , . . . , yn ) = (x1 y1 , . . . , xn yn ) and
exponentiation (x1 , . . . , xn )m = (x1m , . . . , xnm ). The following result was established in Evertse, Schlickewei and Schmidt (2002).
Theorem 6.1.3 Let K be a field of characteristic 0, let n ≥ 2, let a1 , . . . , an ∈
K ∗ and let be a subgroup of (K ∗ )n of finite rank r. Then the number of
non-degenerate solutions of
a1 x1 + · · · + an xn = 1
in (x1 , . . . , xn ) ∈ (6.1.3)
can be estimated from above by a quantity A(n, r) depending on n and r only.
For A(n, r) one may take exp((6n)3n (r + 1)).
The main ingredients of the proof of this result are a specialization argument, to
make a reduction to the case that K is a number field and is finitely generated,
a version of the Quantitative Subspace Theorem (Evertse and Schlickewei
(2002)) and an estimate of Schmidt (1996) for the number of points of very
small height on an algebraic subvariety of a linear torus. This estimate of
Schmidt was recently improved substantially by Amoroso and Viada (2009).
By going through the proof of Evertse, Schlickewei and Schmidt, but replacing
Schmidt’s estimate by theirs, they obtained a stronger version of the above
Theorem 6.1.3 with
4
A(n, r) = (8n)4n
(n+r+1)
.
(6.1.4)
We note that by a different approach, based on Faltings’ Product Theorem
instead of the Subspace Theorem, Rémond (2002) proved a general quantitative
result for subvarieties of tori (see Section 10.10), which gives as a special case
2
that equation (6.1.3) has at most exp(n4n (r + 1)) non-degenerate solutions.
If n = 2, then every solution is non-degenerate. In that case we have the
following sharper result, which was proved by Beukers and Schlickewei (1996).
Theorem 6.1.4 Let K be a field of characteristic 0 and a subgroup of
K ∗ × K ∗ of finite rank r. Then the equation
x1 + x2 = 1 in (x1 , x2 ) ∈ (6.1.5)
has at most 28(r+1) solutions.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.1 Results
133
We immediately obtain the following corollary.
Corollary 6.1.5 Let K, be as in Theorem 6.1.4 and a1 , a2 ∈ K ∗ . Then the
equation
a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ (6.1.6)
has at most 28(r+2) solutions.
Proof. Apply Theorem 6.1.4 with instead of the group generated by and (a1 , a2 ).
In most cases, the bound 28(r+2) in Corollary 6.1.5 can be improved. Let
K, be as in this corollary. Two pairs (a1 , a2 ), (b1 , b2 ) ∈ K ∗ × K ∗ are called
-equivalent if there is (u1 , u2 ) ∈ such that (b1 , b2 ) = (a1 , a2 )(u1 , u2 ). Obviously, the number of solutions of (6.1.5) does not change if (a1 , a2 ) is replaced
by a -equivalent pair. Then we have the following result, which was proved
by Evertse, Győry, Stewart and Tijdeman (1988a).
Theorem 6.1.6 Let K, be as in Theorem 6.1.4. There is a collection of at most
finitely many -equivalence classes of pairs in K ∗ × K ∗ , such that for every
pair (a1 , a2 ) ∈ K ∗ × K ∗ outside the union of these classes, equation (6.1.6) has
at most two solutions. The number of these -equivalence classes is bounded
above by a function B(r) depending on the rank r of only.
In fact, the method of proof gives
B(r) = 12A(5, 2r) + 24A(3, 2r) + 60A(2, 2r)2 ,
(6.1.7)
where A(n, r) is any upper bound depending only on n and r for the number
of non-degenerate solutions of (6.1.3). By using (6.1.4) we obtain B(r) =
e20000(r+3) . For earlier bounds for B(r), see Győry (1992b) and Bérczes (2000).
The bound 2 in Theorem 6.1.6 is optimal. For suppose that the set +
:=
, the equation
{(u1 , u2 ) ∈ : u1 = u2 } is infinite. Then for any (u1 , u2 ) ∈ +
1 − u2
u1 − 1
x1 +
x2 = 1
u1 − u2
u1 − u2
has two solutions in , namely (1, 1) and (u1 , u2 ). But, by Corollary 6.1.5, the
-equivalence class of such an equation can have only finitely many equations
1−u
u −1
, then ( u −u2 , u 1−u )
with solutions (1, 1). Hence if (u1 , u2 ) runs through +
1
2
1
2
runs through infinitely many -equivalence classes.
In Section 6.3 we sketch a proof of the following: equation (6.1.3) has at
most c(n)r+1 non-degenerate solutions, where c(n) is an effectively computable
constant depending only on n. For a detailed proof of Theorem 6.1.3 we
refer to Evertse, Schlickewei and Schmidt (2002) (see also Rémond (2002)).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
134
Unit equations in several unknowns
Theorem 6.1.4 is proved in Section 6.4. In Section 6.5 we deduce Theorem
6.1.6 from Theorem 6.1.3. Here we follow the proof of Evertse, Győry, Stewart
and Tijdeman (1988a) and Bérczes (2000). For further historical comments
related to the theorems in this subsection, we refer to Section 6.7.
6.1.3 Lower bounds
Erdős, Stewart and Tijdeman were the first to consider lower bounds for the
number of solutions of S-unit equations. For a set of distinct primes S =
{p1 , . . . , pt }, denote by N (S) the number of solutions of
x1 + x2 = 1 in x1 , x2 ∈ ±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z .
(6.1.8)
Then, in Erdős, Stewart and Tijdeman (1988), it was shown that for every > 0
and every sufficiently large t, there is a set of primes S of cardinality t such
that
N (S) ≥ exp((4 − )(t/ log t)1/2 ).
Recall that Theorem 6.1.6 implies N (S) ≤ C t+1 with C a constant > 1.
Stewart conjectured that there are absolute constants c1 , c2 > 1, such that
2/3
for every t > 0 and every set of primes S of cardinality t we have N (S) ≤ c1t ,
while conversely, for arbitrarily large t there is a set of primes S of cardinality
2/3
t such that N(S) ≥ c2t .
Konyagin and Soundararajan (2007) obtained the following result, which is
a small further step towards Stewart’s conjecture.
√
Theorem 6.1.7 For every β < 2 − 2 = 0.586 . . . , there are sets of primes
S of arbitrarily large cardinality t, such that N (S) ≥ exp(t β ).
In Section 6.6 we have included the ingenious proof of Konyagin and
Soundararajan.
In their paper mentioned above, Konyagin and Soundararajan proved also
that for every sufficiently large t there are distinct primes p1 , . . . , pt such that
the equation
x−y =1
has at least exp(t 1/16 ) solutions in positive integers x, y composed of p1 , . . . , pt .
We omit the proof of this result, which is based on much deeper analytic number
theory.
There are also results on lower bounds for the number of solutions of S-unit
equations in an arbitrary number of unknowns. Let again S = {p1 , . . . , pt }
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.1 Results
135
be a finite set of primes, let n ≥ 2, and denote by N (n, S) the number of
non-degenerate solutions to
x1 + · · · + xn = 1 in x1 , . . . , xn ∈ {±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z}. (6.1.9)
In Evertse, Moree, Stewart and Tijdeman (2003), the authors proved that for
every n ≥ 2, > 0, and every sufficiently large t there is a set of primes S of
cardinality t such that
2
n
t 1−1/n (log t)−1/n .
N(n, S) ≥ exp (1 − ) n−1
This is a slight improvement of an unpublished result by Granville. It would be
of interest to improve this further, by extending the approach of Konyagin and
Soundararajan.
We introduce another quantity, which more or less measures how much algebraic structure there is in the set of solutions of (6.1.9). Denote by g(n, S) the
smallest integer g such that there is a non-zero polynomial P ∈ C[X1 , . . . , Xn ]
of total degree g, not divisible by X1 + · · · + Xn − 1, such that
P (x1 , . . . , xn ) = 0
for every solution (x1 , . . . , xn ) of (6.1.9).
In other words, the set of solutions of (6.1.9) cannot be contained in a hypersurface of Cn of degree < g(n, S).
It is not hard to show that
g(n, S) ≤ 2n−1 − n + (n − 1)N (n, S)1/(n−1) .
(6.1.10)
Indeed, let N := N (n, S) and let g be the smallest integer such that
) > N . Then there is a non-zero polynomial P1 ∈ C[X1 , . . . , Xn−1 ]
( n+g−1
n−1
of total degree ≤ g such that P1 (x1 , . . . , xn−1 ) = 0 for each non-degenerate
solution (x1 , . . . , xn ) of (6.1.9). This can be seen by viewing the relations
P1 (x1 , . . . , xn−1 ) = 0 as linear equations in the coefficients of P1 . Thus, we
) unknowns and by our choice
obtain a system of N linear equations in ( n+g−1
n−1
of g it has a non-trivial solution. Our choice of g implies
g
n−1
n−1
≤
n+g−2
≤ N,
n−1
hence g ≤ (n − 1)N 1/(n−1) . Now let P be the product of P1 and of all poly
nomials i∈I Xi with I a subset of {1, . . . , n − 1} of cardinality at least 2.
Then P has total degree g + 2n−1 − n ≤ 2n−1 − n + (n − 1)N 1/(n−1) , P is not
divisible by X1 + · · · + Xn − 1 since it depends only on X1 , . . . , Xn−1 , and
every solution of (6.1.9), degenerate or non-degenerate, is a zero of P . This
implies (6.1.10).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
136
Unit equations in several unknowns
Again in Evertse, Moree, Stewart and Tijdeman (2003), it was shown that
for every n ≥ 2, > 0 and every sufficiently large t there is a set of primes S
of cardinality t such that
g(n, S) ≥ exp((4 − )(t/ log t)1/2 ).
Using the above theorem of Konyagin and Soundararajan we improve this as
follows.
√
Theorem 6.1.8 For every n ≥ 2, β < 2 − 2, there are sets of primes S of
arbitrarily large cardinality t, such that g(n, S) ≥ exp(t β ).
The proof of this result is given in Section 6.6.
6.2 Proofs of Theorem 6.1.1 and Corollary 6.1.2
We take as starting point Theorem 3.1.3. As before, K is an algebraic number
field, S a finite set of places of K containing all infinite places and n an integer
with n ≥ 2.
We prove the following result, which is in fact equivalent to Theorem 6.1.1.
Proposition 6.2.1 Let T be a subset of S and > 0. There is a constant
C ineff (K, S, n, ) > 0 such that for all vectors x = (x1 , . . . , xn ) ∈ OSn satisfying
xi = 0 for each non-empty subset I of {1, . . . , n}
(6.2.1)
i∈I
we have
n
v∈S i=1
|xi |v
|x1 + · · · + xn |v
v∈T
≥ C ineff (K, S, n, )
max(|x1 |v , . . . , |xn |v ) HS (x)− .
(6.2.2)
v∈T
Since |x1 + · · · + xn |v max(|x1 |v , . . . , |xn |v ) for v ∈ MK , the special
case of Proposition 6.2.1 with T = S implies the general case of arbitrary
subsets T of S. On the other hand, Proposition 6.2.1 with T = S is a reformulation of Theorem 6.1.1. Indeed, writing x0 := −(x1 + · · · + xn ), we see that
(6.2.2) with T = S can be rewritten as
NS (x0 · · · xn ) HS (x0 , . . . , xn )1−
(with implied constant depending on K, S, n, ), which in turn is equivalent to
inequality (6.1.1) in Theorem 6.1.1.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.2 Proofs of Theorem 6.1.1 and Corollary 6.1.2
137
A weaker version of Proposition 6.2.1 was proved earlier by van der Poorten
and Schlickewei (unpublished).
The proof of Proposition 6.2.1 is by induction on n. For n = 1 the assertion
is trivially true. Assume Proposition 6.2.1 is true for vectors with fewer than
n coordinates, where n ≥ 2. We proceed to prove (6.2.2) under this induction
hypothesis. Henceforth we restrict ourselves to vectors x = (x1 , . . . , xn ) ∈ OSn
satisfying (6.2.1) and
n
|xi |v
|x1 + · · · + xn |v ≤
max (|x1 |v , . . . , |xn |v ) HS (x)− .
v∈S i=1
v∈T
v∈T
(6.2.3)
This is obviously no loss of generality.
We start with a lemma.
Lemma 6.2.2 The set of solutions x ∈ OSn \ {0} of (6.2.3) is contained in a
union of finitely many proper linear subspaces of K n .
Proof. Let x = (x1 , . . . , xn ) be a solution of (6.2.3). Define the linear form
X0 := −(X1 + · · · + Xn ) and put x0 := −(x1 + · · · + xn ). For v ∈ S \ T , let
L1v = X1 , . . . , Lnv = Xn .
For v ∈ T , let iv ∈ {0, . . . , n} with |xiv |v = max(|x0 |v , . . . , |xn |v ), and let
L1v , . . . , Lnv be the linear forms from {X0 , . . . , Xn } \ {Xiv } in some order.
Then
|xiv |v max(|x1 |v , . . . , |xn |v )
so (6.2.3) implies
for v ∈ T ,
|L1v (x) · · · Lnv (x)|v HS (x)− .
v∈S
By Theorem 3.1.3, the solutions x ∈ OSn \ {0} of the latter inequality, and hence
the solutions of (6.2.3) with |xiv |v = max(|x0 |v , . . . , |xn |v ) for v ∈ T , lie in a
union of finitely many proper linear subspaces of K n . By applying this to all
tuples (iv : v ∈ T ), Lemma 6.2.2 follows.
Let T1 , . . . , Tt be the subspaces from Lemma 6.2.2 and let T ∈ {T1 , . . . , Tt }.
Then T can be given by an equation
x1 + · · · + xn = βi1 xi1 + · · · + βim xim ,
(6.2.4)
where βi1 , . . . , βim ∈ K and m < n. Let x ∈ OSn be a vector with (6.2.1), (6.2.3)
and with x ∈ T . So x satisfies (6.2.4). Let J be a minimal subset of {i1 , . . . , im }
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
138
Unit equations in several unknowns
such that
j ∈J
βj xj = 0. By re-indexing, we may assume that
⎫
⎪
⎪
⎪
⎬
u
x1 + · · · + xn =
βi xi , u < n,
i=1
⎪
βi xi = 0 for each non-empty subset I of {1, . . . , u}.⎪
⎪
⎭
(6.2.5)
i∈I
We now consider vectors x ∈ OSn with (6.2.1), (6.2.3), (6.2.5) and show
that these satisfy (6.2.2) with an appropriate constant C ineff . Below, constants
implied by the Vinogradov symbols , will be ineffective, and will depend
only on K, S, n, , β1 , . . . , βu . But notice that β1 , . . . , βu in turn depend only
on K, S, n, , as they are coming from Lemma 6.2.2. So the constants implied
by , ultimately depend only on K, S, n, . Choose δ ∈ OS \ {0} such that
δβi ∈ OS for i = 1, . . . , u, and for a solution x ∈ OSn of (6.2.1), (6.2.3) and
(6.2.5), write zi := δβi xi (i = 1, . . . , u), z = (z1 , . . . , zu ).
Let x = (x1 , . . . , xn ) ∈ OSn be a vector with (6.2.1), (6.2.3) and (6.2.5). Then
|xi |v |zi |v for v ∈ S, i = 1, . . . , u, and |x1 + · · · + xn |v |z1 + · · · + zu |v
for v ∈ T . Hence
n
|xi |v
|x1 + · · · + xn |v
v∈S i=1
n
v∈T
|xi |v
u
v∈S i=u+1
|zi |v
v∈S i=1
|z1 + · · · + zu |v .
v∈T
Now a first application of the induction hypothesis gives
n
|xi |v
v∈S i=1
n
|x1 + · · · + xn |v
v∈T
|xi |v
v∈S i=u+1
n
v∈S i=u+1
v∈T
|xi |v
max (|z1 |v , . . . , |zu |v ) HS (z)−/2
max (|x1 |v , . . . , |xu |v ) HS (x)−/2 .
v∈T
Let
T1 = {v ∈ T : max(|x1 |v , . . . , |xn |v ) = max(|x1 |v , . . . , |xu |v )} ,
T2 = T \ T 1 .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.2 Proofs of Theorem 6.1.1 and Corollary 6.1.2
If T1 = T , we obtain at once (6.2.2), since
T1 T . Then
n
v∈S
i=u+1
139
|xi |v ≥ 1. Suppose
max(|x1 |v , . . . , |xu |v ) = max(|x1 |v , . . . , |xn |v ) for v ∈ T1 ,
max(|x1 |v , . . . , |xu |v ) |(β1 − 1)x1 + · · · + (βu − 1) xu |v
= |xu+1 + · · · + xn |v
for v ∈ T2 .
Now a second application of the induction hypothesis yields
n
|xi |v
v∈S i=1
n
|x1 + · · · + xn |v
v∈T
|xi |v
v∈S i=u+1
×
|xu+1 + · · · + xn |v
v∈T2
max(|x1 |v , . . . , |xn |v ) HS (x)−/2
v∈T1
max(|xu+1 |v , . . . , |xn |v ) HS (xu+1 , . . . , xn )−/2
×
v∈T1
v∈T2
max(|x1 |v , . . . , |xn |v ) HS (x)−/2
max(|x1 |v , . . . , |xn |v ) HS (x)− ,
v∈T
as required. This proves Proposition 6.2.1, hence Theorem 6.1.1.
Proof of Corollary 6.1.2. Let T be the smallest set of places such that S ⊆ T
and a1 , . . . , an are T -units. Then for every non-degenerate solution x ∈ (OS∗ )n
of (6.1.2), we have
H (ai xi ) ≤
1/[K:Q]
max(1, |a1 x1 |v , . . . , |an xn |v )
v∈T
= HT (1, a1 x1 , . . . , an xn )1/[K:Q] ≤ C ineff
for some ineffective constant C ineff . This leaves only finitely many possibilities
for ai xi for i = 1, . . . , n. This implies Corollary 6.1.2 at once.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
140
Unit equations in several unknowns
6.3 A sketch of the proof of Theorem 6.1.3
We outline a proof of the following result.
Theorem 6.3.1 Let K be a field of characteristic 0, n ≥ 2, a subgroup of
(K ∗ )n of finite rank r, and a1 , . . . , an ∈ K ∗ . Then the equation
a1 x1 + · · · + an xn = 1
in (x1 , . . . , xn ) ∈ (6.1.3)
has at most c1 (n)r+1 non-degenerate solutions, where c1 (n) is an effectively
computable number depending only on n.
Constants c2 (n), c3 (n), . . . introduced below will also be effectively computable and depending only on n.
6.3.1 A reduction
We reduce Theorem 6.3.1 to the following apparently weaker result.
∗
Theorem 6.3.2 Let n ≥ 2 and let be a finitely generated subgroup of (Q )n
of rank r. Then the set of solutions of
x1 + · · · + xn = 1 in (x1 , . . . , xn ) ∈ (6.3.1)
n
is contained in a union of at most c2 (n)r+1 proper linear subspaces of Q .
In the proof of Theorem 6.3.1, we have to make a reduction from the
case that is contained in an arbitrary field K of characteristic 0 to the
∗
case ⊂ (Q )n , and for this, we need the following specialization result from
algebraic geometry.
Lemma 6.3.3 Let K be a field of characteristic 0 with K ⊃ Q, and u1 , . . . , um
(m ≥ 1) non-zero elements of K. Then there exists a ring homomorphism
ϕ : Q[u1 , . . . , um ] → Q, leaving Q invariant.
Proof. Define the ideal
I := {f ∈ Q[X1 , . . . , Xm ] : f (u1 , . . . , um ) = 0}
and let
m
Z(I ) := x = (x1 , . . . , xm ) ∈ Q : f (x) = 0 for all f ∈ I .
Obviously, I = (1), so by the Weak Nullstellensatz (see, e.g., Harris (1992),
Theorem 5.17), the set Z(I ) is not empty. Choose c = (c1 , . . . , cm ) ∈ Z(I ).
Then there is a well-defined ring homomorphism ϕ : Q[u1 , . . . , um ] → Q,
mapping ui to ci for i = 1, . . . , m, and mapping the elements of Q to itself.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.3 A sketch of the proof of Theorem 6.1.3
141
Proof of Theorem 6.3.1. First suppose that is a finitely generated subgroup
∗
∗
of (Q )n of rank r, and that a1 , . . . , an ∈ Q . By applying Theorem 6.3.2 to
the group generated by and (a1 , . . . , an ), we infer that the set of solutions
of (6.1.3) is contained in a union of at most c2 (n)r+2 proper linear subspaces
n
of Q .
By induction on n, it now follows that (6.1.3) has at most c1 (n)r+1 nondegenerate solutions. We give the argument. For n = 2 Theorem 6.3.1 is
obviously true. Let n ≥ 3 and assume Theorem 6.3.1 is true for equations
in fewer than n unknowns. Consider the solutions of (6.1.3) lying in a
∗
fixed proper linear subspace of (Q )n , given by a non-trivial linear equation
β1 x1 + · · · + βn xn = 0, say. By combining this with (6.1.3) we can eliminate
one of the unknowns, and obtain an equation i∈I γi xi = 1, where I is a proper
subset of {1, . . . , n}. For each subset J of I , consider the solutions of the lat
ter equation such that i∈J γi xi = 1 but i∈J γi xi = 0 for each non-empty
subset J of J . Assuming J has cardinality m, by the induction hypothesis the
latter equation has at most c1 (m)r+1 solutions (xi : i ∈ J ). Substituting a tuple
(xi : i ∈ J ) in (6.1.3) we obtain i∈J c ai xi = b, where J c := {1, . . . , n} \ J
and b := 1 − i∈J ai xi , and b, as well as each proper subsum of the left-hand
side, are non-zero. By applying again the induction hypothesis, we see that there
are at most c1 (n − m)r+1 possibilities for the remaining tuple (xi : i ∈ J c ). So
for given β1 , . . . , βn and J , we have at most (c1 (m)c1 (n − m))r+1 solutions
(x1 , . . . , xn ). By summing over (β1 , . . . , βn ) (the number of which is bounded
by c2 (n)r+2 ) and over all J , we obtain an upper bound c1 (n)r+1 for the total number of non-degenerate solutions of (6.1.3). This completes the induction step.
We now consider the general case that is a subgroup of (K ∗ )n of finite
rank r, and that a1 , . . . , an ∈ K ∗ , where K is any field of characteristic 0. We
assume without loss of generality that Q is contained in K.
Assume that (6.1.3) has M > c1 (n)r+1 non-degenerate solutions,
x1 , . . . , xM , say, where xi = (xi1 , . . . , xin ) for i = 1, . . . , M. We apply
Lemma 6.3.3 with the set {u1 , . . . , um } consisting of a1 , . . . , an , x11 , . . . , xMn ,
the subsums j ∈I aj xij (i = 1, . . . , M, I ⊂ {1, . . . , n}), the non-zero numbers
among xi1 j − xi2 j (1 ≤ ii < i2 ≤ M, j = 1, . . . , n), and also the multiplicative
inverses of all of these numbers. Thus, the images under ϕ of u1 , . . . , um are all
non-zero. Put aj := ϕ(aj ), xij := ϕ(xij ) (i = 1, . . . , M, j = 1, . . . , n). Then
aj = 0 for j = 1, . . . , n and xi = (xi1 , . . . , xin ) (i = 1, . . . , M) are distinct,
non-degenerate solutions of
a1 x1 + · · · + an xn = 1.
(6.3.2)
Let be the group generated by x1 , . . . , xM . Then is a finitely generated
∗
subgroup of (Q )n , and has rank at most r since it is a homomorphic image of
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
142
Unit equations in several unknowns
a subgroup of . So (6.3.2) has at least M > c1 (n)r+1 non-degenerate solutions
in , contrary to what has been established above. This shows that in the general
case, (6.1.3) cannot have more than c1 (n)r+1 non-degenerate solutions.
The remainder of this section is devoted to the proof of Theorem 6.3.2.
6.3.2 Notation
∗
Assume henceforth that is a finitely generated subgroup of (Q )n of rank r.
Let K be an algebraic number field such that ⊂ (K ∗ )n , and let S be a finite
set of places of K, containing all infinite places, such that ⊆ (OS∗ )n . Put
d := [K : Q],
s := |S|.
For x = (x1 , . . . , xn ) ∈ K n define the heights
h(x) :=
1 max(0, log |x1 |v , . . . , log |xn |v ),
d v∈M
K
+
h(x) :=
n
i=1
h(xi ) =
n
1 max(0, log |xi |v ),
d i=1 v∈M
K
where h(x) denotes the absolute logarithmic height of an algebraic number x.
∗
These heights can be extended in the usual manner to (Q )n by picking any
number field K containing x1 , . . . , xn and applying the above definitions. It is
∗
straightforward to show that for x = (x1 , . . . , xn ) ∈ (Q )n we have
1+
h(x) ≤ h(x) ≤ +
h(x).
n
(6.3.3)
Further, for x = (x1 , . . . , xn ) ∈ we have
⎫
1
⎪
⎪
max(0, log |x1 |v , . . . , log |xn |v ),
⎪
⎬
d v∈S
n
n
1 1 ⎪
+
⎪
max(0, log |xi |v ) =
|log |xi |v |,⎪
h(x) =
⎭
d v∈S i=1
2d v∈S i=1
h(x) =
(6.3.4)
where the last equality is a consequence of the Product Formula.
6.3.3 Covering results
We treat this in more detail since we can simplify the argument in Evertse,
Schlickewei and Schmidt (2002).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.3 A sketch of the proof of Theorem 6.1.3
143
Lemma 6.3.4 Let V be an r-dimensional real vector space and a norm
on V . Let C, δ be positive reals, and let S be a subset of
{x ∈ V : x ≤ C}.
Then S has a subset S0 such that
2C r
,
δ
for every x ∈ S there is x0 ∈ S0 with x − x0 ≤ δ.
|S0 | ≤ 1 +
(6.3.5)
(6.3.6)
Proof. Let S0 be any subset of S with the property that
x − x > δ
for any two distinct x , x ∈ S0 .
We show that S0 satisfies (6.3.5). Knowing this, we can choose S0 of maximal
cardinality; then it satisfies (6.3.6) as well.
For u ∈ V , define Bu := {x ∈ V : x − u ≤ δ/2}. Then by the triangle
inequality, the balls Bu (u ∈ S0 ) are pairwise disjoint, and are all contained
in B := {x ∈ V : x ≤ C + δ/2}. Let μ be the Lebesgue measure on V normalized such that the unit ball {x ∈ V : x ≤ 1} has measure 1. Then, by
comparing measures,
|S0 |
δ
2
r
=
μ(Bu ) ≤ μ(B) = C +
u∈S0
δ
2
r
,
which implies (6.3.5).
Write vectors in Rns as u = (uiv : v ∈ S, i = 1, . . . , n) and define the following homomorphism from to the additive group of Rns :
ϕ : (x1 , . . . , xn ) →
log |xi |v
: v ∈ S, i = 1, . . . , n .
d
Then the kernel of ϕ is the torsion subgroup tors of . Let V be the real vector
space generated by ϕ(). Then V has dimension r. Define a norm on Rns by
1 |uiv |.
2 v∈S i=1
n
u :=
(6.3.7)
Then by (6.3.4) we have
+
h(x) = ϕ(x)
for x ∈ .
(6.3.8)
By combining this with Lemma 6.3.4 we obtain the following.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
144
Unit equations in several unknowns
Lemma 6.3.5 Let C, δ be positive reals, and let S be a non-empty subset of
{x ∈ : +
h(x) ≤ C}. Then S has a subset S0 such that
|S0 | ≤ 1 +
2C
δ
r
(6.3.9)
,
for every x ∈ S there is x0 ∈ S0 with +
h(x · x−1
0 ) ≤ δ.
(6.3.10)
Proof. Let S ∗ := ϕ(S). Choose S0∗ ⊂ S ∗ as in Lemma 6.3.4. Then choose
for each u0 ∈ S0∗ precisely one element x0 ∈ S with ϕ(x0 ) = u0 , and let S0 be
the set of all elements thus chosen. Then clearly, S0 satisfies (6.3.9). To show
that it also satisfies (6.3.10), let x ∈ S, choose u0 ∈ S0∗ with ϕ(x) − u0 ≤ δ,
h(x · x−1
and then x0 ∈ S0 with ϕ(x0 ) = u0 . Then by (6.3.8), +
0 ) ≤ δ.
We give another application.
Lemma 6.3.6 Let θ > 0. There is a subset S1 of V of cardinality
|S1 | ≤ 1 +
4n
θ
r
such that for every x = (x1 , . . . , xn ) ∈ there is c = (civ : v ∈ S, i =
1, . . . , n) ∈ S1 with
n log |xi |v
(6.3.11)
dh(x) − civ ≤ θ.
v∈S i=1
Proof. We apply Lemma 6.3.4 to the set S of vectors
u(x) =
log |xi |v
: v ∈ S, i = 1, . . . , n
dh(x)
(x ∈ )
which is contained in V , and with the norm given by (6.3.7). By (6.3.8) we
have for x ∈ ,
u(x) =
+
h(x)
1
ϕ(x) =
≤ n.
h(x)
h(x)
So S ⊆ {u ∈ V : u ≤ n}. By Lemma 6.3.4 with δ = θ/2 there is a subset
2n r
) = (1 + 4n
)r such that for every x ∈ S1 of S of cardinality at most (1 + θ/2
θ
there is c ∈ S1 with u(x) − c ≤ θ/2. This implies (6.3.11).
6.3.4 The large solutions
We give an upper bound for the number of subspaces containing the solutions
x of (6.3.1) for which h(x) is large. Our main tool is Theorem 3.1.6, i.e., the
quantitative version of the Parametric Subspace Theorem stated in Section 3.1.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.3 A sketch of the proof of Theorem 6.1.3
145
We apply Lemma 6.3.6 with θ = c3 (n)−1 , where c3 (n) is a sufficiently large
function of n. Let S1 be the set from Lemma 6.3.6. With our choice of θ , we
have
|S1 | ≤ c4 (n)r .
(6.3.12)
Pick a tuple c = (civ : v ∈ S, i = 1, . . . , n) from S1 , and consider the solutions x = (x1 , . . . , xn ) ∈ of (6.3.1) with
n log |xi |v
−1
(6.3.13)
dh(x) − civ ≤ θ = c3 (n) .
v∈M i=1
K
Put x0 := 1, X0 := X1 + · · · + Xn , c0v := 0 for v ∈ S, and for v ∈ S, choose
iv ∈ {0, . . . , n} such that
civ ,v = max(c0v , . . . , cnv ).
Let L1v , . . . , Lnv be the linear forms Xi (i ∈ {0, . . . , n} \ {iv }) in some order,
and put dj v = civ if Lj v = Xi . Further, for v ∈ MK \ S, i = 1, . . . , n put Liv =
Xi , div = 0. Finally, put
Q := exp(dh(x)),
d := (div : v ∈ MK , i = 1, . . . , n)
and define the twisted height HQ,d by (3.1.4). By (6.3.13) we have
n log |Lj v (x)|v
−1
−
d
j v ≤ c3 (n) ,
log
Q
v∈M j =1
K
and this implies
−1
HQ,d (x) ≤ Qc3 (n) .
(6.3.14)
Let
ξiv :=
log |xi |v
dh(x)
(v ∈ S,
Then
n
v∈S
i = 1, . . . , n),
ξ0v := 0 (v ∈ S).
ξiv − max(ξ0v , . . . , ξnv )
i=0
1
=
|x1 x2 · · · xn |v − log
max(1, |x1 |v , . . . , |xn |v )
log
dh(x)
v∈S
v∈S
=
−dh(x)
= −1
dh(x)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
146
Unit equations in several unknowns
and
max(ξ0v , . . . , ξnv ) = 1.
v∈MK
In combination with (6.3.13) this implies, assuming that c3 (n) is sufficiently
large, that
n
n
1 1 dj v =
civ − max(c0v , . . . , cnv )
μ :=
n v∈M j =1
n v∈S i=0
K
is approximately equal to
n
1 1
ξiv − max(ξ0v , . . . , ξnv ) = − ;
n v∈S i=0
n
more precisely,
1
1
− c5 (n)−1 ≤ μ ≤ − + c5 (n)−1 ,
n
n
where c5 (n) = 2c3 (n). Likewise,
max(d1v , . . . , dnv ) ≤ 1 + c5 (n)−1 .
λ :=
−
v∈MK
Together with (6.3.14) this implies
HQ,d (x) ≤ Q−μ−δ
(6.3.15)
where, provided that c3 (n) is sufficiently large,
1
− c5 (n)−1 + c3 (n)−1 > 0.
n
Recall that every solution x of (6.3.1) with (6.3.13) implies (6.3.15) with
Q = exp(dh(x)). Now Theorem 3.1.6 (the Quantitative Parametric Subspace
Theorem) implies that the set of solutions of (6.3.1) with (6.3.13) and with
δ = c6 (n)−1 :=
1
log Q ≥ 2c6 (n) log n =: c7 (n)
d
is contained in a union of at most c8 (n) proper linear subspaces of K n . Taking
into account the upper bound (6.3.12) for the cardinality of S1 , which is an
upper bound for the number of different inequalities (6.3.13), we arrive at the
following.
h(x) =
Lemma 6.3.7 The set of solutions x = (x1 , . . . , xn ) ∈ of (6.3.1) with
h(x) ≥ c7 (n)
is contained in a union of at most c8 (n)c4 (n)r proper linear subspaces of K n .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.3 A sketch of the proof of Theorem 6.1.3
147
6.3.5 The small solutions, and conclusion of the proof
It remains to consider the solutions x of (6.3.1) with h(x) < c7 (n). The crucial
tool is the following, which we state without proof.
∗
Proposition 6.3.8 Let b1 , . . . , bn ∈ Q . Then the equation
∗
b1 y1 + · · · + bn yn = 1 in y = (y1 , . . . , yn ) ∈ (Q )n
(6.3.16)
has at most c9 (n) non-degenerate solutions with +
h(y) ≤ c10 (n) .
−1
This result is a special case of more general estimates for the number of points
of small height lying on an arbitrary subvariety of a linear torus. From a result
of Schmidt (1996), theorem 4, which was obtained by an elementary method,
one can deduce the above Proposition with
c9 (n) = c10 (n) = exp((4n)2n+2 ).
From David and Philippon (1999), Theorem 1.3 and errata, which is much
deeper and uses difficult commutative algebra, it follows that Proposition 6.3.8
holds with
n−1
c9 (n) = 2(n+26)7
,
c10 (n) = c9 (n)3/4 .
Finally, a result of Amoroso and Viada (2009) implies that Proposition 6.3.8
holds with
2
c9 (n) = (400n5 log n)n
(n−1)2
,
2
c10 (n) = 2(400n5 log n)n(n−1) .
The proof of Amoroso and Viada also uses commutative algebra but it is not as
difficult as that of David and Philippon.
Consider the set S of solutions x of (6.3.1) with h(x) < c7 (n). By (6.3.3),
these solutions satisfy +
h(x) < nc7 (n). Apply Lemma 6.3.5 with C = nc7 (n),
δ = c10 (n)−1 . According to that lemma, there is a subset S0 ⊆ S of cardinality
|S0 | ≤ (1 + 2nc7 (n)c10 (n))r ≤ c11 (n)r
such that for every x ∈ S there is x0 ∈ S0 with
+
h x · x−1
≤ c10 (n)−1 .
0
(6.3.17)
Write x0 = (b1 , . . . , bn ), y = (y1 , . . . , yn ) := x · x−1
0 . Then clearly, the number
of non-degenerate solutions x of (6.3.1) with (6.3.17) is at most the number
of non-degenerate solutions y of (6.3.16), hence at most c10 (n). Taking into
account the cardinality of S0 , it follows that (6.3.1) has at most
c9 (n)c11 (n)r
non-degenerate solutions with +
h(x) < c7 (n).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
148
Unit equations in several unknowns
The degenerate solutions of (6.3.1) lie in at most 2n proper linear subspaces
of K n , each given by a vanishing subsum. We infer that the solutions x of
(6.3.1) (non-degenerate or not) with h(x) < c7 (n) lie in a union of at most
2n + c9 (n)c11 (n)r
proper linear subspaces.
Adding to this the quantity from Lemma 6.3.7, it follows that the complete
set of solutions of (6.3.1) is contained in a union of at most
c8 (n)c4 (n)r + 2n + c9 (n)c11 (n)r ≤ c2 (n)r+1
proper linear subspaces of K n . This completes the proof of Theorem 6.3.2.
6.4 Proof of Theorem 6.1.4
We follow Beukers and Schlickewei (1996). Let K be a field of characteristic
0, and a subgroup of K ∗ × K ∗ of rank r. We first show that Theorem 6.1.4
can be reduced to the following special case.
∗
∗
Theorem 6.4.1 Let be a finitely generated subgroup of Q × Q of rank r.
Then the equation
x1 + x2 = 1 in (x1 , x2 ) ∈ (6.4.1)
has at most 28(r+1) solutions.
Proof of Theorem 6.1.4. We use again specializations. Let K be any field of
characteristic 0, and let be a subgroup of K ∗ × K ∗ of rank r. We have to
prove that any finite subset of the set of solutions of (6.4.1) has cardinality at
most 28(r+1) . Let {(xi1 , xi2 ) : i = 1, . . . , N} be such a finite subset. We apply
Lemma 6.3.3 with {u1 , . . . , um } consisting of the numbers xik (i = 1, . . . , N ,
k = 1, 2), the non-zero numbers among xik − xj k (1 ≤ i < j ≤ N , k = 1, 2),
and the multiplicative inverses of all these numbers. Let ϕ be the ring homomorphism from Lemma 6.3.3, and put yik := ϕ(xik ) for i = 1, . . . , N, k = 1, 2.
Since the images under ϕ of the numbers listed above are all non-zero, (yi1 , yi2 )
∗
∗
(i = 1, . . . , N) are distinct pairs from Q × Q . In fact, they yield N distinct
solutions of
y1 + y2 = 1
in (y1 , y2 ) ∈ ,
where is the group generated by (yi1 , yi2 ) (i = 1, . . . , N ). But is a subgroup of ϕ(), hence it has rank r ≤ r. Now it follows from Theorem 6.4.1
that N ≤ 28(r +1) ≤ 28(r+1) .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.4 Proof of Theorem 6.1.4
149
In the remainder of this section, we prove Theorem 6.4.1. Instead of the
Quantitative Parametric Subspace Theorem, we can now use a much simpler
method from Diophantine approximation, based on certain polynomial identities, going back to Thue and Siegel. For N ∈ Z>0 define the binary form
N
2N − m
WN (X, Y ) :=
N −m
m=0
N + m N−m
X
(−Y )m ,
m
and set Z := −X − Y , so that X + Y + Z = 0.
Lemma 6.4.2 We have the following polynomial identities, valid for every
positive integer N :
WN (Y, X) = (−1)N WN (X, Y );
(6.4.2)
X2N+1 WN (Y, Z) + Y 2N+1 WN (Z, X) + Z 2N+1 WN (X, Y ) = 0;
Z 2N+1 W (X, Y )
Y 2N+1 WN (Z, X) N
2N+3
Z
WN+1 (X, Y ) Y 2N+3 WN+1 (Z, X)
(6.4.3)
= cN (XY Z)2N+1 (X2 + XY + Y 2 ) with cN = 0.
(6.4.4)
Proof. Identity (6.4.2) is obvious.
Identity (6.4.3) can be deduced from classical relations between hypergeometric functions (see, for instance, Bombieri and Gubler (2006), chapter 5), but
we give a direct proof. Fix a positive integer N , let x, y, z be non-zero complex
numbers with x + y + z = 0 and consider the rational function
f (t) :=
1
.
(t(1 − xt)(1 + yt))N+1
This function has poles of order N + 1 at t = 0, t = 1/x, t = −1/y and no
other poles on C ∪ {∞}. The sum of the residues at these poles is 0. We compute
the residues. The residue of f at t = 0 is the coefficient of t N in the power
series expansion of (1 − xt)−N−1 (1 + yt)−N−1 , which is
N
−N − 1
−N − 1 m
(−x)N−m
y
N
−
m
m
m=0
=
N
2N − m
N −m
m=0
N + m N−m
x
(−y)m = WN (x, y).
m
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
150
Unit equations in several unknowns
The residue of f at t = 1/x is equal to the residue at t = 0 of
1
((t + 1/x)(−xt)(−z/x + yt))N+1
(x/z)N+1
=
.
(t(1 − xyt/z)(1 + xt))N+1
f (t + 1/x) =
This residue is equal to (x/z)N+1 WN (xy/z, x) = (x/z)2N+1 WN (y, z). A similar
computation gives that the residue of f at t = −1/y equals (y/z)2N+1 WN (z, x).
Summing the residues and multiplying with z2N+1 shows that (6.4.3) holds for
all non-zero x, y, z ∈ C with x + y + z = 0.
Denote the left-hand side of (6.4.4) by N . We first show that N ≡ 0.
Indeed, by (6.4.2), the value of N at X = 2, Y = Z = −1 is
WN (2, −1)
WN (−1, 2) WN+1 (2, −1) WN+1 (−1, 2) = ±2WN (2, −1)WN+1 (2, −1),
which is easily seen to be non-zero. By (6.4.3), up to sign, N is invariant under
the substitutions (X, Y ) → (Y, Z), (X, Y ) → (Z, X). Hence N is divisible
by (XY Z)2N+1 . Since N is homogeneous of degree 6N + 5, the quotient
N /(XY Z)2N+1 is a quadratic form, which is up to sign invariant under the
above mentioned substitutions. So this quadratic form must be a scalar multiple
of X2 + XY + Y 2 .
n
Recall that the homogeneous logarithmic height of x = (x1 , . . . , xn ) ∈ Q
is given by
1
hhom (x) = hhom (x1 , . . . , xn ) := [K:Q]
log
max |xi |v ,
v∈MK
1≤i≤n
where K is a number field with x ∈ K n (see Section 1.9). The other heights
used in this chapter are related to this by
h(x) = hhom (1, x1 , . . . , xn ),
+
h(x) =
n
hhom (1, xi ).
i=1
Lemma 6.4.3 Let a, b, c be non-zero elements of Q, and let (xi , yi , zi ) (i =
3
1, 2) be two linearly independent vectors from Q such that axi + byi + czi = 0
for i = 1, 2. Then
hhom (a, b, c) ≤ hhom (x1 , y1 , z1 ) + hhom (x2 , y2 , z2 ) + log 2.
Proof. The vector (a, b, c) is proportional to the exterior product of (x1 , y1 , z1 ),
(x2 , y2 , z2 ), which is (y1 z2 − y2 z1 , z1 x2 − x1 z2 , x1 y2 − x2 y1 ). So
hhom (a, b, c) = hhom (y1 z2 − y2 z1 , z1 x2 − x1 z2 , x1 y2 − x2 y1 ).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.4 Proof of Theorem 6.1.4
151
Choose a number field K containing xi , yi , zi for i = 1, 2. Let s(v) := 1 if v
is a real place, s(v) := 2 if v is a complex place, and s(v) := 0 if v is a finite
place of K. Recall that v∈MK s(v) = [K : Q]. Now the lemma follows easily
by observing that
max(|y1 z2 − y2 z1 |v , |z1 x2 − z2 x1 |v , |x1 y2 − x2 y1 |v )
≤ 2s(v) max(|x1 |v , |y1 |v , |z1 |v ) max(|x2 |v , |y2 |v , |z2 |v )
for v ∈ MK , and then taking the product over v ∈ MK , taking logarithms, and
dividing by [K : Q].
∗
∗
Lemma 6.4.4 Let xi = (xi1 , xi2 ) ∈ Q × Q with xi1 + xi2 = 1 for i = 1, 2
and with x1 = x2 . Then
+ log 2.
h(x1 ) ≤ h x2 x−1
1
Proof. Apply Lemma 6.4.3 with (a, b, c) = (x11 , y11 , −1), (x1 , y1 , z1 ) =
−1
−1
, x22 x21
, 1).
(1, 1, 1), (x2 , y2 , z2 ) = (x21 x11
∗
∗
Lemma 6.4.5 Let xi = (xi1 , xi2 ) ∈ Q × Q with xi1 + xi2 = 1 for i = 1, 2.
Then for every positive integer N there is M ∈ {N, N + 1} such that
1
+ log 8.
h x2 x−2M−1
h(x1 ) ≤ M+1
1
Proof. If both x11 , x12 are roots of unity, then h(x1 ) = 0 and the lemma is
obviously true. Assume that x11 , x12 are not both roots of unity. Choose a
number field K containing xi1 , xi2 for i = 1, 2. By (6.4.3) we have
2M+1
2M+1
WM (x12 , −1) + x12
WM (−1, x11 ) − WM (x11 , x12 ) = 0
x11
for M ∈ Z>0 , while also
2M+1
−2M−1
2M+1
−2M−1
x21 x11
+ x12
x22 x12
− 1 = 0.
x11
Let N be a positive integer. By (6.4.4), and since x1 does not consist of roots
of unity, there is M ∈ {N, N + 1} such that the vectors (x21 , x22 , −1) and
2M+1
2M+1
WM (−1, x11 ), −WM (x11 , x12 )
x11 WM (x12 , −1), x12
are linearly independent. This implies that the two vectors
−2M−1
−2M−1
, x22 x12
, −1 ,
x21 x11
WM (x12 , −1), WM (−1, x11 ), −WM (x11 , x12 ) =: (a, b, c)
are linearly independent. So by Lemma 6.4.3,
+ hhom (a, b, c) + log 2.
(2M + 1)h(x1 ) ≤ h x2 x−2M−1
1
(6.4.5)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
152
Unit equations in several unknowns
We estimate hhom (a, b, c). Choose a number field K containing x11 , x12 . The
binary form WM has integer coefficients, whose absolute values have sum
M
2M − m
M −m
m=0
M +m
3M + 1
=
≤ 23M
m
M
(this can be seen by comparing the coefficients of XM in the power series
identity (1 − X)−M−1 · (1 − X)−M−1 = (1 − X)−2M−2 ). As a consequence, we
have for v ∈ MK ,
max(|a|v , |b|v , |c|v ) ≤ 23Ms(v) max(1, |x11 |v , |x12 |v )M .
By taking the product over v ∈ MK , then logarithms and then dividing by
[K : Q], we obtain
hhom (a, b, c) ≤ M · h(x1 ) + 3M log 2.
Together with (6.4.5) this gives
+ (3M + 1) log 2,
(M + 1)h(x1 ) ≤ h x2 x−2M−1
1
which easily implies our lemma.
The next result, which is needed to deal with the “small” solutions, is due
to Beukers and Zagier.
Lemma 6.4.6 Let x0 = (xi1 , xi2 ) (i = 0, 1, 2) be three distinct points from
∗
∗
Q × Q with xi1 + xi2 = 1 for i = 0, 1, 2. Then
+ h x2 x−1
> 0.09.
h x1 x−1
0
0
Proof. This is a consequence of Corollary 2.4 of Beukers and Zagier (1997).
A similar result of this type, with a lower bound 1/2400 instead of 0.09, was
obtained earlier by Schlickewei and Wirsing (1997). We give a sketch of the
proof of Beukers and Zagier, referring for certain details to their paper.
Write yi = (yi1 , yi2 ) = xi x−1
0 for i = 1, 2 and aj := x0j for j = 1, 2. Then
(1, 1), y1 , y2 lie on the line L : a1 x1 + a2 x2 = 1. Hence
1 1
1 1 y11 y12 = 0.
(6.4.6)
1 y
y22
21
Further,
1
:= y11 y12
y y
21 22
1
y12
y22
1
1
1 −1
y11 = y11 y12 y21 y22 1 y11
1 y −1
y21 21
1 −1 y12
= 0.
−1 y
(6.4.7)
22
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.4 Proof of Theorem 6.1.4
153
For assume the contrary. Then the three points (1, 1), y1 ,y2 lie on a conic C :
b1 x1 x2 + b2 x2 + b3 x1 = 0, where at least two among the coefficients b1 , b2 , b3
must be non-zero. It is easy to see that L and C can have no more than two
points in common, giving a contradiction.
In what follows, we need functions on nine-dimensional complex space.
We write points in C9 as z, (z00 , z01 , . . . , z22 ), or as (z0 , z1 , z2 ) where zi =
(zi0 , zi1 , zi2 ) for i = 0, 1, 2. Define F : C9 → C by
z01 z02 z02 z00 z00 z01 F (z) := z11 z12 z12 z10 z10 z11 .
z z
z22 z20 z20 z21 21 22
Notice that F (z) = ( 2i,j =0 zij ) det(zij−1 ) if all zij = 0. Let μ, ν be reals with
μ ≥ 0, ν ≥ 0, 2μ + 3ν = 1,
which will be chosen optimally later, and define μ,ν : C9 → R by
μ,ν (z) := |F (z)|μ
2
|zij |ν .
i,j =0
2
μ
μ+ν
This is equal to | det(zij−1 )|
if all zij = 0. Finally, define the set
i,j =0 |zij |
D := z ∈ C9 : det(zij ) = 0, |zij | ≤ 1 for i, j = 0, 1, 2
and put
m(μ, ν) := sup μ,ν (z).
z∈D
We first show that
h(y1 ) + h(y2 ) ≥ −log m(μ, ν).
(6.4.8)
Choose a number field K containing the coordinates of y1 , y2 . For v ∈ MK , i =
1, 2, let λiv := max(1, |yi1 |v , |yi2 |v ). First, let v be an infinite place, and choose
an embedding σv : K → C such that | · |v = |σv (·)|s(v) , where as usual, s(v) =
1 if v is real, s(v) = 2 if v is complex. Let z = (z0 , z1 , z2 ) = (z00 , . . . , z22 ),
where z0 = (1, 1, 1), and zi is a scalar multiple of (1, σv (yi1 ), σv (yi2 )) such that
max0≤j ≤2 |zij | = 1, for i = 1, 2. Then
||μv · |y11 y12 y21 y22 |νv
= μ,ν (z)s(v) ≤ m(μ, ν)s(v) .
(λ1v λ2v )2μ+3ν
If v is finite then we have by the ultrametric inequality,
||μv · |y11 y12 y21 y22 |νv
≤ 1.
(λ1v λ2v )2μ+3ν
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
154
Unit equations in several unknowns
By taking the product over v ∈ MK , using (6.4.7) and the Product Formula,
and the condition 2μ + 3ν = 1, inequality (6.4.8) easily follows.
We first derive an upper bound for m(μ, ν), and then determine the minimum
of this upper bound over the set of (μ, ν) ∈ R2≥0 with 2μ + 3ν = 1. Since the
set D is compact, the function μ,ν attains a maximum on D, say at z. We use
the fact that z can be chosen in such a way that at most one of the coordinates of z
has absolute value < 1 (see Beukers and Zagier (1997), Lemma 3.2 for a proof).
By symmetry, we may assume that |z00 | ≤ 1 and |zij | = 1 if (i, j ) = (0, 0). So
zij−1 = zij if (i, j ) = (0, 0). Assume for the moment that z00 = 0. Then
μ,ν (z) = |z00 |μ+ν | det(zij−1 )|μ = |z00 |μ+ν | det(zij−1 ) − det(zij )|μ
−1
μ
= |z00 |μ+ν |z00
− z00 | · |z11 · z22 − z12 · z21 |
≤ 2μ |z00 |ν (1 − |z00 |2 )μ .
This is also true if z00 = 0 since in that case one can prove directly that
|F (z)| ≤ 2. Computing the maximum of f (x) = 2μ x ν (1 − x 2 )μ on [0, 1], we
obtain
m(μ, ν) ≤ 2μ
ν
2μ + ν
ν/2
2μ
2μ + ν
μ
.
We now let μ, ν vary, and determine the minimum of the right-hand side under
the constraints 2μ + 3ν = 1, μ ≥ 0, ν ≥ 0. Elementary calculus (see Beukers
and Zagier (1997), Lemma 3.3) shows that this minimum is equal to the unique
root x0 ∈ (0, 1) of 12 x 2 + x 6 = 1. Inserting this into (6.4.8), we obtain
h(y1 ) + h(y2 ) ≥ −log x0 > 0.09.
This proves our lemma.
We proceed further with equation (6.4.1) and assume henceforth that is a
∗
∗
finitely generated subgroup of Q × Q of rank r. Then there exist an algebraic
number field K and a finite set of places S of K containing the infinite places,
such that
⊆ OS∗ × OS∗ .
Let [K : Q] = d, |S| = s. We denote elements of R2s as u = (uiv : v ∈ S, i =
1, 2), and define a homomorphism from to the additive group of R2s by
ϕ : (x1 , x2 ) →
log |xi |v
: v ∈ S, i = 1, 2 .
d
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.4 Proof of Theorem 6.1.4
155
Let V ⊆ R2s be the real vector space spanned by ϕ(). Then V has dimension
r. By (6.3.8) we have
+
h(x) = ϕ(x)
for x ∈ ,
(6.4.9)
where · is the norm on R2s , given by
u :=
1
2
2
|uiv | for u = (uiv : v ∈ S, i = 1, 2) ∈ R2s .
v∈S i=1
Denote by S the image under ϕ of the set of solutions of (6.4.1). We have
collected some properties of S.
Lemma 6.4.7 For every u ∈ S there are at most two solutions x of (6.4.1)
such that ϕ(x) = u.
Proof. Let v be an infinite place of K. Then there is an embedding σ : K → C
such that |x|v = |σ (x)|s(v) for x ∈ K, where s(v) = 1 if v is real, s(v) = 2 if v is
complex. Consider the solutions x = (x1 , x2 ) of (6.4.1) with ϕ(x) = u. For these
solutions, the absolute values |σ (x1 )| and |1 − σ (x1 )| = |σ (x2 )| have prescribed
values, depending only on u. In geometric terms, σ (x1 ) is an intersection point
of two given circles in the complex plane that depend on u. This implies that
for any given u there are at most two possibilities for σ (x1 ), hence for x.
Lemma 6.4.8 The set S has the following properties:
(i) for any two distinct u1 , u2 ∈ S we have
u1 ≤ 2u2 − u1 + log 4;
(ii) for any two distinct u1 , u2 ∈ S and any positive integer N , there is M ∈
{N, N + 1} such that
2
u2 − (2M + 1)u1 + log 64;
u1 ≤ M+1
(iii) for any three distinct u0 , u1 , u2 ∈ S we have
u1 − u0 + u2 − u0 > 0.09.
Proof. This is simply a combination of Lemmas 6.4.4–6.4.6, the inequality
∗
∗
h(x) ≤ +
h(x) ≤ 2h(x) for x ∈ Q × Q , and (6.4.9).
Our strategy to prove Theorem 6.4.1 is to estimate the cardinality of S,
using (i)–(iii) of Lemma 6.4.8. Then in view of Lemma 6.4.7, we only have to
multiply the upper bound for |S| by 2 to get an upper bound for the number of
solutions of (6.4.1).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
156
Unit equations in several unknowns
We cover S by cones and estimate the number of points in a cone. Let θ > 0
be a parameter whose value will be specified later. By Lemma 6.3.4, there is a
set E ⊂ {e ∈ V : e = 1}, with
|E| ≤ (1 + 2θ −1 )r
(6.4.10)
such that for every non-zero u ∈ V there is e ∈ E for which
u−1 u − e ≤ θ.
(6.4.11)
For e ∈ E denote by Se the set of u ∈ S with (6.4.11). Notice that every u ∈ Se
can be written as
u = ue + u
with u ≤ θ u.
(6.4.12)
Lemma 6.4.9 Let e ∈ E, 0 < θ < 19 .
log 16
(i) For any two distinct u1 , u2 ∈ Se with u2 ≥ u1 ≥ 1−9θ we have
u2 ≥ 54 u1 .
log 64
(ii) For any two distinct u1 , u2 ∈ Se with u2 ≥ u1 ≥ 1−9θ we have
u2 < 10θ −1 u1 .
log 16
(iii) The set of u ∈ Se with u ≥ 1−9θ has cardinality at most
3 + [log(10θ −1 )/ log(5/4)].
Proof. Part (i) is an elementary gap principle. Part (ii) is the more involved
result, based on the polynomial identities from Lemma 6.4.2. Part (iii) follows
from (i) and (ii).
We first prove (i) and (ii). Let u1 , u2 be distinct elements of Se with u2 ≥
u1 . Put λi := ui for i = 1, 2. By (6.4.12), we have ui = λi e + ui with
ui ≤ θ λi for i = 1, 2.
Assume that λ2 < 54 λ1 . Then by property (i) of Lemma 6.4.8,
λ1 ≤ 2(λ2 − λ1 )e + u2 − u1 + log 4
≤ 2(λ2 − λ1 + θ λ2 + θ λ1 ) + log 4 <
1
2
+ 92 θ λ1 + log 4.
log 16
Hence λ1 < 1−9θ . This implies (i).
Next, assume that λ2 ≥ 10θ −1 λ1 . Let N be the positive integer with
2N + 1 ≤ λ2 /λ1 < 2N + 3, and let M ∈ {N, N + 1} be the integer from
Lemma 6.4.8 (ii). Thus, |λ2 − (2M + 1)λ1 | ≤ 2λ1 , and moreover,
M>
4
.
θ
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.4 Proof of Theorem 6.1.4
157
It follows that
2
λ1 ≤ M+1
(λ2 − (2M + 1)λ1 )e + u2 − (2M + 1)u1 + log 64
2
(2λ1 + λ2 θ + (2M + 1)λ1 θ ) + log 64
≤ M+1
2
(2 + (2M + 3)θ + (2M + 1)θ )λ1 + log 64
≤ M+1
4
= M+1 + 8θ λ1 + log 64 < 9θ λ1 + log 64,
log 64
implying λ1 < 1−9θ . This proves (ii).
We next prove (iii). We first consider the points u ∈ Se with
log 64
log 16
≤ u <
.
1 − 9θ
1 − 9θ
(6.4.13)
Let u1 , u2 , . . . be these points, ordered such that u1 ≤ u2 ≤ . . .. Then by
(i), we have for the n-th point in this sequence that un ≥ (5/4)n−1 u1 .
Hence (5/4)n−1 < (log 64)/(log 16) = 3/2, implying n ≤ 2. So Se has at most
two points u with (6.4.13).
Next, we count the points u ∈ Se with
u ≥
log 64
.
1 − 9θ
(6.4.14)
Similarly as above, we order these points in a sequence u1 , u2 , . . . such that
u1 ≤ u2 ≤ . . .. Then again, by (i), we have for the n-th point in this
sequence that un ≥ (5/4)n−1 u1 . On the other hand, by (ii) we have un <
10θ −1 u1 . Hence (5/4)n−1 < 10θ −1 . Thus, we obtain an upper bound
1 + [log(10θ −1 )/ log(5/4)]
for the number of u ∈ Se with (6.4.14). Combined with the upper bound 2 for
the number of points with (6.4.13), this gives (iii).
Proof of Theorem 6.4.1. Let 0 < θ < 19 . We divide S into large points, i.e.,
log 16
log 16
with u ≥ 1−9θ , and small points, i.e., with u < 1−9θ .
Combining the upper bound (6.4.10) for |E| with (iii) of Lemma 6.4.8, we
see that S has at most
,
log(10θ −1 )
· (1 + 2θ −1 )r
3+
log(5/4)
large points.
To estimate the number of small points of S, we observe that by Lemma 6.4.8
(iii), for any u0 ∈ S, the set
{u ∈ S : u − u0 ≤ 0.045}
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
158
Unit equations in several unknowns
has cardinality at most 2. By Lemma 6.3.4, the set of small points of S can be
covered by at most
1+
2 log 16
0.045(1 − 9θ)
r
such sets. Hence S has at most
2 1+
2 log 16
0.045(1 − 9θ)
r
small points.
We now choose θ such that θ −1 = (log 16)/0.045(1 − 9θ), i.e., θ −1 =
9 + (log 16)/0.045 = 70.613 . . ., add the upper bounds for the number of large
points and the number of small points of S obtained above, and finally multiply with 2 to get an upper bound for the number of solutions of (6.4.1).
The resulting bound is 68 × 143r , which is smaller than the bound stated in
Theorem 6.4.1.
6.5 Proof of Theorem 6.1.6
Let K be a field of characteristic 0, a1 , a2 ∈ K ∗ , and a subgroup of K ∗ × K ∗
of finite rank r. We consider equations
a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ (6.1.6)
having at least three distinct solutions, and we have to show that there are at
most B(r) possibilities for the -equivalence class of (a1 , a2 ), where B(r) is
given by (6.1.7).
Thus, let (u1 , u2 ), (v1 , v2 ), (w1 , w2 ) be three distinct solutions of (6.1.6).
Then
1 u1 u2 1 v1 v2 = 0,
1 w w 1
2
i.e.
v1 w2 − v2 w1 + u2 w1 − u1 w2 + u1 v2 − u2 v1 = 0
(6.5.1)
and
each 2 × 2 subdeterminant of the above determinant is = 0.
(6.5.2)
A vanishing subsum of the left-hand side of (6.5.1) is called minimal if none
of the proper subsums of this subsum is 0. We have to distinguish various cases
depending on how (6.5.1) splits into minimal vanishing subsums. Clearly, two
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.5 Proof of Theorem 6.1.6
159
possible splittings that can be transformed into each other by permuting (u1 , u2 ),
(v1 , v2 ), (w1 , w2 ) or interchanging the indices (1, 2) can be treated in the same
manner. Notice that, in this way, one can derive at most 12 splittings from a
given splitting. More precisely, after permuting (u1 , u2 ), (v1 , v2 ), (w1 , w2 ) or
interchanging (1, 2), we are left with the following cases:
(I)
(II)
(III)
(IV)
(V)
no proper subsum of the left-hand side of (6.5.1) vanishes,
v1 w2 + u2 w1 = 0, −v2 w1 − u1 w2 + u1 v2 − u2 v1 = 0,
v1 w2 − v2 w1 + u2 w1 = 0, −u1 w2 + u1 v2 − u2 v1 = 0,
v1 w2 + u2 w1 + u1 v2 = 0, −v2 w1 − u1 w2 − u2 v1 = 0,
v1 w2 + u2 w1 − u2 v1 = 0, −v2 w1 − u1 w2 + u1 v2 = 0.
Other splittings into minimal vanishing subsums are in conflict with (6.5.2).
We define the quantities y1 := v1 /u1 , y2 := v2 /u2 , z1 := w1 /u1 , z2 :=
w2 /u2 . The pair (y1 , y2 ) determines uniquely the -equivalence class of
(a1 , a2 ), since (a1 u1 , a2 u2 ) is the unique solution (ξ1 , ξ2 ) to ξ1 + ξ2 = 1,
y1 ξ1 + y2 ξ2 = 1. Likewise, (z1 , z2 ) determines uniquely the -equivalence
class of (a1 , a2 ).
Case I. We rewrite (6.5.1) (by dividing by u2 v1 ) as
y2 z1
z1
z2
y2
z2 −
+
−
+
= 1.
y1
y1
y1
y1
Let be the image of × under the group homomorphism
((y1 , y2 ), (z1 , z2 )) → z2 ,
y2 z1 z1 z2 y2
, , ,
y1 y1 y1 y1
.
y
Then has rank at most 2r, and (z2 , . . . , y2 ) is a non-degenerate solution of
1
x1 − x2 + x3 − x4 + x5 = 1 in (x1 , . . . , x5 ) ∈ .
y
By Theorem 6.1.3 there are at most A(5, 2r) possibilities for (z2 , . . . , y2 ). Such
y z
1
y
a tuple determines z2 and y2 1 ( y2 )−1 = z1 , hence the -equivalence class of
1
1
(a1 , a2 ). So case I gives rise to at most
A(5, 2r)
possible -equivalence classes of pairs (a1 , a2 ).
Case II. This implies
y1 z2
= −1,
z1
−
y2 z1
z2
y2
−
+
= 1.
y1
y1
y1
Theorem 6.1.3 implies that we have at most A(3, 2r) possibilities for the triple
y z
y
z
(− y2 1 , y2 , y2 ) (using again the argument based on a homomorphic image of
1
1
1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
160
Unit equations in several unknowns
y z
y z
y z
y
× ). In combination with z1 2 = −1 this tuple determines z1 2 y2 1 z1 =
1
1
1
2
y1 y2 , hence y12 = y1 y2 y1 /y2 . This leads to two possibilities for (y1 , y2 ), hence
two possible -equivalence classes for (a1 , a2 ). So case II gives rise to at most
2A(3, 2r)
possible -equivalence classes of pairs (a1 , a2 ).
Case III. This implies
y1 z2
− y2 = 1,
z1
−
z2
y2
+
= 1.
y1
y1
By Theorem 6.1.3 we have at most A(2, 2r)2 possibilities for ( yz1 1z2 , y2 , yz21 , yy21 ).
Each such tuple determines uniquely the pair (y1 , y2 ), hence the -equivalence
class of (a1 , a2 ). So case III gives rise to at most
A(2, 2r)2
possible -equivalence classes of pairs (a1 , a2 ).
Case IV. This implies
y1 z2
z1
−
−
= 1,
y2
y2
−
y2 z1
z2
−
= 1.
y1
y1
According to Theorem 6.1.3, there are at most A(2, 2r)2 possibilities for the
tuple ( yy1 z2 2 , yz12 , yy2 z1 1 , yz21 ). From this tuple we can compute
y2 y2 y2 z1 z2
=
y1 z2 z1 y1 y1
y2
y1
3
.
This gives three possibilities for yy21 , hence for (z1 , z2 ), and hence for the equivalence class of (a1 , a2 ). Thus in case IV there are at most
3A(2, 2r)2
possible -equivalence classes of pairs (a1 , a2 ).
Case V. This implies
z2 +
z1
= 1,
y1
z1 +
z2
= 1.
y2
Theorem 6.1.3 implies that we have at most A(2, 2r)2 possibilities for the triple
(z2 , yz11 , z1 , yz22 ), hence for (z1 , z2 ), and hence for the -equivalence class of
(a1 , a2 ). So in case V we have at most
A(2, 2r)2
possibilities for the -equivalence class of (a1 , a2 ).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.6 Proofs of Theorems 6.1.7 and 6.1.8
161
By adding the upper bounds for the numbers of possible -equivalence
classes of pairs (a1 , a2 ) found in cases I–V, and multiplying this with the
number of permutations of (u1 , u2 ), (v1 , v2 ), (w1 , w2 ) and of the indices 1, 2,
we obtain that the number of -equivalence classes of pairs (a1 , a2 ) ∈ K ∗ × K ∗
such that equation (6.1.6) has more than two solutions is at most
12(A(5, 2r) + 2A(3, 2r) + 5A(2, 2r)2 ) = B(r).
This proves Theorem 6.1.6.
6.6 Proofs of Theorems 6.1.7 and 6.1.8
Proof of Theorem 6.1.7. We follow Konyagin and Soundararajan (2007).
Recall that we are considering the equation
(6.1.8)
x1 + x2 = 1 in x1 , x2 ∈ ±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z ,
where S = {p1 ,√. . . , pt } is a set of distinct primes. We intend to show that for
every β < 2 − 2 there are sets of primes S of arbitrarily large cardinality t
such that the number N (S) of solutions of (6.1.8) is at least exp(t β ).
Let y be a large real number and fix real numbers β, γ with 0 < β, γ < 1,
which will be chosen optimally later. We introduce two sets L, M. The set L is
the set of numbers that are the product of exactly [y β ] distinct primes from the
interval [y/2, y], while M is the set of numbers that are the product of exactly
[γ y β ] distinct primes from [y/4, y/2). Thus, the integers in L are coprime to
those in M.
Using the Prime Number Theorem and
log ab
a
→ 1 as b, → ∞
b log(a/b)
b
(which follows from Stirling’s Formula) we obtain that the set L has cardinality
|L| =
π (y) − π (y/2)
= L1−β+o(1) ,
[y β ]
(6.6.1)
β
where L := y [y ] and here and below, o(1) is used to denote functions of y that
tend to 0 as y → ∞. In a similar manner,
|M| = Lγ (1−β)+o(1) .
(6.6.2)
The idea is to find a positive integer u for which there are many triples (l1 , l2 , m)
such that l1 , l2 ∈ L, m ∈ M and (l1 − l2 )/m = u, and then take for S the set
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
162
Unit equations in several unknowns
of primes in [y/4, y] and those dividing u. Then the pairs (um/ l1 , l2 / l1 ) yield
many solutions to (6.1.8).
We first count the number of triples (l1 , l2 , m) with
l1 ≡ l2 (mod m),
l1 , l2 ∈ L, m ∈ M,
l1 > l2 .
(6.6.3)
Let m ∈ M. For a ∈ Z, denote by r(L, a, m) the number of elements in L that
lie in the residue class a (mod m). By the Cauchy–Schwarz inequality,
2
m
m
1 |L|2
2
.
r(L, a, m) ≥
r(L, a, m) =
m a=1
m
a=1
The left-hand side counts the pairs (l1 , l2 ) in L that are congruent modulo m.
β
Among these, there are |L| trivial solutions with l1 = l2 . Note that m ≤ y [γ y ] ≤
Lγ . We assume that γ < 1 − β. Then in view of (6.6.1), the integer m is of
smaller order of magnitude than |L|. Deleting the pairs with l1 = l2 and using
symmetry, we infer that for any fixed m ∈ M, the number of pairs l1 , l2 ∈ L
with l1 > l2 that are congruent modulo m is bounded below by
|L|2
− |L| = L2(1−β)−γ +o(1) .
2m
Now, using (6.6.2), and summing over the elements of M, we see that the
number of triples (l1 , l2 , m) with (6.6.3) is at least
L2(1−β)−βγ +o(1) .
β
The elements of L are all ≤ y [y ] = L, and the integers of M are all ≥
β
(y/4)[γ y ] , so the integers (l1 − l2 )/m with l1 , l2 , m satisfying (6.6.3), are all
bounded above by L1−γ +o(1) . If we assume that 2(1 − β) − βγ > 1 − γ , or
equivalently,
(2 + γ )(1 − β) > 1,
we see that there is a positive integer u ≤ L1−γ +o(1) with the property that there
are at least
L2(1−β)−βγ −1+γ +o(1) = L(2+γ )(1−β)−1+o(1)
triples (l1 , l2 , m) with l1 , l2 ∈ L, m ∈ M and (l1 − l2 )/m = u. Notice that
gcd(l1 , l2 ) is a divisor of u. By elementary number theory (see, e.g., Hardy
and Wright (1980), chapter XVIII, Theorem 317), the number u has at most
uO(1/ log log u) = Lo(1) divisors. Hence there are a divisor v of u, and at least
L(2+γ )(1−β)−1+o(1) triples (l1 , l2 , m) such that l1 , l2 are coprime integers composed of primes from [y/2, y], m is an integer composed of primes from
[y/4, y), and (l1 − l2 )/m = v. Let S consist of the primes from [y/4, y] and the
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.6 Proofs of Theorems 6.1.7 and 6.1.8
163
primes dividing v. Then these triples (l1 , l2 , m) yield at least L(2+γ )(1−β)−1+o(1)
solutions (mv/ l1 , l2 / l1 ) to (6.1.8).
In the course of the proof we assumed that γ <
√1 − β and (2 + γ )(1 − β) >
1. Such a number γ exists precisely if β < 2 − 2. Since v ≤ u ≤ L1−γ +o(1) ,
the cardinality of S is at most π (y) − π (y/4) + log v < y for y sufficiently
large. Further, for y sufficiently large, we have
L(2+γ )(1−β)−1+o(1) = y [y
β
]((2+γ )(1−β)−1+o(1))
> exp(y β ).
This completes the proof of Theorem 6.1.7.
√
Proof of Theorem 6.1.8. Let β < 2 − 2 and choose t and a set of primes
S = {p1 , . . . , pt } such that N (S) ≥ exp(t β ) =: A(t). We consider
x1 + · · · + xn = 1 in x1 , . . . , xn ∈ {±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z}. (6.1.9)
Recall that g(n, S) denotes the minimal integer g such that there exists a nonzero polynomial P ∈ C[X1 , . . . , Xn ] of total degree g, which is not divisible
by X1 + · · · + Xn − 1, and which has the property that P (x1 , . . . , xn ) = 0
for every solution (x1 , . . . , xn ) of (6.1.9). We prove by induction on n that
g(n, S) ≥ A(t) for n ≥ 2.
First, let n = 2. Let P ∈ C[X1 , X2 ] be a polynomial of total degree g(2, S),
not divisible by X1 + X2 − 1, such that P (x1 , x2 ) = 0 for every solution (x1 , x2 )
of (6.1.8). Let Q(X) := P (X, 1 − X). Then Q(x1 ) = 0 for every x1 for which
there exists x2 such that (x1 , x2 ) is a solution of (6.1.8). Hence
g(2, S) = deg P = deg Q ≥ A(t).
Suppose now that n ≥ 3, and that g(n − 1, S) ≥ A(t) is known to hold. Let
U be the set of tuples
(x1 , . . . , xn ) = (y1 , . . . , yn−2 , yn−1 x1 , yn−1 x2 ),
where (y1 , . . . , yn−1 ) runs through the solutions of
y1 + · · · + yn−1 = 1, y1 , . . . , yn−1 ∈ ±p1z1 · · · ptzt : z1 , . . . , zt ∈ Z
(6.6.4)
and where (x1 , x2 ) runs through the solutions of (6.1.8). By construction, the
tuples in U satisfy
y1 + · · · + yn−2 + yn−1 (x1 + x2 ) = 1
and so they are solutions of (6.1.9).
Let P ∈ C[X1 , . . . , Xn ] be a polynomial of total degree g(n, S), not divisible
by X1 + · · · + Xn − 1, such that P (x1 , . . . , xn−1 , xn ) = 0 for every solution
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
164
Unit equations in several unknowns
(x1 , . . . , xn ) of (6.1.8). Put
Q(X1 , . . . , Xn−1 ) := P (X1 , . . . , Xn−1 , 1 − X1 − · · · − Xn−1 ).
Then Q has total degree g(n, S), and is not identically 0. So we have to prove
that Q has total degree at least A(t).
Clearly, we have
Q(y1 , . . . , yn−2 , yn−1 x1 ) = 0
(6.6.5)
for every solution (y1 , . . . , yn−1 ) of (6.6.4) and every solution (x1 , x2 ) of (6.1.8).
Define a new polynomial in n − 1 variables,
Q∗ (Y1 , . . . , Yn−2 , Z) := Q(Y1 , . . . , Yn−2 , Z · (1 − Y1 − · · · − Yn−2 )).
(6.6.6)
Then Q∗ is not identically zero since Q is not identically zero and since the
change of variables
(X1 , . . . , Xn−1 ) → (Y1 , . . . , Yn−2 , Z · (1 − Y1 − · · · − Yn−2 ))
is invertible. Now from (6.6.6), (6.6.5), it follows that
Q∗ (y1 , . . . , yn−2 , x1 ) = 0
(6.6.7)
for every solution (y1 , . . . , yn−1 ) of (6.6.4) and every solution (x1 , x2 ) of (6.1.8).
We distinguish two cases.
Case I. There is a solution (x1 , x2 ) of (6.1.8) such that the polynomial
Q∗x1 (Y1 , . . . , Yn−2 ) := Q∗ (Y1 , . . . , Yn−2 , x1 )
is not identically zero. Then by (6.6.7), Q∗x1 is a non-zero polynomial with
Q∗x1 (y1 , . . . , yn−2 ) = 0 for every solution (y1 , . . . , yn−1 ) of (6.6.4). Hence Q∗x1
has total degree
≥g(n − 1, S) ≥ A(t).
Now by (6.6.6) this implies that the total degree of Q is at least A(t).
Case II. The polynomial Q∗x1 (Y1 , . . . , Yn−2 ) is identically zero for every solution (x1 , x2 ) of (6.1.8). Then since (6.1.8) has at least A(t) solutions, the
polynomial Q∗ must have degree at least A(t) in the variable Z. By (6.6.6)
this implies that Q has degree at least A(t) in the variable Xn−1 . So again we
conclude that the total degree of Q is at least A(t). This completes our induction
step and our proof.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.7 Notes
165
6.7 Notes
We recall some history concerning the number of solutions of unit equations
and discuss some related results.
r Lewis and Mahler (1961) obtained an explicit upper bound for the number
of solutions of the S-unit equation over Q,
x1 + x2 = 1
in x1 , x2 ∈ Z∗S ,
(6.7.1)
where S = {∞, p1 , . . . , pt } with distinct primes p1 , . . . , pt and Z∗S =
{±p1z1 · · · ptzt : zi ∈ Z} is the corresponding group of S-units. But their
bound depends on p1 , . . . , pt . Lewis and Mahler derived this by applying a general result of theirs on Thue–Mahler equations to equations of
the type |ax n + by n | = p1z1 · · · ptzt in x, y, z1 , . . . , zt ∈ Z. In fact, as was
unnoticed by Lewis and Mahler, applying their general result instead to
|xy(x + y)| = p1z1 · · · ptzt implies an upper bound ct+1 for the number of
solutions of (6.7.1), with c an absolute constant independent of p1 , . . . , pt .
A similar result was independently obtained by Silverman around 1984, by
a different method (unpublished).
The above result was generalized and improved by Evertse (1984a) as
follows. Let K be an algebraic number field of degree d, S a finite set of
places of K of cardinality s containing the infinite places, and a1 , a2 ∈ K ∗ .
Then the equation
a1 x1 + a2 x2 = 1 in x1 , x2 ∈ OS∗
has at most 3 × 7d+2s solutions. We note that earlier Győry (1979), under
certain assumptions concerning the S-norms of a1 and a2 , obtained the better
upper bound 4s + 1 by means of the theory of logarithmic forms. These were
the first upper bounds that depend only on d and s, but not on the coefficients
a1 , a2 .
r Schlickewei considered the equation
a1 x1 + a2 x1 = 1
in (x1 , x2 ) ∈ ,
(6.1.6)
where a1 , a2 are non-zero elements of an arbitrary field K of characteristic 0,
and is a subgroup of K ∗ × K ∗ of finite rank r. He derived a uniform upper
bound for the number of solutions, depending only on r. His unpublished
result was later improved by Beukers and Schlickewei (1996) who obtained
the upper bound 28(r+2) for the number of solutions.
r First Poe (1997) alone in a special case, and then Poe together with Bombieri
and Mueller (Bombieri, Mueller and Poe (1997)) developed a “cluster principle” for the solutions of (6.1.6), in the case that K is a number field and Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
166
Unit equations in several unknowns
is again a subgroup of K ∗ × K ∗ of rank r. Here, a cluster is a set of solutions
such that for any two solutions x1 , x2 in the cluster, the height h(x1 x−1
2 ) is
small, and the cluster principle gives an upper bound for the number of such
clusters. By combining this cluster principle with Baker-type upper bounds
for the heights of the solutions of (6.1.6), Bombieri et al. proved that (6.1.6)
2
has at most d 9r e86r solutions, where d = [K : Q]. Although this bound is
much larger than that of Beukers and Schlickewei, the method of proof is
very different, and it may be applicable to other situations.
r In the special case when K is a number field and = OS∗ × OS∗ , where OS∗ is
the group of S-units for some finite set of places S of K containing the infinite
places, a weaker but effective version of Theorem 6.1.6 was established in
Evertse, Győry, Stewart and Tijdeman (1988a). Using some earlier versions
of Theorems 3.2.5, 3.2.7 and Corollary 4.1.5, due to Baker, van der Poorten
and Győry, respectively, it was proved that apart from finitely many and
effectively determinable OS∗ -equivalence classes of pairs (a1 , a2 ) ∈ K ∗ ×
K ∗ , the equation a1 x1 + a2 x2 = 1 has at most s + 1 solutions (x1 , x2 ) ∈
OS∗ × OS∗ , where s denotes the cardinality of S. Further, in the case when S
is the set of infinite places of K and a1 , a2 ∈ Q∗ , the following more precise
result was obtained in Brindza and Győry (1990). For given coprime positive
integers a1 , a2 , there are only finitely many and effectively determinable
positive integers c such that the equation a1 x1 + a2 x2 = c has more than one
solution (up to conjugacy) in x1 , x2 ∈ OK∗ . The proof utilizes a simultaneous
variant of Baker’s method.
r Corvaja and Zannier (2006), and in a more general extent Levin (2006)
considered one-parameter families of S-unit equations
a1 (t)x1 + a2 (t)x2 = c(t) in t ∈ K, x1 , x2 ∈ OS∗ ,
(6.7.2)
where as before K is a number field, S a finite set of places of K containing
all infinite places, and where a1 , a2 , c ∈ K[X] are given polynomials. In his
paper, Levin proved, among other things, that if (a1 , a2 , c) is a general triple of
non-constant polynomials with deg a1 + deg a2 = deg c > 2, then (6.7.2) has
only finitely many solutions with a1 (t)a2 (t)c(t) = 0. Here “general” means
that if we view triples (a1 , a2 , c) as points in the affine space K 2 deg c+3 , then
the set of triples (a1 , a2 , c) for which the above mentioned finiteness result
does not hold is contained in a proper Zariski closed subset of K 2 deg c+3 .
Levin’s proof allows us to effectively determine this Zariski closed subset.
Notice that Levin’s result provides many examples of S-unit equations that
have no solutions. In his proof, Levin heavily uses the finiteness results
of Corvaja and Zannier (2002a, 2004b) on S-integral points on curves and
surfaces, which they derived from the Subspace Theorem. As a consequence,
Levin’s finiteness result on (6.7.2) is ineffective.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.7 Notes
167
r As was mentioned in Section 4.7, Győry and Pintér (2008) considered over
Q the three-parameter family of S-unit equations
un1 x1 + un2 x2 = 1 in u1 , u2 ∈ Z \ {0}, n ≥ 3, x1 , x2 ∈ Z∗S
with gcd(u1 u2 , p1 · · · pt ) = 1,
(6.7.3)
where p1 , . . . , pt are distinct rational primes, S = {∞, p1 , . . . , pt } and Z∗S
denotes the group of S-units in Q. They showed that apart from finitely many
and effectively computable pairs (un1 , un2 ), the equations under consideration
have no solution in x1 , x2 .
r We now compare the above result with the special case K = Q, = Z∗S × Z∗S
of Theorem 6.1.6 and with the remark occurring after that theorem. Further,
we complete these results with a new one.
For given a1 , a2 ∈ Q∗ , consider the equation
a1 x1 + a2 x2 = 1 in x1 , x2 ∈ Z∗S .
(6.7.4)
We call two pairs (a1 , a2 ), (b1 , b2 ) ∈ Q∗ × Q∗ S-equivalent if they are Z∗S ×
Z∗S -equivalent, i.e., if there is (ε1 , ε2 ) ∈ Z∗S × Z∗S such that bi = ai εi for
i = 1, 2. Then the number of solutions of (6.7.4) does not change if (a1 , a2 )
is replaced by an S-equivalent pair.
Theorem 6.7.1 The following assertions hold.
(i) There are only finitely many S-equivalence classes of pairs (a1 , a2 ) in
Q∗ × Q∗ for which equation (6.7.4) has more than two solutions.
(ii) For each N ∈ {0, 1, 2}, there are infinitely many S-equivalence classes
of pairs (a1 , a2 ) ∈ Q∗ × Q∗ such that equation (6.7.4) has exactly N
solutions.
The assertion (i) is a special case of Theorem 6.1.6, hence is ineffective.
The statement (ii) for N = 2 has been proved in a more general form after the
enunciation of Theorem 6.1.6, while for N = 0 is an immediate consequence
of the above result concerning equation (6.7.3). We now give a sketch of the
proof for N = 1. The proof of each case of (ii) is constructive.
Sketch of the proof of the case N = 1 of (ii). Let A be a large integer, and S
the set of integers composed of the primes p1 , . . . , pt . Denote by H (A) the
set of pairs (a, b) of relatively prime positive integers a, b with a, b ≤ A, and
by P (A) the set of those triples (a, b, c) of positive integers a, b, c for which
(a, b) ∈ H (A), gcd(ab, p1 · · · pt ) = 1 and a + b = c. It is known that H (A)
has cardinality at least c1 A2 , where c1 is an effectively computable positive
absolute constant. This implies that the cardinality of P (N ) is at least c2 A2 .
Here c2 and c3 , c4 , c5 below are effectively computable positive numbers
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
168
Unit equations in several unknowns
depending only on p1 , . . . , pt . If x, y, z is a solution of the equation
ax + by = cz
in x, y, z ∈ S with gcd(x, y, z) = 1
(6.7.5)
for some (a, b, c) in P (A), then by Corollary 4.1.5
max(|x|, |y|, |z|) ≤ c3 Ac4 .
Thus the total number of triples (x, y, z) with relatively prime x, y, z ∈ S
for which there exists (a, b, c) in P (A) satisfying (6.7.5) is at most c5 (log A)3t .
If (a, b, c) ∈ P (A) so that (6.7.5) holds for some relatively prime x, y,
z in S and (x, y, z) is not (1, 1, 1) or (−1, −1, −1), then (a, b, c) = (y − z,
z − x, y − x)/d with d = gcd(y − z, z − x, y − x). Hence (a, b, c) is
uniquely determined by (x, y, z). Consequently, the number of (a, b, c) ∈
P (A) for which up to proportionality (1, 1, 1) is the only solution of (6.7.5)
in S is at least c2 A2 − c5 (log A)3t , which tends to infinity as A tends to
infinity. One can inductively construct an infinite sequence of such (a, b, c).
For a triple (a, b, c) of this kind, write c = σ c0 with positive integers σ , c0
such that σ ∈ S, gcd(c0 , p1 · · · pt ) = 1. Then (1/σ, 1/σ ) is the only solution
of the equation (a/c0 )x1 + (b/c0 )x2 = 1 in x1 , x2 ∈ Z∗S . Since a, b and c0
are pairwise relatively prime, the pairs (a/c0 , b/c0 ) under consideration are
pairwise S-inequivalent. This proves the case N = 1 of (ii).
r We discussed above results that give bounds for the number of solutions of
S-unit equations. Here, we consider equations of the form
x1 + x2 = 1 in x1 , x2 ∈ OK∗ ,
(6.7.6)
where K is a number field. Recall that Evertse’s result mentioned above gives
an upper bound 3 × 72d+3 for the number of solutions of this equation. Grant
(1996) gave examples of number fields K of arbitrarily large degree d such
that (6.7.6) has d 2 solutions. In fact, Grant’s examples were cyclotomic
fields Q(e2πi/p ) with p a prime, and certain number fields arising from elliptic
curves.
We can get much better upper bounds for the number of solutions of
(6.7.6) if we impose some restrictions on x1 , x2 . For instance, Silverman
(1995) proved that if ε is a fixed element of OK∗ , then the equation εm + y = 1
has at most d 1+o(1) solutions m ∈ Z, y ∈ OK∗ .
r We now deal with the equation
a1 x1 + · · · + an xn = 1
in (x1 , . . . , xn ) ∈ ,
(6.1.3)
where a1 , . . . , an are non-zero elements of a field K of characteristic 0 and
is a subgroup of finite rank of the n-fold direct product (K ∗ )n . Already in
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.7 Notes
169
the 1970s, Dubois and Rhin (1976) and independently Schlickewei (1977a)
obtained finiteness results for (6.1.3) in the special case that K = Q and
= (Z∗S )n for some finite set of places S of Q, and also with a condition
imposed on the solutions stronger than non-degeneracy. The general result
that for arbitrary K of characteristic 0 and of finite rank, equation (6.1.3)
has only finitely many non-degenerate solutions, was proved in several steps
in the 1980s. Van der Poorten and Schlickewei in their unpublished preprint
van der Poorten and Schlickewei (1982) and independently Evertse (1984b)
proved that this equation has only finitely many non-degenerate solutions if
K is a number field and = (OS∗ )n for some finite set of places S of K.
Also in their above mentioned preprint, van der Poorten and Schlickewei
claimed a generalization of this to the case that K is an arbitrary field of
characteristic 0 and a finitely generated subgroup of (K ∗ )n , but their proof
was incomplete. In van der Poorten and Schlickewei (1991) they published
the complete proof of their claim. Meanwhile, Evertse and Győry (1988b)
gave a different proof of the claim of van der Poorten and Schlickewei, and
showed that the number of non-degenerate solutions can be estimated from
above by a (with their method of proof not effectively computable) number
depending only on n, K and . Further, Laurent (1984) developed some
Kummer theory, which made it possible to extend the finiteness result on
(6.1.3) from finitely generated groups to groups of finite rank.
r Schlickewei (1990) was the first to obtain an explicit upper bound for the
number of non-degenerate solutions of (6.1.3) in the case that K is a number
field and = (OS∗ )n , where S is a finite set of places of K, containing all
3
infinite places. His bound was improved in Evertse (1995) to (235 n2 )n s ,
where s is the cardinality of S (see also Subsection 9.5.2 of the present
book). In the case where K is a number field and = (OS∗ )n this has not
been improved so far.
Building further on unpublished weaker results of the last two authors,
Evertse, Schlickewei and Schmidt (2002) proved that if K is any field of zero
characteristic, a1 , . . . , an ∈ K ∗ , and a subgroup of rank r of (K ∗ )n , then
(6.1.3) has at most A(n, r) = exp((6n)3n (r + 1)) non-degenerate solutions.
4
In Amoroso and Viada (2009) this was improved to A(n, r) = (8n)4n (n+r+1) .
r We consider the case that has rank 0, i.e., we consider the equation
a1 ζ1 + · · · + an ζn = 1
in roots of unity ζ1 , . . . , ζn ,
(6.7.7)
where a1 , . . . , an again lie in a field K of characteristic 0. Results from Mann
(1965) and Conway and Jones (1976) imply that if a1 , . . . , an ∈ Q∗ , then for
each non-degenerate solution (ζ1 , . . . , ζn ) of (6.7.7), the lowest common
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
170
Unit equations in several unknowns
multiple of the orders of ζ1 , . . . , ζn is ≤ C(n) with C(n) effectively computable in terms of n only. Further, results from Schinzel (1988) and Dvornicich and Zannier (2000) imply that if a1 , . . . , an generate a number field
K of degree d, then for each non-degenerate solution of (6.7.7) the lowest
common multiple of the orders of their components is bounded above by
an effectively computable number C(n, d) depending on n and d only. This
implies that the non-degenerate solutions of (6.7.7) can be determined effectively, and it implies also that the number of non-degenerate solutions of
(6.7.7) is bounded above by a number depending on n and d only. Schlickewei (1996b) considered equations (6.7.7) with coefficients a1 , . . . , an from
an arbitrary field K of characteristic 0 and obtained an upper bound 24(n+1)!
for the number of non-degenerate solutions. This was improved by Evertse
2
(1999) to (n + 1)3(n+1) . The proofs of the two last mentioned results use only
simple properties of cyclotomic fields.
r Evertse, Győry, Stewart and Tijdeman (1988a) proved the following result,
which shows that Theorem 6.1.6 has no obvious generalization to equations
in more than two unknowns. Let K be a field of characteristic 0, n ≥ 3,
and a subgroup of (K ∗ )n of finite rank. Call two tuples of coefficients
(a1 , . . . , an ), (b1 , . . . , bn ) ∈ (K ∗ )n -equivalent if (a1 b1−1 , . . . , an bn−1 ) ∈ .
Then there are groups of finite rank such that for every m > 0 there are
infinitely many -equivalence classes of tuples (a1 , . . . , an ) ∈ (K ∗ )n with
the property that (6.1.3) has at least m non-degenerate solutions.
We give an easy construction different from that of Evertse et al. Choose
m points (xi1 , . . . , xi,n−1 ) ∈ (K ∗ )n−1 such that xi1 + · · · + xi,n−1 = 1 for i =
1, . . . , m and no proper subsums of the left-hand sides vanish. Let 1 be the
multiplicative group generated by xij , for all i = 1, . . . , m, j = 1, . . . , n − 1.
Then the equation
x1 + · · · + xn−1 = 1
in (x1 , . . . , xn−1 ) ∈ 1n−1
has at least m non-degenerate solutions. It follows that for all but finitely
many α ∈ K \ {0, −1}, the equation
1
1+α
1
α
x1 + · · · + 1+α
xn−1 + 1+α
xn = 1 in (x1 , . . . , xn ) ∈ 1n
has at least m non-degenerate solutions, all with xn = 1. We claim that the
1
1
α
, . . . , 1+α
, 1+α
) with α ∈ K \ {0, −1} lie in infinitely
tuples ϕ(α) := ( 1+α
n
many different 1 -equivalence classes. Indeed, it is easy to see that the 1n equivalence of ϕ(α), ϕ(β) implies that α/β ∈ 1 . Now if the tuples ϕ(α) with
α ∈ K \ {0, −1} lay in finitely many 1n -equivalence classes, the group K ∗
would be finitely generated, which is clearly absurd.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
6.7 Notes
171
Instead, Evertse and Győry (1998b) proved the following result.
Theorem 6.7.2 Let K be a field of characteristic 0 and a subgroup
of (K ∗ )n of finite rank. Then for all tuples (a1 , , . . . , an ) ∈ (K ∗ )n with the
exception of at most finitely many -equivalence classes, the (non-degenerate
or degenerate) solutions of (6.1.3) lie in a union of at most 2(n+1)! proper
linear subspaces of K n .
This was improved in Evertse (2004) to 2n+1 . This bound is probably
not best possible. It is as yet not clear what the optimal bound should
be.
r Let K be a number field, S a finite set of places of K containing the infinite
places, and consider again equation (6.1.3) in S-units x1 , . . . , xn . There is
as yet no general effective method to find all non-degenerate solutions if the
number of unknowns is larger than 2. However, in his thesis, Vojta (1983)
gave an effective method to determine all non-degenerate solutions in S-units
of (6.1.3) if the number n of unknowns is 3, and cardinality of the set S is
at most 3. Recently, Bennett (not published when this book went to press)
extended this to n = 4, |S| ≤ 3. Both Vojta and Bennett proved more general
effective results for systems of S-unit equations. More recently, Levin (2014)
extended Vojta’s result to an effective result for S-integral points on certain
quasi-projective varieties, where again |S| is small enough. The proofs of
Vojta, Bennett and Levin all use Baker-type lower bounds for linear forms in
logarithms.
r An effective version of the p-adic Subspace Theorem of Schmidt and Schlickewei would yield an effective version of Corollary 6.1.2, stating the finiteness
of the number of non-degenerate solutions of the S-unit equation (6.1.2), i.e.
(6.1.3). It seems, however, hopeless to make the Subspace Theorem effective
by the present methods. As is pointed out in Győry (1992a), an effective
variant of the following weaker Diophantine result would also imply an
effective version of Corollary 6.1.2, which would be of great importance for
its applications.
Let k, n ≥ 1 be integers, α0 , . . . , αk , β1 , . . . , βn non-zero elements of a
number field K, and bi1 , . . . , bin (i = 1, . . . , k) rational integers with absolute values at most B such that
=
k
αi β1bi1 · · · βnbin − α0
i=1
has no vanishing subsum containing α0 . Let | . |v be a normalized absolute
value on K.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
172
Unit equations in several unknowns
Proposition If
0 < ||v < e−δB
for some δ > 0, then B < C, where C is a number depending only on k, n,
K, α0 , . . . αk , β1 , . . . , βn , v and δ.
For k = 1, this is a non-effective version of Baker’s Theorem and its
p-adic analogue; see Section 3.2. For k ≥ 1, the above proposition is a
straightforward consequence of Proposition 6.2.1, which was deduced from
the Subspace Theorem. Hence the bound C is not effectively computable for
k > 1 by the method of proof.
In Győry (1992a) it is shown that an effective version of the above
Proposition would imply an effective variant of Corollary 6.1.2 on S-unit
equations.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:39, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.008
7
Analogues over function fields
Let k be an algebraically closed field of characteristic 0, and K a function field
in one variable over k, i.e., a finitely generated extension of k of transcendence
degree 1. Thus, K is a finite extension of the field of rational functions k(z),
where z is any element of K \ k. For definitions and more information on
function fields we refer to Chapter 2.
We denote by gK/k the genus of K/k. By a valuation on K we mean a
discrete valuation on K with value group Z such that v(x) = 0 for x ∈ k∗ . Let
MK denote the set of valuations of K. We recall that for a finite subset S of MK
a non-zero element u of K is called an S-unit if v(u) = 0 for all v ∈ MK \ S.
In this chapter we deal with equations
a1 x1 + a2 x2 = 1
(7.1)
a1 x1 + · · · + an xn = 1
(7.2)
and, in a less detailed manner,
to be solved in S-units x1 , . . . , xn and with some generalizations. The coefficients are non-zero elements of K.
In Section 7.1 we state Stothers’ and Mason’s Theorem, giving a function
field analogue of the abc-conjecture, as well as a corollary which states that
(7.1) has only finitely many solutions in S-units x1 , x2 with a1 x1 , a2 x2 ∈ k∗ ,
which can be effectively determined in a well-defined sense. The theorem of
Stothers and Mason and its corollary are proved in Section 7.2. In Section 7.3
we give a survey without proofs on effective results on S-unit equations (7.2)
in an arbitrary number of unknowns, and explain the structure of the set of
solutions of such equations, which is somewhat more complicated than that of
S-unit equations over number fields.
173
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
174
Analogues over function fields
In Sections 7.4 and 7.5 we consider, among other things, the equation
a1 x1 + a2 x2 = 1
in (x1 , x2 ) ∈ ,
(7.3)
where again a1 , a2 ∈ K ∗ and where is a multiplicative subgroup of K ∗ × K ∗
containing k∗ × k∗ such that /(k∗ × k∗ ) is a group of finite rank r. We prove
a result from Evertse and Zannier (2008), stating that equation (7.3) has at
most 3r solutions with a1 x1 , a2 x2 ∈ k. The method of proof we use, which is
based on algebraic geometry, was developed by Bombieri, Mueller and Zannier
(2001) and Zannier (2004).
In the last section of this chapter, we give a brief overview of recent results
on unit equations over fields of positive characteristic.
7.1 Mason’s inequality
Recall that for any α ∈ K, the height of α relative to K is defined by
min(0, v(α)).
HK (α) := −
v
We have HK (α) ≥ 0, and equality holds precisely when α ∈ k. Further, we
denote by |S| the cardinality of a set S.
We start with a theorem of Mason (1983, 1984). It is a generalization of an
earlier result of Stothers (1981).
Theorem 7.1.1 (abc-theorem for function fields) Let S be a finite, non-empty
subset of MK , and let x1 , x2 and x3 be non-zero elements of K with
x1 + x2 + x3 = 0
(7.1.1)
v(x1 ) = v(x2 ) = v(x3 ) for every v in MK \ S.
(7.1.2)
such that
Then either x1 /x2 lies in k, or
HK (x1 /x2 ) ≤ |S| + 2gK/k − 2.
(7.1.3)
We note that (7.1.3) is best possible in the sense that for every g ≥ 0 there is
a function field K over k of genus g such that equality holds for infinitely many
values of |S|; see Silverman (1984) for g = 0 and Brownawell and Masser
(1986) for arbitrary g.
Theorem 7.1.1 implies at once that if S is again a finite subset of MK
and x1 , x2 are S-units with x1 + x2 = 1 and x1 , x2 ∈ k, then HK (xi ) ≤ |S|+
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.1 Mason’s inequality
175
2gK/k − 2 for i = 1, 2. We state a more general result for the S-unit equation
a1 x1 + a2 x2 = 1 in S-units x1 , x2 ,
(7.1.4)
where a1 , a2 ∈ K ∗ .
Theorem 7.1.2 Let (x1 , x2 ) be a solution of (7.1.4) with ai xi ∈ k∗ for i = 1, 2.
Then
max HK (xi ) ≤ |S| + 2gK/k − 2 + 5 max HK (ai ).
i=1,2
i=1,2
We observe that equation (7.1.4) may have infinitely many solutions (x1 , x2 )
such that one of a1 x1 , a2 x2 lies in k. Indeed, suppose (7.1.4) has such a solution
(x1,0 , x2,0 ). Then both a1 x1,0 , a2 x2,0 ∈ k∗ . Put a1 x1,0 /a2 x2,0 =: η. Then we
obtain infinitely many solutions (x1 , x2 ) of (7.1.4) with a1 x1 , a2 x2 ∈ k∗ by
taking (x1 , x2 ) = (x1,0 ξ, x2,0 (1 + (1 − ξ )η)) for any ξ ∈ k∗ .
From Theorem 7.1.2, we obtain the following effective finiteness result.
Here it is necessary to assume that k is presented explicitly in the sense of
Fröhlich and Shepherdson (1956). This means that there is an algorithm to
determine the zeros of any polynomial with coefficients in k. In particular, in
this case we can perform the field operations in k. Further, we assume that K is
presented explicitly, that is, K is given in the form k(z)(y) where z is a variable
and y is a primitive element of K over k(z), with explicitly given minimal
polynomial in k(z)[X].
We say that an element x of K is explicitly given if it is given in the form
x=
d
(qi (z)/q(z))y i−1 ,
i=1
where d = [K : k(z)] and q1 , . . . , qd , q are explicitly given polynomials from
k[z]. We call (q1 , . . . , qd , q) a representation for x. We say that a valuation v
of K is explicitly given, if we are given a local parameter zv and a Laurent
series yv in k((zv )) such that y → yv defines an isomorphic embedding of K
into k((zv )). By a Laurent series being explicitly given we mean that we are
given an inductive procedure to compute its coefficients one by one. If a nonzero x ∈ K and a valuation v are explicitly given, then v(x) can be determined
by computing a Laurent series of x in terms of zv and searching for the first
non-zero coefficient.
Finally, an element x of K is said to be effectively determinable from certain
given input data if there is an algorithm to determine an explicit representation
of x from these data. We note that if elements x1 and x2 of K are effectively
determinable, then so are x1 ± x2 , x1 x2 and x1 /x2 (x2 = 0).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
176
Analogues over function fields
Corollary 7.1.3 Equation (7.1.4) has only finitely many solutions with ai xi ∈
k∗ for i = 1, 2 and these can be determined effectively if we assume that k, K
are presented explicitly and a1 , a2 and the valuations in S are given explicitly
in the sense described above.
For a1 = a2 = 1, Mason (1983, 1984) deduced this corollary from his Theorem 7.1.1 stated above and from Propositions 2.4.1 and 2.4.2. Further, in
this special case he extended Corollary 7.1.3 to the case of positive characteristic. We deduce Corollary 7.1.3 in a manner similar to Mason’s, using
Theorem 7.1.2 instead of Theorem 7.1.1.
We mention that in his book, Mason (1984) gave various applications of the
results mentioned above, to Thue equations, hyper- and superelliptic equations,
and curves of genus 0 and genus 1.
7.2 Proofs
We prove Theorems 7.1.1, 7.1.2 and Corollary 7.1.3.
Proof of Theorem 7.1.1. We assume without loss of generality that S is precisely the set of all v ∈ MK such that v(x1 ), v(x2 ), v(x3 ) are distinct. For convenience, we write u := x3 /x1 . Thus, (7.1.1) implies that x2 /x1 = −(u + 1)
and our assumption translates into
S = {v ∈ MK : v(u) = 0 or v(u + 1) = 0}.
We may assume that u does not lie in k. Since v(u + 1) ≥ 0 if v(u) = 0, we
can partition S into a disjoint union S∞ ∪ S0 ∪ S−1 , where
S∞ = {v ∈ S : v(u) < 0},
S0 = {v ∈ S : v(u) > 0},
S−1 = {v ∈ S : v(u + 1) > 0}.
These sets are pairwise disjoint. Notice that by the Sum Formula (see (2.1.3)
in Section 2.1), v∈S∞ ∪S0 v(u) = 0.
Now choose for every valuation v ∈ MK a local parameter zv . We compare
the order of vanishing at v of u and the local derivative du/dzv . From (2.3.1)
and (2.3.2) in Section 2.3, we infer
v
v
v
du
dzv
du
dzv
du
dzv
= v(u) − 1 for v ∈ S∞ ∪ S0 ,
=v
d(u + 1)
dzv
= v(u + 1) − 1 for v ∈ S−1 ,
≥ 0 for v ∈ MK \ S.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.2 Proofs
177
By combining these with Theorem 2.3.1 and the Sum Formula we obtain
2gK/k − 2 =
v
v
=
du
dzv
≥
(v(u) − 1) +
v∈S∞ ∪S0
(v(u + 1) − 1)
v∈S−1
v(u + 1) − |S| = HK ((u + 1)−1 ) − |S| = HK (x1 /x2 ) − |S|.
v∈S−1
This implies Theorem 7.1.1.
Proof of Theorem 7.1.2. Put H := maxi=1,2 HK (ai ). For i = 1, 2, let Si be the
set of valuations v outside S with v(ai ) = 0. By (2.2.8) we have |Si | ≤
/ k∗ for
2HK (ai ) ≤ 2H for i = 1, 2. Take a solution (x1 , x2 ) of (7.1.4) with ai xi ∈
i = 1, 2. We have v(a1 x1 ) = v(a2 x2 ) = v(1) = 0 for v ∈ MK \ (S ∪ S1 ∪ S2 ).
Notice that by (2.2.7) and (2.2.6),
HK (xi ) ≤ HK (ai xi ) + HK ai−1 = HK (ai xi ) + HK (ai ) ≤ HK (ai xi ) + H
for i = 1, 2. Now an application of Theorem 7.1.1 with a1 x1 , a2 x2 , 1 instead
of x1 , x2 , x3 and S ∪ S1 ∪ S2 instead of S gives for i = 1, 2,
HK (xi ) ≤ HK (ai xi ) + H ≤ |S| + |S1 | + |S2 | + 2gK/k − 2 + H
≤ |S| + 2gK/k − 2 + 5H.
Proof of Corollary 7.1.3. Let (x1 , x2 ) be a solution of (7.1.4) with ai xi ∈
/ k∗
for i = 1, 2. Pick i ∈ {1, 2}. Then v(xi ) = 0 for every valuation v ∈ MK \ S.
Further, by (2.2.8),
|v(xi )| = 2HK (xi ) ≤ 2C,
v∈S
where C is the bound from Theorem 7.1.2. As explained in Section 2.4,
we can compute a minimal polynomial over k[z] of each ai , and then estimate from above the heights of the ai using (2.2.10). This leads to an effectively computable upper bound for C. We conclude that the tuple of integers
(v(xi ) : i = 1, 2, v ∈ S) has only a finite number of effectively determinable
/ k∗ for
possibilities as (x1 , x2 ) runs over the solutions of (7.1.4) with ai xi ∈
i = 1, 2.
Applying Proposition 2.4.1 we infer that apart from a non-zero factor in k,
x1 , x2 have only a finite number of possibilities which are effectively determinable. Hence for i = 1, 2 we may write xi = yi ξi , where ξi is some nonzero element of k and yi belongs to a finite computable subset of K. Now
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
178
Analogues over function fields
equation (7.1.4) transforms into
b1 ξ1 + b2 ξ2 = 1,
(7.2.1)
where bi := ai yi ∈
/ k∗ for i = 1, 2. The pair (b1 , b2 ) belongs to a finite, effectively determinable set, and for each such pair we have to determine the solutions ξ1 , ξ2 ∈ k∗ of (7.2.1).
Fix b1 , b2 . We have seen that b1 , b2 ∈ k. If b1 , b2 are linearly dependent over
k then (7.2.1) is unsolvable. Assume that b1 , b2 are linearly independent over
k. Then by Proposition 2.4.2, equation (7.2.1) has precisely one solution which
can be determined effectively. This completes the proof of our assertion.
7.3 Effective results in the more unknowns case
For completeness, we now present without proof some generalizations of the
results stated in Section 7.1.
Let n ≥ 2 be a given integer. For non-zero elements x0 , x1 , . . . , xn of K, we
define the homogeneous height by
min(v(x0 ), . . . , v(xn )).
HKhom (x0 , . . . , xn ) = −
v
The Sum Formula on K shows that this is actually a height on the projective
space Pn (K). Further, we have
HK (xi /xj ) ≤ HK∗ (x0 , . . . , xn )
for each i, j with 0 ≤ i, j ≤ n.
Write
1
(n − 1)(n − 2) if n ≥ 1.
2
Brownawell and Masser (1986) proved the following general theorem.
N0 := 0,
Nn :=
Theorem 7.3.1 Suppose that x0 , . . . , xn are non-zero elements of K such that
x0 + · · · + xn = 0,
(7.3.1)
and no proper subset of {x0 , . . . , xn } is k-linearly dependent. For each valuation
v of K, let
m(v) = m(v; x0 , . . . , xn ) := |{i : 0 ≤ i ≤ n, v(xi ) = 0}|.
Then
HKhom (x0 , . . . , xn ) ≤ Nn+1 (2gK/k − 2) +
(Nn+1 − Nm(v) ).
(7.3.2)
v
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.3 Effective results in the more unknowns case
179
The proof of Brownawell and Masser uses logarithmic Wronskians. Fix
z ∈ K \ k, so that K is a finite extension of k(z). For f ∈ K, define f (k) :=
(d/dz)k f . Now we define the logarithmic Wronskian of f1 , . . . , fn ∈ K ∗ by
(j −1) λ(f1 , . . . , fn ) := det fi
/fi i,j =1,...,n .
Given a solution (x0 , . . . , xn ) of (7.3.1), let λi be the logarithmic Wronskian
of x0 , . . . , xn with xi omitted. Then the argument of Brownawell and Masser
consists of showing that HK (x0 , . . . , xn ) = HK (λ0 , . . . , λn ), and estimating
v(λi ) from below for i = 0, . . . , n and v ∈ MK , which leads to an upper bound
for HK (λ0 , . . . , λn ).
The following consequence of Theorem 7.3.1 was obtained independently
by Voloch (1985).
Corollary 7.3.2 Suppose that for some finite subset S of MK , x0 , . . . , xn give
rise to a solution of (7.3.1) in S-units, and that no proper subset of {x0 , . . . , xn }
is k-linearly dependent. Then
HKhom (x0 , . . . , xn ) ≤
1
n(n − 1)(|S| + 2gK/k − 2).
2
In his proof, Voloch did not use the Wronskian argument of Brownawell and
Masser, but instead used properties of Weierstrass points on algebraic curves.
Notice that we obtain Mason’s result, Theorem 7.1.1, from Corollary 7.3.2
by taking x1 , x2 , x3 in K ∗ with x1 + x2 + x3 = 0 and with (7.1.2) and applying
Corollary 7.3.2 to (x1 /x2 ) + (x3 /x2 ) + 1 = 0.
Independently, Mason (1986a) proved Corollary 7.3.2 with a larger bound
in terms of n. Further, he showed that apart from a common proportional
S-unit factor, the full range of possibilities for such x0 , . . . , xn is finite, and
may be determined effectively whenever k, K are presented explicitly and the
valuations in S are given explicitly. A sharpening of Corollary 7.3.2 was given
by Zannier (1993). Hsia and Wang (2004) obtained a generalization of the result
of Brownawell and Masser to function fields of arbitrary transcendence degree
over constant fields of arbitrary characteristic. Their proof uses generalized
Wronskians.
A solution x0 , . . . , xn of (7.3.1) is called non-degenerate if i∈I xi = 0 for
every non-empty proper subset I of {0, . . . , n}, and degenerate otherwise.
Brownawell and Masser (1986) proved that the inequality (7.3.2) in Theorem
7.3.1 remains true, in a slightly modified form, if the assumption of linear
independence is replaced by the weaker hypothesis of non-degeneracy.
Set
GK := max(0, 2gK/k − 2).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
180
Analogues over function fields
Theorem 7.3.3 Suppose that x0 , . . . , xn is a non-degenerate solution of
(7.3.1). Then
HKhom (x0 , . . . , xn ) ≤ Nn+1 · GK +
(Nn+1 − Nm(v) ).
v
This implies the following version of Corollary 7.3.2.
Corollary 7.3.4 Suppose that for some finite subset S of MK , x0 , . . . , xn is a
non-degenerate solution of (7.3.1) in S-units. Then
HKhom (x0 , . . . , xn ) ≤
1
n(n − 1)(|S| + GK ).
2
(7.3.3)
It is likely that for n > 2 the factor 12 n(n − 1) in (7.3.3) is not the best
possible one.
We derive a result on the inhomogeneous equation
a1 x1 + · · · + an xn = 1 in S-units x1 , . . . , xn ,
(7.3.4)
where, as before, S is a finite subset of MK and where a1 , . . . , an are non-zero
elements of K. A solution (x1 , . . . , xn ) of this equation is called non-degenerate
if i∈I ai xi = 0 for each non-empty subset I of {1, . . . , n}.
Theorem 7.3.5 For every non-degenerate solution (x1 , . . . , xn ) of (7.3.4) we
have
1
max HK (xi ) ≤ n(n − 1)(|S| + GK ) + (n3 − n2 + 1) max HK (ai ).
1≤i≤n
1≤i≤n
2
Proof. Put H := max1≤i≤n HK (ai ). For i = 1, . . . , n, let Si be the set of
valuations v outside S for which v(ai ) = 0. Then by (2.2.8) we have
|Si | ≤ 2HK (ai ) ≤ 2H for i = 1, . . . , n. Choose a non-degenerate solution
(x1 , . . . , xn ) of (7.3.4). Then, completely similarly as in the proof of Theorem 7.1.2, we have HK (xi ) ≤ HK (ai xi ) + H for i = 1, . . . , n. Now by applying Corollary 7.3.4 with (1, a1 x1 , . . . , an xn ) and S ∪ S1 ∪ · · · ∪ Sn instead of
(x0 , x1 , . . . , xn ), S, we obtain for i = 1, . . . , n,
HK (xi ) ≤ H + HK (ai xi ) ≤ H + HK (1, a1 x1 , . . . , an xn )
≤ H + 12 n(n − 1)(GK + |S| + |S1 | + · · · + |Sn |)
≤ 12 n(n − 1)(GK + |S|) + (n3 − n2 + 1)H.
The above result does not imply that (7.3.4) has only finitely many solutions. We say that two solutions (x1 , . . . , xn ), (x̃1 , . . . , x̃n ) of (7.3.4) are
k-proportional, or lie in the same k-proportionality class, if xi /x̃i ∈ k for
i = 1, . . . , n. In general, a k-proportionality class may contain infinitely many
non-degenerate solutions.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.3 Effective results in the more unknowns case
181
Corollary 7.3.6 The set of non-degenerate solutions of (7.3.4) is contained
in a union of finitely many k-proportionality classes, and if we assume that k,
K are presented explicitly and S, a1 , . . . , an are explicitly given, a full system
of representatives of these classes can be determined effectively.
Proof. Let (x1 , . . . , xn ) be a non-degenerate solution of (7.3.4). Then by (2.2.8)
we have
|v(xi )| = 2HK (xi ) ≤ 2C for i = 1, . . . , n,
v∈S
where C is the upper bound from Theorem 7.3.5. By the same method as in the
proof of Corollary 7.1.3, one can effectively compute an upper bound for C.
This shows that the tuple (v(xi ) : i = 1, . . . , n, v ∈ S) runs through a finite,
effectively determinable set as (x1 , . . . , xn ) runs through the non-degenerate
solutions of (7.3.4), and the non-degenerate solutions corresponding to a given
tuple lie in the same k-proportionality class. Hence the non-degenerate solutions
of (7.3.4) lie in only finitely many k-proportionality classes.
By Proposition 2.4.1, it can be decided effectively whether for a given tuple
(civ : i = 1, . . . , n, v ∈ S) there exist b1 , . . . , bn ∈ OS∗ with v(bi ) = civ for
i = 1, . . . , n, v ∈ S and if so, such b1 , . . . , bn can be determined effectively.
The non-degenerate solutions of (7.3.4) that are k-proportional to (b1 , . . . , bn )
are of the shape (b1 ξ1 , . . . , bn ξn ) with
a1 b1 ξ1 + · · · + an bn ξn = 1, (ξ1 , . . . , ξn ) ∈ kn ,
ai bi ξ = 0 for each non-empty I ⊂ {1, . . . , n}.
(7.3.5)
(7.3.6)
i∈I
The tuples (ξ1 , . . . , ξn ) with (7.3.5) form a linear subvariety V of kn , and
the elements of V with (7.3.6) lie in the complement of a finite number of
linear subvarieties of V , say V1 , . . . , Vr . Thus, the set of non-degenerate solutions of (7.3.4) that are k-proportional to (b1 , . . . , bn ) can be described as
(b1 ξ1 , . . . , bn ξn ) with (ξ1 , . . . , ξn ) ∈ U := V \ (V1 ∪ · · · ∪ Vr ).
So we have to decide whether or not U = ∅ and if so, find an element of U .
Notice that U = ∅ precisely if V = ∅ and V1 , . . . , Vr are proper linear subvarieties of V . This can be checked using Proposition 2.4.2. Further, assuming
U = ∅ one can find an element of U using the parameter representations of
V , V1 , . . . , Vr that can be computed according to Proposition 2.4.2.
Assume that U = ∅. Then one easily checks that U consists of only one
element if V has dimension 0, that is, if a1 b1 , . . . , an bn are linearly independent
over k, and U is infinite if V is positive dimensional, which is the case precisely
if a1 b1 , . . . , an bn are linearly dependent over k.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
182
Analogues over function fields
7.4 Results on the number of solutions
Let again k be an algebraically closed field of characteristic 0 and let now
K be an extension field of k of arbitrary positive transcendence degree
over k.
Let n ≥ 2, and denote by (K ∗ )n the n-fold direct product of the multiplicative
group K ∗ , endowed with coordinatewise multiplication. We consider equations
with solution vectors from a subgroup of (K ∗ )n such that (k∗ )n ⊂ , and
/(k∗ )n has finite rank r. If r = 0 this means that = (k∗ )n , while if r > 0,
this means that there are multiplicatively independent elements u1 , . . . , ur
1
wr
of such that every element of can be expressed as ξ · uw
1 · · · ur with
wi
∗ n
ξ ∈ (k ) and w1 , . . . , wr ∈ Q. (The coordinates of ui are determined only
up to multiplication by roots of unity, but we just make any choice for them.)
We start with the equation in two unknowns
a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ ,
(7.4.1)
where is a subgroup of (K ∗ )2 and a1 , a2 ∈ K ∗ . The following theorem is a
generalization of a result of Zannier (2004).
Theorem 7.4.1 Suppose that ⊃ (k∗ )2 and that /(k∗ )2 has finite rank r ≥ 0.
/ k∗ for j = 1, 2.
Then (7.4.1) has at most 3r solutions with aj xj ∈
We now consider equations in an arbitrary number of unknowns, i.e.,
a1 x1 + · · · + an xn = 1
in (x1 , . . . , xn ) ∈ ,
(7.4.2)
where is a subgroup of (K ∗ )n and a1 , . . . , an ∈ K ∗ . Recall that a solution of
(7.4.2) is called non-degenerate if i∈I ai xi = 0 for each non-empty subset I
of {1, . . . , n}. Further, we say that two solutions (x1 , . . . , xn ), (x̃1 , . . . , x̃n ) are
k-proportional, or belong to the same k-proportionality class, if xi /x̃i ∈ k∗ for
i = 1, . . . , n. The next theorem is the main result from Evertse and Zannier
(2008).
Theorem 7.4.2 Let n ≥ 2. Suppose that ⊃ (k∗ )n and that /(k∗ )n has finite
rank r ≥ 0. Then the non-degenerate solutions of (7.4.2) lie in at most
n+1
i
2
i=2
r
−n+1
k-proportionality classes.
Theorem 7.4.1 follows at once from 7.4.2. Indeed, all solutions of (7.4.1) are
non-degenerate. Further, the solutions with aj xj ∈ k∗ for j = 1, 2 are pairwise
k-non-proportional, and by substituting n = 2 into the bound of Theorem 7.4.2
we obtain precisely the bound 3r from Theorem 7.4.1.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.5 Proof of Theorem 7.4.1
183
We mention that the proof of Theorem 7.4.2, given in Evertse and Zannier
(2008) depends heavily on ideas introduced in Bombieri, Mueller and Zannier
(2001) and Zannier (2004). Weaker and less general results were obtained
earlier in Evertse and Győry (1988b) and Mueller (2000).
To give a flavour of the techniques used in the papers mentioned above, in
the next section we prove Theorem 7.4.1 in the special case that K has transcendence degree 1 over k. The general case that K has arbitrary transcendence
degree over k can be reduced to this special case by means of a specialization
argument. The proof of Theorem 7.4.2, which is not given here, is based on the
same ideas as the proof of Theorem 7.4.1.
7.5 Proof of Theorem 7.4.1
Let k be an algebraically closed field of characteristic 0, let K be an extension
of k of transcendence degree 1, and let be a subgroup of (K ∗ )2 which contains
(k∗ )2 and such that /(k∗ )2 has finite rank r. We would like to define in some
way the k-closure of , which is such that if a point (x1 , x2 ) belongs to this
k-closure, then so does (x1w , x2w ) for any w ∈ k. Then we would like to consider
equation (7.4.1) with solutions from the k-closure of instead of itself. The
importance of this is that it will allow us to use techniques from algebraic
geometry.
It does not suffice to define the k-closure of formally by taking the tensor
product of with k, but we have to embed this k-closure somehow into a ring
or field, to make sense of our desired extension of (7.4.1). As it turns out, one
can define exponentiation with elements from k for formal power series. Then
the k-closure of can be defined after embedding K into a formal power series
ring.
7.5.1 Extension to the k-closure of We keep the notation introduced in Section 7.4. Let again k be an algebraically
closed field of characteristic 0, K an extension of k of transcendence degree
1, and a subgroup of (K ∗ )2 such that ⊃ (k∗ )2 and /(k∗ )2 has rank r.
Choose pairs (bi1 , bi2 ) (i = 1, . . . , r) from that are multiplicatively independent over (k∗ )2 , i.e., there is no non-zero vector w = (w1 , . . . , wr ) ∈ Zr with
r
wi
∈ (k∗ )2 . Then every element of can be expressed as
i=1 (bi1 , bi2 )
(ξ1 , ξ2 )
r
(bi1 , bi2 )wi ,
(7.5.1)
i=1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
184
Analogues over function fields
where (ξ1 , ξ2 ) ∈ (k∗ )2 , w1 , . . . , wr ∈ Q, and where the powers with rational
exponents are defined up to multiplication by points consisting of roots of
unity.
Let L be the extension of k generated by a1 , a2 , bij (i = 1, . . . , r, j = 1, 2).
Notice that L is an algebraic function field in one variable over k. Choose a valuation v of L such that v(aj ) = 0, v(bij ) = 0 for j = 1, 2, i = 1, . . . , r. Since
v has residue class field k, after multiplying a1 , a2 and the bij by appropriate
elements from k∗ , we can achieve that
v(aj − 1) > 0,
v(bij − 1) > 0 for j = 1, 2, i = 1, . . . , r.
(7.5.2)
Choose a local parameter z for v. Then the completion of L at v is the field of
formal Laurent series k((z)), and we may assume that L is a subfield of k((z)).
i
The valuation v is extended to k((z)) by setting v( ∞
i=i0 ci z ) = i0 if ci ∈ k for
∞
i ≥ i0 and ci0 = 0. We say that a sequence {fm }m=0 in k((z)) converges to f
i
if v(fm − f ) → ∞ as m → ∞. The derivative of f = ∞
i=i0 ci z ∈ k((z)) is
∞
i−1
defined by f = df/dz = i=i0 ici z . If limm→∞ fm = f for some sequence
{fm } in k((z)), then also limm→∞ fm = f .
Denote by k[[z]] the ring of formal power series in z, and denote by 1 +
zk[[z]] the set of power series
1+
∞
ci zi
with ci ∈ k for i ≥ 1.
i=1
Notice that 1 + zk[[z]] is a multiplicative group.
By (7.5.2) the elements aj , bij (j = 1, 2, i = 1, . . . , r) all belong to the
group 1 + zk[[z]]. Further, they belong to L, hence are algebraic over k(z).
We are now ready to define exponentiation with elements from k. For
f ∈ 1 + zk[[z]], w ∈ k we put
f w :=
∞
w
(f − 1)j ,
j
j =0
where ( wj ) := w(w − 1) · · · (w − j + 1)/j !. In the topology of k[[z]] defined
by v, this infinite series converges to a limit which belongs to 1 + zk[[z]].
Obviously, by Newton’s binomial formula, f w = f · · · f (w times) for any
non-negative integer w. This exponentiation has the usual properties:
(f w ) = wf w−1 f for f ∈ 1 + zk[[z]], w ∈ k;
(7.5.3)
(f g) = f g for f, g ∈ 1 + zk[[z]], w ∈ k;
(7.5.4)
w
f
w w
w1 +w2
=f
w1 w2
=f
(f
)
w1
f
w2
w1 w2
for f ∈ 1 + zk[[z]], w1 , w2 ∈ k;
for f ∈ 1 + zk[[z]], w1 , w2 ∈ k.
(7.5.5)
(7.5.6)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.5 Proof of Theorem 7.4.1
185
Property (7.5.3) can be proved using the fact that ((f − 1)j ) = j (f − 1)j −1 f for all j and that infinite series can be differentiated sumwise. The other properties can be proved using logarithmic derivatives, where the logarithmic derivative of f ∈ k((z)) is f /f . The map f → f /f defines an injective homomorphism from the multiplicative group 1 + zk[[z]] to the additive group of k[[z]],
and (f w ) /f w = wf /f for f ∈ 1 + zk[[z]], w ∈ k. For instance, (7.5.4) is
proved by showing that both (f g)w and f w g w have logarithmic derivatives
w · ((f /f ) + (g /g)). The identities (7.5.5) and (7.5.6) can be proved likewise.
In the usual manner, we put (x1 , . . . , xn )w := (x1w , . . . , xnw ) for x1 , . . . , xn ∈
1 + zk[[z]] and w ∈ k. We now define the k-closure of by
r
wi
∗
:= (ξ1 , ξ2 ) (bi1 , bi2 ) : ξ1 , ξ2 ∈ k , w1 , . . . , wr ∈ k .
(7.5.7)
i=1
By (7.5.1), this group indeed contains . In what follows, we write w for vectors
(w1 , . . . , wr ) ∈ kr . Our result for is as follows.
Theorem 7.5.1 Let r be a positive integer and let aj , bij (j = 1, 2, i =
1, . . . , r) be elements of 1 + zk[[z]] such that
aj , bij are algebraic over k(z) for j = 1, 2, i = 1, . . . , r,
r
r
there is no w ∈ Z \ {0} with
(bi1 , bi2 )wi ∈ (k∗ )2 .
(7.5.8)
(7.5.9)
i=1
Let be given by (7.5.7). Then the equation
a1 x1 + a2 x2 = 1 in (x1 , x2 ) ∈ with a1 x1 ∈ k∗ , a2 x2 ∈ k∗
(7.5.10)
has at most 3r solutions.
The idea of the proof is to consider the vectors w ∈ kr in the representation
(7.5.7) for the solutions (x1 , x2 ) of (7.5.10), and to estimate the number of these
w using techniques from algebraic geometry. Here it will be crucial that w can
be chosen from kr and not just from Qr , as was the case with the group .
7.5.2 Some algebraic geometry
We have collected some basic facts from algebraic geometry. Our basic reference is Hartshorne (1977), chapter 1. As before, k is an algebraically closed
field of characteristic 0. Let r be a positive integer.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
186
Analogues over function fields
By an algebraic subset of kr we mean the set of common zeros in kr of a
collection of polynomials in k[X1 , . . . , Xr ]. The algebraic subset of kr given
by f1 , . . . , fm ∈ k[X1 , . . . , Xm ], notation V(f1 , . . . , fm ), is defined as the set
of common zeros in kr of f1 , . . . , fm . The algebraic subsets of kr are the
closed sets of the Zariski topology on kr . We say that an algebraic subset of
kr is defined over a subfield k of k if it is the set of common zeros in kr of
polynomials with coefficients in k .
The collection of all polynomials f ∈ k[X1 , . . . , Xr ] vanishing identically
on a given algebraic set X ⊂ kr is an ideal of k[X1 , . . . , Xr ], which is denoted
by I (X ). Clearly, if X , Y are algebraic subsets of kr , then X ⊆ Y if and only
if I (X ) ⊇ I (Y). Since k[X1 , . . . , Xr ] is a Noetherian domain, every ascending
chain of ideals of k[X1 , . . . , Xr ] is eventually constant. By applying this to
ideals associated with algebraic sets, we obtain the descending chain property
for algebraic sets:
if Xi (i = 1, 2, . . .) are algebraic subsets of kr with X1 ⊇ X2 ⊇ X3 ⊇ · · · , then
there is j0 such that Xj = Xj0 for j ≥ j0 .
An algebraic subset of kr is called irreducible if it is not the union of two
strictly smaller algebraic subsets of kr . An irreducible algebraic subset of kr
is also called an algebraic subvariety of kr . A linear subvariety of kr is an
algebraic subvariety of kr given by linear polynomials.
From the descending chain property for algebraic sets it follows easily that
every non-empty algebraic subset X of kr is a union V1 ∪ · · · ∪ Vg of finitely
many algebraic subvarieties of kr . If we assume in addition that none of these
algebraic subvarieties is contained in the union of the others, they are uniquely
determined. In that case, V1 , . . . , Vg are called the irreducible components of
X . From the definition of irreducibility it follows at once that any algebraic
subvariety contained in X must be contained in an irreducible component
of X .
Let V be an algebraic subvariety of kr . Then its associated ideal I (V) is a
prime ideal of k[X1 , . . . , Xr ], hence the quotient ring k[X1 , . . . , Xr ]/I (V) is
an integral domain. The quotient field of this domain is called the function field
of V, notation k(V). The transcendence degree over k of this field is called
the dimension of V, notation dim V. A zero-dimensional algebraic variety is a
point, and a one-dimensional algebraic variety is an algebraic curve. Further,
if V1 , V2 are algebraic subvarieties of kr with V1 strictly contained in V2 , then
dim V1 < dim V2 .
We recall some results from intersection theory. To state these, we need
the notion of degree of an algebraic variety. Let V be an n-dimensional algebraic subvariety of kr . Denote by Vm the k-vector space of polynomials in
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.5 Proof of Theorem 7.4.1
187
k[X1 , . . . , Xr ] of total degree at most m and define HV (m) to be the dimension of the quotient k-vector space Vm /(Vm ∩ I (V)). One can show that there
is a polynomial pV ∈ Q[X], called the Hilbert polynomial of V, such that
HV (m) = pV (m) for every sufficiently large integer m. Further, there is a positive integer deg V, called the degree of V, such that pV (m) = deg V · mn /n!+
(lower powers of m). For instance, deg V = 1 if V = kr or if V is a point.
Proposition 7.5.2 Let V be an algebraic subvariety of kr and let X be an
algebraic subset of kr given by polynomials of total degree at most d. Then
V ∩ X is an algebraic subset of kr with at most d dim V · deg V irreducible
components.
Proof. We proceed by induction on dim V. If V has dimension 0, i.e., is a point,
the assertion is obvious.
Suppose that dim V = n > 0. If V ⊆ X we are done since by definition,
V itself is irreducible. Assume that V ⊂ X . Then there is a polynomial f ∈
k[X1 , . . . , Xr ] of total degree at most d that vanishes identically on X , but does
not vanish identically on V. We now invoke a version of Bézout’s Theorem
(see Hartshorne (1977), chapter 1, Theorem 7.7 for a more precise version with
multiplicities) which states that if V1 , . . . , Vg are the irreducible components
of V ∩ V(f ), then
dim Vi = n − 1 for i = 1, . . . , g,
g
deg Vi ≤ d · deg V.
i=1
Now, by the induction hypothesis, we have for i = 1, . . . , g that Vi ∩ X
has at most d n−1 · deg Vi irreducible components. Consequently, the number
of irreducible components of V ∩ X is at most
g
d n−1 · deg Vi ≤ d n deg V = d dim V · deg V.
i=1
Corollary 7.5.3 Let X be an algebraic subset of kr given by polynomials of
total degree at most d. Let Y be an algebraic subset of X such that X \ Y is
finite. Then X \ Y has cardinality at most d r .
Proof. Assume that X \ Y = {P1 , . . . , Pg }. Notice that the irreducible components of Y, together with P1 , . . . , Pg , form a decomposition of X into irreducible subsets, none of which is contained in the union of the others. This
shows that {P1 }, . . . , {Pg } are irreducible components of X . Now our corollary
follows at once by applying Proposition 7.5.2 with V = kr .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
188
Analogues over function fields
7.5.3 Proof of Theorem 7.5.1
We keep the notation and assumptions from Subsection 7.5.1. We denote by E
the algebraic closure of k(z) in the field k((z)). It is important to observe that
the differentiation x → x = dx/dz maps elements from E to elements from
E.
Suppose that aj , bij (j = 1, 2, i = 1, . . . , r) satisfy (7.5.8) and (7.5.9).
Denote by X the set of w = (w1 , . . . , wr ) ∈ kr such that the three functions
1,
a1
r
wi
bi1
,
a2
r
i=1
wi
bi2
i=1
are k-linearly dependent. Further, denote by Y the set of w ∈ kr such that any
two among these functions are k-linearly dependent.
Let (x1 , x2 ) be a solution of (7.5.10). Representing (x1 , x2 ) as in (7.5.7), we
obtain
ξ1 a1
r
i=1
wi
bi1
+ ξ2 a2
r
wi
bi1
=1
(7.5.11)
i=1
with ξ1 , ξ2 ∈ k∗ , w = (w1 , . . . , wr ) ∈ kr . Hence w ∈ X . Further, the condition
a1 x1 ∈ k∗ , a2 x2 ∈ k∗ implies that w ∈ Y. This shows that w ∈ X \ Y. Conversely, let w ∈ X \ Y. Then there are unique ξ1 , ξ2 ∈ k∗ with (7.5.11), and
this leads to a unique solution (x1 , x2 ) = (ξ1 , ξ2 ) ri=1 (bi1 , bi2 )wi of (7.5.10).
So in order to prove Theorem 7.5.1 it suffices to prove the following.
Proposition 7.5.4 The set X \ Y has cardinality at most 3r .
Eventually, we will apply Corollary 7.5.3 from the previous subsection. We
first show that X \ Y is finite, and then that X , Y are algebraic sets where X is
given by polynomials of total degree at most 3.
We first prove a number of lemmas.
Lemma 7.5.5 Let L be a finite extension of k(z) contained in E, and let
β1 , . . . , βr , α ∈ L∗ and w1 , . . . , wr ∈ k be such that β1w1 · · · βrwr = α. Then for
every valuation v of L we have ri=1 wi v(βi ) = v(α).
Proof. We need some facts on residues. Choose a local parameter zv of v. Then
i
every f ∈ L can be expressed as a Laurent series ∞
i=i0 ci zv with ci ∈ k. We
define the residue of f at zv by reszv (f ) := c−1 ; this defines a k-linear map
from L to k. One can show that the residue depends only on v, i.e., that it is
independent of the choice of zv , but we do not need this. We need only the
easily verifiable fact that for the logarithmic derivative of f ∈ L∗ with respect
to zv we have reszv (f −1 df/dzv ) = v(f ).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.5 Proof of Theorem 7.4.1
189
Recall that the logarithmic derivative x → x −1 dx/dz is defined on the group
1 + zk[[z]] and that it maps products of powers with exponents in k to linear
combinations. Thus, ri=1 βiwi = α maps to
r
wi ·
i=1
dβi /dz
dα/dz
=
,
βi
α
which is an identity with functions in L. By multiplying with dz/dzv , which
also belongs to L, we obtain the same identity, but with zv instead of z. Then
by taking residues with respect to zv , our lemma follows.
Lemma 7.5.6 Let βij (i = 1, . . . , r, j = 1, . . . , m) be elements of (1 +
zk[[z]]) ∩ E such that there is no non-zero w = (w1 , . . . , wr ) ∈ Zr with
r
βijwi ∈ k∗ for i = 1, . . . , m.
i=1
Then the map
ψ : w→
r
wi
βi1
,...,
i=1
r
wi
βim
i=1
defines an injective homomorphism from k to (1 + zk[[z]])m with coordinatewise multiplication.
r
Proof. By (7.5.5), the map ψ defines a homomorphism on kr . Denote by H
the kernel of ψ. Notice that H is the set of w ∈ kr such that
r
βijwi = 1 for j = 1, . . . , m.
(7.5.12)
i=1
We have to prove that H = (0).
Let L be the extension of K generated by the elements βij (i = 1, . . . , r,
j = 1, . . . , m). Then L ⊂ E and L is a finite extension of K. By Lemma 7.5.5,
for every w ∈ H we have
r
wi · v(βij ) = 0 for j = 1, . . . , m, v ∈ ML ,
i=1
where ML is the set of valuations of L. The latter system defines a proper linear
subspace H of kr , defined over Q, containing H . Suppose H = (0). Then
H contains a non-zero vector w = (w1 , . . . , wr ) ∈ Zr . Put xj := ri=1 βijwi for
j = 1, . . . , m. Then xj ∈ L and v(xj ) = 0 for j = 1, . . . , m, v ∈ ML . But this
implies xj ∈ k∗ for j = 1, . . . , m, contradicting our assumption.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
190
Analogues over function fields
Lemma 7.5.7 Let m, r be positive integers, let R ∈ E(U1 , . . . , Um ) be a
rational function in m variables, and let β1 , . . . , βr ∈ (1 + zk[[z]]) ∩ E. Then
there are only finitely many α ∈ F for which there exist w = (w1 , . . . , wr ) ∈ kr ,
u ∈ km such that
r
βiwi = R(u) = α.
(7.5.13)
i=1
Proof. The assertion is obvious if all βi are equal to 1. Suppose that not all
βi are equal to 1. Then since (1 + zk[[z]]) ∩ k = {1}, not all βi belong to k∗ .
Write R = P /Q, where P , Q ∈ E[U1 , . . . , Um ]. Let L be the extension of
k(z) generated by β1 , . . . , βr and the coefficients of P , Q. Then L is a finite
extension of K with L ⊂ E. There is a valuation v of L such that the integers
v(βi ) are not all equal to 0.
We claim that there are integers a, b independent of u such that if R(u) is
defined and non-zero, then a ≤ v(R(u)) ≤ b. Choose a local parameter zv for
v. By expressing the coefficients of P as Laurent series in zv , we obtain
P (u) =
∞
pi (u)zvi
i=i0
with pi ∈ k[U1 , . . . , Um ] not all identically 0. Put Xi0 −1 := kr , and for j ≥ i0 ,
denote by Xj the set of u ∈ kr such that pi (u) = 0 for i0 ≤ i ≤ j . Then Xi0 ⊇
Xi0 +1 ⊇ · · · , and by the descending chain property for algebraic sets, there is j0
such that Xj = Xj0 for j ≥ j0 . Let j0 be the smallest index with this property.
If u ∈ kr is such that P (u) = 0 and v(P (u)) = j , say, we have u ∈ Xj −1 \ Xj .
Hence v(P (u)) ∈ {i0 , . . . , j0 }. By applying the same reasoning to Q, our claim
follows.
Let α ∈ E for which there exist w ∈ kr , u ∈ km with (7.5.13). By Lemma
7.5.5 we have
j=
r
wi v(αi ),
i=1
where j = v(R(u)) ∈ {a, a + 1, . . . , b}. Thus, w belongs to one of finitely
many linear subvarieties of kr of dimension r − 1, all defined over Q.
We now proceed by induction on r. If r = 1, we have only finitely many
possibilities for w, and then our lemma follows at once. Suppose that r ≥ 2,
and let L be one of the linear varieties from above. We have to show that L gives
rise to only finitely many α as in (7.5.13). There are c0 ∈ Qr and linearly independent ck ∈ Zr (k = 1, . . . , r − 1) such that every w ∈ L can be expressed
as w = c0 + r−1
k=1 tk ck with t1 , . . . , tr−1 ∈ k. Write ck = (c1k , . . . , crk ),
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.5 Proof of Theorem 7.4.1
191
γk := ri=1 βicik for k = 0, . . . , r. Notice that γk ∈ E for k = 0, . . . , r. By substituting our expression for w ∈ L into (7.5.13), we obtain
r−1
γktk = γ0−1 R(u) = γ0−1 α.
k=1
By the induction hypothesis, we have only finitely many possibilities for γ0−1 α,
hence for α. This completes our induction step.
Lemma 7.5.8 The set X \ Y is finite.
w
Proof. Let w = (w1 , . . . , wr ) ∈ X \ Y. Put yj := ri=1 bij j for j = 1, 2. Then
the logarithmic derivative of aj yj with respect to z is
aj
aj
+
r
i=1
wi ·
bij
bij
=: Qj
for j = 1, 2.
Since w ∈ X , there are ξ1 , ξ2 ∈ k∗ with
ξ1 a1 y1 + ξ2 a2 y2 = 1.
Upon differentiating this identity with respect to z, we obtain
Q1 · ξ1 a1 y1 + Q2 · ξ2 a2 y2 = 0.
Since w ∈ Y we have a1 y1 = a2 y2 , and since logarithmic differentiation x →
x /x is injective on 1 + zk[[z]], we have Q1 = Q2 . Hence the last two equations
have a unique solution (y1 , y2 ), and on applying Cramer’s rule we obtain
r
w
bij j = yj = Rj (ξ1 , ξ2 , w1 , . . . , wr ) for j = 1, 2,
i=1
where R1 , R2 are certain rational functions in E(U1 , . . . , Ur+2 ).
wi r
wi
, i=1 bi2
). Lemma 7.5.7 implies that if w runs
Put ψ(w) := ( ri=1 bi1
through X \ Y, then ψ(w) runs through a finite set. On the other hand, by
condition (7.5.9) and Lemma 7.5.6, ψ defines an injective map. This shows
that X \ Y is finite.
Lemma 7.5.9 The set X is an algebraic subset of kr , given by polynomials in
k[X1 , . . . , Xr ] of total degree at most 3. Further, Y is an algebraic subset of
X.
Proof. We apply the Wronskian criterion, that functions 1, f1 , f2 ∈ k((z))
are linearly dependent over k if and only if f1 f2 − f2 f1 = 0, where
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
192
Analogues over function fields
fj := d2 fj /dz2 . Let fj := aj ri=1 bijwi for j = 1, 2. By a straightforward computation, fj = pj (w)fj , fj = qj (w)fj for j = 1, 2, where pj are linear polynomials, and qj are quadratic polynomials with coefficients in E, for j = 1, 2.
Since f1 f2 = 0 for every w, it follows that w ∈ X if and only if h(w) = 0,
where h := p1 q2 − p2 q1 . Notice that h has total degree at most 3. From the
coefficients of h we select a maximal subset which is linearly independent over
k, {a1 , . . . , as }, say. Then the other coefficients of h can be expressed as k
linear combinations of a1 , . . . , as , and we get h = sk=1 as hs with polynomials
hk ∈ k[X1 , . . . , Xr ] (k = 1, . . . , s) of total degree at most 3. Now, clearly, for
w ∈ kr we have h(w) = 0 if and only if hk (w) = 0 for k = 1, . . . , s. Hence X
is an algebraic set given by h1 , . . . , hs .
To prove that Y is an algebraic subset of X , we use the Wronskian criterion
that two functions f1 , f2 ∈ k((z)) are k-linearly dependent if and only if f1 f2 −
f1 f2 = 0, and follow the same arguments as above.
Now Proposition 7.5.4 follows at once by combining Lemmas 7.5.8 and
7.5.9 with Corollary 7.5.3.
7.6 Results in positive characteristic
We give an overview of some results on S-unit equations and generalizations
thereof over function fields of positive characteristic.
Let k be a field of characteristic p > 0 and K a finite extension of the rational
function field k(z). We assume that k is algebraically closed in K. Denote by
gK/k the genus of K over k.
Similarly as in the characteristic 0 case, we can endow K with a set of
valuations MK (i.e., normalized discrete valuations that are trivial on k) sat
isfying the Sum Formula v∈MK v(x) = 0 for x ∈ K ∗ . Further, we define for
x = (x1 , . . . , xn ) ∈ K n \ {0},
v(x) := − min(v(x1 ), . . . , v( xn )) for v ∈ MK ,
HKhom (x) :=
v(x).
v∈MK
The height of x ∈ K is given by HK (x) := − v∈MK min(0, v(x)). For a finite
set of valuations S of MK , we define the group of S-units
OS∗ := {x ∈ K : v(x) = 0 for v ∈ MK \ S}.
Recall that the Frobenius map x → x p defines an injective field
m
m
homomorphism on K. As a consequence, the sets K p := {x p : x ∈ K}
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.6 Results in positive characteristic
193
(m = 1, 2, . . .) are subfields of K. We say that a1 , . . . , am ∈ K are linearly
independent over a subset U of K if there are no c1 , . . . , cm ∈ U \ {0} such
that c1 a1 + · · · + cm am = 0.
We start with an analogue of Theorem 7.1.1 in the case of characteristic p,
also due to Mason (1984), chapter VI, Lemma 10.
Theorem 7.6.1 Let x1 , x2 , x3 be non-zero elements of K, and let S be a finite
set of valuations of K such that
x1 + x2 + x3 = 0,
v(x1 ) = v(x2 ) = v(x3 ) for v ∈ MK \ S.
Then either x1 /x2 ∈ K p or
HK (x1 /x2 ) ≤ 2gK/k − 2 + |S|.
Of course, if we allow x1 /x2 ∈ K p , the result may become false, for instance
if x1 , x2 , x3 are any elements of K with x1 + x2 + x3 = 0 and x1 /x2 not in the
pm
pm
pm
constant field k, then we have also x1 + x2 + x3 = 0 for every positive
pm
pm
integer m and clearly HK (x1 /x2 ) may become arbitrarily large.
Mason’s proof of Theorem 7.6.1 is similar to his proof of Theorem 7.1.1,
based on derivations. Silverman (1984) proved a similar result (stated in another
but equivalent form) by a different geometric method, based on the Riemann–
Hurwitz formula and properties of Fermat curves x N + y N = 1.
We now turn to S-unit equations in several unknowns. First Mason (1986b)
and later in a sharper form Wang (1996, 1999) proved analogues of Corollary
7.3.4 in the case of positive characteristic. We recall Wang (1996), Corollary. 1.
Theorem 7.6.2 Suppose that K has genus g. Let S be a finite set of valuations
of K, let m be a positive integer, and let x0 , . . . , xn be elements of OS∗ such that
x0 + · · · + xn = 0
m
and every proper subset of {x0 , . . . , xn } is linearly independent over k · K p .
Then
HKhom (x) ≤
n(n − 1) m−1
p
max(0, 2g − 2 + |S|).
2
Wang’s proof is a positive characteristic analogue of the Wronskian argument
of Brownawell and Masser. In Wang (1999) she slightly sharpened her result.
Hsia and Wang (2004) proved a further extension to function fields of arbitrary
transcendence degree, in the cases of both zero characteristic and positive
characteristic. Their proof uses generalized Wronskians.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
194
Analogues over function fields
We now turn to linear equations with unknowns from a finitely generated
multiplicative group. Let p be a prime, Fp the field of p elements, and Fp
its algebraic closure. Again, K is a field of characteristic p > 0 but we now
assume that K is a finitely generated, transcendental extension of Fp . Let k be
the algebraic closure of Fp in K; so k is a finite field.
Let us start with the equation
a1 x1 + a2 x2 = 1
in x = (x1 , x2 ) ∈ ,
(7.6.1)
∗ 2
where is a subgroup of (K ∗ )2 of finite rank not contained in (Fp ) and
a = (a1 , a2 ) ∈ (K ∗ )2 . For instance, if there is an integer l coprime with p such
that al ∈ , then (7.6.1) may have infinitely many solutions. Specifically, let q
be a power of p such that l|q − 1 and let u = (u1 , u2 ) be a solution of (7.6.1)
∗
with u ∈ (Fp )2 , then
aq
e
−1
· uq
e
(e = 0, 1, 2, . . .)
(7.6.2)
yield infinitely many different solutions of (7.6.1). The following nice result is
due to Voloch (1998), Theorem 2.
Theorem 7.6.3 Let have rank r ≥ 0. Assume there is no positive integer l
such that al ∈ . Then (7.6.1) has at most
pr (p r + p − 2)
p−1
solutions.
The proof uses derivations on K.
Now let n ≥ 2, a finitely generated subgroup of (K ∗ )n (n-fold direct
product, not to be confused with the field of p-th powers K p in case n, p
happen to be equal), and a = (a1 , . . . , an ) ∈ (K ∗ )n , and consider the equation
a1 x1 + · · · + an xn = 1 in x = (x1 , . . . , xn ) ∈ .
(7.6.3)
The following result is a consequence of Hsia’s and Wang’s analogue of Theorem 7.6.2 for function fields in several variables mentioned above.
Theorem 7.6.4 Equation (7.6.3) has only finitely many solutions such that
a1 x1 , . . . , an xn are linearly independent over K p .
We want to weaken the condition that a1 x1 , . . . , an xn be linearly independent
over K p to the condition that (x1 , . . . , xn ) be non-degenerate, that is,
ai xi = 0 for each proper subset I of {1, . . . , n}.
i∈I
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
7.6 Results in positive characteristic
195
Then the situation becomes much more complicated, due to the Frobenius
action on K, and in fact the solutions can be divided into finitely many infinite
classes with a particular structure.
In Mason (1986b), Masser (2004), Adamczewski and Bell (2012) and
Derksen and Masser (2012) various descriptions are given for the set of nondegenerate solutions of (7.6.3). We discuss here a result from the last paper,
which unlike Voloch’s result does not give an upper bound for the number of
classes of solutions, but instead implies that these classes can be determined
effectively.
As an introduction to the result of Derksen and Masser, note that we can
write the set of solutions of (7.6.1) given in (7.6.2) as
ψa−1 ϕqe ψa (u)
(e = 0, 1, 2, . . .),
(7.6.4)
where ψa , ϕq are the maps given by
ψa (x) := a · x,
ϕq (x) := xq .
We now proceed to state the result of Derksen and Masser on the nondegenerate solutions of (7.6.3). Denote by K the group of u ∈ (K ∗ )n for
which there exists l ∈ Z>0 with ul ∈ . For a power q of p and for u ∈ ,
g1 , . . . , gh ∈ K , define the set
ϕqe1 ψg1 · · · ψg−1
ϕqeh ψgh (u) : e1 , . . . , eh ∈ Z≥0 ,
[g1 , . . . , gh ]q (u) := ψg−1
1
h
where ϕq and the ψgi are maps from (K ∗ )n to (K ∗ )n given by (7.6.4). We call
such a set a (K , q)-set of order h. We agree here that a (K , q)-set of order 0
is a single element. Computing a (K , q)-set means computing q and the tuple
(g1 , . . . , gh , u) by which it is defined.
The following result is part of Derksen and Masser (2012), Theorem 3.
∗
Theorem 7.6.5 Let a1 , . . . , an ∈ K ∗ . Assume that is not contained in (Fp )n
and that K is finitely generated. Then there is a power q of p such that the set
of non-degenerate solutions of (7.6.3) is contained in a finite union of (GK , q)sets of order at most n − 1. Further, if suitable effective representations of K
and are given, the prime power q and these (K , q)-sets can be determined
effectively.
In their proof, Derksen and Masser derived a sharpening of Theorem 7.6.4,
basically by extending the argument of Brownawell and Masser based on
Wronskians. From there, they completed the proof of Theorem 7.6.5 by means
of an inductive argument.
We mention that earlier, Masser (2004) obtained a less precise and ineffective
version of Theorem 7.6.5. Building further on work from Derksen (2007) on
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
196
Analogues over function fields
linear recurrence sequences, Adamczewski and Bell (2012), Theorem 3.1 gave
a description of the set of non-degenerate solutions of (7.6.3) in terms of finite
p-automata, and in fact they proved a much more general result for semiabelian varieties. Finally, general results on semi-abelian varieties over fields
of positive characteristic, implying Theorem 7.6.5, have been proved by means
of methods from logic, see Hrushovki (1996), Moosa and Scanlon (2002, 2004)
and Ghioca (2008).
We illustrate Theorem 7.6.5 with an example from Masser (2004). Let
K = Fp (z) with z transcendental over Fp and let = G3 , where G is the
multiplicative subgroup of K ∗ generated by z and 1 − z. Consider the equation
x1 + x2 − x3 = 1
in (x1 , x2 , x3 ) ∈ .
(7.6.5)
This equation has (among others) the non-degenerate solutions
(z(q−1)q , (1 − z)qq , z(q−1)q (1 − z)q ),
where q, q run independently through the powers of p different from 1. This
set of solutions may be described as [g1 , g2 ]p (u), where
g1 = (1, 1, 1),
g2 = (z, 1, z(1 − z)−1 ),
2
u = (z(p−1)p , (1 − z)p , z(p−1)p (1 − z)p ).
Leitner (2012), Theorem 2 gave a complete description of the set of solutions
of (7.6.5) as a union of (K , p)-sets. As it turned out, the set of non-degenerate
solutions is contained in a union of 40 (K , p)-sets if p ≥ 5, 48 (K , 3)-sets
if p = 3, and 240 (K , 2)-sets if p = 2. For p = 2, his result was obtained
earlier by Arenas-Carmona, Berend and Bergelson (2008).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.009
8
Effective results for unit equations in two
unknowns over finitely generated domains
In Chapter 4 we established effective finiteness results on unit equations and
S-unit equations in two unknowns in an algebraic number field. In this chapter
we extend these results to the finitely generated case. More precisely, let A ⊃ Z
be an integral domain which is finitely generated over Z, i.e., A = Z[z1 , . . . , zr ]
for certain not necessarily algebraic generators z1 , . . . , zr , K the quotient field
of A, and a1 , a2 , a3 non-zero elements of K. By a theorem of Roquette (1957),
the group A∗ of units of A is finitely generated, hence we know from, e.g., the
results of Chapter 6 that the equation
a1 x1 + a2 x2 = a3
in x1 , x2 ∈ A∗
(8.1)
has only finitely many solutions. This result is, however, ineffective.
In this chapter, we give an effective proof of this finiteness statement, which
is valid for any arbitrary integral domain A that is finitely generated over Z.
In fact, our main result, Theorem 8.1.1, provides effective upper bounds for
the “sizes” of the solutions x1 , x2 in terms of suitable effective representations
for A, a1 , a2 , a3 . This enables one to determine all solutions in principle; see
Corollary 8.1.2 below. As a further consequence of Theorem 8.1.1 we deduce
an effective finiteness theorem on equation (8.1) in unknowns x1 , x2 taken from
a finitely generated and effectively given multiplicative subgroup of K ∗ , see
Theorem 8.1.3 below.
Our strategy of proof of Theorem 8.1.1 is roughly as follows. We construct
an integral domain B ⊃ A of a special type that can be dealt with more easily,
and consider instead of (8.1) the equation
a1 x1 + a2 x2 = a3
in x1 , x2 ∈ B ∗ .
(8.2)
In the construction of B we use ideas from Seidenberg (1974). We reduce
(8.2) to the function field case, and using Mason’s Theorem 7.1.1 we derive
an effective upper bound for the degrees of the solutions. Next, by means of
197
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
198
Unit equations over finitely generated domains
effective specializations, i.e., explicitly given ring homomorphisms B → Q, we
reduce (8.2) to various S-unit equations in different algebraic number fields, and
apply the results of Chapter 4 to the S-unit equations obtained. This provides
enough information to derive an effective upper bound for the heights of the
solutions of (8.2). From this, we can effectively determine all solutions of (8.2).
The final, crucial step is to go back from (8.2) to (8.1) and to select from the
solutions in B ∗ those that belong to A∗ . For this we have developed an effective
procedure, based on an effective result of Aschenbrenner (2004) on systems of
linear equations over polynomial rings over Z.
The above approach was developed by Győry (1983, 1984). However, at
that time Aschenbrenner’s result was not yet available. Hence, to select those
solutions from B ∗ of the equations under consideration that belong to A∗ ,
certain restrictions on the integral domain A had to be imposed.
This chapter is organized as follows. In Section 8.1 we give the necessary
definitions and state our results. In Sections 8.2–8.6 we prove Theorem 8.1.1.
More precisely, in Sections 8.2 and 8.3 we construct the domain B and give
the effective procedure to select those elements of B ∗ that belong to A∗ , in
Section 8.4 we reduce (8.2) to the function field case and apply Mason’s
Theorem, in Section 8.5 we develop some effective specialization theory, and
in Section 8.6 we reduce (8.2) to S-unit equations over number fields, apply
the results from Chapter 4, and complete the proof. In Section 8.7 we prove
Theorem 8.1.3. In Section 8.8 we briefly discuss some related results.
The results in this chapter were proved for the first time in Evertse and
Győry (2013). We closely follow the exposition of that paper.
8.1 Statements of the results
We introduce the notation used in our theorems. Let again A ⊃ Z be an integral
domain which is finitely generated over Z, say A = Z[z1 , . . . , zr ]. Let I be the
ideal of polynomials f ∈ Z[X1 , . . . , Xr ] such that f (z1 , . . . , zr ) = 0. Then I
is finitely generated, hence
A∼
= Z[X1 , . . . , Xr ]/I,
I = (f1 , . . . , fm )
for some finite set of polynomials f1 , . . . , fm ∈ Z[X1 , . . . , Xr ]. We observe
here that given f1 , . . . , fm , it can be checked effectively whether A is a domain
containing Z. Indeed, this holds if and only if I is a prime ideal of Z[X1 , . . . , Xr ]
with I ∩ Z = (0), and the latter can be checked effectively for instance using
Aschenbrenner (2004), Proposition 4.10, Corollary 3.5.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.1 Statements of the results
199
Denote by K the quotient field of A. For α ∈ A, we call f a representative
for α or say that f represents α if f ∈ Z[X1 , . . . , Xr ] and α = f (z1 , . . . , zr ).
Further, for α ∈ K, we call (f, g) a pair of representatives for α or say that
(f, g) represents α if f, g ∈ Z[X1 , . . . , Xr ], g ∈ I and α = f (z1 , . . . , zr )/
g(z1 , . . . , zr ). We say that α ∈ A (resp. α ∈ K) is given if a representative
(resp. pair of representatives) for α is given.
To do explicit computations in A and K, one needs an ideal membership
algorithm for Z[X1 , . . . , Xr ], which decides, for any given polynomial and
ideal of Z[X1 , . . . , Xr ], whether the polynomial belongs to the ideal. In the
literature there are various such algorithms; we mention only the algorithm
of Simmons (1970), and the more precise algorithm of Aschenbrenner (2004)
which plays an important role in this chapter; see Lemma 8.2.5 below for a
statement of his result. One can perform arithmetic operations on A and K by
using representatives. Further, one can decide effectively whether two polynomials g1 , g2 represent the same element of A, i.e., g1 − g2 ∈ I , or whether two
pairs of polynomials (g1 , h1 ), (g2 , h2 ) represent the same element of K, i.e.,
g1 h2 − g2 h1 ∈ I , by using one of the ideal membership algorithms mentioned
above.
The degree deg f of a polynomial f ∈ Z[X1 , . . . , Xr ] is by definition its
total degree. By the logarithmic height h(f ) of f we mean the logarithm of the
maximum of the absolute values of its coefficients. The size of f is defined by
s(f ) := max(1, deg f, h(f )).
Clearly, there are only finitely many polynomials in Z[X1 , . . . , Xr ] of size
below a given bound, and these can be determined effectively.
We consider equations
a1 x1 + a2 x2 = a3
in x1 , x2 ∈ A∗ ,
(8.1.1)
where a1 , a2 , a3 are non-zero elements of A.
Theorem 8.1.1 Assume that r ≥ 1. Let a1 , a2 , a3 be representatives for
a1 , a2 , a3 , respectively. Assume that f1 , . . . , fm and a1 , a2 , a3 all have degree
at most d and logarithmic height at most h, where d ≥ 1, h ≥ 1. Then for
each solution (x1 , x2 ) of (8.1.1), there are representatives x1 , x1 , x2 , x2 of
x1 , x1−1 , x2 , x2−1 , respectively, such that
r
s(xi ), s(xi ) ≤ exp (2d)c1 (h + 1) for i = 1, 2,
where c1 is an effectively computable absolute constant > 1.
By a theorem of Roquette (1957), the unit group of an integral domain
finitely generated over Z is finitely generated. In the case that A = OS is the
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
200
Unit equations over finitely generated domains
ring of S-integers of a number field it is possible to determine effectively a
system of generators for A∗ , and this was used in all effective finiteness proofs
for (8.1.1) with A = OS . However, no general algorithm is known to determine
a system of generators for the unit group of an arbitrary finitely generated
domain A. In our proof of Theorem 8.1.1, we do not need any information on
the generators of A∗ .
By combining Theorem 8.1.1 with an ideal membership algorithm for the
polynomial ring Z[X1 , . . . , Xr ], one easily deduces the following.
Corollary 8.1.2 Given f1 , . . . , fm such that A is an integral domain containing Z, and given a1 , a2 , a3 ∈ A \ {0}, the solutions of (8.1.1) can be determined
effectively.
Proof. Clearly, (x1 , x2 ) is a solution of (8.1.1) if and only if for i = 1, 2, there
are polynomials xi , xi ∈ Z[X1 , . . . , Xr ] (i = 1, 2) such that xi represents xi
for i = 1, 2, and
a1 · x1 + a2 · x2 − a3 ∈ I,
,
xi · xi − 1 ∈ I
for i = 1, 2.
(8.1.2)
Thus, we obtain all solutions of (8.1.1) by checking, for each quadruple of
r
polynomials x1 , x1 , x2 , x2 ∈ Z[X1 , . . . , Xr ] of size at most exp((2d)c1 (h + 1))
whether it satisfies (8.1.2). Further, using the ideal membership algorithm, it
can be checked effectively whether two different pairs (x1 , x2 ) represent the
same solution of (8.1.1). Thus, we can make a list of representatives, one for
each solution of (8.1.1).
Let γ1 , . . . , γs be multiplicatively independent elements of K ∗ . For given
elements γ1 , . . . , γs ∈ K ∗ the multiplicative independence of γ1 , . . . , γs can be
checked effectively, see for instance Lemma 8.7.2 below. Let again a1 , a2 , a3
be non-zero elements of A and consider the equation
a1 γ1v1 · · · γsvs + a2 γ1w1 · · · γsws = a3
in v1 , . . . , vs , w1 , . . . , ws ∈ Z. (8.1.3)
Theorem 8.1.3 Let a1 , a2 , a3 be representatives for a1 , a2 , a3 and for i =
1, . . . , s, let (gi1 , gi2 ) be a pair of representatives for γi . Suppose that f1 , . . . ,
fm , a1 , a2 , a3 , and gi1 , gi2 (i = 1, . . . , s) all have degree at most d and logarithmic height at most h, where d ≥ 1, h ≥ 1. Then for each solution (v1 , . . . , ws )
of (8.1.3) we have
r
max(|v1 |, . . . , |vs |, |w1 |, . . . , |ws |) ≤ exp (2s d)c2 (h + 1) ,
where c2 is an effectively computable absolute constant > 1.
An immediate consequence of Theorem 8.1.3 is that for given f1 , . . . , fm ,
a1 , a2 , a3 and γ1 , . . . , γs , the solutions of (8.1.3) can be determined effectively.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.2 Effective linear algebra over polynomial rings
201
Since every integral domain finitely generated over Z has a finitely generated
unit group, equation (8.1.1) may be viewed as a special case of (8.1.3). But since
no general effective algorithm is known to find a finite system of generators
for the unit group of a finitely generated integral domain, we cannot deduce
an effective result for (8.1.1) from Theorem 8.1.3. In fact, we argue reversely,
and prove Theorem 8.1.3 by combining Theorem 8.1.1 with an effective result
on Diophantine equations of the type γ1v1 · · · γsvs = γ0 in integers v1 , . . . , vs ,
where γ1 , . . . , γs , γ0 ∈ K ∗ (see Corollary 8.7.3 below).
8.2 Effective linear algebra over polynomial rings
We have gathered from the literature some effective results for systems of
linear equations to be solved in polynomials with coefficients in a field, or with
coefficients in Z.
As usual, we write
log∗ u := max(1, log u)
for u > 0, log∗ 0 := 1.
We use the notation O(·) as an abbreviation for c times the expression between
the parentheses, where c is an effectively computable positive absolute constant
(notice that the meaning of the O-symbol is different from that of the usual Osymbol which means “at most c times the expression between the parentheses”).
At each occurrence of O(·), the value of c may be different.
Given a ring R, we denote by R m,n the R-module of m × n-matrices with
entries in R and by R n the R-module of n-dimensional column vectors with
entries in R. Further, as usual GL(n, R) denotes the group of matrices in
R n,n with determinant in the unit group R ∗ . The degree of a polynomial f ∈
R[X1 , . . . , XN ], that is, its total degree, is denoted by deg f .
From matrices U, V with the same number of rows, we form a matrix [U, V ]
by placing the columns of V after those of
. UU/, and from two matrices U, V with
the same number of columns we form V by placing the rows of V below
those of U .
The logarithmic height h(S) of a finite set S = {a1 , . . . , at } ⊂ Z is defined
by h(S) := log max(|a1 |, . . . , |at |). The logarithmic height h(U ) of a matrix
with entries in Z is defined by the logarithmic height of the set of entries of
U . The logarithmic height h(f ) of a polynomial with coefficients in Z is the
logarithmic height of the set of non-zero coefficients of f .
Lemma 8.2.1 Let U ∈ Zm,n . Then the Q-vector space of y ∈ Qn with U y = 0
is generated by vectors in Zn of logarithmic height at most mh(U ) + 12 m log m.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
202
Unit equations over finitely generated domains
Proof. Without loss of generality we may assume that U has rank m, and
moreover, that the matrix V consisting of the first m columns of U is invertible. Let := det V . By multiplying with V −1 , we can rewrite U y = 0 as
[Im , W ]y = 0, where Im is the m × m-unit matrix, and W consists of m × msubdeterminants of U . The solution space of this system is generated by the
]. An application of Hadamard’s inequality gives the upper
columns of [ I−W
n−m
bound from the lemma for the logarithmic heights of these columns.
Proposition 8.2.2 Let F be a field, N ≥ 1, and R := F [X1 , . . . , XN ]. Further, let U be an m × n-matrix and b an m-dimensional column vector, both
consisting of polynomials from R of degree ≤ d where d ≥ 1.
(i) The R-module of x ∈ R n with U x = 0 is generated by vectors x whose
N
coordinates are polynomials of degree at most (2md)2 .
(ii) Suppose that U x = b is solvable in x ∈ R n . Then it has a solution x whose
N
coordinates are polynomials of degree at most (2md)2 .
Proof. See Aschenbrenner (2004), theorems 3.2, 3.4. Part (ii) of Proposition
8.2.2 was obtained earlier by Masser and Wüstholz (1983), Proposition on
N −1
p.440, estimate on top of p.442, with the slightly smaller bound (2md)2 .
Results of this type, but not with a completely correct proof, were given in
Hermann (1926) and Seidenberg (1974).
Corollary 8.2.3 Let R := Q[X1 , . . . , XN ]. Further, Let U be an m × n-matrix
of polynomials in Z[X1 , . . . , XN ] of degrees at most d and logarithmic heights
at most h where d ≥ 1, h ≥ 1. Then the R-module of x ∈ R n with U x = 0 is
generated by vectors x, consisting of polynomials in Z[X1 , . . . , XN ] of degree
N
N
at most (2md)2 and height at most (2md)6 (h + 1).
Proof. By Proposition 8.2.2 (i) we have to study U x = 0, restricted to vectors
N
x ∈ R n consisting of polynomials of degree at most (2d)2 . The set of these x is
a finite dimensional Q-vector space, and we have to prove that it is generated by
vectors whose coordinates are polynomials in Z[X1 , . . . , XN ] of logarithmic
N
height at most (2md)6 (h + 1).
N
If x consists of polynomials of degree at most (2md)2 , then U x consists of
N
m polynomials with coefficients in Q of degrees at most (2md)2 + d, all of
whose coefficients have to be set to 0. This leads to a system of linear equations
V y = 0, where y consists of the coefficients of the polynomials in x and V
consists of integers of logarithmic heights at most h. Notice that the number
m∗ of rows of V is m times the number of monomials in N variables of degree
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.2 Effective linear algebra over polynomial rings
203
N
at most (2md)2 + d, that is
N
m∗ ≤ m
(2md)2 + d + N
.
N
By Lemma 8.2.1 the solution space of V y = 0 is generated by integer vectors
of logarithmic height at most
m∗ h + 12 m∗ log m∗ ≤ (2md)6 (h + 1).
N
This completes the proof of our corollary.
Lemma 8.2.4 Let U ∈ Zm,n , b ∈ Zm be such that U y = b is solvable in Zn .
Then it has a solution y ∈ Zn with h(y) ≤ mh([U, b]) + 12 m log m.
Proof. Assume without loss of generality that U and [U, b] have rank m. By a
result of Borosh, Flahive, Rubin and Treybig (1989) (see also Lemma 4.3.5),
U y = b has a solution y ∈ Zn such that the absolute values of the entries of
y are bounded above by the maximum of the absolute values of the m × msubdeterminants of [U, b]. The upper bound for h(y) as in the lemma easily
follows from Hadamard’s inequality.
Proposition 8.2.5 Let N ≥ 1 and let f1 , . . . , fm , b ∈ Z[X1 , . . . , XN ] be
polynomials of degrees at most d and logarithmic heights at most h where
d ≥ 1, h ≥ 1, such that
f1 x1 + · · · + fm xm = b
(8.2.1)
is solvable in x1 , . . . , xm ∈ Z[X1 , . . . , xN ]. Then (8.2.1) has a solution in polynomials x1 , . . . , xm ∈ Z[X1 , . . . , XN ] with
deg xi ≤ (2d)exp O(N log
∗
N)
(h + 1),
h(xi ) ≤ (2d)exp O(N log
∗
N)
(h + 1)N+1
(8.2.2)
for i = 1, . . . , m.
Proof. Aschenbrenner’s main theorem (Aschenbrenner (2004), Theorem A)
states that equation (8.2.1) has a solution x1 , . . . , xm ∈ Z[X1 , . . . , XN ] with
deg xi ≤ d0 for i = 1, . . . , m, where
d0 = (2d)exp O(N log
∗
N)
(h + 1).
So it remains to show the existence of a solution with small logarithmic height.
Let us restrict ourselves to solutions (x1 , . . . , xm ) of (8.2.1) of degree ≤ d0 ,
and denote by y the vector of coefficients of the polynomials x1 , . . . , xm . Then
(8.2.1) translates into a system of linear equations U y = b which is solvable
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
204
Unit equations over finitely generated domains
over Z. Here, the number of equations, i.e., number of rows of U , is equal to
). Further, h(U, b) ≤ h. By Lemma 8.2.4, U y = b has a solution
m∗ := ( d0 +d+N
N
y with coordinates in Z of height at most
m∗ h + 12 m∗ log m∗ ≤ (2d)exp O(N log
∗
N)
(h + 1)N+1 .
It follows that (8.2.1) has a solution x1 , . . . , xm ∈ Z[X1 , . . . , XN ] satisfying (8.2.2).
Remarks
1 Aschenbrenner (2004) gives an example which shows that the upper bound
for the degrees of the xi cannot depend on d and N only.
2 The above lemma gives an effective criterion for ideal membership in the
polynomial ring Z[X1 , . . . , XN ]. Let b ∈ Z[X1 , . . . , XN ] be given. Further,
suppose that an ideal I of Z[X1 , . . . , XN ] is given by a finite set of generators f1 , . . . , fm . By the above lemma, if b ∈ I then there are x1 , . . . , xm ∈
Z[X1 , . . . , XN ] with upper bounds for the degrees and heights as in (8.2.2)
such that b = m
i=1 xi fi . It requires only a finite computation to check whether
such xi exist.
8.3 A reduction
We reduce the general unit equation (8.1.1) to a unit equation over an integral
domain B of a special type that can be dealt with more easily.
Let again A = Z[z1 , . . . , zr ] be an integral domain finitely generated over
Z and denote by K the quotient field of A. We assume that r > 0. We have
A∼
= Z[X1 , . . . , Xr ]/I,
(8.3.1)
where I is the ideal of polynomials f ∈ Z[X1 , . . . , Xr ] such that f (z1 , . . . , zr )
= 0. The ideal I is finitely generated. Let d ≥ 1, h ≥ 1 and assume that
I = (f1 , . . . , fm ) with deg fi ≤ d,
h(fi ) ≤ h (i = 1, . . . , m).
(8.3.2)
Suppose that K has transcendence degree q ≥ 0. In the case that q > 0,
we assume without loss of generality that z1 , . . . , zq form a transcendence
basis of K/Q. We write t := r − q and rename zq+1 , . . . , zr as y1 , . . . , yt ,
respectively. In the case that t = 0 we have A = Z[z1 , . . . , zq ], A∗ = {±1} and
Theorem 8.1.1 is trivial. So we assume henceforth that t > 0.
Define
A0 := Z[z1 , . . . , zq ],
A0 := Z,
K0 := Q(z1 , . . . , zq ) if q > 0,
K0 := Q if q = 0.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.3 A reduction
205
Then
A = A0 [y1 , . . . , yt ],
K = K0 (y1 , . . . , yt ).
Clearly, K is a finite extension of K0 , so in particular is an algebraic number
field if q = 0. Using standard algebra techniques, worked out in detail below,
one can show that there exist y ∈ A and f ∈ A0 such that K = K0 (y), y is
integral over A0 , and
A ⊆ B := A0 [f −1 , y],
a1 , a2 , a3 ∈ B ∗ ,
where a1 , a2 , a3 are the coefficients in (8.1.1). If x1 , x2 ∈ A∗ is a solution to
(8.1.1), then xi := ai xi /a3 (i = 1, 2) satisfy
x1 + x2 = 1,
x1 , x2 ∈ B ∗ .
(8.3.3)
At the end of this section, we formulate Proposition 8.3.7 which gives an
effective result for equations of the type (8.3.3). More precisely, we introduce
a different type of degree and height, deg (α) and h(α), for elements α of B,
and give effective upper bounds for the deg and h of x1 , x2 . Subsequently we
deduce Theorem 8.1.1.
The deduction of Theorem 8.1.1 is based on some auxiliary results which
are proved first. We start with an explicit construction of y, f , with effective
upper bounds in terms of r, d, h and a1 , a2 , a3 for the degrees and logarithmic
heights of f and of the coefficients in A0 of the monic minimal polynomial of
y over A0 . Here we follow more or less Seidenberg (1974). Second, for a given
solution x1 , x2 of (8.1.1), we derive effective upper bounds for the degrees and
logarithmic heights of representatives for xi , xi−1 , (i = 1, 2) in terms of deg (xi ),
h(xi ) (i = 1, 2). Here we use Proposition 8.2.5 (Aschenbrenner’s result).
We introduce some further notation. First let q > 0. Then since z1 , . . . , zq
are algebraically independent, we may view them as independent variables, and
for α ∈ A0 , we denote by deg α, h(α) the total degree and logarithmic height
of α, viewed as a polynomial in z1 , . . . , zq . In the case that q = 0, we have
A0 = Z, and we agree that deg α = 0, h(α) = log |α| for α ∈ A0 .
We write Y = (Xq+1 , . . . , Xr ) and K0 (Y) = K0 (Xq+1 , . . . , Xr ). Given f ∈
Q(X1 , . . . , Xr ) we denote by f ∗ the rational function of K0 (Y) obtained by
substituting zi for Xi for i = 1, . . . , q (and f ∗ = f if q = 0). We denote by
degY f ∗ the (total) degree of f ∗ ∈ K0 [Y] with respect to Y. We recall that the
total degree deg g is defined for elements g ∈ A0 and is taken with respect to
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
206
Unit equations over finitely generated domains
z1 , . . . , zq . With this notation, we can rewrite (8.3.1) and (8.3.2) as
⎫
∼ A0 [Y]/(f ∗ , . . . , f ∗ ),
A=
⎪
m
1
⎪
⎪
⎬
degY fi∗ ≤ d for i = 1, . . . , m,
the coefficients of f1∗ , . . . , fm∗ in A0 have degrees at most d ⎪
⎪
⎪
⎭
and logarithmic heights at most h.
(8.3.4)
Put D := [K : K0 ] and denote by σ1 , . . . , σD the K0 -isomorphic embeddings
of K in an algebraic closure K0 of K0 .
Lemma 8.3.1
(i) We have D ≤ d t .
(ii) There exist integers a1 , . . . , at with |ai | ≤ D 2 for i = 1, . . . , t such that
for w := a1 y1 + · · · + at yt we have K = K0 (w).
Proof. (i) The images of (y1 , . . . , yt ) under σ1 , . . . , σD all lie in
t
W := {y ∈ K0 : f1∗ (y) = · · · = fm∗ (y) = 0}.
Conversely, using the fact that K ∼
= K0 [Y]/(f1∗ , . . . , fm∗ ), one sees that each
assignment (y1 , . . . , yt ) → y with y ∈ W yields a K0 -isomorphic embedding
of K in K0 . Hence |W| = D < ∞. Now Corollary 7.5.3 with k = K0 , X =
t
K0 , Y = ∅ implies that |W| ≤ d t . Hence D ≤ d t .
(ii) Let a1 , . . . , at be integers. Then w := ti=1 ai yi generates K over K0
t
if and only if j =1 aj σi (yj ) (i = 1, . . . , D) are distinct. There are integers ai
with |ai | ≤ D 2 for which this holds.
Lemma 8.3.2 There are G0 , . . . , GD ∈ A0 such that
D
Gi w D−i = 0,
G0 GD = 0,
(8.3.5)
i=0
deg Gi ≤ (2d)exp O(r) , h(Gi ) ≤ (2d)exp O(r) (h + 1), (i = 0, . . . , D). (8.3.6)
ut
u1
Proof. In what follows we write Y = (Xq+1 , . . . , Xr ) and Yu := Xq+1
· · · Xq+t
,
|u| := u1 + · · · + ut for tuples of non-negative integers u = (u1 , . . . , ut ).
Further, we define W := tj =1 aj Xq+j .
G0 , . . . , GD as in (8.3.5) clearly exist since w has degree D over K0 . By
∗
∈ A0 [Y] such that
(8.3.4), there are g1∗ , . . . , gm
D
i=0
Gi W
D−i
=
m
gj∗ fj∗ .
(8.3.7)
j =1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.3 A reduction
207
By Proposition 8.2.2 (ii), applied with the field F = K0 , there are polynomials gj∗ ∈ K0 [Y] (so with coefficients being rational functions in z) satisfying
t
t
(8.3.7) of degree at most (2 max(d, D))2 ≤ (2d t )2 =: d0 in Y. By multiplying
G0 , . . . , GD with an appropriate non-zero factor from A0 we may assume that
the gj∗ are polynomials in A0 [Y] of degree at most d0 in Y. By considering
(8.3.7) with such polynomials gj∗ , we obtain
D
Gi W D−i =
m
⎛
⎝
j =1
i=0
⎞ ⎛
gj,u Yu ⎠ · ⎝
|u|≤d0
⎞
fj,v Yv ⎠ ,
(8.3.8)
|v|≤d
where gj,u ∈ A0 and fj∗ = |v|≤d fj,v Yv with fj,v ∈ A0 . We view G0 , . . . , GD
and the polynomials gj,u as the unknowns of (8.3.8). Then (8.3.8) has solutions
with G0 GD = 0.
We may view (8.3.8) as a system of linear equations U x = 0 over K0 ,
where x consists of Gi (i = 0, . . . , D) and gj,u (j = 1, . . . , m, |u| ≤ d0 ).
By Lemma 8.3.1 and an elementary estimate, the polynomial W D−i =
( tk=1 ak Xq+k )D−i has logarithmic height at most O(D log(2D 2 t)) ≤ (2d)O(t) .
By combining this with (8.3.4), it follows that the entries of the matrix U
are elements of A0 of degrees at most d and logarithmic heights at most
h0 := max((2d)O(t) , h). Further, the number of rows of U is at most the number of monomials in Y of degree at most d0 + d which is bounded above by
). So, by Corollary 8.2.3, the solution module of (8.3.8) is genm0 := ( d0 +d+t
t
erated by vectors x = (G0 , . . . , GD , {gi,u }), consisting of elements from A0 of
degree and height at most
(2m0 d)2 ≤ (2d)exp O(r) ,
q
(2m0 d)6 (h0 + 1) ≤ (2d)exp O(r) (h + 1),
q
respectively.
At least one of these vectors x must have G0 GD = 0 since otherwise (8.3.8)
would have no solution with G0 GD = 0, contradicting what we already observed
about (8.3.5). Thus, there exists a solution x whose components G0 , . . . , GD
satisfy both (8.3.5) and (8.3.6). This proves our lemma.
It will be more convenient to work with
y := G0 w = G0 · (a1 y1 + · · · + at yt ).
In the case D = 1 we set y := 1. The following properties of y follow at once
from Corollary 1.9.5 and Lemmas 8.3.1 and 8.3.2.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
208
Unit equations over finitely generated domains
Corollary 8.3.3 We have K = K0 (y), y ∈ A, y is integral over A0 , and y has
minimal polynomial F(X) = XD + F1 XD−1 + · · · + FD over K0 with
deg Fi ≤ (2d)exp O(r) ,
Fi ∈ A0 ,
h(Fi ) ≤ (2d)exp O(r) (h + 1)
for i = 1, . . . , D.
Recall that A0 = Z if q = 0 and Z[z1 , . . . , zq ] if q > 0, where in the latter
case, z1 , . . . , zq are algebraically independent. Hence A0 is a unique factorization domain, and so the greatest common divisor of a finite set of elements
of A0 is well-defined and up to sign uniquely determined. With every element
α ∈ K we can associate an up to sign unique tuple Pα,0 , . . . , Pα,D−1 , Qα of
elements of A0 such that
α = Q−1
α
D−1
Pα,j y j
with Qα = 0, gcd(Pα,0 , . . . , Pα,D−1 , Qα ) = 1.
j =0
(8.3.9)
Put
deg α := max(deg Pα,0 , . . . , deg Pα,D−1 , deg Qα ),
h(α) := max(h(Pα,0 ), . . . , h(Pα,D−1 ), h(Qα )).
Then for q = 0 we have deg α = 0, h(α) = log max(|Pα,0 |, . . . , |Pα,D−1 |,
|Qα |).
Lemma 8.3.4 Let α ∈ K ∗ and let (a, b) be a pair of representatives for α,
with a, b ∈ Z[X1 , . . . , Xr ], b ∈ I . Put
d ∗ := max(d, deg a, deg b),
h∗ := max(h, h(a), h(b)).
Then
deg α ≤ (2d ∗ )exp O(r) ,
h(α) ≤ (2d ∗ )exp O(r) (h∗ + 1).
Proof. Consider the linear equation
Q·α =
D−1
Pj y j
(8.3.10)
j =0
in unknowns P0 , . . . , PD−1 , Q ∈ A0 . This equation has a solution with
Q = 0, since α ∈ K = K0 (y) and y has degree D over K0 . Write again
Y = (Xq+1 , . . . , Xr ) and put Y := G0 · ( tj =1 aj Xq+j ). Let a ∗ , b∗ ∈ A0 [Y] be
obtained from a, b by substituting zi for Xi for i = 1, . . . , q (a ∗ = a, b∗ = b
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.3 A reduction
209
if q = 0). By (8.3.4), there are gj∗ ∈ A0 [Y] such that
Q · a ∗ − b∗
D−1
Pj Y j =
j =0
m
gj∗ fj∗ .
(8.3.11)
j =1
By Proposition 8.2.2 (ii) this identity holds with polynomials gj∗ ∈ A0 [Y] of
t
t
degree in Y at most (2 max(d ∗ , D))2 ≤ (2d ∗ )t2 , where possibly we have to
multiply (P0 , . . . , PD−1 , Q) by a non-zero element from A0 . Now completely
similarly as in the proof of Lemma 8.3.2, one can rewrite (8.3.11) as a system of
linear equations over K0 and then apply Corollary 8.2.3. It follows that (8.3.10)
is satisfied by P0 , . . . , PD−1 , Q ∈ A0 with Q = 0 and
deg Pi , deg Q ≤ (2d ∗ )exp O(r) ,
h(Pi ), h(Q) ≤ (2d ∗ )exp O(r) (h∗ + 1)
(i = 0, . . . , D − 1).
By dividing P0 , . . . , PD−1 , Q by their greatest common divisor and using
Corollary 1.9.5 we obtain Pα,0 , . . . , PD−1,α , Qα ∈ A0 satisfying both (8.3.9)
and
deg Pi,α , deg Qα ≤ (2d ∗ )exp O(r) ,
h(Pi,α ), h(Qα ) ≤ (2d ∗ )exp O(r) (h∗ + 1)
(i = 0, . . . , D − 1).
Lemma 8.3.5 Let α1 , . . . , αn ∈ K ∗ . For i = 1, . . . , n, let (ai , bi ) be a pair of
representatives for αi , with ai , bi ∈ Z[X1 , . . . , Xr ], bi ∈ I . Put
d ∗∗ := max(d, deg a1 , deg b1 , . . . , deg an , deg bn ),
h∗∗ := max(h, h(a1 ), h(b1 ), . . . , h(an ), h(bn )).
Then there is a non-zero f ∈ A0 such that
A ⊆ A0 [y, f −1 ], α1 , . . . , αn ∈ A0 [y, f −1 ]∗ ,
deg f ≤ (n + 1)(2d ∗∗ )exp O(r) ,
h(f ) ≤ (n + 1)(2d ∗∗ )exp O(r) (h∗∗ + 1).
(8.3.12)
(8.3.13)
Proof. Take
f :=
t
i=1
Qyi ·
n
Qαi Qαi−1 .
j =1
Since in general, Qβ β ∈ A0 [y] for β ∈ K ∗ , we have fβ ∈ A0 [y] for
β = y1 , . . . , yt , α1 , α1−1 , . . . , αn , αn−1 . This implies (8.3.12). The inequalities
(8.3.13) follow at once from Lemma 8.3.4 and Corollary 1.9.5.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
210
Unit equations over finitely generated domains
Lemma 8.3.6 Let λ ∈ K ∗ and let x be a non-zero element of A. Let (a, b)
with a, b ∈ Z[X1 , . . . , Xr ] be a pair of representatives for λ. Put
d0 := max(deg f1 , . . . , deg fm , deg a, deg b, deg λx),
h0 := max(h(f1 ), . . . , h(fm ), h(a), h(b), h(λx)).
Then x has a representative x ∈ Z[X1 , . . . , Xr ] such that
deg x ≤ (2d0 )exp O(r log
∗
r)
h(x) ≤ (2d0 )exp O(r log
(h0 + 1),
∗
r)
(h0 + 1)r+1 .
If moreover x ∈ A∗ , then x −1 has a representative x ∈ Z[X1 , . . . , Xr ] with
deg x ≤ (2d0 )exp O(r log
∗
r)
h(x ) ≤ (2d0 )exp O(r log
(h0 + 1),
∗
r)
(h0 + 1)r+1 .
Proof. In the case q > 0, we identify zi with Xi and view elements of A0 as
polynomials in Z[X1 , . . . , Xq ]. Put Y := G0 · ( ti=1 ai Xq+i ). We have
λx = Q−1
D−1
Pi y i
(8.3.14)
i=0
with P0 , . . . , PD−1 , Q ∈ A0 and gcd(P0 , . . . , PD−1 , Q) = 1. According to
(8.3.14), x ∈ Z[X1 , . . . , Xr ] is a representative for x if and only if there are
g1 , . . . , gm ∈ Z[X1 , . . . , Xr ] such that
x · (Q · a) +
m
gi fi = b
i=1
D−1
Pi Y i .
(8.3.15)
i=0
We may view (8.3.15) as an inhomogeneous linear equation in the unknowns
x, g1 , . . . , gm . Notice that by Lemmas 8.3.1–8.3.4 the degrees and logarithmic
i
heights of Qa and b D−1
i=0 Pi Y are all bounded above by
(2d0 )exp O(r) ,
(2d0 )exp O(r) (h0 + 1),
respectively. Now Proposition 8.2.5 implies that (8.3.15) has a solution with
upper bounds for deg x, h(x), as stated in the lemma.
Now suppose that x ∈ A∗ . Again by (8.3.14), x ∈ Z[X1 , . . . , Xr ] is a rep
∈ Z[X1 , . . . , Xr ] such
resentative for x −1 if and only if there are g1 , . . . , gm
that
x · b
D−1
i=0
Pi Y i +
m
gi fi = Q · a.
i=1
Similarly as above, this equation has a solution with upper bounds for deg x ,
h(x ) as stated in the lemma.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.3 A reduction
211
Recall that we have defined A0 = Z[z1 , . . . , zq ], K0 = Q(z1 , . . . , zq ) if
q > 0 and A0 = Z, K0 = Q if q = 0, and that in the case q = 0, degrees
and deg -s are always zero. Theorem 8.1.1 can be deduced from the following
proposition, which makes sense also if q = 0. The proof of this proposition is
given in Sections 8.4–8.6.
Proposition 8.3.7 Let f ∈ A0 with f = 0, and let
F = XD + F1 XD−1 + · · · + FD ∈ A0 [X]
(D ≥ 1)
be the minimal polynomial of y over K0 . Let d1 ≥ 1, h1 ≥ 1 and suppose
max(deg f, deg F1 , . . . , deg FD ) ≤ d1 ,
(8.3.16)
max(h(f ), h(F1 ), . . . , h(FD )) ≤ h1 .
Define the domain B := A0 [y, f −1 ]. Then for each pair (x1 , x2 ) with
x1 + x2 = 1,
x1 , x2 ∈ B ∗
(8.3.17)
we have
deg x1 , deg x2 ≤ 4qD 2 · d1 ,
(8.3.18)
h(x1 ), h(x2 ) ≤ exp O 2D(q + d1 )(log∗ {2D(q + d1 )})2 + D log∗ Dh1 .
(8.3.19)
Proof of Theorem 8.1.1. Let a1 , a2 , a3 ∈ A \ {0} be the coefficients of (8.1.1),
and a1 , a2 , a3 the representatives for a1 , a2 , a3 from the statement of Theorem 8.1.1. By Lemma 8.3.5, there exists non-zero f ∈ A0 such that A ⊆ B :=
A0 [y, f −1 ], a1 , a2 , a3 ∈ B ∗ , and moreover,
deg f ≤ (2d)exp O(r) ,
h(f ) ≤ (2d)exp O(r) (h + 1).
By Corollary 8.3.3 we have the same type of upper bounds for the degrees
and logarithmic heights of F1 , . . . , FD . So in Proposition 8.3.7 we may take
d1 = (2d)exp O(r) , h1 = (2d)exp O(r) (h + 1). Finally, by Lemma 8.3.1 we have
D ≤ dt .
Let (x1 , x2 ) be a solution of (8.1.1) and put xi := ai xi /a3 for i = 1, 2. Let
i ∈ {1, 2}. By Proposition 8.3.7 we have
deg xi ≤ 4qd 2t (2d)exp O(r) ≤ (2d)exp O(r) ,
h(xi ) ≤ exp((2d)exp O(r) (h + 1)).
We apply Lemma 8.3.6 with λ = ai /a3 . Notice that λ is represented by (ai , a3 ).
By assumption, ai and a3 have degrees at most d and logarithmic heights at
most h. Letting ai , a3 play the role of a, b in Lemma 8.3.6, we see that in
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
212
Unit equations over finitely generated domains
that lemma we may take h0 = exp((2d)exp O(r) (h + 1)) and d0 = (2d)exp O(r) . It
follows that xi , xi−1 have representatives xi , xi ∈ Z[X1 , . . . , Xr ] such that
deg xi , deg xi , h(xi ), h(xi ) ≤ exp (2d)exp O(r) (h + 1) .
We observe here that the upper bound for h(xi ) dominates by far the other terms
in our estimation. This completes the proof of Theorem 8.1.1.
Proposition 8.3.7 is proved in Sections 8.4–8.6. In Section 8.4 we deduce
the degree bound (8.3.18). Here, our main tool is Theorem 7.1.1 (Mason’s
effective result on S-unit equations over function fields). In Section 8.5 we work
out a more precise version of an effective specialization argument of Győry
(1983, 1984). In Section 8.6 we prove (8.3.19) by combining the specialization
argument from Section 8.5 with a recent effective result for S-unit equations
over number fields, due to Győry and Yu (2006).
8.4 Bounding the degree in Proposition 8.3.7
We keep the notation from Proposition 8.3.7. We may assume that q > 0
because the case q = 0 is trivial. Let as before K0 = Q(z1 , . . . , zq ), K = K0 (y),
A0 = Z[z1 , . . . , zq ], B = Z[z1 , . . . , zq , f −1 , y]. Choose an algebraic closure
K0 of K0 . Then there are precisely D K0 -isomorphic embeddings of K into
K0 , which we denote by x → x (i) (i = 1, . . . , D).
Fix i ∈ {1, . . . , q}. Let ki be an algebraic closure of Q(z1 , . . . , zi−1 ,
zi+1 , . . . , zq ), contained in K0 . Thus, A0 is contained in ki [zi ]. Define the
field
Mi := ki zi , y (1) , . . . , y (D) .
That is, Mi is the splitting field of the polynomial XD + F1 XD−1 + · · · + FD
over ki (zi ). The subring
.
/
Bi := ki zi , f −1 , y (1) , . . . , y (D)
of Mi contains B = Z[z1 , . . . , zq , f −1 , y] as a subring. Put i := [Mi : ki (zi )].
We will apply the estimates from Sections 2.2 and 2.3 with zi , ki , Mi instead
of z, k, K. Denote by gMi /ki the genus of Mi over ki . The height HMi is taken
with respect to Mi /ki . For G ∈ A0 , we denote by degzi G the degree of G in
the variable zi .
Lemma 8.4.1 For α ∈ K we have
deg α ≤ qD · d1 +
q
i=1
−1
i
D
HMi α (j ) .
j =1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.4 Bounding the degree in Proposition 8.3.7
213
Proof. We have
α = Q−1
D−1
Pj y j
j =0
for certain P0 , . . . , PD−1 , Q ∈ A0 with gcd(Q, P0 , . . . , PD−1 ) = 1. Clearly,
deg α ≤
q
μi ,
where μi := max(degzi Q, degzi P0 , . . . , degzi PD−1 ).
i=1
(8.4.1)
Below, we estimate μ1 , . . . , μq from above. We heavily use the height properties listed in Section 2.2. We fix i ∈ {1, . . . , q} and use the notation introduced
above.
By taking conjugates, we obtain
α (k) = Q−1
D−1
Pj · (y (k) )j
for k = 1, . . . , D.
j =0
Let be the D × D-matrix with rows
D−1
D−1
.
(1, . . . , 1), y (1) , . . . , y (D) , . . . , y (1)
, . . . , y (D)
By Cramer’s rule, Pj /Q = δj /δ, where δ = det , and δj is the determinant
of the matrix obtained by replacing the j -th row of by (α (1) , . . . , α (D) ).
Gauss’ Lemma implies that gcd(P0 , . . . , PD−1 , Q) = 1 in the ring ki [zi ].
So by (2.2.2) (with zi in place of z) we have
μi = Hkhom
(Q, P0 , . . . , PD−1 ).
i (zi )
Notice that (δ, δ1 , . . . , δD ) is a scalar multiple of (Q, P0 , . . . , PD−1 ). By combining (2.2.3), (2.2.1) and inserting [Mi : ki (zi )] = i , we obtain
μi = −1
i HMi (Q, P0 , . . . , PD−1 ) = HMi (δ, δ1 , . . . , δD ).
(8.4.2)
We bound from above the right-hand side. A straightforward estimate yields
that for every valuation v of Mi /ki ,
−min(v(δ), v(δ1 ), . . . , v(δD )) ≤ −D
D
min(0, v(y (j ) )) −
j =1
D
min(0, v(α (j ) )),
j =1
and then summation over v gives
HMi (δ, δ1 , . . . , δD ) ≤ D
D
j =1
HMi (y (j ) ) +
D
HMi (α (j ) ).
(8.4.3)
j =1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
214
Unit equations over finitely generated domains
A combination of (2.2.10), (2.2.3), (2.2.2) and assumption (8.3.16) gives
D
HMi y (j ) = HMi (F) = i Hki (zi ) (F)
j =1
= i max(0, degzi F1 , . . . , degzi FD ) ≤ i · d1 .
Together with (8.4.2) and (8.4.3) this leads to
μi ≤ Dd1 + −1
i
D
HMi α (j ) .
j =1
Now these bounds for i = 1, . . . , q together with (8.4.1) imply our lemma.
Proof of (8.3.18). We fix again i ∈ {1, . . . , q} and use the notation introduced
above. By Proposition 2.3.2, applied with ki , zi , Mi instead of k, z, K and with
F = F = XD + F1 XD−1 + · · · + FD , we have
gMi /ki ≤ (i − 1)D max degzi (Fj ) ≤ (i − 1) · Dd1 .
j
(8.4.4)
Let S denote the subset of valuations v of Mi /ki such that v(zi ) < 0 or v(f ) >
0. Each valuation of ki (zi ) can be extended to at most [Mi : ki (zi )] = i
valuations of Mi . Hence Mi has at most i valuations v with v(zi ) < 0 and at
most i deg f valuations with v(f ) > 0. Thus,
|S| ≤ i + i degzi f ≤ i (1 + deg f ) ≤ i (1 + d1 ).
(8.4.5)
Define the ring of S-integers in Mi ,
OS = {x ∈ Mi : v(x) ≥ 0 for v ∈ MMi \ S}.
This ring contains ki , zi , f and is integrally closed. As a consequence, A0 =
Z[z1 , . . . , zq ] ⊂ OS . The elements y (1) , . . . , y (D) belong to Mi and are integral
over A0 so they certainly belong to OS . As a consequence, the elements of B and
their conjugates over Q(z1 , . . . , zq ) belong to OS . In particular, if x1 , x2 ∈ B ∗
and x1 + x2 = 1, then
x1 + x2 = 1, x1 , x2 ∈ OS∗
(j )
(j )
(j )
(j )
for j = 1, . . . , D.
We apply Mason’s inequality, Theorem 7.1.1, and insert the upper bounds
(j )
(8.4.4) and (8.4.5). It follows that for j = 1, . . . , D we have either x1 ∈ ki or
(j ) HMi x1 ≤ |S| + 2gMi /ki − 2 ≤ 3i · Dd1 ;
(j )
in fact the last upper bound is valid also if x1 ∈ ki . Together with Lemma 8.4.1
this gives
deg x1 ≤ qDd1 + qD · 3Dd1 ≤ 4qD 2 d1 .
For deg x2 we derive the same estimate. This proves (8.3.18).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.5 Specializations
215
8.5 Specializations
In this section we prove some results about specialization homomorphisms from
B to Q, where B is the integral domain B from Proposition 8.3.7. We start with
three auxiliary results that are used in the construction of our specializations.
Lemma 8.5.1 Let m ≥ 1, let α1 , . . . , αm ∈ Q be distinct and suppose that
G(X) := m
i=1 (X − αi ) ∈ Z[X]. Let q, p0 , . . . , pm−1 be integers with
gcd(q, p0 , . . . , pm−1 ) = 1,
and put
βi :=
m−1
1
j
pj αi
q j =0
(i = 1, . . . , m).
Then
log max(|q|, |p0 |, . . . , |pm−1 |) ≤ 2m2 + (m − 1)h(G) +
m
h(βj ).
j =1
Proof. We use the height estimates from Section 1.9. For m = 1 the assertion
is obvious, so we assume m ≥ 2. Let L = Q(α1 , . . . , αm ). Let be the m × m
i
) (i = 0, . . . , m − 1). By Cramer’s rule we have
matrix with rows (α1i , . . . , αm
pi /q = δi /δ (i = 0, . . . , m − 1), where δ = det and δi is the determinant of
the matrix, obtained by replacing the i-th row of by (β1 , . . . , βm ). Put
μ := log max(|q|, |p0 |, . . . , |pm−1 |).
Then since (δ, δ1 , . . . , δm−1 ) is a scalar multiple of (q, p1 · · · pm−1 ) we have,
by (1.9.4) and (1.9.8),
μ = hhom (q, p1 , . . . , pm−1 ) = hhom (δ, δ1 , . . . , δm−1 )
1 =
log max(|δ|v , |δ1 |v , . . . , |δm−1 |v ).
d v∈M
(8.5.1)
L
Estimating the determinants using Hadamard’s inequality for the infinite places
and the ultrametric inequality for the finite places, we get
max(|δ|v , |δ1 |v , . . . , |δm |v ) ≤ mms(v)/2
m
max(1, |αi |v )m−1 max(1, |βi |v )
i=1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
216
Unit equations over finitely generated domains
for v ∈ ML , where s(v) = 1 if v is real, s(v) = 2 if v is complex, and s(v) = 0
if v is finite. Together with (8.5.1) this implies
μ ≤ 12 m log m +
m
((m − 1)h(αi ) + h(βi )).
i=1
A combination with Corollary 1.9.6 implies Lemma 8.5.1.
Lemma 8.5.2 Let g ∈ Z[z1 , . . . , zq ] be a non-zero polynomial of degree d
and N a subset of Z of cardinality > d. Then
|{u ∈ N q : g(u) = 0}| ≤ d|N |q−1 .
Proof. We proceed by induction on q. For q = 1 the assertion is clear. Let q ≥ 2.
0
gi (z1 , . . . , zq−1 )zqi with gi ∈ Z[z1 , . . . , zq−1 ] and gd0 = 0.
Write g = di=0
Then deg gd0 ≤ d − d0 . The induction hypothesis implies that there are at most
(d − d0 )|N |q−2 · |N | tuples (u1 , . . . , uq ) ∈ N q with gd0 (u1 , . . . , uq−1 ) = 0.
Further, there are at most |N |q−1 · d0 tuples u ∈ N q with gd0 (u1 , . . . , uq−1 ) = 0
and g(u1 , . . . , uq ) = 0. Summing these two quantities implies that g has at most
d|N |q−1 zeros in N q .
Lemma 8.5.3 Let g1 , g2 ∈ Z[z1 , . . . , zq ] be two non-zero polynomials of
degrees D1 , D2 , respectively, and let N be an integer ≥ max(D1 , D2 ). Define
S := {u ∈ Zq : |u| ≤ N, g2 (u) = 0}.
Then S is non-empty, and
|g1 |p ≤ (4N )qD1 (D1 +1)/2 max{|g1 (u)|p : u ∈ S}
for p ∈ MQ = {∞} ∪ {primes}.
(8.5.2)
Proof. Put Cp := max{|g1 (u)|p : u ∈ S} for p ∈ MQ . We proceed by induction
on q, starting with q = 0. In the case q = 0 we interpret g1 , g2 as non-zero
constants with |g1 |p = Cp for p ∈ MQ . Then the lemma is trivial. Let q ≥ 1.
Write
g1 =
D1
g1j (z1 , . . . , zq−1 )zqj ,
j =0
g2 =
D2
g2j (z1 , . . . , zq−1 )zqj ,
j =0
where g1,D1 , g2,D2 = 0. By the induction hypothesis, the set
S := {u ∈ Zq−1 : |u | ≤ N, g2,D2 (u ) = 0}
is non-empty and moreover,
max |g1j |p ≤ (4N)(q−1)D1 (D1 +1)/2 Cp
0≤j ≤D1
for p ∈ MQ ,
(8.5.3)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.5 Specializations
217
where
Cp := max{|g1j (u )|p : u ∈ S , j = 0, . . . , D1 }.
We estimate Cp from above in terms of Cp . Fix u ∈ S . There are at least
2N + 1 − D2 ≥ D1 + 1 integers uq with |uq | ≤ N such that g2 (u , uq ) = 0.
Let a0 , . . . , aD1 be distinct integers from this set. By Lagrange’s interpolation
formula,
g1 (u , X) =
D1
g1j (u )Xj
j =0
D1
D1
X − ai
=
g1 (u , aj )
.
a
j − ai
i=0
j =0
i=j
m
Since, in general, the coefficients of a polynomial
k=1 (X − ck ) with
m
c1 , . . . , cm ∈ C have absolute values at most k=1 (1 + |ck |), we deduce
D 1 D1
1 + |ai |
max |g1j (u )| ≤ C∞
|aj − ai |
0≤j ≤D1
j =0
i=0
i=j
≤ C∞ (D1 + 1)(N + 1)D1 ≤ (4N )D1 (D1 +1)/2 C∞ .
Now let p be a prime and put := 1≤i<j ≤D1 |aj − ai |. Then
D1 (D1 +1)/2
Cp .
max |g1j (u )|p ≤ Cp ||−1
p ≤ Cp ≤ (4N )
0≤j ≤D1
It follows that Cp ≤ (4N )D1 (D1 +1)/2 Cp for p ∈ MQ . A combination with (8.5.3)
gives (8.5.2).
We now introduce our specializations B → Q and prove some properties.
We assume q > 0 and apart from that keep the notation and assumptions from
Proposition 8.3.7. In particular, A0 = Z[z1 , . . . , zq ], K0 = Q(z1 , . . . , zq ) and
K = Q(z1 , . . . , zq , y),
B = Z[z1 , . . . , zq , f −1 , y],
where f is a non-zero element of A0 , y is integral over A0 , and y has minimal
polynomial
F := XD + F1 XD−1 + · · · + FD ∈ A0 [X]
over K0 . In the case D = 1, we take y = 1, F = X − 1.
To allow for other applications (e.g., Lemma 8.7.2 below), we consider a
more general situation than what is needed for the proof of Proposition 8.3.7.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
218
Unit equations over finitely generated domains
Let d1 ≥ d0 ≥ 1, h1 ≥ h0 ≥ 1 and assume that
max(deg F1 , . . . , deg FD ) ≤ d0 ,
max(d0 , deg f ) ≤ d1 ,
max(h(F1 ), . . . , h(FD )) ≤ h0 ,
max(h0 , h(f )) ≤ h1 .
Let u = (u1 , . . . , uq ) ∈ Zq . Then the substitution z1 → u1 , . . . , zq → uq
defines a ring homomorphism (specialization)
ϕu : α → α(u) : {g1 /g2 : g1 , g2 ∈ A0 , g2 (u) = 0} → Q.
We want to extend this to a ring homomorphism from B to Q and for this, we
have to impose some restrictions on u. Denote by F the discriminant of F
(with F := 1 if D = deg F = 1), and let
H := F FD · f.
(8.5.4)
Then H ∈ A0 . Using the fact that F is a polynomial of degree 2D − 2 with
integer coefficients in F1 , . . . , FD , it follows easily that
deg H ≤ (2D − 1)d0 + d1 ≤ 2Dd1 .
(8.5.5)
Now assume that
H(u) = 0.
Then f (u) = 0 and, moreover, the polynomial
Fu := XD + F1 (u)XD−1 + · · · + FD (u)
has D distinct zeros which are all different from 0, say y1 (u), . . . , yD (u) (these
numbers should not be confused with the algebraic functions y1 , . . . , yt from
Section 8.3). Thus, for j = 1, . . . , D the assignment
z1 → u1 , . . . , zq → uq ,
y → yj (u)
defines a ring homomorphism ϕu,j from B to Q; in the case D = 1 it is just
ϕu . The image of α ∈ B under ϕu,j is denoted by αj (u). Recall that we may
express elements α of B as
α=
D−1
(Pi /Q)y i
(8.5.6)
i=0
with P0 , . . . , PD−1 , Q ∈ A0 , gcd(P0 , . . . , PD−1 , Q) = 1.
Since α ∈ B, the denominator Q must divide a power of f , hence Q(u) = 0.
So we have
αj (u) =
D−1
(Pi (u)/Q(u))yj (u)i
(j = 1, . . . , D).
(8.5.7)
i=0
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.5 Specializations
219
It is obvious that ϕu,j is the identity on B ∩ Q. Thus, if α ∈ B ∩ Q, then ϕu,j (α)
has the same minimal polynomial as α and so it is conjugate to α.
For u = (u1 , . . . , uq ) ∈ Zq , we put |u| := max(|u1 |, . . . , |uq |). It is easy to
verify that for any g ∈ A0 , u ∈ Zq ,
log |g(u)| ≤ q log deg g + h(g) + deg g log max(1, |u|).
(8.5.8)
In particular,
h(Fu ) ≤ q log d0 + h0 + d0 log max(1, |u|).
(8.5.9)
Now an application of Corollary 1.9.6 gives
D
h(yj (u)) ≤ D + 1 + q log d0 + h0 + d0 log max(1, |u|).
(8.5.10)
j =1
Define the algebraic number fields Ku,j := Q(yj (u)) (j = 1, . . . , D). We
derive an upper bound for the discriminant DKu,j of Ku,j .
Lemma 8.5.4 Let u ∈ Zq with H(u) = 0. Then for j = 1, . . . , D we have
[Ku,j : Q] ≤ D and
q
2D−2
.
|DKu,j | ≤ D 2D−1 d0 · eh0 max(1, |u|)d0
Proof. Let j ∈ {1, . . . , D}. The estimate for the degree is obvious. By
Lemma 1.5.1 we have
|DKu,j | ≤ D 2D−1 H (Fu )2D−2 ,
where H (Fu ) denotes the maximum of the absolute values of the coefficients
of Fu . Now our lemma follows by combining this with (8.5.9).
We finish with two lemmas, which relate h(α) to the heights of αj (u) for
α ∈ B, u ∈ Zq .
Lemma 8.5.5 Let u ∈ Zq with H(u) = 0. Let α ∈ B. Then for j = 1, . . . , D,
h(αj (u)) ≤ D 2 + q(D log d0 + log deg α) + Dh0 + h(α)
+ (Dd0 + deg α) log max(1, |u|).
Proof. Let P0 , . . . , PD−1 , Q be as in (8.5.6) and write αj (u) as in (8.5.7). Let
L = Q(yj (u)). Then for v ∈ ML , we have
|αj (u)|v ≤ D s(v) Av max(1, |yj (u)|)D−1
,
v
where s(v) = 1 if v is real, s(v) = 2 if v is complex, s(v) = 0 if v is finite, and
Av = max(1, |P0 (u)/Q(u)|v , . . . , |PD−1 (u)/Q(u)|v ).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
220
Unit equations over finitely generated domains
Hence
1
h(αj (u)) ≤ log D + [L:Q]
log Av + (D − 1)h(yj (u)).
(8.5.11)
v∈ML
From (1.9.8), (1.9.4) and (8.5.8) we infer
1
log Av = h(P0 (u)/Q(u), . . . , PD−1 (u)/Q(u))
[L:Q]
v∈ML
= hhom (Q(u), P0 (u), . . . , PD−1 (u))
≤ log max(|Q(u)|, |P0 (u)|, . . . , |PD−1 (u)|)
≤ q log deg α + h(α) + deg α · log max(1, |u|).
By combining (8.5.11) with this inequality and with (8.5.10), our lemma easily
follows.
Lemma 8.5.6 Let α ∈ B, α = 0, and let N be an integer with
N ≥ max(deg α, 2Dd0 + 2(q + 1)(d1 + 1)).
Then the set
S := {u ∈ Zq : |u| ≤ N, H(u) = 0}
is non-empty, and
h(α) ≤ 5N 4 (h1 + 1)2 + 2D(h1 + 1)H
where H := max{h(αj (u)) : u ∈ S, j = 1, . . . , D}.
Proof. It follows from our assumption on N , (8.5.5) and Lemma 8.5.3 that S
is non-empty. We proceed with estimating h(α).
Let P0 , . . . , PD−1 , Q ∈ A0 be as in (8.5.6). We analyse Q more closely. Let
km l1
g1 · · · gnln
f = ±p1k1 · · · pm
be the unique factorization of f in A0 , where p1 , . . . , pm are distinct prime
numbers, and ±g1 , . . . , ±gn distinct irreducible elements of A0 of positive
degree. Notice that
m ≤ h(f )/ log 2 ≤ h1 / log 2,
n
i=1 li h(gi ) ≤ qd1 + h1 ,
(8.5.12)
(8.5.13)
where the last inequality is a consequence of Corollary 1.9.5. Since α ∈ B, the
polynomial Q is also composed of p1 , . . . , pm , g1 , . . . , gn . Hence
Q = aQ
k
k
with a = ±p11 · · · pmm ,
l
l
Q = g11 · · · gnn
(8.5.14)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.5 Specializations
221
for certain non-negative integers k1 , . . . , ln . Clearly,
l1 + · · · + ln ≤ deg Q ≤ deg α ≤ N,
and by Corollary 1.9.5 and (8.5.13),
h(Q) ≤ q deg Q +
n
li h(gi ) ≤ N (q + qd1 + h1 ) ≤ N 2 (h1 + 1). (8.5.15)
i=1
In view of (8.5.8), we have for u ∈ S,
log |Q(u)| ≤ q log d1 + h(Q) + deg Q log N
≤ 32 N log N + N 2 (h1 + 1) ≤ N 2 (h1 + 2).
Hence
h(Q(u)αj (u)) ≤ N 2 (h1 + 2) + H
for u ∈ S, j = 1, . . . , D. Further, by (8.5.7) and (8.5.13) we have
Q(u)αj (u) =
D−1
(Pi (u)/a)yj (u)i .
i=0
Put
δ(u) := gcd(a, P0 (u), . . . , PD−1 (u)).
Then by applying Lemma 8.5.1 and then (8.5.9) we obtain
max(|a|, |P0 (u)|, . . . , |PD−1 (u)|
δ(u)
2
≤ 2D + (D − 1)h(Fu ) + D N 2 (h1 + 2) + H
log
≤ 2D 2 + (D − 1)(q log d1 + h1 + d1 log N) + D N 2 (h1 + 2) + H
≤ N 3 (h1 + 2) + DH.
(8.5.16)
Our assumption that gcd(Q, P0 , . . . , PD−1 ) = 1 implies that the greatest
common divisor of a and the coefficients of P0 , . . . , PD−1 is 1. Let p ∈
{p1 , . . . , pm } be one of the prime factors of a. There is j ∈ {0, . . . , D − 1} such
that |Pj |p = 1. Our assumption on N and (8.5.5) imply that N ≥ max(deg H,
deg Pj ). This means that Lemma 8.5.3 is applicable with g1 = Pj and g2 = H.
It follows that
max{|Pj (u)|p : u ∈ S} ≥ (4N )−qN(N+1)/2 .
That is, there is u0 ∈ S with |Pj (u0 )|p ≥ (4N)−qN(N+1)/2 . Hence
|δ(u0 )|p ≥ (4N )−qN(N+1)/2 .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
222
Unit equations over finitely generated domains
Together with (8.5.16), this implies
−1
log |a|−1
p ≤ log |a/δ(u0 )| + log |δ(u0 )|p
≤ N 3 (h1 + 2) + DH + 12 N 3 log 4N ≤ N 4 (h1 + 3) + DH.
Combining this with the upper bound (8.5.12) for the number of prime factors
of a, we obtain
log |a| ≤ 2N 4 h1 (h1 + 3) + 2Dh1 · H.
(8.5.17)
Together with (8.5.14) and (8.5.15), this implies
h(Q) ≤ 2N 4 h1 (h1 + 3) + 2Dh1 · H + N 2 (h1 + 1)
≤ 3N 4 (h1 + 1)2 + 2Dh1 · H.
(8.5.18)
Further, the right-hand side of (8.5.17) is also an upper bound for log δ(u), for
u ∈ S. Combining this with (8.5.16) gives
log max{|Pj (u)| : u ∈ S, j = 0, . . . , D − 1}
≤ N 3 (h1 + 2) + DH + 3N 4 (h1 + 1)2 + 2Dh1 · H
≤ 4N 4 (h1 + 1)2 + 2D(h1 + 1) · H.
Another application of Lemma 8.5.3 yields
h(Pj ) ≤ 12 qN (N + 1) log 4N + 4N 4 (h1 + 1)2 + 2D(h1 + 1) · H
≤ 5N 4 (h1 + 1)2 + 2D(h1 + 1) · H
for j = 0, . . . , D − 1. Together with (8.5.18) this gives the upper bound for
h(α) from our lemma.
8.6 Bounding the height in Proposition 8.3.7
It remains to prove the height bound in (8.3.19). As before, we use O(·) to
denote a quantity which is c times the expression between the parentheses,
where c is an effectively computable positive absolute constant which may be
different at each occurrence of the O-symbol.
We first consider the case q > 0. Pick u ∈ Zq with H(u) = 0, pick j ∈
{1, . . . , D} and put L := Ku,j . Further, let the set of places S consist of all
infinite places of L, and all finite places of L lying above the rational prime
divisors of f (u). Let p1 , . . . , pt be the prime ideals in S and define, in the usual
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.6 Bounding the height in Proposition 8.3.7
223
manner,
s := |S|,
P := max(2, NK (p1 ), . . . , NK (pt )),
Q :=
t
NK (pi ).
i=1
Further, denote by RS the S-regulator (see (1.8.2)). Note that yj (u) is an
algebraic integer, and f (u) ∈ OS∗ . Hence ϕu,j (B) ⊆ OS and ϕu,j (B ∗ ) ⊆ OS∗ .
Let x1 , x2 be a solution of (8.3.17). So
x1,j (u) + x2,j (u) = 1,
x1,j (u), x2,j (u) ∈ OS∗ ,
where x1,j (u), x2,j (u) are the images of x1 , x2 under ϕu,j . We apply Corollary 4.1.5. In a slightly less precise form, this result gives
max(h(x1,j (u)), h(x2,j (u)))
P
· RS · max(log P , log∗ RS ).
≤ exp(O(s log s))
log P
(8.6.1)
We estimate this bound from above. By assumption, f has degree at most d1
and logarithmic height at most h1 , hence
q
|f (u)| ≤ d1 eh1 max(1, |u|)d1 =: R(u).
(8.6.2)
Since the degree of L is at most D, the cardinality s of S is at most s ≤
D(1 + ω), where ω is the number of prime divisors of f (u). Using the inequality
from elementary number theory, ω ≤ O(log |f (u)|/ log log |f (u)|), we obtain
s≤O
D log∗ R(u)
.
log∗ log∗ R(u)
(8.6.3)
Next, we estimate P and RS . By (8.6.2), we have
P ≤ Q ≤ |f (u)|D ≤ exp O(D log∗ R(u)).
(8.6.4)
By inequality (1.8.4) we have
RS ≤ |DL |1/2 (log∗ |DL |)D−1
t
log NK (pi ) ≤ |DL |1/2 (log∗ |DL |)D−1 (log Q)s .
i=1
In view of Lemma 8.5.4 (using d0 ≤ d1 ) we have
q
2D−2
|DL | ≤ D 2D−1 d1 eh1 max(1, |u|)d1
≤ exp O(D log∗ DR(u)),
and this easily implies
|L |1/2 (log∗ L )D−1 ≤ exp O(D log∗ DR(u)).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
224
Unit equations over finitely generated domains
Together with the estimates (8.6.3) and (8.6.4) for s and Q, this leads to
RS ≤ exp O(D log∗ DR(u) + s log∗ log∗ Q) ≤ exp O(D log∗ DR(u)).
(8.6.5)
Now by inserting (8.6.3)–(8.6.5) into the upper bound (8.6.1) we obtain
h(x1,j (u)), h(x2,j (u)) ≤ exp O(D log∗ D log∗ R(u)).
We apply Lemma 8.5.6 with N := 4D 2 (q + d1 + 1)2 . From the already
established (8.3.18) it follows that deg x1 , deg x2 ≤ N . Further, since d1 ≥ d0
we have N ≥ 2Dd0 + 2(d1 + 1)(q + 1). So indeed, Lemma 8.5.6 is applicable
with this value of N. It follows that the set S := {u ∈ Zq : |u| ≤ N, H(u) = 0}
is not empty. Further, for u ∈ S, j = 1, . . . , D, we have
h(x1,j (u)) ≤ exp O(D log∗ D(q log d1 + h1 + d1 log∗ N ))
≤ exp O(N 1/2 (log∗ N )2 + (D log∗ D)h1 ),
and so by Lemma 8.5.6,
h(x1 ) ≤ exp O(N 1/2 (log∗ N)2 + (D log∗ D)h1 ).
For h(x2 ) we obtain the same upper bound. This easily implies (8.3.19) in the
case q > 0.
Now assume q = 0. In this case, K0 = Q, A0 = Z and B = Z[f −1 , y],
where y is an algebraic integer with minimal polynomial F = XD +
F1 XD−1 + · · · + FD ∈ Z[X] over Q, and f is a non-zero rational integer. By assumption, log |f | ≤ h1 , log |Fi | ≤ h1 for i = 1, . . . , D. Denote by
y1 , . . . , yD the conjugates of y, and let L = Q(yj ) for some j . By Lemma 1.5.1
we have |L | ≤ D 2D−1 e(2D−2)h1 . The isomorphism given by y → yj maps K
to L and B to OS , where S consists of the infinite places of L and of the prime
ideals of OL that divide f . The estimates (8.6.2)–(8.6.5) remain valid if we
replace R(u) by eh1 . Hence for any solution (x1 , x2 ) of (8.3.17),
h(x1,j ), h(x2,j ) ≤ exp O((D log∗ D)h1 ),
where x1,j , x2,j are the j -th conjugates of x1 , x2 , respectively. Now an application of Lemma 8.5.1 with g = F, m = D, βj = x1,j gives
h(x1 ) ≤ exp O((D log∗ D)h1 ).
Again we derive the same upper bound for h(x2 ), and deduce (8.3.19). This
completes the proof of Proposition 8.3.7.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.7 Proof of Theorem 8.1.3
225
8.7 Proof of Theorem 8.1.3
We start with some results on multiplicative (in)dependence. We first recall a
result that was published by Loher and Masser, but attributed by them to Yu
Kunrui. Another result of this type was obtained earlier in Loxton and van der
Poorten (1983).
Lemma 8.7.1 Let L be an algebraic number field of degree d, and γ0 , . . . , γs
non-zero elements of L such that γ0 , . . . , γs are multiplicatively dependent, but
any s elements among γ0 , . . . , γs are multiplicatively independent. Then there
are non-zero integers k0 , . . . , ks such that
γ0k0 · · · γsks = 1,
|ki | ≤ 58(s!es /s s ) · d s+1 (log∗ d)h(γ0 ) · · · h(γs )/ h(γi ) for i = 0, . . . , s.
Proof. See Loher and Masser (2004), Corollary 3.2.
We prove a generalization for arbitrary finitely generated integral domains.
As before, let A = Z[z1 , . . . , zr ] ⊇ Z be an integral domain finitely generated
over Z, and suppose that the ideal I of polynomials f ∈ Z[X1 , . . . , Xr ] with
f (z1 , . . . , zr ) = 0 is generated by f1 , . . . , fm . Let K be the quotient field of A.
Let γ0 , . . . , γs be non-zero elements of K, and for i = 1, . . . , s, let (gi1 , gi2 )
be a pair of representatives for γi , i.e., elements of Z[X1 , . . . , Xr ] such that
γi =
gi1 (z1 , . . . , zr )
.
gi2 (z1 , . . . , zr )
Lemma 8.7.2 Assume that f1 , . . . , fm and gi1 , gi2 (i = 0, . . . , s) have degrees
at most d and logarithmic heights at most h, where d ≥ 1, h ≥ 1. Further,
assume that γ0 , . . . , γs are multiplicatively dependent. Then there are integers
k0 , . . . , ks , not all equal to 0, such that
γ0k0 · · · γsks = 1,
|ki | ≤ (2d)exp O(r+s) (h + 1)s
for i = 0, . . . , s.
Proof. We assume without loss of generality that any s numbers among
γ0 , . . . , γs are multiplicatively independent (if this is not the case, take a minimal multiplicatively dependent subset of {γ0 , . . . , γs } and proceed further
with this subset). We first assume that q > 0. We use an argument of van der
Poorten and Schlickewei (1991). We keep the notation and assumptions from
Sections 8.3–8.5. In particular, we assume that z1 , . . . , zq is a transcendence
basis of K, and rename zq+1 , . . . , zr as y1 , . . . , yt , respectively. For brevity,
we have included the case t = 0 as well in our proof. But it should be possible to prove in this case a sharper result by means of a more elementary
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
226
Unit equations over finitely generated domains
method. In the case t > 0, y and F = XD + F1 XD−1 + · · · + FD will be as in
Corollary 8.3.3. In the case t = 0 we take m = 1, f1 = 0, d = h = 1, y = 1,
F = X − 1, D = 1. We construct a specialization such that among the images
of γ0 , . . . , γs no s elements are multiplicatively dependent, and then apply
Lemma 8.7.1.
Let V ≥ 2d be a positive integer. Later we shall make our choice of V more
precise. Let
v = (v0 , . . . , vs ) ∈ Zs+1 \ {0} :
.
(8.7.1)
V :=
|vi | ≤ V for i = 0, . . . , s, vi = 0 for some i
Then
γv :=
s
γivi
−1
(v ∈ V)
i=0
are non-zero elements of K, since each proper subset of {γ0 , . . . , γs } is multiplicatively independent. It is not difficult to show that for v ∈ V, γv has a pair
of representatives (g1,v , g2,v ) such that
deg g1,v ,
deg g2,v ≤ sdV .
In the case t > 0, there exists by Lemma 8.3.5 a non-zero f ∈ A0 such that
A ⊆ B := A0 [y, f −1 ],
γv ∈ B ∗
for v ∈ V
and
deg f ≤ V s+1 (2sdV )exp O(r) ≤ V exp O(r+s) .
In the case t = 0 this holds true as well, with y = 1 and f = v∈V (g1,v · g2,v ).
We apply the theory on specializations explained in Section 8.5 with this f . We
put H := F FD f , where F is the discriminant of F. Using Corollary 8.3.3
and inserting the bound D ≤ d t from Lemma 8.3.1 we get for t > 0,
d0 ≤ (2d)exp O(r) ,
h0 ≤ (2d)exp O(r) (h + 1),
(8.7.2)
where
d0 := max(deg f1 , . . . , deg fm , deg F1 , . . . , deg FD ),
h0 := max(h(f1 ), . . . , h(fm ), h(F1 ), . . . , h(FD )).
With the provision deg 0 = h(0) = −∞, the inequalities (8.7.2) hold true also
if t = 0. Combining this with Lemma 8.3.4, we obtain
deg H ≤ (2D − 1)d0 + deg f ≤ V exp O(r+s) .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.7 Proof of Theorem 8.1.3
227
By Lemma 8.5.2 there exists u ∈ Zq with
H(u) = 0,
|u| ≤ V exp O(r+s) .
(8.7.3)
We proceed further with this u.
As we have seen before, γv ∈ B ∗ for v ∈ V. By our choice of u, there are
D distinct specialization maps ϕu,j (j = 1, . . . , D) from B to Q. We fix one
of these specializations, say ϕu . Given α ∈ B, we write α(u) for ϕu (α). As
the elements γv are all units in B, their images under ϕu are non-zero. So we
have
s
γi (u)vi = 1 for v ∈ V,
(8.7.4)
i=0
where V is defined by (8.7.1).
We use Lemma 8.5.5 to estimate the heights h(γi (u)) for i = 0, . . . , s.
Recall that by Lemma 8.3.4 we have
deg γi ≤ (2d)exp O(r) ,
h(γi ) ≤ (2d)exp O(r) (h + 1)
for i = 0, . . . , s. By inserting these bounds, together with the bound D ≤ d t
from Lemma 8.3.1, those for d0 , h0 from (8.7.2) and that for u from (8.7.3) into
the bound from Lemma 8.5.5, we obtain for i = 0, . . . , s,
h(γi (u)) ≤ (2d)exp O(r) (1 + h + log max(1, |u|))
≤ (2d)
exp O(r+s)
(8.7.5)
(1 + h + log V ).
Assume that among γ0 (u), . . . , γs (u) there are s numbers that are multiplicatively dependent. By Lemma 8.7.1 there are integers k0 , . . . , ks , at least
one of which is non-zero and at least one of which is 0, such that
s
γi (u)ki = 0,
i=0
|ki | ≤ (2d)exp O(r+s) (1 + h + log V )s−1
for i = 0, . . . , s.
Now for
V = (2d)exp O(r+s) (h + 1)s−1
(8.7.6)
(with a sufficiently large constant in the O-symbol), the upper bound for the
numbers |ki | is smaller than V . But this would imply that si=0 γi (u)vi = 1
for some v ∈ V, contrary to (8.7.4). Thus we conclude that with the choice
(8.7.6) for V , there exists u ∈ Zq with (8.7.3), such that any s numbers among
γ0 (u), . . . , γs (u) are multiplicatively independent. The numbers γ0 (u),. . .,γs (u)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
228
Unit equations over finitely generated domains
are multiplicatively dependent, since they are the images under ϕu of γ0 , . . . , γs ,
which are multiplicatively dependent. Substituting (8.7.6) into (8.7.5) we obtain
h(γi (u)) ≤ (2d)exp O(r+s) (h + 1)
for i = 0, . . . , s.
(8.7.7)
Now Lemma 8.7.1 implies that there are non-zero integers k0 , . . . , ks such that
s
γi (u)ki = 1,
(8.7.8)
i=0
|ki | ≤ (2d)exp O(r+s) (h + 1)s
for i = 0, . . . , s.
(8.7.9)
Our assumption on γ0 , . . . , γs implies that there are non-zero integers
l0 , . . . , ls such that si=0 γili = 1. Hence si=0 γi (u)li = 1. Together with (8.7.8)
this implies
s
γi (u)l0 ki −li k0 = 1.
i=1
But γ1 (u), . . . , γs (u) are multiplicatively independent, hence li k0 = ki l0 for
i = 1, . . . , s. Therefore,
l
k
k0
γ0 · · · γsks 0 = γ0l0 · · · γsls 0 = 1,
implying that si=0 γiki = ρ for some root of unity ρ. But ϕu (ρ) = 1 and it is
conjugate to ρ. Hence ρ = 1. So in fact we have si=0 γiki = 1 with non-zero
integers ki satisfying (8.7.9). This proves our lemma, but under the assumption
q > 0. If q = 0 then a much simpler argument, without specializations, gives
h(γi ) ≤ (2d)exp O(r+s) (h + 1) for i = 0, . . . , s instead of (8.7.7). Then the proof
is finished in the same way as in the case q > 0.
Corollary 8.7.3 Assume that f1 , . . . , fm and gi1 , gi2 (i = 0, . . . , s) have
degrees at most d and logarithmic heights at most h, where d ≥ 1, h ≥ 1.
Further, assume that γ1 , . . . , γs are multiplicatively independent and
γ0 = γ1k1 · · · γsks
for certain integers k1 , . . . , ks . Then
|ki | ≤ (2d)exp O(r+s) (h + 1)s
for i = 1, . . . , s.
Proof. By Lemma 8.7.2, and by the multiplicative independence of γ1 , . . . , γs ,
there are integers l0 , . . . , lm such that
m
γili = 1,
i=0
l0 = 0,
|li | ≤ (2d)exp O(r+s) (h + 1)s
for i = 0, . . . , s.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
8.7 Proof of Theorem 8.1.3
229
Now, clearly, we have also
s
γil0 ki −li = 1,
i=1
hence l0 ki − li = 0 for i = 1, . . . , s. It follows that
|ki | = |li / l0 | ≤ (2d)exp O(r+s) (h + 1)s
for i = 1, . . . , s.
This implies our corollary.
Proof of Theorem 8.1.3. We keep the notation and assumptions from the statement of Theorem 8.1.3. For i = 1, . . . , s, j = 1, 2, let αij := gij (z1 , . . . , zr ).
Then αi1 , αi2 ∈ A and γi = αi1 /αi2 for i = 1, . . . , s. Further, let
g :=
s
(gi1 gi2 ),
γ :=
i=1
s
(αi1 αi2 )
i=1
and define the ring
A := A[γ −1 ].
Then
A∼
= Z[X1 , . . . , Xr , Xr+1 ]/I
with I = (f1 , . . . , fm , gXr+1 − 1).
Clearly, γ ∈ A∗ , therefore also αi1 , αi2 ∈ A∗ , and hence γi ∈ A∗ for i =
1, . . . , s. Further, g has total degree at most O(sd) and logarithmic height at
most O(sh). As a consequence, I is generated by polynomials of total degrees
at most O(sd) and logarithmic heights at most O(sh).
Let (v1 , . . . , ws ) be a solution of (8.1.3), and put
x1 :=
s
γivi ,
x2 :=
i=1
s
γiwi .
i=1
Then
a1 x1 + a2 x2 = a3 ,
x1 , x2 ∈ A∗ .
By Theorem 8.1.1, x1 has a representative x1 ∈ Z[X1 , . . . , Xr+1 ] of degree and
logarithmic height both bounded above by
exp (2sd)exp O(r) (h + 1) .
Now Corollary 8.7.3 implies
|vi | ≤ exp (2s d)exp O(r) (h + 1) for i = 1, . . . , s.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
230
Unit equations over finitely generated domains
For |wi | (i = 1, . . . , s) we derive a similar upper bound. This completes the
proof of Theorem 8.1.3.
8.8 Notes
r Itis well-known that if A is a finitely generated integral domain over Z, then
its quotient field K can be represented in the form K = Q(z1 , . . . , zq , y),
where {z1 , . . . , zq } is a transcendence basis of K over Q, y is an integral
element over A0 := Z[z1 , . . . , zq ], and A is contained in the integral domain
B := Z[z1 , . . . , zq , f −1 , y], for some non-zero f ∈ A0 . As was seen in
Sections 8.4–8.6, such an overring B has the advantage that it is easier
to deal with its elements. The generating sets {z1 , . . . , zq , y} of the above
type proved to be useful in several other applications, among others in
transcendental number theory; see e.g. Waldschmidt (1973, 1974) where
an appropriate size is introduced for the elements of K with respect to a
generating set {z1 , . . . , zq , y}.
r Following a method of Győry (1983, 1984), analogues of some results of
this chapter can be established over integral domains finitely generated over
a field of characteristic zero instead of Z. However, in this case finiteness
results cannot be obtained, upper bounds can be derived only for the degrees
of the solutions.
r Theorems 8.1.1–8.1.3 as well as the method of their proofs have several
applications. For instance, Theorems 8.1.1–8.1.3 are applied in our book on
discriminant equations. Further, their methods of proof are used in several
papers to obtain general effective finiteness results for various classes of
Diophantine equations over finitely generated domains over Z, namely for
Thue equations and superelliptic equations in Bérczes, Evertse and Győry
(2014), for polynomial equations f (x, y) = 0 in solutions x, y from a finitely
generated multiplicative group, and even from the division group of the latter,
in Bérczes (2015a, 2015b), and Koymans (2015) for Catalan’s equation.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:29:36, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.010
9
Decomposable form equations
Let F ∈ Z[X, Y ] be a binary form, i.e., a homogeneous polynomial of degree
n ≥ 3, that is irreducible over Q, and δ a non-zero integer. Thue (1909) proved
that the equation
F (x, y) = δ
in x, y ∈ Z
has only finitely many solutions. This was extended by Mahler (1933a) as
follows. Let p1 , . . . , pt be distinct prime numbers. Then the equation
F (x, y) = ±δp1z1 · · · ptzt
in x, y, z1 , . . . , zt ∈ Z
with gcd(x, y, p1 · · · pt ) = 1
has only finitely many solutions. Mahler’s result can be reformulated as follows.
In accordance with terminology introduced before, let S = {∞, p1 , . . . , pt },
where ∞ is the infinite place of Q, and let ZS = Z[(p1 · · · pt )−1 ] be the ring of
S-integers. Then the set of solutions of the equation
F (x, y) ∈ δZ∗S
in x, y ∈ ZS
is a union of only finitely many Z∗S -cosets, i.e., sets of the form {ε(x0 , y0 ) : ε ∈
Z∗S }, where (x0 , y0 ) is a solution of the equation.
In this chapter, we deal with generalizations, where instead of equations over
Z or ZS we consider equations over integral domains that are finitely generated
over Z, and where instead of a binary form F we take a decomposable form
in an arbitrary number of variables, that is, a homogeneous polynomial that
factors into linear forms over an extension of its field of definition.
More precisely, let K be a finitely generated (but not necessarily algebraic)
extension field of Q, and F ∈ K[X1 , . . . , Xm ] a decomposable form in m ≥ 2
variables, which factors into linear forms over a finite extension of K. Further,
let δ ∈ K ∗ and let A be a subring of K that is finitely generated over Z. We
231
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
232
Decomposable form equations
consider the equations
F (x) = δ
in x = (x1 , . . . , xm ) ∈ Am
(9.1)
and
F (x) ∈ δA∗
in x = (x1 , . . . , xm ) ∈ Am ,
(9.2)
where A∗ denotes the unit group of A. Equations of the type (9.1) and (9.2)
are called decomposable form equations. The set of solutions of (9.2) can be
divided into A∗ -cosets x0 A∗ = {εx0 : ε ∈ A∗ } where x0 is a solution of (9.2).
By Roquette’s Theorem (Roquette (1957), p.3), the unit group A∗ is finitely
generated. Hence it is easy to see that (9.2) can be reduced to finitely many
equations of the form (9.1).
Clearly, every binary form is a decomposable form in two variables. Equations of type (9.1) and (9.2) are called Thue equations and Thue–Mahler equations, respectively in the case that F is a binary form. Unlike in the results of
Thue and Mahler mentioned above, we do not require that the binary form is
irreducible over its field of definition. Other important special cases of decomposable form equations are norm form equations, discriminant form equations
and index form equations. As we shall see, decomposable form equations are in
a certain sense equivalent to unit equations and in particular, Thue equations are
equivalent to unit equations in two unknowns. Decomposable form equations
have many number-theoretic applications.
This chapter is basically an extensive survey, in which for some of the stated
results we give a complete proof, whereas for the proofs of others we refer to
the literature. Below, we give a brief overview.
In Section 9.1 we present a general finiteness criterion for equations (9.1)
and (9.2), and in particular for Thue equations when F is a binary form.
For convenience for applications, we establish our criterion for slightly more
general equations. This criterion gives effectively decidable necessary and
sufficient conditions in terms of the linear factors of F such that equations (9.1)
and (9.2) have only finitely many (A∗ -cosets of) solutions.
In Section 9.2 we explain how our finiteness criterion for decomposable form
equations implies the finiteness result for unit equations established in Chapter 6. In Section 9.3 we deduce our finiteness criterion for equations (9.1) and
(9.2) from the finiteness result for unit equations. This shows that the finiteness
results for unit equations and decomposable form equations are equivalent.
In Section 9.4 we give, without proof, a complete description of the set of
solutions of (9.1) and (9.2) in the case when this set is infinite. More precisely,
in this case, the set of solutions can be divided in a natural way into infinite
families, and the number of these families is finite.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.1 A finiteness criterion for decomposable form equations
233
In Section 9.5, we present, without proofs, explicit upper bounds for the
number of families and, in the case of finitely many solutions, for the number
of S-integral solutions for decomposable form equations over the ring of Sintegers of a number field. In Section 9.6 we derive, with full proofs, effective
bounds for the heights of the S-integral solutions of Thue equations, and of
decomposable form equations in an arbitrary number of unknowns from a
restricted class, including discriminant form equations and certain norm form
equations. The proofs are based on effective results from Chapter 4 concerning
S-unit equations.
In our next book, Discriminant Equations in Diophantine Number Theory,
we work out various applications of unit equations to discriminant form equations, index form equations, decomposable form equations of discriminant type
and related problems.
There is an extensive literature on decomposable form equations. Almost
all books and survey papers listed in the Preface of this book that deal with
unit equations and their applications are also concerned with decomposable
form equations. We refer also to Borevich and Shafarevich (1967), Evertse
and Győry (1988d), Feldman and Nesterenko (1998) and Győry (1999) on the
subject. Some further references can be found in the Notes (Section 9.7).
9.1 A finiteness criterion for decomposable form equations
We present a general finiteness criterion which guarantees the finiteness of
the number of solutions of equation (9.1), and the finiteness of the number of
A∗ -cosets of solutions of equation (9.2) for every δ ∈ K ∗ and every subring A
of K which is finitely generated over Z.
Let again K be a field which is finitely generated over Q. We fix an algebraic
closure K of K. Let A ⊂ K be a ring finitely generated over Z. Further, let
F ∈ K[X1 , . . . , Xm ] be a non-zero decomposable form in m ≥ 2 variables and
let δ ∈ K ∗ . For applications it is convenient to extend (9.1) and (9.2) mentioned
in the introduction and to consider the equations
F (x) = δ in x ∈ M
with l(x) = 0
for l ∈ L,
(9.1.1)
F (x) ∈ δA∗ in x ∈ M
with l(x) = 0
for l ∈ L,
(9.1.2)
and
where M is a finitely generated A-module with M ⊂ K m , and L is a finite
set of non-zero linear forms from K[X1 , . . . , Xm ]. In the special case where
M = Am and L consists of the linear factors of F , equations (9.1.1) and (9.1.2)
give (9.1) and (9.2), respectively.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
234
Decomposable form equations
We give necessary and sufficient conditions, in terms of L and the linear
factors of F , such that (9.1.1) and (9.1.2) have only finitely many (A∗ -cosets
of) solutions.
We introduce some convenient notation. In what follows we let G be a finite,
normal extension of K over which F factorizes into linear factors. For a linear
form l = α1 X1 + · · · + αm Xm ∈ G[X1 , . . . , Xm ] and for σ ∈ Gal(G/K) we
define σ (l) := σ (α1 )X1 + · · · + σ (αm )Xm . For a subset L of linear forms from
G[X1 , . . . , Xm ], we define the following:
σ (L) = {σ (l) : l ∈ L} for σ ∈ Gal(G/K);
[L] is the G-vector space generated by L;
L is called Gal(G/K)-stable if σ (L) = L for each σ ∈ Gal(G/K);
L is called Gal(G/K)-proper if for each σ ∈ Gal(G/K) we have either
σ (L) = L or σ (L) ∩ L = ∅.
Given G-linear subspaces V1 , . . . , Vt of the space of linear forms from
G[X1 , . . . , Xm ], we denote by V1 + · · · + Vt the smallest G-vector space
containing them.
We have
F = cl1e1 · · · lnen ,
(9.1.3)
where
L0 := {l1 , . . . , ln } ⊂ G[X1 , . . . , Xm ]
is a Gal(G/K)-stable set of pairwise non-proportional linear forms, c ∈ K ∗
and e1 , . . . , en are positive integers. Clearly, ei = ej if li = σ (lj ) for some
σ ∈ Gal(G/K).
Let L be a finite set of pairwise non-proportional linear forms with
L ⊇ L0 ,
L ⊂ G[X1 , . . . , Xm ].
The main result of this section is as follows.
Theorem 9.1.1 Let m, K, F , G, L0 , L be as above. Then the following three
assertions are equivalent:
(i) rankG L0 = m, and for each non-empty subset L1 L0 such that L1 is
Gal(G/K)-proper, we have
⎞
⎛
[σ (L1 )] ∩ [L0 \ σ (L1 )]⎠ = ∅;
(9.1.4)
L∩⎝
σ ∈Gal(G/K)
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.1 A finiteness criterion for decomposable form equations
235
(ii) for every subring A of K which is finitely generated over Z, every finitely
generated A-module M ⊂ K m , and every δ ∈ K ∗ , equation (9.1.1) has
only finitely many solutions;
(iii) for every A, M, δ as in (ii), equation (9.1.2) has only finitely many A∗ cosets of solutions.
Theorem 9.1.1 is new. We shall prove it in Section 9.3. An important feature
of this theorem is that it relates statements (i.e., (ii), (iii)) about Diophantine
equations to a statement (i.e., (i)) in linear algebra. Furthermore, assertion
(i) is effectively decidable provided K, G and the coefficients of the linear
forms in L0 and L are effectively given in some sense.
The following corollary is an immediate consequence of Theorem 9.1.1.
Corollary 9.1.2 Let m, K, F , G, L0 and L be as in Theorem 9.1.1. If
(i ) rankG L0 = m and L ∩ ([L1 ] ∩ [L0 \ L1 ]) = ∅ for every proper, non-empty
subset L1 of L0 ,
then (ii), (iii) hold. Moreover, if G = K, then (i ), (ii), (iii) are equivalent.
Similar finiteness criteria were established in Evertse and Győry (1988c);
see also Evertse, Gaál and Győry (1989).
We deduce some further consequences.
Corollary 9.1.3 Let m, K, F , G, L0 and L be as in Theorem 9.1.1, and let
L = L0 . Assume that |L0 | ≥ 2m − 1 and that L0 is in general position, i.e., each
subset of m linear forms from L0 is linearly independent. Then equation (9.1.1)
has only finitely many solutions.
For M = Am , this was proved in Győry (1993b).
Proof of Corollary 9.1.3. We apply Corollary 9.1.2. Notice that for each
proper, non-empty subset L1 of L0 we have |L1 | ≥ m or |L0 \ L1 | ≥ m, i.e.,
rankG [L1 ] = m or rankG [L0 \ L1 ] = m. Hence [L1 ] ∩ [L0 \ L1 ] contains L1 or
L0 \ L1 . This implies (i ) with L = L0 , and, by Corollary 9.1.2, (ii) follows.
Corollary 9.1.4 Let F ∈ K[X, Y ] be a non-zero binary form. Then the following two assertions are equivalent:
(iv) F is divisible by at least three pairwise non-proportional linear forms
from K[X, Y ];
(v) for every subring A of K which is finitely generated over Z and every
δ ∈ K ∗ , the equation
F (x, y) = δ
in x, y ∈ A
(9.1.5)
has only finitely many solutions.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
236
Decomposable form equations
The implication (iv) ⇒ (v) follows from work in Thue (1909) for K = Q,
A = Z, Siegel (1921) for K an arbitrary algebraic number field and A its ring
of integers, Mahler (1933a) for K = Q and A = ZS for some finite set of places
S of Q, Parry (1950) for K an arbitrary algebraic number field and A = OS for
some finite set of places S of K, and Lang (1960) in the most general case. As
was mentioned before, equation (9.1.5) is usually called a Thue equation.
Proof of Corollary 9.1.4. Let G be the splitting field of F over K. Then G is
a finite, normal extension of K. Let L0 be a maximal Gal(G/K)-stable set of
pairwise non-proportional linear forms from G[X, Y ] that divide F . We have
to show that (i) with m = 2, L = L0 is equivalent to |L0 | ≥ 3. First assume that
|L0 | ≥ 3. Then rankG L0 = 2. Next, for each proper, non-empty subset L1 of L0
we have |L1 | ≥ 2 or |L0 \ L1 | ≥ 2, i.e., rankG [L1 ] = 2 or rankG [L0 \ L1 ] = 2
and this implies that [L1 ] ∩ [L0 \ L1 ] contains L1 or L0 \ L1 . This gives (9.1.4).
Conversely, assume that |L0 | = 2. Then each proper, non-empty subset L1 of
L0 has |L1 | = 1 and so [σ (L1 )] ∩ [L0 \ σ (L1 )] = (0) for every σ ∈ Gal(G/K).
Hence (9.1.4) cannot hold.
9.2 Reduction of unit equations to decomposable
form equations
It can be shown that unit equations and decomposable form equations are
equivalent in the sense that every unit equation leads to a decomposable form
equation (over a suitable ring which is finitely generated over Z), and every
decomposable form equation can be reduced to finitely many unit equations (in
an appropriate finite field extension). Consequently, general finiteness results
for unit equations imply general finiteness results for decomposable form equations and vice versa. In the two unknowns case (i.e. for unit equations in two
unknowns and for Thue equations) this equivalence was (implicitly) pointed
out by Siegel (1926, 1929), while the general case was worked out by Evertse
and Győry (1988c).
More precisely, we show that Theorem 9.1.1 is equivalent to the following.
Theorem 9.2.1 Let K be a field of characteristc 0, a finitely generated
multiplicative subgroup of K ∗ and let a1 , . . . , am ∈ K ∗ . Then the equation
a1 x1 + · · · + am xm = 1
in x1 , . . . , xm ∈ (9.2.1)
has at most finitely many non-degenerate solutions, i.e., with
ai xi = 0 for each non-empty subset I of {1, . . . , m}.
i∈I
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.3 Reduction of decomposable form equations to unit equations
237
This theorem is proved in Chapter 6 in a more precise quantitative form, see
Theorem 6.1.3. For further historical comments, see Section 6.7. All known
proofs of Theorem 9.2.1 are ineffective.
In the next section, we prove Theorem 9.1.1, taking Theorem 9.2.1 as a
starting point. In the present section we show that Theorem 9.1.1 implies
Theorem 9.2.1.
Proof of the implication Theorem 9.1.1 ⇒ Theorem 9.2.1. Let K be a field of
characteristic 0, a finitely generated subgroup of K ∗ , and a1 , . . . , am ∈ K ∗
with m ≥ 2. Define the decomposable form
F := X1 · · · Xm (a1 X1 + · · · + am Xm ).
Let L0 = {a1 X1 , . . . , am Xm , a1 X1 + . . . + am Xm }, and L be the set of all linear
forms of the form ai1 Xi1 + · · · + ais Xis , where {i1 , . . . , is } is a non-empty
subset of {1, . . . , m}. Then we have L ⊃ L0 . It is easy to check that these L0
and L satisfy statement (i) in Theorem 9.1.1 with G = K (and even (i ) in
Corollary 9.1.2). Let A be the subring of K generated by a1 , . . . , am and the
elements of . Then A is finitely generated over Z, and is a subgroup of A∗ .
It is now clear that every non-degenerate solution x = (x1 , . . . , xm ) of (9.2.1)
satisfies
F (x) ∈ A∗ ,
x ∈ Am ,
l(x) = 0
for l ∈ L.
(9.2.2)
Theorem 9.1.1 (or in this case Corollary 9.1.2) implies that there are at most
finitely many pairwise linearly independent x with (9.2.2). This implies that
(9.2.1) has only finitely many pairwise linearly independent non-degenerate
solutions. But obviously, any two linearly dependent solutions of (9.2.1) have
to be equal. This implies Theorem 9.2.1.
9.3 Reduction of decomposable form equations
to unit equations
In this section we prove the equivalence of assertions (i), (ii) and (iii) of
Theorem 9.1.1, taking Theorem 9.2.1 as starting point. This section has been
divided into three subsections: the first contains the proof of the equivalence
of (ii) and (iii), which is elementary and is independent of unit equations, the
second contains the proof of the implication (i)⇒(iii) and the last the proof
of the implication (iii)⇒(i). In both the second and third subsections we have
used Theorem 9.2.1. We keep the notation and definitions from Section 9.1.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
238
Decomposable form equations
We need a few facts on Noetherian rings and modules; for a proof of these,
we refer to Lang (1984), chapter 6. A commutative ring is called Noetherian
if all its ideals are finitely generated. Let A be a Noetherian commutative ring.
Then for any ideal I of A the residue class ring A/I is Noetherian. Further, for
any integer r ≥ 1 the polynomial ring in r variables A[X1 , . . . , Xr ] is Noetherian. Any finitely generated A-module is Noetherian, i.e., all its A-submodules
are finitely generated. Any integral domain finitely generated over Z is isomorphic to Z[X1 , . . . , Xr ]/I for some ideal I of Z[X1 , . . . , Xr ], hence it is
Noetherian.
9.3.1 Proof of the equivalence (ii) ⇐⇒ (iii) in Theorem 9.1.1
We need the following result of Roquette.
Proposition 9.3.1 Let A be an integral domain that is finitely generated over
Z. Then A∗ is a finitely generated group.
Proof. See Roquette (1957).
(ii)⇐⇒(iii). First assume that (ii) holds. Let A, M, δ be as in the statement of
(iii). Proposition 9.3.1 implies that there are a finite set S ⊂ A∗ such that every
ε ∈ A∗ can be expressed as ηζ n with η ∈ S, ζ ∈ A∗ , where n := degF . Now
if x ∈ M is a solution of (9.1.2), then F (x) = δε with ε ∈ A∗ . Hence there
are η ∈ S, ζ ∈ A∗ such that F (ζ −1 x) = δη. By (ii), each equation F (y) = δη
(η ∈ S) in y ∈ M with l(y) = 0 for l ∈ L has only finitely many solutions.
This implies (iii).
Conversely, assume (iii), and take again A, M, δ as in (ii). Then the solutions
of (9.1.1) lie in finitely many A∗ -cosets. If x1 , x2 are two solutions in the same
A∗ -coset then x2 = εx1 for some ε ∈ A∗ , and εn = F (x2 )/F (x1 ) = 1. So each
A∗ -coset contains at most n solutions of (9.1.1). This proves (ii).
9.3.2 Proof of the implication (i) ⇒ (iii) in Theorem 9.1.1
We need the following consequence of Theorem 9.2.1.
Proposition 9.3.2 The solutions of (9.2.1) lie in a union of finitely many
proper linear subspaces of K m .
Proof. The degenerate solutions, i.e., with i∈I ai xi = 0 for some non-empty
subset I of {1, . . . , m}, lie in finitely many subspaces and so do the nondegenerate solutions.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.3 Reduction of decomposable form equations to unit equations
239
Remark It was pointed out in Evertse and Győry (1988b), and it is also implicit
in the proof of Theorem 6.1.3, that Theorem 9.2.1 and Proposition 9.3.2 are
equivalent.
(i) ⇒ (iii). We assume assertion (i) of Theorem 9.1.1 and deduce (iii). Let
A, M, F , L0 , L, δ be as in (i), (iii) and let V := KM. We proceed by
induction on dimK V . If dimK V = 1, assertion (iii) is trivially true. Assume
that dimK V =: d ≥ 2, and that the implication (i) ⇒ (iii) is true for finitely
generated A-modules in K m that generate a K-vector space of dimension
smaller than d. Without loss of generality we assume that none of the linear
forms l ∈ L vanishes identically on V .
We say that a set of linear forms {l1 , . . . , lt } ⊂ G[X1 , . . . , Xm ] is V -linearly
dependent if there are c1 , . . . , ct ∈ G, not all 0, such that c1 l1 + · · · + ct lt vanishes identically on V ; otherwise, {l1 , . . . , lt } is called V -linearly independent.
Further, {l1 , . . . , lt } is said to be minimally V -linearly dependent if the set itself
is linearly dependent on V , but each of its non-empty proper subsets is linearly
independent on V .
We first show that there is a subset of L0 of cardinality ≥ 3 that is minimally
V -linearly dependent. Assume the contrary. We divide L0 into classes such
that two linear forms belong to the same class if and only if they are V -linearly
dependent. Let {l1 , . . . , ls } be a full set of representatives for these classes.
Then by our assumption, {l1 , . . . , ls } is V -linearly independent. Let L1 be the
class of l1 . As is easily seen, L1 is Gal(G/K)-proper.
We show that all linear forms in W := [L1 ] ∩ [L0 \ L1 ] vanish identically
on V . Let l ∈ W . Since all linear forms in L1 are V -linearly dependent on l1 and
since each linear form in L0 \ L1 is V -linearly dependent on one of l2 , . . . , ls ,
there are c1 , . . . , cs ∈ G such that
l(x) = c1 l1 (x) = −
s
ci li (x)
for x ∈ V .
i=2
But then, si=1 ci li vanishes identically on V , implying that c1 = · · · = cs = 0.
So l vanishes identically on V .
In the same way it follows that for each σ ∈ Gal(G/K), the linear forms
in σ (W ) := [σ (L1 )] ∩ [L0 \ σ (L1 )] vanish identically on V , hence so do the
linear forms in σ ∈Gal(G/K) σ (W ). But then the latter vector space cannot
contain elements of L since we assumed that these do not vanish identically on
V . This violates assumption (i).
So L0 has a minimal V -linearly dependent subset, say {l0 , . . . , lt } with t ≥ 2.
This implies that there are a1 , . . . , at ∈ G∗ such that
l0 (x) = a1 l1 (x) + · · · + at lt (x)
for x ∈ V .
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
240
Decomposable form equations
Let the set S consist of the coefficients of l1 , . . . , ln (i.e., the linear factors of
F in L0 ), a finite set of generators for M, and c, c−1 , δ, δ −1 , where c, δ are as
in (9.1.3). Let B := A[S]. Then for any solution x ∈ M of (9.1.2) we have
l1 (x) ∈ B ∗ , . . . , ln (x) ∈ B ∗ .
This shows that if x ∈ M is a solution to (9.1.2), then the tuple
l1 (x)
lt (x)
,...,
l0 (x)
l0 (x)
is a solution to
a1 y1 + · · · + at yt = 1 in y1 , . . . , yt ∈ B ∗ .
(9.3.1)
The domain B is finitely generated over Z, so by Proposition 9.3.1, the group
B ∗ is finitely generated. By Proposition 9.3.2, there are at most finitely many
non-zero tuples (b1 , . . . , bt ) ∈ Gt such that every solution y = (y1 , . . . , yt ) of
(9.3.1) satisfies one of the relations b1 y1 + · · · + bt yt = 0. As a consequence,
every solution x ∈ M of (9.1.2) satisfies one of the relations
b1 l1 (x) + · · · + bt lt (x) = 0.
Since {l0 , . . . , lt } is minimally V -linearly dependent, each of these relations
defines a proper linear subspace of V . Hence the solutions x ∈ M of (9.1.2)
lie in a finite union of proper linear subspaces of V . By applying the induction
hypothesis to the intersection of M with any of these subspaces (which is a
finitely generated A-module since A is a Noetherian ring and M is a finitely
generated A-module), we infer that (9.1.2) has only finitely many solutions.
9.3.3 Proof of the implication (iii) ⇒ (i) in Theorem 9.1.1
We need another consequence of Theorem 9.2.1. It is in fact a special case of the
Skolem–Mahler–Lech Theorem on the zero multiplicity of linear recurrence
sequences, see Theorem 10.11.1 below.
Proposition 9.3.3 Let a1 , . . . , am , b1 , . . . , bm ∈ K ∗ and suppose that none
of the quotients bi /bj (1 ≤ i < j ≤ m) is a root of unity. Then there are only
finitely many z ∈ Z with
z
= 0.
a1 b1z + · · · + am bm
(9.3.2)
Proof. We proceed by induction on m. For m = 2 the assertion is clear. Let
m ≥ 3. Apply Proposition 9.3.2 with the group generated by b1 , . . . , bm . By
that Proposition, there are a finite number of tuples (c1 , . . . , cm−1 ) = 0 such
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.3 Reduction of decomposable form equations to unit equations
241
that each solution of (9.3.2) satisfies one of the relations
m−1
ci (bi /bm )z = 0.
i=1
By the induction hypothesis, each of these relations is satisfied by at most
finitely many integers z.
(iii) ⇒ (i). We assume that assertion (i) of Theorem 9.1.1 does not hold and
deduce that (iii) does not hold, that is, there are A, M, δ as in (iii), such that
equation (9.1.2) has infinitely many A∗ -cosets of solutions.
First assume that rankG L0 < m. Then the vector space of x ∈ Gm with
l(x) = 0 for l ∈ L is non-zero. By Lemma 1.1.1 and since L0 is Gal(G/K)stable this vector space has a basis from K m . So we can choose x1 ∈ K m \ {0}
with l(x1 ) = 0 for l ∈ L0 . Choose x0 ∈ K m with l(x0 ) = 0 for l ∈ L. Let A be
any subring of K which is finitely generated over Z, M the A-module generated
by x0 , x1 and δ = F (x0 ). Consider the vectors x0 + kx1 (k ∈ Z). These vectors
lie in different A∗ -cosets since x0 , x1 are K-linearly independent. For all but
finitely many k we have l(x0 + kx1 ) = 0 for l ∈ L, and by (9.1.3),
F (x0 + kx1 ) = c
n
li (x0 + kx1 )ei = F (x0 ) = δ.
i=1
Hence (9.1.2) has infinitely many A∗ -cosets of solutions.
Now assume that rankG L0 = m. Since by assumption, assertion (i) of
Theorem 9.1.1 does not hold, there is a Gal(G/K)-proper subset L1 of L0
with ∅ L1 L0 such that
[σ (L1 )] ∩ [L0 \ σ (L1 )].
L ∩ W = ∅, with W :=
σ ∈Gal(G/K)
Then dimG W < m. Hence the G-vector space
W ∗ := {x ∈ Gm : l(x) = 0 for all l ∈ W }
has dimension m − dimG W > 0. Since also W is Gal(G/K)-stable, we infer
from Lemma 1.1.1 that W ∗ is generated by vectors from K m . Moreover, none
of the linear forms in L vanishes identically on W ∗ and so neither do they on
W ∗ ∩ K m . Thus, there is x0 ∈ K m with
l(x0 ) = 0
for l ∈ W,
l(x0 ) = 0 for l ∈ L.
(9.3.3)
We make a partition {L1 , . . . , Lt } of L0 as follows. Take the distinct sets among
σ (L1 ) (σ ∈ Gal(G/K)). Since L1 is Gal(G/K)-proper, these sets are pairwise
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
242
Decomposable form equations
disjoint. Let
L∗0 :=
)
σ (L1 ).
σ ∈Gal(G/K)
If L∗0 = L0 , let L1 , . . . , Lt be the distinct sets among σ (L1 ) (σ ∈ Gal(G/K)). If
L∗0 L0 , let L1 , . . . , Lt−1 be the distinct sets among σ (L1 ) (σ ∈ Gal(G/K)),
and take Lt := L0 \ L∗0 . Then σ (Lt ) = Lt for all σ ∈ Gal(G/K).
Let
⎧
⎫
⎨
⎬
U := u = (ul : l ∈ L0 ) ∈ Gn :
ul l = 0 .
⎩
⎭
l∈L0
We show that
ul l(x) = 0
for x ∈ W ∗ , u ∈ U, i = 1, . . . , t.
(9.3.4)
l∈Li
Let u ∈ U , x ∈ W ∗ . For σ ∈ Gal(G/K) we have
ul l = −
ul l ∈ [σ (L1 )] ∩ [L0 \ σ (L1 )] ⊆ W.
l∈L0 \σ (L1 )
l∈σ (L1 )
L∗0
= L0 then (9.3.4) follows at once. If L∗0 L0 , then (9.3.4) holds for
If
i = 1, . . . , t − 1. But since l∈L0 ul l(x) = 0, it must hold for i = t as well.
We now construct numbers θl ∈ G∗ (l ∈ L0 ) with the following properties:
θl = θi
for l ∈ Li , i = 1, . . . , t,
θσ (l) = σ (θl )
θi /θj
with θi independent of l; (9.3.5)
for l ∈ L0 , σ ∈ Gal(G/K);
is not a root of unity for 1 ≤ i < j ≤ t.
(9.3.6)
(9.3.7)
The construction is as follows. Define the field M by
Gal(G/M) := {σ ∈ Gal(G/K) : σ (L1 ) = L1 }.
We first show that there is θ1 such that M = K(θ1 ) and no quotient of any two
distinct conjugates of θ1 over K is a root of unity. We start by taking θ with
M = K(θ ). Let θ (1) , . . . , θ (d) be the conjugates of θ over K in G. Since the field
G is finitely generated, its group of roots of unity is finite, say of order D. Now
we may take θ1 := θ + a, where a ∈ Z is such that the numbers (θ (i) + a)D
(i = 1, . . . , d) are distinct.
Let θ1 ∈ M be as above and put θi := σi (θ1 ), where σi ∈ Gal(G/K) is such
that σi (L1 ) = Li . This does not depend on the choice of σi . In the case L∗0 L0 ,
choose θt ∈ K ∗ such that θt /θi is not a root of unity for i = 1, . . . , t − 1. Finally,
put θl := θi for l ∈ Li , i = 1, . . . , t. If σ ∈ Gal(G/K) is such that σ (Li ) = Lj ,
with 1 ≤ i < j ≤ t if L∗0 = L0 , and with 1 ≤ i < j ≤ t − 1 if L∗0 L0 , then
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.3 Reduction of decomposable form equations to unit equations
243
σj−1 σ σi ∈ Gal(G/M), hence θj = σ (θi ). Further, if L∗0 L0 , then σ (Lt ) = Lt
and σ (θt ) = θt for σ ∈ Gal(G/K). Thus, (9.3.5)–(9.3.7) follow.
We now construct A, M, δ such that (9.1.2) has infinitely many A∗ -cosets
of solutions. Pick x0 ∈ K m with (9.3.3). We claim that for every k ∈ Z≥0 there
is a unique xk ∈ K m such that
l(xk ) = l(x0 )θlk
for l ∈ L0 ,
(9.3.8)
and that, moreover, these vectors xk are pairwise non-proportional. Indeed, by
(9.3.3), (9.3.4) and (9.3.5) we have for any u ∈ U ,
⎞
⎛
t
⎝
ul l(x0 )θlk =
ul l(x0 )⎠ θik = 0.
i=1
l∈L0
l∈Li
Hence there is xk ∈ Gm with (9.3.8). Further, since rankG L0 = m, it is uniquely
determined. By (9.3.6) we have
σ (l)(σ (xk )) = σ (l)(x0 )θσk (l)
for l ∈ L0 , σ ∈ Gal(G/K),
and then σ (xk ) satisfies (9.3.8) since L0 is Gal(G/K)-stable. Now since (9.3.8)
has only one solution, we must have σ (xk ) = xk for σ ∈ Gal(G/K), hence xk ∈
K m . Finally, by (9.3.7), the tuples (θlk : l ∈ L0 ) (k ∈ Z≥0 ) are pairwise nonproportional. Hence the vectors xk (k ∈ Z≥0 ) are pairwise non-proportional.
Notice that by (9.1.3), (9.3.8) and (9.3.6),
e
F (xk ) = c
l(xk )el = F (x0 )uk , where u :=
θl l ∈ K ∗ .
l∈L0
l∈L0
We show that l(xk ) = 0 for l ∈ L and for all but finitely many k. Let l ∗ ∈ L.
Then since rankG L0 = m, we have
l∗ =
ηl l with ηl ∈ G for l ∈ L0 .
l∈L0
So by (9.3.5),
⎞
⎛
t
⎝
l ∗ (xk ) =
ηl l(x0 )⎠ θik .
i=1
l∈Li
By l ∗ (x0 ) = 0 and Proposition 9.3.3, we have l ∗ (xk ) = 0 for at most finitely
many k. Putting all this together, we infer for all but finitely many k ∈ Z≥0 ,
F (xk ) = F (x0 )uk ,
l(xk ) = 0 for l ∈ L.
(9.3.9)
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
244
Decomposable form equations
We finish by constructing A, M, δ. Let δ := F (x0 ). Then δ = 0 since
L0 ⊆ L. Further, let f (X) = Xs + cs−1 Xs−1 + · · · + c0 ∈ K[X] be a monic
polynomial such that θl (l ∈ L0 ) are all zeros of f . Let
A := Z[u, u−1 , c0 , . . . , cs−1 ].
Then u ∈ A∗ . Clearly, for k ∈ Z, k ≥ s, l ∈ L0 , we have
θlk = −cs−1 θlk−1 − · · · − c0 θlk−s ,
and so, by the fact that xk ∈ K m is the only solution of (9.3.8),
xk = −cs−1 xk−1 − · · · − c0 xk−s
for k ∈ Z, k ≥ s.
Now let M be the A-module generated by x0 , . . . , xs−1 . Then xk ∈ M for
k ∈ Z≥0 . Invoking (9.3.9), we infer that for all but finitely many k the vector xk
is a solution to (9.1.2). Moreover, the vectors xk are pairwise non-proportional.
Hence (9.1.2) has infinitely many distinct A∗ -cosets of solutions, i.e., assertion
(iii) of Theorem 9.1.1 does not hold.
9.4 Finiteness of the number of families of solutions
In this section we describe the structure of the set of solutions of the decomposable form equations (9.1) and (9.2).
Let K be a finitely generated extension field of Q, L a finite extension of K
of degree n ≥ 2 and G a finite, normal extension of K containing L. There are
n distinct K-isomorphisms of L in G, σ1 , . . . , σn say. Let α1 , . . . , αm (m ≥ 2)
be elements of L and consider the linear form l = α1 X1 + · · · + αm Xm . Define
the conjugates of l, l (i) = σi (l)) = m
j =1 σi (αj )Xj (i = 1, . . . , n). Then
NL/K (l) :=
n
l (i) =
i=1
n
(σi (α1 )
i=1
is a decomposable form of degree n in K[X1 , . . . , Xm ], called a norm form,
and the equation
NL/K (l(x)) = δ
in x = (x1 , . . . , xm ) ∈ Am
(9.4.1)
is called a norm form equation over K, where δ ∈ K ∗ and A is a subring of K
which is finitely generated over Z.
In what follows, it will be more convenient to consider equation (9.4.1) in
the form
NL/K (μ) = δ
in μ ∈ M,
(9.4.2)
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.4 Finiteness of the number of families of solutions
245
where M := {μ = l(x) : x ∈ Am }. Notice that M is a finitely generated Asubmodule of L. If we assume that α1 , . . . , αm are linearly independent over
K, there is a one-to-one correspondence between the solutions of (9.4.1) and
(9.4.2).
Using the Subspace Theorem and its p-adic generalization, Schmidt (1971,
1972) and Schlickewei (1977c) established very important finiteness theorems on these equations over Q. The results of Schmidt and Schlickewei were
later extended in Laurent (1984) to the case where the ground field K is a
finitely generated extension of Q. These will be presented later as special
cases of more general results concerning decomposable form equations stated
below.
Equation (9.4.1) is a special decomposable form equation. Let now F ∈
K[X1 , . . . , Xm ] be an arbitrary decomposable form of degree n ≥ 2 and let G
be a finite, normal extension of K over which F factorizes into linear factors.
Consider the decomposable form equation
F (x) = δ
in x = (x1 , . . . , xm ) ∈ Am ,
(9.1)
where A is a subring of K which is finitely generated over Z. We can reformulate
this in a shape similar to (9.4.2) as follows. First observe that F can be expressed
as
F =c
q
NLj /K (lj ),
(9.4.3)
j =1
where L1 , . . . , Lq are finite extensions of K, lj is a linear form from
Lj [X1 , . . . , Xm ] for j = 1, . . . , q, and c ∈ K ∗ . Indeed, we may write F as
F = cl1 · · · ln ,
(9.4.4)
where c ∈ K ∗ and lj = Xnj + αnj +1,j Xnj +1 + · · · + αmj Xm with αij ∈ G for
j = 1, . . . , n, i ∈ {nj + 1, . . . , m}. For each σ ∈ Gal(G/K) we have σ (F ) =
F ∈ K[X1 , . . . , Xm ]. Since G[X1 , . . . , Xm ] is a unique factorization domain,
(9.4.4) implies that there is a permutation (σ (1), . . . , σ (n)) of (1, . . . , n) such
that σ (lj ) = lσ (j ) for j = 1, . . . , n. The index set {1, . . . , n} can be partitioned
into subsets C1 , . . . , Cq such that i, j belong to the same subset if and only
if σ (i) = j for some σ ∈ Gal(G/K). Assume without loss of generality that
j ∈ Cj for j = 1, . . . , q, and let Lj = K(αnj +1,j , . . . , αmj ) for j = 1, . . . , q.
Then
li = NLj /K (lj ) for j = 1, . . . , q,
i∈Cj
and (9.4.3) follows.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
246
Decomposable form equations
Define the K-algebra
:= L1 × · · · × Lq
which is endowed with coordinatewise addition and multiplication. Recall that
any K-algebra isomorphic to a direct product of finite field extensions of K is
called a finite étale K-algebra. Let 1 = (1, . . . , 1) denote the unit element of
. We agree that K-subalgebras of contain by default 1. It can be shown
that any K-subalgebra of is itself a finite étale K-algebra. We view K as a
subalgebra of by identifying a ∈ K with a · 1.
α ) of α = (α1 , . . . , αq ) ∈ by
We define the norm N/K (α
α ) = NL1 /K (α1 ) · · · NLq /K (αq ).
N/K (α
(9.4.5)
It can be shown that this is the determinant of the K-linear map x → α x from
to itself.
The A-module
M := μ = (l1 (x), . . . , lq (x)) : x ∈ Am
is contained in . Replacing δ/c by δ in (9.1), the identities (9.4.3) and (9.4.5)
imply that every solution x of the equation (9.1) yields a solution of the equation
μ) = δ
N/K (μ
in μ ∈ M.
(9.4.6)
Further, if F is of maximal rank, that is, if F has m linearly independent linear
factors in its factorization over G, then there is a one-to-one correspondence
between the solutions of (9.1) and (9.4.6). For q = 1, (9.4.6) reduces to a norm
form equation.
In what follows we consider (9.4.6) where we allow M to be any finitely
generated non-zero A-module in . Denote by KM the vector space generated
by M in . For each K-subalgebra ϒ of , denote by Aϒ the integral closure of
A in ϒ, and by Eϒ the multiplicative subgroup of A∗ϒ , consisting of all elements
ε ∈ A∗ϒ with N/K (εε ) = 1. The group Eϒ is finitely generated. For every
solution μ of (9.4.6) and every K-subalgebra ϒ of for which μ ϒ ⊆ KM,
μEϒ∗ ) ∩ M are solutions of (9.4.6). Such a subset of solutions
all elements of (μ
μEϒ∗ ) ∩ M is called a wide (M, ϒ)-family of solutions of (9.4.6).
(μ
We state some results of Győry without proof.
Theorem 9.4.1 The set of solutions of (9.4.6) is a union of at most finitely
many wide families of solutions of (9.4.6).
Proof. See Győry (1993a).
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.4 Finiteness of the number of families of solutions
247
Consider now the equation
μ) ∈ δA∗
N/K (μ
in μ ∈ M.
(9.4.7)
For any K-subalgebra ϒ of , denote by Uϒ the subgroup of A∗ϒ consisting of
all elements ε with N/K (εε ) ∈ A∗ . The group Uϒ is finitely generated. Further,
μ Uϒ ) ∩ M
for every solution μ of (9.4.7) with μ ϒ ⊆ KM, all elements of (μ
are also solutions of (9.4.7). Such a set of solutions is called a wide (M, ϒ)family of solutions of (9.4.7).
Theorem 9.4.1 easily follows from the following.
Theorem 9.4.2 The set of solutions of (9.4.7) is a union of finitely many wide
families of solutions of (9.4.7).
Proof. See Győry (1993a).
The proof of Theorem 9.4.2 depends again on Proposition 9.3.2 concerning
the unit equation (9.2.1).
Let V be a non-zero K-linear subspace of . For a K-subalgebra ϒ of define
μ ∈ V : μ ϒ ⊆ V }.
V ϒ := {μ
We call V non-degenerate if V ϒ = (0) for every K-subalgebra ϒ of K different
from K, and degenerate otherwise. Let M be a finitely generated A-module in
with KM = V . If V is non-degenerate, then by Theorem 9.4.1, all solutions
μEK ) ∩ M.
of (9.4.6) are contained in a union of finitely many sets of the form (μ
But EK is finite, hence this proves the implication (i) ⇒ (ii) of the following.
Theorem 9.4.3 Let V be a fixed, non-zero K-linear subspace of . Then the
following three statements are equivalent:
(i) V is non-degenerate;
(ii) for every ring A with quotient field K which is finitely generated over Z,
every finitely generated A-module M with KM = V and every δ ∈ K ∗ ,
equation (9.4.6) has only finitely many solutions;
(iii) for every A, M, δ as in (ii), equation (9.4.7) has only finitely many A∗ cosets of solutions.
Proof. See Győry (1993a).
This theorem is equivalent to Theorem 9.1.1 with L = L0 .
We now specialize the above results to the norm form equations (9.4.1)
and (9.4.2). Then, in the classical case K = Q, A = Z, Schmidt (1971) proved
a fundamental theorem, which states that the norm form equation (9.4.2) has
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
248
Decomposable form equations
finitely many solutions for all δ ∈ Q∗ if and only if the Q-vector space QM has
no subspace of the form μL , where μ ∈ L∗ and L is a subfield of L different
from Q and the imaginary quadratic number fields. The result of Schmidt was
generalized by Schlickewei (1977c) for the case K = Q and A a ring of Sintegers. In case of norm form equations, Theorem 9.4.3 as well as Theorem
9.4.1 and Theorem 9.4.2 were proved in Laurent (1984).
In the number field case, when in (9.4.1) and (9.4.2) K is an algebraic
number field, Schmidt (1972) for K = Q, A = Z, Schlickewei (1977c) for
K = Q and Laurent (1984) for an arbitrary number field K gave a more precise
description of the set of solutions of the norm form equations (9.4.1) and (9.4.2),
in which the solutions are divided into more restrictive families of solutions
instead of the wide families of Theorems 9.4.1 and 9.4.2. The next theorem is
a generalization of these results to arbitrary decomposable form equations.
Let K be an algebraic number field, S a finite set of places on K containing
all infinite places, OS the ring of S-integers in K, and a finite étale K-algebra.
Let δ ∈ K ∗ , M a finitely generated OS -module contained in , and consider
the equation
μ) ∈ δOS∗
N/K (μ
in μ ∈ M.
(9.4.8)
For each K-subalgebra ϒ of , denote by OS,ϒ the integral closure of OS in
ϒ. Further, we define the sets
μ ∈ KM : μ ϒ ⊆ KM}, Mϒ := (KM)ϒ ∩ M,
(KM)ϒ := {μ
where KM is the K-vector space in generated by M. Consider the subgroup
∗
UM,ϒ := ε ∈ OS,ϒ
: ε Mϒ = Mϒ
∗
of the unit group of OS,ϒ . The group OS,ϒ
is finitely generated, hence its rank
∗
thus has the same
is finite. One can show that UM,ϒ is of finite index in OS,ϒ
∗
rank as OS,ϒ . An (M, ϒ)-family of solutions of (9.4.8) is a coset μ UM,ϒ ,
where ϒ is a K-subalgebra of and μ ∈ Mϒ is a solution of (9.4.8). Every
element of μ UM,ϒ is a solution of (9.4.8).
Theorem 9.4.4 The set of solution of (9.4.8) is a union of finitely many
families.
Proof. See Győry (1993a).
As was mentioned above, in the case of norm form equations, Theorem 9.4.4
is due to Schmidt (1972) for K = Q, OS = Z, Schlickewei (1977c) for K = Q
and Laurent (1984) for arbitrary number fields K. Theorem 9.4.4 was deduced
from Theorem 9.4.2 by showing that every wide family of solutions splits into
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.5 Upper bounds for the number of solutions
249
finitely many families of solutions. It follows from an observation of Laurent
(1984) that Theorem 9.4.4 cannot be extended to the case of an arbitrary finitely
generated ground field K.
The proofs of Theorems 9.4.1–9.4.4 in Győry (1993a) are based on
Theorem 9.2.1 on unit equations. See also Bombieri and Gubler (2006) where,
in the case of norm form equations over Z, the proof of the above Theorem 9.4.4
involves also Theorem 9.2.1. As was explained in Chapter 6, the proof of Theorem 9.2.1 depends on the p-adic Subspace Theorem. We note that in contrast,
Schmidt and Schlickewei deduced their results concerning norm form equations
directly from the Subspace Theorem and its p-adic generalization.
9.5 Upper bounds for the number of solutions
In this section, we consider decomposable form equations over the ring of
S-integers in an algebraic number field. We give an overview of quantitative
results, giving explicit upper bounds for the number of solutions. We first recall
some history. In Subsection 9.5.1 we recall from Evertse (1995) a general result
on systems of S-unit equations with a Galois action, and in Subsection 9.5.2
we deduce, among other things, a quantitative version of Theorem 9.1.1.
Let K be an algebraic number field, S a finite set of places of K containing
the infinite places, δ a non-zero element of OS , and F ∈ OS [X, Y ] a binary
form of degree n ≥ 3 with at least three pairwise non-proportional linear factors
over K. Consider the equation
F (x, y) ∈ δOS∗
in (x, y) ∈ (OS∗ )2 .
(9.5.1)
Denote by s the cardinality of S and by ωS (δ) the number of prime ideals outside
S occurring in the factorization of δ. The solutions of (9.5.1) are divided into
OS∗ -cosets in the usual manner. Lewis and Mahler (1961) were the first to
give, in the case K = Q, a completely explicit upper bound for the number of
OS∗ -cosets of solutions of (9.5.1), depending on s, ωS (δ), n, and also on the
heights of the coefficients of F . In chapter 6 of his PhD thesis Evertse (1983)
extended the result of Lewis and Mahler to arbitrary number fields and sets
of places S, and derived an explicit upper bound for the number of OS∗ -cosets of
solutions of (9.5.1) that depends only on n, s, ωS (δ), and so is independent of
the coefficients of F . On the other hand, Evertse’s bound had a much worse
dependence on the degree n of F than that of Lewis and Mahler. Later, Evertse’s
bound was reduced substantially in the case that F is irreducible over K.
Bombieri and Schmidt (1987) proved that if F ∈ Z[X, Y ] is an irreducible
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
250
Decomposable form equations
binary form of degree n ≥ 3, then the equation
F (x, y) = 1
in x, y ∈ Z
has at most cn solutions with c an absolute constant. For n sufficiently large, c
can be taken equal to 430. The example (x − a1 y) · · · (x − an y) + y n = 1 with
a1 , . . . , an distinct integers shows that the bound of Bombieri and Schmidt is
best possible in terms of n. Bombieri considered more generally (9.5.1) with
arbitrary K, S but with F irreducible over K and of degree n ≥ 6. In Bombieri
(1994) he obtained the upper bound (12n)12(s+ωS (δ)) for the number of OS∗ cosets of solutions of (9.5.1). For binary forms F ∈ OS [X, Y ] of degree n ≥ 3
irreducible over K, this was improved in Evertse (1997) to (105 n)s+ωS (δ) . This
is still the best bound for general Thue–Mahler equations.
Schmidt generalized the above mentioned results on Thue equations to
norm form equations over Z in more than two unknowns. Let L be a number
field of degree n and l = α1 X1 + · · · + αm Xm , where L = Q(α1 , . . . , αm ) and
α1 , . . . , αm are linearly independent over Q. Let c be a non-zero integer such
that
F := cNL/Q (l) = c (σ (α1 )X1 + · · · + σ (αn )Xn ) ∈ Z[X1 , . . . , Xm ],
σ
where the product is over all embeddings σ : L → Q. Recall that F is called
non-degenerate if the Q-vector space V := {l(x) : x ∈ Qm } does not contain
μL for some μ ∈ L∗ and some subfield L of L that is not equal to Q or an
imaginary quadratic field. Under this hypothesis, Schmidt (1990), Theorem 1
proved that the equation
|F (x)| = 1 in x ∈ Zm
has at most
30m 2
m+4
c1 (m, n) = min n2 n , nc2 (m) with c2 (m) = (2m)m×2
solutions. In the same paper, Schmidt proved also that if δ is any positive
integer, then the equation
|F (x)| = δ
has at most
n
ω(δ)
dm−1 (δ n )
c1 (m, n) m−1
primitive solutions, i.e., with coordinates having greatest common divisor 1,
where ω(δ) denotes the number of distinct primes dividing δ, and dm−1 (δ n )
denotes the number of ways that δ n can be expressed as a product of m − 1
positive integers. Schmidt’s main tool was his quantitative version of the
Subspace Theorem that he had established shortly before in Schmidt (1989).
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.5 Upper bounds for the number of solutions
251
Győry (1993a) gave explicit upper bounds for the number of solutions of
arbitrary decomposable form equations over the ring of S-integers of a number
field K, in the case that this number is finite. More generally, in the case that
the number of solutions is infinite, he gave an explicit upper bound for the
number of families of solutions. He derived his bounds by making a reduction
to S-unit equations over the splitting field over K of the decomposable form
involved and this led to bounds that are exponential in both the cardinality of
S and the degree of the splitting field. Notice that if the decomposable form
involved has degree n, then in the worst case, its splitting field has degree n!
and then Győry’s bound is exponential in n!.
Evertse (1995) proved a general quantitative result on “Galois symmetric
S-unit vectors”, and this enabled him to prove much sharper upper bounds for
the number of solutions (if finite) of decomposable form equations over OS .
This was extended in Evertse and Győry (1997) to estimates for the number of
families of solutions in the case when the number of solutions is infinite.
In the next subsection, we recall, without proof, Evertse’s result on Galois
symmetric S-unit vectors. In the subsequent subsection we discuss some consequences for decomposable form equations and S-unit equations.
9.5.1 Galois symmetric S-unit vectors
Let K be an algebraic number field, S a finite set of places of K, and G a finite
normal extension of K. Denote by OS,G the integral closure of OS in G.
Let n ≥ 3 be an integer and an action of Gal(G/K) on {1, . . . , n}, i.e., a
homomorphism from Gal(G/K) to the permutation group of {1, . . . , n}. That
is, maps σ ∈ Gal(G/K) to a permutation (σ (1), . . . , σ (n)) of (1, . . . , n). We
define the K-algebra
u = (u1 , . . . , un ) ∈ Gn :
:=
σ (ui ) = uσ (i) for σ ∈ Gal(G/K), i = 1, . . . , n
with coordinatewise addition, multiplication, and scalar multiplication with
K. The unit element of is 1 := (1, . . . , 1). We embed K into via
ι : a → a · 1.
A -symmetric partition is a collection of non-empty, pairwise disjoint sets
P = {P1 , . . . , Pt } such that
t
)
Pi = {1, . . . , n},
σ (Pi ) ∈ P for σ ∈ Gal(G/K), i = 1, . . . , t.
i=1
In particular we have the trivial -symmetric partition P0 := {{1, . . . , n}}.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
252
Decomposable form equations
P
A pair i ∼ j is a pair i, j ∈ {1, . . . , n} belonging to the same set of P. With
a -symmetric partition P we associate the sets
P P := u ∈ : ui = uj for each pair i ∼ j ,
n
∩ P .
OS,P := OS,G
The set P is a K-subalgebra of , and OS,P is the integral closure of OS in
P . For instance, P0 = ι(K), and P = for P = {{1}, . . . , {n}}.
Let W be a K-linear subspace of . Define
⊥
W := y = (y1 , . . . , yn ) ∈ G :
n
n
yi ui = 0 for all u ∈ W .
i=1
For a -symmetric partition P = {P1 , . . . , Pt }, we define the subspace of W ,
⎧
⎫
⎨
⎬
yj uj = 0 for all y ∈ W ⊥ , i = 1, . . . , t .
WP := u ∈ W :
⎩
⎭
j ∈Pi
One can show that
WP = {u ∈ W : uP ⊆ W }
(see Evertse (1995), Lemma 10). As a consequence, P WP ⊆ WP .
A K ∗ -coset is a set {a · u : a ∈ K ∗ } with some fixed u ∈ .
We are now ready to state our result.
Theorem 9.5.1 Let K be a number field, G a finite normal extension of K,
n ≥ 3, a Gal(G/K)-action on {1, . . . , n}, W a K-linear subspace of of
dimension m and S a finite set of places of K of cardinality s, containing all
infinite places. Then the set of u = (u1 , . . . , un ) with
∗
ui /uj ∈ OS,G
for i, j = 1, . . . , n, (9.5.2)
u ∈ W,
u1 · · · un = 0,
u ∈ WP
for each -symmetric partition P
∗
/ι(OS∗ ) is infinite
such that OS,P
(9.5.3)
is a union of at most (233 n2 )m s K ∗ -cosets.
3
Proof. See Evertse (1995), Theorem 4, Lemma 10. The proof is based on a
quantitative version of the Subspace Theorem, proved in Evertse (1996).
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.5 Upper bounds for the number of solutions
253
9.5.2 Consequences for decomposable form equations
and S-unit equations
Let K be a number field and S a finite set of places of K containing all infinite
places. Suppose |S| = s. For non-zero δ ∈ OS , we denote by ωS (δ) the number
of prime ideals outside S occurring in the prime ideal factorization of δ.
Let F ∈ OS [X1 , . . . , Xm ] be a decomposable form, and denote by G its
splitting field over K. Recall that there exists a Gal(G/K)-stable set L0 =
{l1 , . . . , ln } ⊂ G[X1 , . . . , Xm ] of pairwise non-proportional linear forms, c ∈
K ∗ and positive integers e1 , . . . , en , such that
F = cl1e1 · · · lnen .
Let L be a finite set of pairwise non-proportional linear forms from
G[X1 , . . . , Xm ] with L ⊇ L0 .
We deduce a quantitative version of the implication (i) ⇒ (iii) of
Theorem 9.1.1 from Theorem 9.5.1.
Theorem 9.5.2 Let m, K, S, F , G, L0 , L be as above. Assume that
rankG L0 = m,
⎛
⎞
L∩⎝
[σ (L1 )] ∩ [L0 \ σ (L1 )]⎠ = ∅
(9.5.4)
(9.5.5)
σ ∈Gal(G/K)
for each non-empty Gal(G/K)-proper subset L1 L0 . Then the solutions of
F (x) ∈ δOS∗
in x ∈ OSm
with l(x) = 0 for l ∈ L
(9.5.6)
lie in at most (233 n2 )m (s+ωS (δ)) OS∗ -cosets.
3
Proof. Let S consist of the places in S and of the prime ideals in the factorization
of δ. Then |S | = s + ωS (δ). Assume that (9.5.6) is solvable (if not we are done)
and choose a solution x0 ∈ OSm . After multiplying l1 , . . . , ln by suitable scalars,
which does not affect the above assumptions on L0 , L, we may assume that
li (x0 ) = 1 for i = 1, . . . , n and c ∈ OS∗ .
Denote by OS ,G the integral closure of OS in G. We first show that for
every solution x ∈ OSm of (9.5.6) we have
li (x) ∈ OS∗ ,G
for i = 1, . . . , n.
(9.5.7)
This is equivalent to the assertion that |li (x)|V = 1 for i = 1, . . . , n and every
place V of G not lying above a place from S . To prove this, take such a place
V . For a polynomial H with coefficients in G denote by |H |V the maximum of
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
254
Decomposable form equations
the | · |V -values of the coefficients of H . By our assumption on l1 , . . . , ln we
have |li |V ≥ 1 for i = 1, . . . , n while on the other hand, by Proposition 1.9.4
and our assumption F ∈ OS [X1 , . . . , Xn ],
n
|li |eVi = |F |V ≤ 1.
i=1
Hence |li |V = 1 for i = 1, . . . , n. So if x ∈ OSm is a solution of (9.5.6) then
|li (x)|V ≤ 1 for i = 1, . . . , n and ni=1 |li (x)|eVi = |F (x)|V = 1. This implies
|li (x)|V = 1 for i = 1, . . . , n, as required.
Define the K-linear map
ϕ : x → (l1 (x), . . . , ln (x)) : K m → Gn .
By (9.5.4), it is injective. Let ϕ(K m ) =: W . Since {l1 , . . . , ln } is Gal(G/K)stable, there is an action of Gal(G/K) on {1, . . . , n} such that σ (li ) = lσ (i)
for i = 1, . . . , n, σ ∈ Gal(G/K). This implies that W is an m-dimensional,
K-linear subspace of .
In view of (9.5.7) we have for every solution x ∈ OSm of (9.5.6) that
ϕ(x) ∈ W,
ϕ(x) ∈ (OS∗ ,G )n ,
(9.5.8)
so certainly, u := ϕ(x) satisfies (9.5.2).
We next show that if x ∈ OSm is a solution of (9.5.6), then
ϕ(x) ∈ WP
for each -symmetric partition P = {{1, . . . , n}},
(9.5.9)
which is stronger than (9.5.3). Let x ∈ OSm be a solution of (9.5.6), and P =
{P1 , . . . , Pt } a -symmetric partition different from {{1, . . . , n}}. Further, let
L1 = {li : i ∈ P1 }. Then L1 L0 and L1 is Gal(G/K)-proper. By assumption
(9.5.5), there is a linear form in
[σ (L1 )] ∩ [L0 \ σ (L1 )]
σ ∈Gal(G/K)
that does not vanish at x. This implies that there are σ ∈ Gal(G/K) and l ∈
[σ (L1 )] ∩ [L0 \ σ (L1 )] such that l(x) = 0. There is a set Pi ∈ P such that
σ (L1 ) = {lj : j ∈ Pi }. Now there are cj ∈ G for j = 1, . . . , n such that
cj lj = −
cj lj ,
l=
j ∈Pi
j ∈Pic
where Pic = {1, . . . , n} \ Pi . The vector (c1 , . . . , cn ) belongs to W ⊥ , and our
observation l(x) = 0 implies that for the vector u = ϕ(x) we have j ∈Pi cj uj =
0. So indeed, ϕ(x) ∈ WP .
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.5 Upper bounds for the number of solutions
255
We conclude that if x ∈ OSm is a solution of (9.5.6), then ϕ(x) satisfies (9.5.8),
(9.5.9), hence (9.5.2), (9.5.3) with S instead of S. Now Theorem 9.5.1 with
S , s = s + ωS (δ), instead of S, s, implies that the vectors ϕ(x), with x ∈ OSm
3
a solution of (9.5.2), lie in at most N := (233 n2 )m (s+ωS (δ)) K ∗ -cosets. Since ϕ
is an injective, K-linear map, this implies that the solutions x themselves lie
in at most N K ∗ -cosets. But, clearly, any two solutions of (9.5.6) in the same
K ∗ -coset lie in fact in the same OS∗ -coset. Theorem 9.5.2 follows.
The next consequence is an improvement of Theorem 6.1.3 in the case
= (OS∗ )m .
Theorem 9.5.3 Let K, S be as above, and let a1 , . . . , am ∈ K ∗ . Then the
equation
a1 u1 + · · · + am um = 1 in u1 , . . . , um ∈ OS∗
(9.5.10)
3
has at most (235 m2 )m s solutions with
ai ui = 0 for each non-empty I ⊆ {1, . . . , m}.
(9.5.11)
i∈I
Proof. We apply Theorem 9.5.1 with n = m + 1, G = K, and
W = {(u1 , . . . , um , um+1 ) ∈ K m+1 : a1 u1 + · · · + am um = um+1 }.
Then the points (u1 , . . . , um , 1) with (9.5.10) and (9.5.11) satisfy (9.5.2) and
(9.5.3). Since these points lie in different K ∗ -cosets, Theorem 9.5.3 follows.
We state without proof a consequence of Theorem 9.5.1 for the number of
families of solutions of decomposable form equations, giving a quantitative
version of Theorem 9.4.4. We keep the notation from Section 9.4.
Let as before K be an algebraic number field, S a finite set of places on
K containing all infinite places and a finite étale K-algebra. Let c, δ be
non-zero elements of OS , M a finitely generated OS -module contained in ,
and consider the equation
μ) ∈ δOS∗
cN/K (μ
in μ ∈ M.
(9.5.12)
α 1 , . . . , α t } of M, the
We assume that for some OS -module generating set {α
polynomial cN/K (X1α 1 + · · · + Xt α t ) has its coefficients in OS . In fact, this
does not depend on the choice of the generating set.
For the definition of the submodules Mϒ and the groups UM,ϒ (for ϒ
a K-subalgebra of ) and that of a family of solutions of (9.5.12) we refer
∗
of finite index if
to Section 9.4. Recall that UM,ϒ is a subgroup of OS,ϒ
ϒ
M = (0). A family of solutions of (9.5.12) is called irreducible if it is not a
union of finitely many strictly smaller families of solutions.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
256
Decomposable form equations
Let n := [ : K], m := dimK KM, s := |S|, and put
ωS (δ) ordv (δ) + m − 1
n
ψ(δ) :=
,
m−1
m−1
v∈S
where ordv (δ) is the exponent on the prime ideal corresponding to v in the
factorization of δ. Consider the K-subalgebras ϒ of such that (9.5.12) has
irreducible (M, ϒ)-families of solutions, and denote by IM the maximum of
∗
: UM,ϒ ], taken over all such algebras ϒ.
the indices [OS,ϒ
We state without proof the following quantitative result, which can be
deduced from Theorem 9.5.1.
Theorem 9.5.4 The set of solutions of (9.5.12) is a union of at most
3
(233 n2 )m s ψ(δ) · IM
irreducible families.
Proof. This is a simplified version of Evertse and Győry (1997), Theorem 1.
Notice that by taking for a finite extension field of K, we obtain from
Theorem 9.5.4 an upper bound for the number of families of solutions of a
norm form equation.
By an OS∗ -coset of solutions, we mean a coset μ OS∗ , where μ is a solution
of (9.5.12).
Corollary 9.5.5 Assume that (9.5.12) has only finitely many OS∗ -cosets of
solutions. Then the number of these is at most
3
(233 n2 )m s ψ(δ).
Proof. By assumption, (9.5.12) cannot have irreducible families of solutions
that are the union of infinitely many OS∗ -cosets. So it has only irreducible
families that are the union of only finitely many OS∗ -cosets, and such families must be OS∗ -cosets themselves. In this situation, IM = 1. Corollary 9.5.5
follows.
We can express the set of solutions of (9.5.12) as a minimal finite union
of irreducible families F1 ∪ · · · ∪ Ft , i.e., none of the families in this union is
contained in the union of the others. Evertse and Győry (1997) showed that this
way of expressing the set of solutions is unique, and moreover, that F1 , . . . , Ft
are precisely the maximal irreducible families of solutions of (9.5.2), that is,
if F is any other irreducible family of solutions of (9.5.12), then F ⊆ Fi for
some i ∈ {1, . . . , t}.
Voutier (2014) showed that if L is an algebraic number field of degree
n > 3 and M a free Z-module of rank 3 contained in OL , then the norm form
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.6 Effective results
257
equation
μ) = 1
NL/Q (μ
in μ ∈ M
(9.5.13)
has at most 10969 n10 families of solutions. On the other hand, in his paper,
Voutier showed that for every number field L of degree n ≥ 3 and every integer
N > 0, there exists a full module M ⊆ OL , i.e., of rank equal to n, such that
(9.5.13) has at least N families of solutions. This implies that the bound in
Theorem 9.5.4 cannot be replaced by one depending only on m, n, s, δ and
independent of M.
9.6 Effective results
In this section, effective results are presented for some important classes of
decomposable form equations of the form
F (x) = δ
in x = (x1 , . . . , xm ) ∈ OSm with l(x) = 0 for l ∈ L
(9.6.1)
and
F (x) ∈ δOS∗
in x = (x1 , . . . , xm ) ∈ OSm with l(x) = 0 for l ∈ L,
(9.6.2)
where OS is the ring of S-integers of a number field K, δ ∈ OS \ {0}, F (X)
is a decomposable form of degree n ≥ 3 with coefficients in OS and L is a
finite set of non-zero linear forms from K[X1 , . . . , Xm ]. Using the effective
results of Section 4.1 on S-unit equations, we derive effective bounds for the
S-integral solutions of Thue equations, discriminant equations, certain norm
form equations and decomposable form equations of an arbitrary number of
unknowns. In the case of equation (9.6.1), these imply the finiteness of the
number of solutions, and make it possible, at least in principle, to determine the
solutions, provided that K, S, δ, n and the coefficients of F are given effectively
in the sense described in Section 1.10. The results presented in this section have
many important applications in number theory.
As was already mentioned, equation (9.6.2) can be reduced to finitely many
equations of the form (9.6.1). This can be carried out in an effective way.
Indeed, let x be a solution of (9.6.2). Then F (x) = δη with some η ∈ OS∗ . By
Proposition 4.3.12 there is an ε ∈ OS∗ for which h(ηε n ) and hence h(δηε n ) are
effectively bounded. Further, εx is a solution of equation (9.6.1) with δ replaced
by δηεn . In what follows, we deal only with equation (9.6.1).
Further effective applications of S-unit equations to discriminant form and
index form equations and related Diophantine problems are given in our next
book on discriminant equations.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
258
Decomposable form equations
9.6.1 Thue equations
Let K be an algebraic number field and S a finite set of places of K, containing
all infinite places. Let F ∈ OS [X, Y ] be a binary form of degree n ≥ 3 having at
least three pairwise non-proportional linear factors over K, and let δ ∈ OS \ {0}.
Consider the Thue equation
F (x, y) = δ
in x, y ∈ OS .
(9.6.3)
In the classical case when K = Q, S = {∞} and F (X, Y ) is irreducible over
Q, the first explicit upper bound for the solutions of this equation was obtained
in Baker (1968a). His bound depends only on δ, n, and the maximum of the
absolute values of the coefficients of F . Baker’s proof is based on his effective
estimates for linear forms in logarithms of algebraic numbers. Baker’s result
was extended in Coates (1969) to the case when K = Q and S is arbitrary and
in Kotov and Sprindžuk (1973) to the case of equation (9.6.3). Later, several
improvements and generalizations have been established; for references see
the Notes (Section 9.7). We note that better bounds can be obtained for the
solutions if certain parameters of the number field generated by one or more
zeros of F (x, 1) are also involved.
For applications, we give completely explicit upper bounds for the solutions
of equation (9.6.3). Let d, hK and RK denote the degree, class number and
regulator of K. Further, let s = |S|, RS the S-regulator of K (see (1.8.2)),
PK := max N (p) if S MK∞
p
and
QK :=
N (p) if S MK∞
and
PK := 2 if S = MK∞ ,
and
QK := 1 if S = MK∞ ,
p
where the maximum and product are taken over all prime ideals p from S,
and N(p) := |OK /p| denotes the norm of p. The case s = 1 being trivial, we
assume that s ≥ 2. Finally, let H (≥ 2) be an upper bound for the maximum of
the logarithmic heights of the coefficients of F (X, Y ).
The next theorem is a slightly weaker version of Corollary 3 of Győry and
Yu (2006).
Theorem 9.6.1 Suppose that the binary form F (X, Y ) in (9.6.3) factorizes
over K into linear factors and that at least three of these factors are pairwise
non-proportional. Then all solutions x, y of (9.6.3) satisfy
max(h(x), h(y)) <
1
h(δ) + 7n(64eds)2s+5 PK NK RS (log∗ RS ),
n
(9.6.4)
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.6 Effective results
259
where
NK = n5 H +
1
hK
log NS (δ) + d d RK +
log QK .
d
d
The proof is based on Corollary 4.1.5 on S-unit equations.
Consider now the case when F (X, Y ) does not factorize over K into linear
forms. For later convenience, we assume that F (1, 0) = 0. Then we may assume
that three zeros, say α1 , α2 , α3 , of F (X, 1) are distinct, and α1 is not contained
in K. Let L = K(α1 ), hL and RL be the class number and regulator of L, T the
set of places of L lying above those of S, and RT the T -regulator of L. Further,
let M = K(α1 , α2 , α3 ) and PM = PK[M:K] if S MK∞ and PM = 2 if S = MK∞ .
Theorem 9.6.2 Let F (X, Y ) be a binary form as in (9.6.3). Suppose that
F (1, 0) = 0, that α1 , α2 , α3 are distinct zeros of F (X, 1), and that α1 is not
contained in K. Then, with the above notation, all solutions of (9.6.3) satisfy
max(h(x), h(y)) <
1
h(δ) + 17(64edn2 s)2ns+5 PM NL RT (log∗ RT ), (9.6.5)
n
where
1
hL
log NS (δ) + (nd)nd RL +
log QK .
d
d
In particular, if K = Q, S = {∞} and F (X, Y ) is irreducible over Q, then
max(|x|, |y|) < exp c(H + log |δ| + nn RL )RL (log∗ RL )
NL = n2 H +
where c = 34(64en2 )2n+6 .
Apart from the value of c, this latter bound was established in Bugeaud and
Győry (1996b). Combining this bound with (1.5.2) and Lemma 1.5.1, one gets
at once an upper bound that depends only on δ, n and H .
Using their methods mentioned in Section 4.5, Bombieri (1993) in the case
S = MK∞ , F (X, 1) monic, and Bugeaud (1998) in the case F (X, 1) monic and
irreducible over K derived similar bounds for the solutions of equation (9.6.3).
Theorem 9.6.2 is a generalization and, apart from the factor log∗ RT in (9.6.5),
is an improvement of these results of Bombieri and Bugeaud.
Remark The restriction F (1, 0) = 0 is not an essential one. Indeed, there is
an a ∈ Z with 1 ≤ a ≤ n such that F (1, a) = 0. Then one may take the binary
form G(X, Y ) = F (X, aX + Y ) instead of F (X, Y ), in which the coefficient
of Xn is F (1, a) = 0 and the logarithmic heights of the coefficients of G do
not exceed (n + 1)(H + n log n) + log(n + 1).
For convenience, we give a common proof for Theorems 9.6.1 and 9.6.2.
We first prove Theorem 9.6.1 by means of Corollary 4.1.5. Proving a version of
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
260
Decomposable form equations
Theorem 9.6.2, we could also use this corollary in the field M = K(α1 , α2 , α3 )
with the set of places V consisting of the set of places of M lying above the
places of S. However, we get a better bound by applying Theorem 4.1.3, where
one of the unknowns of the V -unit equation involved belongs to a finitely
generated subgroup of M ∗ which is much smaller than the group of V -units
in M.
Proof of Theorems 9.6.1 and 9.6.2. We shall use some basic facts from
Chapter 1 without any further mention.
In view of the above remark we may assume that F (1, 0) = 0 holds in
Theorem 9.6.1, too. Then, in the proof below of Theorem 9.6.1, one has to
work with H1 = (n + 1)(H + n log n) + log(n + 1) instead of H . Further, we
may assume that in both cases α1 , α2 , α3 are distinct zeros of F (X, 1) (in the
latter case, not necessarily in K). For i = 1, 2, 3, let Li := K(αi ) with L1 = L,
hLi , RLi the class number and regulator of Li , Ti with T1 = T the set of places
of Li lying above those in S, OTi , OT∗i the ring of Ti -integers and the group of
Ti -units in Li , and
QL i =
NLi (P) if S MK∞ and QLi = 1 if S = MK∞ ,
P
where the product is taken over all prime ideals P from Ti and NLi (P) :=
|OLi /P| is the absolute norm of P.
Let x, y be a solution of (9.6.3), and let a0 = F (1, 0). The number a0 αi
is integral over OS , and so it is in OTi for i = 1, 2, 3. Thus a0 (x − αi y) is
also in OTi , it divides a0n−1 F (x, y) and hence a0n−1 δ in OTi , i = 1, 2, 3. By
Proposition 4.3.12 there is an εi in OT∗i such that, putting δi = εi a0 (x − αi y)
and using the fact that NS (a0 ) ≤ dh(a0 ) ≤ dH , we have
h(δi ) ≤
1
log NTi (a0 (x − αi y)) + 300RLi
[Li : Q]
nd
2
1
log NS (δ) + 300RLi
d
for i = 1, 2, 3.
nd
≤ (n − 1)H +
=: Ai
nd
2
nd
+
+
h Li
log QLi
[Li : Q]
hLi
log QK
d
(9.6.6)
Substituting x − αi y = δi /εi into the identity
(α3 − α2 )(x − α1 y) + (α2 − α1 )(x − α3 y) + (α1 − α3 )(x − α2 y) = 0,
we infer that
τ
ε2
ε2
+ ρ = 1,
ε1
ε3
(9.6.7)
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.6 Effective results
261
where
τ=
α3 − α2 δ1
· ,
α3 − α1 δ2
ρ=
α2 − α1 δ3
· .
α3 − α1 δ2
(9.6.8)
We shall give an upper bound for h(ε2 /ε1 ). First we must derive an upper
bound for h(τ ) and h(ρ). The numbers a0 αi are zeros of the monic polynomial
F (X) := a0n−1 F (X/a0 , 1). The maximum of the logarithmic heights of the
coefficients of F is at most nH . Then Corollary 1.9.6 and (1.9.6) give
h(a0 αi ) ≤ n2 H + n log 2 =: A4
for i = 1, 2, 3,
whence, using (9.6.6) and (9.6.8), it follows that
max(h(τ ), h(ρ)) < 4A4 + 2 log 2 + 2 max Ai =: A5 .
1≤i≤3
(9.6.9)
We first prove Theorem 9.6.1 when α1 , α2 , α3 are in K. Then we must take
H1 in place of H . Further, δi ∈ OS , εi ∈ OS∗ and, instead of (9.6.6), we get
1
log NS (δ) + 300RK
d
=: A6 for i = 1, 2, 3.
h(δi ) < (n − 1)H1 +
d
2
d
+
hK
log QK
d
(9.6.10)
In this case it follows as in (9.6.9) that
max(h(τ ), h(ρ)) < 4A4 + 2 log 2 + 2A6 =: A7
with H1 instead of H in A4 . Since ε2 /ε1 , ε2 /ε3 are S-units in K, we can apply
Corollary 4.1.5 to the S-unit equation (9.6.7) and we get
h(ε2 /ε1 ) < 6.5c1 c2 (PK / log PK )A7 RS max(log(c1 PK ), log∗ (c2 RS )),
where c1 = 11λs 2 (log∗ s)(16ed)3s+2 with λ = 12 if s = 2, λ = 1 if s ≥ 3, and
√
c2 = ((s − 1)!)2 /(2s−2 d s−1 ). But we have m!em /mm ≤ e m for any integer
m ≥ 1. Hence after some computation and simplification we obtain
h(ε2 /ε1 ) < 1.9(64eds)2s+5 (PK / log PK )NK RS max(log PK , log∗ RS )
=: A8 ,
(9.6.11)
where
1
hK
log NS (δ) + d d RK +
log QK .
d
d
We now give an upper bound for h(x/y). Put κ := (x − α1 y)/(x − α2 y).
Then κ = (δ1 /δ2 )(ε2 /ε1 ) and, by (9.6.10) and (9.6.11), we infer that h(κ) <
2A6 + A8 ≤ 1.1A8 . But x/y = (κα2 − α1 )/(κ − 1), hence we get h(x/y) <
3.3A8 . Finally, using y n F (x/y, 1) = δ, we get (9.6.4) for h(y). The bound for
h(x) follows in the same way.
NK := n5 H +
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
262
Decomposable form equations
Next we prove Theorem 9.6.2. Then α1 is not contained in K and we may
assume that α2 = σ (α1 ) with some K-isomorphism σ . We recall that M =
K(α1 , α2 , α3 ). Let V be the set of places of M lying above the places of S, and
OV , OV∗ the ring of V -integers and group of V -units in M. By Proposition 4.3.9
there exists in L a fundamental system {ξ1 , . . . , ξt−1 } of T -units such that
t−1
h(ξj ) ≤ c3 RT ,
j =1
where t = |T | ≤ sn and c3 = ((t − 1)!)2 /2t−2 [L : Q]t−1 . Denote by the subgroup of OV∗ , generated by σ (ξ1 )/ξ1 , . . . , σ (ξt−1 )/ξt−1 . In this situation ε2 , δ2
above can be chosen so that ε2 = σ (ε1 ) and δ2 = σ (δ1 ). Then, in the equation (9.6.7), ε2 /ε1 ∈ and ε2 /ε3 ∈ OV∗ . We apply now Theorem 4.1.3 to the
equation (9.6.7) under these conditions.
Set
:=
t−1
h(σ (ξj )/ξj ).
j =1
Then by Theorem 4.1.3 we have
h(ε2 /ε1 ) < 6.5c4 v(PM / log PM )A5 max(log(c4 vPM ), log∗ ),
where v = |V | and
c4 = 11λt(log∗ t)(16e[M : Q])3t+2 with λ = 12 if t = 1, λ = 1 if t ≥ 2.
Further,
[M : Q] ≤ dn(n − 1)(n − 2),
v ≤ sn(n − 1)(n − 2),
t ≤ sn, and it follows that
≤ 2t−1
t−1
h(ξj ) ≤ 2sn c3 RT .
j =1
Using these inequalities and simplifying the bound so obtained for h(ε2 /ε1 ) we
infer as above in the proof of Theorem 9.6.1 that
h(ε2 /ε1 ) < 5.1(64edn2 s)2sn+4.5 PM NL RT (log∗ RT ),
where
1
hL
log NS (δ) + (nd)nd RL +
log QK .
d
d
Finally, we can derive the bound in (9.6.5) h(x) and h(y) as at the end of the
proof of Theorem 9.6.1.
NL = n2 H +
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.6 Effective results
263
9.6.2 Decomposable form equations in an arbitrary
number of unknowns
Let again K be an algebraic number field, and S a finite set of places of K
containing all infinite places. Consider now the general decomposable form
equation
F (x) = δ in x = (x1 , . . . , xm ) ∈ OSm with l(x) = 0 for l ∈ L,
(9.6.1)
where OS is the ring of S-integers of K, δ ∈ OS \ {0}, F ∈ OS [X1 , . . . , Xm ] is a
decomposable form of degree n ≥ 3 and L is a finite set of non-zero linear forms
from K[X1 , . . . , Xm ]. In this subsection we prove effective finiteness results for
some important classes of equations of the form (9.6.1), including discriminant
form equations and certain norm form equations. In case of discriminant form
equations and norm form equations in an arbitrary number of unknowns the
first effective results were established in Győry (1976) and Győry and Papp
(1978), respectively.
The arguments of Section 9.3.2 show that equation (9.6.1) leads to systems
of unit equations in some finite extension of K. There are no general effective
results for unit equations in more than two unknowns, hence one cannot obtain
for (9.6.1) effective theorems in full generality. However, it will be seen that if
the linear factors of F possess appropriate connectedness properties, then one
can arrive at systems of unit equations consisting of equations in two unknowns
in which the equations have similar connectedness properties. Then one can
apply the effective results from Chapter 4 to the solutions of the arising unit
equations, and using the connectedness properties of these equations, one can
derive an effective upper bound for the heights of the solutions of (9.6.1). For
simplicity, we shall give the bounds explicitly in terms of S only. For completely
explicit bounds, we shall refer to some original papers.
Extending the ground field K if necessary, we may assume that in (9.6.1)
F factorizes into linear forms over K. These linear factors of F are uniquely
determined over K up to proportional factors from K ∗ . Fix a factorization of
F into linear forms l1 , . . . , ln , and denote by L0 a maximal subset of pairwise
linearly independent linear factors of F . To obtain effective finiteness results
on equation (9.6.1), we make some assumptions on L0 .
We denote by G(L0 ) the graph with vertex set L0 in which the edges are the
unordered pairs {l, l }, where l, l are distinct elements of L0 with the property
that there exists a third linear form l ∈ L0 that is a K-linear combination of
l, l . If L0 has at least three elements and G(L0 ) is connected, then F is said to
be triangularly connected. In this case one can reduce equation (9.6.1) to a socalled triangularly connected system of unit equations in two unknowns, and,
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
264
Decomposable form equations
as a consequence, can give an effective upper bound for the heights of the
solutions of (9.6.1). The first effective result of this type was obtained in Győry
and Papp (1978) for S = MK∞ , and in Győry (1978/1979, 1980a) for arbitrary S.
When G(L0 ) is not connected, let L01 , . . . , L0k denote the vertex sets
of the connected components of G(L0 ). If k > 1, we introduce the graph
H(L01 , . . . , L0k ) with vertex set {L01 , . . . , L0k }, in which the pair {L0i , L0j }
is an edge if there exists a non-zero linear form lij which can be expressed
simultaneously as a K-linear combination of the forms in L0i and in L0j . In
this case lij can be chosen so that the total number of non-zero terms in both
representations lij = l∈L0i λl · l = l∈L0j λl · l is minimal. We pick for each
edge {L0i , L0j } such an lij , and we denote by L the union of L0 and the set of
the lij so chosen.
The following generalization was proved in Győry (1998) with an explicit
but weaker upper bound in terms of S. The improvement in S given below is
due to the use of the recent Theorem 4.1.7 in which the upper bound is better
in terms of S than in the other effective results concerning S-unit equations.
In the formulation of the next theorem we keep the notation of Subsection 9.6.1. Namely, s denotes the cardinality of S, RS the S-regulator of K,
PK the maximal norm and QK the product of the norms of the prime ideals
in S if S MK∞ , and PK = 2, QK = 1 if S = MK∞ . Further, let d and DK be
the degree and discriminant of K, and H an upper bound for the logarithmic
heights of the coefficients of F .
Theorem 9.6.3 Let F ∈ OS [X1 , . . . , Xm ] be a decomposable form of degree
n that factors into linear forms over K and satisfies the following conditions:
(i) the set L0 has rank m,
(ii) either k = 1 or k > 1 and the graph H(L01 , . . . , L0k ) is connected.
Then every solution x = (x1 , . . . , xm ) ∈ OSm of (9.6.1) with l(x) = 0 for all
l ∈ L if k > 1, satisfies
max h(xi ) < c5s PK (log∗ QK )RS ,
1≤i≤m
(9.6.12)
where c5 is an effectively computable positive number which depends only on
d, DK , H , m, n and h(δ).
The improved dependence on S has applications in Corollaries 9.6.4 and
9.6.5; see also the Notes (Section 9.7). A completely explicit version of Theorem 9.6.3 can be found in Győry and Yu (2006). We mention that Theorem 9.6.3
is also applicable if F does not factor into linear forms over K but over a finite
extension, G, say of K: by applying the above theorem with G, T instead of
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.6 Effective results
265
K, S where T is the set of places of G lying above those in S one obtains
an upper bound for h(xi ) (i = 1, . . . , m) like (9.6.12) but with K, S replaced
by G, T . In Győry (1998), another effective result on (9.6.1) has been derived
for decomposable forms F satisfying conditions (i) and (ii) and with splitting
field G K, which gives much better bounds if G is large. This is important
for applications to norm form equations and discriminant form equations, see
Corollaries 9.6.6–9.6.8 below.
Remark Theorem 9.6.3 implies that under the assumptions (i) and (ii), equation (9.6.1) has only finitely many solutions, and all of them can be determined
effectively, at least in principle. We note that the finiteness of the number of
solutions in Theorem 9.6.3, and hence in Corollaries 9.6.6–9.6.8 below, follows already from Corollary 9.1.2 in the more general case as well, when K is
replaced by a finitely generated extension of Q and OS by a finitely generated
subring A of K over Z. More precisely, the finiteness condition
(i ) L ∩ ([L1 ] ∩ [L0 \ L1 ]) = ∅ for every proper, non-empty subset L1 of L0
with L = L0 if k = 1
of Corollary 9.1.2 is a consequence of the condition (ii) of Theorem 9.6.3.
Indeed, let L1 be a proper, non-empty subset of L0 . First consider the case
when, in Theorem 9.6.3, k = 1. Since G(L0 ) is connected, there are l ∈ L1 and
l ∈ L0 \ L1 such that l, l are connected by an edge in G(L0 ), i.e., λl + λ l +
λ l = 0 for some l ∈ L0 and non-zero λ, λ , λ ∈ K which proves (i ). Next
assume that k > 1. If there is an L0i with 1 ≤ i ≤ k such that L0i ∩ L1 = ∅
and L0i ∩ (L0 \ L1 ) = ∅, then (i ) follows as in the case k = 1. Suppose that
any L0i is either in L1 or in L0 \ L1 . Since by assumption H(L01 , . . . , L0k ) is
connected, there is a pair L0i , L0j which is an edge in H and L0i is in L1 and
L0j in L0 \ L1 or conversely. But there is a non-zero linear form lij contained
in [L0i ] and [L0j ] and hence in [L1 ] and [L0 \ L1 ] which yields (i ).
We now present some consequences of Theorem 9.6.3. We start with another
version of Theorem 9.6.1 which gives a better bound for the solutions of
equation (9.6.3) in terms of S. Consider again the equation
F (x, y) = δ in x, y ∈ OS ,
(9.6.3)
where F ∈ OS [X, Y ] is a binary form of degree n ≥ 3 which factorizes into
linear factors over K and at least three of these factors are pairwise nonproportional. Further, let δ ∈ OS \ {0} and H an upper bound for the logarithmic
heights of the coefficients of F . It is easy to check that in this case F satisfies
the conditions (i), (ii) of Theorem 9.6.3 with m = 2, k = 1. Hence Theorem
9.6.3 implies the following.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
266
Decomposable form equations
Corollary 9.6.4 Under the above assumptions and notation, all solutions x,
y of (9.6.3) satisfy
max(h(x), h(y)) < c6s PK (log∗ QK )RS ,
where c6 is an effectively computable positive number depending only on d,
DK , H , n and h(δ).
For an explicit value of c6 , we refer to Győry and Yu (2006).
The following consequence of Theorem 9.6.3 provides some information
about the arithmetical properties of decomposable forms at integral points with
coordinates in OK , where OK denotes the ring of integers of K. We denote
by ω(α) the number of distinct prime ideal divisors of α ∈ OK \ {0}, and by
P (α) the greatest of the norms of these prime ideals, with the convention that
P (α) = 1 if α ∈ OK∗ .
Corollary 9.6.5 Let F ∈ OK [X1 , . . . , Xm ] be a decomposable form as in
Theorem 9.6.3, and let N0 be a positive integer. Further, let x = (x1 , . . . , xm ) ∈
OKm be such that
NK ((x1 , . . . , xm )) ≤ N0 ,
F (x) = 0,
l(x) = 0 for l ∈ L if k > 1.
Then
P (log P )ω > c7 (log N )c8
and
P >
c9 (log N )c10
if ω ≤ log P / log2 P ,
c11 (log2 N)(log3 N )/(log4 N) otherwise,
provided that N = max1≤i≤m |NK/Q (xi )| ≥ N1 , where P = P (F (x)) and ω =
ω(F (x)). Here c7 , . . . , c11 and N1 are effectively computable positive numbers
which depend at most on K, F and N0 .
The deduction of this corollary from Theorem 9.6.3 is straightforward, for
this we refer to Győry and Yu (2006).
An important special case of Corollary 9.6.5 is m = 2, k = 1 when F is
a binary form with splitting field K and with at least three pairwise nonproportional linear factors. In this special case the corollary implies a similar
result for polynomials F (X) ∈ OK [X]. Corollary 9.6.5 is a generalization and
improvement of the corresponding results of Győry (1978/1979, 1981a), Haristoy (2003) and many earlier special lower estimates. It motivates the following.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.6 Effective results
Conjecture (Győry and Yu (2006))
Corollary 9.6.5,
267
Under the assumptions and notation of
P > c12 (log N )c13
if N ≥ N1
holds, where c12 , c13 and N1 are effectively computable positive numbers
depending at most on K, F and N0 .
Let now L be an extension of K of degree n ≥ 3 and α1 = 1, α2 , . . . , αm
K-linearly independent elements of L over K with m ≥ 2 which are integral
over OS . Consider the norm form equation
NL/K (α1 x1 + · · · + αm xm ) = δ in x1 , . . . , xm ∈ OS ,
(9.6.13)
where δ ∈ OS \ {0}. For m = 2, this is a Thue equation over OS .
Corollary 9.6.6 Suppose that αm is of degree ≥ 3 over K(α1 , . . . , αm−1 ). Then
all solutions (x1 , . . . , xm ) of (9.6.13) with xm = 0 satisfy
max h(xi ) ≤ C1 ,
1≤i≤m
where C1 is an effectively computable positive number which depends only on
K, L, S, m, n, α1 , . . . , αm and δ.
This implies the following.
Corollary 9.6.7 Suppose that αi+1 is of degree ≥ 3 over K(α1 , . . . , αi ) for
i = 1, . . . , m − 1. Then every solution (x1 , . . . , xm ) of (9.6.13) satisfies
max h(xi ) ≤ C2 ,
1≤i≤m
where C2 is an effectively computable positive number which depends only on
K, L, S, m, n, α1 , . . . , αm and δ.
For S = MK∞ , the first version of Corollary 9.6.7 was obtained in Győry
and Papp (1978). In case of arbitrary S, Corollaries 9.6.6 and 9.6.7 were first
proved by Győry (1981a, 1981b) and independently by Kotov (1981). The best
known, completely explicit upper bounds for the solutions of equation (9.6.13)
are given in Bugeaud and Győry (1996b) and Győry (1998).
Remark In Corollaries 9.6.6, 9.6.7 and hence in Theorem 9.6.3, the respective
assumptions xm = 0 and l(x) = 0 for l ∈ L cannot be dropped, and the lower
cannot be diminished√
in general. Indeed, let α ∈ Q
bound 3 for the degrees of αi √
of degree ≥ 3 over L1 = Q( 2) and let L2 = Q( 2, α). Then the equations
√
√
NL1 /Q (x1 + 2x2 ) = ±1 and NL2 /Q (x1 + 2x2 + αx3 ) = ±1
have infinitely many integral solutions (x1 , x2 , x3 ) with x3 = 0.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
268
Decomposable form equations
Let again L be an extension of degree n ≥ 3 of K and 1, α1 , . . . , αm
K-linearly independent elements of L, integral over OS , such that L =
K(α1 , . . . , αm ), and let l = X0 + α1 X1 + · · · + αm Xm . Denote by σ1 , . . . , σn
the K-isomorphic embeddings of L in Q, and define l (i) := X0 + σi (α1 )X1 +
· · · + σi (αn )Xn for i = 1, . . . , n. Put
2
DL/K (α1 X1 + · · · + αm Xm ) :=
l (i) − l (j ) .
1≤i<j ≤n
This is a decomposable form in OS [X1 , . . . , Xm ] of degree n(n − 1), independent of X0 . It is called a discriminant form. Consider now the discriminant
form equation
DL/K (α1 x1 + · · · + αm xm ) = δ in (x1 , . . . , xm ) ∈ OSm ,
(9.6.14)
where δ ∈ OS \ {0}.
Corollary 9.6.8 Under the above assumptions, all solutions (x1 , . . . , xm ) of
(9.6.14) satisfy
max h(xi ) < C3 ,
1≤i≤m
where C3 is an effectively computable positive number which depends only on
K, S, m, n, α1 , . . . , αm and δ.
For K = Q, S = {∞}, the first version of this corollary was proved in
Győry (1976), and for arbitrary K and S, in Győry and Papp (1977) and Győry
(1981b). The best known, explicit version of Corollary 9.6.8 can be found in
Győry (1998).
Corollary 9.6.8 and its other versions have several applications, among others
to index form equations, algebraic integers of given discriminant or of given
index and power integral bases. Such results are treated in detail in our next
book on discriminant equations. Some related results are also briefly discussed
in the Notes (Section 9.7) and in Section 10.6.
We note that from Theorem 9.6.3 one could easily deduce in Corollaries 9.6.6–9.6.8 explicit bounds in terms of S. Moreover, combining the explicit
version of Theorem 9.6.3 from Győry and Yu (2006) with some arguments from
Győry (1998), one can give completely explicit version of Corollaries 9.6.6–
9.6.8 with slightly better upper bounds than those in Bugeaud and Győry
(1996b) and Győry (1998).
Finally, we observe that Corollary 9.6.5 is in particular applicable in the
case that F is a discriminant form, or a norm form like in Corollaries 9.6.6
and 9.6.7.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.6 Effective results
269
We give only a sketch of the proof of Theorem 9.6.3. For a detailed proof
we refer to Győry and Yu (2006).
Proof of Theorem 9.6.3 (sketch). We keep the above notation of Subsection 9.6.2. We shall denote by c14 , c15 , . . . , c44 effectively computable positive
numbers which depend at most on d, the class number hK and regulator RK of
K, and on H , m, n and h(δ). But by (1.5.2) and (1.5.3) hK , RK can be estimated
from above in terms of d and the discriminant DK of K. Hence we can replace
the dependence on hK and RK by DK .
We make some preliminary remarks. We show in two steps that equation (9.6.1) can be written in the form
l1 (x) · · · ln (x) = δ in x ∈ OSm with l(x) = 0 for l ∈ L,
(9.6.15)
where, up to a proportional factor, l1 · · · ln is a factorization of F into linear
forms in X1 , . . . , Xm with coefficients in OK , the logarithmic heights of the
coefficients of l1 , . . . , ln do not exceed c14 and the new δ ∈ OS \ {0} has height
h(δ) ≤ c15 log∗ QK .
First we recall that as in (9.4.4), F can be written as cl1 · · · ln , where c ∈
OS , c = 0 and li = Xni + αni +1,i Xni +1 + · · · + αmi Xm with αj i ∈ K for i =
1, . . . , n, j ∈ {ni + 1, . . . , m}. Then by (1.9.5) and (1.9.6) we have hhom (F ) ≤
c16 and, by Corollary 1.9.5, h(li ) = hhom (li ) ≤ c17 for i = 1, . . . , n. Thus the
maximum of the logarithmic heights of the coefficients of li is at most c18 . Since
c is a coefficient of F , we have h(c) ≤ c19 which implies that the coefficients
of cl1 have logarithmic heights at most c20 .
In the next step we multiply cl1 , l2 , . . . , ln and δ by the product of the
denominators of the coefficients of cl1 , l2 , . . . , ln . Then the logarithmic heights
of the new δ and the coefficients of the new linear factors, for simplicity denoted
again by l1 , . . . , ln , are at most c21 . Therefore our claim is proved.
Let now x ∈ OSm be a solution of equation (9.6.15) with l(x) = 0 for l ∈ L
if k > 1, and write
li (x) = δi ,
i = 1, . . . , n.
(9.6.16)
Then δi is a divisor of δ in OS and so, by (1.9.2) and the above upper bound for
h(δ), we have log NS (δi ) ≤ log NS (δ) ≤ c22 h(δ) ≤ c23 . By Proposition 4.3.12
there is an εi ∈ OS∗ such that
h(δi /εi ) ≤ c24 log∗ QK ,
i = 1, . . . , n.
(9.6.17)
Let L0 be a maximal subset of pairwise linearly independent linear forms in
the set of new linear forms l1 , . . . , ln . Then the new L0 and its associated graph
G(L0 ) also satisfy the assumptions (i) and (ii) of the theorem. Let L01 , . . . , L0k
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
270
Decomposable form equations
denote the vertex sets of the connected components of G(L0 ). First assume that
k = 1. Then by assumption (i), G(L0 ) is of order at least 3. If {li , lj } is an edge
in G(L0 ), then λi li + λj lj + λl = 0 for some l ∈ L0 and some non-zero λi , λj ,
λ in K with logarithmic heights not exceeding c25 . Together with (9.6.17) this
yields an S-unit equation
τi εi + τj εj + τ ε = 0 in εi , εj , ε ∈ OS∗
(9.6.18)
where the coefficients τi , τj , τ are non-zero elements of K with logarithmic
height ≤ c25 log∗ QK . Now applying Theorem 4.1.7 to equation (9.6.18), we
infer that
s
PK (log∗ QK )RS
max(h(εi /ε), h(εj /ε)) ≤ c26
and so, by (9.6.16) and (9.6.17),
s
max(h(δi /ε), h(δj /ε)) ≤ c27
PK (log∗ QK )RS =: A.
(9.6.19)
If now {li , lq } is an edge in G(L0 ) then we deduce in the same way that there is
an ε ∈ OS∗ such that
max(h(δj /ε ), h(δq /ε )) ≤ A.
Together with (9.6.19) this implies h(ε /ε) ≤ 2A, whence h(δq /ε) ≤ 3A. Using
the assumption that G(L0 ) is connected and repeating the above procedure with
the shortest path connecting two vertices, we infer that h(δi /ε) ≤ c28 A for each
i with li ∈ L0 . Further, if li ∈ L \ L0 is proportional to a linear form li ∈ L0 ,
then li = ρli with some non-zero ρ ∈ K with h(ρ) ≤ c29 , hence h(δi /ε) ≤
c30 A for i = 1, . . . , n. Together with (9.6.15) this gives h(δ/ε n ) ≤ c31 A. Thus
h(ε) ≤ c32 A, and so h(δi ) ≤ c33 A for i = 1, . . . , n. Considering (9.6.16) as a
system of linear equations in x = (x1 , . . . , xm ) and using the assumption (i),
we infer by Cramer’s Rule that
h(xt ) ≤ c34 A
for t = 1, . . . , n.
(9.6.20)
Next consider the case when k > 1 and the graph H(L01 , . . . , L0k ) is connected. For j = 1, . . . , k, let Jj denote the set of indices i with li ∈ L0j . We
may assume without loss of generality that {L01 , L02 } is an edge in this graph.
Then by assumption there is a non-zero l1,2 ∈ L which can be represented in
the form
λ i li =
λ i li
(9.6.21)
i∈J1
i∈J2
such that the total number of non-zero λi ∈ K in both sides of (9.6.21) is minimal. Then, up to a proportional factor, these λi provide a uniquely determined
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.6 Effective results
271
solution of (9.6.21) as a system of linear equations in λi with i ∈ J1 ∪ J2 . One
can prove that there is a non-zero λ1,2 in K such that λ1,2 l1,2 can be expressed
in the form (9.6.21) with non-zero λi ∈ K for which h(λi ) ≤ c35 .
As was seen above in the case k = 1, we have
h(δi /ε1 ) ≤ c36 A
for i ∈ J1
and
h(δi /ε2 ) ≤ c37 A
for i ∈ J2 (9.6.22)
with some ε1 , ε2 ∈ OS∗ . By (9.6.17), this also holds if J1 or J2 consists of a
single element. For the solution x considered above we deduce from (9.6.21)
and (9.6.22) that
h(λ1,2 l1,2 (x)/εq ) ≤ c38 A
for q = 1, 2.
But l1,2 (x) = 0, hence it follows that h(ε2 /ε1 ) ≤ c39 A, whence, by (9.6.22),
h(δi /ε1 ) ≤ c40 A
for i ∈ J1 ∪ J2 .
Using the fact that the graph H(L01 , . . . , L0k ) is connected and repeating this
process with the shortest path connecting two vertices, we infer that h(δi /ε1 ) ≤
c41 A for each i in J1 ∪ · · · ∪ Jk . It follows as above in the case k = 1 that
h(δi /ε1 ) ≤ c42 A and so, in view of (9.6.15), h(δi ) ≤ c43 A for i = 1, . . . , n. We
now infer as in the case k = 1 that (9.6.20) holds with a c44 in place of c34 for
t = 1, . . . , n, whence (9.6.12) follows.
Proof of Corollary 9.6.6. Put M = K(α1 , . . . , αm ), and denote by L0 the set
of the conjugates of the linear form l = α1 X1 + · · · + αm Xm with respect
to M/K. By assumption α1 = 1, hence the forms in L0 are pairwise nonproportional. They form a maximal subset of such forms in the set of linear
forms of NL/K (l). Partition the linear forms in L0 into subsets so that l , l belong
to the same subset if the coefficients of X1 , . . . , Xm−1 in l , l coincide. Then we
get a partition L01 , . . . , L0k with k denoting the degree of K(α1 , . . . , αm−1 ) over
K, and it is easily seen that each of the graphs G(L01 ), . . . , G(L0k ) defined above
is connected. Further, L0 has the properties (i), (ii) from Theorem 9.6.3 with
L = L0 ∪ {Xm }. Considering now equation (9.6.13) over the normal closure,
say G, of L over K, Theorem 9.6.3 applies to equation (9.6.13) and gives an
effective upper bound for the solutions in terms of H , m, n, h(δ), the degree g
and discriminant DG of G, and the parameters involved of SG , that is the set of
places of G lying above those of S. But H can be effectively bounded in terms of
m, n and h(α1 ), . . ., h(αm ). Further, using explicit estimates form Sections 1.4,
1.5 and 1.8, g, |DG | and the parameters mentioned can be effectively estimated
from above in terms of S and the degrees and discriminants of K and L. This
completes the proof.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
272
Decomposable form equations
Proof of Corollary 9.6.7. Let (x1 , . . . , xm ) be a solution of (9.6.13), and denote
by m the greatest integer with xm = 0. If m ≥ 2, Corollary 9.6.6 applies with
m instead of m, while for m = 1 the assertion is trivial.
Proof of Corollary 9.6.8. Using the notation and assumptions of the corollary,
L = K(α1 , . . . , αm ) implies that the linear forms l (1) , . . . , l (n) are pairwise nonproportional. Further, it follows from the linear independence of 1, α1 , . . . , αm
over K that there are indices i1 , . . . , im+1 such that rank{l (i1 ) , . . . , l (im+1 ) } =
m + 1. Notice that the linear forms lij := l (i) − l (j ) (1 ≤ i, j ≤ n) depend only
on X1 , . . . , Xm , and that
lij .
DL/K (α1 X1 + · · · + αm Xm ) = (−1)n(n−1)/2
1 ≤ i, j ≤ n
i = j
Further, rank{li1 ,im+1 , . . . , lim ,im+1 } = m. This means that rank L0 = m, where
L0 denotes a maximal set of pairwise non-proportional linear factors of the
left-hand side. For distinct u, v, w ∈ {1, . . . , n} we have luv + lvw + lwu = 0.
It is easy to check that the graph G(L0 ) is connected, and hence Theorem 9.6.3
combined with some arguments from the end of the proof of Corollary 9.6.6
yields Corollary 9.6.8.
9.7 Notes
We make some historical notes and mention some refinements, applications
and generalizations of the results presented in this chapter.
r There are many papers on effective results for decomposable form equations.
The first effective upper bound for the solutions of Thue equations over Z
was established by Baker (1968b) by means of his effective estimates for
linear forms in logarithms. In the case of discriminant form and index form
equations, the first effective bounds for the solutions were given in Győry
(1976), and for the case of certain norm form and decomposable form equations, in Győry and Papp (1978). Their proofs also involved Baker’s method
but via Győry’s effective results on unit equations in two unknowns. Later, a
number of various effective results with explicit bounds and generalizations
were obtained on the equations mentioned; for results and references, see the
books and survey papers Győry (1980b, 2002), Shorey and Tijdeman (1986),
Evertse, Győry, Stewart and Tijdeman (1988b), Evertse and Győry (1988d),
Sprindžuk (1993) and Feldman and Nesterenko (1998). Practical algorithms
for solving concrete equations of these types were also worked out; see
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.7 Notes
273
de Weger (1989), Tzanakis and de Weger (1989), Bilu and Hanrot (1996,
1999), Smart (1998), Gaál (2002), our book on discriminant equations and
the references given there. All these results were established by Baker’s
method, many of them via unit equations. As was mentioned in Subsection 9.6.1, another method was developed and used in Bombieri (1993),
Bombieri and Cohen (1997, 2003) and Bugeaud (1998) to obtain effective
bounds for the solutions of Thue equations.
For the solutions of decomposable form equations, the best effective upper
bounds to date are given in Bugeaud and Győry (1996b), Bugeaud (1998),
Győry (1998), Győry and Yu (2006) and in Section 9.6 above. The effective
results concerning decomposable form equations have many applications
in Diophantine number theory and algebraic number theory. Several such
applications are treated in detail in our book on discriminant equations.
r Thue equations have many applications. We present here a classical application. Let f ∈ Z[X] be a non-linear polynomial of degree n and m a given
integer ≥ 2 and consider the equation
f (x) = y m in x, y ∈ Z.
(9.7.1)
An important special case is Mordell’s equation x 3 + k = y 2 , where k is a
non-zero integer. Equation (9.7.2) is called an elliptic equation if m = 2,
deg f = 3, a hyperelliptic equation if m = 2 and deg f ≥ 3 and a superelliptic equation if m ≥ 3 and deg f ≥ 2. The example of the Pell equation
dx 2 + 1 = y 2 shows that (9.7.1) may have infinitely many solutions if m = 2
and deg f = 2.
Mordell (1922b, 1923) in the elliptic case and later Siegel (1926) in the
case that f has degree ≥ 3 and no multiple zeros, proved that (9.7.1) has only
finitely many solutions. LeVeque (1964) gave a general finiteness criterion
for equation (9.7.1). Their proofs are based on Thue’s and Siegel’s ineffective
finiteness theorems on Thue equations over Q resp. over number fields, hence
they are also ineffective.
Baker (1968b, 1968c, 1969) was the first to give effective upper bounds
for the solutions of (9.7.1) in the case when f has at least 3 simple zeros if
m = 2 and at least 2 simple zeros if m ≥ 3. We sketch the main steps of his
proof. Assume, for simplicity, that f is monic, and that α1 , α2 and, in the case
m = 2, α3 are simple zeros of f . Put Ki = Q(αi ) for i = 1, 2, 3. If (x, y)
is a solution of (9.7.1), then following Siegel’s argument one deduces that
x − αi = βi σim , where βi is a non-zero element of K with bounded height,
and σi is an unknown integer in Ki for i = 1, 2, 3. This implies that
β1 σ1m − β2 σ2m = α2 − α1 .
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
274
Decomposable form equations
For m ≥ 3, this is a Thue equation over K1 K2 . In this case, Baker applied
his effective result concerning Thue equations over number fields to give an
effective upper bound for the heights of σ1 , σ2 and thereby for x and y. If
m = 2, we have a system of three equations
βi σi2 − βj σj2 = αj − αi
(1 ≤ i < j ≤ 3).
Following Siegel (1926), Baker reduced this system to a single Thue equation
over an appropriate finite extension of K1 K2 K3 and applied his effective
result on Thue equations to the latter. We mention that alternatively one
can combine the above system into a single equation over K1 K2 K3 in three
unknowns σ1 , σ2 , σ3 ,
βi σi2 − βj σj2 =
(αj − αi ),
1≤i<j ≤3
1≤i<j ≤3
where the left-hand side is a triangularly connected decomposable form in
σ1 , σ2 , σ3 , whose linear factors form a system of rank 3, and then apply
Theorem 9.6.3. We note that using here Theorem 9.6.1 or 9.6.2, or the
explicit version of Theorem 9.6.3 from Győry and Yu (2006) one can get
better bounds for the solutions x, y of equation (9.7.1).
Quantitative improvements and generalizations of Baker’s theorems were
later obtained by many authors, including Brindza (1984), who gave an
effective upper bound for the solutions x, y of (9.7.1) under LeVeque’s
general criterion. For practical methods for complete resolution of elliptic
and superelliptic equations, we refer to Gebel, Pethő and Zimmer (1994),
Stroeker and Tzanakis (1994), Bilu and Hanrot (1998) and Tzanakis (2013).
All these results and methods are based on the theory of logarithmic forms or
its elliptic analogue. Recently, Bérczes, Evertse and Győry (2014) proved an
effective finiteness result for hyper- and superelliptic equations over finitely
generated domains.
There are a couple of results on the number of solutions of hyper- and
superelliptic equations. We recall without proof the following special case
of a result of Evertse and Silverman (1986). Its proof takes as starting point
Evertse’s quantitative results on S-unit equations in two unknowns and the
Thue–Mahler equation Evertse (1984a) and follows the same lines as the
argument sketched above. Let m be an integer ≥ 2 and f ∈ Z[X] a polynomial of degree n and discriminant D(f ) = 0. Let ω(D(f )) denote the number
of primes dividing D(f ). Assume that n ≥ 3 if m = 2 and n ≥ 2 if m ≥ 3.
Let K be a number field, containing three zeros of f if m = 2 and two zeros
of f if m ≥ 3, and denote by hm (K) the number of ideal classes of OK whose
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.7 Notes
275
m-th power is the principal ideal class. Then the number of solutions of
f (x) = y m
is at most
3
717n
(ω(D(f ))+1)
13
in x, y ∈ Z
h2 (K)2
2 n2 (ω(D(f ))+1)
(17 m )
(9.7.1)
if m = 2,
hm (K) if m ≥ 3.
A folklore conjecture asserts that hm (K) m,[K:Q],ε |DK |ε for every
ε > 0, where DK denotes the discriminant of K and the implied constant
depends only on m, [K : Q] and ε. By elementary estimates, one can estimate |DK | from above by a power of |D(f )|. This leads to the conjecture
that equation (9.7.1) has m,n,ε |D(f )|ε solutions, for every ε > 0.
r The simplest discriminant equation is
(xi − xj )2 ∈ A∗ in x = (x1 , . . . , xm ) ∈ Am ,
Dm (x) =
(9.7.2)
1≤i<j ≤m
where A is a subring of a field K of characteristic 0 which is integrally closed
and finitely generated over Z. The form
(Xi − Xj )2 ,
Dm :=
1≤i<j ≤m
called a decomposable form of discriminant type, is just the discriminant
of the polynomial f (X) = (X − X1 ) · · · (X − Xm ). Two solutions x =
(x1 , . . . , xm ), x = (x1 , . . . , xm ) of (9.7.2) are called A-equivalent if there
are u ∈ A∗ , a ∈ A such that xi = uxi + a for i = 1, . . . , m.
It is easily seen that the decomposable form Dm is triangularly connected.
Hence, in the number field case when A = OS , the ring of S-integers of a
number field K, Theorem 9.6.3 gives that the set of solution of (9.7.2) is
the union of finitely many OS -equivalence classes of solutions which can be
effectively determined. A generalization for the finitely generated case and
other related results will be treated in detail in our next book on discriminant
equations.
r For applications, discriminant form equations and index form equations
belong to the most important classes of decomposable form equations. For a
detailed treatment of these equations and their applications we refer again to
our next book on discriminant equations. We mention here some basic facts
only about these equations. See also Section 10.6.
Let K be a field of characteristic 0, L an extension of K of degree n ≥ 2,
A a domain with quotient field K which is integrally closed in K and O an Aorder, that is a subring of L containing A that as an A-module is free of rank n.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
276
Decomposable form equations
Let {ω1 = 1, ω2 , . . . , ωn } be an A-module basis of O. Define the linear form
l := X1 + ω2 X2 + · · · + ωn Xn , let l (1) = l, l (2) , . . . , l (n) be the conjugates of
l over K, and define the discriminant form DL/K (ω2 X2 + · · · + ωn Xn ) as in
Section 9.6, i.e., 1≤i<j ≤n (l (i) − l (j ) )2 . Then
DL/K (ω2 X2 + · · · + ωn Xn ) = I (ω2 X2 + · · · + ωn Xn )2 · ,
where I = I (ω2 X2 + · · · + ωn Xn ) is a decomposable form in A[X2 , . . . , Xn ]
of degree n(n − 1)/2 and = DL/K (1, ω2 , . . . , ωn ) is the discriminant of
the basis {1, ω2 , . . . , ωn }. Using the finiteness result of Lang (1960) on unit
equations in two unknowns, it was proved in Győry (1982a, 1982b) that apart
from a proportional factor from A∗ and a translation of the form α → α + a,
a ∈ A, the equations
(i)
O = A[α],
(ii)
DL/K (α) ∈ δA∗ ,
(iii)
DL/K (ω2 x2 + · · · + ωn xn ) ∈ δA∗ ,
(iv)
I (ω2 x2 + · · · + ωn xn ) ∈ δA∗ ,
where δ ∈ A \ {0}, have only finitely many solutions in α ∈ O resp. in
x2 , . . . , xn ∈ A. Moreover, putting α = ni=1 ωi xi with x1 , . . . , xn ∈ A, one
can show that these equations are equivalent. In the special case K = Q,
A = Z, the quantity |I (ω2 x2 + · · · + ωn xn )| is just the index of the additive
group Z[α]+ in O + , therefore, the form I is called an index form. Effective
versions of the above finiteness assertions are given in Győry (1976, 1978b,
1981b) and Győry and Papp (1977) over number fields, and, in our next book
on discriminant equations, over finitely generated domains.
r Let K be a number field with ring of integers OK , S a finite set of places
of K containing all infinite places, p1 , . . . , ps the prime ideals in S, and
phi K = (πi ), where hK denotes the class number of K and πi ∈ OK \ {0}
which, by Proposition 4.3.12, can be chosen so that h(πi ) is effectively
bounded, for i = 1, . . . , s. Let F ∈ K[X1 , . . . , Xm ] be a decomposable form,
and let δ ∈ K \ {0}. It is easy to see that the equation
F (x) = δ in x = (x1 , . . . , xm ) ∈ OSm
(9.7.3)
leads to finitely many equations of the form
F (x) = δ π1z1 · · · πszs in x = (x1 , . . . , xm ) ∈ OKm and z1 , . . . , zs ∈ Z≥0 ,
(9.7.4)
where δ can take only finitely many and effectively determinable values
from K \ {0}. Conversely, any equation of the shape (9.7.4) can be reduced
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.7 Notes
277
to finitely many and effectively determinable equations of the form (9.7.3).
In our book, we considered equations in the form (9.7.3), but in the earlier
literature many results were formulated and proved for equations (9.7.4).
r Combining the proof of Theorem 9.6.3 with the effective results of
Chapter 8 on unit equations, Theorem 9.6.3 and its corollaries formulated
in Section 9.6 can be generalized for the case where the ground ring is an
arbitrary finitely generated integral domain over Z. Using the method presented in Chapter 8, in Bérczes, Evertse and Győry (2014) and in our next
book on discriminant equations effective finiteness results are obtained in a
more direct way concerning the solutions of Thue equations and discriminant
equations over finitely generated domains.
r Schmidt (1971) gave a finiteness criterion for decomposable form equations
over Z in m unknowns, F (x) = δ in x ∈ Zm , but he restricted himself to
decomposable forms F with the property that F (x) = 0 for all non-zero
x ∈ Zm . Schmidt’s result was later extended by Schlickewei (1977d) to the
case of decomposable form equations over rings of S-integers in Q. We
note that the condition F (x) = 0 for x ∈ Zm \ {0} is independent of condition (i) of Theorem 9.1.1. For instance, the decomposable form F = X1 · · ·
Xm (a1 X1 + · · · + am Xm ) with a1 , . . . , am non-zero integers satisfies (i) with
L consisting of all subsums of a1 X1 + · · · + am Xm , but it certainly vanishes
at non-zero integral points.
r Let K be a number field, S a finite set of place of K of cardinality s containing
all infinite places, δ a non-zero element of OS , and F ∈ OS [X, Y ] a binary
form of degree n ≥ 3. It was proved in Evertse (1997) that the Thue–Mahler
equation
F (x, y) ∈ δOS∗ in x, y ∈ OS
(9.7.5)
has at most (5 · 106 n)s+ωS (δ) OS∗ -cosets of solutions.
Erdős, Stewart and Tijdeman (1988) proved the following result, which
implies that Evertse’s bound cannot be replaced by a bound polynomial in s,
say. Let p1 = 2, p2 = 3, . . . be the sequence of primes and n an integer ≥ 2.
Then for every > 0, there exists t0 (n, ) such that for every t ≥ t0 (n, )
there is a polynomial f ∈ Z[X] of degree n with n distinct zeros in Q for
which the equation
f (x) = p1z1 · · · ptzt
has at least exp((n2 − )t 1/n (log t)−(n−1)/n ) solutions in x, z1 , . . . , zt ∈ Z.
Moree and Stewart (1990) proved a similar result with f irreducible over Q.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
278
Decomposable form equations
On the other hand it turned out that, in a certain sense, most of the equations
of type (9.7.5) have much fewer solutions. Let K, S be as above. Two binary
forms F , G ∈ OS [X, Y ] are said to be GL(2, OS )-equivalent if
G(X, Y ) = εF (aX + bY, cX + dY )
for some ε ∈ OS∗ and a, b, c, d ∈ OS with ad − bc ∈ OS∗ . Obviously, the
number of OS∗ -cosets of solutions of (9.7.5) does not change when F is
replaced by a GL(2, OS )-equivalent form. Using the number field case of
Theorem 6.1.6 (see Evertse, Győry, Stewart and Tijdeman (1988a)), Evertse
and Győry (1989) proved the following: for every finite extension L of K
and every integer n ≥ 3 there are up to GL(2, OS )-equivalence only finitely
many binary forms F ∈ OS [X, Y ] of degree n with non-zero discriminant that
factorize into linear factors over L and for which equation (9.7.5) has more
than two OS∗ -cosets of solutions. Here the bound 2 is already best possible.
Further, the assertion does not remain valid without fixing the splitting field.
The proof of Evertse and Győry is ineffective in the sense that it does not allow
us to determine effectively a full system of representatives for the exceptional
equivalence classes. In Evertse and Győry (1989) the authors established
also an effective version, but with the bound 1 + s · min(m, n(n − 1)(n − 2))
instead of 2 where m := [L : K]. For a connection with inequalities involving
resultants of binary forms, see Section 10.9.
r Mahler (1933b) gave asymptotic formulas for the number of solutions of
Thue and Thue–Mahler inequalities, and these were later generalized to
decomposable form inequalities. We give an overview of the recent results.
We need the following notation. Let S = {∞, p1 , . . . , pt } be a finite set
of places of Q. Call a point x = (x1 , . . . , xm ) ∈ Zm S-primitive if
gcd(x1 , . . . , xm , p1 · · · pt ) = 1
(in the case S = {∞} this condition is void). Denote by μ = μ∞ the Lebesgue
measure on R normalized such that μ∞ ([0, 1]) = 1. For a prime number p,
denote by μp the Haar measure on Qp , normalized such that μp (Zp ) = 1.
Further, denote by μS the product measure p∈S μp on p∈S Qp and for
m
positive integers m, by μm
S the product measure on
p∈S Qp . We call a point
m
(xp : p ∈ S) ∈ p∈S Qp S-primitive if |xp |p = 1 for p ∈ S \ {∞}, where
as usual we define |x|p := maxi |xi |p for x = (x1 , . . . , xm ).
For a decomposable form F ∈ Z[X1 , . . . , Xm ], we denote by NF,S (k) the
number of solutions of the decomposable form inequality
|F (x)|p ≤ k in S-primitive x ∈ Zm
p∈S
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.7 Notes
279
and we write NF (k) for NF,S (k) if S = {∞}. Further, we define the set
⎧
⎫
⎨
⎬
|F
(x
)|
≤
k,
p
p
p∈S
AF,S (k) := (xp : p ∈ S) ∈
Qm
:
.
p
⎩
(xp : p ∈ S) S-primitive⎭
p∈S
m/n
, where μF,S :=
Then for k > 0 we have μm
S (AF,S (k)) = μF,S k
m
μS (AF,S (1)). In the case S = {∞} we write μF for μF,S .
It is an obvious problem to compare NF,S (k) with μF,S k m/n . In the 1930s,
Mahler (1933b) proved that if F ∈ Z[X, Y ] is an irreducible binary form of
degree n ≥ 3, then
|NF,S (k) − μF,S k 2/n | F,S k 1/(n−1) (log k)t
as k → ∞,
where the implied constant depends on F and S. In his master’s thesis, de
Jong (1999) proved analogues of Mahler’s results for certain classes of norm
form inequalities. In the case S = {∞}, Thunder proved a more substantial
generalization of Mahler’s result to decomposable form inequalities, and
made Mahler’s result more precise. For the simplicity of our presentation,
we give slightly weaker versions of Thunder’s results.
Let F ∈ Z[X1 , . . . , Xm ] be a decomposable form satisfying condition (i)
of Theorem 9.1.1 with L = L0 and also F (x) = 0 for all x ∈ Zm \ {0}. This
condition is slightly stronger than the one imposed by Thunder. In his paper
Thunder (2001) proved that μF m,n 1 and
NF (k) m,n k m/n
as k → ∞,
where the implied constants are effectively computable and depend only on
m, n and moreover,
−2
|NF (k) − μF k n/d | F k n/(d+n
as k → ∞,
)
(9.7.6)
where the implied constant is effectively computable and depends on F . In
the special case that gcd(m, n) = 1, Thunder (2005) obtained an estimate
similar to (9.7.6) with an effectively computable implied constant depending
only on m and n. Thunder’s arguments consisted of an application of the
Quantitative Subspace Theorem and geometry of numbers. It is still open to
prove an estimate like (9.7.6), with implicit constant depending only on m, n,
without the constraint gcd(m, n) = 1.
J. Liu (2015) obtained in his PhD-thesis generalizations of Thunder’s
results for arbitrary finite sets of places S. More precisely, he proved that
μF,S m,n,S 1, NF,S (k) m,n,S k m/n as k → ∞, and
−2
|NF,S (k) − μF,S k n/d | F,S k n/(d+n
)
as k → ∞,
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
280
Decomposable form equations
where now all implied constants depend also on the primes in S. Further,
in the case that m and n are coprime, he obtained a similar estimate with
implicit constant depending only on m, n and the primes in S. Here again,
all implicit constants are effectively computable.
r Let A be an integral domain that is finitely generated over Z and K its quotient
field. Further, let P ∈ A[X] be a polynomial of degree n without multiple
zeros. Consider the resultant equation
R(P , Q) = δ in Q ∈ A[X] with deg Q = m,
(9.7.7)
where R(P , Q) denotes the resultant of P and Q and where δ ∈ K \ {0}.
Writing P = a0 (X − α1 ) · · · (X − αn ) with distinct α1 , . . . , αn from a finite
extension of K and Q = x0 Xm + x1 Xm−1 + · · · + xm ∈ A[X], we have
n
R(P , Q) = a0m
x0 αim + x1 αim−1 + · · · + xm .
i=1
Thus, (9.7.7) can be regarded as a decomposable form equation. By means
of an earlier version of Corollary 9.1.2 it was proved in Győry (1993b) that
this equation has only finitely many solutions if m < n/2 and this bound n/2
is in general sharp. This improved and generalized results of Wirsing (1971),
Schmidt (1973) and Schlickewei (1977e) obtained in the case K = Q, A = Z
or ZS , a ring of S-integers in Q. In the case when K is a number field and
A = OS is a ring of S-integers in K, a quantitative finiteness result from
Evertse (1995) on decomposable form equations was used in Győry (1994)
3
to derive the upper bound (234 n2 )m s for the number of solutions of (9.7.7),
where s = |S|.
We note that in Sections 10.8, 10.9 other versions of (9.7.7) are considered,
where both P and Q are unknowns, but the splitting field of P · Q is fixed.
r The next application is concerned with irreducible polynomials. It gave an
affirmative answer to a problem of M. Szegedy. Let P ∈ Z[X] be a monic
polynomial of degree n without multiple zeros. Further, let p1 , . . . , ps be
distinct primes and denote by S the set of integers not divisible by primes
different from p1 , . . . , ps . It was proved in Győry (1994) that there are at
3
most (217 n)n (s+1)/3 values a ∈ S for which P (X) + a is reducible over Q.
Indeed, if for some a ∈ S, Q(X) = Xm + x1 Xm−1 + · · · + xm is a divisor
of degree m ≤ n/2 of P (X) + a in Z[X] and if α1 , . . . , αn denote the zeros
of P (X), then (1, x1 , . . . , xm ) is a solution of the equation
F (x0 , x1 , . . . , xm ) ∈ S in (x0 , x1 , . . . , xm ) ∈ Zm+1 ,
(9.7.8)
n
m−1
m
where F = X0 i=1 (αi X0 + αi X1 + · · · + Xm ) is a decomposable form
with coefficients in Z. Using an earlier version of Corollary 9.5.5 from
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.7 Notes
281
Evertse (1995), one can get an upper bound for the number of solutions of
(9.7.8), i.e. for the number of polynomials Q under consideration and the
assertion follows. From this it is easy to deduce that for any monic P ∈ Z[X]
3
of degree n there is an a ∈ Z with |a| ≤ exp{(217 n)n } for which P (X) + a
is irreducible over Q. It is an important feature of this bound that it depends
only on n.
r In the number field case when in (9.2) K is a number field and A is a ring of
S-integers in K, Győry (1993a) gave a criterion for (9.2) to have only finitely
many A∗ -cosets of solutions. Also in the number field case when is a group
of S-units in K, Theorem 9.2.1 on unit equations is equivalent not only to
Theorem 9.1.1 on decomposable form equations, but also to the following
assertion. For any set of n + 2 distinct hyperplanes H0 , . . . , Hn+1 in Pn (K),
the set of S-integral points of Pn (K) \ (H0 ∪ · · · ∪ Hn+1 ) is contained in a
finite union of hyperplanes of Pn (K); see LeVesque and Waldschmidt (2011),
Ru and Wong (1991), Győry (1993b) and, for more general results, Vojta
(1987, 1996) and Levin (2008). Some refinements of Theorems 9.4.1 to 9.4.4
can be found in Győry (1993a) and Evertse and Győry (1997).
r Consider a decomposable form equation F (x) = ±δ over Z and its reformulation of the form (9.4.8) with K = Q, OS = Z. Assume that this equation
has infinitely many solutions in x ∈ Zm . Then the maximal rank r of its
families of solutions satisfies 1 ≤ r < ∞. Denote by P (N ) the number of
solutions x = (x1 , . . . , xm ) with max1≤i≤m |xi | ≤ N . In Everest and Győry
(1997) it was deduced from the case K = Q, OS = Z of Theorem 9.4.4 that
P (N ) = c1 (log N )r + O((log N )r−1 ) as N → ∞,
where c1 is a positive number which depends only upon F and δ. See also
Győry and Pethő (1980) and Evertse and Győry (1997).
r Let G be a finite abelian group, and let Z[G] denote the integral group
ring which consists of all formal expressions g∈G xg · g with xg ∈ Z. Then
Z[G]∗ , the unit group of Z[G], is finitely generated. There is a considerable
interest in the units of Z[G]; see e.g. Karpilovsky (1988) and Sehgal (1978).
For x = g∈G xg · g ∈ Z[G], let |x| := maxg∈G |xg |, and let
UG (N ) := |{x ∈ Z[G]∗ : |x| ≤ N }|.
Suppose that Z[G]∗ has rank r > 0. It was proved in Everest and Győry
(1997) as a special case of the above result concerning P (N ) that
UG (N ) = c2 (log N )r + O((log N)r−1 ) as N → ∞,
where c2 is a positive number which depends only on G.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
282
Decomposable form equations
r Let F ∈ Z[X1 , . . . , Xm ] be a decomposable form of degree n in m ≥ 2
variables. For given c > 0, ν ≥ 0 consider the decomposable form inequality
0 < |F (x)| ≤ c|x|ν in x = (x1 , . . . , xm ) ∈ Zm ,
(9.7.9)
where |x| := max1≤i≤m |xi |. By means of his Subspace Theorem Schmidt
(1973, 1980) proved that (9.7.9) has only finitely many solutions, provided
that
(i) n > 2(m − 1), ν < n − 2(m − 1) and the linear factors of F are in general position (i.e. any m of them are linearly independent over Q),
(ii) F is not divisible in Q[X1 , . . . , Xm ] by any form of degree less
than m.
This was extended by Schlickewei (1977e) to the case when the ground ring
is an arbitrary finitely generated subring of Q. These results have obvious
applications to decomposable form equations of the form
F (x) = G(x) = 0,
where G ∈ Z[X1 , . . . , Xm ] is a non-zero polynomial of degree ν < n −
2(m − 1).
The above results were generalized in Győry and Ru (1998) for the number
field case, without assuming (ii). The proof involves Schmidt’s Subspace
Theorem with moving targets proved by Ru and Vojta (1997).
r As a generalization of decomposable form equations, several people studied
decomposable polynomial equations of the form
F (x) = δ in x = (x1 , . . . , xm ) ∈ Am ,
(9.7.10)
where A is a subring of a finitely generated extension K of Q which is finitely
generated over Z, δ ∈ K \ {0} and
F ∈ K[X1 , . . . , Xm ]
is a decomposable polynomial, i.e., it factorizes into not necessarily homogeneous linear polynomials over a finite extension of K. In Evertse, Gaál and
Győry (1989) a finiteness criterion was given for equation (9.7.10). Later, in
the case K = Q, explicit upper bounds were derived in Bérczes and Győry
(2002) for the number of solutions, provided that this number is finite. Over
number fields, effective bounds were derived for the solutions in Sprindžuk
(1974) and Bilu (1995) for m = 2, and in Gaál (1984, 1985, 1986) for certain
norm polynomial and discriminant polynomial equations.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
9.7 Notes
283
r Let f1 , . . . , fn , G be non-zero polynomials in K[X1 , . . . , Xm ] (m ≥ 2),
where K is a number field. Let F = f1 · · · fn , and assume that
deg F > m max (deg fi ) + deg G.
1≤i≤n
Further, let OS be a ring of S-integers in K, and as a generalization of
decomposable polynomial equations, consider the equation
F (x) = G(x) in x ∈ OSm .
Let X be the hypersurface defined by F = G. It is proved in Corvaja and
Zannier (2004a) that under certain additional assumptions, X ∩ OSm is not
Zariski dense in X .
r Finally, we note that Győry (1983), Mason (1986a, 1986b, 1987, 1988) and
Gaál (1988a, 1988b) established effective results for various decomposable
form equations over function fields. Their proofs are based on some earlier
variants of results from Chapter 7 concerning unit equations. Gaál and Pohst
(2006a, 2006b, 2010) gave the complete resolution of some norm form
equations over certain function fields over a finite field.
Downloaded from https:/www.cambridge.org/core. Cornell University Library, on 16 Jun 2017 at 04:34:11, subject to the Cambridge Core
terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.011
10
Further applications
In the previous chapters several applications of unit equations were presented or
mentioned. Moreover, in Chapter 9 we showed that unit equations and decomposable form equations are in a certain sense equivalent, and using results
concerning unit equations we proved several general results for decomposable
form equations. Unit equations have, however, a great variety of other applications. In this chapter we briefly present some of these applications in their
simplest form, without aiming at completeness. We note that numerous further applications to discriminant equations are treated in our subsequent book
Discriminant Equations in Diophantine Number Theory.
The following topics are discussed: prime factors of sums of integers in
Section 10.1, representations of elements of integral domains as sums of units
in Section 10.2, lengths of finite orbits of polynomial maps on integral domains
in Section 10.3, divisibility properties of polynomials with few non-zero coefficients in Section 10.4, arithmetic graphs with applications to irreducibility
problems for polynomials in Section 10.5, discriminant equations and power
integral bases in number fields in Section 10.6, finiteness results for binary
forms of given discriminant in Section 10.7, equations involving resultants of
monic polynomials in Section 10.8, equations and inequalities involving resultants of binary forms in Section 10.9, Lang’s conjecture for tori in Section 10.10,
linear recurrence sequences and exponential-polynomial equations in Section
10.11, and finally algebraic independence results for values of lacunary power
series in Section 10.12.
10.1 Prime factors of sums of integers
We start with a simple application. Denote by ω(n) the number of distinct
prime factors of a positive integer n, and by P (n) the greatest prime factor of n.
284
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.1 Prime factors of sums of integers
285
Erdős and Turán (1934) proved that for any finite subset A of Z>0 with |A| ≥ 2,
ω
(a + a ) > c1 log |A|,
a,a ∈A
where c1 denotes an effectively computable positive number. Further, they
conjectured, see Erdős (1976) that for every t there is a number c(t) so that if
A and B are finite subsets of Z>0 with |A| = |B| ≥ c(t) then
ω
(a + b) > t.
a∈A,b∈B
Using the result from Evertse (1984a) on the number of solutions of S-unit
equations in two unknowns, see also the Notes in Section 6.7, Győry, Stewart
and Tijdeman (1986) proved the conjecture in the following more general and
more precise form.
Theorem 10.1.1 There exists an effectively computable positive absolute constant c2 such that if A and B are any finite subsets of Z>0 with |A| ≥ |B| ≥ 2,
then
ω
(a + b) > c2 log |A|.
(10.1.1)
a∈A,b∈B
Since the n-th prime can be estimated from below by a constant times n log n,
Theorem 10.1.1 implies the following result.
Corollary 10.1.2 There exists an effectively computable positive absolute
constant c3 such that if A and B are any finite subsets of Z>0 with |A| ≥ |B| ≥ 2,
then there exist integers a ∈ A and b ∈ B for which
P (a + b) > c3 log |A| log log |A|.
(10.1.2)
Erdős, Stewart and Tijdeman (1988) proved that (10.1.1) and (10.1.2) are
not far from being best possible. More precisely, they showed that there is a
positive number c4 such that for each integer k, with k ≥ 3, there exist subsets
A and B of Z>0 with k = |A| ≥ |B| ≥ 2 such that
ω
(a + b) < c4 (log |A|)2 log log |A|.
a∈A,b∈B
Further, they obtained a similar result for P (
a∈A,b∈B (a
+ b)) as well.
Proof of Theorem 10.1.1. We deduce Theorem 10.1.1 from Corollary 6.1.5
concerning unit equations. It is enough to prove this theorem in the case when
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
286
Further applications
|B| = 2. Let a1 , . . . , ak denote the elements of A and let b1 , b2 be the elements
of B. Let p1 , . . . , pt be the primes which divide
k 2
(ai + bj ).
i=1 j =1
Each ai yields a solution x = ai + b1 , y = ai + b2 of the equation
x − y = b1 − b2 .
By Corollary 6.1.5, there are at most 28(2t+2) such pairs (ai + b1 , ai + b2 ).
Hence k ≤ 216(t+1) , which gives t > c5 log k for some effectively computable
positive absolute constant c5 .
Győry, Sárközy and Stewart (1996) proved a multiplicative analogue of
(10.1.1) by showing that there exists an effectively computable positive number
c6 , such that if A and B are any finite subsets of Z>0 with |A| ≥ |B| ≥ 2, then
ω
(ab + 1) > c6 log |A|.
(10.1.3)
a∈A,b∈B
This implies a similar multiplicative analogue of (10.1.2). Further, they obtained
the following common generalization of (10.1.1) and (10.1.3).
Theorem 10.1.3 Let n ≥ 2 be an integer, and let A and B be ordered finite
subsets of Zn>0 with |A| ≥ |B| ≥ 2(n − 1) and with the following properties:
the n-th coordinate of each vector in A is equal to 1 and any n vectors in
B ∪ {(0, . . . , 0, 1)} are linearly independent. Then
⎞
⎛
⎟
⎜
(a1 b1 + · · · + an bn )⎠ > c7 log |A|
ω⎝
(a1 , . . . , an ) ∈ A
(b1 , . . . , bn ) ∈ B
with an effectively computable positive number c7 depending only on n.
Note that (10.1.1) follows from Theorem 10.1.3 by taking n = 2 and b1 = 1
for all (b1 , b2 ) in B. Further, for n = 2, Theorem 10.1.3 gives (10.1.3) if b2 = 1
for each (b1 , b2 ) in B.
In Theorem 10.1.3, all assumptions are necessary. The proof of Theorem
10.1.3 depends on some finiteness results of Evertse and Győry (1988c) and
Evertse (1995) on decomposable form equations.
In their above-mentioned paper, Győry, Sárközy and Stewart formulated the conjecture that if a, b and c denote distinct positive integers and
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.2 Additive unit representations
287
max (a, b, c) → ∞, then
P ((ab + 1)(bc + 1)(ca + 1)) → ∞.
The conjecture was confirmed in stronger forms by Corvaja and Zannier (2003)
and, independently, by Hernández and Luca (2003). For further related results,
see Bugeaud and Luca (2004), Luca (2005) and Zannier (2012).
10.2 Additive unit representations in finitely generated
integral domains
Many people have investigated additive unit representations of elements in
various rings. A central problem is whether as a Z-module the ring of integers
of a number field or, more generally, a finitely generated integral domain of
characteristic 0 can be generated by its units. Further, if the answer is yes, how
many units are needed to represent the elements of the ring? Ashrafi and Vámos
(2005) proved that if K is a quadratic, a complex cubic or a cyclotomic number
field generated by a primitive 2m -th root of unity then there is no integer n ≥ 1
such that every integer in K can be represented as the sum of not more than n
units. Further, they conjectured that it holds true for all algebraic number fields
K. Jarden and Narkiewicz (2007) proved the conjecture in the following more
general situation.
Theorem 10.2.1 If A is a finitely generated integral domain of characteristic
0, then there is no integer n such that every element of A is a sum of at most n
units.
In particular, this holds for the rings of integers and the rings of S-integers
of number fields.
Theorem 10.2.1 is a consequence of the next theorem from Jarden and
Narkiewicz (2007) and a classical result of van der Waerden.
Theorem 10.2.2 If A is a finitely generated integral domain of characteristic
0 and n ≥ 1 is an integer then there exists a constant C1 (A, n), depending only
on A and n, such that every non-constant arithmetic progression in A having
more than C1 (A, n) elements contains an element which is not a sum of n units.
Theorem 10.2.2 is a special case of the following theorem which was established independently by Hajdu (2007). Let K be a field of characteristic 0 and
a multiplicative subgroup of K ∗ of finite rank r. Further, let n ≥ 1 be an
integer, and A a non-empty finite subset of K n of cardinality t.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
288
Further applications
Theorem 10.2.3 There exists a constant C2 (r, n, t) depending only on r, n
and t such that the length of any non-constant arithmetic progression in the set
n
n
ai si : (a1 , . . . , an ) ∈ A, (s1 , . . . , sn ) ∈ i=1
is at most C2 (r, n, t).
In the special case when A is a finitely generated integral domain of
zero characteristic, = A∗ , t = 1 and A = (1, . . . , 1), Theorem 10.2.3 gives
Theorem 10.2.2. We recall that A∗ , i.e., the unit group of A, is finitely generated,
and hence of finite rank.
The proofs of Theorems 10.2.2 and 10.2.3 are both based on earlier versions
of Theorem 6.1.3 on unit equations and a result of van der Waerden (1927)
from Ramsey theory.
Later, Hajdu and Luca (2010) proved Theorem 10.2.3 with a completely
explicit value of C2 (r, n, t). Its proof depends only on Theorem 6.1.3, and
avoids the use of van der Waerden’s Theorem.
Below we sketch the proofs of Theorems 10.2.2 and 10.2.1. We shall use
the following version of van der Waerden’s Theorem.
Theorem 10.2.4 Let r, s be fixed positive integers. Then for any integer N
sufficiently large in terms of r, s the following holds: for any arithmetic progression P of length N of rational integers, and any splitting of P into r subsets,
at least one of these subsets contains an arithmetic progression of length s.
Proof. See van der Waerden (1927).
Proof of Theorem 10.2.2. We proceed by induction on n. Let first n = 1. Let
aj = a0 + (j − 1)δ, j = 1, . . . , N , be an arithmetic progression consisting of
units of A, where δ is a non-zero element of A. We have aj +1 − aj = δ for
j = 1, . . . , N − 1, hence an earlier version, due to Evertse and Győry (1988b),
of Theorem 6.1.3 implies that N is bounded by a number depending only
on A.
Next let n ≥ 1, and assume that the assertion holds with a constant C1 (A, k)
for each positive integer k not exceeding n. For δ ∈ A \ {0}, consider now a
finite arithmetic progression aj = a0 + (j − 1)δ in A, j = 1, . . . , N , each term
of which is a sum of n + 1 units from A. We show that N can be bounded above
by a number which depends only on A and n.
Denote by (δ) the set of all units u in A which appear in a proper representation of the form δ = u1 + · · · + um with m = 1, 2, . . . , 2n + 2, that is the
unit sum u1 + · · · + um has no vanishing subsum. Put (δ) = {x1 , . . . , xM }. It
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.2 Additive unit representations
289
follows from the above-mentioned result from Evertse and Győry (1988b) that
M is bounded above by a number C3 (A, n) which depends only on A and n.
We have
aj =
n+1
uk,j
for some uk,j ∈ A∗ , j = 1, . . . , N,
k=1
whence
δ = aj +1 − aj =
n+1
k=1
uk,j +1 −
n+1
uk,j , j = 1, . . . , N.
(10.2.1)
k=1
Cancel the possible vanishing subsums at the right-hand side of (10.2.1). Then,
for each j , at least one of the units in (10.2.1) belong to (δ). We may assume
without loss of generality that for every j either u1,j or u1,j +1 belongs to (δ).
For t = 1, 2, . . . , M, put
Xt = 1 ≤ j ≤ N : u1,j = xt , Yt = 1 ≤ j ≤ N : u1,j +1 = xt .
Then the set {1, 2, . . . , N} is the union of the sets Xt , Yt , t = 1, . . . , M. It
follows from van der Waerden’s Theorem stated above that at least one of the
sets X1 , . . . , XM , Y1 , . . . , YM contains an arithmetic progression P of length
T > C1 (A, n) if N is sufficiently large with respect to C1 (A, n) and C3 (A, n).
We may assume that X1 has this property. Let d be the difference of P , and put
P = {n1 , . . . , nT }, where ni = n1 + (i − 1)d for i = 1, . . . , T . Then one can
easily verify that ani − x1 = an1 − x1 + (i − 1)dδ for i = 1, . . . , T , and hence
an1 − x1 , . . . , anT − x1 is an arithmetic progression of length > C1 (A, n) in
A, each term of which is a sum of n units. This contradicts the induction
hypothesis. Thus N ≤ C4 (A, n) where C4 (A, n) depends only on A and n.
Proof of Theorem 10.2.1. Assume that every non-zero element of A can be
represented as the sum of at most n units, and let n be the smallest positive
integer with this property. Consider a sufficiently long arithmetic progression
aj = a0 + (j − 1)δ, j = 1, . . . , N
in A, where δ is a non-zero element of A. We follow the above argument. Let Xi ,
1 ≤ i ≤ n, be the set of those indices j ∈ {1, 2 . . . , N} for which aj is a proper
sum of i units from A. Then the set {1, 2 . . . , N} is the union of X1 , . . . , Xn . If
N is large enough then van der Waerden’s Theorem implies that one of the Xi ,
say Xk , contains a long arithmetic progression P = {n1 , . . . , nT }. Then one can
see similarly as above that if N is sufficiently large then an1 , . . . , anT is a long
arithmetic progression each term of which is the sum of k units, contradicting
Theorem 10.2.2. This proves Theorem 10.2.1.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
290
Further applications
We present some consequences of Theorems 10.2.2 and 10.2.3 as well as
some related results. For a finite set S of primes, denote by ZS the ring of
S-integers, and by Z∗S the group of S-units. Jarden and Narkiewicz proved the
following consequence of their Theorem 10.2.2.
Corollary 10.2.5 Let n ≥ 1 be an integer, and S a finite set of primes. Then
the set of positive integers which are sums of at most n elements of Z∗S has
density zero.
This follows from Theorem 10.2.2 applied with A = ZS and from
Szemerédi’s Theorem (Szemerédi (1975)) on arithmetic progressions.
In his above-mentioned paper, Hajdu deduced from his Theorem 10.2.3 and
from the theorem of Green and Tao (2008) about arithmetic progressions of
primes the following.
Corollary 10.2.6 Let n ≥ 1, S = {p1 , . . . , pt } be a finite set of primes, US
the set of integers of the shape ±p1z1 · · · ptzt with z1 , . . . , zt ∈ Z≥0 , and A a
non-empty finite subset of Zn . Then there are infinitely many primes outside the
set
n
n
ai si : (a1 , . . . , an ) ∈ A, (s1 , . . . , sn ) ∈ US .
i=1
For A = (1, . . . , 1) this gives the following
Corollary 10.2.7 Let n ≥ 1, S and US be as in Corollary 10.2.6. There are
infinitely many primes which are not the sum of n elements of AS .
For n = 2, S = {2, 3}, this provided a negative answer to a question of
Pohst (oral communication) who asked whether every prime can be written in
the form 2u ± 3v with some non-negative integers u, v.
It is easy to see that there are number fields whose rings of integers cannot
be generated by their units.
√ Such number fields are, for example, the imaginary quadratic fields Q( d) with squarefree integers d < −3. Jarden and
Narkiewicz (2007) formulated the following problem.
Problem Give a criterion for an algebraic extension of Q to have the property
that its ring of integers is generated by its units.
Jarden and Narkiewicz provided some examples of infinite algebraic extensions of Q having this property. For example, the fields of all algebraic numbers
and all real algebraic numbers are such fields. Further, by the Kronecker–Weber
Theorem the maximal abelian extension of Q also has the property mentioned.
In particular, the ring of integers of an abelian number field is generated by its
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.3 Orbits of polynomial and rational maps
291
units. Besides these results the above problem has been solved for quadratic
number fields in Belcher (1974) and later in Ashrafi and Vámos (2005), for
pure cubic fields in Tichy and Ziegler (2007) and for pure quartic complex
fields in Filipin, Tichy and Ziegler (2008).
Answering affirmatively another problem of Jarden and Narkiewicz, Frei
(2012) proved that for any number field K, there exists a finite extension L of
K such that the ring of integers of L is generated by its units.
For further related results, we refer to Bertók (2013), Dombek, Hajdu and
Pethő (2014), the survey paper Barroero, Frei, Tichy (2011) and the references
given there.
10.3 Orbits of polynomial and rational maps
We start with some generalities. Let for the moment X be any non-empty set
and φ : X → X any map from X to itself (usually called self-map of X). We
denote by φ (i) the i-th iterate of φ (φ applied i times) where we agree that φ (0)
is the identity.
An orbit of φ is a sequence Oφ (a0 ) := {φ (i) (a0 )}∞
i=0 , where a0 ∈ X.
A cycle of φ is a sequence (a0 , . . . , am−1 ) in X, where a0 , . . . , am−1 are
distinct, ai = φ(ai−1 ) for i = 1, . . . , m − 1, and a0 = φ(am−1 ). We call m the
length of the cycle. In this case the orbit Oφ (a0 ) is periodic with period m. Any
a0 ∈ X that is the starting point of a cycle of φ of length m is called a periodic
point of φ of period m.
An orbit Oφ (a0 ) of φ is called finite if there are only finitely many distinct
elements among φ (i) (a0 ) (i = 0, 1, 2, . . .). Suppose this is the case. Write ai :=
φ (i) (a0 ) for i ≥ 0. Then there exists l > 0 such that there is k with 0 ≤ k < l
and ak = al . Take l with this property minimal and put m := l − k. Then
a0 , . . . , ak+m−1 are distinct, and ai+m = ai for i ≥ k. We express the orbit
Oφ (a0 ) conveniently as
Oφ (a0 ) = (a0 , . . . , ak−1 , ak , . . . , ak+m−1 ),
where the overline indicates that (ak , . . . , ak+m−1 ) is the recurring cycle of the
orbit. We call k + m the length of Oφ (a0 ), (a0 , . . . , ak−1 ) the tail of Oφ (a0 ),
and (ak , . . . , ak+m−1 ) the cycle of Oφ (a0 ). Any a0 ∈ X for which Oφ (a0 ) is
finite is called a preperiodic point of φ. In particular, every periodic point is
preperiodic.
There is a vast literature on orbits of maps defined by polynomials on rings
(see for instance Narkiewicz (1995)) or of morphisms of algebraic varieties
(see, e.g., Silverman (2007)). Here, we restrict ourselves to certain aspects that
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
292
Further applications
are closely related to unit equations. These concern bounding the lengths of
cycles and finite orbits in the cases that X = A is an integral domain and φ
is defined by a polynomial, or X is the one-dimensional projective line P1 (K)
over a field K and φ is a rational self-map of P1 (K).
Let A be an integral domain. A polynomial cycle, resp. finite polynomial
orbit in A is a cycle, resp. finite orbit of a map of the type x → f (x) : A → A
where f ∈ A[X]. We sloppily say that it is a cycle or finite orbit of f . We
denote by f (i) the i-th iterate of the map x → f (x).
Notice that linear polynomials cX + d with c, d ∈ A, c = 0 do not give rise
to finite orbits or cycles in A, unless c is a root of unity different from 1.
Narkiewicz (1989) proved that every polynomial cycle in Z has length at
most 2, and in Narkiewicz and Pezda (1997) it is shown that every finite
polynomial orbit in Z has length at most 4. Results from Narkiewicz (1989),
Pezda (1994) and Narkiewicz and Pezda (1997) imply that for a large class
of integral domains A the lengths of polynomial cycles and finite polynomial
orbits in A are uniformly bounded in terms of A. We define the following
quantities:
N1 (A, b) := |{(x1 , x2 ) ∈ A∗ × A∗ : x1 + x2 = b}|
N1 (A) := sup N1 (A, b) : b ∈ A \ {0} .
(b ∈ A \ {0}),
The following result is part of Pezda (2014), Theorem 1. Its proof is based
on ideas from Narkiewicz (1989).
Theorem 10.3.1 Let A be an integral domain for which N1 (A) is finite. Then
every polynomial cycle in A has length at most 6(N1 (A) + 2)2 .
By Roquette’s Theorem (Roquette (1957)) (see also Proposition 9.3.1 in this
book), if A is an integral domain of characteristic 0 that is finitely generated
over Z, then its unit group A∗ is finitely generated. We consider more generally
integral domains of which the unit group has finite rank. If A is such a domain,
and A∗ has rank r, then N1 (A) ≤ 216r+16 by Corollary 6.1.5. This leads at once
to the following corollary.
Corollary 10.3.2 Let A be an integral domain of characteristic 0 such that A∗
has finite rank r. Then every polynomial cycle of A has length at most 232r+35 .
In the proof of Theorem 10.3.1 we need the following simple lemma.
Lemma 10.3.3 Let g ∈ A[X] and a, b ∈ A with a = b. Then a − b divides
g(a) − g(b).
Proof. Use the fact that
g(X)−g(a)
X−a
∈ A[X].
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.3 Orbits of polynomial and rational maps
293
Proof of Theorem 10.3.1. Notice that if (a0 , a1 , . . . , am−1 ) (with m ≥ 2) is a
a
−a0
a −a
) is a cycle of g(X) :=
cycle in A of f ∈ A[X], then (0, 1, a2 −a0 , . . . , m−1
a1 −a0
1
0
−1
(a1 − a0 ) (f ((a1 − a0 )X + a0 ) − a0 ), which is a polynomial in A[X]. So
there is no loss of generality to consider only polynomial cycles starting with
0, 1.
Let (a0 = 0, a1 = 1, a2 , . . . , am−1 ) be such a cycle, say of f ∈ A[X], and
assume without loss of generality that m ≥ 6.
Let V be the set of integers i ∈ {0, . . . , m − 1} that are coprime with m(m −
1)/2. Let p1 , . . . , pt be the distinct primes dividing m, with p1 < · · · < pt .
Then V consists precisely of the integers i ∈ {0, . . . , m − 1} such that i ≡
0, 2 (mod pj ) for j = 1, . . . , t. Now the Chinese Remainder Theorem and a
very generous estimate yield
|V| =
⎧
t
⎪
⎪
⎪
m
·
1 − 2pj−1
⎪
⎨
⎫
⎪
⎪
if p1 > 2,⎪
⎪
⎬
⎪
⎪
⎪
⎪
⎩m ·
⎪
⎪
if p1 = 2 ⎪
⎪
⎭
j =1
t
1
2
j =2
1−
2pj−1
≥
$
m/6.
(10.3.1)
Given an integer k with 1 ≤ k ≤ m − 1, we easily see, by repeatedly applying Lemma 10.3.3 with g = f (k) , that ak divides atk − a(t−1)k for t = 1, 2, . . ..
This implies that ak divides atk for t = 1, 2, . . .. Let i ∈ V with i ≥ 3. There are
k, l ∈ Z such that ik = 1 + lm, and so ai divides a1+lm = 1. Hence ai ∈ A∗ .
Likewise, ai−2 ∈ A∗ . Further, by applying Lemma 10.3.3 with g = f (2) ,
g = f (m−2) , respectively, we deduce that ai−2 divides ai − a2 and ai − a2
divides ai−2 , that is, ai − a2 ∈ A∗ . It follows that (ai , a2 − ai ) (i ∈ V, i ≥ 3)
are all solutions to the unit equation
x1 + x2 = a2
in x1 , x2 ∈ A∗ .
Hence |V| ≤ N1 (A) + 2. Together with (10.3.1) this implies m ≤ 6(N1 (A) +
2)2 . This proves Theorem 10.3.1.
Pezda (1994) proved the following, by a totally different, local method,
independent of unit equations. Let K be a field of characteristic 0 with discrete valuation v and A = {x ∈ K : v(x) ≥ 0} the associated discrete valuation
domain. Assume that the residue class field of A is finite, say with pf elements,
where p is a prime number. Then every polynomial cycle of A has length at
most
pf (p f − 1)p1+log v(p)/ log 2 .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
294
Further applications
By applying this with K a number field of degree d and v the discrete valuation
corresponding to a prime ideal of the ring of integers OK of K lying above 2,
one obtains that every polynomial cycle of OK has length at most
2d+1 (2d − 1).
This bound is comparable with the bound of Corollary 10.3.2 in that it is exponential in rank OK∗ . In his Ph.D. thesis, Zieve (1996) proved various extensions
of Pezda’s result.
We now consider the finite polynomial orbits of an integral domain A.
Denote by B(A) the supremum of the lengths of the polynomial cycles of A.
Narkiewicz and Pezda (1997), Theorem 1 proved that if B(A) is finite and
if moreover the number of non-degenerate solutions of x1 + x2 + x3 = 1 in
x1 , x2 , x3 ∈ A∗ is finite, say C(A), then every finite polynomial orbit of A has
length at most
1
B(A)(31
3
+ C(A)) − 1.
We prove a variation on this result with a simpler proof. Define
N2 (A, b) := |{(x1 , x2 ) ∈ A∗ × A∗ : (1 + x1 )(1 + x2 ) = b}| (b ∈ A \ {0, 1}),
N2 (A) := sup N2 (A, b) : b ∈ A \ {0, 1} .
Theorem 10.3.4 Let A be an integral domain for which both B(A) and N2 (A)
are finite. Then every finite polynomial orbit of A has length at most
B(A)(2N2 (A) + 5).
This has the following consequence for integral domains of characteristic 0
with unit group of finite rank.
Corollary 10.3.5 Let A be an integral domain of characteristic 0 such that
A∗ has finite rank r. Then every finite polynomial orbit of A has length at most
21600(r+5) .
Proof. The equation (1 + x1 )(1 + x2 ) = b in x1 , x2 ∈ A∗ (with b ∈ A, b =
0, 1) can be rewritten as a three term unit equation
x1 + x2 + x1 x2 = b − 1.
Since x1 , x2 = −1, there can be no solutions with x1 + x1 x2 = 0 or x2 + x1 x2 =
0. Further, there are at most two solutions with x1 + x2 = 0. So apart from at
most two solutions, each proper subsum of the left-hand side is non-zero. Now
by applying the bound (6.1.4) of Amoroso and Viada with n = 3, we infer
N2 (A) ≤ 2 + 24324(r+4) .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.3 Orbits of polynomial and rational maps
295
Together with the upper bound for B(A) from Corollary 10.3.2 this implies
Corollary 10.3.5.
Proof of Theorem 10.3.4. Let
(a0 , . . . , ak−1 , ak , . . . , ak+m−1 )
be a finite polynomial orbit in A, say of f ∈ A[X], where a0 , . . . , ak+m−1 are
distinct. Then (ak , . . . , ak+m−1 ) is a polynomial cycle, and so m ≤ B(A).
We first make some reductions. Write k = qm + r with q, r ∈ Z and 0 ≤
r ≤ m − 1. Then (ar , ar+m , . . . , ar+qm ) is a finite orbit of f (m) . Let
ar+im − ak
(i = 0, 1, 2, . . .),
ar − ak
h(X) := (ar − ak )−1 f (m) ((ar − ak )X + ak ) − ak .
bi :=
Then h ∈ A[X], b0 = 1, bi = 0 for i ≥ q, b0 , . . . , bq−1 are distinct, and
(1, b1 , . . . , bq−1 , 0)
is a finite orbit of h. We show that
q ≤ 2N2 (A) + 3.
(10.3.2)
Then using k + m < (q + 2)m ≤ (q + 2)B(A) we obtain at once Theorem
10.3.4.
We use that by Lemma 10.3.3 with g = h(t) , bi − bj divides bi+t − bj +t for
any i, j with 0 ≤ i, j ≤ q and i = j and any t > 0. Assume without loss of
generality that q ≥ 5 and let i be an index with q/2 ≤ i ≤ q − 2. Then bi − 1 =
bi − b0 divides b2i − bi = −bi , hence x1 := bi − 1 ∈ A∗ . Further, bq−1 − bi
divides b2q−2−i − bq−1 = −bq−1 and hence also bi , while bi = bi − bq divides
bq−1 − b2q−1−i = bq−1 , and hence also bq−1 − bi . So x2 := (bq−1 /bi ) − 1 ∈
A∗ . Notice that x1 , x2 are elements of A∗ satisfying
(1 + x1 )(1 + x2 ) = bq−1
and that bq−1 = 0, 1. So the number of indices i with 12 q ≤ i ≤ q − 2 is at
most N2 (A). This implies (10.3.2) and hence Theorem 10.3.4.
Let again A be an integral domain. We call two sequences {ai }ri=0 , {bi }ri=0
in A (with r finite or infinite) equivalent if there are ε ∈ A∗ , a ∈ A such that
bi = εai + a for i = 0, 1, 2, . . .. If {ai }ri=0 is a cycle or orbit of a polynomial
f ∈ A[X], then {bi }ri=0 is a cycle or orbit of g(X) := εf (ε−1 (X − a)) + a.
A polynomial cycle of A is called linear if it is a cycle of a linear polynomial
from A[X], otherwise non-linear. A finite polynomial orbit of A is called
(non-)linear if its cycle is (non-)linear.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
296
Further applications
Halter-Koch and Narkiewicz (1997, 2000) proved that if A is an integral
domain of characteristic 0 that is finitely generated over Z and integrally closed,
then it has up to equivalence only finitely many non-linear polynomial cycles
and only finitely many finite non-linear polynomial orbits. The non-linearity
assumption is needed here. For instance, we obtain infinitely many pairwise
inequivalent linear orbits by taking (1, 0, a) (a ∈ A \ {0, 1}), which is a finite
orbit of f = (X − 1)(X − a) with linear cycle (0, a) coming from a − X. The
proof of Halter-Koch and Narkiewicz heavily uses finiteness results on unit
equations. Pezda (2014) gave an effective algorithm that computes, for any
given number field K, a full set of representatives for the equivalence classes
of the non-linear polynomial cycles and finite orbits of OK .
We state without proof some results on the lengths of cycles and finite orbits
of rational maps on the projective line. In general, for an arbitrary field K, a
rational map φ : P1 (K) → P1 (K) of degree n is given by
φ : (x : y) → (F (x, y) : G(x, y)),
(10.3.3)
where F, G ∈ K[X, Y ] are two binary forms of degree n without a common
factor, i.e., with resultant R(F, G) = 0. Notice that the map φ is unaffected if
we replace F, G by λF, λG for some λ ∈ K ∗ .
We assume henceforth that K is a number field. Let φ be the rational selfmap of P1 (K) of degree n, given by (10.3.3). Let p be a prime ideal of OK . We
say that φ has good reduction at p if the following holds: choose F, G such that
their coefficients lie in OK but not all in p; then R(F, G) ∈ p. Otherwise, we
say that φ has bad reduction at p. It is not difficult to show that for a rational
self-map of P1 (K) there are only finitely many prime ideals of OK at which it
has bad reduction.
This notion of good reduction has an alternative interpretation. Let Fp :=
OK /p denote the residue class field of p and denote by Fp , Gp the binary forms
in Fp [X, Y ], obtained by reducing the coefficients of F, G modulo p. Then the
reduction φp of φ at p is the self-map of P1 (Fp ) given by
(x : y) →
Fp (x, y) Gp (x, y)
:
,
H (x, y) H (x, y)
where H is the greatest common divisor of Fp , Gp in Fp [X, Y ]. Notice that
R(F, G) ∈ p if and only if H is constant. This means that φ has good reduction
at p if and only if φp has the same degree as φ.
For more information on reduction of rational maps, see Silverman (2007),
sections 2.3–2.5.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.3 Orbits of polynomial and rational maps
297
We state without proof the following result.
Theorem 10.3.6 Let K be an algebraic number field of degree d and let φ be
a rational self-map of P1 (K). Let S be the set of places of K consisting of the
infinite places of K and the prime ideals at which φ has bad reduction. Denote
by t the number of prime ideals of OK at which φ has bad reduction, and let
s := |S|.
(i) Every cycle of φ has length at most
d
C1 (d, t) := 12(t + 2) log 5(t + 2) .
(ii) Every finite orbit of φ has length at most
12
C2 (s) := e10 (s + 1)8 (log 5(s + 1))8
s
.
Part (i) has been proved by Morton and Silverman (1994), corollary B. The
proof is by means of a local method, extending that of Pezda (1994). For similar
and related results see Zieve (1996) and Silverman (2007), section 2.6. Part (ii)
has been proved by Canci (2007), Theorem 1. His proof is an extension of
that of Theorem 10.3.4. His main tools are part (i), and Theorem 6.1.3 on unit
equations.
We consider the preperiodic points of rational self-maps of P1 (K). Let
φ be a rational self-map of P1 (K). First suppose that φ is linear, that is,
φ(x : y) = (ax + by : cx + dy) where B := ( ac db ) ∈ GL(2, K). If B has two
eigenvalues in K whose quotient is a root of unity, then there is m > 0 such that
φ (m) is the identity and every point in P1 (K) is a periodic point of φ. Otherwise,
B has at most two fixed points, depending on the number of eigenvalues of B
in K, and no other preperiodic points.
Assume henceforth that φ has degree at least 2. We denote by PrePerK (φ)
the set of preperiodic points of φ in P1 (K). More generally, we may extend
φ to a rational self-map of P1 (Q) and consider the set PrePerK,D (φ) of all
preperiodic points of φ that have degree at most D over K. Then Northcott
(1950) proved that for any integer D > 0, the set PrePerK,D (φ) is finite.
We state a special case of the Uniform Boundedness Conjecture, which was
first formulated in Morton and Silverman (1994).
Conjecture 10.3.7 Let K be a number field of degree d, and φ a rational
self-map of P1 (K) of degree n ≥ 2. Then
|PrePerK (φ)| ≤ C(d, n),
where C(d, n) depends on d and n only.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
298
Further applications
From Theorem 10.3.6 we deduce a weaker version.
Corollary 10.3.8 Let K, d, φ, n be as in Conjecture 10.3.7 and assume that
φ has bad reduction at precisely t prime ideals of OK . Then
|PrePerK (φ)| ≤ C(d, n, t),
where C(d, n, t) is an effectively computable number, depending on d, n and t
only.
Proof. For the i-th iterate of φ we have φ (i) (x : y) = (Fi (x, y) : Gi (x, y)),
where both Fi , Gi are binary forms of degree ni with R(Fi , Gi ) = 0. By
Theorem 10.3.6, for every point (x : y) ∈ PrePerK (φ), there are k, l with 0 ≤
k < l ≤ C2 (s), such that φ (k) (x : y) = φ (l) (x : y), that is, Fk (x, y)Gl (x, y) =
Fl (x, y)Gk (x, y). This shows that the preperiodic points of φ are among the
zeros of the binary form
(Fk Gl − Fl Gk ),
P :=
0≤k<l≤C2 (s)
which is not identically zero since Fi , Gi are coprime for i ≥ 0. Now the
number of preperiodic points of φ is at most the degree of P , which can be
estimated from above effectively in terms of s and n, hence in terms of d, t
and n.
10.4 Polynomials dividing many k-nomials
By a monic k-nomial over Q we will mean a polynomial of the form
Xm1 + a2 Xm2 + · · · + ak−1 Xmk−1 + ak Xmk ∈ Q[X]
with m1 > · · · > mk−1 > mk = 0.
If the polynomial is not a (k − 1)-nomial, i.e., if all ai = 0, we call (m1 , . . . , mk )
its exponent k-tuple. Put
P ∈ Q[X] : ∃Q ∈ Q[X], r ∈ Z≥1 with deg (Q) < k
.
P Rk :=
such that P (X) | Q(X r ) over Q
Posner and Rumsey (1965) noted that P (X) ∈ P Rk implies that P (X) divides
infinitely many monic k-nomials over Q. Indeed, if P (X) divides Q(Xr ) over
Q for some Q(X) of degree < k and integer r ≥ 1, then the vector space of
polynomials in Q[X] modulo Q(X) is at most (k − 1)-dimensional, and hence
Q(X) divides infinitely many k-nomials T (X) over Q. But then Q(Xr ) divides
T (Xr ) and so P (X) divides T (Xr ) over Q. Conversely, Posner and Rumsey
conjectured that if a polynomial P ∈ Q[X] divides infinitely many monic knomials over Q then P ∈ P Rk .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.4 Polynomials dividing many k-nomials
299
For k = 2 the conjecture is obvious. For k = 3, Posner and Rumsey proved
a weaker version of their conjecture. Later, Győry and Schinzel (1994) showed
that the conjecture is true for k = 3 and false for k ≥ 4. The disproof for the
case k ≥ 4 is elementary. For k = 3, the proof involves some deep results on
S-unit equations in two unknowns.
Győry and Schinzel (1994) proved the following stronger assertion.
Theorem 10.4.1 Let P ∈ Q[X] be a non-constant polynomial with t distinct
zeros, K the splitting field of P , d the degree of K over Q, and s the number
of distinct prime ideal factors of the zeros different from 0 of P . There are
effectively computable numbers C1 , C2 depending only on d and s such that if
P divides more than C1 · C2t monic trinomials over Q then P ∈ P R3 .
Győry and Schinzel gave C1 and C2 in explicit form. It should be observed
that these numbers do not depend on the size of the coefficients of P .
Proof (sketch). The proof of Theorem 10.4.1 is based on some earlier, quantitative versions of Corollary 6.1.5 and Theorem 6.1.6. We sketch the basic idea of
the proof. Let P be a polynomial as in the theorem, and let T = Xm + aXn + b
be a trinomial over Q which is divisible by P . If X divides P (X) or if ab = 0,
then P ∈ P R3 easily follows. Hence we assume that X does not divide P and
ab = 0. It is easy to show that P can be written in the form Ps P1 P22 , where
P1 and P2 are relatively prime squarefree polynomials in Q[X]. Denote by
α1 , . . . , αt the distinct zeros of P1 P2 , and by S the set of prime ideal factors
of these zeros in K. Then, for i = 1, . . . , t, (αim , αin ) is a solution of the S-unit
equation
(−1/b)x1 + (−a/b)x2 = 1 in S-units x1 , x2 .
(10.4.1)
First consider those trinomials T = Xm + aXn + b (ab = 0) over Q which
are divisible by P and for which the corresponding equation (10.4.1) has at
most two solutions. One can show that if there are more than 15 such trinomials
then P ∈ P R3 .
Next consider those trinomials T = Xm + aXn + b (ab = 0) over Q which
are divisible by P (X) and the corresponding equation (10.4.1) has more than
two solutions. If Xm + aXn + b and Xm + a Xn + b are such trinomials
and if the corresponding equations of the form (10.4.1) are S-equivalent in
the sense defined before the enunciation of Theorem 6.1.6 then a /a, b /b ∈
OS∗ ∩ Q∗ , where OS∗ denotes as usual the S-unit group in K. Hence it follows
from a quantitative version, due to Győry (1992b), of Theorem 6.1.6 over
number fields that there is a subset A of (Q∗ )2 of cardinality at most C3 such
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
300
Further applications
that for each trinomial Xm + aXn + b under consideration, a = εa0 , b = ηb0
with ε, η ∈ OS∗ ∩ Q∗ and some (a0 , b0 ) ∈ A. Here C3 is a number depending
only on d and s which can be given explicitly.
Fix such a pair (a0 , b0 ) ∈ A and consider all the trinomials of the form
Xm + εa0 Xn + ηb0 with ε, η ∈ OS∗ ∩ Q∗ , which are divisible by P over Q. If
Xm + εa0 Xn + ηb0 and Xm + ε a0 Xn + η b0 are such trinomials then P (X)
divides Xn + c with some c ∈ Q∗ and so P ∈ P R3 . Hence it suffices to deal
with those trinomials for which the pairs (m, n) are pairwise distinct. We
may assume that in the pairs (m, n) in question, say m1 , . . . , mu are pairwise
distinct for u > C4t with a number C4 specified below. Then P divides Ti =
Xmi + εi a0 Xni + ηi b0 over Q for i = 1, . . . , u, and so, for each i,
(−1/b0 ) αjmi /ηi + (−a0 /b0 ) εi αjni /ηi = 1
for j = 1, . . . , t, where εi , ηi ∈ OS∗ ∩ Q∗ for i = 1, . . . , u. By the abovementioned version, due to Evertse (1984a), of Corollary 6.1.5, C4 can be chosen
as an explicit expression of d and s such that for each j with 1 ≤ j ≤ t, αjmi /ηi
can assume at most C4 values. Since by assumption u > C4t , there are distinct
mi
mi
i1 and i2 with 1 ≤ i1 , i2 ≤ u such that αj 1 /ηi1 = αj 2 /ηi2 for j = 1, . . . , t
and, if mi1 > mi2 , then putting r = mi1 − mi2 and η = ηi1 /ηi2 , we get αjr = η
for j = 1, . . . , t. Consequently, P1 P2 divides Xr − η, i.e. P divides (Xr − η)2
and so P ∈ P R3 . Finally, we obtain that if P divides more than 15 + C3 · C42t
trinomials then P ∈ P R3 .
Schlickewei and Viola (1997) improved the bound occurring in Theorem
10.4.1. They proved the theorem with a bound of the form C5 · q C6 where q
denotes the degree of P and C5 , C6 are explicitly given absolute constants. We
note that under the above notation, t ≤ q ≤ 2t and q ≤ d ≤ q! hold. In their
paper Schlickewei and Viola made the conjecture that the bound C5 · q C6 may
be replaced by an absolute constant which does not involve the degree of P at
all. However, as is mentioned by them, at present this seems to be out of reach.
In Győry and Schinzel (1994), the authors proposed as a problem a modified
version of the conjecture of Posner and Rumsey for k ≥ 4. Hajdu (1997) gave
a negative answer to the problem and proposed a further refinement of the
conjecture. For k = 5, this was disproved by Hajdu and Tijdeman (2003).
Further, they noticed that if P divides two monic k-nomials, say T1 and T2 ,
over Q with the same exponent k-tuple, then it divides infinitely many ka
b
T1 + a+b
T2 for every pair (a, b) of
nomials, for example the k-nomials a+b
positive rationals. Then in their paper (Hajdu and Tijdeman (2003)), they made
the following conjecture.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.5 Irreducible polynomials and arithmetic graphs
301
Conjecture 10.4.2 For any k ≥ 5, a polynomial P ∈ Q[X] with P (0) = 0
divides infinitely many monic k-nomials with non-zero constant terms over Q
if and only if either
(i) P ∈ P Rk or
(ii) P divides over Q two monic k-nomials with the same exponent k-tuple.
In the same paper, the authors proved this assertion for k = 4 and for polynomials P with only simple zeros. Further, in Hajdu and Tijdeman (2008) they
confirmed the conjecture for k ≥ 5 in the important special case when P is
irreducible over Q and its Galois group is [2k/3]-times transitive. The proof is
complicated; it depends on Theorem 6.1.3 which gives an upper bound for the
number of solutions of multivariate unit equations.
Finally, we note that Schlickewei and Viola (1999) described a so-called
“proper” family Fk of monic k-nomials such that if a polynomial P having
only simple zeros divides more than C7 (k) elements of Fk with a C7 (k) given
explicitly in terms of k, then P ∈ P Rk .
10.5 Irreducible polynomials and arithmetic graphs
Let K be an algebraic number field, S a finite set of places on K containing all
infinite places, OS the ring of S-integers, NS (·) the S-norm and N a positive
integer. For any finite subset A = {α1 , . . . , αm } of OS with m ≥ 3, we denote
by GS (A) = GS (A, N ) the graph whose vertex set is A and whose edges are
the unordered pairs {αi , αj } with
NS (αi − αj ) > N.
When S consists of the infinite places, this graph will be denoted by G(A) =
G(A, N). These graphs G(A) and GS (A) were introduced in Győry (1971, 1972,
1980c) and were studied and applied by Győry and others; see Győry (2008b),
Győry, Hajdu and Tijdeman (2011) and the references given there.
Several Diophantine problems, for instance related to irreducibility of polynomials (see Theorem 10.5.3), decomposable form equations (see Subsection 9.6.2), discriminant equations (see Theorems 10.6.1–10.6.3) and resultant
equations (see Theorem 10.8.1) can be reduced to the study of connectedness
properties of graphs GS (A, N). Such properties are stated in Theorems 10.5.1
and 10.5.2 below.
In the complement of GS (A, 1), {αi , αj } is an edge if and only if αi − αj
is an S-unit. Hence this complement is called a difference graph of S-units.
For any finite (simple) graph G of order ≥ 3 there is a finite set S of places on
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
302
Further applications
K containing all infinite ones such that GS (A, 1) is isomorphic to G for some
subset A of OS . Further, such S and A can be effectively determined, provided
that K is effectively given; see Győry, Hajdu and Tijdeman (2014).
} of OS are called SThe subsets A = {α1 , . . . , αm }, A = {α1 , . . . , αm
equivalent if, after some reordering of α1 , . . . , αm ,
αi = εαi + β,
i = 1, . . . , m
for some ε ∈ OS∗ and β ∈ OS . In this case the graphs GS (A) and GS (A ) are
obviously isomorphic, they have the same structure. There are infinitely many
S-equivalence classes of subsets A of OS with given cardinality m ≥ 3.
We present two theorems in simplified from on the structure of graphs
GS (A). Denote by d the degree of K, and let s := |S|. Further, as in Chapter 4,
let P denote the greatest norm and Q the product of norms of the prime ideals
involved in S, and let RS be the S-regulator of K.
The following theorem was proved by Győry (2008b) in a more precise
form. Its first, weaker version can be found in Győry (1980c).
Theorem 10.5.1 Let m ≥ 3 be an integer, and A = {α1 , . . . , αm } a subset of
OS . Then the graph GS (A, N ) has at most two connected components, except
possibly in the case when there is an ε ∈ OS∗ such that
max h((αi − αj )/ε) ≤ C1 m3 (C2 s)2(s+2) P RS (log∗ RS )(log∗ QN ).
1≤i,j ≤m
Here C1 , C2 are effectively computable positive numbers such that C1 depends
only on the degree d of K and the regulator and class number of K, and C2
only on d.
This means that the number of exceptional S-equivalence classes is finite,
and a representative of each class can be, at least in principle, effectively
determined.
Proof (sketch). Theorem 10.5.1 is proved by repeated application of Corollary
4.1.5. We sketch some ideas behind the proof.
For a finite graph G we denote by G ! the triangle graph of G, i.e. the
graph whose vertices are the edges of G, and two vertices e1 and e2 of G !
are connected by an edge if and only if G contains a triangle having e1 and
e2 as edges. Further, if both G and G ! are connected then we say that G is
!-connected.
Consider now GS (A) = GS (A, N) in Theorem 10.5.1 and assume that this
graph has at least three connected components. It is easy to see that in this case
the complement of GS (A), for simplicity denoted by G, is !-connected. Let
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.5 Irreducible polynomials and arithmetic graphs
303
{αi , αj , αk } be a triangle in G. Then we have
NS (αi − αj ) ≤ N, NS (αj − αk ) ≤ N, NS (αk − αi ) ≤ N.
Using Proposition 4.3.12, this gives that up to unknown S-unit factors, the
numbers αi − αj , αj − αk , αk − αi have effectively bounded heights. But
(αi − αj )/(αi − αk ) + (αj − αk )/(αi − αk ) = 1,
hence Corollary 4.1.5 implies that the height of (αi − αj )/(αj − αk ) can be
effectively bounded above. If {αj , αk , αl } is another triangle in G, then the
heights of (αj − αk )/(αk − αl ) and so (αi − αj )/(αk − αl ) are also effectively
bounded. Continuing this procedure, it follows that for any two connected vertices {αi , αj }, {αp , αq } in G ! , the height of (αi − αj )/(αp − αq ) is effectively
bounded. But G ! is connected, hence for each quadruple {αi , αj , αp , αq } for
which {αi , αj } and {αp , αq } are edges in G, the height of (αi − αj )/(αp − αq )
can be effectively bounded. Fix p and q. Since G is connected, each distinct αa and αb can be connected by a path in G. Summing over all terms
(αi − αj )/(αp − αq ) for the edges in this path we infer that for each pair (a, b)
the height of (αa − αb )/(αp − αq ) can be effectively bounded. From these facts
it follows easily that up to a common S-unit factor, the height of αa − αb is
effectively bounded for each distinct αa , αb , as stated in Theorem 10.5.1.
The following theorem is a more precise but ineffective version of Theorem
10.5.1.
There are only finitely many pairwise non-associate α ∈ OS with NS (α) ≤
N. Denote by $S (N ) the maximal number of such α.
Theorem 10.5.2 Let m ≥ 3 be an integer with m = 4. Apart from at most
finitely many S-equivalence classes of subsets A = {α1 , . . . , αm } of OS ,
GS (A) has a connected component of order at least m − 1.
(10.5.1)
Further, if
m > 3 · 216s $S2 (N ),
(10.5.2)
then (10.5.1) holds for all subsets A = {α1 , . . . , αm } of OS .
We note that the assumption m = 4 is necessary, and the lower bound m − 1
in (10.5.1) is sharp.
A more general and quantitative version of the first part of Theorem 10.5.2
is given in Győry (2008b); see also Győry (1990). The second part is a special
case of Theorem 2.3 of Győry (2008b). For earlier versions of this part, see
Győry (1980c, 1990).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
304
Further applications
Proof (sketch). The proof of the first part of Theorem 10.5.2 depends on Corollary 6.1.2 or, in the quantitative case, on Theorem 6.1.3 and (6.1.4) concerning
S-unit equations. We now sketch the ideas behind the proof of the second
statement. Let A = {α1 , . . . , αm } be a subset of OS , and let G1 , . . . , Gl be the
connected components of GS (A) such that |G1 | ≤ |G2 | ≤ · · · ≤ |Gl |. Suppose
that l ≥ 3 or l = 2 and |G1 | ≥ 2. If l ≥ 3, let αi1 , αi2 be vertices of G1 and G2 ,
respectively, while if l = 2, let αi1 , αi2 be vertices of G1 . Then
αi2 − αi1 = (αi2 − αj ) + (αj − αi1 )
(10.5.3)
follows for every vertex αj of G3 , . . . , Gl if l ≥ 3, and of G2 if l = 2. Further,
αi2 − αj and αj − αi1 have S-norms at most N for each j . There are $S2 (N ) pairs
(β1 , β2 ) ∈ OS2 with non-zero β1 , β2 such that αi2 − αj = β1 x1 , αj − αi1 = β2 x2
with S-units x1 , x2 . For fixed αi1 , αi2 , (10.5.3) leads to at most $S2 (N ) Sunit equations whose total number of solutions is by Theorem 6.1.4 at most
216s $S2 (N ). But the number of αj in question is at least 13 m. This shows that if
(10.5.2) holds, then l = 1 or l = 2 and |G1 | = 1, which was to be proved.
Theorems 10.5.1 and 10.5.2 have applications to irreducible polynomials.
I. Schur and later A. Brauer, R. Brauer, H. Hopf and others investigated the
irreducibility of polynomials of the form g(f (X)), where f , g are monic
polynomials with integral coefficients, g is irreducible over Q, and the zeros
of f are distinct integers. For a survey of results of this type, see Győry (1972,
1982c).
These investigations were extended in Győry (1971, 1972, 1982c, 1992c)
to the more general case that the zeros of f are in an arbitrary but fixed
totally real number field K. Let A = {α1 , . . . , αm } be the set of zeros of such
a monic polynomial f ∈ Z[X] and suppose that g ∈ Z[X] is an irreducible
monic polynomial whose splitting field is a CM-field, i.e. a totally imaginary
quadratic extension of a totally real number field. In this case g is called of
CM-type. For example, cyclotomic polynomials and quadratic polynomials of
negative discriminant are of CM-type. Consider the graph G(A) = G(A, N)
with N = 2d |g(0)|d/ deg (g) , where d = [K : Q]. It was proved in Győry (1971)
that if this graph G(A) has a connected component having k vertices, then the
number of irreducible factors of g(f (X)) over Q is not greater than deg (f )/k.
This estimate is in general best possible; see Győry (1972).
For f ∈ Z[X] and a ∈ Z, the polynomials f (X) and f (X + a) will be called
equivalent. Then, for irreducible g ∈ Z[X], g(f (X)) and g(f (X + a)) are at the
same time reducible or irreducible. Using the fact that $MK∞ (N ) ≤ C3 N with an
effectively computable number C3 depending only on d and the discriminant
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.6 Discriminant equations and power integral bases
305
of K (see Sunley (1973)), Theorem 10.5.2 implies immediately the following
theorem.
Theorem 10.5.3 Let g ∈ Z[X] be a monic irreducible polynomial of CMtype, and K a totally real number field of degree d. There are only finitely
many equivalence classes of monic polynomials f ∈ Z[X] with deg (f ) ≥ 3,
deg (f ) = 4, and with distinct zeros in K for which g(f (X)) is reducible over
Q. Further, if
deg(f ) > C4 |g(0)|2d/ deg(g)
then g(f (X)) is irreducible over Q. Here C4 is an effectively computable
number depending only on d and the discriminant of K.
We note that for suitable g and K, in Theorem 10.5.3 there exist infinitely
many exceptional equivalence classes of quartic f for which g(f (X)) is
reducible, and these exceptions are described in Győry (1992c). Further,
Theorem 10.5.3 does not remain valid for any monic irreducible g ∈ Z[X]
and for any number field K; see e.g. Győry (1992c).
In Győry, Hajdu and Tijdeman (2011), an upper bound is given for the
number of exceptional equivalence classes of polynomials f . Theorem 10.5.3
is ineffective, in the sense that the method of proof does not make it possible to
determine the exceptional equivalence classes. A weaker but effective version
can be deduced from Theorem 10.5.1. For the first effective results of this type,
see Győry (1982c).
10.6 Discriminant equations and power integral bases in
number fields
Several Diophantine problems of number theory lead to discriminant equations. To illustrate applications of unit equations to such equations, we restrict
ourselves here to some finiteness results in their simplest form. Many other,
more general results, quantitative versions and applications are discussed in
our book Discriminant Equations in Diophantine Number Theory.
Two important discriminant equations are
DK/Q (α) = D
in α ∈ OK
(10.6.1)
and
D(f ) = D
in monic polynomials f ∈ Z[X],
(10.6.2)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
306
Further applications
where K is an algebraic number field, OK its ring of integers, D(f ) the discriminant of f , DK/Q (α) the discriminant of the minimal polynomial, say fα , of α
over Z, and D a non-zero rational integer. In other words, if α satisfies (10.6.1)
then fα satisfies (10.6.2). Equation (10.6.2) can have, however, other, not necessarily irreducible solutions without zeros in K. Hence equation (10.6.2) is
more general than (10.6.1).
If α is a solution of (10.6.1) then so is α + a for all a ∈ Z. Elements α,
α ∗ ∈ OK with α − α ∗ ∈ Z are called equivalent. Similarly, if f is a solution
of (10.6.2), then so is f ∗ (X) = f (X + a) for every a ∈ Z. As in Section 10.5,
such polynomials f , f ∗ are called equivalent. The minimal polynomials of
equivalent α, α ∗ from OK are obviously equivalent.
In the quadratic case, when in (10.6.1) K is a quadratic number field and in
(10.6.2) the polynomials f are quadratic, the solutions of the above equations
can be easily found. Delone (1930) and Nagell (1930) proved independently
of each other that up to equivalence, there are only finitely many irreducible
monic polynomials f ∈ Z[X] of degree 3 for which (10.6.2) holds. This implies
that for a cubic number field K, equation (10.6.1) has also only finitely many
equivalence classes of solutions. In the quartic case, the same assertions were
obtained later by Nagell (1967, 1968a). The proofs of Delone and Nagell are
ineffective. Nagell (1967) conjectured that the finiteness assertion concerning
equation (10.6.1) is true for every number field K.
Let K be as above an algebraic number field, and denote by d and DK the
degree and discriminant of K. By repeatedly applying an earlier version of
Theorem 4.1.1, Győry (1973) proved the following general effective result.
Theorem 10.6.1 Every solution α of (10.6.1) is equivalent to a solution
α ∗ ∈ OK for which
h(α ∗ ) < C1 ,
(10.6.3)
where C1 is an effectively computable number depending only on d, DK and
D.
This implies that there are only finitely many pairwise inequivalent elements
in OK with discriminant D, and a full set of representatives of such elements
can be, at least in principle, effectively determined. This finiteness assertion
was proved independently in an ineffective form in Birch and Merriman (1972).
In view of Minkowski’s inequality (1.5.4), the degree d of K can be estimated
from above in terms of |DK |. Further, if (10.6.1) is solvable then DK divides
D. Hence, in (10.6.3), the dependence of the bound on d and DK , and hence
on K can be dropped; see Győry (1973).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.6 Discriminant equations and power integral bases
307
Proof of Theorem 10.6.1 (sketch). We reduce (10.6.1) to a system of unit equations. Let G denote the normal closure of K/Q, let g be its degree over Q,
and let α (1) = α, α (2) , . . . , α (d) be the conjugates of α with respect to K/Q. If
d ≥ 3 then
α (i) − α (2)
α (1) − α (i)
+
= 1 for i = 3, . . . , d.
α (1) − α (2)
α (1) − α (2)
(10.6.4)
Further, the numbers α (1) − α (2) , α (1) − α (i) and α (i) − α (2) divide D in the
ring of integers of G. Hence Proposition 4.3.12 implies that apart from
some unknown unit factors, the heights of these differences can be effectively bounded above. Thus, equation (10.6.4) reduces indeed to finitely many
unit equations in two unknowns in G. Finally, by Theorem 4.1.1 the heights
of α (1) − α (i) , α (2) − α (i) and so α (i) − α (j ) can be effectively estimated from
above up to the common factor α (1) − α (2) whose height can be effectively
bounded above from (10.6.1) in terms of D, g and the class number and regulator of G. Since these parameters of G can be estimated from above in terms
of d, DK and D, Theorem 10.6.1 follows.
Theorem 10.6.1 is in fact a consequence of Theorem 10.5.1. Let
A = α (1) , . . . , α (d) , N = |D|g .
Then A is a subset of the ring of integers of G, and (10.6.1) gives
|NG/Q α (i) − α (j ) | ≤ N for 1 ≤ i < j ≤ d.
Hence the graph G(A, N ) defined in Section 10.5 consists of isolated vertices.
Thus Theorem 10.5.1 applies and the heights of the differences α (i) − α (j ) can
be effectively bounded above apart from a common unit factor ε in G, while
the height of ε can be estimated from above from (10.6.1).
As was mentioned above, one may assume that in (10.6.1) DK divides D.
Let ω denote the number of distinct prime factors of the quotient D/DK . Using
Theorem 6.1.4 concerning unit equations, one can prove the following theorem,
as a special case of a more general result of Evertse and Győry from their book
on discriminant equations.
Theorem 10.6.2 Equation (10.6.1) has at most
25d
2
(ω+1)
equivalence classes of solutions.
The first, weaker version of this type was proved in Evertse and Győry
(1985).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
308
Further applications
Concerning equation (10.6.2), Delone and Faddeev (1940) posed the problem of giving an algorithm for finding all cubic monic polynomials with integer
coefficients and given non-zero discriminant. In 1973, Győry (1973) proved
the following general theorem.
Theorem 10.6.3 Every solution f ∈ Z[X] of (10.6.2) is equivalent to a solution f ∗ ∈ Z[X] for which
deg(f ∗ ) ≤ C2 ,
H (f ∗ ) ≤ C3 ,
(10.6.5)
where H (f ∗ ) denotes the maximum of the absolute values of the coefficients of
f ∗ and C2 , C3 are effectively computable numbers depending only on D.
This makes it possible, at least in principle, to determine all monic polynomials in Z[X] with given non-zero discriminant.
Later, several quantitative versions of Theorems 10.6.1 and 10.6.3, and
generalizations for S-integers and for polynomials with S-integral coefficients
in number fields, were established by Győry. References and the best known
values for C1 and C3 are given in our book on discriminant equations. The best
possible upper bound C2 can be found in Győry (1974).
For irreducible polynomials f ∈ Z[X], Theorem 10.6.1 implies Theorem
10.6.3. The “reducible” case can be reduced to the “irreducible” one by means
of the relation
⎞
⎛
k
2
R(fi , fj ) ⎠ ,
D(fi ) · ⎝
D(f ) =
i=1
1≤i<j ≤k
k
where f = i=1 fi is the irreducible factorization of f in Z[X] and R(fi , fj )
denotes the resultant of fi and fj . Another option is to apply Theorem 10.5.1
to equation (10.6.2) as in the proof of Theorem 10.6.1, and then estimate in the
bound obtained for H (f ∗ ) the parameters involved in terms of D. An upper
bound can also be derived for deg (f ∗ ) by means of Theorem 10.5.2.
We present some consequences of Theorems 10.6.1 and 10.6.2. For other
applications, for example to discriminant form and index form equations, we
refer to Győry (1976, 1980b), Evertse and Győry (1988a) and our book on
discriminant equations.
As is known, there exist algebraic number fields K having power integral
bases (i.e. integral bases of the form {1, α, . . . , α d−1 } where d = [K : Q]),
but this is not the case in general. The existence of such a basis considerably
facilitates the calculations in K and the study of arithmetical properties of OK ,
the ring of integers of K.
More generally, we consider orders in K, these are the subrings of OK
whose quotient field is K. There are infinitely many orders in K, and OK is
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.6 Discriminant equations and power integral bases
309
the maximal one among them. The order O in K is said to be monogenic
if O = Z[α] for some α ∈ O. Equivalently, in this case {1, α, . . . , α d−1 } is a
Z-module basis of O, where d = [K : Q]. In particular, the number field K is
called monogenic if OK is monogenic, that is, if K has a power integral basis.
It is known that α ∈ O generates O if and only if DK/Q (α) = DO , where
DO denotes the discriminant of O. If α is a generator of O then so are all
α ∗ ∈ O which are equivalent to α. Choosing D = DO , and using the fact that
DK divides DO , Theorem 10.6.1 gives at once the following corollary, see
Győry (1976):
Corollary 10.6.4 If O = Z[α] for some α ∈ O, then there is an α ∗ ∈ O which
is equivalent to α such that
h(α ∗ ) < C4 ,
where C4 is an effectively computable number depending only on d and DO .
In the special case O = OK , we get immediately the following consequence,
already obtained in Győry (1976).
Corollary 10.6.5 If {1, α, . . . α d−1 } is an integral basis of K, then there is an
α ∗ ∈ OK which is equivalent to α such that
h(α ∗ ) < C5 ,
where C5 is an effectively computable number depending only on d and DK .
Thus, up to equivalence, there are only finitely many elements in OK and,
more generally in O, which generate a power integral basis and they can be,
at least in principle, effectively determined. Combining this effective approach
with some reduction procedures, all power integral bases have been determined
in many number fields of relatively small degree; see e.g. Gaál (2002), Bilu,
Gaál and Győry (2004) and our book on discriminant equations.
An immediate consequence of Theorem 10.6.2 is the following.
Corollary 10.6.6 Let O be an order in K. Up to equivalence, there are at
2
most 25d elements α ∈ O such that O = Z[α].
In particular, the same assertion is true for OK .
An order O in K is said to be k times monogenic if there are at least k distinct
equivalence classes of α satisfying O = Z[α]. The following result was proved
by Bérczes, Evertse and Győry (2013).
Theorem 10.6.7 Let K be an algebraic number field of degree ≥ 3. Then there
are at most finitely many three times monogenic orders in K.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
310
Further applications
The bound 3 is best possible, that is there are number fields having infinitely
many two times monogenic orders. The proof of Theorem 10.6.7 depends
on earlier, qualitative versions of Corollary 6.1.5 and Theorem 6.1.6 on unit
equations.
A non-zero element α in an order O of an algebraic number field K is called
a basis of a canonical number system (or CNS basis) for O if every non-zero
element of O can be represented in the form
a0 + a1 α + · · · + am α m
with m ≥ 0, ai ∈ {0, 1, . . . , |NK/Q (α)| − 1} for i = 0, . . . , m and am = 0.
Canonical number systems can be viewed as natural generalizations of radix
representations of rational integers to algebraic integers.
If there exists a canonical number system in O, then O is called a CNS
order. Orders of this kind have been intensively investigated; we refer to the
survey paper Brunotte, Huszti and Pethő (2006) and the references given there.
It was proved in Kovács (1981) and Kovács and Pethő (1991) that O is
a CNS order if and only if O is monogenic. More precisely, if α is a CNS
basis in O, then it is easy to see that O = Z[α]. Conversely, O = Z[α] does
not imply in general that α is a CNS basis. However, in this case there are
infinitely many α which are equivalent to α such that α is a CNS basis for O.
A characterization of CNS bases in O is given in Kovács and Pethő (1991).
The close connection between elements α of O with O = Z[α] and CNS
bases in O enables one to apply results concerning monogenic orders to CNS
orders and CNS bases. For example, it follows from Corollary 10.6.4 that up
to equivalence there are only finitely many canonical number systems in O.
We say that O is a k times CNS order if there are at least k pairwise
inequivalent CNS bases in O. Theorem 10.6.7 implies the following result, see
also Bérczes, Evertse and Győry (2013).
Corollary 10.6.8 Let K be an algebraic number field of degree ≥ 3. Then
there are at most finitely many three times CNS orders in K.
10.7 Binary forms of given discriminant
Let F = a0 Xn + a1 Xn−1 Y + · · · + an Y n be a binary form of degree n ≥ 2 with
coefficients in a field K. We can factor F over an algebraic closure of K as
F =
n
(αi X − βi Y );
(10.7.1)
i=1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.7 Binary forms of given discriminant
311
then the discriminant of F is given by
(αi βj − αj βi )2 .
D(F ) =
1≤i<j ≤n
We can express D(F ) otherwise as a homogeneous polynomial of degree 2n − 2
in Z[a0 , . . . , an ]. Define the binary form FU by
FU (X, Y ) = F (aX + bY, cX + dY ) for
a
c
b
∈ GL(2, K).
d
Then we have
D(λFU ) = λ2n−2 (det U )n(n−1) D(F ) for λ ∈ K ∗ , U ∈ GL(2, K). (10.7.2)
Given a subring A of K, we say that two binary forms F, G ∈ A[X, Y ] are
GL(2, A)-equivalent if there are a unit u ∈ A∗ and U ∈ GL(2, A) such that
G = uFU . By (10.7.2), two GL(2, A)-equivalent binary forms have, up to
multiplication with a unit from A, the same discriminant.
We now restrict ourselves to binary forms with coefficients in Z. By
(10.7.2),two GL(2, Z)-equivalent binary forms have the same discriminant.
We have the following fundamental theorem.
Theorem 10.7.1 Let n, D be integers with n ≥ 2 and D = 0. Then there are
only finitely many GL(2, Z)-equivalence classes of binary forms F ∈ Z[X, Y ]
of degree n and discriminant D.
For n = 2 this is a classical theorem of Lagrange (1773) and for n = 3 a
classical theorem of Hermite (1851). For n ≥ 4 this was proved only in 1972
by Birch and Merriman (Birch and Merriman (1972), Theorem 2). The proofs
of Lagrange and Hermite are effective, while that of Birch and Merriman is
ineffective.
Proof of Birch and Merriman (sketch). We give a brief sketch of the proof of
Birch and Merriman, explaining at which point it fails to be effective. Take a
binary form F ∈ Z[X, Y ] of degree n ≥ 4 and discriminant D = 0. The discriminant of the splitting field of F can be estimated from above in terms of D,
and by the Hermite–Minkowski Theorem, this leaves only a finite, effectively
determinable collection of possible splitting fields for F . So we may restrict
ourselves to binary forms F with given splitting field L, say. Let H denote the
Hilbert class field of L, and let S be a finite set of places of H such that D ∈ OS∗ .
Then F can be factored as in (10.7.1) with αi , βi ∈ OS and αi βj − αj βi ∈ OS∗
for 1 ≤ i < j ≤ n. There is a matrix U ∈ GL(2, OS ) such that
FU = εXY (X − Y )(X − γ3 Y ) · · · (X − γn Y )
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
312
Further applications
with ε ∈ OS∗ , γ3 , . . . , γn ∈ OS . Further, by (10.7.2),
D(FU ) = ±ε2n−2 (γi (1 − γi )) ∈ OS∗ ,
i=3
OS∗
for i = 3, . . . , n. In this way, the problem of
which implies that γi , 1 − γi ∈
finding the binary forms F of given discriminant reduces to an S-unit equation
in two unknowns x + y = 1 in x, y ∈ OS∗ .
Now by an effective finiteness result for such equations such as Theorem
4.1.3, one can show that there are only finitely many possibilities for γ3 , . . . , γn
that can be determined effectively. This shows that the binary forms F ∈
Z[X, Y ] of degree n and discriminant D lie in only finitely many GL(2, OS )equivalence classes. The final step of the proof of Birch and Merriman is to
show that the binary forms in Z[X, Y ] of discriminant D in a given GL(2, OS )equivalence class lie in only finitely many GL(2, Z)-equivalence classes. At
this point, the argument of Birch and Merriman is ineffective, since it does not
give an effective procedure to check whether a given GL(2, OS )-equivalence
class contains a binary form from Z[X, Y ].
Evertse and Győry (1991) managed to give an effective version of the result
of Birch and Merriman. The following is a less precise version of Theorem
1 from their paper. Given a binary form F ∈ Z[X, Y ], denote by H (F ) the
maximum of the absolute values of the coefficients of F .
Theorem 10.7.2 Let n, D be integers with n ≥ 2 and D = 0. Then there is
an effectively computable number C1 , depending only on n and D, such that
for every binary form F ∈ Z[X, Y ] of degree n and discriminant D there is
U ∈ GL(2, Z) such that H (FU ) ≤ C1 .
Proof (sketch). We give only the main idea of the proof. We may again restrict
ourselves to binary forms with given splitting field L. Let F ∈ Z[X, Y ] be
a binary form of degree n and discriminant D with splitting field L. Take
a factorization of F as in (10.7.1). After multiplying F by a small integer
(effectively bounded in terms of L), we may assume that F has a factorization
as in (10.7.1) with αi , βi ∈ OL for i = 1, . . . , n. Put ij := αi βj − αj βi for
1 ≤ i, j ≤ n. Then for any quadruple i, j, k, l of distinct indices we have the
identity
ij kl + j k il + ki j l = 0.
(10.7.3)
Notice that all terms ij are in OL and divide D; hence |NL/Q (ij | ≤ |D|[L:Q]
for all i, j . Using Proposition 4.3.12 we can express each term ij as a product
of an element of height effectively bounded in terms of n, D, L and an element
of OL∗ . By substituting this into the identities (10.7.3) we obtain homogeneous
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.7 Binary forms of given discriminant
313
unit equations like in Theorem 4.1.1. By applying the latter, we obtain effective
upper bounds for the heights of the quotients ij kl /ik j l . We have some
freedom to choose the α, βi in (10.7.1). By doing this in an appropriate way, we
can deduce in fact effective upper bounds for the heights of the numbers ij
themselves. Then, with the help of an argument from the geometry of numbers,
one can construct a matrix U ∈ GL(2, Z) as in Theorem 10.7.2.
In our book Discriminant Equations in Diophantine Number Theory we give
a complete proof of Theorem 10.7.2, with the explicit value
2
C1 = exp (16n3 )25n |D|5n−3 .
It is possible to give a semi-effective version of Theorem 10.7.2 with for C1 a
bound with a much better dependence on D, but with an ineffective dependence
on the splitting field of the binary form F . The following result is Theorem 1
of Evertse (1993).
Theorem 10.7.3 Let F ∈ Z[X, Y ] be a binary form of degree n ≥ 4 and
of discriminant D = 0. Assume that F has splitting field L. Then there is
U ∈ GL(2, Z) such that
H (F ) ≤ C ineff (n, L)|D|21/(n−1) .
Here, C ineff (n, L) is a number, not effectively computable from the proof, that
depends only on n and L.
Proof (sketch). The proof is similar to that of Theorem 10.7.2 but one has to
apply Theorem 6.1.1 with n = 2 to (10.7.3). Some precise combinatorics is
needed to get an exponent O(1/n) on |D|.
It is possible to give explicit upper bounds for the number of GL(2, Z)-equivalence classes of binary forms, under certain additional constraints. Although
it is possible to treat reducible binary forms as well, we restrict ourselves to
binary forms that are irreducible over Q.
Let F = a0 Xn + a1 Xn−1 Y + · · · + an Y n ∈ Z[X, Y ] be a binary form of
degree n ≥ 2. We say that F is associated with a number field K if F is
irreducible over Q, and there is α with F (α, 1) = 0, K = Q(α). This being the
case, we can factor F over K as
F = (X − αY ) a0 Xn−1 + ω1 Xn−2 Y + · · · + ωn−1 Y n−1 ,
where ω1 , . . . , ωn−1 ∈ K. Denote by OF the Z-module generated by the numbers 1, ω1 , . . . , ωn−1 . We call OF the invariant order of F . This naming is
motivated by work of Simon (2001), who showed that OF is in fact an order in
K, i.e., a subring of K of rank n as a Z-module, and that GL(2, Z)-equivalent
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
314
Further applications
binary forms have isomorphic invariant orders. Of course OF depends on the
choice of K and α, but it is unique up to Z-algebra isomorphism. It is not hard
to show that for the discriminant of OF , i.e., DK/Q (1, ω1 , . . . , ωn−1 ), we have
D(OF ) = D(F ).
This implies
D(F ) = c2 DK ,
(10.7.4)
where c = [OK : OF ]. The following result is a less precise form of Corollary
2.2 of Bérczes, Evertse and Győry (2004).
Theorem 10.7.4 Let K be a number field of degree n ≥ 2. and c a positive
integer. Then for every > 0, the number of GL(2, Z)-equivalence classes
of irreducible binary forms F ∈ Z[X, Y ] are associated with K and satisfy
(10.7.4) is
c(2/n(n−1))+ ,
where the implied constant is effectively computable and depends only on n
and .
It is shown in Bérczes, Evertse and Győry (2004) that the bound in Theorem
2
.
10.7.4 cannot be replaced by one of order cα with α < n(n−1)
We subdivide the irreducible binary forms with (10.7.4) further and consider
binary forms with given invariant order. By a result of Delone and Faddeev
(1940), section 15, for every cubic number field K and every order O in K,
there is precisely one GL(2, Z)-equivalence class of cubic forms F ∈ Z[X, Y ]
such that OF ∼
= O. On the other hand, in his paper referred to above, Simon
proved that for every n ≥ 4 there are number fields K of degree n such that OK
is not the invariant order of a binary form. The following result is Corollary 2.1
of Bérczes, Evertse and Győry (2004).
Theorem 10.7.5 Let O be an order in a number field K of degree n ≥ 4. Then
3
there are at most 224n GL(2, Z)-equivalence classes of irreducible binary forms
F ∈ Z[X, Y ] such that OF ∼
= O.
In our book Discriminant Equations in Diophantine Number Theory the bound
3
2
224n is improved to 25n . One can define more generally the invariant order
of a reducible binary form of degree n, which is an order of rank n, i.e. a
commutative ring which as a Z-module is free of rank n. When F has non-zero
discriminant, its invariant order has no nilpotents. In our book on discriminant
equations, we have proved a generalization of Theorem 10.7.5 where O is a
given nilpotent-free order of rank n.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.8 Resultant equations for monic polynomials
315
We mention that both the proofs of Theorems 10.7.4, 10.7.5 use Theorem
6.1.4 (the result of Beukers and Schlickewei).
We finally remark that the papers Evertse and Győry (1991), Evertse (1993)
and Bérczes, Evertse and Győry (2004), as well as our book on discriminant
equations, contain proofs of generalizations of Theorems 10.7.2–10.7.5 for
binary forms with S-integral coefficients in number fields. A further generalization of Theorem 10.7.2 is given in Evertse and Győry (1992a, 1992b) for
decomposable forms of given discriminant. See also our book on discriminant
equations.
10.8 Resultant equations for monic polynomials
Recall that the resultant of two monic polynomials
f =
m
(X − αi ), g =
i=1
m+n
(X − αi )
i=m+1
is given by
R(f, g) =
(αi − αj )
1≤i≤m
m+1≤j ≤m+n
and that R(f, g) is a polynomial with integer coefficients in terms of the
coefficients of f and g.
Let K be an algebraic number field, and consider the resultant equation
R(f, g) = R in monic f, g ∈ Z[X] having all their zeros in K,
(10.8.1)
where R is a non-zero rational integer. If f , g is a solution of (10.8.1) then so
is
f ∗ (X) = f (X + a),
g ∗ (X) = g(X + a)
for all a ∈ Z. Such pairs of polynomials f , g, and f ∗ , g ∗ are called equivalent.
The following result was obtained in Győry (1990).
Theorem 10.8.1 There are only finitely many equivalence classes of pairs f ,
g with deg (f ) ≥ 2, deg (g) ≥ 2 and deg (f ) + deg (g) ≥ 5, without multiple
zeros, such that (10.8.1) holds.
We note that the assumptions concerning the degrees of f and g are necessary. Further, the condition that the zeros of f and g are contained in a
fixed number field cannot be dropped. However, the restriction concerning the
multiplicity of the zeros can be weakened, see Győry (1993c, 2008b).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
316
Further applications
Győry (1990, 1993c, 2008b) and Bérczes, Evertse and Győry (2007a)
obtained quantitative versions of Theorem 10.8.1 which provide upper bounds
for the degrees of f and g and for the number of equivalence classes of pairs
f , g under consideration. For example, it is proved in Győry (1990) that, in
Theorem 10.8.1,
deg(f ) + deg(g) ≤ 12 · 73d+2ω ,
where d is the degree of K over Q, and ω denotes the number of distinct prime
factors of R.
It should be remarked that Theorem 10.8.1 is established in Győry (1990)
in the more general case when the ground ring is any integrally closed integral
domain of characteristic 0 which is finitely generated over Z.
Proof of Theorem 10.8.1 (sketch). We reduce equation (10.8.1) to unit equations. The basic idea is as follows. Let f , g be a solution of (10.8.1) with
deg(f ) = m ≥ 2, deg(g) = n ≥ 2, m + n ≥ 5,
and let {α1 , . . . , αm }, {αm+1 , . . . , αm+n } be the zeros of f and g in K. Since f ,
g are monic, these zeros are contained in OK , the ring of integers of K, and by
assumption they are distinct. Then (10.8.1) can be written in the form
(αi − αj ) = R.
(10.8.2)
1≤i≤m
m+1≤j ≤m+n
The differences αi − αj divide R in OK . Hence taking norms, we infer that
|NK/Q (αi − αj )| ≤ N
for each i, j,
(10.8.3)
where N = |R|d . By Proposition 4.3.12, αi − αj may take only finitely many
values up to a unit factor from OK . There exist several linear relations among
these differences, for example
(αi − αj ) + (αj − αk ) + (αk − αl ) = αi − αl ,
with 1 ≤ i, k ≤ m, m + 1 ≤ j, l ≤ m + n. This leads to inhomogeneous unit
equations in three unknowns. We arrive in this way at a complicated system of
unit equations. However, in contrast with the case of discriminant equations, in
this situation we get unit equations in more than two unknowns. Thus one has
to apply the ineffective Corollary 6.1.2 or Theorem 6.1.3. to obtain Theorem
10.8.1. Therefore, Theorem 10.8.1 is ineffective.
It is simpler to deduce Theorem 10.8.1 from Theorem 10.5.2. We recall, however, that the proof of Theorem 10.5.2 is also based on the results concerning
unit equations mentioned in Chapter 6. Consider the graph G(A) = G(A, N),
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.9 Resultant inequalities and equations for binary forms
317
where A = {α1 , . . . , αm , . . . , αm+n }. Using the above notation, it follows from
(10.8.2) and (10.8.3) that G(A) has either at least three connected components
or two connected components of order at least 2. Hence Theorem 10.5.2 implies
that m + n is bounded. Further, for fixed m and n, we have
αi = εαi + β,
i = 1, . . . , m + n
with some ε ∈ OK∗ , β ∈ OK and with α1 , . . . , αm+n
∈ OK which may take only
finitely many values. This gives
αi − αj = ε(αi − αj ),
0 ≤ i ≤ m,
m + 1 ≤ j ≤ m + n.
(10.8.4)
We see from (10.8.2) and (10.8.4) that for fixed α1 , . . . , αm+n
, εmn is also
fixed, that is ε can assume only finitely many values. Finally, one can infer
that αi = αi∗ + a with some a ∈ Z and with finitely many possible αi∗ ∈ OK ,
i = 1, . . . , m + n, whence Theorem 10.8.1 follows.
10.9 Resultant inequalities and equations for binary forms
m−i i
We keep the notation introduced in Section 10.7. Let F = m
Y ,
i=1 ai X
n
n−i i
G = i=1 bi X Y be two binary forms with coefficients in a field K. Assume
that over an algebraic closure of K, the forms F, G factor as
F =
m
(αi X − βi Y ), G =
n
(γj X − δj Y );
(10.9.1)
j =1
i=1
then the resultant of F, G is given by
R(F, G) =
m n
(βi γj − αi δj ).
i=1 j =1
We can express R(F, G) otherwise as a polynomial with integer coefficients in
a0 , . . . , am , b0 , . . . , bn , homogeneous of degree n in a0 , . . . , am and homogeneous of degree m in b0 , . . . , bn . Notice that for λ, μ ∈ K ∗ and U ∈ GL(2, K)
we have
R(λFU , μGU ) = λn μm (det U )mn R(F, G).
(10.9.2)
Let A be a subring of K. We call two pairs of binary forms (F, G), (F , G )
with coefficients in A GL(2, A)-equivalent if F = u1 FU , G = u2 GU for
some u1 , u2 ∈ A∗ and U ∈ GL(2, A). By (10.9.2), GL(2, A)-equivalent pairs of
binary forms have, up to multiplication with a unit from A∗ , the same resultant.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
318
Further applications
We restrict ourselves to binary forms with coefficients in Z. We start with
formulating some results for resultant inequalities and then deduce some analogues for binary forms of some of the results from the previous section. By
Ciineff (·) we denote positive numbers, depending on the parameters between the
parentheses, that are not effectively computable by the method of proof of the
theorem in which they appear. We call a binary form square-free if it is not
divisible by the square of a non-constant binary form.
Our first result, which is Theorem 1 of Evertse and Győry (1993), gives a
lower bound for the resultant of two binary forms in terms of their discriminants.
Theorem 10.9.1 Let L be a finite, normal extension of Q, and F, G ∈ Z[X, Y ]
binary forms such that
deg F = m ≥ 3, deg G = n ≥ 3, F G is square-free,
F G has splitting field L.
(10.9.3)
Then
|R(F, G)| ≥ C1ineff (m, n, L) |D(F )|n/(m−1) |D(G)|m/(n−1)
1/18
.
It was shown in Evertse and Győry (1993) that the dependence on L in Theorem
10.9.1 is necessary, and that neither of the conditions m ≥ 3, n ≥ 3 can be
removed.
Proof (sketch). Let F, G ∈ Z[X, Y ] be binary forms as in the statement of
Theorem 10.9.1. After multiplying F, G by small integers bounded above in
terms of L which will not have an effect on our result, we may assume that F, G
have factorizations as in (10.9.1) with αi , βi , γj , δj ∈ OL for i = 1, . . . , m,
j = 1, . . . , n. Put ij := βi γj − αi δj for i = 1, . . . , m, j = 1, . . . , n. Then
ij ∈ OL for all i, j and i,j ij = R(F, G). Further, for all distinct i, j, k ∈
{1, . . . , m}, p, q, r ∈ {1, . . . , n} we have
ip iq ir jp j q j r = ip j q kr + iq j r kp + ir jp kq
kq kr kp
−iq jp kr − ip j r kq − ir j q kp = 0.
(10.9.4)
Similarly as in the proof of Theorem 6.1.6, we consider all possible splittings of
(10.9.4) into minimal non-vanishing subsums, and then apply Theorem 6.1.1
to each of these minimal sums. This leads to lower bounds for the quantities
|NL/Q (ip jp kp iq j q kq ir j r kr )| for all i, j, k, p, q, r. By taking the
product of these, the theorem follows.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.9 Resultant inequalities and equations for binary forms
319
It is also possible to give a lower bound for |R(F, G)| in terms of the heights
of a pair of binary forms that is GL(2, Z)-equivalent to F, G. The following
result is Theorem 1 of Evertse (1998).
Theorem 10.9.2 Let F, G ∈ Z[X, Y ] be binary forms with (10.9.3). Then
there is U ∈ GL(2, Z) such that
1/718
.
|R(F, G)| ≥ C2ineff (m, n, L) H (FU )n H (GU )m
Of course, the theorem does not hold without the matrix U , since by varying
the pairs (F, G) in a given GL(2, Z)-equivalence class, |R(F, G)| remains the
same, while H (F ), H (G) may become arbitrarily large.
Proof (sketch). Apply Theorem 10.9.1. According to Theorem 10.7.3, there is
U ∈ GL(2, Z) such that H (GU ) is bounded above in terms of |D(G)|, and so
in terms of |R(F, G)|. Writing FU = m
i=1 (αi X − βi Y ), we get
m
GU (αi , βi ) = ±R(FU , GU ) = ±R(F, G).
i=1
Thus, for i = 1, . . . , m, the number GU (αi , βi ) divides R(F, G) in OL . We
may view the pairs (αi , βi ) as solutions to a Thue equation over OL . This leads
to upper bounds for the heights of the αi , βi , and hence of H (FU ), in terms
of H (GU ) and |R(F, G)|. Thus, both H (FU ), H (GU ) can be estimated from
above in terms of R(F, G). A precise computation gives Theorem 10.9.2.
We deduce some consequences. The first is an analogue of Theorem 10.8.1.
Corollary 10.9.3 Let R be a non-zero integer. Then the pairs of binary forms
F, G ∈ Z[X, Y ] with (10.9.3) and with
R(F, G) = R
lie in at most finitely many GL(2, Z)-equivalence classes.
Proof. Immediate consequence of Theorem 10.9.2.
The next consequence is a special case of Theorem 1 of Evertse and Győry
(1989). Given a binary form F ∈ Z[X, Y ] and an integer m > 0, we consider
the Thue inequality
0 < |F (x, y)| ≤ m in x, y ∈ Z.
(10.9.5)
Two solutions (x, y), (x , y ) of (10.9.5) are called proportional if (x , y ) =
a(x, y) for some a ∈ Q∗ .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
320
Further applications
Corollary 10.9.4 Let n ≥ 3 be an integer and L a finite normal extension of
Q. Then up to GL(2, Z)-equivalence, there are only finitely many binary forms
F ∈ Z[X, Y ] of degree n and with splitting field L such that (10.9.5) has more
than two pairwise non-proportional solutions.
Proof. Let F ∈ Z[X, Y ] be a binary form of degree n and splitting field L and
suppose that (10.9.5) has three pairwise non-proportional solutions, say (x1 , y1 ),
(x2 , y2 ), (x3 , y3 ). Define the binary form G := 3i=1 (yi X − xi Y ). Then
0 < |R(F, G)| = |F (x1 , y1 )F (x2 , y2 )F (x3 , y3 )| ≤ m3 .
By applying Corollary 10.9.3 with R = ±1, . . . , ±m3 , we see that up to
GL(2, Z)-equivalence there are only finitely many possibilities for the pairs
F, G, and so in particular only finitely many possibilities for F .
We finish with a result of LeVesque and Waldschmidt on parametrized Thue
inequalities. Let F = Xn + a1 Xn−1 Y + · · · + an Y n ∈ Z[X, Y ] be a squarefree binary form of degree n ≥ 3 and with given splitting field L. We can
factor F over L as
F = (X − α1 Y ) · · · (X − αn Y ),
with α1 , . . . , αn distinct elements of L. Consider tuples ε := (ε1 , . . . , εn ) with
ε1 , . . . , εn ∈ OL∗ , ε1 α1 , . . . , εn αn distinct,
Fε := (X − ε1 α1 Y ) · · · (X − εn αn ) ∈ Z[X, Y ].
(10.9.6)
Notice that for ε with (10.9.6) we necessarily have ε1 · · · εn = ±1. Let m be an
integer with m ≥ |F (0, 1)|. Then for every with (10.9.6), the Thue inequality
|Fε (x, y)| ≤ m
in x, y ∈ Z
(10.9.7)
has solutions (1, 0), (0, 1). Solutions (x, y) of (10.9.7) with xy = 0 are called
trivial.
The following result is a special case of Theorem 3.1 of LeVesque and
Waldschmidt (2012).
Corollary 10.9.5 There are only finitely many ε with (10.9.6) such that (10.9.7)
has non-trivial solutions.
Proof. By Corollary 10.9.4, the binary forms Fε (with ε as in (10.9.6)) such that
(10.9.7) has non-trivial solutions lie in only finitely many GL(2, Z)-equivalence
classes. So it suffices to show that a GL(2, Z)-equivalence class can contain
only finitely many binary forms Fε . Let ε be as in (10.9.6), and suppose that
Fε = ±FU for some U = ( ac db ) ∈ GL(2, Z). Then
F (a, c) = ±Fε (1, 0) = ±F (1, 0) = ±1, F (b, d) = ±Fε (0, 1) = ±F (0, 1).
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.10 Lang’s Conjecture for tori
321
Now by Thue’s Theorem there are only finitely many possibilities for a, b, c, d,
hence for ε. This proves Corollary 10.9.5.
We mention that in all papers referred to above, generalizations of the
theorems and corollaries stated above have been proved for binary forms with
S-integral coefficients in a number field.
10.10 Lang’s Conjecture for tori
Let K be an algebraically closed field of characteristic 0 and n an integer
≥ 2. Let (K ∗ )n denote the n-fold direct product of the multiplicative group K ∗
of non-zero elements of K. That is, (K ∗ )n is the group with coordinatewise
multiplication
x · y := (x1 y1 , . . . , xn yn )
for x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) ∈ (K ∗ )n ,
and with unit element 1 = (1, . . . , 1). We write polynomials f ∈
K[X1 , . . . , Xn ] as a∈I c(a)Xa , where I is a finite subset of (Z≥0 )n , c(a) ∈ K ∗
for a ∈ I and Xa := X1a1 · · · Xnan if a = (a1 , . . . , an ).
A subvariety of (K ∗ )n is a set
X = {x ∈ (K ∗ )n : f1 (x) = 0, . . . , fr (x) = 0},
where f1 , . . . , fr ∈ K[X1 , . . . , Xn ]. We do not require here that X is irreducible. An algebraic subgroup of (K ∗ )n is a subvariety of (K ∗ )n that is also
a subgroup of (K ∗ )n . For instance, a subvariety of (K ∗ )n given by equations
xai = xbi (i = 1, . . . , r) with ai , bi ∈ Zn≥0 for i = 1, . . . , r is an algebraic subgroup of (K ∗ )n and in fact any algebraic subgroup of (K ∗ )n can be expressed
in this form (see, e.g., Schmidt (1996)). An algebraic coset is a subvariety of
(K ∗ )n of the shape uH = {u · x : x ∈ H } where H is an algebraic subgroup of
(K ∗ )n and u ∈ (K ∗ )n . Such a coset is more precisely called a coset of H .
The following is a more precise quantitative version of theorems of Liardet
(1974, 1975) for n = 2, and Laurent (1984) for n ≥ 3.
Theorem 10.10.1 Let X be a subvariety of (K ∗ )n given by polynomials of
total degree at most . Let be a subgroup of (K ∗ )n of finite rank r. Then
X ∩ is contained in a finite union u1 H1 ∪ · · · ∪ ut Ht of algebraic cosets with
ui Hi ⊆ X for i = 1, . . . , t, where
t ≤ C(n, )r+1 ,
with C(n, ) effectively computable in terms of n and .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
322
Further applications
Below, we give a simple proof of Theorem 10.10.1 by making a reduction
to unit equations
a1 x1 + · · · + an xn = 1 in (x1 , . . . , xn ) ∈ ,
(10.10.1)
where a1 , . . . , an ∈ K ∗ , and using the fact that such equations have only finitely
many non-degenerate solutions (see Chapter 6). But, conversely, this finiteness
result for (10.10.1) is a consequence of Theorem 10.10.1. Indeed, let X be the
subvariety of (K ∗ )n given by the linear equation a1 x1 + · · · + an xn = 1. Denote
by X 0 the set of points of X that remain if we remove all algebraic cosets of
dimension > 0 that are contained in X . By Theorem 10.10.1, the set X 0 ∩ is
finite. It can be shown that X 0 consists precisely of the non-degenerate points
in X , i.e., for which i∈J ai xi = 0 for each proper, non-empty subset J of
{1, . . . , n}. So Theorem 10.10.1 gives back the result that (10.10.1) has only
finitely many non-degenerate solutions.
Theorem 10.10.1 may be viewed as the special case for algebraic tori of a
general conjecture of Lang on semi-abelian varieties (see Lang (1960)). We
do not formally define the n-dimensional algebraic torus Gnm,K over a field K.
We only need the fact that its group of K-rational points is Gnm,K (K) = (K ∗ )n ,
endowed with coordinatewise multiplication. A semi-abelian variety over a
field K is a commutative group variety A over K, for which there exists a short
exact sequence of group varieties over K,
0 → Gnm,K → A → A0 → 0,
where n ≥ 0 and A is an abelian variety over K. If n = 0 then A is an abelian
variety, while if A0 = 0 then A is an algebraic torus. Writing + for the group
operation of A, we define a translate of a semi-abelian subvariety over K of A to
be a subvariety of the shape a + B := {a + x : x ∈ B}, with B a semi-abelian
subvariety of A over K and a ∈ A(K).
Then Lang’s general conjecture for semi-abelian varieties is as follows.
Conjecture Let A be a semi-abelian variety and X a subvariety of A, both
defined over an algebraically closed field K of characteristic 0. Let be a
subgroup of A(K) of finite rank. Then X (K) ∩ is contained in a finite union
of translates (a1 + B1 ) ∪ · · · ∪ (at + Bt ) of semi-abelian subvarieties of A that
are each contained in X .
Lang’s Conjecture implies Mordell’s Conjecture (Mordell (1922a)) that for
any ireducible algebraic curve C of genus g ≥ 2 defined over Q, and any
number field L, the set of L-rational points C(L) is finite. Indeed, we may
view C as a subvariety of its Jacobian JacC which is a g-dimensional abelian
variety over Q. By the Mordell–Weil Theorem, the group JacC (L) is finitely
generated. One-dimensional abelian subvarieties of JacC are elliptic curves,
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.10 Lang’s Conjecture for tori
323
and so translates of those are curves of genus 1. The curve C itself cannot
be a translate of an abelian subvariety of JacC since it has genus at least 2.
So the translates of abelian subvarieties of Jac(C) that are contained in C are
necessarily points, and one deduces that C(L) is finite.
Lang’s Conjecture is now a theorem. Faltings (1983) proved Mordell’s Conjecture. Laurent (1984) proved Lang’s Conjecture in the case of tori. Again
Faltings (1991, 1994) proved Lang’s Conjecture in the case that K = Q, A
is an abelian variety but for a finitely generated subgroup of A(Q) instead
of an arbitrary group of finite rank. Vojta (1996) proved Lang’s Conjecture
for arbitrary semi-abelian varieties over Q, but still with finitely generated.
Finally, McQuillan (1995), combining Vojta’s arguments with Hindry’s (Hindry
(1988)), proved Lang’s Conjecture in full generality, with K an arbitrary algebraically closed fields of characteristic 0, and an arbitrary subgroup of A(K)
of finite rank.
We now prove Theorem 10.10.1. Our main tool is Theorem 6.1.3 on unit
equations.
Proof of Theorem 10.10.1. We have
⎫
⎧
⎬
⎨
ci (a)xa = 0 (i = 1, . . . , t) ,
X = x ∈ (K ∗ )n : fi (x) =
⎭
⎩
a∈Ii
∗
Zn≥0
where Ii ⊂
is finite, ci (a) ∈ K for i = 1, . . . , t, a ∈ Ii , and the polynomials f1 , . . . , ft have total degree at most .
0
Let I := ti=1 Ii . With a point x ∈ X we associate an unordered graph Gx
as follows. The vertices of Gx are the elements of I , and a pair {p, q} is an edge
of Gx if there are i ∈ {i, . . . , t} and a non-empty subset J of Ii such that
⎫
⎪
ci (a)xa = 0,
p, q ∈ I,
⎪
⎬
a∈J
(10.10.2)
a
ci (a)x = 0 for each proper, non-empty subset J of J . ⎪
⎪
⎭
a∈J 2
n+
Notice that there are at most 2( n ) possibilities for the graph Gx . For a graph
G on I , let
XG = {x ∈ X : Gx = G}.
We fix a graph G on I , and show that XG ∩ is contained in a union of at
most C1r+1 algebraic cosets, each of which is contained in X , where C1 is
an effectively computable number depending only on n and . This clearly
suffices. In fact, these cosets will all be cosets of the algebraic group
HG := x ∈ (K ∗ )n : xp = xq for each edge {p, q} of G .
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
324
Further applications
We first show that if u ∈ XG , then uHG ⊂ X . We can express I1 as a union
of pairwise disjoint sets J1 ∪ · · · ∪ Jr such that
c1 (a)ua = 0 for i = 1, . . . , r,
a∈Ji
and a∈J c1 (a)u = 0 if J is a proper subset of one of the Ji . Let x ∈ HG .
Clearly, any pair {p, q} contained in the same set Ji is an edge of G, hence
xp = xq . Consequently,
a
f1 (u · x) =
r c1 (a)(u · x)a = 0.
i=1 a∈Ji
Similarly, fi (u · x) = 0 for i = 2, . . . , t. This shows that u · x ∈ X for every
x ∈ HG .
Let {p, q} be an edge of G and x ∈ XG ∩ . Choose a set J as in (10.10.2).
Then
c (a)
− ci (q) xa−q = 1.
a∈J \{q}
i
Notice that the tuple (xa−q : a ∈ J \ {q}) is a non-degenerate solution to this
equation, and that it lies in a homomorphic image of the group . Now Theorem
6.1.3 implies that xp−q ∈ Up,q , where Up,q is a set, that may depend on p, q
but is otherwise independent of x, of cardinality at most C2r+1 , where C2 is
effectively computable and depends only on n, . The values xp−q , taken for
all edges {p, q} of G, uniquely determine the coset xHG . It follows that the
points x ∈ XG ∩ lie in a union of at most C1r+1 cosets of HG . This completes
our proof.
We give an overview of some extensions and refinements of Theorem
∗
10.10.1. For x = (x1 , . . . , xn ) ∈ (Q )n we define
+
h(x) :=
n
h(xi ),
i=1
where, as usual, h(x) denotes the absolute logarithmic height of x ∈ Q. Let ∗
be a finitely generated subgroup of (Q )n . We denote by the division group
∗
∗
of , that is the subgroup of (Q )n consisting of the points x ∈ (Q )n for which
there is m ∈ Z>0 such that xm ∈ . Define the following enlargements of :
∗
:= y · z : y ∈ , z ∈ (Q )n , +
h(z) < ,
∗
h(z) < (1 + +
h(y) .
C(, ) := y · z : y ∈ , z ∈ (Q )n , +
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.10 Lang’s Conjecture for tori
325
We may view these as a “cylinder” and “truncated cone” around . The sets
and C(, ) were introduced by Poonen (1999) and Evertse (2002), respectively, in a more general context. Clearly, for > 0 we have
⊂ ⊂ C(, ).
It is important to note that and C(, ) are not groups.
Poonen (1999) formulated a “Lang–Bogomolov Conjecture” for semiabelian varieties. In the case of algebraic tori, this states that if X is a subvariety
∗
∗
of (Q )n and is a finitely generated subgroup of (Q )n , then there is > 0
such that X ∩ is contained in a finite union of algebraic cosets, all contained
∗
in X . For = 0 this is Lang’s Conjecture for tori, and for = (Qtors )n we
get Bogomolov’s Conjecture for tori. The general conjecture for semi-abelian
varieties is similar, except that there one has to use a suitable canonical height
on the semi-abelian variety under consideration. Poonen himself and independently S. Zhang (2000) proved the Lang–Bogomolov Conjecture for almost
split semi-abelian varieties, these include algebraic tori and abelian varieties.
The full conjecture was proved by Rémond (2003).
∗
Let X be a subvariety of (Q )n and denote again by X 0 the set that remains if
we remove from X all algebraic cosets of positive dimension that are contained
in X . In Evertse (2002) it was stated, and sketched, that there is > 0 such
that X 0 ∩ C(, ) is finite. Rémond (2003) proved a generalization of this for
semi-abelian varieties.
Rémond (2002) obtained a quantitative version of the Lang–Bogomolov
Conjecture for tori, a somewhat simplified version of which is as follows.
Suppose that X is given by polynomials of degree at most . Define the
number := exp(−(n3n+3 log(n)). Assume has rank r. Then X ∩ is
contained in a union of at most
2
exp n3n +3 log(n)(r + 1)
algebraic cosets, each contained in X . In his proof, Rémond did not use the
Subspace Theorem or results on unit equations, but instead the ideas introduced
by Faltings (1991), which led to bounds with a better dependence on . Rémond
(2000a, 2000b) proved an analogous result for subvarieties of abelian varieties.
In very few cases, effective results for the above mentioned finiteness results
for algebraic tori have been proved. The first case is when X is a curve in
∗
∗
Q × Q , i.e., X is given by
P (x1 , x2 ) = 0
with P ∈ Q[X1 , X2 ]. We assume that P is an absolutely irreducible polynomial
not of the form aX1m + bX2n or aX1m X2n + b, so that X is not an algebraic coset
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
326
Further applications
∗
∗
of Q × Q , which means that X does not contain one-dimensional algebraic
∗
∗
cosets. Let be a finitely generated subgroup of Q × Q . Bombieri and Gubler
(2006), Theorem 5.4.5 gave an effectively computable upper bound in terms
of the heights of the coefficients of P and of a set of generators for for the
heights of the points x ∈ X ∩ .
The result of Bombieri and Gubler was extended by Bérczes, Evertse, Győry
and Pontreau (2009) to sets X ∩ C(, ), where is a finitely generated sub∗
∗
group of Q × Q , and > 0 is an effectively computable number depending
only on the coefficients of P and a generating set for . Moreover, in this latter
work explicit upper bounds are given both for the heights and for the degrees
of the coordinates of the points of these sets, all in terms of the coefficients
of P and the given generators of . Further, this work contains effective versions of Lang’s Conjecture for tori, with extensions to X ∩ , X ∩ C(, ),
∗
for higher dimensional subvarieties X of (Q )n from a very restricted class,
namely subvarieties given by polynomials with at most three non-zero terms.
Applying the specialization techniques discussed in Chapter 8, Bérczes
(2015a) proved effective finiteness results for equations P (x1 , x2 ) = 0 in
x1 , x2 ∈ A∗ , where A is an integral domain that is finitely generated over
Z and P ∈ A[X1 , X2 ]. In a subsequent paper, Bérczes (2015b) he proved an
effective finitenes result for P (x1 , x2 ) = 0 in (x1 , x2 ) ∈ , where is a finitely
generated subgroup of K ∗ × K ∗ .
10.11 Linear recurrence sequences and
exponential-polynomial equations
We give a brief overview of some results concerning zeros of linear recurrence
sequences, and more generally, integer solutions of exponential-polynomial
equations. Much more on these topics can be found in Schmidt (2003).
Let K be an algebraically closed field of characteristic 0. Recall that a
(two-sided) linear recurrence sequence U = {um }∞
m=−∞ in K is given by initial
values u0 , . . . , uk−1 and a linear recurrence
um+k = c1 um+k−1 + · · · + ck um
for m ∈ Z,
where the ci belong to K and ck = 0. Further, assume that the length k of
the recurrence has been chosen minimally. Then we call k the order of U ,
and fU := Xk − c1 Xk−1 − · · · − ck the companion polynomial of U . These
are uniquely determined by U . Assume that
fU = (X − α1 )e1 · · · (X − αr )er ,
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.11 Exponential-polynomial equations
327
where α1 , . . . , αr are distinct elements of K and e1 , . . . , er positive integers.
Then the terms um can be expressed otherwise as
um = f1 (m)α1m + · · · + fr (m)αrm (m ∈ Z),
where fi ∈ K[X] is a polynomial of degree exactly ei − 1, for i = 1, . . . , r.
The sequence U is called non-degenerate if α1 · · · αr = 0 and αi /αj is not
a root of unity for any two distinct indices i, j from {1, . . . , r}.
The zero-multiplicity N(U ) of U is the number of integers m such that
um = 0, that is the number of solutions of the exponential-polynomial equation
f1 (m)α1m + · · · + fr (m)αrm = 0 in m ∈ Z.
(10.11.1)
We have the following general result, due to Skolem, Mahler and Lech.
Theorem 10.11.1 Let U be a non-degenerate linear recurrence sequence in
a field K of characteristic 0. Then N (U ) is finite.
Skolem (1935) proved this in the case that U has its terms in Q and Mahler
(1935a) did so for sequences U with algebraic terms. Finally, Lech (1953)
proved the general result. The proofs of Skolem, Mahler and Lech were all
based on Skolem’s p-adic power series method (Skolem (1933)).
We discuss a generalization of (10.11.1) to exponential-polynomial eqations
in several variables. Let again K be a field of characteristic 0 and n ≥ 1.
For α = (α1 , . . . , αn ) ∈ (K ∗ )n and m = (m1 , . . . , mn ) ∈ Zn , we write α m :=
α1m1 · · · αnmn . We consider equations
m
f1 (m)α m
1 + · · · + fr (m)α r = 0
in m ∈ Zn ,
(10.11.2)
where fi ∈ K[X1 , . . . , Xn ], α i ∈ (K ∗ )n for i = 1, . . . , r. A solution m ∈ Zn
of (10.11.2) is called non-degenerate if i∈I fi (m)α m
i = 0 for each proper,
non-empty subset I of {1, . . . , r}. We recall the following result, which is a
special case of a more general theorem of Laurent (1984, 1989). Define the
group
m
G := {m ∈ Zn : α m
1 = · · · = α r }.
Theorem 10.11.2 Assume that G = {0}. Then (10.11.2) has only finitely many
non-degenerate solutions.
Proof of Theorem 10.11.2 =⇒ Theorem 10.11.1. We proceed by induction on
r. For r = 1 Theorem 10.11.1 is trivial. Let r ≥ 2 and assume that none of the
quotients αi /αj (1 ≤ i < j ≤ r) is a root of unity. This implies that
{m ∈ Z : α1m = · · · = αrm } = {0}.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
328
Further applications
Hence (10.11.1) has only finitely many non-degenerate solutions. By applying
the induction hypothesis to any of the vanishing subsums, we infer that there
are also only finitely many degenerate solutions.
Proof of Theorem 10.11.2 (sketch). The proof depends on Theorem 6.1.1. By
means of a specialization argument (see for instance Schmidt (2003), section
9), Theorem 10.11.2 can be reduced to the case that the coordinates of the
α i and the coefficients of the polynomials fi lie in an algebraic number field
K. We restrict ourselves to this special case. Choose a finite set of places
S of K, containing the infinite places, such that the coordinates of the α i
(i = 1, . . . , r) are all S-units. We apply Theorem 6.1.1 to (10.11.2). Pick a nondegenerate solution m of (10.11.2), and put xi := fi (m)α m
i for i = 1, . . . , r.
Then x1 + · · · + xr = 0 and no proper subsum of the left-hand side is 0. Put
m := max(|m1 |, . . . , |mn |). Thanks to our choice of S, we have
NS (x1 · · · xr ) = NS (f1 (m) · · · fr (m)) ≤ C1 mC2 ,
(10.11.3)
where here and below, the Ci are constants > 1 depending on the fi and the
α i . Further, since m is non-degenerate, we have fi (m) = 0 for i = 1, . . . , r,
which implies
HS (x1 , . . . , xr ) =
max(|x1 |v , . . . , |xr |v )
(10.11.4)
v∈S
≥ C3−1 m−C4 ·
v∈S
max |α m
i |v .
1≤i≤r
By the Product Formula and our choice for S, we have for z ∈ Zn ,
1 z
log
max |α zi |v ≥ 2
h((α i α −1
j ) ) =: ψ(z).
1≤i≤r
r
v∈S
1≤i,j ≤r
One can easily show that ψ satisfies the triangle inequality, and ψ(λz) =
z
|λ|ψ(z) for λ ∈ Z, z ∈ Zn . Further, if ψ(z) = 0, then all terms (α i α −1
j ) are
roots of unity. By our assumption on G, this implies that z = 0. Hence ψ
defines a norm on Zn . Both ψ and the maximum norm · can be extended
to norms on Rn and by a simple compactness argument one shows that there is
c > 0 such that ψ(z) ≥ cz for z ∈ Rn . So
max |α zi |v ≥ exp r −2 cz for z ∈ Zn .
v∈S
1≤i≤r
Together with (10.11.4) this implies
HS (x1 , . . . , xr ) ≥ C3−1 m−C4 C5m .
From Theorem 6.1.1 we deduce HS (x1 , . . . , xr ) ≤ C6 NS (x1 · · · xr )2 , say.
Combining this with the lower bound for HS (x1 , . . . , xr ) just derived and the
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.11 Exponential-polynomial equations
329
upper bound for NS (x1 · · · xr ) from (10.11.3), we obtain
m
C3−1 m−C4 C5
≤ C6 (C1 mC2 )2 ,
which implies that m is bounded.
Below, we discuss quantitative results (upper bounds for the number of
solutions) of (10.11.1) and (10.11.2), that have been obtained as consequences
of the Quantitative Subspace Theorem.
It has been an open problem for a long time to obtain a uniform upper bound
for the zero multiplicity N (U ) of a non-degenerate linear recurrence sequence
U depending only on the order of U . This was finally settled by Schmidt (1999),
who proved the following.
Theorem 10.11.3 Let U be a non-degenerate linear recurrence sequence of
order k in a field of characteristic 0. Then
N (U ) ≤ exp exp exp(3k log k).
Schmidt’s very intricate proof is based on the Quantitative Subspace Theorem from Evertse and Schlickewei (2002), but uses various other techniques.
In fact, the special case where the polynomials fi in (10.11.1) are all constants
follows easily from Theorem 6.1.3, but the extension to arbitrary polynomials
fi was very difficult. Schmidt’s bound has been subsequently improved by
Schmidt himself (Schmidt (2000)), and Allen (2007) and Amoroso and Viada
(2011). The best upper bound to date for N (U ), from the last mentioned paper,
is exp exp(70k). In Schmidt (2003), Schmidt worked out his method of proof
in a special case, giving a flavour of the main ideas.
It is conjectured that under the assumption G = {0}, the number of solutions
of (10.11.2) is bounded above by a quantity depending only on r and the total
degrees of f1 , . . . , fr . Schlickewei and Schmidt (2000) proved the following
weaker result for exponential-polynomial equations over number fields. We
keep the notation from (10.11.2).
Theorem 10.11.4 Assume that the coordinates of the points α i and the coefficients of the polynomials fi lie in an algebraic number field K of degree d.
Let δi denote the total degree of fi for i = 1, . . . , r and put
r
n + δi
B := max n,
.
n
i=1
Assume that G = {0}. Then equation (10.11.2) has at most c(B, d) :=
3
2
235B d 6B non-degenerate solutions.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
330
Further applications
Again the main tool in the proof is the Quantitative Subspace Theorem from
Evertse and Schlickewei (2002) (which was already proved a couple of years
earlier).
There are various generalizations of Theorem 10.11.4, see Schlickewei
and Schmidt (2000) and Ahlgren (1999). From Schmidt (2009) and Corvaja,
Schmidt and Zannier (2010) the following special case of the above conjecture
can be deduced. Let K be any field of characteristic 0, f ∈ K[X1 , . . . , Xn ]
a polynomial of total degree δ, and α = (α1 , . . . , αn ), where α1 , . . . , αn are
multiplicatively independent, non-zero elements of K. Then the equation
α m = f (m)
).
has at most exp(B 9B ) solutions m ∈ Zn , where B := 1 + ( n+δ
δ
10.12 Algebraic independence results
A possibly infinite sequence α1 , α2 , . . . is called algebraically independent
over a field K if there are no N and P ∈ K[X1 , . . . , XN ] − {0} such that
P (α1 , . . . , αN ) = 0. Nishioka (1986, 1987, 1989, 1994) proved various algebraic independence results for values of certain power series at algebraic arguments. All these results are applications of the semi-effective Theorem 6.1.1.
Here, we prove a special case of one of Nishioka’s results, and mention some
of her other results. Below, by algebraic numbers we always mean complex
numbers that are algebraic over Q.
Let K ⊂ C be an algebraic number field and
f (z) =
∞
ak zek
k=0
a power series with coefficients ak ∈ K and with {ek }∞
k=0 a strictly increasing
sequence of non-negative integers. Assume that f (z) has radius of convergence
R > 0. Further, assume that {ek } grows rapidly, i.e.,
ek + ki=1 h(ai )
lim
= 0,
k→∞
ek+1
where as usual h(α) denotes the absolute logarithmic height of an algebraic
number α. By Cijsouw and Tijdeman (1973), the number f (α) is transcendental for any algebraic number α with 0 < |α| < R. Further, it was shown by
Bundschuh and Wylegala (1980), that f (α1 ), . . . , f (αn ) are algebraically independent for any algebraic numbers α1 , . . . , αn with 0 < |α1 | < · · · < |αn | < R.
These results were extended by Nishioka as follows. Call non-zero algebraic
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.12 Algebraic independence results
331
numbers α1 , . . . , αs {ek }-dependent if there are γ , roots of unity ζ1 , . . . , ζs , and
algebraic numbers d1 , . . . , ds , not all zero, such that
αi = ζi γ
for i = 1, . . . , s,
s
di ζiek = 0
for all sufficiently large k.
i=1
Denote by f (l) the l-th derivative of f , where f (0) = f . The following result
is Theorem 1 of Nishioka (1987).
Theorem 10.12.1 Let α1 , . . . , αn be algebraic numbers with 0 < |αi | < R for
i = 1, . . . , n. Then the following three assertions are equivalent:
(i) f (l) (αi ) (i = 1, . . . , n, l ≥ 0) are algebraically dependent over Q;
(ii) there are distinct i1 , . . . , is ∈ {1, . . . , n} such that αi1 , . . . , αis are {ek }dependent;
(iii) 1, f (α1 ), . . . , f (αn ) are linearly dependent over the algebraic numbers.
To give a flavour of Nishioka’s method of proof, we prove the following
special case.
∞
ek
Theorem 10.12.2 Let f (z) = ∞
k=0 z , where {ek }k=0 is a strictly increasing sequence of non-negative integers with limk→∞ ek /ek+1 = 0. Further, let
α1 , . . . , αn be algebraic numbers such that |αi | < 1 for i = 1, . . . , n and none
of the quotients αi /αj (1 ≤ i < j ≤ n) is a root of unity. Then the numbers
f (l) (αi ) (i = 1, . . . , n, l ≥ 0) are algebraically independent over Q.
In fact, this was proved earlier by Nishioka (1986), with ek = k! for all k.
We first prove a crucial lemma (see Nishioka (1989), Lemma 1), which is a
consequence of Theorem 6.1.1.
Lemma 10.12.3 Let be an infinite set of non-negative integers. Further,
let K be a number field, γ1 , . . . , γn non-zero elements of K, and {Ai (m)}m∈
sequences of elements of K, such that
γi /γj is not a root of unity for all i, j with 1 ≤ i < j ≤ n,(10.12.1)
Ai (m) = 0 for i = 1, . . . , n, m ∈ ,
h(Ai (m))
= 0 for i = 1, . . . , n.
lim
m→∞
m
m∈
(10.12.2)
(10.12.3)
Then for every θ with 0 < θ < 1 and every place v of K, there are only finitely
many m ∈ such that
m
|A1 (m)γ1m + · · · + An (m)γnm |v ≤ |γ1 |m
vθ .
(10.12.4)
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
332
Further applications
Remark This lemma easily implies Theorem 10.11.1 (the Skolem–Mahler–
Lech Theorem for linear recurrence sequences) in the case of linear recurrence
sequences with terms in an algebraic number field.
Proof. The proof is by induction on n. For n = 1, the lemma is an easy consequence of assumptions (10.12.2), (10.12.3) and of (1.9.1), more precisely
the inequality log |A1 (m)|v ≥ −[K : Q] · h(A1 (m)). Now let n ≥ 2 and suppose the lemma is true for sums with fewer than n terms. The induction
hypothesis implies that for all but finitely many m ∈ , every proper sub
sum of ni=1 Ai (m)γim is non-zero. We show that also ni=1 Ai (m)γim can be
0 for at most finitely many m. Indeed, by (10.12.1) there is v ∈ MK such that
|γ1 /γn |v = 1. Assume without loss of generality that |γ1 /γn |v > 1. Notice that
by (10.12.3),
h(Ai (m)) + h(An (m))
h(Ai (m)/An (m))
≤
→ 0 as m ∈ , m → ∞.
m
m
Now if ni=1 Ai (m)γim = 0, then
A1 (m)
An−1 (m)
m
m m
·
γ
·
γ
+
·
·
·
+
n−1 = |γn |v ,
A (m) 1
An (m)
n
v
and by the induction hypothesis this is possible for only finitely many m.
So, after removing at most finitely many integers, we obtain an infinite set of
positive integers such that for every m ∈ , each subsum of ni=1 Ai (m)γim
is non-zero. By Lemma 1.9.1, for every m ∈ there is a positive rational
integer dm , with log dm ≤ [K : Q] ni=1 h(Ai (m)), such that dm Ai (m) ∈ OK
for i = 1, . . . , n. Now, clearly, (10.12.3) remains valid with dm Ai (m) instead
of Ai (m), and (log dm )/m → 0 as m → ∞. Hence we may as well prove our
lemma with dm Ai (m) instead of Ai (m). So we may and will assume that all
Ai (m) are algebraic integers without loss of generality.
Now let S be a finite set of places of K such that v ∈ S, and γ1 , . . . , γn are all
S-units. Put ui (m) := Ai (m)γim for i = 1, . . . , n. We apply Proposition 6.2.1
(which is in fact equivalent to Theorem 6.1.1) with xi = ui (m) for i = 1, . . . , m,
S as above, and T = {v}. Define
[K : Q] ni=1 h(Ai (m))
, H := log
max(|γ1 |w , . . . , |γn |w ).
δm :=
m
w∈S
Then H ≥ 0. Notice that for m ∈ we have by (1.9.1),
n
w∈S i=1
|ui (m)|w =
n
|Ai (m)|w ≤ exp(δm m)
w∈S i=1
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.12 Algebraic independence results
while
HS (u1 (m), . . . , un (m)) ≤ exp(mH ) ·
333
max(1, |A1 (m)|w , , . . . , |An (m)m )
w∈S
≤ exp((H + δm )m)
and lastly, by (1.9.1),
max(|u1 (m)|v , . . . , |un (m)|v ) ≥ |γ1 |m
v exp(−δm m).
By combining these three inequalities with Proposition 6.2.1 we obtain that for
every > 0 there is a constant C() > 0 such that for all m ∈ ,
|A1 (m)γ1m + · · · + An (m)γnm |v
≥ C() max(|u1 (m)|v , . . . , |un (m)|v )
n
−1
×
|ui (m)|w
· HS (u1 (m), . . . , un (m))−
≥
w∈S i=1
C()|γ1 |m
v exp
− m((2 + )δm + H ) .
Since we can choose arbitrarily small and since δm → 0 by (10.12.3), it
follows that for every θ with 0 < θ < 1, inequality (10.12.4) has only finitely
many solutions in m ∈ .
Proof of √
Theorem 10.12.2. We use the following notation. Let K be the number
field Q( −1, α1 , . . . , αn ). Let v be the (necessarily complex) place of K
such that | · |v = | · |2 . For x = (x1 , . . . , xr ) ∈ Cr we put x := max1≤i≤r |xi |.
Further, for x = (x1 , . . . , xr ) ∈ K r , w ∈ MK we put xw := max1≤i≤r |xi |w .
Assume that Theorem 10.12.2 is false. Then there is L ≥ 0 such that the
numbers f (l) (αi ) (i = 1, . . . , n, l = 1, . . . , L) are algebraically dependent over
Q. Assume that
|α1 | = max |αi |.
(10.12.5)
1≤i≤n
ek
n(L+1)
For m ≥ 0 put fm (z) := m
and um ∈
k=0 z . Define vectors u ∈ C
n(L+1)
(m = 0, 1, 2, . . . ,) by
K
u := αil f (l) (αi ) i=1,...,n , um := αil fm(l) (αi ) i=1,...,n .
l=0,...,L
l=0,...,L
Then limm→∞ um = u. Note that
m
αil fm(l) (αi ) =
ek (ek − 1) · · · (ek − l + 1)αiek .
k=0
There is a non-zero polynomial
Q[X10 , . . . , Xn,L+1 ] such that
in
n(L + 1)
variables
P ∈
P (u) = 0
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
334
Further applications
(i.e., P evaluated at Xil = αil f (l) (αi ) for i = 1, . . . , n, l = 1, . . . , L). We
choose such P of minimal total degree. Denote the total degree of P by D.
The constants Ci introduced below will be ≥ 1 and depend on P , α1 , . . . , αn
only. Further, {Ciw }w∈MK will be tuples of constants depending only on
P , α1 , . . . , αn , where Ciw ≥ 1 for all w ∈ MK and Ciw = 1 for all but finitely
many w.
For m = 0, 1, 2, . . . , we have by (10.12.5),
2L
|α1 |2em+1 .
|P (um )|v = |P (um ) − P (u)|2 ≤ C1 um − u2 ≤ C2 em+1
Further, for w ∈ MK \ {v} we have
|P (um )|w ≤ C3w max(1, um w )D
D
L
≤ C3w max 1, |(m + 1)em
|w max(1, |α1 |w , . . . , |αn |w )Dem
em
≤ C4w
.
Hence
(10.12.6)
2L
|P (m)|v ≤ C5em em+1
|α1 |2em+1 ,
v∈MK
which is < 1 for m sufficiently large since |α1 | < 1 and em /em+1 → 0 as
m → ∞. So by the Product Formula, P (um ) = 0 for all sufficiently large m.
We infer that for sufficiently large m we have by Taylor’s formula,
0 = P (um ) − P (um−1 )
n L
a
=
em (em − 1) · · · (em − l + 1)αiem il ,
Pa (um−1 )
a
i=1 l=0
where the sum is over all tuples of non-negative integers a = (ail )i=1,...,n, l=0,...,L
with a = 0, i,l ail ≤ D and where Pa := ( ni=1 Ll=0 (ail !)−1 ∂ ail /∂Xilail )P .
Estimating the terms with partial derivatives of order at least 2, we get by
(10.12.5),
n
LD
βi (m)αiem ≤ C6 max(1, um−1 )D em
|α1 |2em
i=1
e
LD
≤ C7m−1 em
|α1 |2em ,
(10.12.7)
where
βi (m) :=
L
l=0
em (em − 1) · · · (em − l + 1) ·
∂P (um−1 )
∂Xil
for i = 1, . . . , n.
We apply Lemma 10.12.3 with = {ek }∞
k=0 minus possibly a finite subset,
γi = αi , Ai (ek ) = βi (k) for i = 1, . . . , n, k ≥ 0, to show that (10.12.7) cannot
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
10.12 Algebraic independence results
335
hold for infinitely many m and derive a contradiction. Condition (10.12.1)
is satisfied with γi = αi for i = 1, . . . , n by assumption. As for condition
(10.12.2), we have to show that for all sufficiently large m we have βi (m) = 0
for i = 1, . . . , m. Indeed, suppose that for some i we have βi (m) = 0 for
infinitely many m. Then for these m,
∂P (um−1 )
1
∂P (um−1 )
·
=−
∂XiL
(em − l) · · · (em − L + 1)
∂Xil
l=0
L−1
and by letting m → ∞ we get
∂P (u)
= 0.
∂XiL
But this is impossible since we had chosen P of minimal total degree with
P (u) = 0.
To verify (10.12.3) we have to estimate the absolute logarithmic height of
βi (m). By a similar computation as in (10.12.6), one has for all sufficiently
large m,
e
|βi (m)|w ≤ C8wm−1 for i = 1, . . . , n, w ∈ MK
and so h(βi (m)) ≤ C9 em−1 for i = 1, . . . , n. Hence
h(βi (m))
C9 em−1
≤
→ 0 as m → ∞.
em
em
So all conditions of Lemma 10.12.3 are satisfied. By (10.12.7) and |α1 |v =
|α1 |2 < 1, em−1 /em → 0 as m → ∞, there is θ with 0 < θ < 1 such that for
all sufficiently large m,
|β1 (m)α1em + · · · + βn (m)αnem |v ≤ (|α1 |v θ )em .
But this contradicts Lemma 10.12.3. Theorem 10.12.2 follows.
Nishioka (1989) proved algebraic independence results for values at algebraic points of power series
fω (z) =
∞
[kω]zk ,
k=0
where ω is a real irrational number and [x] denotes the integral part of a real
number x. Her method of proof is a variation on that given above, and the
main tool is again Theorem 6.1.1. One of her results from that paper (i.e.,
Theorem 2) states that if ω has unbounded partial quotients in its continued
fraction expansion, α1 , . . . , αn are algebraic numbers such that |αi | < 1 for
i = 1, . . . , n and none of the quotients αi /αj (1 ≤ i < j ≤ n) is a root of
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
336
Further applications
unity, then the numbers fω (α1 ), . . . , fω (αn ) are algebraically independent over
the rationals.
In Nishioka (1994), the author applies Theorem 6.1.1 to obtain algebraic
independence results for values of Mahler functions at algebraic points. For an
extensive treatment of transcendence theory of Mahler functions we refer to
Nishioka (1996). We state only the following special case of Nishioka (1994),
Proposition: let
Fr (z) :=
∞
k=0
k
zr , Gr (z) :=
∞
k
(1 − zr ).
k=0
Then for every algebraic number α with 0 < |α| < 1, the numbers Fr (α) (r =
2, 3, . . .), Gr (α) (r = 2, 3, . . .) are algebraically independent over the rationals.
There are various other transcendence results that follow from the Subspace
Theorem but not specifically from Theorem 6.1.1, see for instance Corvaja and
Zannier (2002b), the survey Bugeaud (2011), and the book Bugeaud (2012), in
particular chapters 8 and 9.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:41, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.012
References
Adamczewski, B. and J. P. Bell (2012), On vanishing coefficients of algebraic power
series over fields of positive characteristic, Invent. Math. 187, 343–393.
Ahlgren, S. (1999), The set of solutions of a polynomial-exponential equation, Acta
Arith. 87, 189–207.
Allen, P. B. (2007), On the multiplicity of linear recurrence sequences, J. Number Theory
126, 212–216.
Amoroso, F. and E. Viada (2009), Small points on subvarieties of a torus, Duke Math.
J. 150, 407–442.
Amoroso, F. and E. Viada (2011), On the zeros of linear recurrence sequences, Acta
Arith. 147 (2011), 387–396.
Arenas-Carmona, L., D. Berend and V. Bergelson (2008), Ledrappier’s system is almost
mixing of all orders, Ergodic Theory Dynam. Systems 28, 339–365.
Aschenbrenner, M. (2004), Ideal membership in polynomial rings over the integers,
J. Amer. Math. Soc. 17, 407–442.
Ashrafi, N. and P. Vámos (2005), On the unit sum number of some rings, Quart. J.
Math. 56, 1–12.
Baker, A. (1966), Linear forms in the logarithms of algebraic numbers, Mathematika
13, 204–216.
Baker, A. (1967a), Linear forms in the logarithms of algebraic numbers, II, Mathematika
14, 102–107.
Baker, A. (1967b), Linear forms in the logarithms of algebraic numbers, III, Mathematika 14, 220–228.
Baker, A. (1968a), Linear forms in the logarithms of algebraic numbers, IV, Mathematika
15, 204–216.
Baker, A. (1968b), Contributions to the theory of Diophantine equations, Philos. Trans.
Roy. Soc. London, Ser. A 263, 173–208.
Baker, A. (1968c), The Diophantine equation y 2 = ax 3 + bx 2 + cx + d, J. London
Math. Soc. 43, 1–9.
Baker, A. (1969), Bounds for the solutions of the hyperelliptic equation, Proc. Camb.
Philos. Soc. 65, 439–444.
Baker, A. (1975), Transcendental number theory, Cambridge University Press.
Baker, A., ed. (1988), New Advances in Transcendence Theory, Cambridge University
Press.
337
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
338
References
Baker, A. (1998), Logarithmic forms and the abc-conjecture, in: Number Theory Diophantine, Computational and Algebraic Aspects, Proc. Conf. Eger, 1966, K. Győry,
A. Pethő and V. T. Sós, eds., de Gruyter, 37–44.
Baker, A. (2004), Experiments on the abc-conjecture, Publ. Math. Debrecen 65, 253–
260.
Baker, A. and H. Davenport (1969), The equations 3x 2 − 2 = y 2 and 8x 2 − 7 = z2 ,
Quart. J. Math. Oxford Ser. (2) 20, 129–137.
Baker, A. and D. W. Masser, eds. (1977), Transcendence theory: advances and applications, Academic Press.
Baker, A. and G. Wüstholz (2007), Logarithmic Forms and Diophantine Geometry,
Cambridge University Press.
Barroero, F., C. Frei and R. F. Tichy (2011), Additive unit representations in rings over
global fields – a survey, Publ. Math. Debrecen 79, 291–307.
Belcher, P. (1974), Integers expressible as sums of distinct units, Bull. London Math.
Soc. 6, 66–68.
Bertók, Cs. (2013), Representing integers as sums or differences of general power
products, Acta Math. Hungar. 141, 291–300.
Bérczes, A. (2000), On the number of solutions of index form equations, Publ. Math.
Debrecen 56, 251–262.
Bérczes, A. (2015a), Effective results for unit points over finitely generated domains,
Math. Proc. Camb. Phil. Soc. 158, 331–353.
Bérczes, A. (2015b), Effective results for division points on curves in G2m , J. Th. Nombers
Bordeaux, to appear.
Bérczes, A., J.-H. Evertse and K. Győry (2004), On the number of equivalence classes
of binary forms of given degree and given discriminant, Acta Arith. 113, 363–399.
Bérczes, A., J.-H. Evertse and K. Győry (2007a), On the number of pairs of binary
forms with given degree and given resultant, Acta Arith. 128, 19–54.
Bérczes, A., J.-H. Evertse and K. Győry (2007b), Diophantine problems related to discriminants and resultants of binary forms, in: Diophantine Geometry, proceedings
of a trimester held from April–July 2005, U. Zannier, ed., CRM series, Scuola
Normale Superiore Pisa, pp. 45–63.
Bérczes, A., J.-H. Evertse and K. Győry (2009), Effective results for linear equations in
two unknowns from a multiplicative division group, Acta Arith. 136, 331–349.
Bérczes, A., J.-H. Evertse and K. Győry (2013), Multiply monogenic orders, Ann. Sc.
Norm. Super. Pisa Cl. Sci. (5) 12, 467–497.
Bérczes, A., J.-H. Evertse and K. Győry (2014), Effective results for Diophantine
equations over finitely generated domains, Acta Arith. 163, 71–100.
Bérczes, A., J.-H. Evertse, K. Győry and C. Pontreau (2009), Effective results for points
on certain subvarieties of a tori, Math. Proc. Camb. Phil. Soc. 147, 69–94.
Bérczes, A. and K. Győry (2002), On the number of solutions of decomposable polynomial equations, Acta Arith. 101, 171–187.
Beukers, F. and H. P. Schlickewei (1996), The equation x + y = 1 in finitely generated
groups, Acta. Arith. 78, 189–199.
Beukers, F. and D. Zagier (1997), Lower bounds of heights of points on hypersurfaces,
Acta Arith. 79, 103–111.
Bilu, Yu. F. (1995), Effective analysis of integral points on algebraic curves, Israel J.
Math. 90, 235–252.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
339
Bilu, Yu. F. (2002), Baker’s method and modular curves, in: A Panorama of Number
Theory, or The View from Baker’s Garden, Proc. conf. ETH Zurich, 1999,
G. Wüstholz, ed., Cambridge University Press, pp. 73–88.
Bilu, Yu. F. (2008), The many faces of the subspace theorem [after Adamczewski,
Bugeaud, Corvaja, Zannier, . . .], Séminaire Bourbaki, Vol. 2006/2007, Astérisque
317, Exp. No. 967, vii, 1–38.
Bilu, Yu. F. and Y. Bugeaud (2000), Démonstration du théorème de Baker-Feldman via
les formes linéaires en deux logarithmes, J. Théorie des Nombres, Bordeaux, 12,
13–23.
Bilu, Yu. F., I. Gaál and K. Győry (2004), Index form equations in sextic fields: a hard
computation, Acta Arith. 115, 85–96.
Bilu, Yu. F. and G. Hanrot (1996), Solving Thue equations of high degree, J. Number
Theory, 60, 373–392.
Bilu, Yu. F. and G. Hanrot (1998), Solving superelliptic Diophantine equations by
Baker’s method, Compositio Math. 112, 273–312.
Bilu, Yu. F. and G. Hanrot (1999), Thue equations with composite fields, Acta Arith.,
88, 311–326.
Birch, B. J. and J. R. Merriman (1972), Finiteness theorems for binary forms with given
discriminant, Proc. London Math. Soc. 24, 385–394.
Bombieri, E. (1993), Effective diophantine approximation on GM , Ann. Scuola Norm.
Sup. Pisa (IV) 20, 61–89.
Bombieri, E. (1994), On the Thue-Mahler equation (II), Acta Arith. 67, 69–96.
Bombieri, E. and P. B. Cohen (1997), Effective Diophantine approximation on Gm , II,
Ann. Scuola Norm. Sup. Pisa (IV) 24, 205–225.
Bombieri, E. and P. B. Cohen (2003), An elementary approach to effective Diophantine approximation on Gm , in Number Theory and Algebraic Geometry, To
Peter Swinaerton Dyer on his 75th birthday, London Math. Soc. Lecture Note
Series 303, M. Reid and A. Skorobogatov, eds. Cambridge University Press,
pp. 41–62.
Bombieri, E. and W. Gubler (2006), Heights in Diophantine Geometry, Cambridge
University Press.
Bombieri, E., J. Mueller and M. Poe (1997), The unit equation and the cluster principle,
Acta Arith. 79, 361–389.
Bombieri, E., J. Mueller and U. Zannier (2001), Equations in one variable over function
fields, Acta Arith. 99, 27–39.
Bombieri, E. and W. M. Schmidt (1987), On Thue’s equation, Invent. Math. 88, 69–81.
Borevich, Z. I. and I. R. Shafarevich (1967), Number Theory, 2nd edn., Academic Press.
Borosh, I., M. Flahive, D. Rubin and B. Treybig (1989), A sharp bound for solutions of
linear Diophantine equations, Proc. Amer. Math. Soc. 105, 844–846.
Bosma, W., J. Cannon and C. Playoust (1997), The Magma algebra system I. The user
languange, J. Symbolic Comput, 24, 235–265.
Brindza, B. (1984), On S-integral solutions of the equation y m = f (x), Acta Math.
Hungar. 44, 133–139.
Brindza, B. and K. Győry (1990), On unit equations with rational coefficients, Acta
Arith. 53, 367–388.
Broberg, N. (1999), Some examples related to the abc-conjecture for algebraic number
fields, Math. Comp. 69, 1707–1710.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
340
References
Browkin, J. (2000), The abc-conjecture, in: Number Theory, R. P. Bambah, V. C. Dumir
and R. J. Hans-Gill, eds., Birkhäuser, pp. 75–105.
Brownawell, W. D. and D. W. Masser (1986), Vanishing sums in function fields, Math.
Soc. Camb. Phil. Soc. 100, 427–434.
Brunotte, H., A. Huszti and A. Pethő (2006), Bases of canonical number systems in
quartic number fields, J. Théor. Nombres Bordeaux 18, 537–557.
Bugeaud, Y. (1998), Bornes effectives pour les solutions des équations en S-unités et
des équations de Thue-Mahler, J. Number Theory 71, 227–244.
Bugeaud, Y. (2011), Quantitative versions of the subspace theorem and applications,
J. Théor. Nombres Bordeaux 23, 35–57.
Bugeaud, Y. (2012), Distribution Modulo One and Diophantine Approximation,
Cambridge Tracts in Mathematics 193, Cambridge University Press.
Bugeaud, Y. and K. Győry (1996a), Bounds for the solutions of unit equations, Acta
Arith. 74, 67–80.
Bugeaud, Y. and K. Győry (1996b), Bounds for the solutions of Thue-Mahler equations
and norm form equations, Acta Arith. 74, 273–292.
Bugeaud, Y. and F. Luca (2004), A quantitative lower bound for the greatest prime factor
of (ab + 1)(bc + a)(ca + 1), Acta Arith. 114, 275–294.
Bundschuh, P. and F.-J. Wylegala (1980), Über algebraische Unabhängigkeit bei gewissen nichtfortsetzbaren Potenzreihen, Arch. Math. 34, 32–36.
Canci, J. K. (2007), Finite orbits for rational functions, Indag. Mathem., N.S. 18, 203–
214.
Cassels, J. W. S. (1959), An Introduction to the Geometry of Numbers, Springer
Verlag.
Cijsouw, P. L. and R. Tijdeman (1973), On the transcendence of certain power series of
algebraic numbers, Acta Arith. 23, 301–305.
Coates, J. (1969), An effective p-adic analogue of a theorem of Thue, Acta Arith. 15,
279–305.
Coates, J. (1970), An effective p-adic analogue of a theorem of Thue II, The greatest
prime factor of a binary form, Acta Arith, 16, 392–412.
Cohen, H. (1993), A Course in Computational Algebraic Number Theory, Springer
Verlag.
Cohen, H. (2000), Advanced Topics in Computational Number Theory, Springer Verlag.
Conway, J. H. and A. J. Jones (1976), Trigonometric Diophantine equations (on vanishing sums of roots of unity), Acta Arith. 30, 229–240.
Corvaja, P., W. M. Schmidt and U. Zannier (2010), The Diophantine equation
α1x1 · · · αnxn = f (x1 , . . . , xn ) II, Trans. Amer. Math. Soc. 362, 2115–2123.
Corvaja, P. and U. Zannier (2002a), A subspace theorem approach to integral points on
curves, C.R. Math. Acad. Sci. Paris 334, 267–271.
Corvaja, P. and U. Zannier (2002b), Some new applications of the subspace theorem,
Compos. Math. 131, 319–340.
Corvaja, P. and U. Zannier (2003), On the greatest prime factor of (ab + 1)(ac + 1),
Proc. Amer. Math. Soc. 131, 1705–1709.
Corvaja, P. and U. Zannier (2004a), On a general Thue’s equation, Amer. J. Math. 126,
1033–1055.
Corvaja, P. and U. Zannier (2004b), On integral points on surfaces, Ann. Math. 160,
705–726.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
341
Corvaja, P. and U. Zannier (2006), On the integral points on certain surfaces, Int. Math.
Res. Not. Art.ID 98623, 20 pp.
Corvaja, P. and U. Zannier (2008), Applications of the Subspace Theorem to certain
Diophantine problems: a survey of some recent results, in: Diophantine Approximation, Festschrift for Wolfgang Schmidt, H. P. Schlickewei, K. Schmidt and R.
Tichy, eds., Springer Verlag, pp. 161–174.
Daberkow, M., C. Fieker, J. Klüners, M. Pohst, K. Roegner and K. Wildanger (1997),
KANT V4, J. Symbolic Comput. 24, 267–283.
David, S. and P. Philippon (1999), Minorations des hauteurs normalisées des sousvariétés des tores, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 28, 489–543, Errata,
29, 729–731.
Delone (Delaunay), B. N. (1930), Über die Darstellung der Zahlen durch die binären
kubischen Formen von negativer Diskriminante, Math. Z, 31, 1–26.
Delone, B. N. and D. K. Faddeev (1940), The theory of irrationalities of the third degree
(Russian), Inst. Math. Steklov 11, Acad. Sci. USSR. English translation, Amer.
Math. Soc., 1964.
Derksen, H. (2007), A Skolem-Mahler-Lech theorem in positive characteristic and finite
automata, Invent. Math. 168, 175–244.
Derksen, H. and D. W. Masser (2012), Linear equations over multiplicative groups,
recurrences, and mixing I, Proc. London Math. Soc. 104, 1045–1083.
Dombek, D., L. Hajdu and A. Pethő (2014), Representing algebraic integers as linear
combinations of units, Period. Math. Hung. 68, 135–142.
Dubois, E. and G. Rhin (1975) Approximation rationnelles simultanées de nombres
algébriques réels et de nombres algébriques p-adiques, in: Journées Arithmétiques
de Bordeaux (Conf. Univ. Bordeaux, 1974), W. W. Adams, ed., Astérisque 24/25,
Soc. Math. France, pp. 211–227.
Dubois, E. and G. Rhin (1976), Sur la majoration de formes linéaires à coefficients
algébriques réels et p-adiques. Démonstration d’une conjecture de K. Mahler,
C.R. Acad. Sci. Paris Sér. A-B 282, A1211–A1214.
Dvornicich, R. and U. Zannier (2000), On sums of roots of unity, Monatsh. Math. 129,
97–108.
Dyson, F. J. (1947), The approximation of algebraic numbers by rationals, Acta Math.
79, 225–240.
Eichler, M. (1966), Introduction to the theory of algebraic numbers and functions,
Academic Press.
Elkies, N. D. (1991), ABC implies Mordell, Int. Math. Res. Not. 7, 99–109.
Erdős, P. (1976), Problems in number theory and combinatorics, Proc. 6th Manitoba
Conference on Numerical Math. pp. 35–58.
Erdős, P., C. L. Stewart and R. Tijdeman (1988), Some Diophantine equations with
many solutions, Compos. Math. 66, 37–56.
Erdős, P. and P. Turán (1934), On a problem in the elementary theory of numbers, Amer.
Math. Monthly 41, 608–611.
Everest, G. R. and K. Győry (1997), Counting solutions of decomposable form equations, Acta Arith. 79, 173–191.
Evertse, J.-H. (1983), Upper bounds for the numbers of solutions of Diophantine equations, Ph.D. thesis, University of Leiden, Leiden. Also published as Math. Centre
Tracts No. 168, CWI, Amsterdam.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
342
References
Evertse, J.-H. (1984a), On equations in S-units and the Thue-Mahler equation, Invent.
Math. 75, 561–584.
Evertse, J.-H. (1984b), On sums of S-units and linear recurrences, Compos. Math. 53,
225–244.
Evertse, J.-H. (1993), Estimates for reduced binary forms, J. Reine Angew. Math. 434,
159–190.
Evertse, J.-H. (1995), The number of solutions of decomposable form equations, Invent.
Math. 122, 559–601.
Evertse, J.-H. (1996), An improvement of the quantitative subspace theorem, Compos.
Math. 101, 225–311.
Evertse, J.-H. (1997), The number of solutions of the Thue-Mahler equation, J. Reine
Angew. Math. 482, 121–149.
Evertse, J.-H. (1998), Lower bounds for resultants, II, in: Number Theory, Diophantine,
Computational and Algebraic Aspects, Proc. Conf. Eger, Hungary, 1996, K. Győry,
A. Pethö, V. T. Sós, eds., Walter de Gruyter, pp. 181–198.
Evertse, J.-H. (1999), The number of solutions of linear equations in roots of unity, Acta
Arith. 89, 45–51.
Evertse, J.-H. (2002), Points on subvarieties of tori, in: A Panorama of Number Theory,
or the View from Baker’s Garden, Proc. conf. ETH Zürich, 1999, G. Wüstholz, ed.,
Cambridge University Press, pp. 214–230.
Evertse, J.-H. (2004), Linear equations with unknowns from a multiplicative group
whose solutions lie in a small number of subspaces, Indag. Math. (N.S.) 15, 347–
355.
Evertse, J.-H. and R. G. Ferretti (2002), Diophantine inequalities on projective varieties,
Int. Math. Res. Not. 2002:25, 1295–1130.
Evertse, J.-H. and R. G. Ferretti (2008), A generalization of the Subspace Theorem
with polynomials of higher degree, in: Diophantine Approximation, Festschrift for
Wolfgang Schmidt, H. P. Schlickewei, K. Schmidt and R. Tichy, eds., Springer
Verlag, pp. 175–198.
Evertse, J.-H. and R. G. Ferretti (2013), A further improvement of the Quantitative
Subspace Theorem, Ann. Math. 177, 513–590.
Evertse, J.-H., I. Gaál and K. Győry (1989), On the numbers of solutions of decomposable polynomial equations, Arch. Math. 52, 337–353.
Evertse, J.-H. and K. Győry (1985), On unit equations and decomposable form equations, J. Reine Angew. Math. 358, 6–19.
Evertse, J.-H. and K. Győry (1988a), On the number of polynomials and integral
elements of given discriminant, Acta. Math. Hung. 51, 341–362.
Evertse, J.-H. and K. Győry (1988b), On the number of solutions of weighted unit
equations, Compos. Math. 66, 329–354.
Evertse, J.-H. and K. Győry (1988c), Finiteness criteria for decomposable form equations, Acta Arith. 50, 357–379.
Evertse, J.-H. and K. Győry (1988d), Decomposable form equations, in: New Advances
in Transcendence Theory, Proc. conf. Durham 1986, A. Baker, ed., pp. 175–202.
Evertse, J.-H. and K. Győry (1989), Thue-Mahler equations with a small number of
solutions, J. Reine Angew. Math. 399, 60–80.
Evertse, J.-H. and K. Győry (1991), Effective finiteness results for binary forms with
given discriminant, Compositio Math., 79, 169–204.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
343
Evertse, J.-H. and K. Győry (1992a), Effective finiteness theorems for decomposable
forms of given discriminant, Acta. Arith. 60, 233–277.
Evertse, J.-H. and K. Győry (1992b), Discriminants of decomposable forms, in: New
Trends in Probability and Statistics, F. Schweiger and E. Manstavičius, eds., pp. 39–
56.
Evertse, J.-H. and K. Győry (1993), Lower bounds for resultants, I, Compositio Math.
88, 1–23.
Evertse, J.-H. and K. Győry (1997), The number of families of solutions of decomposable form equations, Acta. Arith. 80, 367–394.
Evertse, J.-H. and K. Győry (2013), Effective results for unit equations over finitely
generated domains, Math. Proc. Camb. Phil. Soc. 154, 351–380.
Evertse, J.-H. and K. Győry (2016), Discriminant Equations in Diophantine Number
Theory, Cambridge: Cambridge University Press, to appear.
Evertse, J.-H., K. Győry, C. L. Stewart and R. Tijdeman (1988a), On S-unit equations
in two unknowns, Invent. math. 92, 461–477.
Evertse, J.-H., K. Győry, C. L. Stewart and R. Tijdeman (1988b), S-unit equations
and their applications, in: New Advances in Transcendence Theory, Proc. conf.
Durham 1986, A. Baker, ed., pp. 110–174. Cambridge University Press.
Evertse, J.-H., P. Moree, C. L. Stewart and R. Tijdeman (2003), Multivariate equations
with many solutions, Acta Arith. 107 (2003), 103–125.
Evertse, J.-H. and H. P. Schlickewei (1999), The Absolute Subspace Theorem and
linear equations with unknowns from a multiplicative group, in: Number Theory
in Progress, proc. conf. Zakopane 1997 in honour of the 60th birthday of Prof.
Andrzej Schinzel, K. Győry, H. Iwaniec and J. Urbanowicz, eds., Walter de Gruyter,
pp. 121–142.
Evertse, J.-H. and H. P. Schlickewei (2002), A quantitative version of the Absolute
Subspace Theorem, J. Reine Angew. Math. 548, 21–127.
Evertse, J.-H., H. P. Schlickewei and W. M. Schmidt (2002), Linear equations in variables
which lie in a multiplicative group, Ann. Math. 155, 807–836.
Evertse, J.-H. and J.-H. Silverman (1986), Uniform bounds for the number of solutions
to Y n = f (X), Math. Proc. Camb. Phil. Soc. 100, 237–248.
Evertse, J.-H. and U. Zannier (2008), Linear equations with unknowns from a multiplicative group in a function field, Acta Arith. 133, volume dedicated to the 75th
birthday of Wolfgang Schmidt, 157–170.
Faltings, G. (1983), Endlichkeitssätze für abelsche Varietäten über Zahlkörpern, Invent.
Math. 73, 349–366, Erratum: Invent. Math. 75 (1984), 381.
Faltings, G. (1991), Diophantine approximation on abelian varieties, Ann. Math. 133,
549–576.
Faltings, G. (1994), The general case of S. Lang’s conjecture, in: Bersotti symposium in
Algebraic Geometry (Abano Terme, 1991), 175–182, Perspect. Math. 15, Academic
Press.
Faltings, G. and G. Wüstholz (1994), Diophantine approximations on projective spaces,
Invent. Math. 116, 109–138.
Feldman, N. I. and Y. V. Nesterenko (1998), Transcendental numbers, Springer Verlag,
Vol. 44 of Encyclopaedia of Math. Sci.
Filipin, A., R. F. Tichy and V. Ziegler (2008), The additive unit structure of pure quartic
complex fields, Funct. Approx. Comment. Math. 39, 113–131.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
344
References
Fincke, U. and M. Pohst (1985), Improved methods for calculating vectors of short
length in a lattice, including a complexity analysis, Math. Comp. 44, 463–471.
Frei, C. (2012), On rings of integers generated by their units, Bull. London Math. Soc.
44, 167–182.
Friedman, E. (1989), Analytic formulas for regulators of number fields, Invent. Math.
98, 599–622.
Fröhlich, A. and J. C. Shepherdson (1956), Effective procedures in field theory, Philos.
Trans. Roy. Soc. London, Ser. A 248, 407–432.
Gaál, I. (1984), Norm form equations with several dominating variables and explicit
lower bounds for inhomogeneous linear forms with algebraic coefficients, Studia
Sci. Math. Hungar 19, 399–411.
Gaál, I. (1985), Norm form equations with several dominating variables and explicit
lower bounds for inhomogeneous linear forms with algebraic coefficients, II, Studia
Sci. Math. Hungar 20, 333–344.
Gaál, I. (1986), Inhomogeneous discriminant form and index form equations and their
applications, Publ. Math. Debrecen 33, 1–12.
Gaál, I. (1988a), Integral elements with given discriminant over function fields, Acta
Math. Hungar. 52, 133–146.
Gaál, I. (1988b), Inhomogeneous norm form equations over function fields, Acta Arith.
51, 61–73.
Gaál, I. (2002), Diophantine equations and power integral bases, Birkhäuser.
Gaál, I. and M. Pohst (2002), On the resolution of relative Thue equations, Math. Comp.
71, no. 237, 429–440 (electronic).
Gaál, I. and M. Pohst (2006a), Diophantine equations over global function fields I, The
Thue equation, J. Number Theory 119, 49–65.
Gaál, I. and M. Pohst (2006b), Diophantine equations over global function fields II,
S-integral solutions of Thue equations, Exper. Math. 15, 1–6.
Gaál, I. and M. Pohst (2010), Diophantine equations over global function fields IV,
S-unit equations in several variables with an application to norm form equations,
J. Number Theory 130, 493–506.
Gebel, J., A. Pethő and H. G. Zimmer (1994), Computing integral points on elliptic
curves, Acta Arith. 67, 171–192.
Gelfond, A. O. (1934), Sur le septième problème de Hilbert, Izv. Akad. Nauk SSSR 7,
623–630.
Gelfond, A. O. (1935), On approximating transcendental numbers by algebraic numbers,
Dokl. Akad. Nauk SSSR 2, 177–182.
Gelfond, A. O. (1940), Sur la divisibilité de la différence des puissances de deux nombres
entiers par une puissance d’un idéal premier, Mat. Sbornik 7 (49), 7–26.
Gelfond, A. O. (1960), Transcendental and algebraic numbers, New York, Dover.
Ghioca, D. (2008), The isotrivial case in the Mordell-Lang theorem, Trans. Amer. Math.
Soc. 360, 3839–3856.
Grant, D. (1996), Sequences of Fields with Many Solutions to the Unit Equation, The
Rocky Mountain J. Math. 26, 1017–1029.
Granville, A. (1998), ABC allow us to count squarefrees, Int. Math. Res. Not. 19,
991–1009.
Granville, A. and H. M. Stark (2000), abc implies no “Siegel zeros” for L-functions of
characters with negative discriminant, Invent. Math. 139, 509–523.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
345
Green, B. and T. Tao (2008), The primes contain arbitrarily long arithmetic progressions,
Ann. of Math. 167, 481–547.
Győry, K. (1971), Sur l’irréductibilité d’une classe des polynômes, I, Publ. Math.
Debrecen 18, 289–307.
Győry, K. (1972), Sur l’irréductibilité d’une classe des polynômes, II, Publ. Math.
Debrecen 19, 293–326.
Győry, K. (1973), Sur les polynômes à coefficients entiers et de discriminant donné,
Acta Arith. 23, 419–426.
Győry, K. (1974), Sur les polynômes à coefficients entiers et de discriminant donné II,
Publ. Math. Debrecen 21, 125–144.
Győry, K. (1976), Sur les polynômes à coefficients entiers et de discriminant donné III,
Publ. Math. Debrecen 23, 141–165.
Győry, K. (1978a), On polynomials with integer coefficients and given discriminant IV,
Publ. Math. Debrecen 25, 155–167.
Győry, K. (1978b), On polynomials with integer coefficients and given discriminant V,
p-adic generalizations, Acta Math. Acad. Sci. Hung. 32, 175–190.
Győry, K. (1978/1979), On the greatest prime factors of decomposable forms at integer
points, Ann. Acad. Sci. Fenn., Ser. A I, Math. 4, 341–355.
Győry, K. (1979), On the number of solutions of linear equations in units of an algebraic
number field, Comment. Math. Helv. 54, 583–600.
Győry, K. (1979/1980), On the solutions of linear diophantine equations in algebraic
integers of bounded norm, Ann. Univ. Sci. Budapest. Eötvös, Sect. Math. 22–23,
225–233.
Győry, K. (1980a), Explicit upper bounds for the solutions of some diophantine equations, Ann. Acad. Sci. Fenn., Ser A I, Math. 5, 3–12.
Győry, K. (1980b), Résultats effectifs sur la représentation des entiers par des formes
désomposables, Queen’s Papers in Pure and Applied Math., No.56.
Győry, K. (1980c), On certain graphs composed of algebraic integers of a number field
and their applications I, Publ. Math. Debrecen 27, 229-242.
Győry, K. (1981a), On the representation of integers by decomposable forms in several
variables, Publ. Math. Debrecen 28, 89–98.
Győry, K. (1981b), On S-integral solutions of norm form, discriminant form and index
form equations, Studia Sci. Math. Hungar 16, 149–161.
Győry, K. (1981c), On discriminants and indices of integers of an algebraic number
field, J. Reine Angew. Math. 324, 114–126.
Győry, K. (1982a), Polynomials of given discriminant and integral elements of given
discriminant over integral domains, C. R. Math. Rep. Acad. Sci. Canada 4, 75–80.
Győry, K. (1982b), On certain graphs associated with an integral domain and their
applications to Diophantine problems, Publ. Math. Debrecen 29, 79–94.
Győry, K. (1982c), On the irreducibility of a class of polynomials III. J. Number Theory
15, 164–181.
Győry, K. (1983), Bounds for the solutions of norm form, discriminant form and
index form equations in finitely generated integral domains, Acta Math. Hung. 42,
45–80.
Győry, K. (1984), Effective finiteness theorems for polynomials with given discriminant
and integral elements with given discriminant over finitely generated domains,
J. Reine Angew. Math. 346, 54–100.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
346
References
Győry, K. (1990), On arithmetic graphs associated with integral domains, in: A Tribute
to Paul Erdős, Cambridge University Press, pp. 207–222.
Győry, K. (1992a), Some recent applications of S-unit equations, Astérisque 209,
17–38.
Győry, K. (1992b), Upper bounds for the numbers of solutions of unit equations in two
unknowns, Lithuanian Math. J. 32, 40–44.
Győry, K. (1992c), On the irreducibility of a class of polynomials IV, Acta Arith. 62,
399–405.
Győry, K. (1993a), On the numbers of families of solutions of systems of decomposable
form equations, Publ. Math. Debrecen 42, 65–101.
Győry, K. (1993b), Some applications of decomposable form equations to resultant
equations, Coll. Math. 65, 267–275.
Győry, K. (1993c), On the number of pairs of polynomials with given resultant or given
semi-resultant, Acta Sci. Math. 57, 515–529.
Győry, K. (1994), On the irreducibility of neighbouring polynomials, Acta. Arith. 67,
283–294.
Győry, K. (1996), Applications of unit equations, in: Analytic Number Theory, RIMS
Kokyusoku 958, Kyoto, Japan, pp. 62–78.
Győry, K. (1998), Bounds for the solutions of decomposable form equations, Publ.
Math. Debrecen 52, 1–31.
Győry, K. (1999), On the distribution of solutions of decomposable form equations,
in: Number Theory in Progress, Proc. conf. in honour of 60th birthday of Andrzej
Schinzel, K. Győry, H. Iwaniec and J. Urbanowicz, eds., de Gruyter, pp. 237–
365.
Győry, K. (2002), Solving diophantine equations by Baker’s theory, in: A Panorama of
Number Theory, Cambridge, pp. 38–72.
Győry, K. (2006), Polynomials and binary forms with given discriminant, Publ. Math.
Debrecen 69, 473–499.
Győry, K. (2008a), On the abc-conjecture in algebraic number fields, Acta Arith. 133,
281–295.
Győry, K. (2008b), On certain arithmetic graphs and their applications to diophantine
problems, Funct. Approx. Comment. Math., 39, 289–314.
Győry, K. (2010), S-unit equations in number fields: effective results, generalizations,
ABC-conjecture, in: Analytic number theory and related topics, RIMS Kokyusoku
1710, pp. 71–84.
Győry, K., L. Hajdu and R. Tijdeman (2011), Irreducibility criteria of Schur-type and
Pólya-type, Monatsh. Math. 163, 415–443.
Győry, K., L. Hajdu and R. Tijdeman (2014), Representation of finite graphs as difference graphs of S-units, I, J. Combinatorial Theory, Ser. A, 127, 314–335.
Győry, K. and Z. Z. Papp (1977), On discriminant form and index form equations,
Studia Sci. Math. Hungar. 12, 47–60.
Győry, K. and Z. Z. Papp (1978), Effective estimates for the integer solutions of norm
form and discriminant form equations, Publ. Math. Debrecen 25, 311–325.
Győry, K. and A. Pethő (1980), Über die Verteilung der Lösungen von Normformen
Gleichungen III, Acta Arith. 37, 143–165.
Győry, K., I. Pink and Á. Pintér (2004), Power values of polynomials and binomial
Thue-Mahler equations, Publ. Math. Debrecen 65, 341–362.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
347
Győry, K. and Á. Pintér (2008), Polynomial powers and a common generalization of
binomial Thue-Mahler equations and S-unit equations, in: Diophantine Equations,
Proc. conf. in honour of Tarlok Shorey’s 60th birthday, N. Saradha, ed., New Delhi,
pp. 103–119.
Győry, K. and M. Ru (1998), Integer solutions of a sequence of decomposable form
inequalities, Acta Arith. 86, 227–237.
Győry, K., A. Sárközy and C. L. Stewart (1996), On the number of prime factors of
integers of the form ab + 1, Acta Arith. 74, 365–385.
Győry, K. and A. Schinzel (1994), On a conjecture of Posner and Rumsey, J. Number
Theory, 47, 63–78.
Győry, K., C. L. Stewart and R. Tijdeman (1986), On prime factors of sums of integers
I, Compositio Math 59, 81–88.
Győry, K. and K. Yu (2006), Bounds for the solutions of S-unit equations and decomposable form equations, Acta Arith. 123, 9–41.
Hajdu, L. (1993), A quantitative version of Dirichlet’s S-unit theorem in algebraic
number fields, Publ. Math. Debrecen 42, 239–246.
Hajdu, L. (1997), On a problem of Győry and Schinzel concerning polynomials, Acta
Arith. 78, 287–295.
Hajdu, L. (2007), Arithmetic progressions in linear combinations of S-units, Period.
Math. Hung. 54, 175–181.
Hajdu, L. (2009), Optimal systems of fundamental S-units for LLL-reduction, Periodica
Math. Hung. 59, 79–105.
Hajdu, L. and F. Luca (2010), On the length of arithmetic progressions in linear combinations of S-units, Archiv Math. 94, 357–363.
Hajdu, L. and R. Tijdeman (2003), Polynomials dividing infinitely many quadrinomials
or quintinomials, Acta Arith. 107, 381–404.
Hajdu, L. and R. Tijdeman (2008), A criterion for polynomials to divide infinitely many
k-nomials, in: Diophantine Approximation, Festschrift for Wolfgang Schmidt, H. P.
Schlickewei, K. Schmidt and R. Tichy, eds., Springer Verlag, pp. 175–198.
Halter-Koch, F. and W. Narkiewicz (1997), Polynomial cycles and dynamical units,
in: Proc. Conf. Analytic and Elementary Number Theory, dedicated to the 80th
birthday of E. Hlawka, W. G. Nowak and J. Schoißengeier, eds., Wien, 1997,
70–80.
Halter-Koch, F. and W. Narkiewicz (2000), Scarcity of finite polynomial orbits, Publ.
Math. Debrecen 56, 405–414.
Hardy, G. H. and E. M. Wright (1980), An introduction to the theory of numbers,
5th. edn., Oxford University Press.
Haristoy, J. (2003), Équations diophantiennes exponentielles, Thèse de docteur,
Strasbourg.
Harris, J. (1992), Algebraic Geometry, A First Course, Springer Verlag.
Hartshorne, R. (1977), Algebraic Geometry, Springer Verlag.
Hermann, G. (1926), Die Frage der endlich vielen Schritte in der Theorie der Polynomideale, Math. Ann. 95, 736–788.
Hermite, C. (1851), Sur l’introduction des variables continues dans la théorie des nombres, J. Reine Angew. Math. 41, 191–216.
Hernández, S. and F. Luca (2003), On the largest prime factor of (ab + 1)(ac + 1)(bc +
1), Bol. Soc. Mat. Mexicana, 9, 235–244.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
348
References
Hindry, M. (1988), Autour d’une Conjecture de Serge Lang, Invent. Math. 94, 575–603.
Houriet, J. (2007), Exceptional units and Euclidean number fields, Archiv Math. 88,
425–433.
Hrushovki, E. (1996), The Mordell-Lang conjecture for function fields, J. Amer. Math.
Soc. 9, 667–690.
Hsia, L.-C. and J. T.-Y. Wang (2004), The ABC theorem for higher-dimensional function
fields, Trans. Amer. Math. Soc. 356, no. 7, 2871–2887.
Jarden, M. and W. Narkiewicz (2007), On sums of units, Monatsh. Math. 150, 327–332.
de Jong, R. S. (1999), On p-adic norm form inequalities, Master thesis, Leiden.
de Jong, R. S. and G. Rémond (2011), Conjecture de Shafarevich effective pour les
revêtements cycliques, Algebra and Number Theory 5, 1133–1143.
von Känel, R. (2011), An effective proof of the hyperelliptic Shafarevich conjecture and
applications, Ph.D. thesis, ETH Zürich.
von Känel, R. (2013), On Szpiro’s discriminant conjecture, Internat. Math. Res. Notices
1–35. Published online: doi:10.193/imrn/vnt079.
von Känel, R. (2014a), An effective proof of the hyperelliptic Shafarevich conjecture,
J. Théorie des Nombres, Bordeaux, 26, 507–530.
von Känel, R. (2014b) Modularity and integral points on moduli schemes,
arXiv:1310.7263v2 [math.NT].
Karpilovsky, G. (1988), Unit groups of classical rings, Oxford University Press.
Koblitz, N. (1984), p-adic Numbers, p-adic Analysis, and Zeta-Functions, Springer
Verlag.
Konyagin, S. and K. Soundararajan (2007), Two S-unit equations with many solutions,
J. Number Theory 124, 193–199.
Kotov, S. V. (1981), Effective bound for a linear form with algebraic coefficients in the
archimedean and p-adic metrics, Inst. Math. Akad. Nauk BSSR, Preprint No. 24,
Minsk (Russian).
Kotov, S. V. and V. G. Sprindžuk (1973), An effective analysis of the Thue-Mahler
equation in relative fields, Dokl. Akad. Nauk BSSR 17, 393–395 (Russian).
Kotov, S. V. and L. Trelina (1979), S-ganze Punkte auf elliptischen Kurven, J. Reine
Angew. Math. 306, 28–41.
Kovács, B. (1981), Canonical number systems in algebraic number fields, Acta Math.
Acad. Sci. Hungar. 37, 405–407.
Kovács, B. and A. Pethő (1991), Number systems in integral domains, especially in
orders of algebraic number fields, Acta Sci. Math. 55, 287–299.
Koymans, P. (2015), The Catalan Equation, Master thesis, Leiden University.
Lagarias, J. C. and K. Soundararajan (2011), Smooth solutions to the abc equation: the
xyz conjecture, J. Théorie des Nombres de Bordeaux 23, 209–234.
Lagrange, J. L. (1773), Recherches d’arithmétiques, Nouv. Mém. Acad. Berlin, 265–312;
Oeuvres III, 693–758.
Landau, E. (1918), Verallgemeinerung eines Pólyaschen Satzes auf algebraische Zahlkörper, Nachr. Ges. Wiss. Göttingen, 478–488.
Lang, S. (1960), Integral points on curves, Inst. Hautes Études Sci. Publ. Math. 6, 27–43.
Lang, S. (1962), Diophantine geometry, Wiley.
Lang, S. (1970), Algebraic Number Theory, Addison-Wesley.
Lang, S. (1978), Elliptic curves: Diophantine analysis, Springer Verlag.
Lang, S. (1983), Fundamentals of Diophantine Geometry, Springer Verlag.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
349
Lang, S. (1984), Algebra, 2nd. edn., Addison-Wesley.
Langevin, M. (1999), Liens entre le théorème de Mason et la conjecture (abc), in:
Number Theory (5th conf. of CNTA, Ottawa ON 1996), R. Gupta and K. S. Williams,
eds. 187–213. CRM Proc. Lecture Notes 19, AMS, Providence RI.
Laurent, M. (1984), Équations diophantiennes exponentielles, Invent. Math. 78, 299–
327.
Laurent, M. (1989), Équations exponentielles polynômes et suites récurrentes linéaires,
II, J. Number Theory 31, 24–53.
Lech, C. (1953), A note on recurring series, Ark. Math. 2, 417–421.
Lehmer, D. H. (1933), Factorization of certain cyclotomic functions, Ann. Math. (2) 34,
461–479.
Leitner, D. J. (2012), Linear equations over multiplicative groups in positive characteristic, Acta Arith. 153, 325–347.
Lenstra Jr., H. W. (1977), Euclidean number fields of large degree, Inventiones Math.
38, 237–254.
Lenstra, A. K., H. W. Lenstra Jr. and L. Lovász (1982), Factoring polynomials with
rational coefficients, Math. Ann. 261, 515–534.
Leutbecher, A. (1985), Euclidean fields having a large Lenstra constant, Ann. Inst.
Fourier 35, 83–106.
Leutbecher, A. and J. Martinet (1982), Lenstra’s constant and euclidean number fields,
Astérisque 94, 87–131.
Leutbecher, A. and G. Niklasch (1989), On cliques of exceptional units and Lenstra’s
construction of Euclidean fields, Lecture Notes Math. 1380, 150–178.
LeVeque, W. J. (1964), On the equation y m = f (x), Acta Arith. 9, 209–219.
LeVesque, C. and M. Waldschmidt (2011), Some remarks on diophantine equations and
diophantine approximation, Vietnam J. Math. 39, 343–368.
LeVesque, C. and M. Waldschmidt (2012), Familles d’équations de Thue-Mahler
n’ayant que des solutions triviales, Acta Arith. 155, 117–138.
Levin, A. (2006), One-parameter families of unit equations, Math. Res. Lett. 13, 935–
945.
Levin, A. (2008), The dimension of integral points and holomorphic curves on the
complements of hyperplanes, Acta Arith. 134, 259–270.
Levin, A. (2014), Lower bounds in logarithms and integral points on higher dimensional
varieties, Algebra Number Theory 8, 647–687.
Lewis, D. J. and K. Mahler (1961), Representation of integers by binary forms, Acta
Arith. 6, 333–363.
Liardet, P. (1974), Sur une conjecture de Serge Lang, C.R. Acad. Sci. Paris 279, 435–
437.
Liardet, P. (1975), Sur une conjecture de Serge Lang, Astérisque 24–25, Soc. Math.
France.
Liu, J. (2015), On p-adic Decomposable Form Inequalities, Ph.D. thesis, Leiden.
Loher, T. and D. Masser (2004), Uniformly counting points of bounded height, Acta
Arith. 111, 277–297.
Louboutin, S. (2000), Explicit bounds for residues of Dedekind zeta functions, values of
L-functions at s = 1, and relative class numbers, J. Number Theory 85, 263–282.
Loxton, J. H. and A. J. van der Poorten (1983), Multiplicative dependence in number
fields, Acta Arith. 42, 291–302.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
350
References
Luca, F. (2005), On the greatest common divisor of u − 1 and v − 1 with u and v near
S-units, Monatsh. Math. 146, 239–256.
Mahler, K. (1933a), Zur Approximation algebraischer Zahlen I: Über den grössten
Primteiler binärer Formen, Math. Ann. 107, 691–730.
Mahler, K. (1933b), Zur Approximation algebraischer Zahlen III: Über die mittlere
Anzahl grosser Zahlen durch binäre Formen, Acta Math. 62, 91–166.
Mahler, K. (1935a), Eine arithmetische Eigenschaft der Taylor-koeffizienten rationaler
Functionen, Proc. Kon. Ned. Akad. Wetensch. 38, 50–60.
Mahler, K. (1935b), Über transzendente p-adische Zahlen, Compos. Math. 2, 259–275.
Mann, H. B. (1965), On linear relations between roots of unity, Mathematika 12, 107–
117.
Mason, R. C. (1983), The hyperelliptic equation over function fields, Math. Proc. Camb.
Phil. Soc. 93, 219–230.
Mason, R. C. (1984), Diophantine equations over function fields, Cambridge University
Press.
Mason, R. C. (1986a), Norm form equations I, J. Number Theory 22, 190–207.
Mason, R. C. (1986b), Norm form equations III: positive characteristic, Math. Proc.
Camb. Phil. Soc. 99, 409–423.
Mason, R. C. (1987), Norm form equations V. Degenerate modules, J. Number Theory
25, 239–248.
Mason, R. C. (1988), The study of Diophantine equations over function fields, in:
New Advances in Transcendence Theory, Proc. conf. Durham 1986, A. Baker, ed.,
Cambridge University Press, pp. 229–247.
Masser, D. W. (1985), Conjecture in “Open Problems” section, in: Proc. Symposium on
Analytic Number Theory, London, 25.
Masser, D. W. (2002), On abc and discriminants, Proc. Amer. Math. Soc. 130, 3141–
3150.
Masser, D. W. (2004), Mixing and linear equations over groups in positive characteristic,
Israel J. Math. 142, 189–204.
Masser, D. W. and G. Wüstholz (1983), Fields of large transcendence degree generated
by values of elliptic functions, Invent. Math. 72, 407–464.
Matveev, E. M. (2000), An explicit lower bound for a homogeneous rational linear form
in logarithms of algebraic numbers, II. Izvestiya: Mathematics 64, 1217–1269.
McQuillan, M. (1995), Division points on semi-abelian varieties, Invent. Math. 120
(1995), 143–159.
Mestre, J. F. (1981), Corps euclidiens, unités exceptionnelles et courbes elliptiques,
J. Number Theory 13, 123–137.
Minkowski, H. (1910), Geometrie der Zahlen, Teubner (Posthumously published; prepared by D. Hilbert and A. Speiser).
Moosa, R. and T. Scanlon (2002), The Mordell-Lang conjecture in positive characteristic
revisited, In: Model theory and applications, Quaderni di matematica 11, L. Belair,
Z. Chatzidakis, P. D’Aquino, D. Marker, M. Otero, F. Point and A. Wilkie, eds.
Dipartimento di Matematica Seconda Università di Napoli. pp. 273–296.
Moosa, R. and T. Scanlon (2004), F -structures and integral points on semiabelian
varieties over finite fields, Amer. J. Math. 126, 473–522.
Mordell, L. J. (1922a), On the rational solutions of the indeterminate equations of the
third and fourth degrees, Proc. Cambridge Philos. Soc. 21, 179–192.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
351
Mordell, L. J. (1922b), Note on the integer solutions of the equation Ey 2 = Ax 3 +
Bx 2 + Cx + D, Messenger Math. 51, 169–171.
Mordell, L. J. (1923), On the integer solutions of the equation ey 2 = ax 3 + bx 2 + cx +
d, Proc. London Math. Soc. (2) 21, 415–419.
Moree, P. and C. L. Stewart (1990), Some Ramanujan-Nagell equations with many
solutions, Indag Math. (N. S.), 1, 465–472.
Morton, P. and J. H. Silverman (1994), Rational periodic points of rational functions,
Intern. Math. Res. Not. (2), 97–110.
Mueller, J. (2000), S-unit equations in function fields via the abc-theorem, Bull. London
Math. Soc. 32, 163–170.
Murty, M. R. and H. Pasten (2013), Modular forms and effective Diophantine approximation, J. Number Theory 133, 3739–3754.
Nagell, T. (1930), Zur Theorie der kubischen Irrationalitäten, Acta Math. 55, 33–65.
Nagell, T. (1964), Sur une propriété des unités d’un corps algébrique, Arkiv för Mat. 5,
343–356.
Nagell, T. (1967), Sur les discriminants des nombres algébriques, Arkiv för Mat. 7,
265–282.
Nagell, T. (1968a), Quelques propriétés des nombres algébriques du quatrième degré,
Arkiv för Mat. 7, 517–525.
Nagell, T. (1968b), Sur les unités dans les corps biquadratiques primitifs du premier
rang, Arkiv för Mat. 7, 359–394.
Nagell, T. (1970), Sur un type particulier d’unités algébriques, Arkiv för Mat. 8, 163–
184.
Narkiewicz, W. (1989), Polynomial cycles in algebraic number fields, Colloq. Math. 58,
149–153.
Narkiewicz, W. (1995), Polynomial Mappings, Lecture Notes Math. 1600, Springer
Verlag.
Narkiewicz, W. and T. Pezda (1997), Finite Polynomial Orbits in Finitely Generated
Domains, Monatsh. Math. 124, 309–316.
Neukirch, J. (1992), Algebraische Zahlentheorie, Springer Verlag.
Nishioka, K. (1986), Proof of Masser’s Conjecture on the Algebraic Independence of
Values of Liouville Series, Proc. Japan Acad. Ser. A 62, 219–222.
Nishioka, K. (1987), Conditions for algebraic independence of certain power series of
algebraic numbers, Compos. Math. 62, 53–61.
Nishioka, K. (1989), Evertse theorem in algebraic independence, Arch. Math. 53, 159–
170.
Nishioka, K. (1994), Algebraic independence by Mahler’s method and S-unit equations,
Compos. Math. 92, 87–110.
Nishioka, K. (1996), Mahler Functions and Transcendence, Lecture Notes Math. 1631,
Springer Verlag.
Northcott, D. G. (1950), Periodic points on an algebraic variety, Ann. Math. 51, 167–177.
Parry, C. J. (1950), The p-adic generalization of the Thue-Siegel theorem, Acta Math.
83, 1–100.
Pasten, H. (2014), Arithmetic problems around the abc-conjecture and connections with
logic, Ph.D. thesis, Queen’s University, Canada.
Pethő, A. and R. Schulenberg (1987), Effektives Lösen von Thue Gleichungen, Publ.
Math. Debrecen 34, 189–196.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
352
References
Pethő, A. and B. M. M. de Weger (1986), Products of prime powers in binary recurrence sequences I. The hyperbolic case, with an application to the generalized
Ramanujan-Nagell equation, Math. Comp. 47, 713–727.
Pezda, T. (1994), Polynomial cycles in certain local domains, Acta Arith. 66, 11–22.
Pezda, T. (2014), An algorithm determining cycles of polynomial mappings in integral
domains, Publ. Math. Debrecen 84, 399–414.
Poe, M. (1997), On distribution of solutions of S-unit equations, J. Number Theory 62,
221–241.
Pohst, M. E. (1993), Computational Algebraic Number Theory, Birkhäuser Verlag.
Pohst, M. E. and H. Zassenhaus (1989), Algorithmic algebraic number theory,
Cambridge University Press.
Poonen, R. (1999), Mordell-Lang plus Bogomolov, Invent. Math. 137, 413–425.
van der Poorten, A. J. and H. P. Schlickewei (1982), The growth condition for recurrence
sequences, Macquarie University Math. Rep. 82–0041.
van der Poorten, A. J. and H. P. Schlickewei (1991), Additive relations in fields,
J. Austral. Math. Soc. (Ser. A) 51, 154–170.
Posner, E. C. and H. Rumsey, Jr. (1965), Polynomials that divide infinitely many trinomials, Michigan Math. J., 12, 339–348.
Rémond, G. (2000a), Inégalité de Vojta en dimension supérieure, Ann. Scuola Norm.
Sup. Pisa Cl. Sci. (4) 29, 101–151.
Rémond, G. (2000b), Décompte dans une conjecture de Lang, Invent. Math. 142, 513–
545.
Rémond, G. (2002), Sur les sous-variétés des tores, Compos. Math. 134, 337–366.
Rémond, G. (2003), Approximation diophantienne sur les variétés semi-abeliennes,
Ann. Sci. École Norm. Sup. (4) 36, 191–212.
Ridout, P. (1958), The p-adic generalization of the Thue-Siegel-Roth Theorem, Mathematika 5, 40–48.
Robert, O., C. L. Stewart and G. Tenenbaum (2014), A refinement of the abc conjecture,
Bull. London Math. Soc. 46, 1156–1166.
Roquette, P. (1957), Einheiten und Divisorenklassen in endlich erzeugbaren Körpern,
Jahresber. Deutsch. Math. Verein 60, 1–21.
Rosser, J. B. and L. Schoenfeld (1962), Approximate formulas for some functions of
prime numbers, Illinois J. Math. 6, 64–94.
Roth, K. F. (1955), Rational approximations to algebraic numbers, Mathematika 2,
1–20.
Ru, M. and P. Vojta (1997), Schmidt’s subspace theorem with moving targets, Invent.
Math. 127, 51–65.
Ru, M. and P. M. Wong (1991), Integral points of Pn \ {2n + 1 hyperplanes in general
position}, Invent. Math. 106, 195–216.
Schinzel, A. (1988), Reducibility of lacunary polynomials VIII, Acta Arith. 50, 91–106.
Schlickewei, H. P. (1976a), Linearformen mit algebraischen Koeffizienten, Manuscripta
Math. 18, 147–185.
Schlickewei, H. P. (1976b), Die p-adische Verallgemeinerung des Satzes von ThueSiegel-Roth-Schmidt, J. Reine Angew. Math. 288, 86–105.
Schlickewei, H. P. (1976c), On products of special linear forms with algebraic coefficients, Acta Arith. 31, 389–398.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
353
Schlickewei, H. P. (1977a), Über die diophantische Gleichung x1 + · · · + xn = 0, Acta
Arith. 33 (1977), 183–185.
Schlickewei, H. P. (1977b), The p-adic Thue-Siegel-Roth-Schmidt theorem, Arch. Math.
(Basel) 29, 267–270.
Schlickewei, H. P. (1977c), On norm form equations, J. Number Theory 9, 370–380.
Schlickewei, H. P. (1977d), On linear forms with algebraic coefficients and Diophantine
equations, J. Number Theory 9, 381–392.
Schlickewei, H. P. (1977e), Inequalities for decomposable forms, Astérisque 41–42,
pp. 267–271.
Schlickewei, H. P. (1990), S-unit equations over number fields, Invent. Math. 102,
95–107.
Schlickewei, H. P. (1992), The quantitative Subspace Theorem for number fields,
Compos. Math. 82, 245–273.
Schlickewei, H. P. (1996a), Multiplicities of recurrence sequences, Acta Math. 176,
171–243.
Schlickewei, H. P. (1996b), Equations in roots of unity, Acta Arith. 76, 99–108.
Schlickewei, H. P. and W. M. Schmidt (2000), The Number of Solutions of PolynomialExponential Equations, Compos. Math. 120, 193–225.
Schlickewei, H. P. and C. Viola (1997), Polynomials that divide many trinomials, Acta
Arith. 78, 267–273.
Schlickewei, H. P. and C. Viola (1999), Polynomials that divide many k-nomials, in:
Number Theory in Progress, Vol. I, Proc. conf. in honour of the 60th birthday
of Andrzej Schinzel, K. Győry, H. Iwaniec and J. Urbanowicz eds. de Gruyter,
pp. 445–450.
Schlickewei, H. P. and E. Wirsing (1997), Lower bounds for the heights of solutions of
linear equations, Invent. Math. 129, 1–10.
Schmidt, W. M. (1971), Linearformen mit algebraischen Koeffizienten II, Math. Ann.
191, 1–20.
Schmidt, W. M. (1972), Norm form equations, Ann. Math. 96, 526–551.
Schmidt, W. M. (1973), Inequalities for resultants and for decomposable forms,
in: Diophantine Approximation and its Applications, Academic Press, pp. 235–
253.
Schmidt, W. M. (1975), Simultaneous approximation to algebraic numbers by elements
of a number field, Monatsh. Math. 79, 55–66.
Schmidt, W. M. (1978), Thue’s equation over function fields, J. Austral. Math. Soc. Ser
A 25, 385–422.
Schmidt, W. M. (1980), Diophantine Approximation, Lecture Notes Math. 785, Springer
Verlag.
Schmidt, W. M. (1989), The subspace theorem in diophantine approximation, Compos.
Math. 96, 121–173.
Schmidt, W. M. (1990), The number of solutions of norm form equations, Trans. Amer.
Math. Soc. 317, 197–227.
Schmidt, W. M. (1991), Diophantine Approximations and Diophantine Equations,
Lecture Notes Math. 1467, Springer Verlag.
Schmidt, W. M. (1992), Integer points on curves of genus 1, Compositio Math. 81,
33–59.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
354
References
Schmidt, W. M. (1996), Heights of points on subvarieties of Gnm , In: Number Theory 1993–94, London Math. Soc. Lecture Note Ser. 235, S. David, ed., 157–187.
Cambridge University Press.
Schmidt, W. M. (1999), The zero multiplicity of linear recurrence sequences, Acta Math.
182, 243–282.
Schmidt, W. M. (2000), Zeros of linear recurrence sequences, Publ. Math. Debrecen
56, 609–630.
Schmidt, W. M. (2003), Linear recurrence sequences, in: Diophantine Approximation,
C.I.M.E. Summer school, Cetraro, Italy, June 28–July 6, 2000, F. Amoroso, U.
Zannier, eds., Lecture Notes Math. 1819, Springer Verlag, pp. 171–247.
Schmidt, W. M. (2009), The Diophantine equation α1x1 · · · αnxn = f (x1 , . . . , xn ), in: Analytic Number Theory. Essays in Honour of Klaus Roth, W. W. L. Chen, W. T. Gowers, H. Halberstem and W. M. Schmidt, eds., pp. 414–420. Cambridge University
Press.
Schneider, T. (1934), Transzendenzuntersuchungen periodischer Funktionen: I Transzendenz von Potenzen; II Transzendenzeigenschaften elliptischer Funktionen,
J. Reine Angew. Math. 172, 65–74.
Sehgal, S. (1978), Topics in Group Rings, Marcel Dekker.
Seidenberg, A. (1974), Constructions in algebra, Trans. Amer. Math. Soc. 197, 273–
313.
Serre, J.-P. (1989), Lectures on the Mordell-Weil theorem, Aspects of Math. E15,
Vieweg.
Shorey, T. N. and R. Tijdeman (1986), Exponential Diophantine Equations, Cambridge
University Press.
Siegel, C. L. (1921), Approximation algebraischer Zahlen, Math. Z. 10, 173–213.
Siegel, C. L. (1926), The integer solutions of the equation y 2 = ax n + bx n−1 + · · · + k,
J. London Math. Soc. 1, 66–68.
Siegel, C. L. (1929), Über einige Anwendungen diophantischer Approximationen, Abh.
Preuss. Akad. Wiss., Phys. Math. Kl., No. 1.
Siegel, C. L. (1969), Abschätzung von Einheiten, Nachr. Göttingen, 71–86.
Silverman, J. H. (1984), The S-unit equation over function fields, Math. Proc. Camb.
Phil. Soc. 95, 3–4.
Silverman, J. H. (1995), Exceptional units and numbers of small Mahler measure,
Experiment. Math. 4, 70–83.
Silverman, J. H. (2007), The arithmetic of dynamical systems, Springer Verlag.
Simmons, H. (1970), The solution of a decision problem for several classes of rings,
Pacific J. Math. 34, 547–557.
Simon, D. (2001), The index of nonmonic polynomials, Indag. Math. (N.S) 12, 505–517.
Skolem, Th. (1933), Einige Sätze über gewisse Reihenentwicklungen und exponentiale
Beziehungen mit Anwendung auf diophantische Gleichungen, Oslo Vid. akad.
Skrifter 6, 1–61.
Skolem, Th. (1935), Ein Verfahren zur Behandlung gewisser exponentialer Gleichungen,
8. Skand. Mat.-Kongr. Stockholm 163–188.
Smart, N. (1995), The solution of triangularly connected decomposable form equations,
Math. Comp. 64, 819–840.
Smart, N. P. (1997), S-unit equations, binary forms and curves of genus 2, Proc. London
Math. Soc. (3) 75, 271–307.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
355
Smart, N. P. (1998), The Algorithmic Resolution of Diophantine Equations, Cambridge
University Press.
Smart, N. P. (1999), Determining the small solutions to S-unit equations, Math. Comput.
68, 1687–1699.
Sprindžuk, V. G. (1969), Effective estimates in “ternary” exponential diophantine equations (Russian), Dokl. Akad. Nauk BSSR, 13, 777–780.
Sprindžuk, V. G. (1973), Squarefree divisors of polynomials and class numbers of
algebraic number fields (Russian), Acta Arith. 24, 143–149.
Sprindžuk, V. G. (1974), Representation of numbers by the norm forms with two
dominating variables, J. Number Theory, 6, 481–486.
Sprindžuk, V. G. (1976), A hyperelliptic diophantine equation and class numbers
(Russian), Acta Arith. 30, 95–108.
Sprindžuk, V. G. (1982), Classical Diophantine Equations in Two Unknowns (Russian),
Nauka.
Sprindžuk, V. G. (1993), Classical Diophantine Equations, Lecture Notes Math. 1559,
Springer Verlag.
Stewart, C. L. and R. Tijdeman (1986), On the Oesterlé-Masser conjecture, Monatsh.
Math. 102, 251–257.
Stewart, C. L. and K. Yu (1991), On the abc conjecture, Math. Ann. 291, 225–230.
Stewart, C. L. and K. Yu (2001), On the abc conjecture, II, Duke Math. J. 108, 169–
181.
Stothers, W. W. (1981), Polynomial identities and Hauptmodulen, Quart. J. Math.
Oxford Ser. (2) 32, 349–370.
Stroeker, R. J. and N. Tzanakis (1994), Solving elliptic Diophantine equations by
estimating linear forms in elliptic logarithms, Acta Arith. 67, 177–196.
Sunley, J. S. (1973), Class numbers of totally imaginary quadratic extensions of totally
real fields, Trans. Amer. Math. Soc. 175, 209–232.
Surroca, A. (2007), Sur l’effectivité du théorème de Siegel et la conjecture abc,
J. Number Theory, 124, 267-290.
Szemerédi, E. (1975), On sets of integers containing no k elements in arithmetic progression, Acta Arith. 27, 299–345.
Taylor, R. and A. Wiles (1995), Ring-theoretic properties of certain Hecke algebras,
Ann. Math. (2) 141, 553–572.
Teske, E. (1998), A space efficient algorithm for group structure computation, Math.
Comp. 67, 1637–1663.
Thue, A. (1909), Über Annäherungswerte algebraischer Zahlen, J. Reine Angew. Math.
135, 284–305.
Thunder, J. L. (2001), Decomposable Form Inequalities, Ann. Math. 153, 767–804.
Thunder, J. L. (2005), Asymptotic estimates for the number of integer solutions to
decomposable form inequalities, Compos. Math. 141 (2005), 271–292.
Tichy, R. F and V. Ziegler (2007), Units generating the ring of integers of complex cubic
fields, Colloq. Math. 109, 71–83.
Tzanakis, N. (2013), Elliptic Diophantine Equations, de Gruyter.
Tzanakis, N. and B. M. M. de Weger (1989), On the practical solution of the Thue
equation, J. Number Theory 31, 99–132.
Vaaler, J. (2014), Heights on groups and small multiplicative dependencies, Trans. Amer.
Math. Soc. 366, 3295–3323.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
356
References
Vojta, P. (1983), Integral points on varieties, Ph.D.-thesis, Harvard University.
Vojta, P. (1987), Diophantine Approximation and Value Distribution Theory, Lecture
Notes in Math. 1239. Springer Verlag.
Vojta, P. (1996), Integral points on subvarieties of semiabelian varieties, I, Invent Math.
126, 133–181.
Vojta, P. (2000), On the ABC-conjecture and diophantine approxination by rational
points, Amer. J. Math. 122, 843–872. Correction, Amer. J. Math. 123 (2001), 383–
384.
Voloch, J. F. (1985), Diagonal equations over function fields, Bol. Soc. Bras. Mat. 16,
29–39.
Voloch, J. F. (1998), The equation ax + by = 1 in characteristic p, J. Number Th. 73,
195–200.
Voutier, P. (1996), An effective lower bound for the height of algebraic numbers, Acta
Arith. 74, 81–95.
Voutier, P. (2014), Modules with many non-associates and norm form equations with
many families of solutions, J. Number Theory 138, 20–36.
van der Waerden, B. L. (1927), Beweis einer Baudetschen Vermutung, Nieuw. Arch.
Wisk. (2) 15, 212–216.
Waldschmidt, M. (1973), Propriétés arithmétiques des valeurs de fonctions
méromorphes algébriquement indépendantes, Acta Arith. 23, 19–88.
Waldschmidt, M. (1974), Nombres Transcendants, Springer Verlag.
Waldschmidt, M. (2000), Diophantine approximation on linear algebraic groups,
Springer Verlag.
Wang, J. T.-Y. (1996), The truncated second main theorem of function fields, J. Number
Theory 58, 139–157.
Wang, J. T.-Y. (1999), A note on Wronskians and the ABC theorem, Manuscripta Math.
98, 255–264.
de Weger, B. (1987), Algorithms for Diophantine Equations, Dissertation, Centrum
voor Wiskunde en Informatica, Amsterdam.
de Weger, B. (1989), Algorithms for Diophantine Equations, CWI Tract 65, Amsterdam.
Wildanger, K. (1997), Über das Lösen von Einheiten- und Indexformgleichungen in
algebraischen Zahlkörpern mit einer Anwerdung auf die Bestimmung aller ganzen
Punkte einer Mordellschen Kurve, Dissertation, Technical University, Berlin.
Wildanger, K. (2000), Über das Lösen von Einheiten- und Indexformgleichungen in
algebraischen Zahlkörpern, J. Number Theory 82, 188–224.
Wiles, A. (1995), Modular elliptic curves and Fermat’s Last Theorem, Ann. Math. (2)
141, 443–551.
Wirsing, E. (1971), On approximation of algebraic numbers by algebraic numbers of
bounded degree, in: Proc. Sympos. Pure Math. 20, Amer. Math. Soc., Providence,
pp. 213–247.
Wüstholz, G., ed. (2002), A panorama of number theory or the view from Baker’s
garden, Cambridge University Press.
Yu K. (2007), P -adic logarithmic forms and group varieties III, Forum Mathematicum,
19, 187–280.
Zannier, U. (1993), Some remarks on the S-unit equation in function fields, Acta Arith.
64, 87–98.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
References
357
Zannier, U. (2003), Some applications of diophantine approximation to diophantine
equations (with special emphasis on the Schmidt subspace theorem), Forum.
Zannier, U. (2004), On the integer solutions of exponential equations in function fields,
Ann. Inst. Fourier (Grenoble) 54, 849–874.
Zannier, U. (2009), Lecture notes on Diophantine analysis, Edizioni della Normale.
Zannier, U. (2012), Some Problems of Unlikely Intersections in Arithmetic and
Geometry, Princeton University Press.
Zhang, S. (2000), Distribution of almost division points, Duke Math. J. 103, 39–46.
Zieve, M. E. (1996), Cycles of polynomial mappings, Ph.D. thesis, University of
California, Berkeley.
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:52, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.013
Glossary of frequently used notation
General
|S|
log∗ x
log∗n x
, f (x) = O(g(x)) as
x→∞
f (x) = o(g(x)) as
x→∞
Z>0 , Z≥0
Fp
Pn (K)
A, A+ , A∗
A[X1 , . . . , Xn ]
gcd
GL(n, A), SL(n, A)
L/K
TrL/K (α), NL/K (α)
DL/K (ω1 , . . . , ωn )
D(f ), D(F )
R(f, g), R(F, G)
cardinality of a finite set S
max(1, log x), log∗ 0 := 1.
log∗ iterated n times applied to x
Vinogradov symbols; A(x) B(x) or B(x) A(x)
means that there is a constant c > 0 such that
A(x) ≥ cB(x) for all x in the specified domain
these are constants c1 , c2 > 0 such that
|f (x)| ∈ c1 g(x) for all x ≥ c2 .
lim f (x)/g(x) = 0.
x→∞
positive integers, non-negative integers
finite field of p elements.
n-dimensional projective space over a field K.
ring (always commutative with 1), additive group of
A, group of units of A
ring of polynomials in n variables with coefficients in
A
greatest common divisor
multiplicative group of n × n-matrices with entries in
A and determinant in A∗ , resp. determinant 1
field extension L/K
trace, norm of α ∈ L over K
discriminant of a K-basis {ω1 , . . . , ωn } of L
discriminant of a polynomial f (X), binary form
F (X, Y )
resultant of polynomials f (X), g(X), binary forms
F (X, Y ), G(X, Y ).
358
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:51, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.014
Glossary of frequently used notation
359
Number fields
ordp (a)
|a|p
|a|∞
Qp
MQ
O K , D K , h K , RK
p, a
(α) = αOK
ordp (a)
ordp (α)
NK (a)
e(P|p), f (P|p)
MK
MK∞
MK0
| · |v (v ∈ MK )
Kv
S
OS
OS∗
NS (α)
|x|v (v ∈ MK )
H hom (x)
H (x)
exponent of a prime number p in the unique prime
factorization of a ∈ Q, ordp (0) = ∞
p−ordp (a) , p-adic absolute value of a ∈ Q
max(a, −a), ordinary absolute value of a ∈ Q
p-adic completion of Q, Q∞ = R
{∞} ∪ {primes}, set of places of Q
ring of integers, discriminant, class number, regulator
of a number field K
non-zero prime ideal, fractional ideal of OK
fractional ideal generated by α
exponent of p in the unique prime ideal factorization
of a
exponent of p in the unique prime ideal factorization
of (α) for α ∈ K, with ordp (0) := ∞.
absolute norm of a fractional ideal a of OK (written
as N (a) if it is clear which is the underlying number
field)
ramification index, residue class degree of a prime
ideal P over a prime ideal p.
set of places of a number field K
set of infinite (archimedean) places of K
set of finite (non-archimedean) places of K, identified
with the non-zero prime ideals of OK
normalized absolute values of K, satisfying the
product formula, with |α|v := NK (p)−ordp (α) if α ∈ K
and v = p is a prime ideal of OK
completion of K at v
finite set of places of K, containing MK∞
{α ∈ K : |α|v ≤ 1 for v ∈ MK \ S}, ring of
S-integers, written as ZS if K = Q
{α ∈ K : |α|v = 1 for v ∈ MK \ S}, group of S-units,
written as Z∗S if K = Q
v∈S |α|v , S-norm of α ∈ K
maxi |xi |v , v-adic norm of x = (x1 , . . . , xn ) ∈ K n
( v∈MK |x|v )1/[K:Q] , absolute homogeneous height of
x ∈ Kn
( v∈MK max(1, |x|v ))1/[K:Q] , absolute height of
x ∈ Kn
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:51, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.014
360
H (α)
hhom (x), h(x), h(α)
Glossary of frequently used notation
( v∈MK max(1, |α|v ))1/[K:Q] , absolute height of
α∈K
log H hom (x), log H (x), log H (α), absolute logarithmic
heights
Function fields
k
k((z))
gK/k
MK
v(x) (v ∈ MK )
HKhom (x)
HK (x)
field of constants (always algebraically closed)
field of Laurent series in z
genus of function field K with constant field k
set of (normalized discrete) valuations of K, trivial on
k
mini v(xi ), v-adic norm of x = (x1 , . . . , xn ) ∈ K n
−
v(x), homogeneous height of x ∈ K n
v∈MK
v∈MK max(0, −v(x)), height of x ∈ K
Downloaded from https:/www.cambridge.org/core. Columbia University Libraries, on 16 Jun 2017 at 04:30:51, subject to the Cambridge
Core terms of use, available at https:/www.cambridge.org/core/terms. https://doi.org/10.1017/CBO9781316160749.014
Cambridge University Press
978-1-107-09760-5 - Unit Equations in Diophantine Number Theory
Jan-Hendrik Evertse and Kálmán Győry
Index
More information
Index
A-order, 275
abc-conjecture, 90
number field version, 92
abc-theorem for function fields, 174
absolute value, 12
archimedean, 13
continuation, 14
equivalence, 13
extension, 14
non-archimedean, 13
trivial, 12
additive unit representation, 287
algebraic coset, 321
algebraic function field, 30
algebraic subgroup, 321
bad reduction, of rational self-map, 296
Baker’s method, 98
Baker’s type inequalities, 97
binary form, 231
canonical number system, 310
characteristic polynomial, 3
class group, 10
class number, 10, 11
CM-field, 121
CNS basis, 310
CNS order, 310
completion, 13
complex place, 15
cycle, 291
cycle, polynomial, 292
decomposable form, 231
decomposable form equation, 232, 263, 272,
286
decomposable form inequality, 278, 282
decomposable form of discriminant type, 275
decomposable form, triangularly connected,
263
decomposable polynomial, 282
decomposable polynomial equations, 282
Dedekind domain, 6
derivative
of algebraic function, 36
difference graph, 301
differential, 35
holomorphic, 36
discrete valuation, 7, 13
discriminant
of basis, 4
of number field, 10, 11
discriminant equation, 305
discriminant form, 268, 276
discriminant form equation, 233, 263, 268, 272
division group, 324
effective specialization, 198
effectively computable algebraic number, 23
effectively computable fractional ideal, 24
effectively given algebraic number, 23
effectively given fractional ideal, 24
effectively given number field, 23
elliptic equation, 273
equivalence
of binary forms, 311
equivalent
of algebraic integers, 306
of monic polynomials, 306
Euclidean norm, 123
exceptional units, 121
explicitly presented field, 37, 175
361
© in this web service Cambridge University Press
www.cambridge.org
Cambridge University Press
978-1-107-09760-5 - Unit Equations in Diophantine Number Theory
Jan-Hendrik Evertse and Kálmán Győry
Index
More information
362
Index
exponential-polynomial equations, 326
Extension Formula, 33
family of solutions of decomposable form
equation, 248
Fermat’s Last Theorem, 91
field of p-adic numbers, 26
field with absolute value, 13
complete, 13
completion, 13
Fincke–Pohst algorithm, 117, 118
finite étale K-algebra, 246
finite place
of Q, 14
of number field, 15
fractional ideal, 5
absolute norm, 9
extension, 7
generated by S, 5
greatest common divisor, 6
inverse, 6
lowest common multiple, 6
product, 6
relative norm, 8
fundamental system of S-units, 18
Gal(G/K)-proper, 234
Gal(G/K)-stable, 234
Galois symmetric S-unit vector, 251
generalized Fermat equation together, 91
genus, 36, 173
GL(2, A)-equivalence, 317
good reduction, of rational self-map, 296
Gram–Schmidt orthogonalization process,
123
group of S-units, 17
height
S-height, 44, 130
absolute logarithmic, 19
absolute multiplicative, 19
homogeneous of polynomial over function
field, 35
homogeneous of vector over function field,
33
logarithmic of finite set S, 201
logarithmic of matrix, 201
logarithmic of vector, 21
multiplicative homogeneous of vector, 21
multiplicative of vector, 21
of algebraic function, 34
of polynomial, 22
twisted, 46
hyperelliptic equation, 273
ideal membership algorithm, 199, 204
index form, 276
index form equation, 268, 272
infinite place
of Q, 14
of number field, 15
infinite valuation function field, 33
inner product on, 123
irreducible family of solutions of
decomposable form equation, 255
KANT, 119, 123
k-nomial, 298
k-proportional solutions, 180
Lang’s Conjecture, 322, 323
Lang–Bogomolov Conjecture, 325
lattice
full in real vector space, 68
in real vector space, 68
lattice, full in real vector space, 10
Laurent series, 31
length, of cycle, 291
linear forms in logarithms, 52
linear recurrence sequence, 326
companion polynomial, 326
non-degenerate, 327
order, 326
zero-multiplicity, 327
LLL-reduced basis, 104, 110, 123
LLL-reduction algorithm, 103, 124, 125
local parameter, 31
local ring of discrete valuation, 30
MAGMA, 119
Mahler measure, 21
minimal polynomial over Z, 20
MINIMIZE, 117
Minkowski’s Theorem on successive minima,
68
monic minimal polynomial, 3
monogenic number field, 309
monogenic order, 309
monogenic, k times, 309
Mordell’s Conjecture, 322, 323
Mordell’s equation, 273
Mordell–Weil Theorem, 322
© in this web service Cambridge University Press
www.cambridge.org
Cambridge University Press
978-1-107-09760-5 - Unit Equations in Diophantine Number Theory
Jan-Hendrik Evertse and Kálmán Győry
Index
More information
Index
Noetherian module, 238
Noetherian ring, 238
non-degenerate solution, 180
non-degenerate solutions, 128
norm
absolute, of fractional ideal, 9
of algebraic number, 4
on real vector space, 68
relative, of fractional ideal, 8
relative, of prime ideal, 8
unit ball, 68
v-adic of polynomial, 22
v-adic of vector, 21
norm form, 244
norm form equation, 233, 244, 250, 263, 267
normal closure, 3
orbit, 291
orbit, finite, 291
orbit, finite polynomial, 292
order, in number field, 308
order, monogenic, 309
p-adic exponential, 28
p-adic logarithm, 27
p-adic numbers, 14
pair of representatives, 199
periodic point, 291
place lying above, 16
place lying below, 16
power integral basis, 268, 308
preperiodic point, 291
prime ideal
of ring of intergers, 6
Product Formula, 14, 15
Puiseux expansions, 31
radical, 89
ramification index, 8, 31
ramification index of local field, 26
Ramsey theory, 288
real place, 15
regulator, 11
representation of algebraic function, 37
representation of algebric function, 175
representative, 199
residue class degree, 8
resultant, 315, 317
resultant equation, 280, 315
ring of p-adic integers, 26
363
ring of S-integer, 17
Roth’s Theorem, 42, 91
S-integer, 17
S-norm, 17
S-regulator, 18
-symmetric partition, 251
S-units, 17
S-regulator, 18
fundamental system, 18
S-unit of function field, 173
self-map, 291
semi-abelian variety, 322
Skolem, Mahler–Lech theorem, 327
specialization, 140, 218
splitting field, 3
Subspace Theorem, 43, 245
p-adic, 44, 249
parametric, 45, 46
quantitative, 45, 252, 279, 329
successive minimum, 68
Sum Formula, 31
superelliptic equation, 273
Thue equation, 232, 250, 258
Thue–Mahler equation, 232, 249, 277
trace, 4
triangle graph, 302
ultrametric inequality, 13
Uniform Boundedness Conjecture, 297
unit equations, 61, 232
homogeneous, 61
units, 10
exceptional, 121
fundamental system, 11
regulator, 11
unit group of ring of integers, 10
unit rank, 11
valuation, 13
discrete, 7, 13
value group, 13
valuation on function field, 30, 173
explicitly given, 38, 175
Vandermonde’s identity, 4
Weak Nullstellensatz, 140
wide family of solutions of decomposable
form equation, 246
© in this web service Cambridge University Press
www.cambridge.org
Téléchargement