1010 INTRODUCTION* 1010 A. Scope and Application of Methods The procedures described in Standard Methods for the Examination of Water and Wastewater are intended for use in analyzing a wide range of waters, including surface water, ground water, saline water, domestic and industrial water supplies, cooling or circulating water, boiler water, boiler feed water, and treated and untreated municipal and industrial wastewaters. In recognition of the unity of the water, wastewater, and watershed management fields, the analytical methods are categorized based on constituent, not type of water. An effort has been made to present methods that apply generally. When alternative methods are necessary for samples of different composition, the basis for selecting the most appropriate method is presented as clearly as possible. In specific instances (e.g., samples with extreme concentrations or otherwise unusual compositions or characteristics), analysts may have to modify a method for it to be suitable. If so, they should plainly state the nature of the modification when reporting the results. Certain procedures are intended for use with sludges and sediments. Here again, the effort has been made to present methods with the widest possible application. However, these methods may require modification or be inappropriate for chemical sludges or slurries, or other samples with highly unusual composition. Most of the methods included here have been endorsed by regulators. Regulators may not accept procedures that were modified without formal approval. Methods for analyzing bulk water-treatment chemicals are not included. American Water Works Association committees prepare and issue standards for water treatment chemicals. Laboratories that desire to produce analytical results of known quality (i.e., results are demonstrated to be accurate within a specified degree of uncertainty) should use established quality control (QC) procedures consistently. Part 1000 provides a detailed overview of QC procedures used in the individual standard methods as prescribed throughout Standard Methods. Other sections of Part 1000 address laboratory safety, sampling procedures, and method development and validation. Material presented in Part 1000 is not necessarily intended to be prescriptive nor to replace or supersede specific QC requirements given in individual sections of this book. Parts 2000 through 9000 contain sections describing QC practices specific to the methods in the respective Parts; these practices are considered to be integral to the methods. Most individual methods will contain explicit instructions to be followed for that method (either in general or for certain regulatory applications). Similarly, the overview of topics covered in Part 1000 is not intended to replace or be the sole basis for technical education and training of analysts. Rather, the discussions are intended as aids to augment and facilitate reliable use of the test procedures herein. Each Section in Part 1000 contains references that can be reviewed to gain more depth or details for topics of interest. * Reviewed by Standard Methods Committee, 2011. 1010 B. Statistics 1. Normal Distribution summation procedure but with n equal to a finite number of repeated measurements (10, 20, 30, etc.): If a measurement is repeated many times under essentially identical conditions, the results of each measurement (x) will be distributed randomly about a mean value (arithmetic average) because of uncontrollable or experimental uncertainty. If an infinite number of such measurements were accumulated, then the individual values would be distributed in a curve similar to those shown in Figure 1010:1. Figure 1010:1A illustrates the Gaussian (normal) distribution, which is described precisely by the mean () and the standard deviation (). The mean (average) is simply the sum of all values (xi) divided by the number of values (n). x ⫽ 共兺x i兲/n for estimated mean The standard deviation of the entire population measured is as follows: ⫽ 关兺共x i ⫺ 兲 2/n兴 1/2 The empirical estimate of the sample standard deviation (s) is as follows: ⫽ 共兺x i兲/n for entire population s ⫽ 关兺共x i ⫺ x 兲 2/共n ⫺ 1兲兴 1/2 Because no measurements are repeated infinitely, it is only possible to make an estimate of the mean (x ) using the same The standard deviation fixes the width (spread) of the normal distribution and consists of a fixed fraction of the 1 INTRODUCTION (1010)/Statistics Figure 1010:1. Three types of frequency distribution curves—normal Gaussian (A), positively skewed (B), and negatively skewed (C)—and their measures of central tendency: mean, median, and mode. Courtesy: L. Malcolm Baker. measurements that produce the curve. For example, 68.27% of the measurements lie within ⫾ 1, 95.45% between within ⫾ 2, and 99.73% within ⫾ 3. (It is sufficiently accurate to state that 95% of the values are within ⫾ 2 and 99% within ⫾ 3.) When values are assigned to the ⫾ multiples, they are called confidence limits, and the range between them is called the confidence interval. For example, 10 ⫾ 4 indicates that the confidence limits are 6 and 14, and the confidence interval ranges from 6 to 14. Another useful statistic is the standard error of the mean ()—the standard deviation divided by the square root of the number of values 共/ 冑n). This is an estimate of sampling accuracy; it implies that the mean of another sample from the same population would have a mean within some multiple of . As with , 68.27% of the measurements lie within ⫾ 1, 95.45% within ⫾ 2, and 99.73% within ⫾ 3. In practice, a relatively small number of average values is available, so the confidence intervals about the mean are expressed as: where t has the following values for 95% confidence intervals: n t n t 2 3 4 12.71 4.30 3.18 5 10 ⬁ 2.78 2.26 1.96 Using t compensates for the tendency of a small number of values to underestimate uncertainty. For n ⬎ 15, it is common to use t ⫽ 2 to estimate the 95% confidence interval. Still another statistic is the relative standard deviation (/) with its estimate (s/x̄), also known as the coefficient of variation (CV), which commonly is expressed as a percentage. This statistic normalizes and sometimes facilitates direct comparisons among analyses involving a wide range of concentrations. For example, if analyses at low concentrations yield a result of 10 ⫾ 1.5 mg/L and at high concentrations yield a result of 100 ⫾ 8 mg/L, the standard deviations do not appear comparable. However, the percent relative standard deviations are 100 (1.5/10) ⫽ x ⫾ ts/ 冑n 2 INTRODUCTION (1010)/Statistics 15% and 100 (8/100) ⫽ 8%, indicating that the variability is not as great as it first appears. The mean, median, and mode for each curve in Figure 1010:1 were calculated as follows: 1) Mean is the value at the 50th percentile level, or arithmetic average, 2) Mode is the value that appears most frequently, and 3) Median1 is estimated as follows: TABLE 1010:I. CRITICAL VALUES FOR 5% AND 1% TESTS OF DISCORDANCY FOR A SINGLE OUTLIER IN A NORMAL SAMPLE Median ⫽ 1⁄3 共2 ⫻ Mean ⫹ Mode) 2. Log-Normal Distribution In many cases, the results obtained from analyzing environmental samples will not be normally distributed [i.e., a graph of the data distribution will be obviously skewed (see Figure 1010:1B and C)] so the mode, median, and mean will be distinctly different. To obtain a nearly normal distribution, convert the measured variable results to logarithms and then calculate x and s. The antilogarithms of x and s are the estimates of geometric mean (x g) and geometric standard deviation (sg). The geometric mean is defined as: x g ⫽ 关兿共x i兲兴 1/n ⫽ antilog 兵1/n关兺 log 共xi 兲兴其 Calibration curve data can be fitted to a straight line or quadratic curve by the least squares method, which is used to determine the constants of the curve that the data points best fit. To do this, choose the equation that best fits the data points and assume that x is the independent variable and y is the dependent variable (i.e., use x to predict the value of y). The sum of the squares of the differences between each actual data point and its predicted value are minimized. For a linear least squares fit of 1.15 1.49 1.75 1.94 2.10 2.22 2.32 2.41 2.55 2.66 2.71 2.75 2.82 2.88 3.10 3.24 3.34 3.41 3.60 3.66 冧 关兺y2 ⫺ a0 兺y ⫺ a1 兺xy ⫺ a2 兺x2 y兴 1 兺y2 ⫺ 共兺y兲2 n 冋 册 0.5 4. Rejecting Data In a series of measurements, one or more results may differ greatly from the others. Theoretically, no result should be arbitrarily rejected because it may indicate either a faulty technique (casting doubt on all results) or a true variant in the distribution. In practice, it is permissible to reject the result of any analysis in which a known error occurred. In environmental studies, extremely high and low concentrations of contaminants may indicate either problematic or uncontaminated areas, so they should not be rejected arbitrarily. An objective test for outliers has been described.4 If a set of data is ordered from low to high (xL, x2 . . . xH) and the mean and standard deviation are calculated, then suspected high or low outliers can be tested via the following procedure. First, calculate the statistic T using the discordancy test for outliers: 共兺x兺y/n ⫺ 兺xy兲 关共兺x兲 2/n ⫺ 兺x 2兴 兺y ⫺ m兺x n 共兺x兺y/n兲 ⫺ 兺xy 兺y 2 ⫺ 共兺y兲 2/n 1.15 1.46 1.67 1.82 1.94 2.03 2.11 2.18 2.29 2.37 2.41 2.44 2.50 2.56 2.74 2.87 2.96 3.03 3.21 3.27 冦 The correlation coefficient1–3 (degree of fit) is: 冋 3 4 5 6 7 8 9 10 12 14 15 16 18 20 30 40 50 60 100 120 r⫽ 1⫺ the slope (a1) and the y intercept1–3 (a0) are computed as follows: r⫽m 1% Critical Value a more detailed description of the algebraic manipulations, see the cited references. In this case, the correlation coefficient1 is: y ⫽ mx ⫹ b b⫽ 5% SOURCE: BARNET, V. & T. LEWIS. 1995. Outliers in Statistical Data, 3rd ed. John Wiley & Sons, New York, N.Y. 3. Least Square Curve Fitting m⫽ Number of Measurements n 册 0.5 The best fit is when r ⫽ 1. There is no fit when r ⫽ 0. For a quadratic least squares fit of y ⫽ a 2 x 2 ⫹ a 1 x ⫹ a 0, T ⫽ (x H – x̄)/s for a high value, or the constants (a0, a1, and a2)1⫺3 must be calculated. Typically, these calculations are performed using software provided by instrument manufacturers or independent software vendors. For T ⫽ (x ⫺ x L )/s for a low value. Second, compare T with the value in Table 1010:I for either a 3 INTRODUCTION (1010)/Glossary 5% or 1% level of significance for the number of measurements (n). If T is larger than that value, then xH or xL is an outlier. Further information on statistical techniques is available elsewhere.5–7 3. TEXAS INSTRUMENTS, INC. 1975. Texas Instruments Programmable Calculator Program Manual ST1. Statistics Library, Dallas, Texas. 4. BARNETT, V. & T. LEWIS. 1995. Outliers in Statistical Data, 3rd ed., John Wiley & Sons, New York, N.Y. 5. NATRELLA, M.G. 1963. Experimental Statistics, Handbook 91. National Bur. Standards, Washington, D.C. 6. SNEDECOR, G.W. & W.G. COCHRAN. 1980. Statistical Methods. Iowa State University Press, Ames. 7. VERMA, S.P. & A. QUIROZ-RUIZ. 2006. Critical values for 22 discordancy test variants for outliers in normal samples up to sizes 100, and applications in science and engineering. Revista Mexicana de Ciencias Geologicas 23(3):302. 5. References 1. SPIEGEL, M.R. & L.J. STEPHENS. 1998 Schaum’s Outline—Theory and Problems of Statistics. McGraw-Hill, New York, N.Y. 2. LAFARA, R.L. 1973. Computer Methods for Science and Engineering. Hayden Book Co., Rochelle Park, N.J. 1010 C. Glossary This glossary defines concepts, not regulatory terms. It is not intended to be all-inclusive. Accuracy— estimate of how close a measured value is to the true value; includes expressions for bias and precision. Analyte—the element, compound, or component being analyzed. Bias— consistent deviation of measured values from the true value, caused by systematic errors in a procedure. Calibration check standard—standard used to determine an instrument’s accuracy between recalibrations. Confidence coefficient—the probability (%) that a measurement will lie within the confidence interval (between the confidence limits). Confidence interval—set of possible values within which the true value will lie with a specified level of probability. Confidence limit—one of the boundary values defining the confidence interval. Detection levels—various levels in use are: Instrument detection level (IDL)—the constituent concentration that produces a signal greater than five times the instrument’s signal:noise ratio. The IDL is similar to the critical level and criterion of detection, which is 1.645 times the s of blank analyses (where s is the estimate of standard deviation). Lower level of detection (LLD) [also called detection level and level of detection (LOD)]—the constituent concentration in reagent water that produces a signal 2(1.645)s above the mean of blank analyses. This establishes both Type I and Type II errors at 5%. Method detection level (MDL)—the constituent concentration that, when processed through the entire method, produces a signal that has 99% probability of being different from the blank. For seven replicates of the sample, the mean must be 3.14s above the blank result (where s is the standard deviation of the seven replicates). Compute MDL from replicate measurements of samples spiked with analyte at concentrations more than one to five times the estimated MDL. The MDL will be larger than the LLD because typically 7 or less replicates are used. Additionally, the MDL will vary with matrix. Reporting level (RL)—the lowest quantified level within an analytical method’s operational range deemed reliable enough, and therefore appropriate, for reporting by the laboratory. RLs may be established by regulatory mandate or client specifications, or arbitrarily chosen based on a preferred level of acceptable reliability. Examples of RLs typically used (besides the MDL) include: Level of quantitation (LOQ)/minimum quantifiable level (MQL)—the analyte concentration that produces a signal sufficiently stronger than the blank, such that it can be detected with a specified level of reliability during routine operations. Typically, it is the concentration that produces a signal 10s above the reagent water blank signal, and should have a defined precision and bias at that level. Minimum reporting level (MRL)—the minimum concentration that can be reported as a quantified value for a target analyte in a sample. This defined concentration is no lower than the concentration of the lowest calibration standard for that analyte and can only be used if acceptable QC criteria for this standard are met. Duplicate—1) the smallest number of replicates (two), or 2) duplicate samples (i.e., two samples taken at the same time from one location) (field duplicate) or replicate of laboratory analyzed sample. Fortification—adding a known quantity of analyte to a sample or blank to increase the analyte concentration, usually for the purpose of comparing to test result on the unfortified sample and estimating percent recovery or matrix effects on the test to assess accuracy. Internal standard—a pure compound added to a sample extract just before instrumental analysis to permit correction for in efficiencies. Laboratory control standard—a standard usually certified by an outside agency that is used to measure the bias in a procedure. For certain constituents and matrices, use National Institute of Standards and Technology (NIST) or other national or international traceable sources (Standard Reference Materials), when available. Mean—the arithmetic average (the sum of measurements divided by the number of items being summed) of a data set. Median—the middle value (odd count) or mean of the two middle values (even count) of a data set. Mode—the most frequent value in a data set. Percentile—a value between 1 and 100 that indicates what percentage of the data set is below the expressed value. Precision (usually expressed as standard deviation)—a measure of the degree of agreement among replicate analyses of a sample. 4 INTRODUCTION (1010)/Dilution/Concentration Operations Quality assessment—procedure for determining the quality of laboratory measurements via data from internal and external quality control measures. Quality assurance—a definitive plan for laboratory operations that specifies the measures used to produce data with known precision and bias. Quality control—set of measures used during an analytical method to ensure that the process is within specified control parameters. Random error—the deviation in any step in an analytical procedure that can be treated by standard statistical techniques. Random error is a major component of measurement error and uncertainty. Range—the difference of the largest and smallest values in a data set. Replicate—repeated operation during an analytical procedure. Two or more analyses for the same constituent in an extract of one sample constitute replicate extract analyses. Spike—see fortification. Surrogate standard—a pure compound added to a sample in the laboratory just before processing so a method’s overall efficiency can be determined. Type I error (also called alpha error)—the probability of determining that a constituent is present when it actually is absent. Type II error (also called beta error)—the probability of not detecting a constituent that actually is present. 1010 D. Dilution/Concentration Operations total volume will equal a ⫹ b, which is not always the case). Most aqueous-solution volumes are additive, but alcoholic solutions or concentrated acid may be only partially volumetrically additive, so be aware of potential problems when combining nonaqueous solutions with aqueous diluents. b. Volumetric dilution to a measured volume (a/c). This method is used to dilute an aliquot to a given volume via a pipet and volumetric flask. It is the most accurate means of dilution, but when fortifying sample matrices, some error can be introduced if a regular Class A volumetric flask is used. The error will be proportional to the volumes of both spiking solution and flask. For the most accurate work, measure the unfortified sample aliquot in a 100-mL Cassia Class A volumetric flask to the 100-mL mark (0.0 on the flask neck*), and then pipet the volume of fortifying solution. Mix the solution and note the graduated volume on the neck of the flask. The fortified solution’s true volume is equal to 100 mL ⫹ graduated volume over 100 mL. The true total volume is necessary when computing the dilution factor for the percent recovery of fortified analyte (LFM) in Sections 1020B.12e and 4020B.3a to obtain the most accurate analytical estimate of recovery. Dilution factors for multiple volumetric dilutions are calculated as the product of the individual dilutions. Generally, serial dilution is preferred when making dilutions of more than two or three orders of magnitude. Avoid trying to pipet quantities of less than 1.0 mL into large volumes (e.g., ⬍1.0 mL into 100 or 1000 mL) to avoid large relative error propagation. Some biological test methods (e.g., BOD or toxicity testing) may include dilution techniques that do not strictly conform to the preceding descriptions. For example, such techniques may use continuous-flow dilutors and dilutions prepared directly in test equipment, where volumes are not necessarily prepared via Class A volumetric equipment. Follow the method-specific dilution directions. 1. Adjusting Solution Volume Analysts frequently must dilute or concentrate the amount of analyte in a standard or sample aliquot to within a range suitable for the analytical method so analysis can be performed with specified accuracy. The following equations enable analysts to compute the concentration of a diluted or concentrated aliquot based on the original aliquot concentration and an appropriate factor or fractional constant. (A factor in this context is the ratio of final adjusted volume to original volume.) They also can compute the concentration of adjusted aliquot volume based on the original aliquot volume. Concentration of diluted aliquot ⫽ original aliquot concentration ⫻ dilution fraction Concentration of original aliquot ⫽ diluted aliquot concentration ⫻ dilution factor Concentration of concentrated aliquot ⫽ original aliquot concentration ⫻ concentration factor Concentration of original aliquot ⫽ concentrated aliquot concentration ⫻ concentration fraction where: Dilution fraction ⫽ original volume/adjusted volume, Dilution factor ⫽ adjusted volume/original volume, Concentration factor ⫽ original volume/adjusted volume, and Concentration fraction ⫽ adjusted volume/original volume. 2. Types of Dilutions 3. Bibliography Several types of dilutions are used in Standard Methods procedures. Two of the most common volumetric techniques critical to analytical chemistry results are: a. Volumetric addition [a/(a ⫹ b)]. This method typically is used to dilute microbiological samples and prepare reagents from concentrated reagents. It assumes that volumes a and b are additive (i.e., when a is combined with b in one container, the NIEMELA, S.I. 2003. Uncertainty of Quantitative Determinations Derived by Cultivation of Microorganisms, Publication J4/2003 MIKES. Metrologia, Helsinki, Finland. * Pyrex, or equivalent. 5