JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 23, NO. X, MONTH 2005 1 Design and Performance Analysis of Scheduling Algorithms for WDM-PON under SUCCESS-HPON Architecture Kyeong Soo Kim, Member, IEEE, David Gutierrez, Student Member, IEEE, Fu-Tai An, Member, IEEE, and Leonid G. Kazovsky, Fellow, IEEE, Fellow, OSA Abstract— We report the results of our design and performance analysis of two new algorithms for efficient and fair scheduling of variable-length frames in a wavelength division multiplexing (WDM)-passive optical network (PON) under Stanford University aCCESS-Hybrid PON (SUCCESS-HPON) architecture. The WDM-PON under the SUCCESS-HPON architecture has unique features that have direct impacts on the design of scheduling algorithms: First, an optical line terminal (OLT) uses tunable transmitters and receivers that are shared by all the optical network units (ONUs) served by the OLT to reduce the number of expensive dense WDM (DWDM) transceivers. Second, also for cost reduction, ONUs have no local DWDM light sources but use optical modulators to modulate optical continuous wave (CW) bursts provided by the OLT for upstream transmissions. Therefore, the tunable transmitters at the OLT are used for both upstream and downstream transmissions. To provide efficient bidirectional communications between the OLT and the ONUs and guarantee fairness between upstream and downstream traffic, we have designed two scheduling algorithms – batching earliest departure first (BEDF) and sequential scheduling with schedule-time framing (S3 F). The BEDF is based on the batch scheduling mode where frames arriving at the OLT during a batch period are stored in virtual output queues (VOQs) and scheduled at the end of the batch period. It improves transmission efficiency by selecting the frame with the earliest departure time from a batch of multiple frames, which optimizes the usage of tunable transmitters in scheduling. Considering the high complexity of the optimization process in BEDF, we have also designed the S3 F based on the sequential scheduling mode as in the original sequential scheduling algorithm proposed earlier. In S3 F we use VOQs to provide memory space protection among traffic flows and a granting scheme together with schedule-time framing for both upstream and downstream traffic to reduce framing and guard band overhead. Through extensive simulations under various configurations of the tunable transmitters and receivers, we have demonstrated that both the BEDF and S3 F substantially improve the throughput and delay performances over the original sequential scheduling algorithm, while guaranteeing better fairness between upstream and downstream traffic. Index Terms— Access, media access control (MAC) protocols, passive optical network (PON), scheduling, wavelength division multiplexing (WDM) This work was supported in part by the Stanford Networking Research Center and STMicroelectronics. K. S. Kim is with the Advanced System Technology, STMicroelectronics, Stanford, CA 94305, USA (e-mail: [email protected]). D. Gutierrez and L. G. Kazovsky are with the Photonics and Networking Research Laboratory, Stanford University, Stanford, CA 94305, USA (e-mail: {degm,kazovsky}@stanford.edu). F-T. An is with the Marvell Technology Group Ltd. (email:[email protected]). This paper was presented in part at GLOBECOM 2004, Dallas, TX, November, 2004. I. I NTRODUCTION Efficient and fair scheduling of variable-length messages under the constraints of shared resources is critical for the success of advanced, next-generation wavelength-routed optical networks where tunable transmitters and receivers are shared by many users in order to reduce the high cost of wavelength division multiplexing (WDM) optical components. The scheduling problem we study in this paper is for a WDM-passive optical network (PON) under Stanford University aCCESS-Hybrid PON (SUCCESS-HPON) architecture, which was proposed for next-generation hybrid WDM/time division multiplexing (TDM) optical access networks [1].1 The SUCCESS-HPON architecture is based on a topology consisting of a collector ring and several distribution stars connecting a central office (CO) and optical networking units (ONUs). By clever use of coarse WDM (CWDM) and dense WDM (DWDM) technologies, it guarantees the coexistence of current-generation TDM-PON and next-generation WDM-PON systems on the same network. The semi-passive configuration of remote nodes (RNs) together with the hybrid topology also enables supporting both business and residential users on the same access infrastructure by providing protection and restoration capability, a frequently missing feature in traditional PON systems. In designing the SUCCESS-HPON architecture, we mainly focused on providing economical migration paths from the current-generation TDM-PONs to future WDM-based optical access networks. This has been achieved by sharing some high-performance but costly components and resources in SUCCESS WDM-PON2 : First, an optical line terminal (OLT) uses tunable transmitters and receivers that are shared by all the optical network units (ONUs) served by the OLT to reduce the number of expensive DWDM transceivers. Second, also for cost reduction, ONUs have no local DWDM light sources but use optical modulators to modulate optical continuous wave (CW) bursts provided by the OLT for upstream transmissions. Therefore, the tunable transmitters at the OLT are used for both upstream and downstream transmissions. The sharing of tunable transmitters and receivers at the OLT and the use of tunable transmitters for both upstream and 1 We have changed the name of the architecture from SUCCESS to SUCCESS-HPON to distinguish it from other architectures under the same research initiative of SUCCESS at PNRL, Stanford. 2 In this paper we use the term SUCCESS WDM-PON to denote the WDMPON under the SUCCESS-HPON architecture. 2 downstream transmissions, however, pose a great challenge in designing scheduling algorithms: A scheduling algorithm for the SUCCESS WDM-PON has to keep track of the status of all shared resources (i.e., tunable transmitters, tunable receivers and wavelengths assigned to ONUs) and arrange them properly in both time and wavelength domains to avoid any conflicts among them for both upstream and downstream transmissions. While many researchers have studied the issue of scheduling messages in both time and wavelength domains in network architectures based on tunable transmitters and/or receivers (e.g., [2]–[5]), only a few schemes have been proposed to support variable-length message transmissions without segmentation and reassembly processes. In [4], we studied scheduling algorithms for unslotted carrier sense multiple access with collision avoidance (CSMA/CA) with backoff media access control (MAC) protocol to address the issues of fairness and bandwidth efficiency in multiple-access WDM ring networks. In [5], the authors studied distributed algorithms for scheduling variable-length messages in a single-hop multichannel local lightwave network with a focus on reducing tuning overhead. To the best of our knowledge, however, scheduling algorithms for a network where tunable transmitters are used for both upstream and downstream transmissions as in the SUCCESS WDM-PON, have not been investigated by other researchers. In [1] we proposed a sequential scheduling algorithm for the SUCCESS WDM-PON, which emulates a virtual global first-in-first-out (FIFO) queueing for all incoming frames. In this algorithm incoming frames are scheduled sequentially in the order of arrival at the OLT. This original sequential scheduling algorithm is simple to implement, but suffers from poor transmission efficiency and fairness guarantee between upstream and downstream traffic. To address the limitations of the original sequential scheduling algorithm, we propose in this paper two new scheduling algorithms – batching earliest departure first (BEDF) and sequential scheduling with schedule-time framing (S3 F). The key idea in the design of BEDF is to provide room for optimization and priority queueing by scheduling over more than one frame: In BEDF, frames arriving at the OLT during a batch period are stored in virtual output queues (VOQs) and scheduled at the end of the batch period, which allows in scheduling to select the best frame according to a given optimal scheduling policy from the batch of multiple frames in the VOQs. We choose the EDF as an optimal scheduling policy to minimize the unused time of the tunable transmitters. The throughput versus scheduling delay tradeoff is a major design issue in BEDF. In S3 F, considering the high complexity of the BEDF optimization process, we adopt the sequential scheduling mode as in the original sequential scheduling algorithm, but use VOQs to provide memory space protection among traffic flows as in BEDF and a granting scheme together with schedule-time framing for both upstream and downstream traffic to reduce overhead due to framing and guard bands. The rest of the paper is organized as follows. In Section II we provide a high-level overview of the SUCCESS-HPON architecture and review the MAC protocol, frame formats and original sequential scheduling algorithm for the WDM-PON JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 23, NO. X, MONTH 2005 under the SUCCESS-HPON architecture. In Section III we describe the BEDF and S3 F scheduling algorithms based on the system model and procedures used in the description of the original sequential scheduling algorithm in Section II. In Section IV, we provide the results of the performance analysis of the designed scheduling algorithms through simulations. Section V summarizes our work in this paper and discusses future directions for further studies. II. WDM-PON U NDER SUCCESS-HPON A RCHITECTURE A. Overall Architecture A high-level overview of the SUCCESS-HPON, including TDM-PONs and WDM-PONs as its subsystems with wavelength allocations, is shown in Fig. 1. A single-fiber collector ring with stars attached to it formulates the basic topology. The collector ring strings up RNs, which are the centers of the stars. The ONUs attached to the RN on the west side of the ring talk and listen to the transceivers on the west side of the OLT, and likewise for the ONUs attached to the RN on the east side of the ring. Logically there is a point-to-point connection between each RN and the OLT. No wavelength is reused on the collector ring. When there is a fiber cut, the affected RNs will switch to the transceivers on the other side of the OLT for continuous operations as soon as they sense a signal loss. The RN for TDM-PON has a pair of CWDM band splitters to add and drop wavelengths for upstream and downstream transmissions, respectively. On the other hand, the RN for WDM-PON has one CWDM band splitter, adding and dropping a group of DWDM wavelengths within a CWDM grid, and a DWDM MUX/DEMUX device, i.e., arrayed waveguide grating (AWG), per PON. Each ONU has its own dedicated wavelength for both upstream and downstream transmissions on a DWDM grid to communicate with the OLT. Since the insertion loss of a typical AWG is roughly 6 dB regardless of the number of ports, AWGs with more than eight ports will likely be employed to enjoy better power budget compared to passive splitters. Fig. 2 shows block diagrams of the portion of the OLT and the ONU for the SUCCESS WDM-PON. Tunable components, such as fast tunable lasers and tunable filters are employed for DWDM channels. Because the average load of the network is usually lower than the peak load [6], we can expect statistical multiplexing gain by sharing tunable components at the OLT, which also reduces the total system cost by minimizing the transceiver count for a given number of ONUs and user demand on bandwidth. Downstream optical signals from the tunable transmitters in DWDM channels enter both ends of the ring through passive splitters and circulators. Upstream optical signals from the ring pass the same devices but in reverse order and are separated from the downstream signals by the circulators. The scheduler controls the operation of both tunable transmitters and tunable receivers based on the scheduling algorithms that will be described in Section III. Note that the tunable transmitters at the OLT are used for both downstream frames and CW optical bursts to be KIM et al.: SCHEDULING ALGORITHMS FOR WDM-PON UNDER SUCCESS-HPON 3 λ 3, λ 4, … Central Office λ1, λ2 λ*3 λ3 λ*3, λ4, … λ*1, λ2 RN λ41 λ4 λ1 λ*1 RN λ42 RN λ43 λ2 RN λ21 Fig. 1. TDM-PON ONU RN TDM-PON RN λ22 λ23 WDM-PON ONU RN WDM-PON RN Overview of SUCCESS-HPON. Overhead M:2 Passive Splitter For Downstream ... Tunable Transmitter M Upstream Grant Queues 8-Bit Preamble ... ... Downstream Traffic Queues Tunable Transmitter 1 8-Bit Flags Ethernet Frame 16-Bit Delimiter … Ethernet Frame or Overhead 8-Bit Preamble Circulator 16-Bit Delimiter 8-Bit Flags 16-Bit Grant Grant (CW) Scheduler ... Tunable Receiver 1 Upstream Traffic To the Network Tunable Receiver N (a) N:2 Passive Splitter Overhead Upstream Traffic Queue For Upstream 8-Bit Preamble 16-Bit Delimiter 16-Bit Report Ethernet Frame … Ethernet Frame Order of Transmission Modulator Fig. 3. Frames formats for SUCCESS WDM-PON MAC protocol. MAC Circulator of the SUCCESS WDM-PON, readers are referred to [1]. Burst-Mode Receiver Downstream Traffic 1:2 Passive Splitter (b) Fig. 2. Block diagrams of (a) the portion of OLT and (b) the ONU for SUCCESS WDM-PON. modulated by the ONUs for their upstream frames. With this configuration only half-duplex communications are possible at the physical layer between the OLT and each ONU using a variation of the time compression multiplexing (TCM) scheme [7]. Compared to a similar architecture with a two-fiber ring, two sets of light sources and two sets of MUX/DEMUX for full-duplex communications [8], our design significantly lowers deployment cost. As a tradeoff, however, we need a careful design of a scheduling algorithm to provide efficient bidirectional communications at the MAC layer. As discussed before, the ONU has no local light source and uses an optical modulator to modulate optical CW bursts received from the OLT for its upstream transmission. A semiconductor optical amplifier (SOA) can be used as a modulator for this purpose [9]. The ONU MAC protocol block not only controls the switching between upstream and downstream transmissions but also coordinates with the scheduler at the OLT through a polling mechanism. For implementation details, especially at the physical layer B. MAC Protocol and Frame Formats for SUCCESS WDMPON Like APON and EPON systems [10], the SUCCESS WDMPON OLT polls to check the amount of upstream traffic stored at the ONUs and sends grants – but in the form of optical CW bursts in this case – to allow the ONUs to transmit upstream traffic. Since there is neither a separate control channel nor a control message embedding scheme using escape sequences as in [11], the MAC protocol has to rely on in-band signaling and uses the frame formats shown in Fig. 3, where the report and grant fields are defined for the polling process.3 Note that the 1-bit ‘ID’ field in [1] for downstream frames has been extended to 8-bit flags for future extensions: Now the ‘Frame Type’ field of the flag is used to indicate whether this frame is for normal data traffic or not. Usage of the fields in the 8-bit flags is summarized in Table I. Each ONU reports the amount of traffic waiting in its upstream traffic queue in octets through the report field in an upstream frame when the ‘Force Report’ field of a received downstream frame is set4 , and the OLT uses the grant field 3 In this paper, we assume that Ethernet frames are carried in the payload part of SUCCESS WDM-PON frames. Note that, however, any other protocol frame or packet. e.g., IP packets, can be encapsulated and carried in a SUCCESS WDM-PON frame because it does not depend on any specific layer 2 or 3 protocols unlike APON or EPON. 4 In this paper, we assume this field is always set. 4 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 23, NO. X, MONTH 2005 TABLE I 8-B IT F LAGS IN D OWNSTREAM SUCCESS WDM-PON F RAME Bit Field 0-3 Frame Type 4 Force Report 5 6 7 Unused Unused Unused 0 1 2-15 0 1 - Values Normal Data Grant Unused No action required ONU should report in the corresponding upstream frame - to indicate the actual size of each grant (also in octets). Note that, as shown in Fig. 3, the length of the whole CW burst corresponds to that of all upstream Ethernet frames (i.e., the size of grant) plus the report field and the overhead. We use two control parameters to govern the polling process consisting of reporting and granting operations as follows: • ON U T IM EOU T : The OLT maintains one timer per ONU and resets it whenever a grant frame is sent downstream to an ONU. It clears the timer when the corresponding upstream frame with a nonzero report field is received. If the timer expires after the ON U T IM EOU T period, which means either there was no upstream traffic when the ONU received a grant frame or the report message was lost during the transmission to the OLT, the OLT sends a new grant to poll that ONU again and resets the timer. This parameter keeps the polling process going on even in the case of the loss of polling messages and bounds the maximum polling cycle. It also affects the average packet delay of upstream traffic when the system is under light load. • M AX GRAN T : This parameter limits the maximum size of a grant (i.e., the payload part of the CW burst) for ONU upstream traffic. C. Original Sequential Scheduling Algorithm Here we describe how the scheduling of transmission and/or reception of a SUCCESS WDM-PON frame is done under the original sequential scheduling algorithm proposed in [1]. This will be the basic building block of the new scheduling algorithms in Section III. For this purpose, we consider a SUCCESS WDM-PON system with W ONUs (therefore W wavelengths), M tunable transmitters, and N tunable receivers. Because the tunable transmitters are used for both upstream and downstream traffic but tunable receivers are only for upstream traffic, we usually need more transmitters than receivers, i.e., W ≥ M ≥ N . We include in the algorithm description the guard band of G ns between consecutive SUCCESS WDM-PON frames that accounts for the effects of unstable local ONU clock frequencies and tuning time of tunable transmitters and receivers at the OLT. We define the following arrays of global status variables used in the algorithm description: • CAT: Array of Channel Available Times. CAT[i]=t, where i = 1, 2, ..., W , means that the wavelength λi will be available for transmission after time t. TAT: Array of Transmitter Available Times. TAT[i]=t, where i = 1, 2, ..., M , means that the ith tunable transmitter will be available for transmission after time t. • RAT: Array of Receiver Available Times. RAT[i]=t, where i = 1, 2, ..., N , means that the ith tunable receiver will be available for reception after time t. • RTT: Array of round trip times (RTTs) between the OLT and the ONUs. RTT[i] denotes the RTT between the OLT and the ith ONU. When scheduling each SUCCESS WDM-PON frame, we first select the earliest available transmitter and receiver. Assuming that the ith transmitter and the jth receiver are the earliest available transmitter and receiver respectively, we can obtain the transmission time t of a SUCCESS WDM-PON frame destined for the kth ONU as follows: max(RAT [j] + G − RT T [k] − GOH , T AT [i] + G, CAT [k] + G) if the frame is a grant for upstream traffic, t= max(T AT [i] + G, CAT [k] + G) if the frame is for downstream traffic, (1) where GOH is a transmission delay for the grant overhead consisting of the overhead, 8-bit flags and grant fields of the SUCCESS WDM-PON grant frame at a line rate R bit/s. If the frame is a grant frame for upstream traffic, the reception of the corresponding upstream frame from the ONU should be scheduled at t + GOH + RT T [k]. After scheduling the frame transmission and/or reception, the related status variables should be updated as follows: • T AT [i] = t + l/R , CAT [k] = t + l/R (2) and if the frame is a grant frame for upstream traffic, RAT [j] = t + l/R + RT T [k], (3) where l is the length of the whole frame in bits. Fig. 4 illustrates the timing relations among tunable transmitters and receivers, and frames over channels through an example: At t1 , a report for upstream traffic from ONU4 arrives at the OLT. First, the scheduler at the OLT checks the transmitter availability and finds that TX3 is available now. Then, it checks the receiver availability and finds that RX1 will be available at t0 + GOH + RT T1 + lcw1 . Then, it also checks the channel availability and finds that λ4 is available now. Finally, based on all these information, the scheduler schedules the transmission of a grant frame at t0 + RT T1 + lcw1 + G − RT T4 through TX3 on λ4 and the reception of a corresponding upstream frame from ONU4 at t0 + GOH + RT T1 + lcw1 + G. Pseudocode for the whole procedure is given in Fig. 5. III. D ESIGN OF N EW BATCH AND S EQUENTIAL S CHEDULING A LGORITHMS FOR SUCCESS WDM-PON In this section we describe two new scheduling algorithms – the BEDF and the S3 F – designed in order to improve the following performance measures over the original sequential scheduling algorithm: 1) Fairness guarantee between upstream KIM et al.: SCHEDULING ALGORITHMS FOR WDM-PON UNDER SUCCESS-HPON λ1 RTT1 λ4 lcw2 RX2 λ2 λ1 TX1 λ1 A. Batching Earliest Departure First (BEDF) Scheduling CW1 t0 TX2 TX3 transmission capacity. To address the issue of memory space protection among traffic flows, we base both the scheduling algorithms on VOQing with one VOQ per traffic flow either upstream or downstream for an ONU. lcw3 lcw1 RX1 λ2 CW2 λ4 CW3 G t1 New transmission scheduled! RTT4 5 Time Grant Overhead (= OH + Flags + Grant) Fig. 4. An example of the original sequential scheduling at t1 for a system with W = 4, M = 3 and N = 2. begin k ←− destination(frame); l ←− length(frame); tnow ←− current time; for i = 1 to W do CAT [i] ←− max(tnow , CAT [i]); for i = 1 to M do T AT [i] ←− max(tnow , T AT [i]); for i = 1 to N do RAT [i] ←− max(tnow , RAT [i]); select i s.t. T AT [i] ≤ T AT [m] ∀m = 1, . . . , M ; if the frame is a grant for upstream traffic then select j s.t. RAT [j] ≤ RAT [n] ∀n = 1, . . . , N ; t ←− max(RAT [j] + G − RT T [k] − GOH , T AT [i] + G, CAT [k] + G); schedule reception at time t + GOH + RT T [k] with the jth receiver via the wavelength λk ; RAT [j] ←− t + l/R + RT T [k]; else t ←− max(T AT [i] + G, CAT [k] + G); end T AT [i] ←− t + l/R; CAT [k] ←− t + l/R; schedule transmission at time t with the ith transmitter via the wavelength λk ; end Fig. 5: Pseudocode for the original sequential scheduling algorithm. and downstream traffic flows for a given ONU and 2) overall throughput. Here we use a simple but intuitive definition of ’fairness’: On the assumption that all received traffic flows are legitimate, the scheduler assigns bandwidth so that the resulting throughput of a traffic flow should be in proportion to its incoming rate. By ‘traffic flow’ we mean the aggregated traffic between the OLT and each ONU in each direction (upstream or downstream); thus, the scheduler at the OLT deals with a total of 2W separate traffic flows. In the original sequential scheduling algorithm, a downstream Ethernet frame is encapsulated in a SUCCESS WDM-PON frame immediately after its arrival and put into a global FIFO queue that is shared by all upstream and downstream traffic. As the simulation results in [1] show, the lack of protection for memory space among traffic flows leads into poor fairness between upstream and downstream traffic. Also, because there is no room for optimization in scheduling, the maximum achievable throughput is much lower than the total The idea of batch scheduling, where a batch of arrived messages during a certain period forms a task set to which a scheduling algorithm is applied, has been already studied in [12], but in a slightly different context where the main concern is the reduction of the frequency and complexity of the scheduling algorithm at the cost of deferring consideration of new tasks. On the other hand, the major concern in our design of the BEDF is to provide room for optimization in scheduling by forming a task set consisting of multiple frames by batching process. Rather than sequentially scheduling each frame in the order of arrival, by forming a batch of arrived frames and searching for a frame with an optimal value according to a given scheduling policy, we can optimize the scheduling performance. Because transmission efficiency under the constraint of sharing limited resources is one of the major design goals, we select the EDF as an optimal scheduling policy to minimize the time when transmitters and channels are wasted. Building upon the basic sequential scheduling algorithm description in Section II, we can describe the BEDF scheduling algorithm as follows: At the end of each batch period, Step 1 Choose the earliest available transmitter and receiver (i.e., whose TAT and RAT are minimum). Step 2 Given the earliest available transmitter and receiver, calculate a possible transmission time using Eq. 1 for the first unscheduled frame in each VOQ that is not marked as ‘Unschedulable’. Step 3 Select the frame with the minimum transmission time (i.e., the earliest departure time) and if the transmission time is within the boundary of the next batch period, schedule its transmission; otherwise, cancel its scheduling and mark the corresponding VOQ as ‘Unschedulable’. If the scheduled frame is a grant for upstream traffic, schedule the reception of the corresponding upstream frame from the ONU after GOH + RT T from its transmission time. Step 4 Update the status variables using Eq. 2 for the transmitter, the channel and if needed, the receiver. Step 5 Repeat the whole procedures from the steps 1 through 4 until there is no unscheduled frame or all VOQs are marked as ‘Unschedulable’. Note that in contrast to the batch scheduling scheme proposed in [12], once the scheduled transmission time exceeds the boundary of the next batch period, we cancel the scheduling of that frame, mark the corresponding VOQ as ‘Unschedulable’, and exclude the frames in that VOQ in further scheduling during the current batch period. This prevents the frames arriving in the current batch period from consuming the resources available in the next batch period and therefore provides some protection for network resources between batches of frames. The interleaving of scheduling and 6 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 23, NO. X, MONTH 2005 Snapshots of VOQs at the beginning of batch periods Scheduling results of the 1st batch 1st batch period 2nd batch period Scheduling results of the 2nd batch + remnants from the 1st batch 3rd batch beriod Time Fig. 6. A scheduling example showing the interleaving of scheduling and transmission phases in BEDF. transmission phases in BEDF is illustrated through an example in Fig. 6. B. Sequential Scheduling with Schedule-time Framing (S3 F) The major downside of BEDF compared to the original sequential scheduling algorithm, is the higher computational complexity due to the optimization process in scheduling to search for the frame with the earliest departure time. For example, in the worst case where all VOQs have frames to schedule, the BEDF needs roughly 2W times as many calculations as the original sequential scheduling algorithm to schedule one frame. Here we propose S3 F, an improved sequential scheduling algorithm. S3 F is based on the sequential scheduling mode, but unlike the original sequential scheduling algorithm, the scheduling is done at the end of each frame transmission (except in the case when a frame arrives at an empty VOQ, where the scheduling is done immediately after its arrival). It also uses grants for downstream traffic as well as upstream traffic to provide better fairness guarantee and schedule-time framing of downstream Ethernet frames in the VOQs to overcome the low transmission efficiency of the original scheduling algorithm. Due to the memory space protection among traffic flows through VOQing, the S3 F can provide better fairness guarantee than the original sequential scheduling algorithm. For the purpose of granting downstream traffic, we maintain a downstream transmission counter per downstream VOQ. When granting upstream traffic based on a received request from an ONU, we also grant downstream traffic as well based on the VOQ status at the time of the arrival of the report message. Granting downstream traffic is done by setting the said grant counter to the minimum of the queue length of the VOQ and M AX GRAN T . When scheduling downstream transmission, the grant counter value controls the number of Ethernet frames to be scheduled and transmitted in one SUCCESS frame through the procedure shown in Fig. 7. Note that the procedure in Fig. 7 allows at least one Ethernet frame transmission to be scheduled, irrespective of the value of the downstream transmission counter (dsT xCtr[i]). This allows the OLT to transmit downstream traffic for a particular ONU even when there is no granting for the ONU: In the case where there is no request for upstream traffic from that begin if VOQ[i] is not empty then numBits ←− 0; pos ←− 0; ptr ←− &ethF rame(V OQ[i], pos); repeat dsT xCtr[i] ←− dsT xCtr[i] − length(∗ptr); numBits ←− numBits + length(∗ptr); pos ←− pos + 1; ptr ←− &ethF rame(V OQ[i], pos); if ptr is NULL then // no more frames to schedule exit the loop; end until dsT xCtr[i] < length(∗ptr); schedule the transmission of a SUCCESS frame whose payload length is numBits; // using the sequential scheduling algorithm in Fig. 5 store pos for the scheduled transmission later; end end Fig. 7: Pseudocode for the scheduling of downstream data frame transmission for a given channel i in S3 F. Note that pos denotes the relative position of an Ethernet frame from the head of the VOQ (e.g., pos = 0 means it is the headof-line (HOL) frame.). ONU and therefore no granting, it is still possible to transmit downstream frames, but one at a time. The benefit of granting and schedule-time framing of downstream traffic is three-fold. First, by encapsulating multiple Ethernet frames in one SUCCESS WDM-PON frame as in upstream transmission, we can reduce the overhead due to the SUCCESS WDM-PON framing and the guard bands. Second, we can also reduce the waste of tunable transmitters and channels and therefore minimize scheduling delays by preventing spread of smaller frames over multiple transmitters and channels. This is well illustrated in the examples in Fig. 8. Here we assume that for both the framing schemes, there are three Ethernet frames at t1 in the VOQ for channel 1, four Ethernet frames at t2 for channel 2 and one Ethernet 0 frame at t3 for channel 3. ti denotes the resulting scheduled transmission time of the first frame for channel i, so the 0 corresponding scheduling delay is given by ti − ti . The effects of the inefficient use of transmitters in the arrival-time framing, where each incoming Ethernet frame is encapsulated in a SUCCESS WDM-PON frame at the moment of its arrival, become clear when we compare its scheduling delay of the frame for channel 3 in (b) to that of the schedule-time framing in (a). Third, by integrated and intelligent granting of both upstream and downstream traffic, we could better control the whole traffic flows for guarantee of fairness and support of quality of service (QoS) in the future. IV. P ERFORMANCE A NALYSIS We have developed a simulation model for the performance evaluation of the designed scheduling algorithms using Objective Modular Network Testbed in C++ (OMNeT++) [13]. The simulation model is for a WDM-PON system under the KIM et al.: SCHEDULING ALGORITHMS FOR WDM-PON UNDER SUCCESS-HPON Time 7 Time EF H2 G EF H3 t 3’ EF EF H2 G EF EF EF H2 H1 G EF EF EF EF t3’ t3 t 2’ t2 t 1’ t1 EF H3 EF EF EF t 2’ Ethernet Frame EF G t2 H1 Hi SUCCESS WDM-PON Frame Header for Channel i G 1 H1 G t3 H2 EF H2 2 3 (a) TX t 1’ t1 H1 G 1 2 (b) 3 TX 0 Fig. 8. Effects of the framing on the scheduling delays where ti − ti is the scheduling delay of the first frame for channel i: (a) For the schedule-time framing and (b) for the arrival-time framing. TABLE II D EFAULT PARAMETER VALUES FOR S IMULATIONS TABLE III T UNABLE T RANSMITTER AND R ECEIVER C ONFIGURATIONS FOR S IMULATIONS Parameter Value R 10 Gbps W 16 G 50 ns Q 10 Mbytes ON U T IM EOU T 1 ms Description Line rate for upstream and downstream transmissions Number of ONUs and Wavelengths Guard band between adjacent SUCCESS WDM-PON frames (including tuning overhead of tunable transmitters and receivers) Size of OLT VOQs and ONU upstream traffic queue Expiration time of ONU timer SUCCESS-HPON architecture with 16 ONUs. The ONUs are divided into four groups with 4 ONUs per group and placed from the OLT 5 km, 10 km, 15 km, and 20 km, respectively. The line rate R for both upstream and downstream transmissions is set to 10 Gbps. The ON U T IM EOU T and the guard band G are set to 1 ms and 50 ns respectively. As for traffic modeling, we choose a simple Poisson process for IP packet generation because the major purpose of the simulations in this paper is to compare the relative performances of the designed scheduling algorithms rather than to investigate the actual performances under realistic conditions. The packet size distribution is configured to match that of a measurement trace from one of MCI’s backbone OC-3 links [14], and the destination distribution for downstream packets at the OLT follows a uniform distribution. The generated IP packets are encapsulated in Ethernet frames before their arrival at the OLT and ONUs. The size of VOQs at the OLT and the upstream traffic queue at the ONU is set to 10 megabytes. The default parameter values for the simulations are summarized in Table II. For each scheduling algorithm, we ran simulations for several different configurations of tunable transmitters and Number of Transmitters (M ) 4 4 8 8 Number of Receivers (N ) 2 4 4 8 Total Offered Load [Gbps] 1, 2, ..., 40 1, 2, ..., 40 1, 2, ..., 80 1, 2, ..., 80 receivers. Due to space limitation, however, we show the simulation results for the chosen subsets of configurations summarized in Table III in this paper. The total offered load is the sum of the arrival rates for downstream and upstream traffic, where we fix the ratio of the former to the latter to 2 considering that there is more downstream traffic than upstream traffic in access networks. The maximum load for each configuration is set to the total transmitter capacity (= M × R), which slightly overloads the system. We first investigate the effects of important control parameters on the performances of the scheduling algorithms – the batch period for the BEDF and the maximum grant size (M AX GRAN T ) for the S3 F – to determine optimal parameter values. Then, based on the optimal values of those control parameters, we compare the performances of the two scheduling algorithms. A. Effects of Batch Period on BEDF Performance To investigate the effects of the batch period on the BEDF performance, we ran simulations for three different batch periods and show the throughput and delay results in Figs. 9 and 10, respectively. In the simulations we set M AX GRAN T to 120 percent of the average amount of ONU incoming traffic during a given batch period when the system load is maximum. For example, with M = 8, the maximum system load is 80 Gbps and the average ONU incoming rate is 1.67 Gbps. For a batch period of 10 ms, because the average amount of 8 the ONU incoming traffic during this period is 16.7 Mbits, M AX GRAN T is set to 20 Mbits. In this way we can minimize the effects of M AX GRAN T in our investigation of the effects of the batch period on the BEDF performances. Note that M AX GRAN T should be large enough to handle the longest possible Ethernet frame. Otherwise that long Ethernet frame in an ONU upstream queue would block the whole upstream traffic from the ONU. Likewise, the batch period should be long enough to handle the maximum size requests. This implies that there is a limit in the minimum length of the batch period. From the simulation results we found that the batch period of 1 ms provides the best overall performances of the three periods considered. Unlike our intuitive expectation, the effects of longer batch period on the actual transmission performances are not always positive: As the batch period increases, the negative effect of increasing delay becomes dominating over the effect of better optimization in scheduling with a bigger task set consisting of more frames. In general, however, the performance differences are not significant especially in throughput. We also observed that the number of receivers, given the number of transmitters, has minor impacts on the BEDF performances. B. Effects of Maximum Grant Size on S3 F Performance In Figs. 11 and 12 we show the throughput and delay performances of S3 F with four different values of M AX GRAN T . In Fig. 11 we can see that the total throughput approaches the total transmitter capacity in most of the cases except when the maximum grant size is less than 5 Mbits and the number of receivers is half the number of transmitters. This higher transmission efficiency has been achieved because as we expected, the schedule-time framing combined with the granting scheme efficiently reduces the framing and the guard band overhead in downstream transmission by encapsulating multiple Ethernet frames in one SUCCESS WDM-PON frame. Note that the upstream traffic is no longer penalized as in [1] by the downstream traffic even when the system is highly overloaded, which results from the memory space protection by VOQing as well as the change in downstream frame scheduling. From the results, we also observed that when the maximum grant size is less than 5 Mbits and there are fewer receivers than transmitters, the downstream throughput actually decreases after reaching its maximum as the system load further increases. Although the overall throughput and the fairness between upstream and downstream traffic have been greatly improved by the granting with schedule-time framing, one may wonder whether there is any downside in the delay performance due to the effects of large transmission delay of possibly very long SUCCESS WDM-PON frames. From the delay performances shown in Fig. 12, we can verify that this is not the case: In fact, the total average packet delay is maintained well below 5 ms until the total offered load exceeds around 95 percent of the total transmitter capacity, again except for those cases where the maximum grant size is less than 5 Mbits and the number of receivers is half the number of transmitters. Unlike JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 23, NO. X, MONTH 2005 traditional TDM-PONs, because there can exist multiple channels simultaneously between the OLT and the ONUs in the SUCCESS WDM-PON, giant frames using one channel hardly block other frames. The effects of the number of simultaneous channels between the OLT and the ONUs, which is directly related with the number of transmitters and receivers, become clear when we compare the results in Fig. 12 where the average packet delay for the larger number of transmitters and receivers in (a) is less than that for the smaller number of transmitters and receivers in (b). Note that the effects of the maximum grant size on the downstream packet delay are opposite to those on the upstream packet delay: As the maximum grant size increases, the downstream packet delay decreases while the upstream packet delay increases. In our simulations where the downstream traffic dominates over the upstream traffic, the opposite effects on the upstream traffic are negligible in the total average packet delay. But with different traffic conditions, one may have to take into account these opposite effects. Also note that the initial dip in uptream packet delay under light system load: Under the current granting scheme described in this paper, there is no further grant frame generated for an ONU when the ONU reports to the OLT no frame waiting in the upstream traffic queue. Therefore the regular polling cycle of granting and reporting pauses until the ONU timer at the OLT expires, when the polling cycle is restarted by sending a new polling message to the ONU. This whole procedure related with the ONU timer expiration results in the increase in upstream packet delay. This can be controlled by adjusting the ON U T IM EOU T value or a new granting scheme that generates a certain minimum size grant even when the ONU reports no frame in the upstream traffic queue. In general the effects of the maximum grant size are salient with less number of receivers for the given number of transmitters, which is different from the results for the original sequential scheduling algorithm where the number of receivers, given the number of transmitters, has marginal impacts on the overall performances: This is because, when there is less number of receivers, the polling period for the upstream traffic becomes longer and the maximum grant size is a more limiting factor in this case than in the case with a longer polling period. Considering throughput and delay performances altogether, we can conclude that the maximum grant size of 5 Mbits provides the best overall performance for all the configurations considered in the simulations. C. Performance Comparison between BEDF and S3 F Scheduling Algorithms We compare the performances of the BEDF and the S3 F with the optimal parameter values – 1-ms batch period for the BEDF and 5-Mbit M AX GRAN T for the S3 F – through simulations and show the results in Figs. 13 and 14. The results for the original sequential scheduling algorithm in [1] are also shown in the figures for the purpose of comparison. From the results we can see that both the BEDF and S3 F greatly improves the scheduling performances over the original KIM et al.: SCHEDULING ALGORITHMS FOR WDM-PON UNDER SUCCESS-HPON 80 60 50 40 30 20 15 10 5 0 10 50 20 30 40 50 60 70 80 5 25 N=8, BP=10 ms N=8, BP=5 ms N=8, BP=1 ms N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms 40 0 Downstream Throughput [Gbps] Downstream Throughput [Gbps] 25 10 0 30 20 10 0 10 15 20 25 30 35 40 20 25 30 35 40 20 25 30 35 40 N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms N=2, BP=10 ms N=2, BP=5 ms N=2, BP=1 ms 20 15 10 5 0 0 10 30 20 30 40 50 60 70 80 0 5 14 N=8, BP=10 ms N=8, BP=5 ms N=8, BP=1 ms N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms 20 15 10 5 10 15 N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms N=2, BP=10 ms N=2, BP=5 ms N=2, BP=1 ms 12 Upstream Throughput [Gbps] 25 Upstream Throughput [Gbps] 30 20 0 10 8 6 4 2 0 0 0 10 20 30 40 50 60 70 80 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (a) Fig. 9. N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms N=2, BP=10 ms N=2, BP=5 ms N=2, BP=1 ms 35 Total Throughput [Gbps] Total Throughput [Gbps] 40 N=8, BP=10 ms N=8, BP=5 ms N=8, BP=1 ms N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms 70 9 0 5 10 15 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (b) Throughput of BEDF scheduling algorithm for total, downstream and upstream traffic: (a) For M = 8; (b) for M = 4. sequential scheduling algorithm. The maximum achievable total throughput now approaches the total transmitter capacity and the upstream traffic is no longer penalized by the downstream traffic as the system load increases. The performance improvement of the new scheduling algorithms becomes also clear in average packet delay. Both the BEDF and S3 F maintain the total average packet delay below 2 ms until the system load exceeds 87.5 percent of the total transmitter capacity for all the configurations in consideration. The comparison study shows that, of the two scheduling algorithms, the S3 F provides better overall performances than the BEDF in terms of both throughput and average packet delay. Considering the lower complexity of the sequential scheduling mode and the potential for better control of QoS and fairness through integrated granting of both upstream and downstream traffic flows, we can conclude that the S3 F is a better choice for a scheduling algorithm in practical implementations. V. C ONCLUSIONS We have presented the results of the design and performance analysis of the two new scheduling algorithms – BEDF and S3 F – providing efficient and fair bidirectional communications between the OLT and the ONUs in WDM-PON under the SUCCESS-HPON architecture. The major design goal is to overcome the low transmission efficiency and the poor 10 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 23, NO. X, MONTH 2005 40 N=8, BP=10 ms N=8, BP=5 ms N=8, BP=1 ms N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms 20 N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms N=2, BP=10 ms N=2, BP=5 ms N=2, BP=1 ms 35 Total Average Packet Delay [ms] Total Average Packet Delay [ms] 25 15 10 5 30 25 20 15 10 5 0 0 0 10 30 40 50 60 70 80 20 15 10 5 0 10 25 20 30 40 50 60 70 15 20 25 30 35 40 20 25 30 35 40 20 25 30 35 40 N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms N=2, BP=10 ms N=2, BP=5 ms N=2, BP=1 ms 40 30 20 10 80 5 25 N=8, BP=10 ms N=8, BP=5 ms N=8, BP=1 ms N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms 20 0 Upstream Average Packet Delay [ms] Upstream Average Packet Delay [ms] 10 0 0 15 10 5 0 10 15 N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms N=2, BP=10 ms N=2, BP=5 ms N=2, BP=1 ms 20 15 10 5 0 0 10 20 30 40 50 60 70 80 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (a) Fig. 10. 5 50 N=8, BP=10 ms N=8, BP=5 ms N=8, BP=1 ms N=4, BP=10 ms N=4, BP=5 ms N=4, BP=1 ms 25 0 Downstream Average Packet Delay [ms] Downstream Average Packet Delay [ms] 30 20 0 5 10 15 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (b) Average packet delay of BEDF scheduling algorithm for total, downstream and upstream traffic: (a) For M = 8; (b) for M = 4. fairness guarantee between upstream and downstream traffic flows of the original sequential scheduling algorithm proposed in [1]. To achieve this goal, we adopt batch scheduling mode in BEDF to do optimization in scheduling with a batch of frames. In S3 F we maintain sequential scheduling mode as in the original sequential scheduling algorithm but use grants for downstream traffic, in addition to upstream traffic, together with schedule-time framing to reduce the overhead due to framing and guard bands. We base both scheduling algorithms on VOQing to separate and protect memory spaces among traffic flows. Through simulations we found that in BEDF, the effects of the batch period on the throughput are not significant; on the other hand, the average packet delay is strongly dependent upon the size of the batch period. The simulation results also showed that the number of receivers, given the number of transmitters, has negligible effects on the performances. Of the three batch periods considered, 1-ms batch period provides the best overall performances. In S3 F we investigated the effects of the maximum grant size, (i.e., M AX GRAN T ). The simulation results showed that as M AX GRAN T increases, the throughput also increases; as for delay, the downstream delay decreases but the upstream delay increases. In most of the cases we can see that the total throughput approaches the total transmitter capacity except when the maximum grant size is less than 5 Mbits and KIM et al.: SCHEDULING ALGORITHMS FOR WDM-PON UNDER SUCCESS-HPON 80 60 50 40 30 20 15 10 5 0 10 60 20 30 40 50 60 70 80 5 30 20 10 10 15 20 25 30 35 40 25 30 35 40 25 30 35 40 N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb N=2, MAX_GRANT=∞ N=2, MAX_GRANT=5 Mb N=2, MAX_GRANT=2 Mb N=2, MAX_GRANT=1 Mb 25 Downstream Throughput [Gbps] 40 0 30 N=8, MAX_GRANT=∞ N=8, MAX_GRANT=5 Mb N=8, MAX_GRANT=2 Mb N=8, MAX_GRANT=1 Mb N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb 50 Downstream Throughput [Gbps] 25 10 0 20 15 10 5 0 0 0 10 30 20 30 40 50 60 70 80 5 15 10 5 10 15 20 N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb N=2, MAX_GRANT=∞ N=2, MAX_GRANT=5 Mb N=2, MAX_GRANT=2 Mb N=2, MAX_GRANT=1 Mb 12 Upstream Throughput [Gbps] 20 0 14 N=8, MAX_GRANT=∞ N=8, MAX_GRANT=5 Mb N=8, MAX_GRANT=2 Mb N=8, MAX_GRANT=1 Mb N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb 25 Upstream Throughput [Gbps] 30 20 0 10 8 6 4 2 0 0 0 10 20 30 40 50 60 70 80 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (a) Fig. 11. N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb N=2, MAX_GRANT=∞ N=2, MAX_GRANT=5 Mb N=2, MAX_GRANT=2 Mb N=2, MAX_GRANT=1 Mb 35 Total Throughput [Gbps] Total Throughput [Gbps] 40 N=8, MAX_GRANT=∞ N=8, MAX_GRANT=5 Mb N=8, MAX_GRANT=2 Mb N=8, MAX_GRANT=1 Mb N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb 70 11 0 5 10 15 20 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (b) Throughput of S3 F scheduling algorithm for total, downstream and upstream traffic: (a) For M = 8; (b) for M = 4. the number of receivers is half the number of transmitters. In the case of unlimited granting, we observed minor decrease in downstream throughput only when the system is highly overloaded but the overall performance is as good as in the best case. Considering the throughput and delay performances altogether, we found that the maximum grant size of 5 Mbits is the best of the four values considered. The comparison study with the optimal parameter values – 1-ms batch period for the BEDF and 5-Mbit M AX GRAN T for the S3 F – showed that S3 F provides better overall performances than BEDF in terms of both throughput and average packet delay, although the differences between the two are not significant. Considering the lower complexity of the sequential scheduling mode and the potential for better control of QoS and fairness through integrated granting of both upstream and downstream traffic flows, we can conclude that the S3 F is a better choice for practical implementations. As for the fairness issue, we have mainly focused on the guarantee of fairness between upstream and downstream traffic flows for a given ONU in this paper. We demonstrated through simulations that both BEDF and S3 F can guarantee good fairness between upstream and downstream traffic with proper choice of control parameter values except when the system is severely overloaded. Later in practice, however, fairness guarantee with QoS support among individual user connections with a guaranteed minimum bandwidth and a 12 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 23, NO. X, MONTH 2005 35 N=8, MAX_GRANT=∞ N=8, MAX_GRANT=5 Mb N=8, MAX_GRANT=2 Mb N=8, MAX_GRANT=1 Mb N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb 15 N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb N=2, MAX_GRANT=∞ N=2, MAX_GRANT=5 Mb N=2, MAX_GRANT=2 Mb N=2, MAX_GRANT=1 Mb 30 Total Average Packet Delay [ms] Total Average Packet Delay [ms] 20 10 5 25 20 15 10 5 0 0 0 10 30 40 50 60 70 80 25 20 15 10 5 0 10 12 20 30 40 50 60 70 15 20 25 30 35 40 25 30 35 40 25 30 35 40 N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb N=2, MAX_GRANT=∞ N=2, MAX_GRANT=5 Mb N=2, MAX_GRANT=2 Mb N=2, MAX_GRANT=1 Mb 50 40 30 20 10 80 8 5 14 N=8, MAX_GRANT=∞ N=8, MAX_GRANT=5 Mb N=8, MAX_GRANT=2 Mb N=8, MAX_GRANT=1 Mb N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb 10 0 Upstream Average Packet Delay [ms] Upstream Average Packet Delay [ms] 10 0 0 6 4 2 0 10 15 20 N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb N=2, MAX_GRANT=∞ N=2, MAX_GRANT=5 Mb N=2, MAX_GRANT=2 Mb N=2, MAX_GRANT=1 Mb 12 10 8 6 4 2 0 0 10 20 30 40 50 60 70 80 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (a) Fig. 12. 5 60 N=8, MAX_GRANT=∞ N=8, MAX_GRANT=5 Mb N=8, MAX_GRANT=2 Mb N=8, MAX_GRANT=1 Mb N=4, MAX_GRANT=∞ N=4, MAX_GRANT=5 Mb N=4, MAX_GRANT=2 Mb N=4, MAX_GRANT=1 Mb 30 0 Downstream Average Packet Delay [ms] Downstream Average Packet Delay [ms] 35 20 0 5 10 15 20 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (b) Average packet delay of S3 F scheduling algorithm for total, downstream and upstream traffic: (a) For M = 8; (b) for M = 4. weight per connection will be important, which is beyond the scope of the current paper. In this regard the extension of the results in [15] for the cousin-fair hierarchical scheduling in access networks – mainly in the context of the currentgeneration TDM-PONs – to the case of the next-generation hybrid WDM/TDM networks with shared tunable transmitters and receivers, including WDM-PONs under the SUCCESSHPON architecture, could be a solution and an interesting topic for further research. ACKNOWLEDGMENT The authors would like to thank the Associate Editor and anonymous reviewers for their constructive comments. The authors would also like to thank Mr. Salvatore Rotolo of STMicroelectronics for his encouragement and support for this work. R EFERENCES [1] F.-T. An, K. S. Kim, D. Gutierrez, S. Yam, E. Hu, K. Shrikhande, and L. G. Kazovsky, “SUCCESS: A next-generation hybrid WDM/TDM optical access network architecture,” J. Lightwave Technol., vol. 22, no. 11, pp. 2557–2569, Nov. 2004. [2] A. Bianco, M. Guido, and E. Leonardi, “Incremental scheduling algorithms for WDM/TDM networks with arbitrary tuning latencies,” IEEE Trans. Commun., vol. 51, no. 3, pp. 464–475, Mar. 2003. [3] K. Ross, N. Bambos, K. Kumaran, I. Saniee, and I. Widjaja, “Scheduling bursts in time-domain wavelength interleaved networks,” IEEE J. Select. Areas Commun., vol. 21, no. 9, pp. 1441–1451, Nov. 2003. KIM et al.: SCHEDULING ALGORITHMS FOR WDM-PON UNDER SUCCESS-HPON 80 60 50 40 30 25 20 15 10 10 5 0 0 10 60 20 30 40 50 60 70 80 0 5 30 S33F with N=8 S F with N=4 BEDF with N=8 BEDF with N=4 Sequential with N=8 Sequential with N=4 40 30 20 10 10 20 25 30 35 40 20 25 30 35 40 20 25 30 35 40 20 15 10 5 0 0 0 10 30 20 30 40 50 60 70 80 0 5 14 S3F with N=8 S3F with N=4 BEDF with N=8 BEDF with N=4 Sequential with N=8 Sequential with N=4 20 15 10 5 10 15 S33F with N=4 S F with N=2 BEDF with N=4 BEDF with N=2 Sequential with N=4 Sequential with N=2 12 Upstream Throughput [Gbps] 25 10 8 6 4 2 0 0 0 10 20 30 40 50 60 70 80 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) 0 5 10 15 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (a) Fig. 13. 15 S33F with N=4 S F with N=2 BEDF with N=4 BEDF with N=2 Sequential with N=4 Sequential with N=2 25 Downstream Throughput [Gbps] 50 Downstream Throughput [Gbps] 30 20 0 Upstream Throughput [Gbps] S33F with N=4 S F with N=2 BEDF with N=4 BEDF with N=2 Sequential with N=4 Sequential with N=2 35 Total Throughput [Gbps] Total Throughput [Gbps] 40 S33F with N=8 S F with N=4 BEDF with N=8 BEDF with N=4 Sequential with N=8 Sequential with N=4 70 13 (b) Comparison of throughput for total, downstream and upstream traffic: (a) For M = 8; (b) for M = 4. [4] K. S. Kim and L. G. Kazovsky, “Design and performance evaluation of scheduling algorithms for unslotted CSMA/CA with backoff MAC protocol in multiple-access WDM ring networks,” Information Sciences, vol. 149, no. 1–2, pp. 135–148, Jan. 2003, invited Paper. [5] F. Jia, B. Mukherjee, and J. Iness, “Scheduling variable-length messages in a single-hop multichannel local lightwave network,” IEEE/ACM Trans. Networking, vol. 3, no. 4, pp. 477–488, Aug. 1995. [6] K. Khalil, K. Luc, and D. Wilson, “LAN traffic analysis and workload characterization,” in Proc. Local Computer Networks, Sept. 1990, pp. 112–122. [7] B. Bosik and S. Kartalopoulos, “A time compression multiplexing system for a circuit switched digital capability,” IEEE Trans. Commun., vol. 30, no. 9, pp. 2046–2052, Sept. 1982. [8] J.-I. Kani, M. Teshima, K. Akimoto, N. Takachio, H. Suzuki, and K. Iwatsuki, “A WDM-based optical access network for wide-area gigabit access services,” IEEE Optical Commun. Mag., vol. 41, no. 2, pp. S43–S48, Feb. 2003. [9] J. Prat, C. Arellano, V. Polo, and C. Bock, “Optical network unit based on a bidirectional reflective semiconductor optical amplifier for fiberto-the-home networks,” IEEE Photon. Technol. Lett., vol. 17, no. 1, pp. 250–252, Jan. 2005. [10] K. S. Kim, “On the evolution of PON-based FTTH solutions,” Information Sciences, vol. 149, no. 1–2, pp. 21–30, Jan. 2003, invited Paper. [11] G. Kramer, B. Mukherjee, and G. Pesavento, “IPACT: A dynamic protocol for an Ethernet PON (EPON),” IEEE Commun. Mag., vol. 40, pp. 74–80, Feb. 2002. [12] M. Moghaddas and B. Hamidzadeh, “Batching earliest deadline first scheduling,” in Proc. Fifth International Workshop on Object-Oriented Real-Time Dependable Systems, Monterey, CA, Nov. 1999, pp. 29–34. [13] A. Varga, OMNeT++: Discrete event simulation system, Technical University of Budapest, June 2003, version 2.3. [14] WAN Packet Size Distribution [Online]. Available: http://www.nlanr.net/NA/Learn/packetsizes.html. [15] G. Kramer, A. Banerjee, N. K. singhal, B. Mukherjee, S. Dixit, and 14 JOURNAL OF LIGHTWAVE TECHNOLOGY, VOL. 23, NO. X, MONTH 2005 60 S3F with N=8 S3F with N=4 BEDF with N=8 BEDF with N=4 Sequential with N=8 Sequential with N=4 50 S33F with N=4 S F with N=2 BEDF with N=4 BEDF with N=2 Sequential with N=4 Sequential with N=2 50 Total Average Packet Delay [ms] Total Average Packet Delay [ms] 60 40 30 20 10 40 30 20 10 0 0 10 20 30 40 50 60 70 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) 0 5 45 S33F with N=8 S F with N=4 BEDF with N=8 BEDF with N=4 Sequential with N=8 Sequential with N=4 20 0 Downstream Average Packet Delay [ms] Downstream Average Packet Delay [ms] 25 80 15 10 5 10 15 20 25 30 35 40 20 25 30 35 40 35 40 S33F with N=4 S F with N=2 BEDF with N=4 BEDF with N=2 Sequential with N=4 Sequential with N=2 40 35 30 25 20 15 10 5 0 0 0 10 20 30 40 60 70 80 20 15 10 5 0 10 15 S3F with N=4 S3F with N=2 BEDF with N=4 BEDF with N=2 Sequential with N=4 Sequential with N=2 15 10 5 0 0 10 20 30 40 50 60 70 80 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (a) Fig. 14. 5 20 S3F with N=8 S3F with N=4 BEDF with N=8 BEDF with N=4 Sequential with N=8 Sequential with N=4 25 0 Upstream Average Packet Delay [ms] Upstream Average Packet Delay [ms] 30 50 0 5 10 15 20 25 30 Total Offered Load [Gbps] (Downstream Rate:Upstream Rate=2:1) (b) Comparison of average packet delay for total, downstream and upstream traffic: (a) For M = 8; (b) for M = 4. Y. Ye, “Fair queueing with service envelopes (FQSE): A cousin-fair hierarchical scheduler for subscriber access networks,” IEEE J. Select. Areas Commun., vol. 22, no. 8, pp. 1497–1513, Oct. 2004. Kyeong Soo Kim (S’89-M’97) received the B.S., M.E., and Ph.D. degrees, all in electronics engineering, from Seoul National University, Seoul, Korea, in 1989, 1991, and 1995, respectively. From 1996 to 1997, he was engaged in development of multi-channel ATM switching systems as a Post-Doc researcher at Washington University in St. Louis, Missouri, where he also taught undergraduate and graduate courses as an Instructor of Washington University and Adjunct professor of University Missouri, St. Louis. From 1997 to 2000, he was with the PON Systems R&D organization of Lucent Technologies as a Member of Technical Staff and co-developed the first commercial APON-based FiberTo-The-Home/Business (FTTH/B) system, which won the 1999 Bell Labs President’s Silver Award. Since 2001 he has been with STMicroelectronics, working on next-generation access and metro area networks as Researcherin-Residence at Stanford Networking Research Center. Dr. Kim has served as KIM et al.: SCHEDULING ALGORITHMS FOR WDM-PON UNDER SUCCESS-HPON a Member of the Technical Program Committee for ICC 2005, STFOC 2005, GLOBECOM 2004, and JCIS 2005, 2003 and 2002. Dr. Kim is a member of IEEE. David Gutierrez (S’93) received the B.S. degree in electrical engineering from the Universidad de los Andes, Columbia, in 1998, and the M.S. degree in electrical engineering from Stanford University, Stanford, CA, in 2002. He is currently working toward the Ph.D. degree with the Electrical Engineering Department, Stanford University. He has previously worked with such companies as Nortel, Reuters, BASF, and AT&T. At Stanford, he has worked with the Stanford Learning Laboratory and the Stanford University Medical Media and Information Technologies (SUMMIT) Laboratory. He is a Member of the Photonics and Networking Research Laboratory (PNRL), where he is working on access networks. Mr. Gutierrez is also a Fellow of STMicroelectronics, Stanford, CA. Fu-Tai An (S’98-M’04) received the B.S. degree in electrical engineering from National Taiwan University, Taiwan, in 1996, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, in 1998 and 2004. During the summer of 1999, he helped to start a company, Excess Bandwidth Company. He was a Member of Research Staff of the Analog-Front-End Group for DSL applications. During the summers of 2000 and 2001, he was with Sprint ATL to investigate high-performance optical transmission gears. He is now with Marvell Technology Group Ltd. His research interests include photonic networking, optical communication system design, wireless and wired communication system design, and mixed-signal circuit design. Dr. An received the IEEE Lasers & Electro-Optics Society (LEOS) Japanese Chapter Student Award at the IEEE OptoElectronics and Communication Conf. (OECC). Leonid G. Kazovsky (M’80-SM’83-F’91) is a Professor of electrical engineering at Stanford University, Stanford, CA, since 1990. After joining Stanford, he founded the Photonics and Networking Research Laboratory (PNRL). Prior to joining Stanford, he was with Bellcore (now Telcordia) conducting research on wavelength-division-multiplexing, highspeed and coherent optical fiber communication systems. He has authored or coauthored two books, some 150 journal technical papers, and a similar amount of conference papers. Dr. Kazovsky is a Fellow of the Optical Society of America (OSA). 15