Telechargé par soufianelamrani2002

algoritms um6p cs

publicité
Algorithms II
PhD. Mohammed-Amine KOULALI
[email protected]
UM6P-CS
Mohammed VI Polytechnic University
13 maj 2022
1 / 43
Table of Contents
1
Introduction
2
Hash Tables
3
Python Hash Table Implementation
2 / 43
Maps and Dictionaries
In a dictionary unique keys are mapped to associated values (e.g., Pythons dict class ).
The keys are assumed to be uniques, but the values are not necessarily
uniques.
Dictionaries are commonly known as associative arrays or maps.
3 / 43
1
Map as Python Dictionary
Lines 1–15 / 19
import s y s
filename = sys . argv [ 1 ]
3
5
7
9
1
3
5
f r e q = {}
f o r p i e c e in open ( f i l e n a m e ) . r e a d ( ) . l o w e r ( ) . s p l i t ( ) :
# only consider alphabetic characters within t h i s piece
word = ’ ’ . j o i n ( c f o r c in p i e c e i f c . i s a l p h a ( ) )
i f word :
# r e q u i r e at ⤦
Ç l e a s t one a l p h a b e t i c c h a r a c t e r
f r e q [ word ] = 1 + f r e q . g e t ( word , 0 )
max_word = ’ ’
max_count = 0
f o r (w , c ) in f r e q . i t e m s ( ) :
# ( key , v a l u e ) t u p l e s ⤦
Ç r e p r e s e n t ( word , c o u n t )
i f c > max_count :
max_word = w
4 / 43
1
3
5
Map as Python Dictionary
Lines 14–28 / 19
i f c > max_count :
max_word = w
max_count = c
p r i n t ( ’ The most f r e q u e n t word i s ’ , max_word )
p r i n t ( ’ I t s number o f o c c u r r e n c e s i s ’ , max_count )
4 / 43
Map Example
5 / 43
The Map ADT
The most significant five behaviors of a map M are as follows:
M[k]: Return the value v associated with key k in map M, if one exists;
otherwise raise a KeyError (i.e., __getitem__) .
M[k] = v : Associate value v with key k in map M, replacing the existing
value if the map already contains an item with key equal to k(i.e.,
__setitem__).
del M[k]: Remove from map M the item with key equal to k; if M has
no such item, then raise a KeyError (i.e., __delitem__) .
len(M): Return the number of items in map M (i.e., __len__) .
iter(M): The default iteration for a map generates a sequence of keys
in the map(i.e., __iter __ ).
6 / 43
The Map ADT
For additional convenience, map M should also support the following behaviors:
k in M: Return True if the map contains an item with key k (i.e.,
__contains__ ).
M.get(k, d = None): Return M[k] if key k exists in the map; otherwise
return default value d.
M.setdefault(k, d): If key k exists in the map, simply return M[k]; if
key k does not exist, set M[k] = d and return that value.
M.pop(k, d = None): Remove the item associated with key k from the
map and return its associated value v .
M.popitem(): Remove an arbitrary key-value pair from the map, and
return a (k, v ) tuple representing the removed pair.
M.clear (): Remove all key-value pairs from the map.
7 / 43
The Map ADT
M.keys(): Return a set-like view of all keys of M.
M.values(): Return a set-like view of all values of M.
M.items(): Return a set-like view of (k, v ) tuples for all entries of M.
M.update(M2): Assign M[k] = v for every (k, v ) pair in map M2.
M == M2: Return True if maps M and M2 have identical key-value
associations.
M! = M2: Return True if maps M and M2 do not have identical keyvalue associations.
8 / 43
Example
9 / 43
MutableMapping
Abstract Base Class for read-only and mutable mappings.
Provides a framework to assist in creating a user-defined map class.
Provides concrete implementations for all behaviors other than the first
five using the abstract ones.
10 / 43
2
4
6
MutableMapping
def __contains__ ( self , k ) :
try :
self [ k ] # access via __getitem__
r e t u r n True
e x c e p t KeyError :
r e t u r n False # attempt failed
11 / 43
MapBase
Lines 1–15 / 22
from c o l l e c t i o n s import MutableMapping
2
4
6
8
c l a s s MapBase ( MutableMapping ) :
""" Our own a b s t r a c t b a s e c l a s s t h a t i n c l u d e s a n o n p u b l i c ⤦
Ç _Item c l a s s . """
#−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− n e s t e d _Item c l a s s ⤦
Ç −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
c l a s s _Item :
""" L i g h t w e i g h t c o m p o s i t e t o s t o r e key−v a l u e p a i r s a s ⤦
Ç map i t e m s . """
__slots__ = ’ _key ’ , ’ _value ’
0
2
def __init__ ( s e l f , k , v ) :
s e l f . _key = k
s e l f . _value = v
4
def __eq__( s e l f , o t h e r ) :
12 / 43
MapBase
Lines 14–28 / 22
1
3
5
def __eq__( s e l f , o t h e r ) :
r e t u r n s e l f . _key == o t h e r . _key
Ç b a s e d on t h e i r k e y s
def __ne__( s e l f , o t h e r ) :
r e t u r n not ( s e l f == o t h e r )
# compare i t e m s ⤦
# o p p o s i t e o f __eq__
7
9
def __lt__( s e l f , o t h e r ) :
r e t u r n s e l f . _key < o t h e r . _key
Ç b a s e d on t h e i r k e y s
# compare i t e m s ⤦
12 / 43
Simple Unsorted Map Implementation
Lines 1–15 / 48
from map_base import MapBase
2
4
6
8
0
2
4
c l a s s UnsortedTableMap ( MapBase ) :
"""Map i m p l e m e n t a t i o n u s i n g an u n o r d e r e d l i s t . """
def __init__ ( s e l f ) :
""" C r e a t e an empty map . """
s e l f . _table = [ ]
Ç _Item ’ s
# l i s t of ⤦
def __getitem__ ( s e l f , k ) :
""" R e t u r n v a l u e a s s o c i a t e d w i t h k e y k ( r a i s e K e y E r r o r ⤦
Ç i f n o t f o u n d ) . """
f o r i t e m in s e l f . _ t a b l e :
i f k == i t e m . _key :
r e t u r n i t e m . _value
r a i s e K e y E r r o r ( ’ Key E r r o r : ’ + r e p r ( k ) )
13 / 43
1
Simple Unsorted Map Implementation
Lines 14–28 / 48
r e t u r n i t e m . _value
r a i s e K e y E r r o r ( ’ Key E r r o r :
’ + repr ( k ) )
3
5
7
9
1
3
5
def __setitem__ ( s e l f , k , v ) :
""" A s s i g n v a l u e v t o k e y k , o v e r w r i t i n g e x i s t i n g v a l u e ⤦
Ç i f p r e s e n t . """
f o r i t e m in s e l f . _ t a b l e :
i f k == i t e m . _key :
# Found a ⤦
Ç match :
item . _value = v
# ⤦
Ç reassign value
return
# and q u i t
# d i d n o t f i n d match f o r k e y
s e l f . _ t a b l e . append ( s e l f . _Item ( k , v ) )
def __delitem__ ( s e l f , k ) :
""" Remove i t e m a s s o c i a t e d w i t h k e y k ( r a i s e K e y E r r o r i f ⤦
Ç n o t f o u n d ) . """
f o r j in range ( l e n ( s e l f . _ t a b l e ) ) :
13 / 43
2
4
6
8
0
2
4
Simple Unsorted Map Implementation
Lines 27–41 / 48
""" Remove i t e m a s s o c i a t e d w i t h k e y k ( r a i s e K e y E r r o r i f ⤦
Ç n o t f o u n d ) . """
f o r j in range ( l e n ( s e l f . _ t a b l e ) ) :
i f k == s e l f . _ t a b l e [ j ] . _key :
# Found a ⤦
Ç match :
s e l f . _ t a b l e . pop ( j )
# remove ⤦
Ç item
return
# and q u i t
r a i s e K e y E r r o r ( ’ Key E r r o r : ’ + r e p r ( k ) )
def __len__( s e l f ) :
""" R e t u r n number o f i t e m s i n t h e map . """
return len ( s e l f . _table )
def __iter__ ( s e l f ) :
""" G e n e r a t e i t e r a t i o n o f t h e map ’ s k e y s . """
f o r i t e m in s e l f . _ t a b l e :
y i e l d i t e m . _key
13 / 43
1
Simple Unsorted Map Implementation
Lines 40–54 / 48
f o r i t e m in s e l f . _ t a b l e :
y i e l d i t e m . _key
3
5
7
9
1
# y i e l d t h e KEY
un=UnsortedTableMap ( )
un . s e t d e f a u l t ( 1 0 , " ab " )
un . s e t d e f a u l t ( 1 0 0 , " cd " )
p r i n t ( un . v a l u e s ( ) )
p r i n t ( un . k e y s ( ) )
p r i n t ( un . p o p i t e m ( ) )
p r i n t ( un . v a l u e s ( ) )
p r i n t ( un . k e y s ( ) )
13 / 43
Simple Unsorted Map manipulation
14 / 43
Hash Tables
One of the most practical data structures for implementing a map, and
the one that is used by
Python’s own implementation of the dict class.
This structure is known as a hash table.
15 / 43
Why using a Hash Table
Consider a map with n items that uses integer keys in a range from 0
to N − 1 for some N.
In this case, we can represent the map using a lookup table of length
N.
We store the value associated with key k at index k of the table.
16 / 43
Why using a Hash Table
here are two challenges in extending this framework to the more general
setting of a map.
We may not wish to devote an array of length N if it is the case that
N >> n.
Second, we do not in general require that a map’s keys be integers.
17 / 43
Solution: Using a hash function
The novel concept for a hash table is the use of a hash function to map
general keys to corresponding indices in a table.
Ideally, keys will be well distributed in the range from 0 to N - 1 by a
hash function, but in practice there may be two or more distinct keys
that get mapped to the same index.
As a result, we will conceptualize our table as a bucket array.
Each bucket may manage a collection of items that are sent to a specific
index by the hash function.
18 / 43
Solution: Using a hash function
A bucket array of capacity 11 with items (1, D), (25, C ), (3, F ), (14, Z ), (6, A), (
and (7, Q), using a simple hash function.
19 / 43
Hash functions
A hash function, h maps each key k to an integer in the range [0, N − 1],
where N is the capacity of the bucket array for a hash table.
We use h(k), as an index into our bucket array instead of the key k
(which may not be appropriate for direct use as an index).
We store the item (k, v ) in the bucket A[h(k)].
If there are two or more keys with the same hash value, then two different
items will be mapped to the same bucket(i.e., collision).
20 / 43
Hash functions goodness
A hash function is ’good’ if it maps the keys in our map so as to
sufficiently minimize collisions.
For practical reasons, we also would like a hash function to be fast and
easy to compute.
21 / 43
Hash functions decomposition
It is common to view the evaluation of a hash function, h(k), as consisting
of two portions.
The hash code that maps a key k to an integer,
The compression function that maps the hash code to an integer within
a range of indices, [0, N − 1], for a bucket array.
22 / 43
Decomposition Pros
The hash code portion of that computation is independent of a specific
hash table size.
Only the compression function depends upon the table size.
Indeed, the underlying bucket array for a hash table may be dynamically
resized, depending on the number of items currently stored in the map.
23 / 43
Hash Codes
The hash function takes an arbitrary key k in our map and computes
hash code for it (integer with no range/positiveness constraints).
The set of hash codes assigned to the keys should avoid collisions as
much as possible.
If the hash codes cause collisions, then there is no hope for the compression function to avoid them.
24 / 43
Possible Hash Codes
There are different possibilities for creating hash codes.
Treating the Bit Representation as an Integer.
Polynomial Hash Codes.
Cyclic-Shift Hash Codes
25 / 43
Hash Codes
Treating the Bit Representation as an Integer
For any data type x we can simply take as a hash code for x an integer
interpretation of its bits.
More, one could combine using some technique the high-order and loworder portions of a 64-bit key to form a 32-bit hash code.
A simple technique is to add the two components as 32bit numbers
(ignoring overflow), or to take their exclusive-or.
These approaches can be extended to any object x = (x0 , x1 , ..., xn−1 )
of 32-bit integers using ∑i xi or x0 ⊕ x1 ⊕ . . . ⊕ xn−1
26 / 43
Hash Codes
Polynomial hash codes
The summation and exclusive-or hash codes are not good choices for
character strings or other variable-length objects that can be viewed as
tuples where the order is significant.
Given a 16-bit hash code for a character string s that sums the Unicode
values of the characters in s.
This hash code unfortunately produces lots of unwanted collisions for
common groups of strings (e.g., ’temp01’ and ’temp10’).
27 / 43
Hash Codes
Polynomial Hash Codes
Given a nonzero constant, a ≠ 1, and use as a hash code the value
x0 an−1 + x1 an−2 + . . . + xn−2 a + xn−1 .
By Horner’s rule, this polynomial can be computed as xn−1 + a(xn−2 +
a(xn−3 + . . . + a(x2 + a(x1 + ax0 )) . . .)).
Polynomial hash code uses multiplication by different powers as a way
to spread out the influence of each component across the resulting hash
code.
We should choose the constant a so that it has some nonzero, low-order
bits, to preserve some of the information content even as we are in an
overflow situation.
a choice has to be done empirically.
28 / 43
1
3
5
7
9
1
Hash Codes
Cyclic-Shift Hash Codes
A variant of the polynomial hash code replaces multiplication by a with
a cyclic shift of a partial sum by a certain number of bits.
For example, a 5-bit cyclic shift of the 32-bit value 001111011001011010101
is 10110010110101010001010100000111.
def hash_code ( s ) :
mask = (1 << 32) - 1 # limit to 32 - bit ⤦
Ç integers
h =0
f o r character in s :
h = ( h << 5 & mask ) | ( h >> 27)
h += ord ( character )
return h
p r i n t ( hash_code ( " abcd " ) )
-----------------------> python hash_code . py
> 3282116
29 / 43
Hash Codes in Python
Python has a built-in function with signature hash(x) that returns an
integer value as the hash code for object x.
Only immutable data types are hashable in Python.
int, float, str, tuple, and frozenset classes produce robust hash codes.
Hash codes for character strings are based on a technique similar to
polynomial hash codes with exclusive-or computations.
Hash codes for tuples are computed as a combination of the hash codes
of the individual elements of the tuple.
If hash(x) is called for an instance x of a mutable type, such as a list,
a TypeError is raised.
Instances of user-defined classes are treated as unhashable by default,
with a TypeError raised by the hash function.
However, a function that computes hash codes can be implemented in
the form of a special method named hash within a class.
The returned hash code should reflect the immutable attributes of an
instance.
30 / 43
Hash Codes in Python
For example, a Color class that maintains three numeric red, green, and
blue components might implement the method as:
1
def hash ( self ) :
r e t u r n hash ( ( self . red , self . green , ⤦
Ç self . blue ) ) # hash combined tuple
An important rule to obey is that if a class defines equivalence through
__eq__ , then any implementation of hash must be consistent, in
that if x == y , then hash(x) == hash(y ).
31 / 43
Compression Functions
The hash code for a key k will typically not be suitable for immediate
use with a bucket array (e.g., negative, bigger than N).
Thus, once we have determined an integer hash code for a key object k,
there is still the issue of mapping that integer into the range [0, N − 1].
This computation, known as a compression function, is the second action performed as part of an overall hash function.
A good compression function is one that minimizes the number of collisions for a given set of distinct hash codes.
32 / 43
Compression Functions
The Division Method
A simple compression function is the division method, which maps an
integer i to i mod N , where N , the size of the bucket array, is a fixed
positive integer.
Additionally, if we take N to be a prime number, then this compression
function helps spread outthe distribution of hashed values.
Indeed, if N is not prime, then there is greater risk that patterns in the
distribution of hash codes will be repeated in the distribution of hash
values, thereby causing collisions.
For example, if we insert keys with hash codes
{200, 205, 210, 215, 220, . . . , 600}
into a bucket array of size 100, then each hash code will collide with
three others. But if we use a bucket array of size 101, then there will
be no collisions.
33 / 43
Compression Functions
The MAD Method
A more sophisticated compression function, which helps eliminate repeated patterns in a set of integer keys, is the Multiply-Add-and-Divide
(or MAD") method.
This method maps an integer i to [(ai + b)modp] mod N , where N is
the size of the bucket array, p is a prime number larger than N , and a
and b are integers chosen at random from the interval [0, p − 1], with
a > 0.
This compression function is chosen in order to eliminate repeated patterns in the set of hash codes and get us closer to having a good"hash
function, that is, one such that the probability any two different keys
collide is 1/N .
34 / 43
Collision-Handling Schemes
A hash table takes a bucket array, A, and a hash function, h, and
use them to implement a map by storing each item (k, v ) in the bbucket"A[h(k)].
This simple idea is challenged, however, when we have two distinct keys,
k1 and k2 , such that h(k1) = h(k2).
The existence of such collisions prevents us from simply inserting a new
item (k, v ) directly into the bucket A[h(k)].
It also complicates our procedure for performing insertion, search, and
deletion operations.
35 / 43
Collision-Handling Schemes
Separate Chaining
Each bucket A[j] store its own secondary container, holding items (k, v )
such that h(k) = j.
A natural choice for the secondary container is a small map instance
implemented using a list, as described.
This collision resolution rule is known as separate chaining.
36 / 43
Collision-Handling Schemes
Separate Chaining
37 / 43
Collision-Handling Schemes
Open Addressing
Separate chaining requires the use of an auxiliary data structure.
If space is at a premium then we can use the alternative approach of
always storing each item directly in a table slot.
In Open Addressing no auxiliary structures are employed, but it requires
a bit more complexity to deal with collisions.
38 / 43
Open Addressing
Linear Probing and Its Variants
A simple method for collision handling with open addressing is linear
probing.
With this approach, if we try to insert an item (k, v ) into a bucket
A[j] that is already occupied, where j = h(k), then we next try A[(j +
1)modN].
If A[(j + 1)modN] is also occupied, then we try A[(j + 2)modN], and
so on, until we find an empty bucket that can accept the new item.
Once this bucket is located, we simply insert the item there.
We need change the implementation when searching for an existing key–
the first step of all __getitem__ , __setitem__ , or __delitem__
operations.
In particular, to attempt to locate an item with key equal to k, we must
examine consecutive slots, starting from A[h(k)], until we either find
an item with that key or we find an empty bucket.
39 / 43
Collision-Handling Schemes
Linear Probing
40 / 43
2
HashMapBase
Lines 1–15 / 51
from map_base import MapBase
from c o l l e c t i o n s import MutableMapping
from random import r a n d r a n g e
# u s e d t o p i c k MAD ⤦
Ç parameters
4
6
8
c l a s s HashMapBase ( MapBase ) :
""" A b s t r a c t b a s e c l a s s f o r map u s i n g hash−t a b l e w i t h MAD ⤦
Ç compression .
Keys must be h a s h a b l e and non−None .
"""
0
2
4
def __init__ ( s e l f , cap =11 , p =109345121) :
""" C r e a t e an empty hash−t a b l e map .
cap
p
i n i t i a l t a b l e s i z e ( d e f a u l t 11)
p o s i t i v e p r i m e u s e d f o r MAD ( d e f a u l t 1 0 9 3 4 5 1 2 1 )
41 / 43
1
3
5
7
HashMapBase
Lines 14–28 / 51
cap
i n i t i a l t a b l e s i z e ( d e f a u l t 11)
p
p o s i t i v e p r i m e u s e d f o r MAD ( d e f a u l t 1 0 9 3 4 5 1 2 1 )
"""
s e l f . _ t a b l e = cap ∗ [ None ]
s e l f . _n = 0
# number ⤦
Ç o f e n t r i e s i n t h e map
s e l f . _prime = p
# prime ⤦
Ç f o r MAD c o m p r e s s i o n
s e l f . _ s c a l e = 1 + r a n d r a n g e ( p−1)
# scale ⤦
Ç from 1 t o p−1 f o r MAD
s e l f . _shift = randrange (p)
# shift ⤦
Ç from 0 t o p−1 f o r MAD
9
1
3
def _hash_function ( s e l f , k ) :
r e t u r n ( hash ( k ) ∗ s e l f . _scale + s e l f . _ s h i f t ) % ⤦
Ç s e l f . _prime % l e n ( s e l f . _ t a b l e )
d e f __len__( s e l f ) :
r e t u r n s e l f . _n
41 / 43
HashMapBase
Lines 27–41 / 51
r e t u r n s e l f . _n
2
4
def __getitem__ ( s e l f , k ) :
j = s e l f . _hash_function ( k )
r e t u r n s e l f . _bucket_getitem ( j , k )
Ç r a i s e KeyError
# may ⤦
6
8
0
def __setitem__ ( s e l f , k , v ) :
j = s e l f . _hash_function ( k )
s e l f . _bucket_setitem ( j , k , v )
Ç s u b r o u t i n e m a i n t a i n s s e l f . _n
i f s e l f . _n > l e n ( s e l f . _ t a b l e ) // 2 :
Ç l o a d f a c t o r <= 0 . 5
s e l f . _ r e s i z e (2 ∗ len ( s e l f . _table ) − 1)
Ç 2^ x − 1 i s o f t e n p r i m e
# ⤦
# keep ⤦
# number ⤦
2
4
def __delitem__ ( s e l f , k ) :
j = s e l f . _hash_function ( k )
s e l f . _bucket_delitem ( j , k )
# may ⤦
41 / 43
1
3
5
7
9
1
HashMapBase
Lines 40–54 / 51
j = s e l f . _hash_function ( k )
s e l f . _bucket_delitem ( j , k )
Ç r a i s e KeyError
s e l f . _n −= 1
# may ⤦
def _ r e s i z e ( s e l f , c ) :
""" R e s i z e b u c k e t a r r a y t o c a p a c i t y c and r e h a s h a l l ⤦
Ç i t e m s . """
old = l i s t ( s e l f . items () )
# use i t e r a t i o n to ⤦
Ç record e x i s t i n g items
s e l f . _ t a b l e = c ∗ [ None ]
# then r e s e t t a b l e to ⤦
Ç desired capacity
s e l f . _n = 0
# n recomputed d u r i n g ⤦
Ç subsequent adds
f o r ( k , v ) in o l d :
self [k] = v
# r e i n s e r t o l d key−v a l u e ⤦
Ç pair
41 / 43
2
4
ChainHashMap
Lines 1–15 / 32
from hash_map_base import HashMapBase
from unsorted _table_map import UnsortedTableMap
c l a s s ChainHashMap ( HashMapBase ) :
""" Hash map i m p l e m e n t e d w i t h s e p a r a t e c h a i n i n g f o r ⤦
Ç c o l l i s i o n r e s o l u t i o n . """
6
8
0
def _bucket_getitem ( s e l f , j , k ) :
bucket = s e l f . _table [ j ]
i f b u c k e t i s None :
r a i s e K e y E r r o r ( ’ Key E r r o r : ’ + r e p r ( k ) )
Ç match f o u n d
return bucket [ k ]
Ç r a i s e KeyError
# no ⤦
# may ⤦
2
4
def _ b u c k e t _ s e t i t e m ( s e l f , j , k , v ) :
i f s e l f . _ t a b l e [ j ] i s None :
s e l f . _ t a b l e [ j ] = UnsortedTableMap ( )
Ç new t o t h e t a b l e
# bucket i s ⤦
42 / 43
1
3
5
ChainHashMap
Lines 14–28 / 32
i f s e l f . _ t a b l e [ j ] i s None :
s e l f . _ t a b l e [ j ] = UnsortedTableMap ( )
Ç new t o t h e t a b l e
o l d s i z e = len ( s e l f . _table [ j ] )
s e l f . _table [ j ] [ k ] = v
i f len ( s e l f . _table [ j ] ) > o l d s i z e :
Ç to the t a b l e
s e l f . _n += 1
Ç o v e r a l l map s i z e
# bucket i s ⤦
# k e y was new ⤦
# increase ⤦
7
9
1
def _ b u c k e t _ d e l i t e m ( s e l f , j , k ) :
bucket = s e l f . _table [ j ]
i f b u c k e t i s None :
r a i s e K e y E r r o r ( ’ Key E r r o r : ’ + r e p r ( k ) )
Ç match f o u n d
del bucket [ k ]
Ç r a i s e KeyError
# no ⤦
# may ⤦
3
def __iter__ ( s e l f ) :
42 / 43
2
4
ChainHashMap
Lines 27–41 / 32
def __iter__ ( s e l f ) :
f o r b u c k e t in s e l f . _ t a b l e :
i f b u c k e t i s not None :
Ç nonempty s l o t
f o r k e y in b u c k e t :
y i e l d key
# a ⤦
42 / 43
ProbeHashMap
Lines 1–15 / 53
1
from hash_map_base import HashMapBase
3
c l a s s ProbeHashMap ( HashMapBase ) :
""" Hash map i m p l e m e n t e d w i t h l i n e a r p r o b i n g f o r c o l l i s i o n ⤦
Ç r e s o l u t i o n . """
_AVAIL = o b j e c t ( )
# s e n t i n a l marks l o c a t i o n s o f ⤦
Ç previous deletions
5
7
9
1
def _ i s _ a v a i l a b l e ( s e l f , j ) :
""" R e t u r n True i f i n d e x j i s a v a i l a b l e i n t a b l e . """
r e t u r n s e l f . _ t a b l e [ j ] i s None or s e l f . _ t a b l e [ j ] i s ⤦
Ç ProbeHashMap . _AVAIL
def _ f i n d _ s l o t ( s e l f , j , k ) :
""" S e a r c h f o r k e y k i n b u c k e t a t i n d e x j .
3
5
Return ( success , index ) tuple , d e s c r i b e d as f o l l o w s :
I f match was found , s u c c e s s i s True and i n d e x d e n o t e s ⤦
Ç its location .
43 / 43
1
3
5
7
9
1
3
ProbeHashMap
Lines 14–28 / 53
R e t u r n ( s u c c e s s , i n d e x ) tuple , d e s c r i b e d a s f o l l o w s :
I f match was found , s u c c e s s i s True and i n d e x d e n o t e s ⤦
Ç its location .
I f no match found , s u c c e s s i s F a l s e and i n d e x d e n o t e s ⤦
Ç first available slot .
"""
f i r s t A v a i l = None
w h i l e True :
i f s e l f . _is_available ( j ) :
i f f i r s t A v a i l i s None :
firstAvail = j
# mark t h i s ⤦
Ç as f i r s t a v a i l
i f s e l f . _ t a b l e [ j ] i s None :
return ( False , f i r s t A v a i l )
# s e a r c h has ⤦
Ç failed
e l i f k == s e l f . _ t a b l e [ j ] . _key :
r e t u r n ( True , j )
# f o u n d a match
j = ( j + 1) % l e n ( s e l f . _table )
# keep ⤦
Ç looking ( c y c l i c a l l y )
43 / 43
ProbeHashMap
Lines 27–41 / 53
j = ( j + 1) % len ( s e l f . _table )
Ç looking ( c y c l i c a l l y )
# keep ⤦
2
4
6
def _bucket_getitem ( s e l f , j , k ) :
found , s = s e l f . _ f i n d _ s l o t ( j , k )
i f not f o u n d :
r a i s e K e y E r r o r ( ’ Key E r r o r : ’ + r e p r ( k ) )
Ç match f o u n d
r e t u r n s e l f . _ t a b l e [ s ] . _value
# no ⤦
8
0
2
4
def _ b u c k e t _ s e t i t e m ( s e l f , j , k , v ) :
found , s = s e l f . _ f i n d _ s l o t ( j , k )
i f not f o u n d :
s e l f . _ t a b l e [ s ] = s e l f . _Item ( k , v )
Ç i n s e r t new i t e m
s e l f . _n += 1
Ç has i n c r e a s e d
else :
s e l f . _ t a b l e [ s ] . _value = v
# ⤦
# size ⤦
# ⤦
43 / 43
1
ProbeHashMap
Lines 40–54 / 53
else :
s e l f . _ t a b l e [ s ] . _value = v
Ç overwrite existing
# ⤦
3
5
7
def _ b u c k e t _ d e l i t e m ( s e l f , j , k ) :
found , s = s e l f . _ f i n d _ s l o t ( j , k )
i f not f o u n d :
r a i s e K e y E r r o r ( ’ Key E r r o r : ’ + r e p r ( k ) )
Ç match f o u n d
s e l f . _ t a b l e [ s ] = ProbeHashMap . _AVAIL
Ç as vacated
# no ⤦
# mark ⤦
9
1
3
def __iter__ ( s e l f ) :
f o r j in range ( l e n ( s e l f . _ t a b l e ) ) :
Ç entire table
i f not s e l f . _ i s _ a v a i l a b l e ( j ) :
y i e l d s e l f . _ t a b l e [ j ] . _key
# scan ⤦
43 / 43
ProbeHashMap
Lines 53–67 / 53
43 / 43
Téléchargement