Enonc´e - ENSTA ParisTech

publicité
IN101 - TD07
Énoncé
Instructions générales
• En cas de problèmes, demandez de l’aide! Mais n’oubliez pas de vous référer, en premier lieu, au diapositives du cours magistral et aux références bibliographiques. L’index disponible en ligne vous aidera à lier
concepts abordés et suports de cours.
• Utilisez l’éditeur de texte de votre choix pour les exercices. En cas d’hésitation, privilégiez gedit.
• Si ce n’est pas encore le cas, créez un dossier IN101/.
• Dans ce dossier IN101/, créez un nouveau sous-dossier pour la séance de TD, que vous nommerez TD07/.
• Pour les questions du type : ”Write, in natural language, an algorithm that. . . ”, ouvrez un nouveau fichier
texte (d’extension ”.txt”) plutôt qu’un fichier python (d’extension ”.py”).
Exemple : helloworld.txt
• Documentez et commentez précisément votre code : décrivez l’objectif général de l’algorithme, expliquez
les points-clefs de son implémentation, détaillez les conditions normales d’éxécution du script (c.-à-d. le
fichier ”.py”) et exposez les cas extrêmes et les erreurs prévus.
Everyone
1
Hashing
A hash table is a collection of items which are stored in such a way as to make it easy to find
them later. Each position of the hash table, often called a slot, can hold an item and is named
by an integer value starting at 0. For example, we will have a slot named 0, a slot named 1, a
slot named 2, and so on. Initially, the hash table contains no items so every slot is empty.
2
HashTable creation
We can implement a hash table by using a list with each element initialized to the special
Python value None. A second list can be used to store the actual data.
Aims: Implement a class for hashtables. Implement functions with pre-defined specifications.
As in the CM, implement a HashTable class in the HashTable.py module, and implement
the member functions below1 . For Q6, Q7, and Q8, we recommend that you draw the different
steps of the function (with links between nodes, as in the CM) on a piece of paper before
implementing it in Python.
You can find the departure code at
http://perso.ensta-paristech.fr/~paun/ENSTA_IN101/hash_table_td.py.
1
Note that calling the length function
len
allows you to call len(H) on a HashTable H.
1
Make sure that all the tests we provided pass without error.
Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8
3
init (size=default value)
str ()
len ()
size ()
is full()
put(key,data)
get(key)
rehash(oldhash,size)
initialize an empty HashTable
return a string representation of the HashTable
returns the number of elements in the HashTable
returns the capacity of the HashTable
returns True if the HashTable is full, False otherwise
add data to the HashTable
gets data from the HashTable indexed obtained from key
implement simple collision resolution technique
Word Count
Aims: Use the HashTable to resolve a problem
In this exercise we will use the hashtable in order to store the word count for every word in
a given file.
Q9 A file2words(filename) function is provided that reads data from a file and converts it
into a list. How can we use it ? Test it on the file at
http://perso.ensta-paristech.fr/~paun/ENSTA_IN101/lorem_ipsum.txt.
Q10 Write a function hash text(filename) that uses the HashTable structure to store the
number of occurrences of each word in the input file.
Q11 Compare the speed of the previous implementation with the Python default one. You
may use the timeit module.
2
Téléchargement