Cours Python SRI outils scientifiques October 4, 2016 1 Programmation scientifique en python Python devient de plus en plus une alternative à Matlab. Pourquoi ? • gratuit • “vrai” langage de programmation • bénéficie de la communauté open source: en progression constante • unifie de nombreuses bibliothèques existant dans différents langages (algèbre, analyse, statistiques, traitement signal, etc) • librairie python pour “emballer” le tout: scipy, http://www.scipy.org/ On verra ici : • le calcul matriciel avec numpy (base de nombreux autres outils) • une introduction à quelques outils pertinents – – – – – 1.1 traitement signal traitement d’image apprentissage automatique graphes robotique ## Programmation vectorielle: numerical python (numpy) introduction à la programmation scientifique et l’utilisation de vecteurs/matrices • • • • • définitions, créations index/références quelques opérations / exemples de calcul vectorisé règles de “broadcasting” (composition) opérations + complexes, convolution (ex: automates cellulaires, fractales) In [2]: import numpy as np a = np.random.random(1000) print type(a) <type ’numpy.ndarray’> In [7]: print a[:10] print a.min(),a.max(),a.mean() [ 0.30925646 0.86760964 0.48519242 0.77411482 0.38657677 0.58858523 0.06037206 0.26884881 0.24828936] 0.00125452472117 0.999758911835 0.49546800591 1 0.50936001 1.2 Calcul vectorisé In [8]: from math import cos print cos(a) --------------------------------------------------------------------------TypeError Traceback (most recent call last) <ipython-input-8-aaa6d586df86> in <module>() 1 from math import cos ----> 2 print cos(a) TypeError: only length-1 arrays can be converted to Python scalars In [5]: from numpy import cos c = cos(a) print c[:10] [ 0.97542417 0.7051168 0.9559976 0.7920507 0.889291 0.9395283 0.98941373 0.8121354 0.6542547 ] 0.93883746 In [10]: (a+a)[:10] Out[10]: array([ 0.61851293, 1.01872002, 1.2.1 1.73521928, 1.17717045, 0.97038485, 0.12074412, 1.54822963, 0.53769763, 0.77315354, 0.49657872]) Comparaison In [13]: b = list(a) print b[:10] [0.3092564639249985, 0.86760964130719376, 0.48519242363955573, 0.77411481507324209, 0.38657676754846271, In [11]: %timeit a**2 The slowest run took 16.04 times longer than the fastest. This could mean that an intermediate result is 1000000 loops, best of 3: 1.93 µs per loop In [10]: %timeit [x**2 for x in b] 1000 loops, best of 3: 274 µs per loop In [11]: %%timeit s = 0 for x in b: s = s + x**2 1000 loops, best of 3: 320 µs per loop 2 1.3 Indexation et dimensions In [14]: c = np.array(b) print type(c) print c[9] c.shape <type ’numpy.ndarray’> 0.248289362122 Out[14]: (1000,) In [18]: c = c.reshape(10,100) print c[0,9] print print c[:,2:4] 0.857599610473 [[ [ [ [ [ [ [ [ [ [ 0.47500376 0.20858125 0.48992152 0.2826899 0.43615071 0.03902043 0.87706503 0.78677078 0.0239854 0.62720625 1.4 0.14563654] 0.00736357] 0.70656261] 0.1070452 ] 0.99472248] 0.75047057] 0.72165283] 0.88379719] 0.67950247] 0.5302494 ]] Création In [18]: c = np.zeros(shape=(3,4)) print c.shape print c c.ravel() (3, 4) [[ 0. 0. [ 0. 0. [ 0. 0. 0. 0. 0. 0.] 0.] 0.]] Out[18]: array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]) In [19]: c = np.arange(10) print c [0 1 2 3 4 5 6 7 8 9] In [20]: c = np.arange(0.,1.,0.05) c Out[20]: array([ 0. , 0.45, 0.9 , 0.05, 0.1 , 0.5 , 0.55, 0.95]) 0.15, 0.6 , 0.2 , 0.65, In [21]: c = np.linspace(0, 1.,21) c 3 0.25, 0.7 , 0.3 , 0.75, 0.35, 0.8 , 0.4 , 0.85, Out[21]: array([ 0. , 0.45, 0.9 , 1.5 0.05, 0.5 , 0.95, 0.1 , 0.15, 0.55, 0.6 , 1. ]) 0.2 , 0.65, 0.25, 0.7 , 0.3 , 0.75, 0.35, 0.8 , Création (suite) In [94]: np.zeros((2,2)) Out[94]: array([[ 0., [ 0., 0.], 0.]]) In [95]: np.ones((2,2)) Out[95]: array([[ 1., [ 1., 1.], 1.]]) In [24]: np.eye(3) Out[24]: array([[ 1., [ 0., [ 0., 1.6 0., 1., 0., 0.], 0.], 1.]]) Références attention aux références -> différent des listes In [29]: c[1] = 0 print(c) a = c[1:-1] c[1] = -1 print a[:10] [ 0. 0.6 [-1. 0. 0.65 0.1 0.1 0.7 0.15 0.15 0.75 0.2 0.2 0.8 0.25 0.25 0.85 0.3 0.3 0.9 0.35 0.35 0.95 0.4 0.4 0.45 0.5 1. ] 0.45 0.5 ] 0.25 0.3 0.35 0.4 In [32]: c[0] = 0 a = np.zeros(c.shape) # ou #a[:] = c a[...] = c c[0] = 2 print a[:10] [ 0. 1.7 -1. 0.1 0.15 0.2 Index Conditionnels In [34]: ia1 = np.argwhere(a>0.55) ia2 = np.argwhere((a>.55) | (a<0.)) print ia2 [[ 1] [12] [13] [14] [15] 4 0.45] 0.55 0.4 , 0.85, [16] [17] [18] [19] [20]] In [47]: print a[ia1].ravel() print a[ia2].ravel() [ 0.6 [-1. 0.65 0.6 0.7 0.65 0.75 0.7 0.8 0.75 0.85 0.8 0.9 0.85 0.95 0.9 1. ] 0.95 1. ] In [35]: print a[(a<0.) | (a>0.55)] [ 0.8 0.85 0.9 0.95 1. ] In [106]: %pylab inline Populating the interactive namespace from numpy and matplotlib 1.8 Exemples d’utilisation In [37]: from numpy import sin %pylab inline x = np.linspace(-20,20,1000) a = sin(x/2.) + np.random.normal(scale=0.2,size=1000) pylab.plot(x,a) Populating the interactive namespace from numpy and matplotlib WARNING: pylab import has clobbered these variables: [’cos’] ‘%matplotlib‘ prevents importing * from pylab and numpy Out[37]: [<matplotlib.lines.Line2D at 0x10621e050>] 5 Refaisons la moyenne glissante In [38]: c = np.copy(a[1:-1]) c += a[:-2] + a[2:] c *= 1./3 pylab.plot(x[1:-1],c) Out[38]: [<matplotlib.lines.Line2D at 0x10627f810>] In [44]: def smooth3(a): c = np.zeros(len(a)-2) c += a[:-2] + a[1:-1] + a[2:] c *= 1./3 return c c = smooth3(a) for i in range(10): c = smooth3(c) #pylab.plot(c) #pylab.ylim((0.0,1.0)) pylab.plot(c) Out[44]: [<matplotlib.lines.Line2D at 0x106c81390>] 6 en fait: opération plus générale (convolution) -> existe dans scipy: convolve, et smooth 1.9 Règles de composition (“broadcasting”) In [49]: a = np.arange(100) a = a.reshape(10,10) a Out[49]: array([[ 0, [10, [20, [30, [40, [50, [60, [70, [80, [90, 1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 2, 12, 22, 32, 42, 52, 62, 72, 82, 92, 3, 13, 23, 33, 43, 53, 63, 73, 83, 93, 4, 14, 24, 34, 44, 54, 64, 74, 84, 94, 5, 15, 25, 35, 45, 55, 65, 75, 85, 95, 6, 16, 26, 36, 46, 56, 66, 76, 86, 96, 7, 17, 27, 37, 47, 57, 67, 77, 87, 97, 8, 18, 28, 38, 48, 58, 68, 78, 88, 98, 9], 19], 29], 39], 49], 59], 69], 79], 89], 99]]) In [50]: a+10 Out[50]: array([[ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [ 20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [ 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [ 40, 41, 42, 43, 44, 45, 46, 47, 48, 49], [ 50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [ 60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [ 70, 71, 72, 73, 74, 75, 76, 77, 78, 79], [ 80, 81, 82, 83, 84, 85, 86, 87, 88, 89], [ 90, 91, 92, 93, 94, 95, 96, 97, 98, 99], [100, 101, 102, 103, 104, 105, 106, 107, 108, 109]]) 7 In [133]: a[:2,0:3] Out[133]: array([[ 0, 1, 2], [10, 11, 12]]) In [47]: b = np.arange(10)-5 print b [-5 -4 -3 -2 -1 0 1 2 3 4] 6 16 26 36 46 56 66 76 86 96 7 17 27 37 47 57 67 77 87 97 8 18 28 38 48 58 68 78 88 98 9] 19] 29] 39] 49] 59] 69] 79] 89] 99]] In [54]: print(a) a+b [[ 0 [10 [20 [30 [40 [50 [60 [70 [80 [90 1 11 21 31 41 51 61 71 81 91 2 12 22 32 42 52 62 72 82 92 3 13 23 33 43 53 63 73 83 93 4 14 24 34 44 54 64 74 84 94 5 15 25 35 45 55 65 75 85 95 Out[54]: array([[ [ [ [ [ [ [ [ [ [ -5, 5, 15, 25, 35, 45, 55, 65, 75, 85, -3, 7, 17, 27, 37, 47, 57, 67, 77, 87, -1, 9, 19, 29, 39, 49, 59, 69, 79, 89, 1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 3, 13, 23, 33, 43, 53, 63, 73, 83, 93, 5, 15, 25, 35, 45, 55, 65, 75, 85, 95, In [144]: a = np.eye(5) a[1,2] = 1 print a print print a[0:-2,0:-2] print print a[1:-1,1:-1] print a[0:-2,0:-2] + a[1:-1,1:-1] [[ [ [ [ [ 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 1. 0. 0. [[ 1. [ 0. [ 0. 0. 1. 0. 0.] 1.] 1.]] [[ 1. [ 0. [ 0. 1. 1. 0. 0.] 0.] 1.]] 0. 0. 0. 1. 0. 0.] 0.] 0.] 0.] 1.]] 8 7, 17, 27, 37, 47, 57, 67, 77, 87, 97, 9, 11, 13], 19, 21, 23], 29, 31, 33], 39, 41, 43], 49, 51, 53], 59, 61, 63], 69, 71, 73], 79, 81, 83], 89, 91, 93], 99, 101, 103]]) Out[144]: array([[ 2., [ 0., [ 0., 1.10 1., 2., 0., 0.], 1.], 2.]]) Calculs / grilles en 2D In [56]: x = np.linspace(0.,5.,3) y = np.linspace(0.,3.,4) (X,Y) = np.meshgrid(x,y) print X print print Y [[ [ [ [ 0. 0. 0. 0. [[ [ [ [ 0. 1. 2. 3. 2.5 2.5 2.5 2.5 0. 1. 2. 3. 5. 5. 5. 5. ] ] ] ]] 0.] 1.] 2.] 3.]] In [57]: Y Out[57]: array([[ [ [ [ 0., 1., 2., 3., 0., 1., 2., 3., 0.], 1.], 2.], 3.]]) In [58]: a = X*X + Y*Y a Out[58]: array([[ [ [ [ 0. 1. 4. 9. , , , , 6.25, 7.25, 10.25, 15.25, 25. 26. 29. 34. ], ], ], ]]) In [ ]: In [ ]: Traitement du signal en python : Scipy, une surcouche de numpy 1.11 Image et matrice In [62]: image = np.random.rand(50, 50) plt.imshow(image, cmap=plt.cm.hot) plt.colorbar() Out[62]: <matplotlib.colorbar.Colorbar instance at 0x10a27b3f8> 9 In [63]: def smooth(image): new = np.zeros(image.shape) Z = image new[1:-1,1:-1] = (Z[0:-2,0:-2] + Z[0:-2,1:-1] + Z[0:-2,2:] + Z[1:-1,0:-2] + Z[1:-1,1:-1] + Z[1:-1,2:] + Z[2: ,0:-2] + Z[2: ,1:-1] + Z[2: ,2:])/9. return new In [210]: new = smooth(image) plt.imshow(new, cmap=plt.cm.hot) plt.colorbar() Out[210]: <matplotlib.colorbar.Colorbar instance at 0x11757e758> 10 In [181]: filtre = np.ones((3,3))/9 In [185]: filtre, image[0:3,0:3] Out[185]: (array([[ [ [ array([[ [ [ 0.11111111, 0.11111111, 0.11111111, 0.67870554, 0.13887435, 0.69173264, 0.11111111, 0.11111111, 0.11111111, 0.99102435, 0.0137605 , 0.33549461, 0.11111111], 0.11111111], 0.11111111]]), 0.95144261], 0.08352475], 0.18722741]])) In [186]: filtre*image[0:3,0:3] Out[186]: array([[ 0.07541173, [ 0.01543048, [ 0.07685918, 0.11011382, 0.00152894, 0.03727718, 0.10571585], 0.00928053], 0.02080305]]) 0.11111111, 0.11111111, 0.11111111, 0.11111111], 0.11111111], 0.11111111]]) In [187]: weights = filtre In [188]: weights Out[188]: array([[ 0.11111111, [ 0.11111111, [ 0.11111111, In [64]: from scipy import misc lena = misc.lena() plt.imshow(lena,cmap=plt.cm.gray) print type(lena) print lena.shape, lena.dtype 11 <type ’numpy.ndarray’> (512, 512) int64 In [65]: new = lena for i in range(120): new = smooth(new) plt.imshow(new,cmap=plt.cm.gray) Out[65]: <matplotlib.image.AxesImage at 0x109d3c390> 12 Plusieurs librairies python (ou avec interface python) pour l’image • PIL python imaging library : manipulation de base des formats • opencv : traitements avancés / vision par ordinateur (originellement en C++) / reconnaissance des formes http://opencv.org 1.12 Graphes plusieurs librairies disponibles • networkx (pur python) • igraph (C avec API python) • matrices creuses en numpy In [69]: #Exemple import networkx as nx # graph aléatoire G = nx.random_geometric_graph(150, 0.12) pos = nx.get_node_attributes(G, ’pos’) # positions des noeuds, # -> trouver le noeud le plus près du centre à (0.5,0.5) dists = [(x - 0.5)**2 + (y - 0.5)**2 for x, y in pos.values()] ncenter = np.argmin(dists) # calculer la longueur des chemins depuis le centre p = nx.single_source_shortest_path_length(G, ncenter) # faire la figure: couleur reliée à la longueur des chemins pylab.figure() nx.draw_networkx_edges(G, pos, alpha=0.4) 13 nx.draw_networkx_nodes(G, pos, nodelist=p.keys(), node_size=120, alpha=0.5, node_color=p.values(), cmap=pylab.cm.hot) pylab.show() In [ ]: # exemple igraph # exemple numpy 1.13 Apprentissage automatique / reconnaissance des formes dès le prochain semestre A partir d’exemples, construire un modèle qui prend des décisions sur des données nouvelles • classification : déterminer la catégorie d’un événement; exemple détection de spam • régression : prédire des valeurs d’une fonction à partir d’exemple; exemple: • apprendre des politiques d’action pour un robot scikit-learn (surcouche de numpy) http://scikit-learn.org opencv a aussi des outils reliés, spécialisés pour le traitement d’image 1.14 Robotique ROS: robot operating system, utilisé dans les TP de robotique. http://wiki.ros.org/ • gestion du hardware • visualisation • communication entre modules avec une API python In [ ]: 14