Diffusion de l'information : Modélisation et Analyse

Telechargé par Ouafa Amrini

Téléchargement

HAL Id: tel-01100255

https://tel.archives-ouvertes.fr/tel-01100255

Submitted on 6 Jan 2015

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Distributed under a Creative Commons Attribution - NoDerivatives| 4.0 International

License

Diusion de l’information dans les médias sociaux :

modélisation et analyse

Adrien Guille

To cite this version:

Adrien Guille. Diusion de l’information dans les médias sociaux : modélisation et analyse. Informa-

tique [cs]. Université Lumière Lyon 2, 2014. Français. �tel-01100255�

Thèse présentée pour obtenir le grade de

Docteur de l’Université Lumière Lyon 2

École Doctorale Informatique et Mathématiques (ED 512)

Laboratoire ERIC (EA 3083)

Discipline : Informatique

Diffusion de l’information dans les médias sociaux

Modélisation et analyse

Par : Adrien Guille

Présentée et soutenue publiquement le 25 novembre 2014, devant un jury composé de :

Pascal Poncelet,Professeur des Universités, Université Montpellier 2 Rapporteur

Emmanuel Viennet,Professeur des Universités, Université Paris 13 Rapporteur

Vincent Labatut,Maître de Conférences, Université d’Avignon et des Pays du Vaucluse Examinateur

Christine Largeron,Professeur des Universités, Université Jean Monnet Saint-Etienne Examinatrice

Cécile Favre,Maître de Conférences, Université Lumière Lyon 2 Co-directrice

Djamel Zighed,Professeur des Universités, Université Lumière Lyon 2 Directeur

Abstract

Social media have greatly modiﬁed the way we produce, diffuse and consume

information, and have become powerful information vectors. The goal of this thesis

is to help in the understanding of the information diffusion phenomenon in social

media by providing means of modeling and analysis.

First, we propose MABED (Mention-Anomaly-Based Event Detection), a statistical

method for automatically detecting events that most interest social media users from

the stream of messages they publish. In contrast with existing methods, it doesn’t

only focus on the textual content of messages but also leverages the frequency of

social interactions that occur between users. MABED also differs from the literature in

that it dynamically estimates the period of time during which each event is discussed

rather than assuming a predeﬁned ﬁxed duration for all events. Secondly, we propose

T-BASIC (Time-Based ASynchronous Independent Cascades), a probabilistic model

based on the network structure underlying social media for predicting information

diffusion, more speciﬁcally the evolution of the number of users that relay a given

piece of information through time. In contrast with similar models that are also based

on the network structure, the probability that a piece of information propagate from

one user to another isn’t ﬁxed but depends on time. We also describe a procedure

for inferring the latent parameters of that model, which we formulate as functions of

observable characteristics of social media users. Thirdly, we propose SONDY (SOcial

Network DYnamics), a free and extensible software that implements state-of-the-art

methods for mining data generated by social media, i.e. the messages published by

users and the structure of the social network that interconnects them. As opposed

to existing academic tools that either focus on analyzing messages or analyzing the

network, SONDY permits the joint analysis of these two types of data through the

analysis of inﬂuence with respect to each detected event.

The experiments, conducted on data collected on Twitter, demonstrate the rele-

vance of our proposals and shed light on some properties that give us a better un-

derstanding of the mechanisms underlying information diffusion. First, we compare

the performance of MABED against those of methods from the literature and ﬁnd that

taking into account the frequency of social interactions between users leads to more

accurate event detection and improved robustness in presence of noisy content. We

also show that MABED helps with the interpretation of detected events by providing

clear textual descriptions and precise temporal descriptions. Secondly, we demons-

trate the relevancy of the procedure we propose for estimating the pairwise diffusion

probabilities on which T-BASIC relies. For that, we illustrate the predictive power of

users’ characteristics, and compare the performance of the method we propose to es-

timate the diffusion probabilities against those of state-of-the-art methods. We show

the importance of having non-constant diffusion probabilities, which allows incorpo-

rating the variation of users’ level of receptivity through time into T-BASIC. We also

study how – and in which proportion – the social, topical and temporal characteristics

of users impact information diffusion. Thirdly, we illustrate with various scenarios the

usefulness of SONDY, both for non-experts – thanks to its advanced user interface and

adapted visualizations – and for researchers – thanks to its application programming

interface.

Keywords. Social media data mining ; Event detection and tracking ; Modeling and

predicting information diffusion ; Scientiﬁc software development.

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

1 / 190 100%

Documents connexes

UE01 Evénement : conjuguer créativité et faisabilité 15 UE02

Test auto-évaluation pour le trouble de stress post

fiche-inscription-black-friday (, 270.68 Ko)

BBQ estival 4 juin 2014 Membres présents : membre de l`AGE et les

Traduction automatique : Diagramme explicatif

Module_3_Pr_Dalery_CC_Psychiatrie_corrigé (PDF

Exercice 48 p 296 : Notons P l`évènement « On obtient Pile au

Soirée privée en lieu clos:

Mlle Galié ALI ORDJO - Amazon Web Services

Merci pour votre participation!

Faire une suggestion

Avez-vous trouvé des erreurs dans l'interface ou les textes ? Ou savez-vous comment améliorer l'interface utilisateur de StudyLib ? N'hésitez pas à envoyer vos suggestions. C'est très important pour nous!

GDPR Confidentialité Conditions d''utilisation

Diffusion de l'information : Modélisation et Analyse

Documents connexes

Faire une suggestion

Produits

Assistance

Produits

Assistance

Diffusion de l'information : Modélisation et Analyse

Documents connexes

Faire une suggestion

Produits

Assistance

Ajouter ce document à la (aux) collections

Ajouter ce document à enregistré

Suggérez-nous comment améliorer StudyLib