Automatic Analysis of Gossip in E-Mail Michael Henninger, University of Applied Sciences Northwestern Switzerland

Michael Henninger, University of Applied Sciences Northwestern Switzerland

Hauke Fuehres, University of Cologne

Peter A. Gloor, MIT Center for Collective Intelligence

This paper proposes a novel type of link – the “gossip link”. While the automatic generation of links in e-mail archives is straightforward, gossip links raise all sorts of interesting research problems. For simple e-mail links, if A sends an e-mail to B, there is a link from A to B. Gossip links can also be extracted automatically from e- mail text bodies: if A talks about C in his/her e-mail, we construct a link from A to C. However such links can also be constructed in two additional ways: We can also have a gossip link from A to B, if A gossips about somebody to B. Finally, we can have a gossip link from B to C, if C is spoken about in an e-mail message sent to B.

We speculate that “gossip links” are “honest signals” in the same sense as the “honest signals” captured with Sandy Pentland’s sociometric badges. If a person is spoken about a lot, she/he must be important for the community. In combination with automatic sentiment analysis, we would then see if a person were important and well-liked, or important and not-so-well-liked. We also speculate that we have people who are more active chatterboxes than others, and there are people who are more popular as receivers of gossip than others. It will be interesting to explore, if the perception of people in the face-to-face world in these roles has some correspondence with gossip in e-mail.

We have built a first version of an automatic gossip link recognition system in Condor, in combination with a simple “bag of word”-based sentiment analysis system. We are currently testing the system with an e-mail archive of an active group of medical researchers, to explore our hypothesis.

We envision many practical applications for our gossip links. For example, we will track the acceptance of new members of a community, to see if there will be a correlation between gossip about a new member, and the amount of e-mail traffic sent and received by the new member of the community. It might be that some people are embraced by the community more quickly than others. Another application could be tracking of new ideas and events. Are new ideas associated with certain people who also might be the creators of these new ideas? Tracking of gossip links also offers another microscope into the inner health of a team, measuring whether people talk in positive or negative undertones about each other. Finally it will be interesting to just check if the amount of gossip about a person correlates with the amount of e-mail sent to and received from this person. It could be that there are hidden influencers, who send and receive relatively little e-mail, but are spoken about a lot.

The same concept of gossip can also be extended to the analysis of online forum posts and chat transcripts. Who are the persons about whom others speak most? Do they talk about them positively or negatively? One could imagine a system of automatic troll detection in online forums, where posts of trolls trigger a subsequent discussion about the troll and her/his posts. Another application in online forums would be to find the most active and valuable posters, because they will be spoken about most and in positive undertones.

While we have only just scratched the surface of this exciting new area of “gossip based” social networking research, we are convinced that there will be numerous applications of this new type of “gossip link”.