Logo image
Privacy Detective: Detecting Private Information and Collective Privacy Behavior in a Large Social Network
Conference proceeding

Privacy Detective: Detecting Private Information and Collective Privacy Behavior in a Large Social Network

Aylin Caliskan Islam, Jonathan Walsh and Rachel Greenstadt
Proceedings of the 13th Workshop on privacy in the electronic society, pp 35-46
03 Nov 2014

Abstract

detecting private information privacy privacy behavior sensitive information social network text classification
Detecting the presence and amount of private information being shared in online media is the first step towards analyzing information revealing habits of users in social networks and a useful method for researchers to study aggregate privacy behavior. In this work, we aim to find out if text contains private content by using our novel learning based approach `privacy detective' that combines topic modeling, named entity recognition, privacy ontology, sentiment analysis, and text normalization to represent privacy features. Privacy detective investigates a broader range of privacy concerns compared to previous approaches that focus on keyword searching or profile related properties. We collected 500,000 tweets from 100,000 Twitter users along with other information such as tweet linkages and follower relationships. We reach 95.45% accuracy in a two-class task classifying Twitter users who do not reveal much private information and Twitter users who share sensitive information. We score timelines according to three privacy levels after having Amazon Mechanical Turk (AMT) workers annotate collected tweets according to privacy categories. Supervised machine learning classification results on these annotations reach 69.63% accuracy on a three-class task. Inter-annotator agreement on timeline privacy scores between various AMT workers and our classifiers fall under the same positive agreement level. Additionally, we show that a user's privacy level is correlated with her friends' privacy scores and also with the privacy scores of people mentioned in her text but not with the number of her followers. As such, privacy in social networks appear to be socially constructed, which can have great implications for privacy enhancing technologies and educational interventions.

Metrics

19 Record Views
60 citations in Scopus

Details

Logo image