Bridging distinct domains in privacy related learning problems

Rebekah Overdorf

doi:10.17918/etd-7607

Back

Bridging distinct domains in privacy related learning problems

Dissertation

Open access

Bridging distinct domains in privacy related learning problems

Rebekah Overdorf

Doctor of Philosophy (Ph.D.), Drexel University

Sep 2017

DOI:

https://doi.org/10.17918/etd-7607

Files and links (1)

pdf

Overdorf_Rebekah_20171.19 MBDownload View

PDF Open Access Open Access (License Unspecified)

Abstract

Privacy

Computer Science

Computer Security

Machine Learning

Development of efficient and effective machine learning methods has prompted a surge of research on their application from use in spam filtering to recommender systems. Blindly applying machine learning tools to learning problems in privacy and security, however, does not often produce the desired results. Applications of machine learning in privacy and security are often affected by this difference as adversaries are ordinarily present and training data with reliable ground truth is frequently difficult to obtain. This problem is exacerbated by the fact that data used for testing methods may differ from the real world data that the model is created for. This thesis addresses three learning problems in privacy and security, all of which have data from different domains that needs to be considered. In authorship attribution we tackle the cross-domain case in which the training data and testing data are written in different contexts and mediums. Research in this area has been limited to texts written in the same domain, an assumption that cannot often be made in real world settings. We explore cross-domain attribution in three such domains: blogs, Twitter feeds, and Reddit comments. Research in website fingerprinting focuses on a single domain, the incoming and outgoing packets on a network, to determine which webpage a user is visiting. In addition to this domain, we focus on the websites themselves and develop methods that successfully determine which website level features cause a site to be more or less susceptible to this type of attack. Similarly, most research on the economies and structure of cybercriminal forums focuses on only the domain of private messages. While there is some research that has investigated what can be learned from the public interactions on these forums, no work has tried to bridge these domains. We present a method to predict which public threads are likely to trigger private interactions.

Metrics

40 File views/ downloads

14 Record Views

Details

Title: Bridging distinct domains in privacy related learning problems
Creators: Rebekah Overdorf - DU
Contributors: Rachel Greenstadt (Advisor) - Drexel University (1970-)
Awarding Institution: Drexel University
Degree Awarded: Doctor of Philosophy (Ph.D.)
Publisher: Drexel University; Philadelphia, Pennsylvania
Number of pages: xiv, 112 pages
Resource Type: Dissertation
Language: English
Academic Unit: Computer Science (Computing) (2013-2026); College of Computing and Informatics (2013-2026); Drexel University
Other Identifier: 7607; 991014632258404721

Bridging distinct domains in privacy related learning problems

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media