Feature Exploration by Industry Segment in Ad-Level CS Data

HTI Labs

Purpose/Context

We want to get a by industry segment understanding of features and relationships between features in our online commercial sex data at the ad-level. Currently, features that are especially relevant to our industry segment vetting process are prioritized. Current analyses rely on a sample of online CS ads that were manually vetted to determine segment type. Further, select features were manually annotated (i.e., price, race, hispanic status, etc.). That said, the current analysis relies on a hybrid of automatically generated feature values and manually annotated feature values.

Proportion of Ads by Industry Segment

Feature Missingness by Industry Segment

Escort Services

Picture/Video Sales

Massage Parlors

Brothel/Residences

Stripclubs/Bars/Casinos

Outdoor Solicitation

Feature Cooccurrence by Industry Segment

Escort Services

Picture/Video Sales

Massage Parlors

Brothel/Residences

Stripclubs/Bars/Casinos

Outdoor Solicitation

Single Categorical Features by Industry Segment

Gender

Hispanic

Race

Multiple Sex Providers

Organization-Releated

Single Numeric Features by Industry Segment

Age

Price Per Hour

Phone Number Count

Venue Count

Feature Relationships by Industry Segment

Age & Price - All Ages

Age & Price - Younger than than 50

Age & Price - Younger than 30