A few of my students are working on pornography detection for video sharing social networks (an early draft of our work is available on ARXIV). Pornography is a contentious issue, littered with polemic, fallacies and rethorical traps. We have tried, as much as possible, to keep away from those. We refrain, thus, from value judgements, which are the realm of Philosophy and Social Sciences, way outside our jurisdiction.
An interesting difficulty I have faced for reporting on this work was showing representative images, without hurting the sensibilities of reviewers and readers. So far, my (admittedly coward) choice has been taking the tamest images that are still representative of the phenomena I want to illustrate. For example: to illustrate that the dataset is ethnically diverse, I would chose frames where only the faces of the actors are shown; to illustrate that the dataset contains gay porn as well as straight porn, I would show a frame with the actors kissing instead of having sex; etc.
But recently, I had a tough choice to make. A student was to submit his Master disstertation to the viva-voce committee, and, as it usually happens in Brazil, he has sent me a draft for corrections and suggestions. His “Results” chapter contained, among cold graphs and tables, several very explicit images, illustrating in detail the cases of success and failure of our algorithm. The only thing is: all images contained censor bars.
I returned the draft with several corrections, among which, a note begging him to remove the bars:
“Don’t censor the images — it’s extremely distasteful: this is a scientific work for an adult audience. Either remove the images entirely (if they are not needed), either keep them uncensored (don’t mess up with the data !). In the worst case, put them in an Annex or in a separate suplement.”
In the end, he’s decided to keep the images uncensored, which I feel was the right scientific decision.
Nevertheless, everytime I open his “Experimental Results” chapter I cringe a little bit. Againg admittedly cowardly I am looking forward for the defense, when I’ll be able to share the responsibility for the final decision — keeping or taking away the images from the definitive version — with the rest of the committee.
* * *
Taking a (superficial) look in the literature, I noticed that many authors (including myself) practice a form of “partial self-censorship”: choosing “tame” images, making them tiny in the page, or using washed out grayscale reproductions — a compromise between scientific truth and respect to the taboo ? Or just plain cowardice ? Most authors simply don’t include images, and a few choose to employ the censor bars. The full-fledged honesty of my student is rare.
The censor bars, IMHO, are the worst choice — at once hypocritical and unscientific. Hypocritical, because the reader can perfectly imagine what is behind them, so any of the “dirtiness” from which they would be supposedly “protecting” the reader is still being created in his or her mind. The effect is exactly the same as when using euphemisms like “f-word”: the correct word is still created in the listener mind. Unscientific, because they count on the reader imagination (with its distortions, imprecisions, and, often, amplifications) instead of depicting precisely the phenomena at study.
Interestingly, in one paper, the authors censor the faces of the actors (by pixelization). This is an interesting choice and raises a question I have not considered: since we collect our dataset from pornography sharing social networks, we cannot assume that everyone in the video is a professional actor. I hope that none of our examples have Computer Vision scientists unaware that their amateur videos have escaped to the net !
* * *
In the end of the day, this is 2011 — 64 years since the first Kinsey report ! Shouldn’t science have got some guts by now ?