Recently accepted papers on ICIP and SIBGRAPI

I had a paper accepted on ICIP, the IEEE International Conference on Image Processing, by my student Sandra de Avila (whose main supervisor is my former M.Sc. supervisor Prof. Arnaldo de Araújo). Sandra is currently in France at the prestigious LIP6 lab, under the supervision of my former Ph.D. supervisior Prof. Matthieu Cord and our colleague Prof. Nicolas Thome. The paper presents an interesting extension to the “bag of visual words” approach (which is based on quantized local features using a codebook / “visual dictionary”), taking into consideration an histogram of the distances between the features effectively found on the images and the features chosen to compose the codebook. Here’s the title and abstract:

Bossa: Extended BoW Formalism for Image Classification
In image classification, the most powerful statistical learning approaches are based on the Bag-of-Words paradigm. In this article, we propose an extension of this formalism. Considering the Bag-of-features, dictionary coding and pooling steps, we propose to focus on the pooling step. Instead of using the classical sum or max pooling strategies, we introduced a density function-based pooling strategy. This flexible formalism allows us to better represent the links between dictionary codewords and local descriptors in the resulting image signature. We evaluate our approach in two very challenging tasks of video and image classification, involving very high level semantic categories with large and nuanced visual diversity.

I’ve also had two papers accepted on our counterpart national conference, SIBGRAPI. The first is the work of the Ph.D. student Ana Lopes and her R.A. Elerson Santos (supervised by Prof. Arnaldo Araújo and co-supervised by Prof. Jussara Almeida; I give her some technical and nontechnical support every now and then). It concerns the use of transfer learning of concepts from (static) image datasets to video datasets in order to recognize human actions. We show that learning the concepts present on the Caltech256 dataset allow a classifier to obtain improved results on the challenging “in the wild” human action Hollywood2 dataset.

Transfer Learning for Human Action Recognition
To manually collect action samples from realistic videos is a time-consuming and error-prone task. This is a serious bottleneck to research related to video understanding, since the large intra-class variations of such videos demand training sets large enough to properly encompass those variations. Most authors dealing with this issue rely on (semi-) automated procedures to collect additional, generally noisy, examples. In this paper, we exploit a different approach, based on a Transfer Learning (TL) technique, to address the target task of action recognition. More specifically, we propose a framework that transfers the knowledge about concepts from a previously labeled still image database to the target action video database. It is assumed that, once identified in the target action database, these concepts provide some contextual clues to the action classifier. Our experiments with Caltech256 and Hollywood2 databases indicate: a) the feasibility of successfully using transfer learningtechniques to detect concepts and, b) that it is indeed possible to enhance action recognition with the transferred knowledge of even a few concepts. In our case, only four concepts were enough to obtain statistically significant improvements for most actions.

The second is the work of my Ph.D. student Marcelo Coelho and his R.A. Cássio dos Santos Jr. (again his main supervisor is Prof. Arnaldo de Araújo). It concerns the clean-up of noisy SIFT features of street-view images (urban façades). We have found out that subspace clustering, a non-supervised technique, is able to isolate clusters of useful and non-useful SIFT features for the task of retrieving a target image. The challenge is identifying a priori which cluster is the relevant one. This work compares and contrasts two subspace clustering techniques: FINDIT (based on dimension voting) and MSSC (based on a fuzzy mean-shift).

Subspace Clustering for Information Retrieval in Urban Scene Databases
We present a comprehensive study of two important subspace clustering algorithms and their contribution to enhance results from the difficult task of matching images taken of the same object using different devices at different conditions. Our experiments were done on two distinct databases containing urban scenes which were tested using state-of-the-art matching algorithms. After initial evaluation of both datasets by that procedure, clustering algorithms were applied to them. An exhaustive comparison was performed in every cluster found and a significant amelioration in the results was obtained.

I’ll put a link to the preprints as soon as I have they become available.

Posted in publications, science | Tagged , , , , , , , , , , , , , , , , | Leave a comment

Scientific sense and hurt sensibilities

A few of my students are working on pornography detection for video sharing social networks (an early draft of our work is available on ARXIV). Pornography is a contentious issue, littered with polemic, fallacies and rethorical traps. We have tried, as much as possible, to keep away from those. We refrain, thus, from value judgements, which are the realm of Philosophy and Social Sciences, way outside our jurisdiction.

An interesting difficulty I have faced for reporting on this work was showing representative images, without hurting the sensibilities of reviewers and readers. So far, my (admittedly coward) choice has been taking the tamest images that are still representative of the phenomena I want to illustrate. For example: to illustrate that the dataset is ethnically diverse, I would chose frames where only the faces of the actors are shown; to illustrate that the dataset contains gay porn as well as straight porn, I would show a frame with the actors kissing instead of having sex; etc.

But recently, I had a tough choice to make. A student was to submit his Master disstertation to the viva-voce committee, and, as it usually happens in Brazil, he has sent me a draft for corrections and suggestions. His “Results” chapter contained, among cold graphs and tables, several very explicit images, illustrating in detail the cases of success and failure of our algorithm. The only thing is: all images contained censor bars.

I returned the draft with several corrections, among which, a note begging him to remove the bars:

Don’t censor the images — it’s extremely distasteful: this is a scientific work for an adult audience. Either remove the images entirely (if they are not needed), either keep them uncensored (don’t mess up with the data !). In the worst case, put them in an Annex or in a separate suplement.”

In the end, he’s decided to keep the images uncensored, which I feel was the right scientific decision.

Nevertheless, everytime I open his “Experimental Results” chapter I cringe a little bit. Againg admittedly cowardly I am looking forward for the defense, when I’ll be able to share the responsibility for the final decision — keeping or taking away the images from the definitive version — with the rest of the committee.

* * *

Taking a (superficial) look in the literature, I noticed that many authors (including myself) practice a form of “partial self-censorship”: choosing “tame” images, making them tiny in the page, or using washed out grayscale reproductions — a compromise between scientific truth and respect to the taboo ? Or just plain cowardice ? Most authors simply don’t include images, and a few choose to employ the censor bars. The full-fledged honesty of my student is rare.

The censor bars, IMHO, are the worst choice — at once hypocritical and unscientific. Hypocritical, because the reader can perfectly imagine what is behind them, so any of the “dirtiness” from which they would be supposedly “protecting” the reader is still being created in his or her mind. The effect is exactly the same as when using euphemisms like “f-word”: the  correct word is still created in the listener mind. Unscientific, because they count on the reader imagination (with its distortions, imprecisions, and, often, amplifications) instead of depicting precisely the phenomena at study.

Interestingly, in one paper, the authors censor the faces of the actors (by pixelization). This is an interesting choice and raises a question I have not considered: since we collect our dataset from pornography sharing social networks, we cannot assume that everyone in the video is a professional actor. I hope that none of our examples have Computer Vision scientists unaware that their amateur videos have escaped to the net !

* * *

In the end of the day, this is 2011 — 64 years since the first Kinsey report ! Shouldn’t science have got some guts by now ?

Posted in science | Tagged , , , , | 7 Comments

No easy answers

Barely a few months (6, 7 ?) since using Mac OS X and my system has got basically unusable: MacPorts has died and gone to hell, root certificates get systematically rejected by Safari, the system is sssoooo ssssllllloooowwww that using it feels like moving in water. Firefox crashes every few hours. Parallels isn’t working anymore in Coherence mode.

The system is so unreliable that my only option is to reformat the machine and restart from scratch (basically what I had to do when I used XP — but XP usually lasted a good year and a half between reformats).

With that and all the software incompatibility problems, I am seriously considering switching to Windows 7 + Cygwin and thrashing this whole Mac OS experiment in the huge pile of “tried: didn’t work”.

Posted in technology | Tagged , , | 8 Comments

For better and for worse

Apparently the new Xcode has broken MacPorts so completely the thing is now essentially useless.

More and more, for better and for worse, MacOS is the new Windows.

Posted in technology | Tagged , , , | 3 Comments

Monteiro Lobato Doodle

I’ve coment a while ago on the Viscount of Corncob and his Algebra Congestion (of which I am also a constant sufferer). The Viscount, together with Emilia, the Marquise of Shortail were pictured in the Google Doodle of today’s Brazilian search page. Both are the creation of Brazilian writer Monteiro Lobato, whose 129th birthday is commemorated today.

Posted in blogging | Tagged , , | Leave a comment

How much Cultural Heritage is destroyed by copyright ?

Okay, now that the bombastic title has been said, the disclaimers:

  1. I am not against copyright per se;
  2. I do not automatically condone acts against copyright;
  3. When I first wrote this post it was 2h30 (in good civilized 24 format) and I always feel very brave this late in the night (“oh, yeah, let’s definitely post that !”) but much less so the next morning (“you know what, let’s not touch the beautiful and fragile seal of that can of worms”.)

But I have been reflecting on the recent events on Egypt, and on how “War” and “Civil Unrest” are known as two important causes of Cultural Heritage destruction. Together with “Flood” and “Fire” they are certainly right in the top 10 list of every conservator worst nightmares.

But what about “Law” ? And specifically, “Copyright Law” ?

I am thinking about this because of Google’s Video Identification service (Beta, of course) on YouTube, which allows participant copyright holders to automatically identify uploaded videos containing their material. The applied phlebotinum is quite interesting, actually, and involves near-duplicate identification using content-based techniques. I am particularly interested in this kind of technology, and I feel particularly concerned — scientifically and ethically — because this was the subject matter (mutatis mutandis videos per images) of my Ph.D. Thesis.

The terms of service are very interesting: it allows the copyright holders the draconian choice of just removing the “offending” material, but also invites them the more nuanced choice of embracing the Millennium, letting ‘their fans to participate in the creative process’ and even “splitting the loots”, by becoming partners with YouTube and sharing advertisement revenues.

It is a fact that many are embracing the Millennium, for matters of profit, fun, philosophy, or all the above. But what happens when they don’t ? It’s then that content, id est, cultural digital artefacts hit the proverbial bit bucket.

The alarm button sounded to me a while ago, during the infamous case of the “disappearing Hitler movies”. You’ve heard the story: there was this unknown European film about Hitler, and then there was a parody based on one of its hammy scenes, and then there were a few derivative parodies, and there were thousands of derivative parodies, and now the thing was viral and all bets were off. Suddenly everybody knew about both the parodies and the film. Sounds like a fair deal, no ? Well…

One day the parodies started to disappear. Silently, quickly and deadly. Predictably, YouTube users were outraged. There were even those daring enough to point out the irony of the situation, proving that Godwin’s Law is alive and well. Fortunately, after much hesitation, YouTube came to its senses and concluded that if there is one canonical example of fair use, the ill-fated Hitler parody is it, and stopped taking the videos down.

This is now official, recorded History, with a big H, and Wikipedia tells it better than I. I quote:

One scene in the film, in which Hitler launches into a furious tirade upon finally realizing that the war is truly lost, has become a staple of internet viral videos.[16] In these wildly anachronistic videos, the original audio of Ganz’s voice is retained, but new subtitles are added so that he now seems to be reacting instead to some setback in present-day politics, sports, popular culture, navajo moccasins, etc. One parody depicted Hitler flying into a rage in response to being banned from Xbox Live. The creator of this parody was the one who originally came up with the idea of Downfall parodies, his video Hitler gets Banned from Xbox Live was the first ever Downfall parody (and the first parody to be taken down as well).[17]This video accumulated a vast number of YouTube views and was posted on video game related sites.

By 2010, there were thousands of such parodies, including many in which a self-aware Hitler is incensed that people keep making Downfall parodies.[18]

The film’s director, Oliver Hirschbiegel, spoke positively about these parodies in a 2010 interview with New York magazine, saying that many of them were funny and they were a fitting extension of the film’s purpose: “The point of the film was to kick these terrible people off the throne that made them demons, making them real and their actions into reality. I think it’s only fair if now it’s taken as part of our history, and used for whatever purposes people like.”[19] Nevertheless, Constantin Films has taken an “ambivalent” view of the parodies, and has asked video sites to remove many of them.[20] On April 21, 2010, the producers initiated a massive removal of parody videos on YouTube.[21] However, there has been a resurgence of the videos on the site since the mass removal.[22] On July 28, 2010, Constantin responded by issuing DMCA takedown notices on videos which had countered the blocking of the videos using a Fair Use argument.[citation needed]

As of October 2010, Youtube no longer blocks any Downfall-derived parodies,[23] and is now placing ads on some of them. This was seen by many as a sign of relief, ending the cat-and-mouse game that involved parodists and Constantin Film.

Corynne McSherry, an attorney specializing in intellectual property and free speech issues[24] for the Electronic Frontier Foundation, stated “All the [Downfall parody videos] that I’ve seen are very strong Fair Use cases and so they’re not infringing, and they shouldn’t be taken down.”[25]

So, happy end ? Well, as an Archivist in spirit, I am not so convinced. I doubt that all the memetic diversity of the movies pre-, well, censorship, has been preserved.

I also wonder how many no less violent memecausts have been perpetrated in the silent of the night without getting any publicity, without their stories being told on Wikipedia, nobody complaining the citations are needed. There are no citations left.

Nowadays, every broken link followed by a friendly ‘Sorry about that.’ and preceded by mysterious messages of ‘has been terminated’, ‘no longer available’ and ‘copyright infringement’ sends a chill down my spine. Just in case, I quickly clear my history and jump to disney.com.

* * *

I wonder if, considering everything, anonymous isn’t a critical actor in the protection of endangered digital cultural artifacts. By keeping them circulating in alternative ecosystems when they can no longer exist in the official, sanitized world of law-abidden internet, how much software, game, image, video, text, mail exchange and other important testimony of our Culture has been saved from the unforgiving jaws of /dev/null ?

EDIT 6/june: There is a very interesting commentary on a 2005 report by the British CLIR (Council on Library and Information Resources) , focused on sound records on obsolete media, but that also considers more unconventional media.

Posted in heritage | Tagged , , , , , , | Leave a comment

Must everything be so difficult ? (Part 1)

I am trying to avoid ranting too much on this blog, but the computing industry is not cooperating.

So, I bought (actually, I was given) an HP Mini 1000 (modelo 1030NR) netbook computer. I would have find it useful for talks and short trips if it weren’t for the incredibly absurd choice of HP of: 1) including a non-standard external video port; 2) failing to provide the adapter to a standard VGA or DVI for more than a year after the machine has hit the market. No, I’m not kidding. Nevertheless, I still found it rather convenient for leisurely browsing the web or doing quick jobs on Microsoft Office.

However I’ve been noticing that the netbook has become slower and slower with the passing months, to the point that, lately, it’s been as useful as as self-heating paperweight. That’s when I’ve decided to do a fresh install of Ubuntu Netbook Edition and use it only for web browsing.

After not little struggle to have Ubuntu NE installed in an SD Card (I ended up using the Universal USB Installer in Parallels, since the solution described in the Mac section creates a filesystem that apparently will only boot on a Mac machine — on a PC, it is recognized as “unformatted”).

I’ve booted — finally ! — to Ubuntu NE, but only to discover that the default installation does not include the proprietary Broadcom firmwares for the (in)famous B43 kernel module.  But, wait ! You can still install then quite easily, by clicking an icon on the system notification area: “Install Additional Drivers”. The only thing is: you have to connect to the Internet. Which you can’t, since you don’t have the drivers in first place. Talk about a Catch-22 !

Fortunately, I’ve found an walkthrough to solve that circular dependency. I quote:

If you do not have any other means of Internet access on your computer, you will have to install b43-fwcutter and patch packages from the install media. After that you will need to setup firmware manually (without the firmware automatically downloading and being set up).

Step 1

b43-fwcutter is located on the Ubuntu install media under ../pool/main/b/b43-fwcutter/ and patch is located under ../pool/main/p/patch/ or both in the official repositories online. Double click on the package to install or in a terminal (under the desktop menu Applications > Accessories > Terminal) navigate to the folder containing the package and issue the following command:

/b43-fwcutter/$ sudo dpkg -i b43-fwcutter*

Step 2

On a computer with Internet access, download the required firmware files from http://downloads.openwrt.org/sources/wl_apsta-3.130.20.0.o and http://mirror2.openwrt.org/sources/broadcom-wl-4.150.10.5.tar.bz2

Step 3

Copy the downloaded files to your home folder and execute the following commands consecutively in a terminal to extract and install the firmware:

~$ tar xfvj broadcom-wl-4.150.10.5.tar.bz2
~$ sudo b43-fwcutter -w /lib/firmware wl_apsta-3.130.20.0.o
~$ sudo b43-fwcutter --unsupported -w /lib/firmware broadcom-wl-4.150.10.5/driver/wl_apsta_mimo.o

Step 4

Under the desktop menu System > Administration > Hardware/Additional Drivers, the b43 drivers can be activated for use. Note: A computer restart may be required before using the wifi card. LiveCD/LiveUSB Note: The install media contents are mounted under /cdrom of the filesystem.

Step 5

For temporary use with the LiveCD and LiveUSB environments, instead of a computer restart, in a terminal issue the following commands:

~$ sudo modprobe -r b43 ssb
~$ sudo modprobe b43

Note: Allow several seconds for the network manager to scan for available networks before attempting a connection.

End of the drama ? And they connected happily ever after ? Not quite. I’ve got the wireless card working — for about 4 or 5 seconds. And then it would either drop the connection, or fail to connect altogether, or even stop showing the available networks. What sortilege could be keeping my card from its ethernet blessing ?

A quick inspection on dmesg revealed the matter:

b43-phy0 ERROR: Fatal DMA error: 0x00000400, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000
b43-phy0 ERROR: This device does not support DMA on your system. Please use PIO instead.

Again, Saint Google had a pointer to the solution. To load the modules durint an active session,  type:

sudo modprobe -r b43 ssb
sudo modprobe b43 pio=1 qos=0

For making the settings permanent, type:

sudo touch /etc/modprobe.d/b43.conf
echo "options b43 pio=1 qos=0" | sudo tee -a /etc/modprobe.d/b43.conf

Then, and only then, I’ve got the system working !

Two more hints :

  1. If you are trying to install Ubuntu on the Netbook and the program is stalling, try to unmark both options (download updates, install third party proprietary software) — you can always do those later !
  2. After having all the pains above to make everything working, don’t use the “Install Proprietary Hardware” automated GUI of Ubuntu — it will only mess everything up.

(When computers were created, weren’t they supposed to solve our problems ?)

Posted in technology | Tagged , , , , | 1 Comment