Paper published at CVIU

Our paper “Pooling in image representation: the visual codeword point of view” has been published at the May issue of the Computer Vision and Image Understanding journal (CVIU). The paper is available at the publisher’s site (DOI: 10.1016/j.cviu.2012.09.007). The last preprint is also available in my publications page.

In this paper, we explore and extend the bags-of-visual-words formalism. We propose a new pooling function, based upon preserving information about the distances between the image low-level descriptors in the image and the codewords in the visual dictionary. That density-based approach allows the creation of more compact representations than the parametric approaches (based upon  moments of multidimensional Gaussians) commonly found in literature. Here’s the abstract:

In this work, we propose BossaNova, a novel representation for content-based concept detection in images and videos, which enriches the Bag-of-Words model. Relying on the quantization of highly discriminant local descriptors by a codebook, and the aggregation of those quantized descriptors into a single pooled feature vector, the Bag-of-Words model has emerged as the most promising approach for concept detection on visual documents. BossaNova enhances that representation by keeping a histogram of distances between the descriptors found in the image and those in the codebook, preserving thus important information about the distribution of the local descriptors around each codeword. Contrarily to other approaches found in the literature, the non-parametric histogram representation is compact and simple to compute. BossaNova compares well with the state-of-the-art in several standard datasets: MIRFLICKR, ImageCLEF 2011, PASCAL VOC 2007 and 15-Scenes, even without using complex combinations of different local descriptors. It also complements well the cutting-edge Fisher Vector descriptors, showing even better results when employed in combination with them. BossaNova also shows good results in the challenging real-world application of pornography detection.

I’d like to shoot a small video, exploring the bags-of-words model in general, and this work in particular, later this month — let’s hope I can find the time !

Posted in publications, science | Tagged , , , , , , , , , | 2 Comments

Paper published at JASIST

Prof. Jacques Wainer and I had our paper “What happens to computer science research after it is published ? Tracking CS research lines” issued for early view on the Journal of the American Society for Information Science and Technology (JASIST) (DOI: doi/10.1002/asi.22818). The last preprint, before the publishers’ corrections, is also available in my publications page. Here’s the abstract :

Are computer science papers extended after they are published ? We have surveyed 200 computer science publications, 100 journal articles, and 100 conference papers, using self-citations to identify potential and actual continuations. We are interested in determining the proportion of papers that do indeed continue, how and when the continuation takes place, and whether any distinctions are found between the journal and conference populations. Despite the implicit assumption of a research line behind each paper, manifest in the ubiquitous “future research” notes that close many of them, we find that more than 70% of the papers are never continued.

In this paper we try to shed light on that “early stopping” phenomenon. Why so many CS papers stay on the “first idea” phase ? Does this interact with the atypical value that CS attributes to conferences ? Is there a correlation (positive or negative) between any  ”quality” metric of the work and the probability of  a continuation popping up ?

My colleague and friend Jacques is a specialist on Scientometrics, and social networks of cooperating scientists, whom he analyses through webs of publications, co-publications, citations and co-citations. We have spent about an year discussing the best statistical tools to tackle such complex phenomena, and then trying to translate the results back into “social” meaningful conclusions. I’ll let you judge how much we have succeeded on the latter effort.

Posted in publications, science | Tagged , , , , | Leave a comment

Excuse me while I go wrap my head in tin foil

I’ve never imagined that this blog would degenerate into so much rant about technology — maybe is it something about the 13 in 2013 ? Or am I becoming grumpy as I age ? (My detractors would reply that I’ve always been grumpy).

I have Google Chrome, Apple Safari and Mozilla Firefox installed on my MacBook, since more than twenty years have passed since the inception of the Web, and still designers can’t agree which browser to support (and browser makers can’t agree to which standards to commit). I switch defaults every few months, as I get fed up with the inconveniences du jour. Safari is the current default.

Now the horror story: whenever I open Chrome, it asks me to have access to the keys of one of my blogs. A blog whose existence it is not supposed to know. It is not on my bookmarks, it is not on my Google Account, it is not on the browsing history (supposedly — I’ve cleared it “from the beginning of time” — but that probably means that only I don’t have access to it anymore). Yet, every time I open Chrome, it asks me for the keys to the private content of that blog. Twice. So far, I’ve resisted.Google Chrome requesting access to OS X keychain

I’ve looked around, and there’s a general procedure to solve Chrome keychain indiscretions in OS X, which consists in opening Utilities / Keychain Access, and deleting the Chrome Safe Storage item. The procedure actually works, but it results in Chrome not syncing with the Google Account anymore. If I type my password to link Chrome to the KGB, the interrogatory starts again, meaning that the Central Agency already knows about my secrets.

I might need a double layer on that tin foil heat.

EDIT April, 1st : Victory at last ! I’ve found out deep inside the bowels of Chrome how to delete the information. Hit Chrome’s tools menu (in the upper right corner) and hit Settings (or in a Mac just hit Command+, ). Go to the bottom of the page and click on Show advanced settings. On the Passwords and forms section, click on Managed saved passwords. There you’ll find Chrome’s stash of sites whose passwords it wants to follow. Delete the offending entries (by clicking on the little “x” that appears when you hover), and Chrome you’ll leave you alone.

Posted in technology | Tagged , , , , , | 1 Comment

Sharing Mathematica with Yourself

Maybe I forgot to click on some half-hidden checkbox when I’ve installed it, but I’ve found out that my Mathematica 9 copy was working only for one of the users in my MacBook. That is annoying, because I keep separate users for my everyday usage,  and for giving presentations and classes (so there is zero risk that one of my friends suddenly appears on Skype telling a dirty joke in the middle of a presentation to the Schools’ president).

But I’ve found out the problem is easy to remediate. There are three folders where Mathematica 9 searches for the license files : $BaseDirectory/Licensing, $InstallationDirectory/Configuration/Licensing, and $UserBaseDirectory/Licensing (open a new Mathematica notebook and type the commands $BaseDirectory, etc. to find out exactly what the paths are in your system). Sure enough, mine was in $UserBaseDirectory/Licensing — meaning it was accessible by just that user.

This simple sequence of commands solved the problem :


$ sudo su
$ mkdir $BaseDirectory
$ mv $UserBaseDirectory/Licensing $BaseDirectory

Again, be sure to substitute the $variables above by the correct paths. I’ve double checked the permissions, and mine were already ok (all users had reading permissions). If it’s not the case for you, try this command :

$ chmod -R a+rX $BaseDirectory

And that is all. (I hope that doesn’t violate any terms, but I can’t see why it would : this kind of Mathematica license is per machine, and even if it were per user, well, both users are the same person, and they are never both “on” at once, isn’t it ?)

I don’t know if this Mathematica single-machine/multi-user license problem (or solution) applies for systems other than Mac OS X — if you find out, I’d be glad to know.

Posted in technology | Tagged , , | 2 Comments

Ad blocking, Paid content

With all the ado about Google Glasses, it seems that wearable augmented reality is fashionable again — despite previous attempts having failed to take roots.

But suppose that the technology does take roots. A lot of ink is flowing about how it would revolutionize advertisement, by allowing on-the-fly personalized ads that integrate with the user daily experience. The concept has even been parodied.

But suppose the opposite happens. Just like ad-blocking is gaining momentum on our browsers, couldn’t the concept work for real life ? Two days ago I’ve sent an e-mail to a friend, who studies architecture and urban design :

Subject : One crazy idea

You know those glasses of Google that created so much buzz last year ? And those ad-blocking browser add-ons ? How about combining the two : a virtual urban intervention — virtual augmented glasses that take ads from outdoor panels, subway panels, electronic panels and exchange them for fine art paintings or beautiful nature scenery.

What I didn’t know is that Ad-Block+ themselves had already imagined the concept, as an April Fools joke.

But must the idea be confined to realm of jokes ? I don’t know about the SoA of image registration in augmented reality, but the image recognition portion seems to me completely feasible.

Would some Computer Graphics geek like to work with me on a prototype ? (Or would we be targeting ourselves to be killed by mobs of contemporary Mad People).

* * *

Craziest idea : a new World Wide Web where instead of giving up your private data and drowning in ads, you pay for both the services and the content. At a cost high enough to let you demand that your data remains yours, that your privacy remains unviolated, and that your eyes remain spared of any ads.

(Okay, no need to lock me up in the funny farm.  Forget I said anything.)

Posted in leisure, science, technology | Tagged , , , , , | Leave a comment

Goodbye, Google+

When Google+ started, I had the feeling it could become the social network for “the rest of us”, who don’t have a Facebook account for a reason. Those of us who want to share ideas, interesting stuff, without flinging open our personal lives online, without bothering our friends with endless invites to play some silly minigame.

It seems though, that just as the service starts to get critical mass, Google has decided to make it “Orkut 2.0″, i.e., to turn it into yet another classical social network. Thanks but, no, thanks.

While the service was becoming slowly but progressively intrusive the latest months, I was under the impression that it was just bothering me, and that if I was careful enough with all the “new terms of service”, and “discover new features”, and “new settings” screens, I could keep enjoying it. But then my sister came to me and said:

“— Could you just stop sending me those Google+ invites? It’s annoying.”

“— Say what ?!”

I found out that for weeks Google+ has been sending her e-mails every other day, telling her how much I want to keep in touch with her, and how much she is missing out on my updates by not joining the service. In my name. This, of course, is a huge breach of trust, and I don’t think I’ve overreacted by deleting my profile immediately. Call it ragequit, if you want, but it’s over.

If you have a Google+ account, ask around : you might be, unknowingly, annoying friends, family — or worse — colleagues, and clients. If you had the same experience as me, I’d love to hear from you. I’d be reassured to know I wasn’t victim of a freaky bug or something.

Posted in technology | Tagged , , | 4 Comments

Talk at LIP6 Lab, UPMC, Paris

I’m giving my “Scalability Issues in Multimedia Information Retrieval” talk at the LIP6, UPMC, this Tuesday, February 19th, at 13h. I thank my colleague and former advisor Prof. Matthieu Cord for this opportunity.

Posted in science | Tagged , , , , , , , , , , | Leave a comment

Apple Blues

I don’t know what is happening. Maybe the ghost of Steve Jobs is haunting my devices. Or maybe Apple devices only work as supposed for the fervent believers. The fact is that both my iPhone 4S and my MacBook Pro (13″, circa 2010) took a nasty turn South since the last software update.

Both are experiencing network issues. The iPhone is often on Edge, when everyone around in on 3G. Turning it off and on, or turning 3G off and on in Preferences sometimes solves the problem. The MacBook is having issues with WiFi access, the nasty “!” now appears often over the menu bar icon — again rebooting the device usually solves the problem.

But the MacBook Pro is being the most infuriating. The rainbow wheel of hell has decided to test my patience, appearing even after seemingly innocent actions (changing fields in a web form, for example). Only a month ago, I have reinstalled OS X in this machine from scratch because it had become too slow, so the reappearance of this problem after such a short interval is being particularly annoying.

The main problem, I guess, is psychological. Whenever my Apple devices start to act, I can’t avoid feeling cheated. I have payed a premium to have something that works smoothly, out of the box. If I am to Google for solutions every other Wednesday, and spend 3 hours hacking configuration files, I might as well switch to a cheaper — and more open — platform.

(Erratum 17/2: I have also blamed an autocorrect bug on the new iOS updated, but it seems that it only occurs in Safari, and then only in a few sites, so I have been hasty in my anger.)

Posted in technology | Tagged , , , , | 1 Comment

Talk at the I3S Lab, Université de Nice, Sophia-Antipolis

As part of my visit to the I3S Lab, I’m giving a talk on February 11th :

Title: Scalability Issues in Multimedia Information Retrieval
Where: I3S conference room (level 0)
When: Monday, February 11th, at 14h00

Abstract:
The Millennium marked a turning point for textual Information Retrieval, a moment when Search Engines and Social Networks changed our relationship to World Wide Web: gigantic corpora of knowledge suddenly felt friendly, accessible and manageable. Ten years later, the same phenomenon is happening for complex non-textual data, including multimedia. The challenge is how to provide intuitive, convenient, fast services for those data, in collections whose size and growing rate is so big, that our intuitions fail to grasp.

Two issues have dominated the scientific discourse when we aim at that goal: our ability to represent multimedia information in a way that allows answering the high-level queries posed by the users, and our ability to process those queries fast.

In this talk, I will focus on the latter issue, examining similarity search in high-dimensional spaces, a pivotal operation found a variety of database applications — including Multimedia Information Retrieval. Similarity search is conceptually very simple: find the objects in the dataset that are similar to the query, i.e., those that are close to the query according to some notion of distance. However, due to the infamous “curse of the dimensionality”, performing it fast is challenging from both the theoretical and the practical point-of-view.

I have selected for this talk Hypercurves, my latest research endeavor, which is a distributed technique aimed at hybrid CPU–GPU environments. Hypercurves’ goal is to employ throughput-oriented GPUs to keep answer times optimal, under several load regimens. The parallelization also poses interesting theoretical questions of how much can we optimize the parallelization of approximate k-nearest neighbors, if we relax the equivalence to the sequential algorithm from exact to probabilistic.

The talk will be in English. I thank my colleague and friend Prof. Frédéric Precioso, for this opportunity.

Posted in science | Tagged , , , , , , , , , , | Leave a comment

Keep your iPhone alive in France

I’ve traded my tropical Brazilian summer for what turned out to be a harsh French winter. My brain is delighted to exchange ideas with my colleague (and former advisor) Prof. Matthieu Cord at the Université Pierre et Marie Curie, and my colleague (and old friend) Prof. Frédéric Precioso at the Université de Nice Sophia Antipolis. My nose, however, is complaining a lot.

* * *

I’ve heard that traveling with a smartphone makes for a much enhanced experience. Living it first-hand, I realize the huge difference some simple things make when you are abroad, like having Google Maps ready at your hands. Apps for specific destinations are la cérise du gâteau (the Parisian metro/bus company RATP has a terrific one, also available for Android).

However, keeping the gourmand smartphone alive without breaking the bank might be a challenge. The solution I adopted was buying a pre-paid local SIM card. Making it happen, however, is harder than it should.

As far as I know, none of the main French carriers offer interesting pre-paid Internet offers in micro-SIM format. Among the prepaid mini-SIM offers, Orange’s Internet Max is interesting : it gives you “unlimited” (actually something around 500 Gb) data access for a month.

  1. Having chosen to go with Orange Fr, and their pre-paid offer, the Mobicarte, I’ve tried to buy it online — so it would be waiting for me, when I arrived. Their damn online shop, however, would not take my credit card. Maybe you’ll be luckier ?
  2. So, one day after I landed, I went to a physical shop and asked for a Mobicarte (priced around 10€). They’ve asked in which phone I’d use it and I answered “oh, it’s an old Nokia — I was using a friend’s Mobicarte in it but now I have to return it”. (No, my pants weren’t on fire.)
  3. They’ve asked me for an identity document. I didn’t have my passport on me, but they accepted my Brazilian identity card. In doubt, I think it’s safer to bring the passport.
  4. I also bought some credit, around 25€, to complement the 5€ of credit that comes with the Mobicarte.

All that was incredibly easy —I kept reading stories round the net on how buying the chip is a nightmare —  maybe the secret is to feign ignorance : don’t mention the words “iPhone”, “Android” or “Internet Max”, and you should be safe.

Now, the difficult parts are : (I) to make the card fit in the iPhone ; (II) to be patient enough to go through all the hoops in order to activate the Internet Max options.

Part I is a matter of having guts. I’ve heard of people who use their own (or a friend’s) micro-SIM as a template, and a very sharp kitchen knife, with good results. I’m not so brave, so, just before my trip, I’ve ordered this gadget on Amazon.fr : a mini-SIM trimmer, which performs the operation without requiring much adroitness. It arrived two days after I landed. You can check both the clipper and the results below :

SIM clipper used to convert a mini-SIM (a Mobicarte from Orange Fr) to a micro-SIM

One obvious recommendation is to clip and test the chip on the iPhone before loading the credits, so if something goes wrong you’re losing 10€ instead of 34€. I’m only emphasizing this because, as a good absent-minded professor, I went and loaded the credits immediately after buying them.

Activating the Internet Max is a matter of patience. First, chances are you’ll have to wait at least 24h after buying the chip, until the system identifies you. Otherwise you’ll receive a frightening message saying that you are not identified, and that your account you be cancelled, and that you’ll be guillotined at the place de la Concorde. Nothing like that will happen — ignore the message and wait another day :

Vous ne pouvez pas souscrire à l’option Internet Max car nous ne pouvons pas vous identifier. Vous devez vous rendre dans votre point de vente accompagné de votre pièce d’identité. Votre ligne sera rétablie quelques jours après l’enregistrement de vos cordonnées. Sans action de votre part, votre ligne sera suspendue un mois après l’activation de la ligne. Vous ne pourrez plus passer des appels et votre ligne sera résiliée deux mois après son activation.

Services menu of Orange Fr, invoked calling #123# Then, to activate the Internet Max option, invoke the services menu calling the number #123#, then choose the options :

4 – Mon space (My space)
3 – Ajouter une option (Add an option)
5 – Suite (Next)
1 – Internet + Mail
2 – Internet Max
1 – Suite
1 – Souscrire (Subscribe)
1 – Valider (Validate)

Or something like that, to be honest. The menu changes a lot, according to the current design and promotions, but you should be able to find your way.

Internet Max will cost you 9€ for a month, at current prices, but one word of warning : it doesn’t cover POP / SMTP / IMAP traffic, used for example, by most e-mail apps and e-mail push notifications. If you’re on a budget, avoid using those apps, and disable push notifications for e-mail. Otherwise, you can buy an “Option mail” (following instructions very close to the ones above, but changing the 5th step), that covers unlimited traffic for those protocols, for 6€. So, the real cost for “unlimited” net is 15€ per month.

Finally, the most difficult step : waiting. Beware, because the options max will not be activated immediately : that will take 48h ! With the identification delay after buying the card, that makes for a minimum of three days of waiting, so this solution is only practical for longer trips.

Big thumbs-up for the guys at VeloNomad — I’ve first learned about that possibility there. They seem to have a service that sends you a French SIM home, before the trip, so you can have your phone ready to communicate from day one. I’ve discovered that service too close to my trip for it to be useful — if you use it, I’m curious to find about your experience. They add those very good remarks  :

  • Disable the 3G/data roaming before you put the SIM on the phone, or the credits might disappear surprisingly fast (if you are serious about avoiding roaming charges, it’s actually a good idea to do it home, before turning off the phone in the plane) ;
  • Before diving in the net or mail, use it a bit and check if the credits are draining — if they are, the option is not yet activated ;
  • Obvious, but sometimes overlooked : ensure that your phone is unlocked before you leave home !

(Finally, my lawyer is telling me to reinforce that those instructions are provided in good faith, but that you should be careful, find information and make your own decisions : if you follow the instructions and your phone turns into a brick, or you are attacked by a flock of angry birds, I’m not liable.)

(Postscriptum : the intention here is having access to the web at reasonable cost, so to have a more enjoyable trip in France, not to cheat Orange Fr. Be reasonable : don’t go and tether all your party of 20 travelers on a single chip, don’t download an entire Hollywood worth of pirate movies, etc. In other words : be a conscious hacker and don’t spoil this for everyone else.)

Posted in technology | Tagged , , , , , | 2 Comments