My postgraduate offer for 2016/1 : Deep Learning From a Statistician’s Viewpoint

With few exceptions, my postgraduate offers follow a pattern. On the second semester, I offer my “101” Multimedia Information Retrieval course, which introduces multimedia representations, machine learning, computer vision, and… information retrieval. On the first semester, I offer a topics course, usually following a book : so far we have explored Bishop’s PRML, Hofstadter’s GEB, and Jaynes’ “Probability Theory”.

For 2016/1, I’m risking something different :

“Artificial Intelligence is trending again, and much of the buzz is due to Deep Neural Networks. For long considered untrainable, Deep Networks were boosted by a leap in computing power, and in data availability.

Deep Networks stunned the world by classifying images into thousands of categories with accuracy, by writing fake wikipedia articles with panache, and by playing difficult videogames with competence.

My aim here is a less “neural” path to deep models. Let us take the biological metaphors with a healthy dose of cynicism and seek explanations instead in statistics, in information theory, in probability theory. Remember linear regression ? Deep models are multi-layered generalized linear models whose parameters are learned by maximum likelihood. Let us start from there and then explore the most promising avenues leading to the current state of the art.

This course will be nothing like your typical classroom experience. There will be no lectures. We will meet once a week for a presencial session to discuss previous work, and plan our attack for the next week. I’ll expect you to continue working throughout the week. There will be no exams. I’ll grade your work based on participation during the sessions, progress between sessions, self assessment, and peer assessment.

Active participation will be mandatory. This means (surprise !) talking in public. Everyone will be learning together, so all of us must accept the risk to be wrong. This course won’t work for those who always want to appear wise and knowledgeable. The course will be in English.

Deep networks can be seen as hierarchical generalized linear models.

Deep networks can be seen as hierarchical generalized linear models.

We’ll be a cozy small group : at most 12 students. I’ll select the candidates based on a letter of intentions, and on previous experience. Write a short e-mail to dovalle@dca.fee.unicamp.br. No need to be fancy : just state your reasons for participating, and any previous experience (academic, professional, and extra-curricular) with Machine Learning, Statistics, Probability, or Information Theory.

This course is not for beginners, nor for the faint of heart. We are jumping in head first at the deep (tee hee !) end. After all, we will delve into one of the most engaging intellectual frontier of our time. I dare you to join us !”

Very important ! If you want to enroll at this course without being enrolled at the program (what UNICAMP awfully calls “special students”), you have to do you pre-enrollment until 7/Dec/2015 (hard deadline !). Even if you are enrolled at the program (“regular student”) send me your application at most until 31/Dec/2015, because I’ll select regular and special (urgh !) students at the same time.

EDIT 20/01 : I have sent the acceptance notices — looking forward to work with a swell group of very motivated students !

What : Post-graduate course for the Master or Doctorate in Electrical Engineering program of UNICAMP (4 credits)

When : 2016/1st semester — mandatory presencial meetings Tuesdays from 19 to 21h ; support meetings same day from 16 to 18h

Image credit : composite from Ramón y Cajal 1st publication showing a cerebellum cut, and scatterplots from Fisher’s iris dataset drawn by Indon~commonswiki, wikimediacommons.

Associate director of undergraduate studies

For the next few months I’ll be occupying the position of associate director of undergraduate studies of the Computer Engineering course, left by Prof. Ivan Ricarte, who got his full professorship at another academic unit of UNICAMP. Currently, the director is Prof. Helio Pedrini of the Institute of Computing. Prof. Akebo Yamakami has kindly accepted to be my “vice-associate”, an informal position that exists due to the direction being shared between two academic units. This is good news, because I’m a rookie in what concerns academic administration, while Prof.  Yamakami has been involved in undergraduate studies direction since… forever. His experience will be inestimable.

I was appointed by the Electrical and Computer Engineering School steering committee in an indirect election, for a provisional mandate. Next June, the entire electoral college (faculty, staff and students) will vote for the next director here at FEEC, and for the next associate director at Institute of Computing, since the positions switch between the two units at the end of the mandates.  (I know, I know — it’s complicated — but you get used to the idiosyncrasies of Brazilian public administration after a while…)

I thank my colleagues of the steering committee for their trust.

Call for Contributions — Symposium of Signal Processing @ UNICAMP

The fifth edition of the University of Campinas Signal Processing Symposium
(SPS-Unicamp) will take place this year in September, 15-17th.

This local symposium, promoted by the research community of São Paulo, is gaining importance as a dynamic, interactive event, that offers young scientists the opportunity to network among themselves and with industrial partners.

The call for contributions is open. SPS-Unicamp welcomes papers and mini-courses proposals in the following areas :

  • Biomedic engineering ;
  • Image and video processing, visualization and computer
    graphics ;
  • Signal processing applied to forensis, biometry and bioinformatics ;
  • Control and automation ;
  • Seismic processing ;
  • Communications ;
  • Signal processing applied to sports science ;
  • Theory of signal processing ;
  • Hardware implementation of signal processing

Papers can be written both in English or Portuguese. Both 4-page short papers and 1-page extended abstracts are accepted. Not only original works with results, but also works in progress, and research-project papers are welcome.

Deadline : August 4th, 2014. 

For more information, please check SPS-Unicamp Homepage.

The IEEE Women in Engineering South Brazil student chapter, hosted at Unicamp, and the IEEE Signal Processing Society São Paulo chapter of support this event.

Am I forgetting anything ?

I have just realized : the most important event in my professional life since the Ph.D. viva-voce defense went unannounced in this blog. I have been recently accepted as a faculty member of the Department of Computer Engineering and Industrial Automation (DCA) of the School of Electrical and Computer Engineering (FEEC) of the State University of Campinas (UNICAMP). I am now officially an absent-minded professor.

Balancing a faculty career, with research, teaching and administrative obligations is more challenging than people outside academia usually realize. For the last 5 years, I was exclusively focused on research, so I am rediscovering the thrill of being in a classroom. I am also discovering the painstaking work needed to sustain academic institutions, for, if their horizontal, democratic nature warrants their members many freedoms, they require in return much debate, discussion and politics.

Nevertheless, I am loving every minute of my new duties. I know that the passing years take their toll, but for the moment, at least, I am in my element.

Back from the USA

I have just arrived (suitcases still to be undone) from my trip to the USA. This time, I went to Philadelphia for the MIR Conference, where I have presented a poster on the work of my student Fábio Faria. I have met many interesting people at MIR and heard exciting, new ideas from them, but (without any intention to dismiss the hard work of the organizers) I must confess I was expecting a more diverse array of works (especially considering how broad the “Multimedia” community is).

Instead, I was astonished by how much the presented selection was similar in terms of technical foundation: classification based on discriminant approach (almost always using SVM) and representation based on “bags of visual features”. It is not that those do not interest me — after all, our own work is sits squarely on those pillars — but I was very interested in hearing about, seeing other approaches: generative models based on latent or explicit semantics, representations based on constellation models — what do I know ? — perhaps something completely new, which I haven’t even heard about.

I was left wondering why those “competing theories” were so notably absent. Has the community decided that SVM + Bags of Features is so conspicuously better than everything else ? (If that is the case, I would like to know how they reached this conclusion — though I like the results given by the pair “bags + SVM”, I am far from considering the “case closed”).

Was it self-selection by the autors, who didn’t submit their works to this particularly community ?

Or — and this is obviously the worst scenario— have all the alternative works been retained at the peer review barrier, because ideological considerations have (maybe  unconsciously?) tainted the assessment of quality. I would like to quick dismiss this latter possibility, but the similarity between the works was really astounding.  My student Otávio Penatti, who is on his first months of Ph.D. (he was there presenting a demo of his M.Sc. work) remarked it immediately.

I was very glad, nevertheless, to have this opportunity to visit Philadelphia. It was a very moving experience for me, because it gave me a very concrete, very immediate realization of how strongly The Enlightenment was shining in America at that time.

* * *

Otávio and I have profited from our travel to the USA to visit Prof. Edward Fox in Virginia Tech, who was the former Ph.D. advisor of Otávio’s current Ph.D. avidsor and my Post-Doc advisor Prof. Ricardo Torres. We have an ongoing cooperation with Prof. Fox. In fact, while we were there, we have met a Brazilian colleague of ours, Nadia Kozievitch, who is spending an year of her Ph.D. with Prof. Fox.

While we were there, we gave a talk on our current work and got acquainted with several exciting projects Prof. Fox is conducting, on a broad array of applications of digital libraries,  including identification of fingerprints, biodiversity databases, e-Science, cooperation for crisis situations, and education.

We have also met Brazilian Prof. João Setúbal, who showed us the Virginia Bioinformatics Institute, and talked about his work in genomics, and the new field of transcriptonics.

We were very impressed not only with the infra-structure of Virginia Tech, but also with the kindness and attentiveness of everyone who received us.

Reasoning for Complex Data

Together with Prof. Anderson Rocha, Prof. Jacques Wainer, Prof. Ricardo Torres (my Post Doc advisor, by the way) and Prof. Siome Goldenstein, we have recently founded a new laboratory at the Computing Institute of the State University of Campinas (UNICAMP).

The new lab — which we named RECOD — aims to embrace the research subjects of machine learning, multimedia retrieval and classification, multimodality and digital forensics.

The foundation of this new lab both celebrates a history of fruitful colaboration between its participating members and inaugurates a new phase of tighter cooperation, in which the synergy of our complementary competencies will be fostered in an optimized environment.

I cannot avoid to be proud that my colleagues have accepted both my name and logo suggestions for the  new lab.

Long live RECOD !

RECOD Lab Logotype, with the lab motto "reasoning for complex data"

Tutorial Accepted on SBBD 2009

My tutorial Similarity Search and Indexing for High-Dimensional Data has been accepted on SBBD 2009 (The Brazilian Symposium on Databases).  Here’s the abstract:

Searching by similarity is a critical operation on many systems, and thus has attracted the attention of many disciplines in Computer Sciences, including Computational Geometry, Machine Learning, Multimedia and, of course, Databases. To perform efficiently, similarity search requires the support of indexing, which suffers from the infamous “curse of the dimensionality”. In this tutorial we will introduce the challenges of indexing and searching high-dimensional data, and present the most recent tools available to “tame the curse”. At the end, the audience will have a good grasp of the current state of the art, the most promising research trends and the challenges still faced by the technology.

The tutorials, as I understand, are open to all participants on the conference. Mine will be held on Wednesday, October 7th from 14h40 to 18h20, with a 20′ coffee-break. If you use Google calendar, you can save the date by clicking on the button below.

* * *

I’ve unintentionally let an awful lot of of time pass since my last post — the move to Campinas (and to UNICAMP) has been wonderful, but also laborious. I thought that after moving across countries three times, moving across states would be a piece of cake, but it seems that, no matter the distance, moving is always a lot of hassle!

EDIT 11/11/09: The tutorial presentation, for the moment without narrative, is available on my talks and courses page.

Post-Doctoral Internship at UNICAMP

I am glad to announce that I’ve got a post-doctoral position at the Computing Institute of the State University of Campinas, where I’ll be supervised by Prof. Ricardo Torres.

Prof. Torres and I have met last December at Cergy Pontoise, France, on the occasion of his visit at the ETIS Labs and my research internship at the LIP6 Labs. We have quickly found common research interests and decided to submit a request for a post-doctoral scholarship to FAPESP, one of the biggest scientific sponsoring foundations in Brazil. We received the acceptance a few days ago.

During this new internship I will try to broaden some of the results of my thesis, and continue to work on scalability issues of Machine Learning and Information Retrieval. And I will continue to cooperate with my French colleagues (whom I am visiting on September, by the way ), and with my current team, at the Federal University of Minas Gerais.

I am looking forward to start it !

Upcoming Talk: Three New Methods for kNN Search

Prof. Ricardo Torres has invited me to the Institute of Computing of the State University of Campinas, where I am giving a talk on the work I’ve done on my thesis. I will explore the challenges of kNN search (also known as k nearest neighbours search, or simply similarity search) and discuss the three original methods I’ve proposed: the 3-way trees, which are based on the traditional KD-Tree with the addition of redundant overlapping nodes; the projection KD-Forests, my first attempt of using an index composed of multiple moderate-dimensional sub-indexes; and finally the Multicurves, an index based on the use of multiple moderate-dimensional space-filling curves, which has several nice properties like ease of implementation, dynamicity (tolerance to insertions and deletions without performance degradation) and avoidance of random accesses (thus making secondary-memory implementation easier).

The talk will be in Portuguese.