Workshop on Fairness in AI

Next Monday, June 27, I am organizing a workshop on issues around fairness, bias and discrimination in AI and Machine Learning.

Here is a link to the program. Remote participation is possible (link in the website), and in-person participation is free but we ask people to register so we can print badges and order the appropriate number of coffee breaks.

This workshop is being organized in partnership with EDGE, an Italian NGO that works on LGBT rights, and it is the first event of their initiative “A+I: Algoritmi + Inclusivi”, which will feature an awareness campaign and a series of video interviews that will start after the summer.

In next week’s workshop, Oreste Pollicino from Bocconi will talk about the perspective of the legal community around algorithmic discrimination, Symeon Papadopoulos from ITI Patras will give a survey on issues of fairness in image processing and image understanding, Sanghamitra Dutta from J.P. Morgan AI will talk about how to use the theory of causality to reason about fairness, Debora Nozza and Dirk Hovy from Bocconi will talk about issues of fairness in language models and natural language processing, and Omer Reingold from Stanford and Cynthia Dwork from Harvard will talk about modeling and achieving fairness in prediction models.

The last morning session will be a panel discussion moderated by Damiano Terziotti from EDGE about perspectives from the social sciences and from outside academia. It will feature, among others, Brando Benifei, a member of the EU parliament who has played a leading role in the 2021 draft EU regulations on AI. The other panel members are Alessandro Bonaita, who is a data science lead in Generali (Italy’s largest insurance company), Luisella Giani, who is leading a technology consulting branch of Oracle for Europe, Middle East and Africa, Cinzia Maiolini, who is in the national secretariat of CGIL, an Italian Union, and Massimo Airoldi from the University of Milan.

If you are in or near Milan next week, come to what is shaping up to be a memorable event!

unary communication

When Twitter started to become popular, I remember thinking that the premise of its service, that its distinguishing feature was its limitation, was ridiculous. (Remember never to ask me for investment advice.)

At the time, I thought that it would be really fun to create a parody site where you could only post one bit messages. Clearly, the site would be called bitter, and when you log in the prompt would ask “Are you bitter?” and if you answered yes your post would be a frowny face, while if you answered no your post would be a smiley face. I went as far as checking that this didn’t seem too hard to pull off in Drupal, to make sure no such parody site existed already, and to see if or were available. (Of course they weren’t!)

Anyways, I was mistaken in thinking that two possible messages, and hence one bit of information, was the end of the road. Indeed, it is possible to have only one possible message, and this is the insight pursued by yo, which, apparently, is not a parody and has received one million dollars in funding.

The long tail of free online education

Last Fall, three Stanford classes were “offered online” for free: Andrew Ng’s machine learning class, Sebastian Thrun’s AI class, and Jennifer Widom’s data base class. There had been interest and experiments in online free education for a long time, with the MITx initiative being a particularly significant one, but there were a few innovations in last year’s Stanford classes, and they probably contributed to their runaway success and six-digit enrollment.

One difference was that they did not post videos of the in-class lectures. There was, in fact, no in-class lecture. Instead, they taped short videos, rehearsed and edited, with the content of a standard 90-minute class broken down in 4 ten-minutes video or so. This is about the difference between taping a play and making a movie. Then the videos came with some forms of “interactivity” (quizzes that had to be answered to continue), and they were released at the rate in which the class progressed, so that there was a community of students watching the videos at the same time and able to answer each other’s questions in forums. Finally, the videos were used in the Stanford offerings of the classes: the students were instructed to watch the videos by themselves, and during the lecture time they would solve problems, or have discussions or have guest lectures and so on. (In K-12 education, this is called the “flipped classroom” model, in which students take lectures at home and solve homeworks in class, instead of the traditional other way around.)

In the past few months, there has been a lot of thinking, and a lot of acting, about the success of this experiment. Sebastian Thrun started a company called udacity to offer online courses “branded” by the company itself, and Daphne Koller and Andrew Ng started a company called coursera to provide a platform for universities to put their courses online, and, meanwhile, Harvard and Berkeley joined MIT to create edX.

At a time when the growth of higher education costs in the United States appear unsustainable, particularly in second-tier universities, and when the demand for high-quality higher education is exploding in the developing world, these projects have attracted a lot of interest.

While the discussion has been focused on the “summer blockbusters” of higher education, and what they should be like, who is going to produce them, how to make money from them, and so on, I would like to start a discussion on the “art house” side of things.

In universities all over the world, tens of thousands of my colleagues, after they have “served” their departments teaching a large undergraduate classes and maybe a required graduate class, get to have fun teaching a research-oriented graduate class. Their hard-earned insights into problems about which they are the world’s leading expert, be it a particular organ of the fruit fly or a certain corner of the Langlands program, are distilled into a series of lectures featuring content that cannot be found anywhere else. All for the benefit of half a dozen or a dozen students.

If these research-oriented, hyper-specialized courses were available online, those courses might have an audience of 20 or 30 students, instead of 100,000+, but their aggregate effect on their research communities would be, I believe, very significant.

One could also imagine such courses being co-taught by people at different universities. For example, imagine James Lee and Assaf Naor co-teaching a course on metric embeddings and approximation algorithms: they would devise a lesson plan together, each would produce half of the videos, and then at both NYU and UW the students would watch the videos and meet in class for discussions and working on problems; meanwhile study groups would probably pop up in many theory groups, of students watching the videos and working on the problem sets together.

So someone should put a research-oriented graduate course online, and see what happens. This is all to say that I plan to teach my class on graph partitioning, expander graphs, and random walks online in Winter 2013. Wish me luck!

Is the British government hiring Italian political consultant?

David Willetts, the British minister for higher education, has recently announced that the government was “inviting proposals for a new type of university with a focus on science and technology and on postgraduates.” The NYT article on the matter may be biased, but it looks like this announcement could have come from the Italian ministry of university and research, and I mean it as an offense.

So, how much will the government invest in this new university? “There will be no additional government funding,” Mr. Willetss says, and all the funding will have to come from the private sector. And what is the government’s vision and plan for this new university? Mr. Willetts says that “We are not intending to issue any guidelines. We want people to come to us with ideas.”

So the idea of the minister for higher education is that the private sector comes up with all the funding and all the planning for a new university. (Imagine the home secretary stating the goal of increasing the police force, but all the new police force would be paid for by the private sector, which, after all, has an interest in reducing street crime, and that it is not the intention of the home office to dictate how this private police force should operate.) This is exactly what an Italian minister would talk about at a press conference, only to be forgotten the following week.

The reaction from the academia, however, is different. In Italy, you would see people throw their hands in the air and say “madonna mia, in mano a chi siamo,” while the British are masters of understatement.

“We at Oxford feel that keeping the U.K. a world leader in science and research is a very important objective,” Ian Walmsley, Oxford’s chief research officer, “and we’re pleased that the government agrees with that.”

Stephen Caddick, for the University College at London says the proposal is “not uninteresting”.

Should the NYT and the ACM report facts?

Last week, the public editor of the New York Times wrote a post asking the following question: when the paper reports a statement from a public figure which is not true, should it also report the fact that the statement is false?

The post received hundreds of comments, mostly of the form “Wait, what? Is this even a question?” and it has stimulated a rich online discussion, mostly as incredulous that this would even be a question. In fact, the reluctance of mainstream media to report facts (as opposed to reporting statements) has been a long-standing problem. More than eleven years ago, Paul Krugman wrote “If a presidential candidate were to declare that the earth is flat, you would be sure to see a news analysis under the headline ”Shape of the Planet: Both Sides Have a Point.” After all, the earth isn’t perfectly spherical.

Krugman was joking, but something rather similar happened when reports of prisoner abuse surfaced during the wars in Afghanistan and Iraq. The Times was taken to task for never referring to the abuse as “torture” and eventually there was an article (from the public editor at the time? I wasn’t able to find it) explaining that since the Administration had taken the position that whatever happened to the prisoners was not torture, to say otherwise would have meant choosing sides in a political controversy.

If you are reading in theory you are probably aware of two bills making their way in the Senate and the House, called PIPA and SOPA, respectively, which have the goal of shutting down sites that illegally contain (or link to) copyrighted material. While the DCMA already allows the shutting down of web site in such cases, it can only be enforced within the US. PIPA and SOPA would allow copyright holders to go after a foreign website by requiring Domain Name Servers to stop resolving the domain name, and requiring search engines to stop linking to the website.

Objections to the bill include concerns about the unintended consequences of giving the state the power and the technical ability to “censor” websites, about the security vulnerabilities that would arise from any tampering with the DNS protocols, and the ramification, both in terms or quality of results and of free speech considerations, for search engines.

Meanwhile, the Research Works Act would forbid the NIH and other federal agencies to mandate open access publishing of the papers resulted from sponsored research. The intention of the bill, which is to restrict access to federally funded research, is an attack to the academic community, for which open access is an unqualified good thing.

As the leading association of technical and academic computer science professionals, one would expect the ACM to publicize the technical issues involved with SOPA and PIPA (the free speech issues are clearly a political issue on which the ACM might want to stay neutral) and to come out strongly in opposition of the Research Works Act, especially considering that ACM is a member of the Association of American Publishers, which has been the main lobbying force behind the bill, so that not speaking out is essentially the same as supporting the bill. The MIT press, for example, which is a member of the AAP, has come out against the Research Works Act.

The President of the ACM, however, has stated his intention of keeping the ACM away from any discussion of SOPA, PIPA and the Research Works ACT. (An ACM technical committee might produce a statement on SOPA and PIPA, but it would not be a position statement of the organization as a whole.)

Edited to add: the White House has come out against the DNS-blocking provision, and there are moves under way to amend both legislation to remove DNS-blocking. Meanwhile, as Doug Tyger correctly points out in the comments, the United States ACM Public Policy Council has sent letters to congress highlighting the technical concerns raised by the proposed legislation.

And where have you been?

As you may have heard, iPhones store, unencrypted, a list of locations where the phone has been while turned on. This information, like all the user information on the phone, is also backed-up unencrypted on one’s computer when syncing the phone with iTunes.

There is a cute program, called iPhone tracker which will read this information and plot it on a map, also creating a video showing where your phone (and you) has been over time.

This is my neighborhood:

And this is me attending TCC 2011:

What could possibly go wrong with this?

The keyboard that goes “CLICK”

When I got a computer for my new office at Stanford last year, it came with the Apple wireless keyboard, a piece of equipment that some people like very much, and that has a handsome and minimalist design. The bluetooth connection was, however, occasionally flaky, and it seemed silly to have a battery operated device sitting in front of a desktop. So I decided to buy a wired keyboard instead, and since everybody raves about how good it is to type on Lenovo laptops, I thought Lenovo would sell a keyboard made to feel like their laptops. Unfortunately there does not seem to be such a thing.

Searching for information about keyboards, however, I found a whole online cult devoted to the keyboards that IBM made for its PC in the 1980s: the IBM Model M keyboard. Although I never owned or used an IBM PC, I remember using similar keyboards when I was a graduate student in the mid 1990s, and we each had a terminal on our desk connected to a mainframe in the basement. The terminals had monochromatic, text-only, displays, but they keyboards were good, and, every time you would press a key, they would go *CLICK*, just like the 1980s IBM keyboards. (The terminals were made by HP and were 1980s technology.)

The license/patents to make these keyboards went to Lexmark, when IBM spun off its printers/devices business. Making the keyboards, however, was not profitable because they never break — see for example this video of a Model M versus a watermelon.

Lexmark then sold the license/patents to Unicomp, an American company whose business is to make clones of the IBM Model M keyboard and other 1980s models.

So that’s what I got for my office and, while I feel rather self-conscious about showing enthusiasm about a keyboard, it is awesome. (To be precise, the keyboard I got is not an exact clone of the IBM model M: mine has a USB cable, a “Windows” key, and it works with a Mac without drivers. The Windows key becomes the “command” key. For the purists, it is possible to buy actual IBM models M, with a PS/2 interface which can be connected to a USB via converter, at, where they even have “mint condition never used” ones.)

This term, as readers of in theory might have noticed, I am writing notes for two classes, which means that I am typing for roughly 15-20 hours a week, mostly at home. Usually, at home I would work using the laptop on the sofa, and use my desk for storage, but this wasn’t good this term, so I (mostly) cleared the desk, got a monitor, a mouse, another, awesome Unicomp keyboard, and hooked it all up to my MacBook Air (which has new hinges, yay!). Continue reading


I often read relatively long news articles or essays online, from blogs and from the websites of newspapers and magazines, and I am a big fan of readability. Clicking on the previous link takes you to a page where you choose the page width and font size and type that you prefer, and you get a button that you can drag to the link toolbar of your browser. Instead of being a link, however, it’s a javascript application that, when clicked on, clears up the page you are reading of all the junk. Only the text and (quite impressively) the right pictures stay. If your article spans multiple pages, and you need to click to advance to the next page, the application is usually able to collect all the pages together. Try it on an article in the New York Times web site, and be in awe as all the clutter disappears, or try it on a post in the complexity blog, and get away from the green background and from the comments.

Of course, for all I know, the application also collects all your passwords and sends them to a server in Russia, but I doubt it. Talking about losing all your password, I was shocked when I realized that, in Firefox, if you go to Preferences->Security->Saved Passwords, it gives you an option to show all the saved passwords in the clear. I can’t imagine any circumstance in which this would be a useful feature, and I can certainly imagine a lot of circumstances in which this is a terrible thing. If you “set a master password,” then it won’t show the saved passwords unless one enter the master password, but then one has to also enter the master password all the time, defying the purpose of saving passwords in the browser in the first place.

I found out about readability from instapaper, which I also like very much, but that requires some set up (signing up, downloading the iphone app) and makes sense only if one reads long online articles fairly often, while I think that readability is useful for everybody. Readability is also available as a Firefox add-on, but I much prefer the simplicity of dragging a button from their site to the toolbar.