Friday, February 16

Not The Emperor's New Security Studies

Andrew Patrick's Blog has interesting commentary on the research methodology used in The Emperor’s New Security Indicators: An evaluation of website authentication and the effect of role playing on usability studies (henceforth to be called TENSI:AEoWAatEoRPoUS for short or better, TENSI for shorter). The paper is to appear in the IEEE Symposium on Security and Privacy, and has already been noted in Wired, /., Computerworld and the NYT.

My gloss on the paper's result: a new, supposedly stronger security technology -- site authentication images -- already in use by several financial institutions, can easily be defeated. That's all you'll get from me; read the paper! I'm posting to comment on Andrew's critique of the paper. OK? We'll get there.

Before we go much further, full disclosure: 1) I work closely with Rachna (one of the paper's authors) now that she's a Fellow at CommerceNet and; 2) I reviewed earlier versions of the paper. I think the paper is terrific work and reinforces the skepticism many security people have felt about the silly PassMark technology for years.

Now finally to Andrew's post. He makes interesting points; many very good ones, but a few not so good. I want to pick on the not-so-good ones, but it's worth reading his post to pick up on his general methodological concern about user studies (and hence, focus groups -- product managers take note). He highlights bias introduced by research settings that may be particularly pernicious for security usability studies:
  1. Task Focus. Emphasis on a study's inevitably artificial task would tend to reduce the subject's attention to secondary matters which might be the study's real purpose. In other words, tell the subject to look up his bank balance with the web browser and he might be so fixated on pleasing the white-coated lady with the clipboard that he overlooks the little lock he would otherwise pay attention to. This is a big problem if what the study is really trying to do is see if people look for the little lock, even considering I'm damn sure nobody pays attention to the little lock outside the lab either. This is science!
  2. Obedience To Authority. As was infamously shown by the Milgram experiment, study participants are astonishingly willing to harm others at the behest of the researcher. One supposes they might be willing to harm themselves (i.e., access their online bank insecurely) to please the researcher by continuing with the study.
It certainly might be better if we could do studies that were more like computer usage in the wild. I'm all for it. But in the meantime, lab user studies are very useful and some of Andrew's other comments are quite off-base.

Selection Bias? He faults the recruiting procedure of the study, suggesting that security-minded people were selected against:

It is important to note that 21 people were recruited but were not able to participate in the study. Three of these people refused to sign the consent form and explicitly expressed concerns about the privacy of their banking data. Also, five people stated they could not remember their login information when it came time to complete the tasks. These people may actually have had concerns about using their personal banking information for the study, and used forgetting of their login information as an excuse for not completing the study.

The result is that up to 8 people with, expressed or perhaps hidden, concerns about providing private information during the experiment were excluded from the study. This means that the final group of participants was biased towards people who did not have concerns about their banking privacy in this setting. We know that people have different defaults attitudes towards their personal privacy, with some research showing that 27% are privacy liberals, with few concerns, 17% are privacy fundamentalists, with serious concerns, and the remaining 56% being privacy pragmatists (see Cranor, Reagle, Ackerman, 1999). The recruiting procedures for this study appears to have eliminated the privacy fundamentalists, and thus biased the results towards the liberals and pragmatists who were willing to continue banking in spite of repeated security/privacy indicators.

Perhaps, but this is only a hypothesis, and not a persuasive one. For one thing, it conflates greater concern about privacy with greater vigilance when it comes to security indicators. Just because a prospective subject was more worried than others that researchers might learn their secrets doesn't at all imply that they would have insisted on seeing their security image.

Then there is the trick in going from 3 to 8. It would be nice to have an investigation of the ratio of password resets to successful logins at financial sites (my guess: >10%). Are you sure that your guess is that all of the 88 people would be able to recall their passwords under these conditions?

Now look at the real demographics in the study: 88 prospective subjects, only 3 of which actually declined to participate by explicitly citing privacy concerns. Sounds different than 21 out of 67, doesn't it? [12 prospects were eliminated because they were unfamiliar with the bank's website. As an aside, Andrew, show more respect for the authors' scrupulous detailing of study participant selection!]

If they don't understand, who's fault is that? I'm frankly baffled that Andrew apparently believes that the subjects' lack of awareness regarding their online risks is a fault of the study:

Another flaw in the current experiment is failing to confirm that the participants perceived they were taking risks when they persisted to login to the bank site without security indicators. Although the researchers feel that failing to heed the security warnings as risky behavior, did the participants see it that way? Most people have not personally experienced harm because of their behavior on the Internet, so the perceived risk may be have been very low.

Uh, that's the point of this type of study? Users don't realize that ignoring security warnings is bad. Secure UI designers better take that into account.

Also, people who have been using the Internet for any time have become accustomed to logging in to systems where there are no security indicators, and it is often necessary to do tasks on the Internet (e.g., the common practice of including login forms on non-secure web pages). Participants may also have accounts at other banking sites that do not provide the SiteKey security indicators, so they may be being quite willing to login without the indicator.

Again, that's the point. The idea that we can have the indicator-of-the-month (well, indicator that only works on some particular site) and expect users to be able to interpret it, or its absence, is a bad idea. Andrew thinks so too, as he says early in his comments:
In fact, I think any technique that requires users to notice something is different when they are doing their banking tasks is bound to have limited success.
Quite so. But we need studies demonstrating this, because important web sites persist in deploying security mechanisms which depend on users reliably noticing things that almost all users don't actually notice.