A famous New Yorker cartoon from 1993 showed two dogs at a computer, with one saying to the other, “On the Internet, nobody knows you’re a dog.”
That may no longer be true.
A new analysis of online consumer data shows that large Web companies are learning more about people than ever from what they search for and do on the Internet, gathering clues about the tastes and preferences of a typical user several hundred times a month.
These companies use that information to predict what content and advertisements people most likely want to see. They can charge steep prices for carefully tailored ads because of their high response rates.
The analysis, conducted for The New York Times by the research firm comScore, provides what advertising executives say is the first broad estimate of the amount of consumer data that is transmitted to Internet companies.
Privacy advocates have previously sounded alarms about the practices of Internet companies and provided vague estimates about the volume of data they collect, but they did not give comprehensive figures.
The Web companies are, in effect, taking the trail of crumbs people leave behind as they move around the Internet, and then analyzing them to anticipate people’s next steps. So anybody who searches for information on such disparate topics as iron supplements, airlines, hotels and soft drinks may see ads for those products and services later on.
Consumers have not complained to any great extent about data collection online. But privacy experts say that is because the collection is invisible to them. Unlike Facebook’s Beacon program, which stirred controversy last year when it broadcast its members’ purchases to their online friends, most companies do not flash a notice on the screen when they collect data about visitors to their sites.
“When you start to get into the details, it’s scarier than you might suspect,” said Marc Rotenberg, executive director of the Electronic Privacy Information Center, a privacy rights group. “We’re recording preferences, hopes, worries and fears.”
But executives from the largest Web companies say that privacy fears are misplaced, and that they have policies in place to protect consumers’ names and other personal information from advertisers. Moreover, they say, the data is a boon to consumers, because it makes the ads they see more relevant.
These companies often connect consumer data to unique codes identifying their computers, rather than their names.
“What is targeting in the long term?” said Michael Galgon, Microsoft’s chief advertising strategist. “You’re getting content about things and messaging about things that are spot-on to who you are.”
The rich troves of data at the fingertips of the biggest Internet companies are also creating a new kind of digital divide within the industry. Traditional media companies, which collect far less data about visitors to their sites, are increasingly at a disadvantage when they compete for ad dollars.
The major television networks and magazine and newspaper companies “aren’t even in the same league,” said Linda Abraham, an executive vice president at comScore. “They can’t really play in this sandbox.”
During the Internet’s short life, most people have used a yardstick from traditional media to measure success: audience size. Like magazines and newspapers, Web sites are most often ranked based on how many people visit them and how long they are there.
But on the Internet, advertisers are increasingly choosing where to place their ads based on how much sites know about Web surfers. ComScore’s analysis is a novel attempt to estimate how many times major Web companies can collect data about their users in a given month.
Web companies once could monitor the actions of consumers only on their own sites. But over the last couple of years, the Internet giants have spread their reach by acting as intermediaries that place ads on thousands of Web sites, and now can follow people’s activities on far more sites.
Large Web companies like Microsoft and Yahoo have also acquired a number of companies in the last year that have rich consumer data.
“So many of the deals are really about data,” said David Verklin, chief executive of Carat Americas, an ad agency in the Aegis Group that decides where to place ads for clients.
“Everyone feels that if we can get more data, we could put ads in front of people who are interested in them,” he said. “That’s the whole idea here: put dog food ads in front of people who have dogs.”
Web companies also can collect more data as people spend more time online. The number of searches that American Web users enter each month has nearly doubled since summer of 2006, to 14.6 billion searches in January, according to comScore.
ComScore analyzed 15 major media companies’ potential to collect online data in December. The analysis captured how many searches, display ads, videos and page views occurred on those sites and estimated the number of ads shown in their ad networks.
These actions represented “data transmission events” — times when consumer data was zapped back to the Web companies’ servers. Five large Web operations — Yahoo, Google, Microsoft, AOL and MySpace — record at least 336 billion transmission events in a month, not counting their ad networks.
The methodology was worked out with comScore and based on the advice of senior online advertising executives at two of the largest Internet companies.
“I think it’s a reasonable way to look at how many touch-points companies have with their consumers,” Jules Polonetsky, the chief privacy officer for AOL, said of the comScore findings on Friday.
But Mr. Polonetsky cautions that not all of the data at every company is used together. Much of it is stored separately.
The information transmitted might include the person’s ZIP code, a search for anything from vacation information to celebrity gossip, or a purchase of prescription drugs or other intimate items. Some types of data, like search queries, tends to be more valuable than others.
Yahoo came out with the most data collection points in a month on its own sites — about 110 billion collections, or 811 for the average user. In addition, Yahoo has about 1,700 other opportunities to collect data about the average person on partner sites like eBay, where Yahoo sells the ads.
MySpace, which is owned by the News Corporation, and AOL, a unit of Time Warner, were not far behind.
ComScore said it recorded the ad networks using different methods and that the exact ordering of these top companies might vary with a different methodology, but the overall picture would be similar.
Google also has scores of data collection events, but the company says it is unique in that it mostly uses only current information rather than past actions to select ads.
The depth of Yahoo’s database goes far in explaining why AOL is talking with Yahoo about a merger and Microsoft is willing to pay more than $41.2 billion to acquire the company.
Traditional media companies come in far behind.
Condé Nast magazine sites, for example, have only 34 data collection events for the average site visitor each month. The numbers for other traditional media companies, as generated by comScore, were 45 for The New York Times Company; 49 for another newspaper company, the McClatchy Corporation; and 64 for the Walt Disney Company.
Some companies are trying to close the gap. Walt Disney, for example, is studying how to combine data from its divisions like ESPN, Disney and ABC. The News Corporation is exploring ways to use information that MySpace members post on that site to select ads for those members when they visit other News Corporation sites.
IAC is using data from its LendingTree site to deliver ads on its other sites to people it knows are looking for mortgages.
Some advertising executives say media companies will have little choice but to outsource their ad sales to companies like Microsoft and Yahoo to benefit from their data. The Web companies may prove they can use their algorithms and consumer information to better select which ads for visitors better than media companies can.
“I think a lot of publishers are going to find they don’t have enough data,” said David W. Kenny, chief executive of Digitas, a digital advertising agency in the Publicis Groupe. “There’s only going to be a handful of big players who can manage the data.”
People who spend more time on the Internet, of course, will have more information transmitted about them. The comScore per-person figures are averages; occasional Web users have far less transmitted about them.
The comScore figures do not include the data that consumers offer voluntarily when registering for sites or e-mail services. When consumers do so, they often give sites permission to link some of their interests or searches to their user name.
The figures also do not account for information people enter on social network pages. MySpace, for example, collects billions of user actions each day in the form of blogs, comments and profile updates, said Peter Levinsohn, president of Fox Interactive Media, which owns MySpace.
Even with all the data Web companies have, they are finding ways to obtain more. The giant Internet portals have been buying ad-delivery companies like DoubleClick and Atlas, which have stockpiles of information. Atlas, for example, delivers 6 billion ads every day. The comScore figures do not capture such data.
Executives from Web companies said they had been working to inform consumers on their data practices.
These companies noted their consumer-protection policies. AOL, for example, lets users opt out of some ad targeting, Google lets users edit the search histories that are linked to their user names, Yahoo is working on a policy to obscure people’s computer identification addresses that are connected to search results, and Microsoft says it does not link any of its visitors’ behavior to their user names, even if those people are registered.
A study of California adults last year found that 85 percent thought sites should not be allowed to track their behavior around the Web to show them ads, according to the Samuelson Law, Technology & Public Policy Clinic at the University of California at Berkeley, which conducted the study.
Louise Story, The New York Times. March 10, 2008
Copyright © 2008 The New York Times Company. All rights reserved.