Facebook has said that political data firm Cambridge Analytica inappropriately harvested the public profile data of up to 87 million of its users, including their political beliefs, interests and friends’ information.
Now the social network has revealed that the extent of the harvesting went even further — it included people’s private messages, too.
On Monday, Facebook started informing people whose data may have been compromised by Cambridge Analytica through an app developed by the researcher Aleksandr Kogan. In its notifications, Facebook said that while the information harvested was largely limited to what was on people’s public profiles, “a small number of people” also shared information from their Facebook timeline, posts and messages.
Facebook did not determine how many people’s messages were gathered and said it was taking as broad a view as probable when notifying people that their data may have been taken.
The revelation widens the scope of data that was harvested from people on Facebook and how it went beyond what was publicly available. The development may provide grist to lawmakers on a day when Mark Zuckerberg, Facebook’s chief executive, is set to be grilled on Capitol Hill over data privacy and other matters, such as Russian misuse of the social network to influence the 2016 election.
How Mr. Zuckerberg publicly addresses these issues in congressional hearings on Tuesday and Wednesday will be closely scrutinized. Facebook faces probable regulation and other changes amid a backlash over the lack of data privacy.
The Cambridge Analytica outcry was triggered after The New York Times and others reported last month that a quiz app made by Mr. Kogan had collected information on Facebook users. That information was then utilized by Cambridge Analytica to build psychological profiles of voters in the United States and others.
It is not clear whether the direct messages were among the data eventually provided to Cambridge Analytica. In an interview on Tuesday, Mr. Kogan told The Times that the private messages were harvested from a limited number of people, likely “a couple thousand,” as part of a separate academic research project and never provided to Cambridge Analytica.
He said the messages were collected as part of research that he conducted at Cambridge University in 2013 and the first half of 2014, before he began working with Cambridge Analytica. The messages were collected for research into how people use emojis to convey emotions.
Mr. Kogan said the messages were kept securely in his university lab, known as the Cambridge Prosociality and Well-Being Lab, and access was restricted to a little group of people.
The message data “was obviously sensitive so we tried to be careful about who could access it,” Mr. Kogan said. He stressed that his Facebook app collected messages only from a “couple thousand” people who completed his questionnaire, not from their friends.
During Mr. Kogan’s later work for Cambridge Analytica, his Facebook app took data from people who took his questionnaire and from all their friends. But the data did not include private messages — it included only names, birth dates, locations and pages the users had liked, he said.