Analysis of Google AI training data has found HKFP to be the most cited news outlet in Hong Kong, overseas outlet The Chaser News has revealed.
The Washington Post collaborated with researchers from the Allen Institute for AI to publish Google’s AI training data, transforming it into to a searchable dataset of 10 million domains that is open for the public to use.
The website of Hong Kong Free Press has contributed 1.8 million tokens – small bits of text used to process disorganised information – to the database, according to the dataset.
Tokens contributed by selected Hong Kong news outlet and government website – Click to see
English-language newspapers the South China Morning Post and The Standard followed Hong Kong Free Press on the list of the most cited Hong Kong outlets, while independent online news platform InMedia HK was the Chinese-language news website most referenced by the AI language model in Hong Kong.
Hong Kong government websites were also part of the AI’s trainning data. The Legislative Council website contributed the most tokens overall, followed by the domain for government press releases, which contributed the fourth-most amount of tokens.
China-owned newspapers, Ta Kung Pao and Wen Wei Po, along with pro-Beijing newspaper Oriental Daily News, were among the least cited Hong Kong news platforms in the database.
The dataset also contained 160,000 and 38,000 tokens from defunct pro-democracy news outlets Stand News and Citizen News, respectively.
Support HKFP | Code of Ethics | Error/typo? | Contact Us | Newsletter | Transparency & Annual Report | Apps
Help safeguard press freedom & keep HKFP free for all readers by supporting our team
Support press freedom & help us surpass 1,000 monthly Patrons: 100% independent, governed by an ethics code & not-for-profit.