Report reveals AI database with abusive child images; raises alarm on misuse for generating realistic fake exploitation pictures. (IT World Canada)


December 21, 2023

The uncovering of thousands of sexually abusive images of children on a widely-used database for training artificial intelligence (AI) image generators has sparked concerns about the potential misuse of these offensive photos. A recently released report from the Stanford University Internet Observatory (SIO) reveals that these disturbing images were part of a vast database named LAION-5B, employed by AI developers for training purposes.

This alarming discovery has prompted action to remove the source images. Researchers promptly reported the image URLs to organizations dedicated to combating child exploitation, including the National Center for Missing and Exploited Children (NCMEC) in the United States and the Canadian Centre for Child Protection (C3P).

The investigation disclosed the troubling images within LAION-5B, recognized as one of the largest repositories of images used by AI developers for training. LAION-5B contains billions of images sourced from various platforms, encompassing mainstream social media and popular adult video sites.

Following the report's release, the nonprofit Large-scale Artificial Intelligence Open Network (LAION) emphasized a stringent policy against illegal content. LAION promptly removed the datasets from its platform until the offensive images could be expunged.

The SIO report clarified its investigation methods, highlighting the use of hashing tools like Microsoft’s PhotoDNA to match image fingerprints with databases maintained by nonprofits addressing online child sexual exploitation. The study did not involve viewing abusive content directly, and matches were verified by relevant organizations like NCMEC and C3P wherever possible.

Addressing the challenges surrounding datasets used to train AI models, the SIO suggested safety measures for future data collection and model training. Recommendations included cross-checking images against known databases of child sexual abuse material (CSAM) and collaborating with child safety organizations like NCMEC and C3P.

While LAION-5B's creation involved attempts to filter explicit content and identify underage explicit material, the report highlighted issues with widely-used AI image-generating models like Stable Diffusion, trained on a diverse range of content, including explicit material. Additionally, other models like Google’s Imagen, trained using LAION datasets, were found to contain inappropriate content, leading to concerns about public use.

Despite efforts to identify CSAM within LAION-5B, the SIO recognized limitations due to incomplete industry hash sets, content attrition, limited access to original reference sets, and inaccuracies in classifying "unsafe" content.

The report concluded that web-scale datasets pose significant problems, advocating for their restriction to research settings, while promoting more curated and meticulously sourced datasets for publicly distributed AI models.

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

You may also like

Intel to build custom chip for Amazon; shares rise sharply

Intel’s foundry division has landed a significant deal with Amazon's cloud services unit, AWS, to produce custom artificial intelligence chips.....

OpenAI’s o1 introduces new model that thinks like humans

OpenAI has unveiled its latest model, o1, also known as the "strawberry project," which is designed to enhance complex reasoning....

Teen creates a robot to solve the Rubik's Cube

A 13-year-old student from St Malachy’s College in North Belfast has built a Lego robot capable of solving a Rubik’s....

SpaceX Unveils New, Stylish EVA Spacesuits, Making History

At an altitude of 700 kilometres above Earth, Thursday’s groundbreaking SpaceX spacewalk reached a new height in space exploration. This....

Adobe to Release New AI Tool for Video Creation This Year

Adobe is set to launch a new video creation and editing tool powered by generative AI, expected to be available....

Apple's latest AirPods double as hearing aids

In a groundbreaking announcement at its recent product showcase, Apple revealed that its latest AirPods Pro will now serve a....

Huawei is about to release its competitor to Apple’s iPhone 16

Huawei’s latest smartphone has sparked considerable excitement, with over three million pre-orders pouring in even before its official release. The....

Apple's new iPhone to use Arm's next-gen chip for AI features

Apple is set to launch its highly anticipated iPhone 16 today, showcasing a new generation of technology powered by the....

Boeing’s Starliner Returns Empty, Astronauts Stay in Space

After months of uncertainty and setbacks, Boeing's new astronaut capsule, Starliner, departed the International Space Station on Friday without its....

Google Unveils 5 New Android Features: TalkBack, Music Search, and More

Google has recently rolled out a set of exciting updates for Android users, enhancing several key features and introducing new....

Recon Instruments co-founder aims to boost self-driving tech with Matt3r

Hamid Abdollahi, who co-founded Recon Instruments and made a name in the wearable tech industry, is now focusing on a....

Apple Event 2024: Products Likely Missing from September 9 Launch

Apple is gearing up for one of its most anticipated events of the year, set to take place next week.....