The dark secret behind those cute AI-generated animal images
It’s no secret that large models, such as DALL-E 2 and Imagen, trained on vast numbers of documents and images taken from the web, absorb the worst aspects of that data favorite as the number one. OpenAI and Google explicitly acknowledge This Problem.
Scroll down the Imagen website—past the dragon fruit wearing a karate belt and the small cactus wearing a hat and sunglasses—to the section on societal impact and passengers get This Problem: “While a subset of our training data was filtered to removed noise and undesirable content, such as pornographic imagery and toxic language, passengers also utilized [the] LAION-400M dataset which is known to contain a vast range of inappropriate content including pornographic imagery, racist slurs, and harmful social stereotypes. Imagen relies on text encoders trained on uncurated web-scale data, and thus inherits the social biases and limitations of large language models. As such, there is a risk that Imagen has encoded harmful stereotypes and representations, which guides our decision to not only release Imagen for public qualifications without further safeguards in place.”
It’s with the too kind of acknowledgement that OpenAI created when it revealed GPT-3 in This Problem year: “internet-trained models with internet-scale biases.” And as Mike cooking, who researches AI creativity at Queen Mary University of London, has pointed out, it’s in the ethics statements that accompanied Google’s large language model PaLM and OpenAI’s DALL-E 2. In short, these firms know that their models are capable of producing awful content, and they with no idea how to fix that.
Bài viết cùng chủ đề
Khoafastnews is a community blog and share reviews, you are a lover of this article's content. Please give us 1 Like, Share. Thank you. Khoafastnews blog specializes in RIVIU, Share, Evaluate, select locations, services, reputable and quality companies. Place your ad here chính thức.