扣扣传媒

Skip to main content Skip to search

YU News

扣扣传媒News

Study Reveals AI Diffusion Models Mostly Rearrange, Not Reinvent, What They Learn

Lakshmikar Polamreddy, a student in the Katz School's Ph.D. in Mathematics, is co-author of the study published in the journal Information Fusion.

By Dave DeFusco

Diffusion models have become the artistic and scientific darlings of artificial intelligence. They power image generators like DALL路E and Stable Diffusion, producing stunning, lifelike pictures from simple text prompts. But a recent study led by researchers at the Katz School of Science and Health asks a fundamental question: Are these models really creating something new or just rearranging what they鈥檝e already seen?

That question lies at the heart of  published in the journal Information Fusion by Lakshmikar Polamreddy, a Ph.D. student in mathematics at the Katz School, and Jialu Li, a student in the M.S. in Artificial Intelligence. Their research challenges a popular belief that diffusion models 鈥渋magine鈥 in the same way humans do.

鈥淒iffusion models have been the state-of-the-art for image and video generation,鈥 said Polamreddy. 鈥淲e wanted to test whether they really generate new data or not. My assumption was that they don鈥檛鈥攖hat they just replicate the existing content in different forms.鈥

Diffusion models work by gradually turning random noise into a detailed image, learning from large datasets of real pictures. Because their results can be impressively realistic, it鈥檚 easy to assume they鈥檙e generating novel ideas. But Polamreddy鈥檚 team found otherwise. When they asked a model trained on tens of thousands of images to produce new ones, almost all the results were variations of existing data. 

鈥淚f I generate 10,000 images,鈥 he said, 鈥渕aybe only 10 of them contain truly new features not seen in the training data. Those 10 are what we call 鈥榙iverse samples.鈥欌

These diverse samples are special. They contain elements that are different but relevant to the original data. Polamreddy distinguishes them from so-called out-of-distribution samples, which are completely unrelated.

鈥淚f I give the model brain images and ask it to generate more, but it produces a heart image, that鈥檚 out of distribution,鈥 he said. 鈥淲e discard those. But if it gives a new kind of brain image with a slightly different structure, that鈥檚 a diverse sample and it鈥檚 valuable.鈥

The team鈥檚 most striking finding came from applying their method to medical images, where data scarcity is a real problem. Hospitals often can鈥檛 share patient scans because of privacy concerns, and collecting new images for training AI diagnostic systems is expensive and time-consuming.

That鈥檚 where data augmentation鈥攃reating additional training images artificially鈥攃omes in. Most augmentation techniques, like flipping or rotating existing images, don鈥檛 add new information. Polamreddy鈥檚 study suggests that even a small number of truly diverse samples can significantly improve diagnostic models.

鈥淒ata augmentation is especially critical in the medical field,鈥 said Polamreddy. 鈥淏ecause of privacy concerns, we don鈥檛 have enough data. Generating diverse samples with new content helps counter that scarcity and improves downstream tasks, like image classification and disease diagnosis.鈥

Using chest X-rays and breast ultrasound images, the researchers trained an image-classification model with and without diverse samples. The results were striking: adding diverse samples improved classification accuracy by several percentage points, sometimes more than five points higher than models trained on standard generated images.

鈥淓ven a few diverse samples can make a big difference,鈥 said Jialu Li, co-author of the study. 鈥淭hey diversify the training data and help the model generalize better, which means it performs more accurately on real-world medical images.鈥

To measure novelty, the team turned to information theory, a mathematical framework that studies how information is stored and transmitted. They used metrics like entropy and mutual information to see whether the generated images truly contained new data.

鈥淚f there鈥檚 no relationship between the training and generated images, entropy will be high,鈥 said Polamreddy. 鈥淭hese measurements help us see whether there鈥檚 really new information or just a repetition of what the model already knows.鈥

Their conclusion is that ideal diffusion models don鈥檛 create new information at all. Any new content comes from small imperfections in how the model reverses the diffusion process or, essentially, a lucky byproduct of noise and complexity.

To find diverse samples, the researchers had to take a brute-force approach. They generated thousands of images repeatedly, filtering each batch to identify the rare few that contained novel features.

鈥淚f I want 100 diverse samples,鈥 said Polamreddy, 鈥淚 might have to run the model many times. Each iteration gives me one or two, so I keep going until I get what I need.鈥

That method, while effective, is slow. The team鈥檚 next goal is to design 鈥渄iversity-aware鈥 diffusion models鈥攐nes that can produce semantically rich, varied images in a single pass.

鈥淲e need better conditioning in the diffusion process,鈥 said Polamreddy. 鈥淭hat鈥檚 how we can teach models to generate more diverse samples automatically instead of relying on brute force.鈥

Share

FacebookTwitterLinkedInWhat's AppEmailPrint

Follow Us