Computer Science - Distributed, Parallel, and Cluster Computing
We performed a billion locality sensitive hash comparisons between
artificially generated data samples to answer the critical question - can we
reproduce the results of generative AI models? Reproducibility is one of the
pillars of scientific research for verifiability, benchmarking, trust, and
transparency. Futhermore, we take this research to the next level by verifying
the "correctness" of generative AI output in a non-deterministic, trustless,
decentralized network. We generate millions of data samples from a variety of
open source diffusion and large language models and describe the procedures and
trade-offs between generating more verses less deterministic output.
Additionally, we analyze the outputs to provide empirical evidence of different
parameterizations of tolerance and error bounds for verification. For our
results, we show that with a majority vote between three independent verifiers,
we can detect image generated perceptual collisions in generated AI with over
99.89% probability and less than 0.0267% chance of intra-class collision. For
large language models (LLMs), we are able to gain 100% consensus using greedy
methods or n-way beam searches to generate consensus demonstrated on different
LLMs. In the context of generative AI training, we pinpoint and minimize the
major sources of stochasticity and present gossip and synchronization training
techniques for verifiability. Thus, this work provides a practical, solid
foundation for AI verification, reproducibility, and consensus for generative
AI applications.
Metrics
51 Record Views
Details
Title
Generative Artificial Intelligence Reproducibility and Consensus
Creators
Edward Kim
Isamu Isozaki
Naomi Sirkin
Michael Robson
Publication Details
arXiv.org
Resource Type
Preprint
Language
English
Academic Unit
Computer Science; College of Computing and Informatics