What does the data rights policy mean for AI models trained on proprietary data? Would these models need to be restricted to data right holders?
In talking with folks about this, there seemed to be a distinction between AI models that simply use proprietary data to create derived products, i.e., (images → redshift would be fine) ; but models that reconstruct images would not be fine. I want to confirm this.
Specifically: Would releasing a model that generates LSST-like images pre-trained on proprietary LSST images be allowed? This is probably the most-on-the-nose case.