AB 2013: What AI Developers Must Disclose & Protect

1.2.26

AB 2013: What AI Developers Must Disclose & Protect

California Assembly Bill 2013 (AB 2013), the Generative Artificial Intelligence: Training Data Transparency Act, is set to take effect starting January 1, 2026. This bill requires developers of generative artificial intelligence (AI) systems serving California to publicly disclose information regarding the data used to train their AI systems.

AB 2013 applies to individuals, corporations, and government agencies that provide generative AI systems for public use. Generative AI is a type of artificial intelligence that can create texts, images, or other media based on user inputs and prompts. Generative AI systems are trained on large sets of data and learn from finding patterns within the data to generate new content.

Before any generative AI system or update is made publicly available in California, the developer must post on their website a summary of the training data used.

 The posted documentation must include, among other things:

  1. The source of the data (e.g., public websites, purchased databases).
  2. What kinds of data were used (text, images, etc.).
  3. A description of how the dataset furthers the intended purpose of the system
  4. Whether the data protected by copyright or in the public domain.
  5. Whether the data includes personal or aggregated consumer information.
  6. Whether synthetic data was used
  7. Whether any processing or “cleaning” was done to the data.
  8. The time frame the data was collected.

AB 2013 is a part of a larger effort to regulate AI in California and provide transparency so users can make informed decisions about the AI systems they purchase and engage with. 

The performance of AI is directly impacted by the quality and relevance of the data used to train it. Major AI services such as OpenAI, Meta and Google are competing to lead the AI industry and thus seek large quantities of training data.  In the race to obtain data, there is a significant risk that copyrighted material is being used without permission and that  personal, sensitive information could be misused.  

Another concern this bill addresses relates to AI developers’ use of synthetic data to train their AI systems. Synthetic data is not collected and instead is artificially created to mimic statistical properties of authentic data. This alleviates obstacles for AI developers in accessing data that may be scarce, protected or sensitive. However, relying on synthetic data is concerning because it cannot wholly replicate complex and variable data collected from real-world events. Synthetic data may result in biased or inaccurate output from generative AI systems.

As AI is becoming unavoidable in our daily lives, proponents of AB 2013 believe the bill will promote transparency and build public trust by giving users clearer information when  deciding whether to implement AI or exchange personal information with generative AI systems.

Contrastingly, opponents of the bill and developers question whether it is feasible to compile the required disclosures as preparing a comprehensive summary may be difficult given that AI systems are trained on massive amounts of data from varied sources, many of which do not have clear records of origin. Many datasets are derived from third parties and developers may be unable to determine whether that data is copyrighted or otherwise protected.

Moreover, opponents of the bill are concerned that AB 2013 prevents innovation and competition in the digital marketplace. Opponents believe the bill exposes companies to greater liability for intellectual property infringement and privacy risks. Also, developers worry there is a lack of protection for their trade secrets and intellectual property by requiring them to potentially reveal valuable proprietary or competitive information. 

As developers prepare to comply with AB 2013, it is advisable to consult legal counsel to best navigate the significant challenges of disclosure requirements while protecting valuable competitive information.

This is not legal advice. For detailed legal advice or assistance, reach out to the attorneys at Schwartz Semerdjian Cauley Schena & Bush, LLP