Two federal courts issued their first-of-kind AI-evidence rulings on the same day in February 2026 and reached opposite results. Both applied orthodox doctrine. Both got there logically. The difference was the facts, and that is exactly the lesson practitioners need to carry into discovery planning right now.
What Heppner actually held
The facts are specific and matter for any attempt to generalize the holding. Bradley Heppner was indicted on securities and wire fraud charges in October 2025. After receiving a grand jury subpoena and retaining counsel, but before his arrest in November, he used the consumer version of Claude to generate approximately thirty-one documents analyzing his legal exposure and developing potential defense theories. When federal agents executed a search warrant at his residence, they seized electronic devices containing those documents. He later shared them with his attorneys and asserted privilege.
Judge Rakoff rejected the claim on what he described as "at least two, if not all three" elements of the governing test. The attorney-client privilege analysis was categorical: Claude is not an attorney, and the communications were not between a client and counsel. That finding alone was sufficient to deny the claim. On confidentiality, the court held that Anthropic's consumer privacy policy, which expressly reserved the right to collect user inputs, train the model on those inputs, and disclose information to governmental authorities and third parties, put Heppner on notice that his exchanges were not confidential. The work product doctrine failed separately: the documents were prepared by the defendant on his own volition, not by or at the direction of counsel, and they did not reflect defense counsel's litigation strategy at the time of creation.[^1]
Two procedural details carry weight for civil practitioners. First, the AI Documents were not produced through ordinary civil discovery; agents seized them via search warrant. That sequence illustrates a point worth internalizing: AI interactions exist in multiple locations beyond the vendor's servers. Exports, screenshots, saved PDFs, downloaded reports, and cached files are all potential custodial sources. Second, Judge Rakoff applied the standard forwarding principle: transmitting unprivileged material to counsel does not privilege it retroactively. No AI-specific carve-out created that result. The same logic governs any document.
The work product counterpoint
Warner turns on the pro se posture. Sohyon Warner was representing herself in an employment discrimination action against Gilbarco. The defendants sought production of "all documents and information" concerning her use of third-party AI tools in connection with the lawsuit and argued that inputting litigation materials into ChatGPT constituted a disclosure that waived work product protection.
Magistrate Judge Patti denied the motion on two independent grounds. First, the request failed relevance and proportionality review under Fed. R. Civ. P. 26(b)(1). Second, and more doctrinally significant, the AI materials were protected work product. Patti applied Fed. R. Civ. P. 26(b)(3)(A), which protects "documents and tangible things prepared in anticipation of litigation or for trial by another party or its representative." Warner was proceeding pro se; she qualified as "another party" and her AI-assisted litigation preparation fell within the doctrine. On waiver, Patti applied the distinction that Sixth Circuit law requires: attorney-client privilege waiver can flow from voluntary disclosure to any third party, but work product waiver requires disclosure to an adversary, or disclosure in circumstances likely to reach one.
ChatGPT and other generative AI programs are tools, not persons, even if they may have administrators somewhere in the background.
Heppner and Warner are doctrinally consistent despite their divergent results. Heppner involved a client who acted independently of counsel and used a platform governed by consumer terms that affirmatively disclaimed confidentiality. Warner involved a party using AI as her own litigation instrument, with no gap between the person using the tool and the person managing the case. The outcome in each follows from the facts applied to unchanged doctrine.
The direction of travel: AI logs as ordinary ESI
Courts are not treating "AI chats" as a category that escapes standard discovery analysis. In the consolidated copyright MDL against OpenAI in the S.D.N.Y., Magistrate Judge Ona Wang ordered production of twenty million de-identified ChatGPT logs and denied a stay while OpenAI's objections were pending.[^2] OpenAI objected on privacy grounds and characterized the demand as requiring indefinite retention of consumer ChatGPT and API content going forward.
That order establishes the frame. Relevance, proportionality, privacy protections, protective orders, and de-identification protocols: these are the tools courts apply to AI chat logs. The label "AI chats" carries no reflexive protection, and it carries no reflexive immunity.
Forensic triage without junk science
The threshold principle: you cannot prove "this text was written by an LLM" from text alone to an evidentiary standard. Contemporary AI detectors carry material false positive and false negative rates. Some are demonstrably biased against non-native English writing. OpenAI discontinued its own public "AI classifier for indicating AI-written text" due to low accuracy.[^3] Linguistic style inference is not a foundation for sanctions motions, privilege challenges, or broad discovery expansions.
The practical alternative is outside-in forensic triage. Identify process artifacts that make AI use more probable, then use proportional discovery tools to confirm or falsify the hypothesis. The evidentiary targets worth prioritizing are hard to fabricate at scale: metadata, enterprise audit logs, account records, and native exports.
Document forensics. Native files carry more information about authorship and drafting behavior than prose style does. Microsoft's documentation of "document properties" notes that author, title, and other metadata fields travel with Office files unless intentionally scrubbed. eDiscovery tools can expose those fields across custodians at scale, which is more defensible than any linguistic inference.[^4]
Two text-level artifacts are real but require precise framing. Neither establishes LLM authorship. Both warrant a proportional follow-on inquiry.
Invisible Unicode characters. The Unicode standard defines zero-width format characters, including U+200B ZERO WIDTH SPACE, ordinarily invisible and intended for line-break segmentation in specific writing systems. Academic and security literature shows these characters appear in text steganography and watermarking pipelines. Clusters of them in English-language produced documents justify a narrow inquiry into what system generated the text. Copy-paste pipelines, template systems, and deliberate obfuscation produce the same artifacts.
Provenance metadata. The Coalition for Content Provenance and Authenticity (C2PA) specification supports embedding signed provenance manifests into PDF and Office-format containers. Where adopted, C2PA creates machine-verifiable records of creation and edits. Routine civil litigation has not yet widely encountered C2PA, but it is the architectural path toward evidence-grade provenance. A produced file whose listed author is "python-docx" is already a usable data point.
Enterprise and platform logging. For organizational parties, the strongest evidence of AI use may come from the organization's own compliance telemetry. Microsoft's documentation states that audit logs are generated for Copilot user interactions, capturing who acted, when, and what resources were accessed. Microsoft's Purview eDiscovery tools are designed to search Copilot prompts and responses and to preserve them for investigations. Copilot use inside an enterprise is materially more discoverable than consumer chatbot use on a personal device.
On the vendor side, OpenAI's enterprise documentation describes a "Compliance Logs Platform" that exports time-windowed immutable JSONL logs for auditing and compliance. That architecture changes the calculus: enterprise AI activity is more likely to exist in a structured, exportable format than in ad hoc screenshots.
On watermarks. Text watermarking is a genuine research direction. One prominent approach biases token selection so output carries a statistically detectable signature. Consumer LLMs have not universally deployed it, it degrades under paraphrasing and formatting changes, and recent literature demonstrates watermark-stealing and spoofing attacks against major schemes.[^5] Treat watermarks as: if the provider offers a verification service, use it. Do not treat them as a forensic silver bullet.
Building a proportional discovery theory
A request for "all AI chats related to this case" will lose credibility and will typically lose the motion. Federal discovery rules impose proportionality requirements, and ESI is a mandated planning topic under Fed. R. Civ. P. 26(f). The governing standard under Rule 26(b)(1) requires that discovery be proportional to the needs of the case. The right structure is a case-specific relevance hook tied to the narrowest evidentiary target that actually matters: knowledge, intent, notice, reliance, state of mind, provenance of specific documents, or spoliation.
A practical framing for any request:
Relevance hook. AI interactions are ESI that can show what a party knew, believed, or was trying to accomplish, particularly when the AI was used to draft or justify factual assertions later served in discovery or filed with the court.
Proportionality guardrail. Specify a defined time window, defined custodians, defined subject matter, and a connection to specific produced documents or interrogatory answers.
Form of production. Push for native or reasonably usable electronic forms when metadata or audit trails matter. Fed. R. Civ. P. 34(b)(2)(E) contemplates ESI production forms and requires usability.
Device imaging at the outset will likely backfire. The Rule 34 committee notes explicitly caution that inspection and testing of electronic systems is not a routine right, and courts guard against undue intrusiveness. The escalation ladder starts with admissions and exports, then moves to device inspection only on a developed record of need.
What to request from the party
The most reliable civil litigation path is to compel the party, not the platform, to produce what the party can access: exports, screenshots, downloaded archives, and enterprise logs. The major providers supply export tools that create litigation-ready archives.
OpenAI documents a user-facing export of ChatGPT chat history and account data. Anthropic documents a Claude "Export data" process through privacy settings. Google documents exporting Gemini Apps data via Takeout and My Activity, noting that downloading does not delete server-side data. These mechanisms make targeted requests technically concrete and proportional.
The core interrogatory. Lead with identification:
Identify each generative AI system or feature used by you or anyone acting on your behalf to draft, revise, summarize, translate, or generate any content relating to the claims, defenses, discovery responses, witness statements, or documents produced in this action, including the system name, account identifier, device used, date range of use, and whether any chat history or logs are retained, deleted, or exported.
This framing is proportional because it produces a map first and aligns with the obligation to identify and discuss ESI sources in the Rule 26(f) conference.
Targeted RFPs. Tie requests to artifacts or topics.
Exports for specific documents. Produce exported chat transcripts, including prompts, outputs, and attachments, for conversations used to create or revise the following produced documents: [Bates ranges]. The existence of official export mechanisms makes this technically feasible and proportional.
Enterprise audit logs. For organizational parties using Microsoft Copilot, request the Purview audit and eDiscovery exports for the relevant custodians and date ranges. Microsoft frames these tools as the appropriate mechanism to audit and investigate Copilot interactions.
Retention settings and deletion actions. OpenAI's documentation states that chats persist until deletion and are generally scheduled for deletion within 30 days, subject to legal and security exceptions, and that legal holds can override standard deletion behavior. Understanding what retention settings were in place, and when, is essential for spoliation analysis.
Billing and access records. When the opposing party denies AI use, subscription receipts, reimbursements, and account emails establish access and opportunity. They are more probative than prose style arguments, and courts in the OpenAI MDL have treated structured logs and account-scale records as cognizable discovery objects when properly framed.
The Stored Communications Act trap
Fed. R. Civ. P. 45 authorizes subpoenas commanding production of documents and ESI from nonparties. It does not override statutory constraints on what providers can disclose. The Stored Communications Act, 18 U.S.C. §§ 2701-2713, generally prohibits providers of electronic communication services and remote computing services from disclosing the contents of communications to any person or entity, absent a recognized exception.
In re Subpoena Duces Tecum to AOL, O'Grady, and Crispin collectively establish that a civil Rule 45 subpoena served on an electronic communication service provider for message contents will run into SCA limitations. The same structural barrier applies to AI chat logs. A subpoena served on OpenAI or Anthropic for the contents of a party's chat sessions is likely to draw SCA objections unless a statutory exception applies.
The most commonly available exception in civil litigation is consent: if the account holder authorizes production, the prohibition lifts. The practical litigation sequence is: compel the party to export and produce via consent-based production; use the subpoena as leverage; seek a court order requiring the party to consent if the party resists. Treating a Rule 45 subpoena to the provider as a shortcut around party discovery is a strategy courts have consistently rejected in the email and social media context. They will apply the same reasoning to AI vendors.
Subpoenas for non-content records (subscriber information, billing records, login timestamps, IP session logs) face a lower SCA barrier but will still draw aggressive objections and require narrow tailoring. The statute's distinction between "electronic communication services" and "remote computing services" determines which disclosure rules apply to a given provider. That classification question has not yet been resolved for large language model providers specifically.[^6]
Proving denial and building the record
When an opposing party denies AI use, the objective is not to win an argument about writing style. It is to build a circumstantial record that either forces a correction or supports sanctions if the denial was false and evidence was destroyed.
Start with the lowest-friction proofs. Account-level records (exports, subscription receipts, enterprise audit logs) are harder to explain away than "this paragraph reads like ChatGPT." Office metadata fields and eDiscovery metadata reports can show authorship chains that undermine claimed drafting narratives. Then anchor the deposition: if a discovery response contains technical or legal assertions that appear generated, put the witness to "how was this drafted, what sources were consulted, what tools were used," then follow with document requests keyed to the answers. Warner signals that courts will resist intrusive fishing into internal drafting processes absent a particularized, concrete showing of need. Build the record to meet that threshold before you ask.
On preservation: Fed. R. Civ. P. 37(e) imposes sanctions frameworks for failure to preserve ESI when litigation was reasonably anticipated. OpenAI's own retention documentation and public statements about legal holds demonstrate that standard deletion windows give way when preservation obligations apply. What deletion settings were active, and whether a litigation hold was in place, is essential to any spoliation argument.
When you obtain AI logs or exports and intend to use them at trial, build authenticity foundations early. Fed. R. Evid. 902(13) provides a self-authentication path for records generated by an electronic process or system. Fed. R. Evid. 902(14) covers certified data copied from an electronic device or file via digital identification. Both map cleanly onto properly handled AI exports and forensic extractions. Plan that foundation when you receive the documents.
[^1]: Venable's alert on Heppner notes that Judge Rakoff's opinion expressly declined to address enterprise-grade AI tools subject to no-training provisions and contractual confidentiality commitments. That gap is the most consequential open question the decision leaves behind. A client using an enterprise instance at counsel's direction, governed by enforceable confidentiality terms, presents a structurally different case from Heppner. Several commentators, including at Lawfare, have also criticized the court's confidentiality analysis as overly reliant on Anthropic's broadest contractual reserved rights without examining the specific terms or product tier that governed Heppner's actual use.
[^2]:
[^3]: OpenAI announced the discontinuation of its public AI text classifier in July 2023. The stated rationale was low accuracy across text lengths and a demonstrated false positive rate against human-written text, particularly from non-native English speakers.
[^4]: eDiscovery metadata reports generated from Office files typically include "Author," "Last Modified By," and revision history fields. When those fields show an author name inconsistent with the producing party's claimed drafting narrative, they create a documentable evidentiary discrepancy that is harder to dismiss than any linguistic argument.
[^5]: The watermark-stealing literature includes work demonstrating that an adversary can observe sufficient watermarked outputs to replicate the statistical signature on non-watermarked text or remove it from watermarked text. This limits the forensic reliability of watermarking absent a provider-run verification endpoint with cryptographic controls.
[^6]: The SCA's ECS/RCS classification matters because 18 U.S.C. § 2702 and § 2703 impose different disclosure obligations on the two categories. AI chatbot providers likely qualify as remote computing services, which makes their disclosure obligations more flexible in some respects, but no court has addressed the question directly for large language model providers.