Skip to content

8.2 Legal Validity and Scrutiny of AI-Generated Evidence

Section titled “The Imprint of Intelligence: Legal Validity and Scrutiny Dilemmas of AI-Generated Evidence”

As Artificial Intelligence (AI) technology, especially the capabilities of Generative AI (AIGC), rapidly develops and becomes increasingly widespread, content directly generated by AI (e.g., analysis reports automatically generated based on input data, simulated accident scene images, or even synthesized “witness” voice statements) or “derivative information” or “derivative evidence” formed after deep processing or analysis of traditional evidence by AI (e.g., surveillance footage made clearer through AI enhancement algorithms, key information lists automatically extracted and summarized from massive emails by AI, risk scores given by AI pattern recognition, or automatically generated case summaries or timelines) are appearing more frequently in various legal practice scenarios. These novel information carriers, marked with the imprint of “intelligence,” are inevitably attempting to enter litigation or arbitration proceedings, being proffered by one or more parties as a basis for proving case facts.

This emerging phenomenon poses unprecedented, even fundamental challenges to the long-established, rule-bound, sophisticated system of evidence law. Our existing traditional evidence law framework, evolved over centuries, with its core concepts, fundamental principles, and specific rules, was primarily constructed around human perception (sight, hearing), human memory and recollection, human oral and written statements, and traces in the physical world that can be objectively recorded and verified (like documents, objects). Now, when the potential source of “evidence” might be a piece of text “created” by a non-human algorithm based on statistical patterns rather than firsthand experience, or a blurry image “enhanced” by AI through complex computational “inpainting,” a series of extremely thorny legal questions concerning the very foundation of judicial fairness inevitably arise:

  • Can these novel forms of “evidence,” bearing the mark of AI, meet the basic threshold for admission into court for review—the requirement of Admissibility? Do they satisfy the fundamental attributes of evidence: authenticity, legality, and relevance?
  • Even if formally admitted, how should their Probative Value—their persuasive strength or weight in proving the fact in issue—be objectively and accurately assessed? To what extent can (or should) we trust them?
  • How can we judge the Authenticity and Reliability of information generated or processed by algorithms that are extremely complex, often operate as “black boxes,” and whose internal logic is often incomprehensible to humans? Can they be easily forged, tampered with, or maliciously manipulated?
  • Are our existing, intricate rules of evidence and standards of proof, established for different types of evidence (e.g., distinguishing original vs. copy, direct vs. circumstantial, documentary vs. real, testimonial vs. physical evidence; and regulating hearsay, opinion evidence, character evidence, etc.), still effective or sufficient to address these novel forms of evidence generated or significantly mediated by non-human intelligence? Do existing rules require significant interpretation, adaptation, or even supplementation?

This section aims to delve into the core dilemmas, main points of controversy, potential risks, and current/future coping strategies regarding the assessment of legal validity and court (or arbitral tribunal) scrutiny of various types of information or evidence directly generated or deeply processed by AI when introduced into legal proceedings.

I. AI-Generated Content (AIGC) Submitted as Evidence: Numerous Hurdles, Long Road Ahead, Extreme Caution Needed

Section titled “I. AI-Generated Content (AIGC) Submitted as Evidence: Numerous Hurdles, Long Road Ahead, Extreme Caution Needed”

First, let’s examine content directly generated by AI that parties attempt to use to describe or reflect some objective state of facts. Typical examples might include:

  • An seemingly complete “case fact summary” or “legal risk assessment report” automatically drafted by AI based on the lawyer’s input description and evidence points.
  • Simulated 3D scene images or dynamic accident reconstruction animations automatically generated by AI based on scattered witness testimonies or accident scene photos to visualize the event.
  • A “statement” recording synthesized using voice cloning technology to mimic the voice of a key person in the case, with content potentially favorable or unfavorable to them.

Submitting such content purely created by AI (Generated Content) as substantive evidence to directly prove the truth or falsity of case facts faces extremely significant, even fundamental legal hurdles both currently and in the foreseeable future. Its ability to meet Admissibility requirements, and its Probative Value even if admitted, are subject to profound, difficult-to-overcome doubts.

  • Inherent, Systemic Flaws in Authenticity and Reliability:

    • “Hallucination” Risk as AIGC’s Original Sin, Contradicting Evidentiary Truth Requirement: As repeatedly emphasized in Sections 6.1 & 6.6, all current generative AI (especially LLMs) inherently and unavoidably risk producing “Hallucinations”. They can confidently, fluently, even seemingly professionally fabricate completely non-existent facts, fake data, erroneous citations, or severely distort, misunderstand, or over-extrapolate input information. Directly admitting such content—which naturally may contain substantial fictional elements and whose truthfulness lacks inherent guarantee—as evidence for fact-finding in court fundamentally violates the most basic requirement of evidence law for objective truthfulness.
    • Shadow of Deepfakes Looming Over Generated Audio/Video Content: For AI-generated audio and video content (i.e., Deepfakes), their increasingly realistic forgery capabilities cast fundamental, hard-to-dispel doubts on their authenticity (risks detailed in Section 6.6). In the current absence of universally effective, highly reliable deepfake detection and verification technologies, any court or arbitral tribunal facing such “evidence” will inevitably, and rightfully, exercise the highest level of caution and skepticism.
    • Lack of Trustworthy Human Perception and Cognitive Basis: Traditional forms of evidence (eyewitness testimony, handwritten letters, crime scene fingerprints, reliably captured photos) possess credibility and probative value because they can usually be traced, directly or indirectly, to verifiable human perceptual activities, memory processes, subjective statements, or objective records of the physical world. AIGC production, however, results from complex, opaque algorithms learning, imitating, and probabilistically combining patterns from massive training data. It lacks this verifiable, direct or indirect perceptual or recording link to the objective facts sought to be proved. Its “creation” process is fundamentally different from human cognition based on experience, observation, and logic. Therefore, we cannot simply apply traditional evidence review logic and reliability standards to AIGC.
  • Fundamental Conflicts with Established Core Rules of Evidence:

    • Best Evidence Rule / Original Document Rule: This ancient and important rule (though specifics vary across jurisdictions, the core idea is ensuring originality and accuracy) generally requires that when needing to prove the content of a writing, recording, or photograph, the “Original” must be produced in principle, unless statutory exceptions apply (e.g., original lost not due to proponent’s bad faith). For text, images, or audio/video purely generated by AI “ex nihilo” directly in the digital realm, what constitutes the “original”? Is it the first digital file generated and stably saved? The specific code or algorithm version that generated it? Or is it essentially a “reproduction” or “simulation” based on training data, lacking a traditional “original” altogether? This conceptual ambiguity makes applying the original document rule difficult and controversial.
    • Hearsay Rule (Rule Against Hearsay): This rule (central in common law evidence systems, with similar spirit in others like China emphasizing direct/original evidence) generally strictly limits or prohibits submitting Out-of-court Statements as evidence to prove the truth of the matter asserted therein. The primary rationale is that the out-of-court declarant was not under oath in court, and their statement’s truthfulness, accuracy, memory, perception, and clarity cannot be directly observed by the court or (crucially) tested and challenged by the opposing party through Cross-examination.
      • Is AIGC “Out-of-court Statement”?: Could AI-generated content containing factual assertions or descriptions (e.g., AI-generated “case analysis report,” “narrative text” of an event, or even a synthesized “witness statement” recording) be considered a type of “statement” made “out of court” by a non-human entity? This is a novel legal theory question. But more crucially and clearly, AI itself is obviously not a competent witness capable of appearing in court, taking an oath, and being questioned by the judge and cross-examined by parties. It cannot vouch for the truthfulness of its “statements” or respond to challenges about its “perceptual” basis, “memory” accuracy, or “expressive” intent. Therefore, even if AIGC were treated as a special type of out-of-court statement, it would almost certainly fail to meet the strict conditions under which hearsay exceptions are typically allowed (these exceptions usually require the declarant’s availability for cross-examination or fall into categories with special reliability guarantees like business records, dying declarations, etc., none of which apply to AIGC).
    • Opinion Evidence Rule: This rule generally restricts witnesses (especially Lay Witnesses) from offering opinions based on personal inference, speculation, subjective evaluation, or requiring specialized knowledge. Witnesses should, in principle, only testify to objectively perceived facts. Only qualified Expert Witnesses can offer opinion evidence on matters within their field of expertise, under specific conditions.
      • AIGC Often Contains “Opinions”: Much AI-generated content, such as risk assessment reports (containing risk ratings), case analyses (involving causation inferences or liability suggestions), predictive conclusions (win probability, recidivism risk), or even seemingly objective summaries (whose selection and organization might imply evaluation), substantively includes significant “opinion” components, rather than purely objective factual statements.
      • Can AI Qualify as an “Expert”?: The question then becomes, can AI be considered an “Expert” qualified to offer opinions to the court? The answer currently is almost certainly no. Expert witnesses need recognized qualifications, ability to explain their reasoning process, accountability for their opinions, and capability to withstand cross-examination. AI clearly cannot meet these requirements. Thus, submitting AI-generated content containing opinions, judgments, or predictions directly as evidence is likely inadmissible under the opinion evidence rule.
  • Severe Challenges in Meeting Admissibility Thresholds:

    • Relevance: Even if formally relevant, does the AI-generated content truly have a substantive, logical connection to the core Facts in Issue / Material Facts needing proof? Or is it merely an output generated by algorithmic pattern matching on training data, appearing relevant but potentially misleading? Proving this connection might require additional explanation and argument.
    • Legality / Lawful Acquisition: Was the AI system itself and its underlying training data sourced, obtained, and used in a fully lawful and compliant manner? E.g., did the training data potentially infringe third-party copyrights or privacy rights? Did the AI system’s development and deployment comply with relevant national regulations on AI ethics and security? If the evidence itself originated from illegal or non-compliant means, its admissibility would be fundamentally questioned (similar to exclusionary rules for illegally obtained evidence).
    • Credibility & Reliability Assessment of the Source (i.e., the AI Model): Before admitting AIGC evidence, the court might need to assess the reliability, accuracy, stability, known biases, and the development, testing, and validation processes of the specific AI model that generated the content. This usually requires introducing extremely complex technical evidence and very expensive expert testimony for explanation and debate, undoubtedly increasing litigation costs, delaying proceedings, and raising complexity.

Current Practical Considerations & Appropriate Positioning: Synthesizing the above analysis, it’s clear that under current legal frameworks and technological maturity levels, attempting to submit AIGC content directly generated by AI aimed at directly proving the truth of case facts (e.g., “According to my analysis, this contract is invalid,” “The AI simulation shows the defendant was speeding”) as independent, substantive evidence faces insurmountable legal hurdles and almost certain failure.

Does this mean AIGC has no place in current judicial procedures? Not necessarily. Its more plausible, appropriate, and lower-risk application lies in using it as auxiliary, illustrative, or demonstrative material during court activities (or pre-trial preparation). For example:

  • Lawyers might use AI-generated simulation images or animations based on existing evidence to help judges, juries (in jury systems), or arbitrators more intuitively understand a complex technical principle, a hard-to-describe accident process, or a crucial spatial relationship at a scene. (This is akin to using physical models, charts, presentations, or simple hand-drawn diagrams in the past, just with more advanced technology).
  • AI-generated text summaries or key information lists can serve as “working drafts” or “reference notes” for lawyers or judicial staff to quickly grasp lengthy evidence materials, sort out case timelines, or prepare hearing outlines. (Crucially, formal legal documents or court statements must still directly cite and rely on the original evidence itself, not AI summaries).

The key is clearly defining its role and limitations: When using such AI-generated auxiliary materials in judicial proceedings, one must clearly, explicitly, and unequivocally inform the court and opposing parties of their nature—i.e., they are simulation results, visualizations, or preliminary organizational materials generated by AI algorithms based on specific input information, intended for aiding understanding or illustration, and absolutely not independent evidence directly proving case facts. Furthermore, the underlying facts and original data upon which these materials are based still need to be fully proven through other admissible evidence compliant with evidence rules. Also, the specific generation process, core algorithms/models relied upon, and potential assumptions or limitations might also need to be subject to court scrutiny and cross-examination.

II. “Derivative Evidence” or “Analysis Results” from AI Processing: Scrutiny Shifts to Process Reliability & Impact Assessment

Section titled “II. “Derivative Evidence” or “Analysis Results” from AI Processing: Scrutiny Shifts to Process Reliability & Impact Assessment”

Besides directly generating new content (AIGC), AI technology is also increasingly used to perform various forms of deep processing, complex analysis, intelligent enhancement, or pattern mining on traditional, original evidentiary materials. Based on these processes or analyses, new “derivative information,” “analysis results,” or items claimed to have evidentiary value (“derivative evidence”) are formed. Typical examples include:

  • A blurry, hard-to-discern surveillance video processed by AI image/video enhancement techniques (super-resolution, denoising, deblurring), resulting in a “clearer version” claimed to reveal key details (faces, license plates).
  • A secret recording with extremely noisy background and faint voices, processed by AI audio enhancement and denoising, resulting in a “cleaned version” claimed to make the conversation intelligible.
  • An hours-long court hearing recording automatically converted by AI Speech-to-Text (STT) technology into an electronic text record (draft transcript).
  • A summary table of key information, a consolidated list of specific clauses (like risk clauses), or an automated analysis report with risk scores, automatically generated by AI Information Extraction (IE) or Natural Language Processing (NLP) techniques from millions of emails or thousands of contract documents.
  • A probabilistic outcome (e.g., predicted win probability range for a case type, assessed recidivism risk level for a defendant, or fraud risk score for a financial transaction) given by an AI risk assessment model or predictive algorithm based on vast historical case data or specific case feature inputs.
  • Seemingly anomalous transaction patterns, hidden relationship networks, or lists of highly “similar cases” identified by AI pattern recognition or clustering algorithms from massive datasets.

For this type of “derivative evidence” or “analysis result”—not entirely created by AI but formed based on deep processing or analysis of existing original evidence—the focus of scrutiny regarding its admissibility and probative value in legal proceedings shifts beyond just the final presented result (e.g., the “clarified” video, the risk report, the prediction score). More critically, it demands a comprehensive, rigorous examination of the entire process that generated it. That is, we need to delve into:

  • Was the AI processing or analysis process itself sufficiently reliable, accurate, and scientifically sound?

  • Were the AI algorithms and techniques used adequately validated? Are their capability limits and potential errors known?

  • What substantive impact might this processing or analysis have had on the authenticity and integrity of the original evidence?

  • To what extent can the final “derivative result” be trusted as a basis for fact-finding?

  • Reliability & Scientific Soundness of the AI Process is Core:

    • Validity, Maturity & Domain Acceptance of the Algorithm: Was the specific AI algorithm or model used (e.g., which deep learning architecture for image enhancement? which STT engine version? what’s the risk model algorithm?) based on solid scientific theory? Are its technical principles relatively clear and explainable (at least high-level)? Has the algorithm or technology undergone sufficient academic research, peer review, and practical validation within its relevant professional field (e.g., computer vision, speech recognition, data mining, or specific application domains like financial risk control, medical image analysis)? What are its known average accuracy rates, common error types, optimal operating conditions, and clear limitations in similar scenarios? Did it follow relevant industry technical standards or best practice guidelines?
    • Quality & Suitability of Input Data: How was the quality of the original data fed into the AI for processing or analysis (e.g., the original blurry video for enhancement, the raw audio for transcription, the base dataset for training the risk model)? Did it suffer from severe noise, missing information, formatting errors, or corruption? Did the quality of this input data meet the basic requirements for the chosen AI algorithm to operate effectively and produce reliable results? (GIGO principle applies here too; low-quality input rarely yields high-quality output).
    • Normativity & Impact of Human Operations in the Process: Was there human intervention during the AI processing or analysis? E.g., did humans need to set key algorithm parameters (enhancement intensity, risk threshold)? Did humans filter or preprocess the input data? Were the AI’s intermediate or final results manually corrected or interpreted? If so, did the human operators possess necessary qualifications, skills, and objectivity? Was their specific operational process standardized, consistent, and well-documented? Could human intervention have introduced new biases or errors?
    • Reproducibility & Verifiability of the Process: For given input data and defined AI algorithm/parameters, can the processing or analysis process be repeated (Reproducibility)? I.e., does processing the same data with the same method yield identical or statistically highly similar results? More importantly, can the result be verified (Verifiability) or reproduced (Replicability) by an independent third party (e.g., an opposing party’s expert with equivalent or higher qualifications) using the same or similar methods? Process reproducibility and result verifiability are crucial for establishing trust in its scientific reliability.
  • Prudently Assessing Potential Impact of AI Processing on Original Evidence Authenticity & Integrity:

    • Core Question: Did the process alter the substantive content of the evidence?: This is one of the foremost concerns for courts deciding admissibility of AI-processed evidence. Did the AI processing—especially operations aimed at “enhancing” or “repairing” original evidence (like image/video super-resolution, deblurring, denoising, color restoration) or “simplifying” or “extracting” from original evidence (like text summarization, key info extraction)—potentially, inadvertently (or intentionally, if misused), alter the substantive information crucial for proving case facts contained in the original evidence? E.g.:
      • Could AI image enhancement have “hallucinated” seemingly plausible details not present in the original blur (a clearer but wrong facial feature, an “inpainted” license plate number)?
      • Could AI audio denoising have excessively filtered out background sounds that, although noise, might contain important environmental cues or context (a gunshot, distant conversation)? Or did the process alter subtle acoustic features (pitch, rate changes) affecting assessment of authenticity or speaker emotion?
      • Could AI-generated text summaries, due to the algorithm’s biased judgment of “importance” or oversimplification for brevity, have omitted critical qualifying conditions, context, or opposing viewpoints from the original, leading to a distorted or misleading understanding of the source text’s core meaning?
    • Did the process introduce new distortions or artifacts?: Besides altering substance, could the AI process itself have introduced new, non-original digital artifacts into the data (e.g., checkerboard or ringing effects in enhanced images; “metallic” or unnatural transitions in processed audio)? Or did it leave detectable traces indicating processing by a specific algorithm? These need consideration.
    • Fidelity & Comprehensiveness of Transcribed Text: For AI-generated speech transcripts, beyond textual accuracy, consider whether they fully and faithfully reflect all potentially valuable information in the original recording. E.g., do they accurately label speaker identities? Record important non-linguistic sounds (sighs, crying, background noises, object impacts)? Reflect speaker’s tone, pace variations, hesitations that might imply emotion or credibility? Current STT often focuses on lexical content, with limited capacity for capturing these non-semantic cues, potentially losing some evidentiary value of the original audio.
    • Objectivity, Completeness & Potential Bias of AI Analysis Reports: For AI-generated reports containing analysis or predictions (e.g., risk analysis from emails, similar case reports from precedents, risk scores from models), carefully assess whether they objectively and comprehensively reflect all information in the underlying source data. Or could they be skewed by inherent algorithmic biases (training data, model design), data quality issues, or intentional/unintentional selectivity by designers/users in choosing analysis dimensions or presenting results, thus carrying some tendency or misleading nature?
  • Applying Stricter Evidence Scrutiny Standards & Introducing Multi-layered Expert Evidence:

    • Potential Triggering of Scientific Evidence Standards: For AI analysis or prediction results derived from extremely complex, opaque algorithms (especially deep learning models) and potentially having significant impact on case outcome (e.g., AI assisting complex DNA profile analysis for identification; securities fraud risk scores from complex financial models; AI diagnostic system conclusions used to assist in determining medical malpractice causation), courts deciding admissibility are highly likely to apply the jurisdiction’s specific standards for admissibility of scientific or expert evidence.
      • In US federal courts and many adopting states, this typically means the Daubert standard (requiring judges as “gatekeepers” to assess if the theory/technique can be and has been tested, subjected to peer review and publication, has a known or potential rate of error, existence/maintenance of standards controlling its operation, and general acceptance in the relevant scientific community).
      • Other jurisdictions might still use the older Frye standard, focusing primarily on general acceptance in the relevant field.
      • While concepts differ, similar spirits exist elsewhere (e.g., China’s rules on expert opinions emphasize legality of procedure, scientific validity of method, reliability of conclusion). Meeting these strict standards usually requires proponents to provide extensive, detailed technical documentation, validation reports, and almost always necessitates reliance on qualified expert witnesses testifying for explanation and defense.
    • Central & Diverse Roles of Expert Witnesses in AI Evidence Review: In reviewing and admitting evidence involving complex AI processing or analysis, the role of expert witnesses with relevant qualifications becomes absolutely crucial and indispensable. Moreover, it might require not just one type of expert, but collaboration among experts from different fields:
      • AI Technology Experts (Computer Scientists, Data Scientists, Algorithm Engineers): Their core role is to explain to the court the basic working principles, core algorithm characteristics, data basis, performance metrics (accuracy, error rates), inherent limitations, potential bias risks of the specific AI system used in the case, and the general acceptance and state-of-the-art of the technology in its field. They need to convey complex technical issues in language understandable to the court (a challenge itself) and defend the reliability and validity of the technology under rigorous cross-examination (potentially against opposing tech experts).
      • Substantive Domain Experts (Forensic Scientists, Financial Risk Analysts, Clinical Physicians, Digital Forensics Investigators, Auditors, etc.): Their core role is not to explain the AI technology itself, but to interpret the actual meaning, relevance, and probative value of the AI analysis or processing results within their specific professional domain. They need to assess if the AI output aligns with professional knowledge, practical experience, and industry standards in their field. Does the result support or rebut a specific fact in issue? How should this result, combined with other evidence, influence the final professional judgment or conclusion? AI analysis results themselves (e.g., a risk score, a match probability) usually cannot directly serve as final legal or factual conclusions. They must be “translated,” confirmed, and contextualized by human domain experts to become usable expert opinions for the court.

Key Practical Considerations & Recommendations:

  • Must Maintain Complete, Detailed Records of the Entire AI Processing: For all instances where AI technology is used to process, analyze, or enhance evidence materials—including which specific tool/model/version was used, when, by whom, with what key parameter settings, following which specific steps, producing what key intermediate and final outputsdetailed, accurate, preferably tamper-proof records must be kept. This record is the foundation for later proving the reliability, reproducibility, and compliance of the process in court, and a prerequisite for withstanding cross-examination and judicial scrutiny.
  • Must Properly Preserve the Original Evidence Securely: Regardless of what AI processing or enhancement is performed, the original, unprocessed evidence file (Original Evidence) must be preserved with highest priority, security, and integrity, ensuring its clear provenance, unbroken chain of custody, and freedom from any contamination or tampering. The original evidence serves as the baseline and ultimate reference for all subsequent processing and analysis. In court, if disputes arise over AI-processed evidence, the court will likely demand the original for comparison and verification.
  • Clearly Define AI’s Auxiliary Role When Using/Presenting: When submitting or presenting any results derived from AI analysis or processing to the court, arbitral tribunal, or opposing parties, clearly, candidly, and unambiguously state that AI played only an auxiliary role (e.g., “The risk scores in this report are based on preliminary analysis of the data by the XX AI model and are for reference only; final judgment was made by our expert team”). Emphasize that AI conclusions/outputs must be interpreted in conjunction with all other evidence, relevant legal provisions, and human professional knowledge and experience, and are by no means absolute “truth” replacing human judgment.
  • Enhance Process Transparency Where Feasible: Within the limits of protecting legitimate trade secrets or system security, strive to increase the transparency of the working process of AI systems used for evidence processing. E.g., appropriately disclose to the court and opposing parties the basic type and principle of the algorithm used, key performance metrics (known accuracy/error rates), and known limitations or potential risks of the technology. Increased transparency helps reduce unnecessary suspicion and enhances the credibility and procedural acceptability of the results.

III. Special Challenges & Counter-Strategies for AI Deepfake Evidence (Re-emphasizing Severity)

Section titled “III. Special Challenges & Counter-Strategies for AI Deepfake Evidence (Re-emphasizing Severity)”

AI-driven Deepfake technology (especially highly realistic audio and video forgeries) poses unprecedented, disruptive, extremely severe challenges to the traditional assessment of authenticity for audio/video evidence within the evidence system (principles, harms, countermeasures detailed in Section 6.6). This issue is extremely important and increasingly urgent in the practice of reviewing AI-related evidence, requiring re-emphasis of core challenges and strategies:

  • Burden of Proof Rules May Need Adjustment: Under traditional rules, the party challenging the authenticity of evidence often bears the initial burden of proving forgery. However, facing technically extremely difficult-to-detect deepfakes, placing the full burden on the challenger might be unduly harsh and unrealistic. Future evidence rules might evolve or be explored in directions like: when one party raises specific, reasonably grounded doubts about the authenticity of critical audio/video evidence (e.g., providing preliminary technical analysis suggesting anomalies, or showing significant contradiction with other established facts), could the burden of proof shift, requiring the proponent of the evidence to bear a higher, more active responsibility to prove its authenticity and integrity (i.e., that it’s genuine and untampered)? Evolution of rules in this area warrants close attention.
  • Professional Technical Forensics Will Become Increasingly Routine: It’s foreseeable that in future cases involving critical, disputed digital audio/video evidence, conducting professional technical forensic examination to rule out or confirm the possibility of deepfakes might gradually shift from an exception to a more routine requirement. This poses new practical challenges:
    • Need for Sufficient Qualified Experts & Institutions: Currently, specialized institutions and personnel expert in deepfake detection and forensics are relatively scarce.
    • Cost & Time Investment: High-quality technical forensics usually involve significant costs and lengthy turnaround times, potentially pressuring litigation efficiency and parties’ financial capacity.
    • Need for Standardized & Accepted Methods: Requires urgent research and establishment of scientific, reliable, and judicially accepted technical standards and procedural norms for deepfake forensic examination.
  • Upholding & Strengthening the Principle of Holistic Judgment: Even with a technical forensic report (whether finding “no obvious signs of forgery” or “high suspicion of forgery”), the report must absolutely not be treated as the sole or decisive basis for determining evidence authenticity. The court (or tribunal) must still, in its final judgment, consider the forensic opinion as one piece of evidence among many, and make a comprehensive assessment based on all other evidence in the case (e.g., the context in which the audio/video was created, testimonies of relevant witnesses, corroborating or contradicting documents and physical evidence, the overall logical consistency of the case, etc.). Apply rigorous logical reasoning and common sense judgment based on life experience to ultimately evaluate the authenticity of the audio/video evidence and its probative value regarding the facts in issue.
  • Stricter Requirements for Evidence Provenance & Chain of Custody: Against the backdrop where deepfake technology makes tampering easier and more covert, courts will undoubtedly become even more critical and stringent in examining the original source of audio/video evidence (how and by whom was it obtained?), the legality and compliance of the acquisition method, and the completeness, clarity, and integrity of the chain of custody from initial creation/acquisition to final submission to court. Any doubt regarding provenance or chain of custody could significantly undermine the evidence’s credibility.

IV. Adaptability Challenges for Existing Evidence Rules & Future Development Prospects

Section titled “IV. Adaptability Challenges for Existing Evidence Rules & Future Development Prospects”

Our existing evidence rule systems—evolved over long history primarily to deal with human actions and traditional physical/analog evidence forms (e.g., requirements for Relevance, Legality, Authenticity; distinctions and rules for different evidence types like original vs. copy, direct vs. circumstantial, testimonial vs. real; admissibility limitations on Hearsay, Opinion, Character evidence; special standards for Expert Testimony and Scientific Evidence)—provide fundamental analytical frameworks, core legal principles, and important dimensions for thinking when reviewing and evaluating evidence potentially involving AI. These basic principles (pursuit of truth, ensuring due process, excluding illegal evidence) remain valid and must be upheld in the AI era.

However, AI, as a novel, non-human source or processor of evidence, based on complex data and algorithms, often with opaque internal mechanisms, indeed poses numerous unprecedented challenges and new issues requiring adaptive adjustment in the specific interpretation and practical application of this traditional system.

  • Interpretive & Applicability Challenges for Existing Rules in AI Contexts: Courts, arbitral bodies, legal practitioners, and academia need to continuously explore and accumulate experience through future cases, studying how to reasonably, creatively, and consistently with the spirit of rule of law, interpret and apply traditional evidence rules designed for the “human world” to address new situations brought by AI. E.g.:
    • Under the Best Evidence Rule, for content purely generated by AI in digital space (AI-created image, AI analysis report), how should the legally significant “original” be defined? Is it the first stably saved digital file? The specific algorithm code/model version generating it? Or is it essentially derivative data lacking a traditional “original”?
    • Could analysis reports, risk assessments, or predictive conclusions generated by AI based on learning from vast data constitute a new form of “Hearsay” (as it indirectly relies on potentially numerous original sources or statements within training data not directly tested in court)? If so, could it ever meet existing hearsay exceptions?
    • Is content generated by AI containing judgments, evaluations, or predictions considered Opinion Evidence? If so, can AI itself qualify as an “Expert” to offer such opinions (currently clearly no)? Or should AI output be treated as “raw data” or an “analytical tool” requiring interpretation and adoption by a qualified human expert, with only the human expert’s final opinion being the actual evidence?
    • If AI is used to assist human experts in their analysis and judgment (e.g., AI aids radiologist reading scans for diagnosis; AI aids accountant auditing financials for anomalies), does the reliability basis for the final human expert opinion rest solely on the human’s judgment, or must it also involve scrutiny and disclosure of the reliability, accuracy, and limitations of the underlying AI tool relied upon? How should the qualifications, scope of testimony, and cross-examination methods for “AI expert witnesses” (those explaining the AI technology itself) be defined?
  • Necessity & Potential Directions for Amending/Supplementing Evidence Rules: As AI technology deepens and its use in judicial practice becomes more common, relying solely on interpretation of existing rules might become insufficient to address all emerging complexities and risks. It is highly likely that legislatures or supreme judicial bodies will need, in the future, to provide clearer, more specific guidance adapted to AI’s characteristics, possibly through amending existing evidence laws/rules, or issuing specialized judicial interpretations, trial work guidelines, or even dedicated legislation concerning AI-related evidence issues. Potential directions could include:
    • Clearly defining the basic preconditions, admissibility standards, and limitations for using different types of AI-generated content or AI processing results as evidence.
    • Developing more targeted identification standards, burden of proof rules, and legal consequences specifically addressing evidence forgery or tampering using Deepfakes and other AI techniques.
    • Researching and stipulating the reasonable scope, procedural requirements, and necessary protection for trade secrets and IP regarding discovery of AI algorithms, models, or related data in litigation.
    • Clarifying disclosure requirements for transparency and explainability regarding the AI process when using AI analysis results or expert opinions in court, and defining acceptable levels and forms of explanation.
    • Updating and detailing rules concerning the collection, preservation, submission, authentication, and integrity of electronic evidence to specifically cover new risks introduced by AI processing (e.g., algorithmic errors, irreversible processing).
  • Enhancing AI Literacy of the Legal Community as Fundamental Guarantee: Regardless of how legal rules evolve, ultimately, the ability to effectively review, challenge, utilize, and adjudicate evidence involving AI in practice rests upon human legal professionals equipped with necessary knowledge and skills. Therefore, systematically and continuously enhancing the entire legal community’s—including judges, prosecutors, lawyers, arbitrators, forensic experts, legal educators, and researchersbasic understanding of AI technology (especially its legal applications), ability to identify its potential risks and ethical challenges, and capacity for critically evaluating its output and evidentiary value will be the most fundamental guarantee for the entire justice system to successfully adapt to the intelligent transformation and uphold judicial fairness and efficiency. This requires immense effort across all levels, including legal education, professional admission, continuing training, and interdisciplinary exchange.

Conclusion: Navigating Unknown Waters in the Intelligent Age – Prudence is the Best Compass, Humans are the Ultimate Helmsmen

Section titled “Conclusion: Navigating Unknown Waters in the Intelligent Age – Prudence is the Best Compass, Humans are the Ultimate Helmsmen”

Novel forms of evidence or information directly generated or deeply processed by AI are entering legal practice at an unprecedented pace and manner, profoundly challenging our traditional understanding and existing rules regarding evidence authenticity, reliability, originality, admissibility, and the evidence law system as a whole. Their inherent issues and potential risks—such as potential inaccuracy (“hallucination” risk), opacity of origin and process (“black box” problem), difficulty in determining authorship (copyright dilemma), and extreme vulnerability to malicious forgery and tampering (especially Deepfakes)—make submitting such “imprints of intelligence” directly as independent, substantive evidence to prove case facts face immense legal hurdles and extremely high risks of inadmissibility under current legal frameworks and technological understanding.

For “derivative evidence” or “analysis results” formed from AI processing or analysis of original evidence, the focus of legal scrutiny must shift from merely the final presented result to a comprehensive, rigorous, meticulous examination of the entire AI process that generated it—its reliability, scientific validity, transparency, reproducibility, and potential impact on the original evidence’s substantive content. This often necessitates professional technical forensic opinions, multi-layered expert testimony, and potentially applying stricter scientific evidence standards.

The proliferation of Deepfake technology, in particular, poses a severe, potentially subversive test to the foundational credibility of traditionally reliable audio/video evidence, demanding utmost vigilance and skepticism in future evidence review practices, along with active development and application of corresponding technical detection means and legal regulatory strategies.

Addressing these unprecedented challenges requires prudent, adaptive interpretation and application innovation based on full understanding and respect for existing evidence law frameworks and fundamental principles, possibly necessitating necessary supplementation and refinement of relevant rules through future legislation or judicial interpretation. But more centrally and fundamentally, it requires the entire legal community itself to continuously enhance its understanding of AI technology, risk awareness, and critical thinking capabilities.

It is foreseeable that for a considerable time to come, when navigating the vast ocean of legal practice and encountering various forms of “AI evidence,” maintaining healthy skepticism, insisting on independent cross-verification against authoritative original sources for all facts and bases, emphasizing full transparent recording and review of AI processing, upholding the core role of human experts in final judgment and responsibility, and strictly adhering to fundamental requirements of due process will serve as our best compass and most reliable anchor for sailing safely, utilizing technology effectively, and ultimately upholding judicial fairness in these uncharted waters of the intelligent age. Technology can assist, even augment our capabilities, but the final helm must, and can only, be held by humans endowed with legal wisdom and ethical responsibility.