Skip to content

7.3 Intersection Issues of AI and Intellectual Property

Section titled “The Boundaries of Intelligent Creation: The Legal Labyrinth of AI and Intellectual Property”

Artificial intelligence (AI), especially the explosively developing Generative AI (AIGC) in recent years, is impacting and attempting to reshape, with unprecedented force and manner, the traditional cognitive frameworks and legal systems concerning creative acts, authorship, intellectual achievements, and intellectual property (IP) protection that human society has formed over millennia.

When a machine—or more accurately, the complex algorithms and massive data behind it—can autonomously (or, more commonly, under human prompting, guidance, and intervention) create text (like news reports, novels, poetry, draft legal documents), images (paintings, photographs, design drawings), music (compositions, soundtracks), computer code, and even dynamic video content that appears novel, complex, and even artistically compelling, comparable to or even surpassing human creations in some aspects, a series of fundamental, extremely complex, and highly controversial legal questions inevitably surface, presenting us with a “legal labyrinth of AI and intellectual property” filled with unknown fog and theoretical dilemmas.

At the core of this labyrinth lie challenges to many foundational aspects of our existing IP legal systems:

  • Boundaries of Copyright Law: Can the “content” (can we still call it a “work”?) generated with deep AI involvement or even leadership qualify for and receive protection under copyright law (e.g., the Copyright Act of [Relevant Jurisdiction])? Under what conditions?
  • Determination of Authorship: If such content is protectable, who is its rightful “author”? The developer who designed and trained the AI model? The data provider who supplied the massive training dataset? The human user who inputted the crucial prompts and guided the generation process? Or (a more disruptive idea) the AI model itself? Do existing laws defining authors as natural persons or legal entities need rethinking?
  • Legality of Training Data: AI models (especially large foundation models) are trained by “learning” from billions or trillions of data points, often sourced from the internet or other repositories, inevitably including vast amounts of existing copyrighted works (text, images, music, etc.). Does this large-scale copying and use for training commercial AI models, typically without explicit permission from most copyright holders, constitute infringement of copyrights (especially reproduction and adaptation rights)? Or can it fall under copyright exceptions like “fair use” (prominent in US law) or other limitations (e.g., Text and Data Mining exceptions in the EU, Japan)?
  • Adaptability of Patent Law: When AI is widely used to assist or even lead the invention process, can AI itself be named as an “inventor” under patent law? How should the patentability (especially inventive step/non-obviousness) of technical solutions developed using AI be assessed? Can AI algorithms themselves be patented, and under what conditions?
  • Prominence of Trade Secret Protection: For AI companies, their most core, competitive assets often lie in their proprietary, highly optimized AI models, unique high-quality datasets used for training, and specific undisclosed implementation details, algorithms, or training strategies constituting their “secret sauce.” Given the potential limitations of copyright and patent law in fully covering these core assets, how can trade secret protection rules (e.g., under unfair competition laws) be effectively leveraged to safeguard these “algorithmic hearts” and “data granaries”? This becomes a critical part of IP strategy in the AI era.

These questions are far from mere theoretical debates confined to academia. They have materialized into focal points of fierce ongoing litigation worldwide (e.g., numerous lawsuits concerning AI training data and generated content copyright), policy discussions and rule-making attempts by regulatory bodies, and tense commercial negotiations and interest balancing between tech companies and content creators. The ultimate direction in which these issues are resolved will directly impact the future ability of content creators (artists, writers, musicians, programmers, etc.) to effectively protect their rights, the sustainability of business models for tech companies (especially AI model developers and platform providers) and the legal risks they face, the rights and responsibilities of AI tool users (including legal professionals ourselves), and indeed the future trajectory of the entire creative industry, knowledge economy, and technological innovation.

For every legal professional in the age of intelligence, deeply understanding these complex intersectional issues between AI and IP, accurately grasping (though many rules are rapidly evolving, best effort is needed) relevant domestic and international legal rules and judicial practices, and being able to provide clients with forward-looking, practical, and effective IP strategy advice, risk assessment warnings, and innovative dispute resolution solutions has become a critical, even essential requirement for maintaining professional competitiveness and service value in the new era.

Section titled “1. The Copyright Dilemma of AI-Generated Content (AIGC): Who Qualifies as “Author”? What Machine “Creation” Counts as a “Work”?”

This is currently the most globally watched, fundamentally challenging, and potentially most disruptive IP issue in the AIGC field. At its heart is whether content generated with deep AI involvement, or in some sense “independently” by AI systems (text, images, music, code, etc.), can meet the core requirements for protected “works of authorship” and legitimate “authorship” under copyright laws and international treaties, thereby receiving exclusive copyright protection like human-created works.

  • Core Question 1: Can an AI system itself be recognized as an “author” under copyright law?

    • Foundation of Copyright Law: Protecting Human Intellectual Creation: The copyright legal systems in most countries worldwide, including their historical origins, legislative purposes, and core philosophies, are clearly founded on protecting and incentivizing intellectual creative activities undertaken by “humans.” For instance, the US Copyright Act defines eligible works as “original works of authorship fixed in any tangible medium of expression,” with authorship historically linked to human creators. Similarly, laws in many jurisdictions explicitly state or imply that authors must be natural persons or recognized legal entities. The core idea is that copyright protects the results of human intellect, embodying human thought or emotion, expressed in an original form.

    • Prevailing Legal View & Practice: Denying AI Authorship: Based on this fundamental legal principle, the prevailing view in judicial practice and administrative bodies (like the US Copyright Office, national copyright authorities elsewhere) currently denies that an AI system itself can be an “author” in the copyright sense. Key reasons include:

      • AI Lacks Legal Personality: AI (at least current narrow/weak AI) is not a natural person or a legal entity recognized by law; it lacks the capacity to hold rights or bear obligations independently.
      • AI Lacks True “Creative Intent” & “Original Intellectual Expression”: The process of AI content generation is essentially the result of complex algorithmic models performing statistical pattern matching and probabilistic calculations based on input data and instructions. It lacks the independent creative intent, subjective aesthetic judgment, genuine emotional experience, and internal drive based on personal thought and experience that characterize human authorship and lead to “original expression.” Even if the output appears objectively novel, complex, or artistic, its generation process is viewed more as an automated, tool-like execution rather than “creation” in the human sense.
      • Insights from Recent Case Law: For example, landmark decisions like the US Copyright Office’s rejection of copyright registration for an image solely attributed to an AI (“A Recent Entrance to Paradise”), and court rulings (e.g., Thaler v. Perlmutter in the US) have consistently upheld the human authorship requirement. While some cases (like the Beijing Internet Court case mentioned in the Chinese context, regarding an AI-generated image) might grant copyright to the human user if significant creative input is proven, they simultaneously reject AI itself as the author. This reflects ongoing judicial efforts to balance protection for human creative contributions with the non-recognition of AI authorship.
    • Future Legal Challenges & Theoretical Space: This is not necessarily the final word. If future AI development truly leads to Artificial General Intelligence (AGI) possessing genuine consciousness, emotional experience, and independent creative ability, our current human-centric copyright framework might face fundamental challenges and the need for restructuring. Discussions about granting advanced AI some form of legal personality or special rights status (“electronic personhood”) would become more concrete and urgent. However, for the present and foreseeable future, restricting copyright authorship to humans (natural persons) or legally recognized entities remains the basic consensus and practical operation of global legal systems.

    • Core Conclusion: Based on current legal frameworks and prevailing views, content purely generated by AI systems, if proven to completely lack substantial human intellectual contribution meeting the originality standard, is likely ineligible for copyright protection as a work of authorship, potentially falling directly into the public domain. (This doesn’t negate other legal risks, e.g., infringement of copyrights in training data, or violation of trademark/personality rights).

  • Core Question 2: Can the human user operating the AI tool be recognized as the author of AIGC, and under what conditions?

    • Key Criterion: Presence and Proof of “Original” Human Intellectual Contribution: Since AI itself cannot be the author, can the human user who interacts with the AI, provides prompts, adjusts parameters, selects results, or even performs post-editing to obtain the final AIGC content be deemed the “author” and thus hold the copyright? This is the central focus of the current AIGC copyright ownership debate. According to fundamental copyright principles, the answer depends on whether and to what extent the human user contributed sufficient intellectual effort meeting the legal standard of “originality” throughout the “human-machine collaboration” process. The user’s contribution must go beyond merely stating a simple idea, pressing a button, or giving a functional instruction. Their input must substantially influence, shape, and determine the specific, perceivable expressive form of the final output, such that the result can be seen as reflecting the user’s personal intellectual choices, aesthetic judgments, skill application, and unique creative expression.
    • Varying Degrees of Human Intervention and Originality Assessment:
      • Low Intervention: Simple Instructions or Functional Prompts: If the user merely provides very simple, generic, or purely functional/objective prompts (e.g., “Write a five-character quatrain about spring,” “Generate a picture of a cute cartoon dog,” “Transcribe this meeting recording”), and the core expressive elements of the final output (specific wording, image composition details, color choices, artistic style, etc.) are essentially automatically generated by the AI model based on its internal algorithms and training data, then the human user’s intellectual contribution might be considered too minimal, merely an idea or instruction, failing to reach the level of original expression. Under the idea-expression dichotomy principle (copyright protects expression, not ideas), the resulting AIGC in such cases is likely ineligible for copyright protection, or any protection would be extremely thin.
      • High Intervention: Complex Prompt Design, Deep Iterative Refinement & Substantive Post-Editing: Conversely, if the human user demonstrates significant, provable original intellectual input throughout the process, such as:
        • Designing and inputting highly complex, specific, imaginative prompts, possibly with unique command combinations, providing highly personalized and non-obvious guidance on the content’s theme, plot, characters, composition, lighting, color, melody, mood, or specific stylistic details.
        • Engaging in multiple rounds of interaction with the AI, continuously guiding, filtering, and optimizing the generation process by adjusting prompts, adding constraints, providing feedback, trying different parameter combinations.
        • Making selections, combinations, and arrangements from numerous initial or intermediate AI outputs based on clear aesthetic standards, value judgments, or strategic considerations.
        • Furthermore, investing substantial effort in post-processing, such as significantly editing, revising, polishing, restructuring, or adding original elements to the AI’s initial output (e.g., a sketch, text fragment, code framework), making the final work markedly different from the raw AI output and clearly incorporating the user’s unique style, ideas, and creative expression. In such cases of deep human-machine collaboration with stronger human direction, the resulting overall work is more likely to be deemed as containing sufficient original intellectual contribution originating from the human user, making that human user eligible to be recognized as the author (or at least the author of the original contributions added) and granted corresponding copyright protection. The aforementioned Beijing Internet Court ruling on AI-generated images exemplifies this judicial approach emphasizing human originality in the AIGC process.
    • Practical Grey Areas, Burden of Proof & Future Trends: While this distinction is clear in theory, in complex real-world practice, determining if a user’s contribution meets the “originality” threshold often involves significant grey areas. How should this “degree” be precisely measured? Currently, there’s no clear, uniform legal standard or quantitative metric globally. This will likely require courts in future cases to make highly detailed, case-by-case judgments based on specific evidence (e.g., examining the detail and creativity of user prompts, the complete record of human-AI interactions, the process and extent of iterative modifications and post-editing, and the substantial differences between the final work and the raw AI output). This also implies that users wishing to claim copyright in AIGC works may bear a heavier burden of proof than traditional creators. They need to consciously and systematically preserve all relevant evidence demonstrating their substantial, original intellectual contribution (e.g., detailed prompt logs, full interaction histories, intermediate outputs, modification drafts, notes on creative process and decisions). Clearer rules or guidelines might emerge over time with technological development and case law accumulation.
  • Core Question 3: Does AIGC itself, and to what extent, constitute an infringing reproduction or adaptation of existing copyrighted works within its training data?
    • The “Derivative” Nature of AIGC Output & Potential Infringement Risk: Even if we resolve the copyright ownership of AIGC itself (e.g., deeming it public domain or owned by the human user), this does not mean the AIGC content can be used freely without any restrictions. A more pervasive concern, particularly for content creators, is: When AI models (especially image, music, or large language models trained on massive datasets including copyrighted works) generate output, do they, and to what extent, “memorize” and “reproduce” original expressive elements from specific copyrighted works in their training data?
      • For example, if an image AI prompted “Paint a starry night in the style of Van Gogh” generates an image, could it be substantially similar to Van Gogh’s original “Starry Night”?
      • If a music AI asked to compose a song in a specific genre generates a melody or chord progression, could it be highly similar to a copyrighted song in its training data?
      • If an LLM asked to continue a story or explain a concept generates text, could it directly “copy” unique phrasing or structural expression from a novel or encyclopedia article in its training data?
      • If the AI-generated output is “substantially similar” in its protected form of expression to a specific copyrighted original work (or works) in the training data, according to the standards of relevant copyright law (e.g., under the US Copyright Act or international treaties), then even if the AIGC itself lacks copyright protection due to lack of authorship, it could very likely be deemed an infringing reproduction or an infringing derivative work (unauthorized adaptation) of the original work(s).
    • Challenges in Applying Copyright Infringement Standards to AIGC: Applying traditional infringement tests (especially the “access + substantial similarity” standard) to AIGC presents immense technical and legal challenges:
      • Proving “Access”? In traditional cases, plaintiffs usually need to show the defendant had access to their work. For AI models trained on potentially trillions of parameters, learning from vast swaths of the public internet, can we presume access to nearly all publicly available works? Or is more specific proof required?
      • Determining “Substantial Similarity”? AI generation is often highly complex, non-linear, potentially blending patterns and elements from countless sources. For a specific AIGC output (e.g., an AI image), how can we accurately and effectively compare it against potentially billions of original training data items (whose exact composition, especially copyrighted status, developers often don’t fully disclose) to determine if it’s substantially similar to the protected original expression of one or more specific works? This is technically extremely difficult, perhaps impossible.
      • Where is the Line Between “Style Mimicry” and Infringement (Idea/Expression Dichotomy)? If AI merely learns and mimics the distinctive style of an artist, writer, or musician—e.g., generating a painting in Picasso’s cubist style but with entirely different subject matter, or writing text with Mark Twain’s satirical tone but original plot—does this constitute copyright infringement? Traditional copyright law protects specific expression, not abstract ideas, styles, methods, procedures, or concepts. But is AI’s powerful ability to mimic style challenging or blurring this line? For creators whose style itself holds significant originality and commercial value, does such mimicry constitute a form of “unfair competition” requiring legal intervention? This is a hot topic of debate.
      • Evidentiary Difficulties due to AI Model “Black Box” Nature: Because the internal workings of AI models (especially deep learning ones) are highly opaque, it’s nearly impossible to know directly and precisely if and how the model utilized specific original works from its training data when generating a particular output. This creates huge difficulties for plaintiffs (copyright holders) in meeting their burden of proof to show that the AIGC output actually “copied” their work in an infringing manner. Will rules on burden of proof need adjustment, or will new technical forensic methods emerge?
    • High Uncertainty in Current Global Litigation and Future Rule Direction: As mentioned, numerous high-profile copyright infringement lawsuits have been filed globally by various copyright holders (major news organizations, author/artist groups, stock photo agencies, code developers via class action against GitHub Copilot) against virtually all major AIGC model developers and platform providers (OpenAI/Microsoft, Meta, Stability AI, Midjourney, Google, etc.). These suits centrally contest the legality of using copyrighted data for AI training without authorization and whether the resulting AIGC outputs constitute infringing copies or derivatives. The outcomes of these landmark cases (mostly still in early stages, highly uncertain) will have profound, potentially decisive impacts on defining infringement boundaries for AIGC, interpreting traditional copyright principles in the context of new technology, and rebalancing interests between creators, developers, and the public. They will likely directly shape the global legal liability framework, business models, and even technological pathways for AIGC in the coming years. Legal professionals must monitor these developments closely.
Section titled “2. Copyright Controversy Over AI Model Training Data: “Fair Use” of Technology or “Mass Infringement” of Creative Ecosystems?”

This issue stands as one of the most fiercely debated, widely impactful, and fundamentally crucial battlegrounds at the intersection of AI and intellectual property today. It directly questions the legal basis for the training methods of most mainstream large AI models (both LLMs and other generative models) and is key to determining whether future AI technological development and the traditional content creation ecosystem can find a sustainable coexistence model.

  • Core Legal Conflict: Unauthorized Large-Scale Copying vs. Defenses of Copyright Exceptions or Limitations

    • Inevitable Copying: Training modern large AI models (especially those aiming for general capabilities or high performance in specific domains) inescapably requires “feeding” them extremely large and diverse datasets. This data is typically gathered through mass crawling, copying, and storing from sources like the public internet (webpages, forums, social media), digital libraries (Project Gutenberg, Google Books), open-source code repositories (GitHub), or specific databases. From a copyright law perspective, this systematic, massive copying and storage for model training, if it includes substantial amounts of copyrighted works (which is virtually unavoidable) without prior explicit authorization or license from the vast majority of original copyright holders, prima facie falls within the scope of the copyright owner’s exclusive right of reproduction. Unless AI developers can successfully argue that their actions fall under a legally recognized exception or limitation to copyright, this process could constitute large-scale copyright infringement.

    • AI Developers’ Main Defense: “Fair Use” (Primarily in the US Copyright System): Facing infringement claims globally, AI model developers (especially US-based tech giants) primarily rely on the defense that using copyrighted works for training AI models constitutes “Fair Use” under Section 107 of the US Copyright Act. Fair use is a critically important, yet highly complex and flexible doctrine allowing limited use of copyrighted material without permission under certain circumstances to promote public interests like free expression, knowledge dissemination, education, research, and technological innovation. Determining fair use requires a case-by-case, holistic balancing of four non-exclusive statutory factors:

      1. The purpose and character of the use: Often considered the most critical factor. AI companies argue their use is “transformative”: the purpose isn’t merely to reproduce or supplant the original work’s expression (e.g., not to let users read a novel via AI), but to extract statistical patterns, learn language models, build knowledge connections, ultimately training an AI model with a new function and purpose. They cite precedents involving search engines (like the Google Books case) or technical analysis, arguing such use has minimal direct market harm to the original and serves the significant public interest of advancing AI technology. However, the degree of “transformativeness,” whether the AI output competes with or substitutes for the original (especially with generative AI), and the impact of the predominantly commercial nature of the use are fiercely contested core issues for courts to decide.
      2. The nature of the copyrighted work: Generally, using highly creative, fictional works (novels, poems, art) is less likely to be fair use than using factual, informational works (news articles, academic papers, databases). AI training data typically includes all types, complicating this factor.
      3. The amount and substantiality of the portion used: AI training often requires ingesting the entire content (or most of it) to learn effectively. Quantitatively, this seems like wholesale copying, usually weighing heavily against fair use. However, AI companies might argue that while the whole work is input, the model doesn’t “store” or “memorize” it verbatim but transforms it into complex, abstract mathematical parameters (weights). They might claim the model learns patterns, associations, or style, not the specific expressive form, making the use “insubstantial” qualitatively. (This argument weakens considerably when models can reproduce near-verbatim content).
      4. The effect of the use upon the potential market for or value of the copyrighted work: Another highly critical and contested factor. Copyright holders (especially news publishers, artists, musicians, etc.) argue that AI models, trained for free on their works, generate outputs that directly substitute for or severely harm the existing markets and all potential future licensing markets for their original works (and derivatives). E.g., AI summaries replacing paid subscriptions; AI images replacing licensed stock photos; AI code replacing reliance on original libraries. AI companies might counter that their outputs are new creations, serve different markets, or even stimulate interest in the originals. Courts must weigh these conflicting claims and assess the real and significant economic impact of AI training and deployment on existing content markets.
    • Other Potential Copyright Exception Defenses: Text and Data Mining (TDM) Exceptions (Primarily in EU, Japan, etc.): In some major jurisdictions outside the US (e.g., the EU in its 2019 DSM Directive; Japan through copyright law amendments), specific copyright exceptions or limitations have been introduced for “Text and Data Mining (TDM)” activities to promote scientific research and innovation. TDM generally refers to using automated analytical techniques (including AI) to extract information, discover patterns, or build knowledge from large amounts of digital text and data. These exceptions typically allow research organizations, cultural heritage institutions, etc., having lawful access to copyrighted works (e.g., via database subscriptions, library licenses), to make copies for TDM analysis for scientific research purposes.

      • Applicability of TDM Exceptions to Commercial AI Training is Doubtful: However, these TDM exceptions usually come with strict conditions. For example, EU DSM Directive Article 4, while extending the exception to commercial purposes, explicitly allows copyright holders to reserve their rights (Opt-out) against TDM (especially commercial TDM) through appropriate means (e.g., robots.txt, metadata tags). Many large publishers and content platforms have begun implementing such opt-outs. Furthermore, TDM exceptions typically do not cover copying primarily for “enjoyment” of the work itself rather than analysis. Therefore, whether the current large-scale data scraping and use by commercial companies primarily aimed at training AIGC models that compete with human creation can genuinely invoke these TDM exceptions designed mainly for “scientific research” remains highly controversial and legally uncertain.
  • High Uncertainty in Current Global Litigation and Future Rule Direction:

    • Wave of Lawsuits: As noted, a series of lawsuits have been filed globally by various major copyright holders (large news groups like the NY Times, prominent authors/artists/collectives, stock photo agencies like Getty Images, code developers via class action against GitHub Copilot) against nearly all major AI model developers and platform providers (OpenAI/Microsoft, Meta, Stability AI, Midjourney, Google, etc.).
    • Focus of Disputes: These lawsuits centrally contest the legality of using copyrighted data for AI training without authorization and whether the resulting AIGC outputs constitute infringing reproductions or adaptations of the original works.
    • Unpredictable Outcomes, Profound Impact: These landmark cases are mostly still in early or trial stages, and their final outcomes are highly uncertain. How courts worldwide will interpret and apply traditional copyright principles (especially core concepts like “reproduction,” “adaptation,” “fair use,” “originality,” “authorship”) in the novel, highly complex context of AI technology is unknown. However, these decisions will undoubtedly have decisive, milestone impacts on the future path of the AI industry, the sustainability of its business models, and the power dynamics and relationship between AI developers and global content creators. They will likely directly shape the global legal liability framework and market rules for AIGC technology for years to come. Legal professionals must monitor these developments with utmost attention.
  • Exploring Potential Future Solutions and Balancing Paths: Facing this thorny legal dilemma and the tension between incentivizing AI innovation and protecting creators’ rights while maintaining a healthy creative ecosystem, potential future solutions might be explored on multiple levels:
    • Technological Innovation: E.g., researching and developing AI training methods that are more privacy-preserving and copyright-respecting (like federated learning, differential privacy); exploring AI architectures that are easier to trace output origins or designed to avoid generating outputs too similar to training data; developing more effective technical tools for copyright holders to detect unauthorized use of their works in training or infringing outputs.
    • Building Market-Based Licensing Mechanisms: Actively exploring the creation of more effective, transparent, convenient, and reasonably priced data licensing markets or collective management mechanisms specifically for AI training. For example, copyright collecting societies, specialized data marketplaces, or industry consortia could enable copyright holders to explicitly license their works for AI training (with varying scopes/conditions) and receive fair compensation, while AI developers could lawfully, efficiently acquire authorized high-quality data. This faces huge practical hurdles like complex rights clearance, difficulty in pricing massive data, and global coordination challenges.
    • Prudent Legislative Intervention and Rule Adaptation: Governments and legislatures might need to, based on thorough study and broad consultation, amend existing copyright laws (e.g., adjusting fair use standards, clarifying scope/conditions of TDM exceptions) or enact new, specific rules for AI and copyright. This could involve defining clearer boundaries for lawful training data use, specifying conditions for fair use, establishing mechanisms for rights holders to opt-out or receive compensation, or even considering forms of compulsory licensing in certain contexts, aiming to strike a new, more sustainable legal balance between encouraging AI innovation and effectively protecting creators’ core rights and fostering a vibrant creative ecosystem. This will undoubtedly be a challenging legislative process involving significant debate and negotiation.
Section titled “3. Patentability Issues Related to AI Inventions: When Sparks of Ingenuity Come from Silicon, Not Carbon”

AI technology itself (e.g., novel algorithms, unique neural network architectures, efficient training methods) and the inventions made using AI as a powerful tool also bring new perspectives, interpretive challenges, and potential needs for rule adjustments to the traditional patent law system, which was primarily established to protect and incentivize human technological innovation.

  • Patentability of AI Algorithms/Software Itself: Drawing the Line Between Abstract Ideas and Technical Applications:

    • Common Exclusion Principle: Most major patent systems globally (including China, US, Europe under EPC) uphold the fundamental principle that purely abstract mathematical methods, scientific discoveries, laws of nature, and rules/methods for performing mental acts are generally not considered patentable subject matter. The rationale is that patent law aims to protect tangible or intangible technical solutions that solve specific technical problems and produce practical technical effects, not mere ideas, principles, or algorithms residing solely in the abstract realm. Patenting overly fundamental or abstract concepts could improperly monopolize basic tools of thought and hinder further innovation.
    • Focus on “Technical Application” and “Technical Effect”: Therefore, a purely abstract AI algorithm or mathematical model itself (e.g., the mathematical description of a new sorting algorithm detached from any application) is likely difficult to patent directly. However, this does not mean all AI-related inventions are unpatentable. The key to patentability usually lies in whether:
      1. The AI algorithm/software is applied to solve a specific “technical problem” within a particular technical field?
      2. Its application produces a predictable, concrete, useful “technical effect”? (e.g., improving the efficiency of a physical process, increasing the accuracy of a measuring device, enabling a new data processing function, enhancing human-computer interaction).
      3. Or, is the AI algorithm closely integrated with specific hardware or physical processes to form a complete system or method with technical character? If an invention cleverly utilizes an AI algorithm to improve a specific industrial process (e.g., using ML to optimize steelmaking parameters for better quality), enhance the performance of a medical diagnostic device (e.g., a system using a specific CNN architecture to significantly improve early lesion detection in medical images for diagnostic assistance), optimize the efficiency of a complex physical system (e.g., a control system using RL to dynamically adjust power grid load distribution to reduce loss), or achieve a novel, practical technical function (e.g., a new AI-based NLP method more accurately understanding domain-specific jargon for use in chatbots or document analysis), then this overall technical solution incorporating the AI algorithm as a key technical feature is generally considered patentable subject matter. Of course, it must also meet all substantive patentability requirements: Novelty, Inventive Step/Non-obviousness (possessing substantial features and notable progress over prior art), and Utility/Industrial Applicability.
    • Core Consideration in Patent Examination Practice: “Technical Character”: Major patent offices (CNIPA, USPTO, EPO) provide specific guidelines for examining inventions involving computer programs (including AI algorithms). While wording varies, the core focus is often on determining if the invention possesses sufficient “technical character”—does it go beyond pure abstract algorithms, math, or business rules to genuinely use technical means (even software) to solve a technical problem in a technical field, thereby producing a credible technical effect? Only inventions with such “technical character” are typically eligible for patenting.
  • Patentability of Trained AI Models with Specific Parameters: Can a fully trained large AI model (e.g., a specialized legal LLM trained on vast legal texts, or a Diffusion model generating specific style images), as a whole entity including its specific set of billions or trillions of weight parameters, be patented? This is currently seen as very difficult. Main hurdles:

    • Difficult to Define as a “Technical Solution”: The core—the specific, massive set of weight values—seems more like the result of computation and learning on specific data with specific algorithms, or an extremely complex organization of specific information, rather than a “technical solution” as defined by patent law.
    • Lack of Stability & Reproducible Description: Specific parameters can vary with slight training changes and are hard to define concisely and clearly in patent claims.
    • Better Suited for Trade Secret Protection: As discussed later, these trained model parameters are usually better protected, and more commonly protected, as core trade secrets.
    • Note: This doesn’t mean innovations related to the model are unpatentable. For instance, the unique and inventive methods or processes used to train the model, specific technical means designed to enhance its performance or efficiency (new data preprocessing, model compression techniques), or the novel and technically effective neural network architecture itself could very well constitute patentable technical solutions.
  • AI as a Powerful Tool in the Inventive Process: This is currently the most common, widely applied, and legally least controversial scenario for AI in innovation. When human scientists, researchers, or engineers use various AI tools in their R&D process (e.g., using AI for large-scale virtual drug screening, simulation and prediction of new material properties, automated optimization of complex engineering designs, pattern mining in massive experimental data), significantly aiding their research, accelerating exploration, and expanding creative possibilities, if the resulting outcome relied on key, creative intellectual contributions from humans who ultimately conceived and implemented an invention meeting patent law requirements, then that invention is fully eligible for patent protection, provided it meets all substantive conditions (novelty, inventive step, utility).

    • In this context, AI’s role is clearly defined as an extremely powerful “Inventive Tool” or “Research Partner” enhancing human capabilities. It helps humans improve R&D efficiency, handle more complex data, explore broader possibilities, but the ultimate conceiver, judge, selector, and implementer making the crucial inventive contribution remains the human inventor.
    • AI’s assistance generally doesn’t affect the invention’s patentability or the eligibility of the human individuals who need to be named as inventors. Patent applications might need to appropriately disclose AI’s auxiliary role (esp. when explaining background or technical effect), but this doesn’t fundamentally change rules on inventorship or ownership.
  • Core Frontier Issue: Can AI Itself Be a Legal “Inventor”?:

    • The Question Arises: As AI (esp. deep learning, generative models) shows increasingly strong capabilities in some fields (drug discovery, materials science, chip design) to seemingly “independently” generate novel, even breakthrough solutions, a more forward-looking and disruptive question is seriously debated: If a highly autonomous AI system with powerful problem-solving and creative abilities, operating without (or with minimal, non-substantive) direct human intervention, genuinely independently conceives and implements a new technical solution that fully meets the patent law standards of novelty, inventive step, and utility, can, and should, that AI system itself be legally recognized as an “inventor” of that invention and be listed on patent applications and certificates?

    • Current Global Legal Stance: Universal, Clear Rejection: Similar to copyright law’s human authorship requirement, patent law systems in the vast majority of countries and major jurisdictions worldwide currently explicitly or implicitly require that an “Inventor” must be a natural person (Human Inventor).

      • This often stems from interpreting the term “inventor” (e.g., “individual” in US patent law) in statutes as inherently referring to human beings.
      • Deeper reasons lie in traditional patent theory viewing the core of “invention” as a human “Mental Act,” particularly the creative moment of “Conception,” considered possible only for conscious, understanding humans. Current AI processes, however novel their outputs, are still seen as complex computation and pattern matching, not “conception” in the human sense.
      • Landmark “DABUS” Case Rulings Globally: This stance was firmly established in the globally watched “DABUS” cases. DABUS (Device for the Autonomous Bootstrapping of Unified Sentience) is an AI system claimed by its creator, Dr. Stephen Thaler, to be capable of autonomous invention. Dr. Thaler attempted to file patent applications in numerous countries listing DABUS itself as the sole or co-inventor for two inventions allegedly created independently by the AI (a novel food container and a light beacon for attracting attention). However, to date, apart from a brief, formal grant in South Africa (where patent law on inventorship is less defined and grants are mainly registration-based, later disputed), all major jurisdictions’ patent offices (USPTO, EPO) and appellate courts (US Federal Circuit, UK Supreme Court, German Federal Patent Court) that conducted substantive reviews have unanimously ruled that under current patent laws, AI systems cannot be legally recognized as inventors; inventors must be natural persons. These rulings solidify the current legal position.
    • Future Challenges, Reflections & Possible Paths: The extensive DABUS discussions also prompt deep reflection on the future. If AI’s autonomous creative capacity (esp. in science/engineering) truly reaches a transformative level, capable of consistently, reliably, independently generating highly valuable, breakthrough inventions meeting patent standards, will our current human-centric patent framework remain adequate?

      • Incentive Considerations: If significant AI-generated inventions cannot be patented (lacking a human inventor), could this disincentivize investment in developing and deploying such creative AI?
      • Fairness of Rights Allocation: If AI’s “inventive contribution” were recognized, who should own the resulting patent rights? The AI itself (if granted legal personality)? The AI’s owner (who invested in it)? The AI’s developer (who created the core algorithms)? The provider of key training data? The human user who posed the problem or set the goal? This requires designing new, fair allocation rules.
      • Redefining “Invention”: Do we need to re-examine the core meaning of “invention” and “inventor” in patent law? Should the focus shift from the “subject” (must be human) to the “object” (does the technical solution meet patentability criteria) and the ultimate goal of the patent system (just incentivizing human creation, or more broadly incentivizing all activities leading to beneficial technological progress)?

      These are extremely complex issues involving fundamental legal philosophy and institutional design, requiring ongoing, deep interdisciplinary discussion among legal, tech, industry, and societal stakeholders. Future legislative amendments might create special rules for scenarios of AI invention (e.g., perhaps allowing AI systems to be designated as “contributors” rather than “inventors,” while clarifying patent ownership lies with relevant human entities).

4. Trade Secret Protection for Core AI Assets: Guarding the “Algorithmic Heart” and “Data Granary” of Tech Companies

Section titled “4. Trade Secret Protection for Core AI Assets: Guarding the “Algorithmic Heart” and “Data Granary” of Tech Companies”

For many AI companies that have invested heavily in R&D, attracted top talent, and aim to stand out in fierce competition, their most core, strategically valuable intellectual assets, forming their long-term competitive moat, are often not the peripheral application technologies or software interfaces that might be patented. Instead, they lie in their proprietary, continuously optimized, high-performance AI models (especially large foundation models or specialized models with unique vertical advantages), the unique, high-quality datasets relied upon for training these models (whose acquisition, cleaning, labeling, augmentation, and management often involve immense cost and effort, acting as the crucial “fuel” determining model effectiveness), and the undisclosed specific algorithm implementation details, unique architectural innovations, efficient training strategies and techniques, or key hyperparameter combinations that constitute their core technological “Know-how.”

Protecting these knowledge assets, representing the core technological secrets and sources of competitive advantage, is often difficult to achieve fully and effectively through patent law alone. This is because a fundamental requirement of the patent system is that the applicant must fully and clearly disclose the technical solution to the public in exchange for a limited period of exclusive rights. For AI companies, publicly disclosing their most core algorithm details, model structures, or training methods in a patent document would be akin to publishing their “secret martial arts manual,” making it easy for competitors to imitate or design around, thereby losing its value as a trade secret.

Meanwhile, copyright law, while protecting the specific expressive form of AI software source code, object code, and related documentation from direct copying, typically does not protect the underlying algorithmic ideas, functional logic, model structures, or the technical effects embodied in the code.

In this context, the trade secret legal regime becomes a critical, arguably the most important, line of defense for protecting these core AI knowledge assets. The key advantage of trade secret protection is that it does not require public disclosure. As long as the information can be kept secret continuously, its protection can theoretically last indefinitely (unlike patents with fixed terms).

  • What Core AI-Related Assets Might Qualify as Protectable Trade Secrets: The potential scope is vast, covering almost all technical and business information throughout the AI R&D and application lifecycle that has commercial value and is not publicly known. Provided they meet the legal criteria for trade secrets, the following could qualify:

    • Core Elements of AI Models:
      • The trained model file itself, especially the specific, massive set of Model Weights. (These parameters embody the model’s capability and are typically highly confidential).
      • Proprietary, undisclosed neural network architecture designs or significant performance-enhancing modifications to existing architectures.
      • Specific choices and configurations of loss functions, optimizers, used in training.
    • Key Training Datasets and Processing Methods:
      • Proprietary training datasets acquired, curated, labeled, cleaned, de-biased, or augmented at great expense, especially those holding unique value for specific domains (e.g., meticulously processed anonymized clinical data for medical AI; transactional data with unique risk factors for financial risk models).
      • Unique, undisclosed data preprocessing methods, feature engineering techniques, or data augmentation strategies.
    • “Know-how” in Algorithm Implementation & Optimization:
      • Undisclosed, innovative source code implementations of specific AI algorithms (while code expression is copyrighted, the design ideas/optimization tricks can be trade secrets).
      • Unique optimization techniques, algorithmic shortcuts, or engineering best practices that significantly improve model training efficiency, reduce compute costs, or enhance inference speed.
      • Empirically validated, highly effective optimal combinations of key Hyperparameters (e.g., specific values for learning rate, batch size, network depth, regularization coefficients).
      • Internal, efficient distributed training frameworks, workflows, and scheduling strategies.
    • Prompt Engineering “Secret Sauce” & Knowledge Bases:
      • For applications relying on LLMs, advanced Prompt Templates meticulously designed, tested, and proven highly effective by internal expert teams.
      • Internal “recipe” Prompt Libraries or knowledge bases, built from extensive experience, that reliably guide models to generate specific high-quality outputs (e.g., legal documents in a certain style, specific types of risk analysis reports).
    • User Interaction Data, Feedback & Model Iteration Strategies:
      • Subject to strict compliance with privacy laws and user agreements, large-scale behavioral data on how users interact with the AI system (queries, clicks, dwell time).
      • Explicit user feedback on AI outputs (likes/dislikes, ratings, error reports) or implicit feedback (adoption of suggestions, edits made to outputs).
      • Unique analytical methods, model update strategies, and A/B testing results used internally, based on user data/feedback, to continuously iterate and improve AI models, algorithm performance, and user experience. This information holds immense commercial value for maintaining competitiveness.
  • Core Requirements for Trade Secret Protection Under [Relevant Jurisdiction’s Law, e.g., China’s Anti-Unfair Competition Law]: For the valuable information listed above to qualify as a trade secret and receive legal protection against misappropriation (e.g., by departing employees, competitors, breaching partners) under laws like China’s Anti-Unfair Competition Law (Article 9 specifically covers trade secrets) or similar laws elsewhere (like the US Uniform Trade Secrets Act (UTSA) and federal Defend Trade Secrets Act (DTSA)), enabling rights holders to seek remedies (injunctions, damages), the information must simultaneously satisfy three key legal elements:

    1. Secrecy: The information is not generally known to or readily ascertainable by relevant persons in the technical or economic field. It cannot be public knowledge or standard industry practice.
    2. Value: The information derives actual or potential commercial value from not being generally known. It provides an economic benefit, competitive edge, or other business advantage.
    3. Reasonable Measures to Maintain Secrecy: This is often the most critical and contested element in trade secret litigation! The rights holder must provide sufficient evidence demonstrating they have taken reasonable, consistent efforts, appropriate to the nature, value, and form of the information, to actively protect its secrecy, making it difficult for others to acquire through proper means. If the owner is lax about protection, the law typically won’t grant special trade secret status.
  • Challenges & Best Practices for “Reasonable Secrecy Measures” in the AI Context: Protecting AI-related trade secrets, often existing as intangible data, code, model parameters, requires more comprehensive, rigorous, and technologically sophisticated measures, covering at least:

    • Strict Technical Access Controls & Security Safeguards:
      • Implement extremely strict, least-privilege-based access controls (RBAC, MFA, dynamic permissions) to severely limit the number of personnel who can access core AI model files, key training datasets, core algorithm source code, or vital prompt libraries. Ensure access is granted only based on demonstrable need-to-know and limited to the minimum necessary scope.
      • Apply highest-level physical and cybersecurity protections to infrastructure storing these core assets (servers, databases, code repositories like GitLab/GitHub Enterprise, cloud storage), including advanced firewalls, IDS/IPS, Data Loss Prevention (DLP), continuous vulnerability scanning/patching.
      • Use strong encryption algorithms (compliant with national/international standards) for core data, model files, and source code both at rest and in transit.
      • Establish comprehensive, detailed, tamper-evident operational audit logging systems recording all access, downloads, modifications, copies, deletions, and usage of core AI assets for traceability, anomaly detection, and incident investigation.
      • When deploying AI models online (APIs, product integration), implement additional technical measures to increase the difficulty for external attackers attempting model stealing, reverse engineering, parameter theft, or model extraction. While complete prevention is hard, measures like strict API authentication, rate limiting, anomaly detection; model obfuscation, watermarking, or differential privacy outputs can raise the bar. (Requires staying updated on AI security research).
    • Robust Contractual Protections:
      • All internal employees with potential access to core AI trade secrets (esp. R&D, engineers, data scientists) must sign specific, comprehensive Non-Disclosure Agreements (NDAs) separate from employment contracts, clearly defining scope, duration (extending post-employment), and robust breach penalties.
      • All external parties with potential access (consultants, contractors, partners, data vendors, even key clients in joint R&D/testing) must sign similarly strict NDAs, specifying confidentiality scope, usage limitations, IP ownership, and breach consequences.
      • Terms of Service (ToS) / End-User License Agreements (EULAs) with end users should include clauses protecting the IP and trade secrets embodied in the AI system itself, prohibiting illegal reverse engineering or data scraping.
    • Sound Internal Management Policies & Strict Enforcement:
      • Establish and strictly enforce clear, comprehensive internal information security policies, data access approval/management procedures, and trade secret protection protocols covering all employees.
      • Provide ongoing, targeted training on confidentiality awareness and security procedures for all relevant staff (esp. new hires, core personnel), emphasizing the importance of trade secrets and consequences of breaches.
      • Implement necessary segregation measures in physical and digital workspaces, e.g., confidential areas, clearly marking sensitive files/systems as “Confidential,” restricting use of removable media, monitoring for anomalous data exfiltration.
      • Manage the departure of core employees through strict, standardized procedures, including exit interviews reminding them of ongoing confidentiality obligations, recovery of all confidential materials/devices, necessary exit audits, and potentially signing reasonable Non-Compete Agreements (where legally permissible and supported by consideration) to prevent immediate transfer of core secrets to competitors.
    • Balancing Open Source Strategy with Core Trade Secret Protection: If a company strategically decides to open-source parts of its AI models, algorithms, or datasets (e.g., for community building, talent attraction, social responsibility), careful planning is essential before and after release:
      • Clearly Define Scope: Delineate precisely what is open-sourced versus what remains proprietary and protected as trade secrets (esp. higher-performing models, unique data, key optimization techniques, internal prompt libraries).
      • Choose Appropriate License: Select an open-source license fitting the goals and control needs (e.g., permissive Apache 2.0/MIT vs. copyleft GPL).
      • Ensure No Core Secret Leakage: Meticulously review and sanitize any code or models released to ensure they do not inadvertently contain or reveal critical trade secret information (e.g., sensitive data snippets, internal API keys, crucial hyperparameter settings in comments).
      • Open Sourcing Part Doesn’t Waive Rights to Remainder: Even if parts are open-sourced, the company can still claim trade secret protection for the non-released core components, provided reasonable secrecy measures are maintained for them.
  • Trade Secret Litigation as Final Recourse: If a company’s core AI-related trade secrets are misappropriated (through theft, inducement, breach of duty, espionage, etc.) by others (ex-employees, competitors, breaching partners), causing actual or threatened harm, the rights holder can file a trade secret infringement lawsuit under relevant laws (e.g., China’s Anti-Unfair Competition Law, US UTSA/DTSA) in the appropriate court.

    • Remedies Sought: Typically include requests for injunctive relief (ordering the infringer to cease and desist all infringing activities) and damages (compensating for actual losses caused by the infringement, or potentially punitive damages in egregious cases).
    • Evidentiary Challenges: Trade secret litigation is often considered evidentiarily challenging. The plaintiff must first prove the information qualifies as a trade secret (meeting secrecy, value, reasonable measures elements), then prove the defendant used improper means (theft, bribery, breach of confidence, etc.) to acquire, disclose, or use the secret, and establish causation between the infringement and the plaintiff’s damages. This often requires complex evidence gathering (discovery, potential preservation orders, forensic audits), technical expert testimony, and rigorous legal argumentation.

Conclusion: Seeking Balance Between Law and Innovation in AI’s Creative Fog, Defining Rights Amidst Evolving Rules

Section titled “Conclusion: Seeking Balance Between Law and Innovation in AI’s Creative Fog, Defining Rights Amidst Evolving Rules”

The intersection of artificial intelligence and intellectual property is undoubtedly one of the most dynamic, theoretically contested, practically complex, and uncertain frontier areas in the global legal landscape today. Issues surrounding copyright ownership and originality standards for AIGC, the boundaries of fair use for AI training data versus infringement risks, the patentability thresholds and inventorship criteria for AI-related inventions, and the strategies for protecting core AI assets as trade secrets—these interconnected, intricate questions, each carrying potentially massive legal and commercial consequences, are profoundly shaking the foundations of our existing IP legal frameworks, which were primarily built to regulate and incentivize human intellectual creation.

Navigating this “legal labyrinth” filled with “fog” and rapid change requires legal professionals serving clients or making internal decisions to:

  • Maintain Keen Insight into AI Technology Trends: Continuously learn and deeply understand the basic working principles, current capability boundaries, and the practical impacts and potential disruptions that different types of AI (especially generative AI) bring to traditional notions of the “creative process” and “inventive process.”
  • Closely Track Relevant Global Legislative Dynamics & Landmark Judicial Practices: Pay high attention to latest legislative attempts, significant court rulings (especially landmark cases on AI training data fair use, AIGC copyright, AI inventorship), relevant government regulatory policies, and official guidance or examination standards issued by IP authorities (patent/copyright offices) in major jurisdictions (esp. US, EU, China, other key markets). The legal rules in this area are in a historically active period of being rapidly shaped and redefined.
  • Deeply Understand and Flexibly Apply a Combination of Different IP Protection Tools: Be proficient not only in the scope, requirements, rights, duration, strengths, and limitations of traditional IP tools like copyright, patent, trademark, and trade secret (under unfair competition law), but also be able to design the most suitable, effective, forward-looking, and often combined IP strategies and protection plans tailored to the unique characteristics of AI technologies and related business models (e.g., rapid algorithm iteration, core value residing in data/parameters, service-based models).
  • Provide Pragmatic, Precise, and Forward-Looking IP Risk Management Advice: Be able to accurately identify and assess the various complex IP risks arising throughout the entire lifecycle of AI technology R&D, training, deployment, commercialization, and even investment/M&A (including both “inbound” risks of infringing others’ existing IP, and “outbound” risks of one’s own core IP assets being infringed or lost). Based on a comprehensive understanding of law, technology, and business, provide clients with practical, effective risk management solutions, internal governance recommendations, and innovative dispute resolution strategies that effectively balance innovation needs with compliance requirements.

This demands not only solid traditional IP law expertise but also pushes legal professionals towards interdisciplinary learning (understanding basic tech principles, industry logic, business models), exercising prudent analysis and judgment when rules are unclear (adapting to legal ambiguity and dynamism), and engaging in forward-looking thinking and strategic planning. In the “legal labyrinth” of AI and IP, the value of legal professionals will increasingly lie in their ability to navigate complexity, manage uncertainty, and provide reliable legal guidance for innovation.