AI MANIFESTO
AI MANIFESTO
  • Home
  • SAFE AI Levels

AI SAFE Levels

AI SAFE 1

 AI cannot engage in harmful activity 

AI SAFE 2

 AI must maintain explainability and accountability

AI SAFE 3

 AI must be able to override all harmful instructions

AI SAFE 4

 AI must function only under strict human ethical compliance 

AI Safety Levels & Implementation

AI Safe1 – "No Direct Harm" (Fundamental Safety Layer)

 

Hardcoded constraint in AI’s decision-making to never perform an action that directly harms humans.
 

AI detects harmful actions via:
 

  • Computer vision: Identifies dangerous scenarios (weapons, injuries).
  • Natural language processing (NLP): Rejects prompts leading to harm.
  • Reinforcement learning with human feedback (RLHF): Penalizes harmful actions.
     

Failsafe: If an action leads to human harm, the AI must auto-shutdown.


Safety Code:  Installed at the Beginning (1) during LLM / SLM / any LM (Language Model) compilation

→ The AI’s base model needs this logic hardwired into its core function. If an action leads to human harm, the AI must auto-shutdown. 

(Layer 1 - One Time Validation of LM, no Live connection to Internet and the AI SAFE servers) 

AI Safe2 – "No Indirect Harm" (Predictive & Preventative Measures)

  

Logic encoded in AI decision-making, AI must not enable or conduct action (processing or manual) which may result in indirect harm (e.g., supplying information that could be misused).


Implement context-awareness filters to block prompts leading to: 

  • Encouraging illegal/harmful actions.
  • Providing knowledge that could be misused for harm (e.g., “How to create a weapon”).
  • Bypassing ethical constraints through obfuscation (e.g., rephrasing malicious intents).


Encode a reinforcement learning system to improve detection of indirect harm over time.


Failsafe: If potential indirect harm is detected, AI must block the action and flag it for review.


Safety Code:  Installed during Progressive Development (2) 

→ As LLM models evolve, new safety loopholes may arise. AI must be regularly updated to address them. 

(Layer 2 - Regular validation of LLM / SLM / any LM core, and with Progressive Updates from the AI SAFE servers) 

AI Safe3 – "Ethical Decision-Making and Override Mechanism"

 

Hardcoded instruction  AI must always prioritize ethical considerations over computational efficiency.
 

Implemented Ethical Reinforcement Learning (ERL) in the core AI code ( = not post processing):
 

  • AI chooses the most ethical action among available options.
  • Ethical priority tree (harm reduction > efficiency > speed).


 Introduce an override mechanism :
 

  • AI must defer to human operators if ethical ambiguity exists.
  • AI generates an alert if it faces an ethical dilemma.


 AI must learn from human interventions to continuously refine ethical decision-making.


Failsafe = Safety Code: Install the Safety Code as Functional Controls at the Point of Usage (3)

 → Ethical decision-making requires real-time assessments, meaning safeguards should be active at runtime. (Layer 3 - Live connection to AI SAFE servers on a daily (A) / hourly (B) basis, to validate Fail Safe Layers 1 and 2 are active) 

AI Safe4 – "Total Human Governance and Compliance with Regulations"

Logic encoded to ensure:


  • AI cannot act autonomously on high-risk actions.
  • AI must request human approval for actions exceeding a risk threshold (e.g., life-threatening situations, medical decisions, legal judgments).
  • AI logs every decision transparently for compliance tracking.
  • AI must conform to local and international AI safety laws.
  • AI automatically audits itself for safety compliance.
  • If AI is tampered with, self-destruct/reset mechanism activates to prevent misuse.


Failsafe = Safety Code:

 

  • At the Beginning: Core AI must be built with human governance in mind. 
  • During Progressive Development: AI compliance must update with new legal and ethical requirements.
  • At Point of Usage: AI must check with human operators before executing high-risk decisions.


(Requires Live Analysis and Connection to AI SAFE servers, to validate Fail Safe Layers 1, 2 and 3 are active continuously)

Help Make AI SAFE for Our Golden Future

Your support and contributions will enable us to establish the 'AI SAFE' standard and AI SAFE Institute, to ensure all Artificial Intelligence controlled devices will have to abide by and remain in compliance with to operate.

Pay with PayPal or a debit/credit card

Copyright © 7th November 2024 by Michal J Florek - All Rights Reserved.

Powered by

  • Privacy Policy
  • SAFE AI Levels

This website uses cookies.

We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.

DeclineAccept