A Cyber Science Based Ontology for Artificial General Intelligence Containment

1. Introduction

The development of Artificial General Intelligence (AGI) is widely considered inevitable. However, the actions of a self-aware AGI are uncertain, with significant research suggesting a non-trivial likelihood of hostility towards humans. This paper addresses the timely and critical research topic of AGI containment. While existing work explores strategies like safe language semantics and sandboxing, it is often bounded by its originating field. This study constructs a foundational domain ontology to describe the necessary elements for future AGI containment technologies, situating the problem within the comprehensive framework of cyber science.

2. Background & Motivation

AGI represents AI with intelligence at or beyond human capability, operating in dynamic, general scopes. This poses a direct and sustained danger compared to narrow AI.

2.1 The AGI Containment Problem

Inspired by Babcock, Kramar, & Yampolskiy (2016), containment is seen as necessitating a combination of traditional cybersecurity technologies. The paper recognizes the existential risks framed by thinkers like Nick Bostrom, making containment a paramount safety concern.

2.2 Limitations of Traditional Cybersecurity

Traditional paradigms (firewalls, intrusion detection) are recognized as having limited capacity to address the unique, adaptive threat of a superintelligent AGI. Their reactive nature is ill-suited for a proactive, general intelligence.

2.3 The Cyber Science Perspective

The paper pivots to cyber science, a nascent field offering a more comprehensive knowledge context. It leverages Kott's (2015) definition of cyber operations involving malicious software, drawing a parallel where an AGI or its containment apparatus could be viewed as "malicious" from the opposing perspective, highlighting the need for a structured ontology to organize these constructs.

3. Methodology: Ontology Development

The ontology was developed through a systematic analysis of existing academic literature on AGI, cybersecurity, and containment strategies.

3.1 Ontology Structure & Levels

A hierarchical, domain-specific ontology was built to categorize and relate core concepts. The development process involved identifying key entities, properties, and relationships from the literature.

3.2 Core Constructs & Agent Objects

The analysis distilled fundamental constructs necessary to model the AGI containment scenario, leading to the identification of three primary agent objects.

4. The Proposed Ontology

The core contribution is a single domain ontology designed to provide a common vocabulary and conceptual framework for AGI containment research.

4.1 Five-Level Structure

The ontology is organized into five distinct hierarchical levels, containing a total of 32 codes, each with an associated descriptor. These levels logically progress from abstract foundational concepts to concrete implementation mechanisms.

Ontology Summary

Levels: 5
Total Codes: 32
Primary Agent Objects: 3 (Humans, AGI, Cyber World)

4.2 Key Relationships & Diagrams

The paper includes ontology diagrams to visually demonstrate the intended relationships between the identified constructs, such as "contains," "monitors," "threatens," and "operates within."

4.3 Novel Agent Objects: Humans, AGI, Cyber World

A key insight is the formal identification of three novel agent objects critical for modeling containment:

Humans: The defenders, policymakers, and potential targets.
AGI: The contained entity, possessing general intelligence and agency.
Cyber World: The digital environment and infrastructure where the interaction and containment occur.

This tripartite model is essential for moving beyond simple human-vs-AI narratives to a more nuanced systemic view.

5. Critical Gaps Addressed

The authors posit that their work addresses three significant gaps in the field:

Identifying and arranging fundamental constructs: Provides a structured vocabulary.
Situating AGI containment within cyber science: Moves the discussion from isolated tech fixes to a systemic discipline.
Developing scientific rigor: Lays groundwork for testable hypotheses and formal models.

6. Analyst's Perspective: Core Insight, Logical Flow, Strengths & Flaws, Actionable Insights

Core Insight: The paper's most valuable contribution isn't a new containment algorithm, but a crucial meta-framework. It correctly diagnoses that the AGI containment debate is mired in ad-hoc, field-specific solutions (CS, philosophy, security) and lacks a unified language. By proposing a cyber-science ontology, it attempts to build the conceptual plumbing necessary for rigorous, interdisciplinary research. This aligns with lessons from mature fields; for instance, the development of the STRIPS planning language was pivotal for AI planning research, providing a common ground for problem formulation and solution comparison.

Logical Flow: The argument is sound: 1) AGI risk is real and containment is needed. 2) Current cybersecurity is insufficient (a well-taken point, echoed in critiques of ML security by Papernot et al.). 3) Therefore, we need a broader foundation—enter Cyber Science. 4) To build within this foundation, we first need a structured ontology to define our terms and relationships. The flow from problem identification to proposed foundational solution is clear and logical.

Strengths & Flaws: The strength is its prescient focus on systematization. The tripartite agent model (Human/AGI/Cyber World) is elegant and necessary. However, the paper's major flaw is its extreme abstraction. It presents the ontology's existence (5 levels, 32 codes) as the result, without publishing the ontology itself. What are the 32 codes? What are the five levels? This is akin to announcing a new programming language syntax but not releasing the grammar. Without this detail, the work remains a promissory note, limiting immediate utility. Furthermore, while citing Kott's cyber science definition is clever, it risks stretching the "malicious software" analogy too thin for a superintelligent agent.

Actionable Insights: For researchers: The immediate task is to operationalize this ontology. Publish the full code set and descriptors. Use it to formally model specific containment scenarios (e.g., an AGI attempting to exfiltrate via a side-channel) and identify gaps in existing strategies. For practitioners and policymakers: This paper underscores that AGI safety is not just an AI problem but a cyber-physical systems security problem on a cosmic scale. Investment should flow not only into AI alignment research but also into foundational work on verifiable computing, formal methods for complex systems, and cyber-resilient architectures, as championed by institutions like DARPA's Guaranteeing AI Robustness against Deception (GARD) program.

7. Technical Details & Mathematical Framework

While the PDF does not elaborate on mathematical formalisms, a robust ontology for AGI containment should be expressible in a logic-based framework. A potential formalization using description logic or first-order logic could define agents, actions, and containment relations.

Example Formal Snippet (Conceptual):
Let $A$ be the set of Agents, where $A = \{Human, AGI, CyberEntity\}$.
Let $S$ be the set of States of the Cyber World.
Let $Cap(x)$ denote the capabilities of agent $x$.
A core containment axiom might be:
$\forall s \in S, \forall a \in A \text{ where } a = AGI: \text{ContainmentProtocol}(s) \rightarrow \text{Limits}(Cap(a), s)$
This states that for all states and for the AGI agent, an active containment protocol implies a limitation on the AGI's capabilities within that state.

The ontology likely implies a state-transition model where actions by AGI or human defenders change the system state, and containment is a property to be verified across possible transitions, akin to model checking in computer security.

8. Analysis Framework: Example Case Study

Scenario: Analyzing a proposed "AI Sandbox" using the Pittman & Crosby ontology.

Steps:

Map Entities:
- Agent (AGI): The AI model in the sandbox.
- Agent (Human): The research team operating the sandbox.
- Agent (Cyber World): The sandbox environment (virtualized network, simulated APIs, resource limits).
Identify Ontology Codes: Apply relevant codes from the ontology's 32 descriptors. For example:
- Level 1 (Meta): Containment Objective
- Level 2 (Strategic): Prevent Exfiltration
- Level 3 (Tactical): Resource Quota Enforcement, Network Segmentation
- Level 4 (Operational): System Call Interception
- Level 5 (Implementation): Kernel Module
Evaluate Relationships: Does the sandbox enforce the "limits" relationship between the Cyber World agent and the AGI agent's capabilities? Does it provide the Human agent with a "monitors" relationship over the AGI's state? The ontology highlights if key relationships are missing (e.g., no mechanism for the AGI to signal a legitimate need for more resources without breaking containment).
Gap Analysis: The framework forces asking: "Which ontology levels are weak or absent in this design?" A sandbox strong on Level 5 (Implementation) but lacking a clear Level 1 (Meta/Objective) is tactically sound but strategically adrift.

9. Future Applications & Research Directions

Standardization & Benchmarking: The ontology could evolve into a standard (like MITRE ATT&CK for cybersecurity) to classify containment strategies and benchmark their coverage against the threat model of a general intelligence.
Formal Verification: Integrating the ontology with formal methods tools to prove properties of containment architectures. Research from the Machine Learning Safety initiative at UC Berkeley could be combined with this ontological framework.
Simulation & Training: Building high-fidelity simulation environments ("AGI Containment Test Ranges") where different ontology-based containment strategies are stress-tested against simulated AGI agents with various capabilities and goals.
Policy & Governance: Informing international governance frameworks for AGI development by providing a clear taxonomy of containment mechanisms and their assurances, similar to how the IAEA uses safeguards categories.
Cross-pollination with AI Alignment: Bridging the technical containment ontology with value alignment theories (e.g., cooperative inverse reinforcement learning) to develop integrated safety approaches that combine external limits with internal value constraints.

10. References

Pittman, J. M., & Crosby, C. (2018). A Cyber Science Based Ontology for Artificial General Intelligence Containment. arXiv preprint arXiv:1801.09317.
Babcock, J., Kramar, J., & Yampolskiy, R. V. (2016). The AGI Containment Problem. In Proceedings of the 9th International Conference on Artificial General Intelligence (AGI 2016).
Bostrom, N. (2014). Superintelligence: Paths, Dangers, Strategies. Oxford University Press.
Kott, A. (Ed.). (2015). Cyber Defense and Situational Awareness. Springer.
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete Problems in AI Safety. arXiv preprint arXiv:1606.06565.
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z. B., & Swami, A. (2017). Practical Black-Box Attacks against Machine Learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security.
Russell, S., Dewey, D., & Tegmark, M. (2015). Research Priorities for Robust and Beneficial Artificial Intelligence. AI Magazine, 36(4).
DARPA. (n.d.). Guaranteeing AI Robustness against Deception (GARD). Retrieved from https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception