Unapproved AI Usage

5 automated security scanners

Shadow LLM Service Detection

Purpose: The Shadow LLM Service Detection Scanner is designed to identify unauthorized usage of ChatGPT and Claude AI services, detect exposure of personal API keys, and uncover the presence of email-based services by probing DNS, HTTP, TLS, ports, and APIs. This tool helps organizations maintain security and compliance with regulatory requirements by detecting potential threats associated with these services.

What It Detects:

Unauthorized ChatGPT/Claude Usage: Identifies references to unauthorized use of ChatGPT or Claude in web content and scans for specific API endpoints related to these services.
Personal API Key Exposure: Detects the presence of personal API keys in HTTP responses and DNS TXT records, looking for patterns indicative of exposed API keys.
Email-Based Services Detection: Identifies email service providers through MX, SPF, DKIM, and DMARC DNS records and scans web content for references to common email services.
Security Headers Analysis: Checks for the presence of critical security headers such as Strict-Transport-Security, Content-Security-Policy, X-Frame-Options, and X-Content-Type-Options.
TLS/SSL Vulnerabilities: Inspects SSL/TLS configurations to identify outdated protocols (TLSv1.0, TLSv1.1) and weak cipher suites (RC4, DES, MD5).

Inputs Required:

domain (string): The primary domain to analyze (e.g., acme.com).

Business Impact: This scanner is crucial for organizations that rely on AI services and handle sensitive data. Detecting unauthorized usage of these services can prevent potential security breaches, protect intellectual property, and ensure compliance with privacy regulations such as GDPR or HIPAA.

Risk Levels:

Critical: Conditions where the presence of ChatGPT or Claude API keys in DNS TXT records poses a significant risk to sensitive information and regulatory compliance.
High: Situations where weak security headers are exposed, potentially allowing for unauthorized access or data leakage.
Medium: Issues related to outdated TLS protocols or weak cipher suites that may be bypassed with ease but still pose a potential threat in certain environments.
Low: Informal findings regarding the presence of email services without immediate security implications.
Info: General information about DNS records and headers, not considered critical unless directly linked to high-risk activities.

If specific risk levels are not defined in the README, they have been inferred based on the scanner’s purpose and potential impact.

Example Findings:

A website contains references to ChatGPT API endpoints that were not configured by the organization, indicating unauthorized use of AI services.
DNS TXT records contain personal API keys exposed to the public internet, posing a significant risk for data breaches.

AI Notebook Environment Usage

Purpose: The AI Notebook Environment Usage Scanner is designed to identify and assess potential security risks associated with unauthorized use of AI notebook environments such as Google Colab instances and Jupyter notebooks on a given domain. This tool aims to safeguard sensitive data and computational processes by detecting unauthorized exposure to external platforms, which could pose significant security threats.

What It Detects:

Colab Instance Detection: Identifies references to Google Colab in web content, looking for specific URLs or embedded scripts related to Colab.
Jupyter Notebook Detection: Searches for Jupyter notebook files (.ipynb) hosted on the domain and detects common Jupyter interface elements and configurations.
Research Environment Indicators: Identifies mentions of research-related tools and libraries commonly used in AI notebooks, including specific configuration files or directories associated with these environments.
Security Headers Analysis: Checks for the presence and correctness of security headers that should be implemented to protect web applications, focusing on missing or weak configurations such as Strict-Transport-Security, Content-Security-Policy, X-Frame-Options, and X-Content-Type-Options.
TLS/SSL Inspection: Analyzes the SSL/TLS configuration of the domain for outdated or insecure protocols and cipher suites, including detection of TLSv1.0, TLSv1.1, RC4, DES, and MD5.

Inputs Required:

domain (string): Primary domain to analyze (e.g., acme.com). This parameter is essential for all the aforementioned detections to be performed accurately.

Business Impact: Detecting unauthorized AI notebook environments on a domain can significantly mitigate potential data leakage and exposure risks, safeguarding critical information that could otherwise be accessed by external parties without proper security measures in place. The scanner helps organizations maintain control over their sensitive data and computational processes, reducing the likelihood of security breaches and compliance issues.

Risk Levels:

Critical: This risk level applies if unauthorized AI notebook environments are detected on the domain, posing a high threat to sensitive information exposure.
High: Applicable if weak or missing security headers are identified, which could lead to easier exploitation of vulnerabilities in web applications.
Medium: Indicates issues with outdated TLS configurations that might be susceptible to attacks, though not as severe as critical risks.
Low: Informational findings regarding the presence of Jupyter notebooks without significant implications for data exposure or security unless accompanied by other indicators of risk.
Info: Provides basic information about detected Colab instances and specific URLs associated with them.

Example Findings:

A domain hosting a Google Colab instance might flag “colab_detected” as true, indicating the presence of an unauthorized AI notebook environment potentially exposing sensitive data to external platforms.
Inadequate security headers could be flagged under “detected_security_headers”, highlighting weak application defenses against common web attacks.

AI Browser Plugin Monitoring

Purpose: The AI Browser Plugin Monitoring Scanner is designed to detect unauthorized use of AI tools and extensions in web browsers by analyzing various aspects of HTTP requests, DNS records, TLS/SSL configurations, and socket connections. This helps organizations identify potential security risks associated with the presence of code assistants, writing tools, and data analysis extensions within their networks.

What It Detects:

Security Headers Analysis: Checks for missing or weak security headers such as Strict-Transport-Security, Content-Security-Policy, X-Frame-Options, and X-Content-Type-Options.
TLS/SSL Configuration Issues: Identifies outdated TLS versions (e.g., TLSv1.0, TLSv1.1) and weak cipher suites (e.g., RC4, DES, MD5).
DNS Record Analysis: Examines TXT, MX, NS, CAA, and DMARC records for potential misconfigurations or suspicious entries.
HTTP Content Analysis: Scans HTTP responses for known patterns associated with AI tools and extensions in the content body.
Socket Connection Fingerprinting: Performs port scanning and service fingerprinting to detect unauthorized services running on the target domain that could be related to AI usage.

Inputs Required:

domain (string): The primary domain to analyze (e.g., acme.com).

Business Impact: This scanner is crucial for organizations looking to secure their digital assets and prevent potential data breaches or unauthorized use of AI tools that could compromise sensitive information. Detecting such unauthorized usage helps in maintaining a robust security posture, ensuring compliance with regulations, and safeguarding intellectual property.

Risk Levels:

Critical: Conditions where the presence of an AI tool is confirmed through TLS/SSL misconfigurations, outdated protocols, or weak cipher suites that are known vulnerabilities.
High: Conditions where DNS records reveal suspicious entries or lack proper security headers that could indicate unauthorized use of AI tools.
Medium: Conditions where HTTP content contains patterns indicative of AI tools but does not necessarily pose a critical risk, requiring further investigation and potential mitigation strategies.
Low: Conditions where initial analysis suggests minimal to no presence of AI tools, considered informational unless corroborated by additional findings.
Info: Conditions that provide basic information about the domain’s network configuration without directly pointing to specific AI tool usage but still relevant for a comprehensive security audit.

Example Findings:

A detected Strict-Transport-Security header is missing, which could be exploited by attackers to intercept sensitive data in transit.
An outdated TLS version (e.g., TLSv1.0) and weak cipher suite (e.g., RC4) are identified, indicating a lack of proper encryption standards that can be easily bypassed by malicious actors.

Local AI Tool Installation

Purpose: The Local AI Tool Installation Scanner is designed to identify locally installed AI models and desktop applications that could pose potential security risks due to unauthorized usage, lack of proper management, or vulnerabilities in offline processing capabilities. This tool aims to provide a comprehensive analysis of the local environment for any unauthorized or potentially risky AI tools.

What It Detects:

Locally Installed AI Models: Identifies presence of known AI model files (e.g., .h5, .pt, .pb) on local systems, scanning directories for common AI framework-specific files and configurations.
Desktop AI Applications: Detects installed applications related to AI, such as TensorFlow Model Server or PyTorch, checking system registry or application directories for AI tool installations.
Offline Capabilities: Scans for scripts and executables that enable offline processing of data using AI models, identifying local servers or services running AI applications without internet connectivity.
Security Headers Analysis: Examines HTTP security headers to ensure proper configuration against known vulnerabilities, checking for the presence of critical security headers like Strict-Transport-Security, Content-Security-Policy, and others.
TLS/SSL Inspection: Inspects SSL/TLS certificates for outdated protocols (e.g., TLSv1.0, TLSv1.1) and weak cipher suites, verifying certificate validity and checking for known vulnerabilities in the TLS stack.

Inputs Required:

domain (string): Primary domain to analyze (e.g., acme.com). This input is essential for DNS resolution, SSL/TLS inspection, and security header analysis.

Business Impact: Detecting unauthorized or potentially risky AI tools on local systems is crucial as it helps in mitigating potential threats from malicious actors exploiting these vulnerabilities. It also ensures compliance with data protection regulations by identifying any unauthorized use of sensitive information processing technologies.

Risk Levels:

Critical: The scanner identifies outdated TLS protocols, weak cipher suites, and missing critical security headers that significantly weaken the security posture of the system.
High: Presence of AI models or applications without proper authorization, potentially leading to unauthorized data access or manipulation.
Medium: Inadequate configuration of security headers which might lead to limited protection against common web attacks.
Low: Minor issues such as presence of experimental AI frameworks that are not commonly used in production environments.
Info: Informational findings about the existence of specific AI tools, useful for auditing and compliance purposes but generally low risk.

Example Findings:

A local system is found to be hosting an unauthorized AI model file .h5 which could be exploited by external parties for malicious activities.
An installed application “TensorFlow Model Server” is detected without proper authorization, indicating potential misuse of corporate resources for non-business purposes.

Private Model API Usage

Purpose: The Private Model API Usage Scanner is designed to identify unauthorized usage of Hugging Face endpoints, custom models, and self-hosted instances by analyzing various network parameters such as DNS records, HTTP headers, TLS configurations, and open ports. This tool aims to detect potential security vulnerabilities related to AI model exposure and ensure compliance with data privacy and security standards.

What It Detects:

Hugging Face Endpoint Detection: Identifies DNS TXT records containing references to Hugging Face services, checks for MX records pointing to Hugging Face domains, and scans for NS records that include Hugging Face nameservers.
Custom Model Hosting Identification: Analyzes HTTP headers for indicators of custom model hosting solutions and searches for specific content patterns in HTML responses related to AI models.
Self-Hosted Instance Exposure: Performs port scanning to detect open ports commonly used by AI services, uses service fingerprinting to identify running AI model servers, checks for missing or weak security headers, inspects TLS certificates for outdated protocols and identifies weak cipher suites and deprecated hashing algorithms.

Inputs Required:

domain (string): Primary domain to analyze (e.g., acme.com)

Business Impact: This scanner is crucial as it helps organizations safeguard their sensitive data by detecting unauthorized access or exposure of AI models, which could lead to significant security breaches and compliance issues.

Risk Levels:

Critical: The critical risk level applies when outdated TLS protocols (e.g., TLSv1.0, TLSv1.1) are detected in the SSL/TLS configuration, indicating a severe vulnerability that must be addressed immediately.
High: High risks are associated with missing or weak security headers such as Strict-Transport-Security, Content-Security-Policy, X-Frame-Options, and X-Content-Type-Options. These issues can lead to significant exposure of sensitive information.
Medium: Medium risk findings involve the detection of specific content patterns in HTML responses that indicate potential unauthorized usage of Hugging Face endpoints or custom models, which could be exploited for data exfiltration or other malicious activities.
Low: Low risks pertain to open ports commonly used by AI services (e.g., 5000, 8501) being detected without proper security measures in place, requiring immediate attention but with a lower urgency compared to critical issues.
Info: Informational findings include the detection of specific patterns in HTTP headers and content, which may not directly pose a severe risk but could be indicative of potential exposure or unauthorized usage that should still be monitored and investigated.

Example Findings:

A DNS TXT record containing “huggingface” references is detected, indicating possible unauthorized access to Hugging Face services.
Missing Strict-Transport-Security header in HTTP responses suggests a potential risk of data interception via unencrypted channels.