Corpus Analysis
Corpus Analysis
Section titled “Corpus Analysis”5 automated security scanners
Training Data Contamination
Section titled “Training Data Contamination”Purpose: The Training Data Contamination Scanner is designed to detect and mitigate potential data contamination risks within datasets used for training machine learning models. It aims to identify leaked documents, unauthorized access indicators, code inclusions that could compromise system integrity, internal communication markers potentially linked to security threats, and known vulnerabilities that may pose a risk to the organization’s information assets.
What It Detects:
- Leaked Document Detection: Identifies patterns indicative of leaked documents such as “exposed”, “leaked”, or “breached” within dataset contents.
- Code Inclusion Identification: Detects the presence of code snippets or files that should not be included in training datasets, including terms related to malware, ransomware, trojan, command and control (C2) activities.
- Internal Communication Presence: Identifies internal communication markers within datasets, such as email addresses or references to internal documents, which may indicate potential phishing activities or credential harvesting.
- Known Vulnerability Indicators: Scans for Common Vulnerabilities and Exposures (CVE) identifiers and checks against known exploited vulnerabilities from CISA KEV to identify potential security risks.
Inputs Required:
domain(string): Primary domain to analyze, providing the scope of the dataset being evaluated.company_name(string): Company name for statement searching, which helps in identifying relevant breach disclosure statements and internal communication markers related to the organization.
Business Impact: This scanner is crucial for maintaining data integrity and preventing unauthorized exposure of sensitive information that could lead to significant security breaches or legal liabilities. It plays a vital role in securing the training datasets used for AI applications, ensuring compliance with privacy regulations such as GDPR or HIPAA, and safeguarding intellectual property from theft.
Risk Levels:
- Critical: The scanner flags patterns directly linked to unauthorized data exposure (e.g., “breached” within dataset contents) that could lead to immediate security incidents.
- High: Detects indicators of potential internal threats or compromised systems, such as phishing activities or credential harvesting signals in the dataset.
- Medium: Identifies vulnerabilities and known exploits that may not be immediately critical but pose a risk over time if left unaddressed.
- Low: Informational findings related to exposure indicators like “leaked” terms suggest potential issues requiring review for compliance with data handling policies.
- Info: Scans for CVE identifiers and other benign indicators of system activity that do not necessarily indicate security vulnerabilities but are relevant for general IT management.
Example Findings:
- The scanner might flag a dataset containing “unauthorized access” keywords, indicating potential internal threats or unauthorized data exposure risks.
- A leaked document pattern within the dataset could suggest an issue with data handling and retention policies that need immediate attention to prevent further leakage.
Model Memorization Assessment
Section titled “Model Memorization Assessment”Purpose: The Model Memorization Assessment Scanner is designed to detect verbatim recall testing, unique identifier retention, and exact phrase reproduction in breach disclosure statements. This tool helps organizations ensure that their incident reports are not merely regurgitating templates without providing meaningful information about the actual incidents.
What It Detects:
- Verbatim Recall Testing: Identifies repeated phrases or sentences across multiple disclosures, indicating the use of generic templated language.
- Unique Identifier Retention: Checks for consistent retention of unique identifiers in different statements to ensure each disclosure provides distinct and relevant information.
- Exact Phrase Reproduction: Detects exact matches between specific phrases or sentences across different disclosures, suggesting reuse without modification.
- Template Usage Detection: Analyzes the structure and language used in disclosures to identify template-based content that suggests a lack of originality in incident reporting.
- Lack of Incident-Specific Details: Evaluates the inclusion of detailed, specific information about each incident, flagging those that lack unique details or provide overly generic descriptions.
Inputs Required:
domain(string): The primary domain to analyze, such as “acme.com,” which helps in searching for breach disclosure statements on the company’s website.company_name(string): The name of the company, like “Acme Corporation,” used for statement searching and identification purposes.
Business Impact: This scanner is crucial for maintaining the integrity and transparency of breach disclosure statements, which are critical for building trust with stakeholders and complying with regulatory requirements.
Risk Levels:
- Critical: Conditions that could lead to severe consequences such as significant financial loss or legal repercussions due to inadequate incident reporting.
- High: Conditions where the risk is high but not immediately critical, such as non-compliance with specific security standards or potential public trust issues.
- Medium: Conditions where the risk is moderate and may require attention for improvement in breach disclosure practices.
- Low: Conditions where the risk is minimal and does not significantly impact the organization’s security posture.
- Info: Conditions that provide informational insights but do not pose immediate risks or compliance issues.
Example Findings:
- A disclosure contains repeated phrases from previous statements without any modifications to reflect new findings or details of the incident.
- Identifiers like CVE numbers are reused across different disclosures without clear relevance to distinct incidents, indicating a potential lack of specificity in reporting.
Data Poisoning Susceptibility
Section titled “Data Poisoning Susceptibility”Purpose: The Data Poisoning Susceptibility Scanner is designed to detect malicious fine-tuning influence, adversarial training data, and model manipulation by analyzing domain and company-specific threat intelligence feeds. It aims to identify potential vulnerabilities and suspicious activities that could indicate data poisoning or adversarial attacks in machine learning models.
What It Detects:
- Malicious Fine-Tuning Influence: Identifies patterns indicative of unauthorized modifications during the fine-tuning process, looking for signs of malicious actors injecting harmful data into training datasets.
- Adversarial Training Data: Detects the presence of adversarial examples or crafted inputs designed to deceive machine learning models, searching for indicators of data manipulation aimed at altering model behavior.
- Model Manipulation: Identifies suspicious activities that suggest intentional alterations to pre-trained models, looking for evidence of unauthorized access or tampering with model parameters.
Inputs Required:
domain(string): Primary domain to analyze (e.g., acme.com)company_name(string): Company name for statement searching (e.g., “Acme Corporation”)
Business Impact: This scanner is crucial for organizations operating in domains where machine learning models are used, as it helps identify potential threats and vulnerabilities that could lead to data poisoning or model manipulation. It contributes significantly to the security posture by providing early detection mechanisms against adversarial attacks and unauthorized modifications of training datasets.
Risk Levels:
- Critical: Conditions under which the scanner identifies known exploited vulnerabilities in the domain and company infrastructure using CISA KEV, indicating a severe risk that could lead to significant data loss or model compromise.
- High: Conditions where there are indications of unauthorized access or tampering with model parameters, posing a high risk of data poisoning or model manipulation.
- Medium: Conditions where there are suspicious activities indicative of potential adversarial training data or malicious fine-tuning influence, requiring immediate attention to prevent future risks.
- Low: Informal findings that do not pose an immediate threat but should be monitored for any changes in behavior that might indicate emerging vulnerabilities.
- Info: Informational findings from threat intelligence feeds that provide general insights into the domain’s security posture without directly indicating a risk.
Example Findings:
- The scanner may flag unauthorized modifications during fine-tuning, identifying patterns indicative of malicious actors injecting harmful data into training datasets.
- It may also detect adversarial examples or crafted inputs designed to deceive machine learning models, signaling potential risks in model manipulation and security vulnerabilities.
Prompt Injection Resilience
Section titled “Prompt Injection Resilience”Purpose: The Prompt Injection Resilience Scanner is designed to safeguard AI systems against adversarial attacks by detecting potential vulnerabilities such as adversarial prompt responses, instruction overrides, and system prompt leakage. This tool ensures that AI models remain resilient in the face of malicious inputs that could manipulate their behavior or expose sensitive information.
What It Detects:
- Adversarial Prompt Response: Identifies unexpected outputs from AI systems triggered by benign prompts, which may suggest manipulation by adversarial inputs.
- Instruction Override Vulnerability: Uncovers instances where system instructions are overridden, leading to unintended actions that can be executed based on the model’s responses.
- System Prompt Leakage: Detects accidental disclosures of internal system prompts or configurations when processing user queries, which could expose sensitive information.
- Malicious Input Patterns: Uses regex patterns to identify common malicious input structures designed to inject harmful payloads such as malware, SQL commands, or other code snippets.
- Response Consistency Checks: Compares responses from AI systems under similar but distinct inputs to detect inconsistencies that may indicate manipulation or unexpected behavior.
Inputs Required:
- domain (string): Primary domain to analyze, providing the main website address for comprehensive analysis.
- company_name (string): Company name used for searching breach disclosure statements related to potential data breaches or security incidents within the organization.
Business Impact: This scanner is crucial for maintaining the integrity and confidentiality of AI systems utilized by enterprises, governments, and critical infrastructure providers. By identifying vulnerabilities that could be exploited through adversarial inputs, it helps organizations mitigate risks associated with unauthorized access and data exposure.
Risk Levels:
- Critical: Severe vulnerabilities that directly enable malicious input manipulation or significant data leakage are considered critical.
- High: High-risk conditions where system instructions can be overridden or sensitive information is exposed without proper authorization.
- Medium: Moderate risks involving potential unauthorized access through compromised inputs, potentially leading to less severe consequences if exploited.
- Low: Minimal risk scenarios that might not pose a significant threat but still require monitoring and improvement in input handling mechanisms.
- Info: Informal findings related to minor inconsistencies or non-critical issues in prompt responses, generally requiring further investigation for optimization.
Example Findings:
- “The AI system unexpectedly responded with details about our company’s upcoming product launch plans during a routine security audit query.”
- “A user input was detected attempting to override critical instructions within the AI model, which could lead to unauthorized data access if not mitigated.”
Information Leakage Channels
Section titled “Information Leakage Channels”Purpose: The Information Leakage Channels Scanner is designed to identify potential vulnerabilities and exposed services by analyzing various threat intelligence feeds. It aims to detect side-channel data exposure, unintended information disclosure, and inference leakage through the use of Shodan API for open ports and services, CISA KEV database for known vulnerabilities, VirusTotal API for domain/IP reputation, AbuseIPDB for malicious activities or blacklisting status, NVD/CVE database for exploited vulnerabilities, and public data sources for threat indicators and exposure indicators.
What It Detects:
- Exposed Services and Vulnerabilities: Identifies open ports and services using Shodan API and scans for known vulnerabilities listed in the CISA KEV database.
- Domain/IP Reputation: Evaluates domain and IP reputation using VirusTotal API and checks for malicious activities or blacklisting status on AbuseIPDB.
- Known Exploited Vulnerabilities: Cross-references identified vulnerabilities with the NVD/CVE database to determine if they are known exploited vulnerabilities (KEVs).
- Threat Indicators in Public Data Sources: Searches for specific threat indicators such as CVE numbers, malware-related terms, command and control references, and phishing activities.
- Exposure Indicators in Public Data Sources: Looks for exposure indicators like data breaches, unauthorized access, and data dumps.
Inputs Required:
domain(string): Primary domain to analyze (e.g., acme.com)company_name(string): Company name for statement searching (e.g., “Acme Corporation”)
Business Impact: This scanner is crucial for organizations looking to secure their digital assets by identifying potential vulnerabilities and exposed services that could be exploited through various channels, thereby safeguarding sensitive information and preventing unauthorized access.
Risk Levels:
- Critical: Conditions where the scanner identifies critical vulnerabilities or exposes highly sensitive data without proper mitigation measures in place.
- High: Conditions where the scanner detects significant exposure of internal systems or potential exploitation of known vulnerabilities that are not yet exploited but pose a high risk.
- Medium: Conditions where the scanner flags moderate risks such as unpatched systems or less critical vulnerabilities that could be targeted by attackers.
- Low: Conditions where the scanner identifies minor issues like outdated software versions, which can be addressed with minimal impact on security posture.
- Info: Conditions where the scanner finds informational findings that do not pose immediate risk but are indicative of potential future threats if left unaddressed.
Example Findings:
- The scanner might flag an exposed SSH service on a company’s domain, indicating unauthorized access points and high risk.
- It could also identify known vulnerabilities in software systems that have been exploited by threat actors, signaling significant exposure to data breaches.