Digital Footprint Analysis

7 automated security scanners

SSL Certificate Expiry Check

Purpose: The SSL Certificate Expiry Check Scanner is designed to monitor the expiration dates of SSL/TLS certificates for domains. This ensures that websites remain accessible and secure by preventing service outages due to expired certificates, detecting issues with certificate management, and identifying domains at risk of losing HTTPS protection.

What It Detects:

Certificate Expiration Date Check: Retrieves SSL certificates from domains, extracts the expiration date, calculates the number of days until expiration, flags critical (within 30 days) and warning (within 90 days) expirations, and identifies short validity periods.
Certificate Validity Period Analysis: Extracts the issuance date, calculates the total validity period, checks if it exceeds 398 days, detects backdated certificates, and identifies unusually short validity periods.
Renewal History Assessment: Compares current expiration dates with expected renewal cycles, detects newly issued certificates, flags certificates beyond typical renewal windows, and identifies failures in certificate renewal automation.
Multiple Domain Expiry Coordination: Checks the expiration dates across multiple subdomains, flags mismatched expiration dates, detects staggered renewal patterns, and distinguishes between wildcard certs and individual certs.
Certificate Lifecycle Stage: Categorizes certificates as new, active, expiring, or expired based on their percentage of validity period elapsed, flags certificates in the final 10% of validity, and identifies those requiring immediate action.

Inputs Required:

domain (string): Fully qualified domain name (e.g., ekkatha.com)

Business Impact: Certificate expiration creates significant availability and security risks. Expired certificates cause browser warnings and block user access, potentially leading to lost business or customer dissatisfaction. Automatic renewals that fail silently can leave websites vulnerable to attacks. Short validity periods require frequent renewal attention, which is time-consuming and may be neglected if not properly managed.

Risk Levels:

Critical: Certificates expiring within 30 days are critical as they pose an immediate threat to service availability.
High: Certificates expiring within 90 days indicate a high risk of imminent expiration, requiring urgent attention.
Medium: Unusually short validity periods and backdated certificates suggest potential management issues that could lead to future problems.
Low: Longer validity periods might be informational unless they are part of staggered renewal strategies or wildcard certificate usage.
Info: Informational findings include newly issued certificates, which may not yet pose a significant risk but should still be monitored for compliance with best practices.

Example Findings:

A domain with an SSL certificate expiring in 15 days would be flagged as critical due to the imminent expiration.
A certificate with only 60 days of validity, especially if it is newly issued or backdated, could indicate issues that need investigation.

Website Template Analysis

Purpose: The Website Template Analysis Scanner is designed to identify website template frameworks, CMS platforms, and common web development patterns. This tool aims to detect unauthorized clones, recognize infrastructure patterns, and fingerprint web application stacks for security assessment purposes. By identifying the presence of specific CMS platforms, JavaScript frameworks, CSS frameworks, templates, and other technologies used in web development, this scanner provides valuable insights into the architecture and potential vulnerabilities of a website.

What It Detects:

CMS Platform Detection: The scanner analyzes HTTP headers for CMS signatures, checks for indicators of WordPress, Drupal, Joomla, and detects meta generator tags to identify the Content Management System (CMS) platform in use.
JavaScript Framework Identification: By parsing HTML for framework-specific patterns, the scanner can detect the usage of popular frameworks such as React, Angular, and Vue.js.
CSS Framework Recognition: It analyzes CSS class naming patterns to determine if Bootstrap, Tailwind, or Foundation is used, helping in identifying the CSS framework employed.
Template Fingerprinting: The scanner extracts HTML structure patterns to identify common template providers like ThemeForest. It also detects specific file paths and identifies comments left by template authors.
Technology Stack Analysis: This includes identifying web server software, programming languages (e.g., PHP), database types, and recognizing middleware components used in the infrastructure.

Inputs Required:

domain (string): A fully qualified domain name (e.g., ekkatha.com) is required as input to perform the analysis.

Business Impact: Identifying website template frameworks, CMS platforms, and web development patterns is crucial for security assessments as it helps in understanding the attack surface, potential vulnerabilities associated with default configurations, and possible exploitation vectors based on known patterns. This knowledge aids in developing targeted mitigation strategies and enhancing overall security posture.

Risk Levels:

Critical: The scanner should be able to handle invalid domain formats or network errors without crashing, ensuring robustness against common issues that could arise during the scanning process.
High: If the scanner fails to detect any of the specified elements (CMS platforms, JavaScript frameworks, CSS frameworks, templates, etc.), it poses a high risk as it would not provide complete information necessary for security assessments.
Medium: The scanner should handle HTTP request failures and HTML parsing errors gracefully, providing clear error messages that help in understanding what went wrong during the scanning process.
Low: Informational findings related to version disclosure or detection of specific versions might be considered low risk if they do not directly impact security but are still useful for detailed analysis.

If the README doesn’t specify exact risk levels, infer them based on the scanner’s purpose and impact.

Example Findings:

A WordPress website with a default configuration that exposes its version number in the HTTP headers could be flagged as critical if this information is publicly accessible.
Misidentification of a JavaScript framework might lead to false positives in security assessments, which should be considered medium risk unless mitigated by additional context or user input.

Self-Signed Certificate Detection

Purpose: The Self-Signed Certificate Detection Scanner is designed to identify self-signed SSL/TLS certificates that bypass trusted certificate authorities. This scanner aims to detect development/staging environments exposed to production, internal services, and potential man-in-the-middle attack infrastructure. It plays a crucial role in identifying security vulnerabilities such as browser warnings, lack of third-party validation for domain ownership, ease of creation for phishing attacks, indication of misconfigured production systems, and exposure of internal/development environments publicly.

What It Detects:

Self-Signed Certificate Detection: Retrieves SSL certificate from a domain, checks if the issuer equals the subject (indicating a self-signed certificate), verifies that the certificate is not signed by recognized CAs, flags certificates not in trust stores, and detects locally issued certificates.
Certificate Authority Validation: Extracts the issuer organization name, checks against a list of known CAs, verifies that the issuer is in the system trust store, flags unknown or untrusted issuers, and detects private/internal CAs.
Certificate Chain Analysis: Checks certificate chain depth, verifies that the chain leads to a trusted root, flags chains terminating in self-signed certificates, detects incomplete chains, and identifies missing intermediates.
Subject-Issuer Comparison: Extracts the subject Distinguished Name (DN) and issuer DN from the certificate, compares these fields, flags exact matches (indicating self-signed certificates), and detects suspicious similarities between subject and issuer.
Trust Store Verification: Checks if a certificate is in the system trust store, verifies that the root is trusted by major browsers, flags untrusted certificate chains, detects browser warning triggers, and identifies user trust bypass scenarios.

Inputs Required:

domain (string): Fully qualified domain name (e.g., ekkatha.com)

Business Impact: Self-signed certificates create significant security vulnerabilities that can lead to various consequences such as training users to ignore security alerts, facilitating phishing attacks by lacking third-party validation of domain ownership, and exposing internal or development environments publicly. This can compromise the integrity and confidentiality of sensitive information and may result in unauthorized access to systems and data.

Risk Levels:

Critical: Conditions that lead to critical severity include certificates being self-signed and not trusted by any system trust store, which could expose misconfigured production systems directly to attackers without any intermediary verification.
High: Conditions for high risk involve certificates with issuers unknown or untrusted, indicating potential internal misuse of private CAs or misconfigurations that bypass standard security practices.
Medium: Medium risk is associated with incomplete certificate chains and self-signed certificates within the chain, which can lead to trust issues in systems where intermediate certificates are missing or not trusted.
Low: Low risk findings pertain to cases where browsers might issue warnings but do not significantly compromise system security, such as when a certificate is publicly trusted without any known vulnerabilities.
Info: Informational findings include situations where the scanner identifies that no self-signed certificates were detected and systems are correctly configured with trusted CAs.

Example Findings:

A domain “example.com” was found to have a self-signed SSL certificate, which is not recommended for production environments due to potential security risks.
An internal service using a private CA issued certificate was detected as untrusted by the system trust store, posing a risk of unauthorized access if exposed externally.

DNS Pattern Recognition

Purpose:
The DNS Pattern Recognition Scanner is designed to analyze DNS configuration patterns in order to fingerprint infrastructure providers, identify shared hosting environments, detect Content Delivery Network (CDN) usage, and recognize DNS management platforms through nameserver patterns and zone structures. This tool helps in understanding the underlying technology stack used by a domain, which can be crucial for security assessments and compliance checks.

What It Detects:

Nameserver Pattern Analysis: The scanner extracts all authoritative nameservers, identifies naming patterns, matches them against known providers to detect AWS Route53, Cloudflare, Google Cloud DNS, etc., and flags unknown or suspicious patterns.
Cloud Provider Detection: By analyzing nameserver hostnames for provider signatures, the scanner can check for AWS, Azure, and GCP patterns, as well as recognize managed DNS services.
CDN Infrastructure Identification: It checks for CDN-specific DNS patterns to identify Cloudflare proxied records, detect Akamai edge configurations, find Fastly or other CDN CNAME patterns, and map their distribution strategies.
TTL Pattern Analysis: The scanner extracts TTL values across record types, identifies optimization patterns, detects very low TTLs (indicators of load balancing) and very high TTLs (indicators of static infrastructure), and recognizes TTL strategies.
Record Structure Fingerprinting: It analyzes A/AAAA record counts and patterns, checks MX record configurations, examines TXT record patterns for SPF and DMARC signatures, and identifies common infrastructure templates and automated DNS management practices.

Inputs Required:

domain (string): Fully qualified domain name (e.g., ekkatha.com)

Business Impact:
DNS pattern analysis is crucial as it can reveal sensitive information about the infrastructure behind a domain, which can be leveraged for various purposes such as identifying potential misconfigurations, assessing security posture, and ensuring compliance with regulations related to DNS management.

Risk Levels:

Critical: The scanner must actually query DNS servers and return valid JSON output; any failure in this process should result in an error field being populated while maintaining a valid JSON structure.
High: Errors such as DNS resolution failures, missing NS records, or pattern matching errors can lead to misidentification of infrastructure details, potentially leading to high risk scenarios like unauthorized access or data leakage.
Medium: Network timeouts and invalid domain formats might result in inconclusive findings but should still be handled gracefully without crashing the scanner.
Low: Informational findings such as minor discrepancies in TTL values or unrecognized nameserver patterns could be considered low risk if they do not significantly impact security posture.
Info: These are generally benign and would include observations that might not directly affect security but can provide useful context for infrastructure analysis.

Example Findings:

A domain using AWS Route53 might have nameservers ending in “ns-.awsdns-.{com|org|net|co.uk}”.
A heavily optimized TTL strategy across various record types could indicate a CDN usage or automated DNS management practices.

Infrastructure Similarity Detection

Purpose: The Infrastructure Similarity Detection Scanner is designed to identify similarities in infrastructure fingerprints between domains. This tool aims to detect related assets, shared hosting, infrastructure reuse patterns, and potential shadow IT or unauthorized deployments by analyzing IP address overlap, TLS configuration, HTTP headers, DNS configurations, and recognizing common cloud deployment patterns.

What It Detects:

IP Address Overlap Analysis: The scanner resolves domain names to their corresponding IP addresses and compares these IPs against known organizational IP ranges to identify co-hosted domains.
TLS Configuration Fingerprinting: It extracts TLS cipher suite order, compares certificate issuer patterns, checks SSL/TLS protocol versions supported, analyzes certificate chain similarities, and creates a unique TLS configuration fingerprint for each domain.
HTTP Header Similarity: The scanner retrieves HTTP response headers and compares the Server header, X-Powered-By, etc., to identify shared web server configurations.
DNS Configuration Comparison: It compares nameserver configurations, checks for identical NS patterns, analyzes TTL value similarities, and compares MX and TXT record patterns to detect shared DNS management.
Infrastructure Template Recognition: The scanner identifies common cloud deployment patterns, detects infrastructure-as-code templates, recognizes standard configurations (AWS, Azure, GCP), flags identical infrastructure fingerprints, and calculates overall similarity scores.

Inputs Required:

domain (string): A domain name to be fingerprinted for analysis.

Business Impact: Identifying infrastructure similarities is crucial for understanding the security relationships between domains. Shared IP addresses can indicate multi-tenant hosting or related sites, which may lead to unauthorized access and data sharing risks. Similar TLS configurations suggest centralized management that could compromise privacy and encryption standards. Recognizing shared infrastructures helps in managing shadow IT deployments and unauthorized use of organizational resources.

Risk Levels:

Critical: Conditions where multiple domains share identical infrastructure fingerprints or critical components like IP addresses, TLS configurations, or DNS patterns are present.
High: Where a single domain shares significant infrastructure components with others but does not meet the criteria for Critical risk.
Medium: Informal sharing of less critical infrastructure elements such as common server software versions.
Low: Minimal sharing of generic infrastructure details that do not pose immediate security risks.
Info: Non-significant sharing of DNS configurations or minor HTTP headers differences that are generally safe but can be indicative of broader network management practices.

Example Findings:

A domain shares multiple IPs with known co-hosted domains, indicating potential unauthorized access and data sharing.
Identical TLS configurations among several domains suggest centralized management without proper encryption standards or multi-factor authentication policies.

Certificate Chain Analysis

Purpose: The Certificate Chain Analysis Scanner is designed to validate the integrity and trustworthiness of SSL/TLS certificate chains used by servers. It aims to detect chain breaks, trust issues, and misconfigurations that could enable man-in-the-middle attacks, thereby compromising the security of TLS connections.

What It Detects:

Chain Completeness Validation: The scanner retrieves the complete SSL/TLS certificate chain from a server and verifies its completeness by checking for the presence of an end-entity certificate, one or more intermediate certificates, and the root CA certificate. Incomplete chains that lack intermediates or the root CA are flagged as potential security risks.
Chain Order Verification: The scanner validates that the certificates in the chain are ordered correctly, ensuring each certificate is signed by the subsequent certificate in the chain and that the issuer matches the subject of the next certificate. Misordered chains are identified as critical issues.
Trust Anchor Validation: This involves identifying the root CA certificate in the chain and verifying its presence in the system’s trust store. The scanner also checks whether the root CA is recognized by a reputable Certificate Authority, flagging any unknown or untrusted roots as significant concerns.
Intermediate Certificate Analysis: The scanner checks all intermediate certificates for validity, ensuring they are not expired and that they meet basic constraints and key usage requirements set by their respective CAs. Weak or improperly configured intermediates are flagged to highlight potential vulnerabilities.
Signature Verification: Each certificate’s signature is verified against the expected algorithm, with a particular focus on identifying weak cryptographic algorithms such as SHA1 and MD5, which are known to be susceptible to attacks. Signature verification failures are reported as critical issues.

Inputs Required:

domain (string): A fully qualified domain name (e.g., ekkatha.com) that the scanner uses to retrieve SSL/TLS certificates from the server hosting this domain.

Business Impact: Certificate chain vulnerabilities can significantly impact the security of digital communications, potentially leading to unauthorized access, data leakage, and system compromise. Proper validation and enforcement of certificate chains are crucial for maintaining secure TLS connections.

Risk Levels:

Critical: Chain completeness is missing intermediates or root CA; misordered certificates; expired intermediate certificates; self-signed roots; cross-signed certificates creating ambiguous trust paths.
High: Weak signature algorithms (SHA1, MD5); weak key usage in intermediates; basic constraints not met by intermediates.
Medium: Intermediate CAs with limited permissions or configurations that do not fully comply with CA standards.
Low: Minor deviations in chain order or minor weaknesses in individual certificates that do not significantly impact overall security.
Info: Informational findings such as using modern cryptographic algorithms for signatures and key usage, which while not critical, are still recommended for enhanced security practices.

Example Findings:

A server is found to be missing its intermediate CA certificate, leading to potential chain breakage that could allow man-in-the-middle attacks.
An expired root CA certificate in the chain results in an immediate critical alert indicating a complete trust breakdown and significant security risk.

Brand Asset Misuse

Purpose: The Brand Asset Misuse Scanner is designed to detect unauthorized use of corporate brand assets across the internet. This includes identifying phishing sites, counterfeit operations, and brand impersonation attempts by detecting typosquatting domains using similar brand names, stolen logos on fake e-commerce sites, unauthorized trademark use in malicious email campaigns, brand impersonation on social media for fraud, and visual identity theft for business email compromise attacks.

What It Detects:

Domain Similarity Detection: Generates common typosquatting variations of brand domain, checks DNS registration status of similar domains, queries WHOIS for ownership information, flags registered domains not owned by the organization, and detects homograph attacks (unicode lookalikes).
Logo Image Detection: Fetches homepage HTML and extracts image URLs, downloads potential logo images, compares visual similarity to official brand logos using perceptual hashing.
Favicon Analysis: Retrieves favicon from target domains, compares against official brand favicon, calculates visual similarity score, detects exact or near-exact favicon copies.
Meta Tag Inspection: Extracts meta description and title tags, searches for brand name mentions, detects brand keyword stuffing, flags unauthorized brand references.
SSL Certificate Subject Analysis: Retrieves SSL certificate for domain, checks organization name in certificate subject, flags certificates claiming to be the brand, detects impersonation in cert CN/SAN fields.

Inputs Required:

domain (string): Official brand domain (e.g., ekkatha.com)

Business Impact: Brand asset misuse can enable sophisticated attacks that pose significant risks to corporate security and reputation. Detecting unauthorized use of brand assets early helps mitigate the risk of phishing, counterfeit operations, and fraudulent activities, protecting both customers and businesses from potential harm.

Risk Levels:

Critical: Conditions where typosquatting domains are registered using similar brand names for phishing or stolen logos on fake e-commerce sites selling counterfeit products pose a high risk of direct financial loss and damage to corporate reputation.
High: Unauthorized trademark use in malicious email campaigns, brand impersonation on social media for fraud, and visual identity theft for business email compromise attacks are considered high risks as they can lead to unauthorized access to sensitive information and potential financial losses.
Medium: Brand keyword stuffing and unauthorized mentions in meta tags may not directly cause significant harm but indicate a lack of compliance with corporate brand usage guidelines and contribute to overall risk perception.
Low: Informational findings such as minor variations in domain names or slight deviations in logo similarity scores, while still considered misuse, generally pose minimal direct risks unless accompanied by other indicators of higher severity.
Info: These are less critical but still noteworthy instances that might not directly impact security but could be indicative of broader issues needing attention.

Example Findings:

A typosquatting domain “ekkatha.net” registered for phishing purposes, detected by DNS registration checks and WHOIS queries.
A fake e-commerce site using a stolen logo from the official brand, identified through image comparison against known logos stored in the database.