Apache Tika Hit by Critical XXE Bug CVE-2025-66516

apache tika

A major security warning has been issued for Apache Tika, a widely used tool for content detection and file parsing. A newly discovered vulnerability, identified as CVE-2025-66516, has been rated CVSS 10.0, the highest possible severity score. This makes the bug extremely dangerous for organisations and developers who rely on Tika for processing documents such as PDFs, XML files, and many other formats.

The flaw exposes Apache Tika to XML External Entity (XXE) injection, a type of attack that can allow cybercriminals to read sensitive files from the server, access internal network resources, and in some cases even achieve remote code execution (RCE). Because Tika is often used in enterprise environments, document-processing pipelines, and search systems, this vulnerability poses a high-impact risk to many businesses.

CVE-2025-66516 is a critical XXE vulnerability affecting multiple Apache Tika modules across all supported platforms. According to the official advisory, the issue allows attackers to exploit Tika using a specially crafted XFA (XML Forms Architecture) file embedded inside a PDF. If the malicious file is parsed by an affected version of Tika, the attacker can trigger XXE behavior and gain unauthorized access to sensitive data stored on the server.

Affected Apache Tika Modules

The vulnerability impacts the following Maven packages:

  • org.apache.tika:tika-core
    Versions: 1.13 to 3.2.1
    Patched in: 3.2.2

  • org.apache.tika:tika-parser-pdf-module
    Versions: 2.0.0 to 3.2.1
    Patched in: 3.2.2

  • org.apache.tika:tika-parsers
    Versions: 1.13 to 1.28.5
    Patched in: 2.0.0

Because these components are commonly used in Tika-based deployments, a large number of applications may be at risk if not properly updated.

To understand the seriousness of this bug, it is important to know what XML External Entity (XXE) attacks are. XXE vulnerabilities occur when an application processes XML input without properly restricting external entity references. This weakness lets attackers:

  • read sensitive files on the server

  • send unauthorized requests to internal systems (SSRF)

  • steal credentials or private keys

  • disrupt application logic

  • potentially execute remote code in extreme cases

Since Apache Tika automatically parses documents uploaded or submitted to applications, it becomes a powerful target for XXE exploitation, especially in systems that handle user-submitted files.

Interestingly, CVE-2025-66516 is closely related to an earlier XXE vulnerability in Tika, tracked as CVE-2025-54988, which carried a CVSS score of 8.4. That flaw was patched in August 2025, but the Apache Tika team now clarified that the earlier fix did not fully address all affected components.

The new advisory highlights two major oversights:

The previous report focused on the tika-parser-pdf-module, but the actual root cause was inside tika-core. This means that users who updated the PDF parser module immediately after the earlier advisory but did not upgrade tika-core remained vulnerable.

The fix for CVE-2025-66516 is included only in tika-core version 3.2.2 or later.

The earlier disclosure did not mention that in Tika 1.x versions, the PDF parser code existed in the tika-parsers module. As a result, organisations using Tika 1.x may have mistakenly assumed they were unaffected.

This expanded scope makes CVE-2025-66516 a far broader and more critical issue than initially expected.

vulnerability

A CVSS score of 10.0 means the vulnerability is:

  • Easy to exploit

  • Does not require authentication

  • Can lead to data exposure or total system compromise

  • Impacts all platforms (Windows, Linux, macOS)

Apache Tika often runs in:

  • content indexing systems

  • document ingestion pipelines

  • data extraction services

  • enterprise search engines

  • cloud-based document analysis tools

Any organisation using Tika in automated workflows where users can upload or submit files is at high risk.

Given the severity of CVE-2025-66516, experts strongly recommend immediate upgrades to safe versions:

  • Update tika-core to 3.2.2 or higher

  • Update tika-parser-pdf-module to 3.2.2 or higher

  • Update tika-parsers to version 2.0.0 if using the 1.x branch

You should also:

  • scan your application logs for suspicious XML-based activity

  • audit systems where user-uploaded PDFs are processed

  • review your XML parser configurations

  • enable secure parsing options where applicable

  • ensure your DevOps pipelines use the latest packages

CVE-2025-66516 is one of the most critical vulnerabilities reported in Apache Tika to date. Because it enables XXE injection through malicious PDF files, it exposes organisations to serious risks including data leakage and possible remote code execution. The expanded scope of affected modules means that many applications previously considered safe may still be vulnerable.

Updating to the patched versions is the only reliable way to mitigate this threat. With a CVSS score of 10.0 and widespread use of Tika across industries, applying the fix should be treated as an immediate priority.

Follow us on Twitter and Linkedin for real time updates and exclusive content.

1 thought on “Apache Tika Hit by Critical XXE Bug CVE-2025-66516”

  1. Pingback: CVE-2025-6389: Sneeit Wordpress RCE Exploited & CVE-2025-2611

Comments are closed.

Scroll to Top