Improve security posture of PDF reporting (#12160)

* Add custom URL fetcher for PDF rendering * Fix for report helper functions * Use new fetcher * Additional unit tests * Add new setting to control remote URL fetching * validate URLs against SSRF * Add global setting to disable URL fetching entirely * Update docs * Fix capitalization * Fix logging backend * Update CHANGELOG
2026-07-04 06:00:38 +00:00 · 2026-06-14 10:55:51 +10:00
parent b294bba66b
commit 2b4f303770
12 changed files with 327 additions and 7 deletions
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Breaking Changes

+- [#12160](https://github.com/inventree/InvenTree/pull/12160) changes the way that remote URIs are loaded into the PDF report generator. Remote URIs (e.g. `http://` and `https://`) are now blocked by default, and can be enabled via the `REPORT_FETCH_URLS` system setting. This change was made to improve the security of the report generation process, as allowing remote URL fetching can potentially expose internal network services to SSRF attacks. Additionally, file URIs (e.g. `file://`) are now always blocked, and assets must be embedded as `data:` URIs before reaching the PDF generator.
 - [#12107](https://github.com/inventree/InvenTree/pull/12107) makes a breaking change to the `SalesOrderStatusGroups` enum, fixing a bug where the "shipped" status was not included in the "active" group. This change may affect any external client applications which make use of the `SalesOrderStatusGroups` enum, as the "shipped" status will now be included in the "active" group instead of the "complete" group. If you are using this enum in an external client application, you will need to update your application to account for this change.
 - [#9604](https://github.com/inventree/InvenTree/pull/9604) refactors user API endpoint to be less ambiguous
 - [#11893](https://github.com/inventree/InvenTree/pull/11893) bumps Node environment to version 24 LTS - this is only relevant if you build the frontend assets yourself
@@ -230,3 +230,35 @@ And the snippet file `stock_row.html` may be written as follows:
 </tr>
 {% endraw %}
 ```
+
+## Security
+
+Report templates are powerful by design — they have access to the full Django template language and to model data across the InvenTree database. For this reason, **template upload is restricted to staff users only**.
+
+### URL Fetching
+
+When WeasyPrint renders a template to PDF it can make outbound requests to load images, stylesheets, and fonts referenced in the HTML. InvenTree restricts this through a custom URL fetcher with the following rules:
+
+| URL Type | Behavior |
+|---|---|
+| `data:` URIs | Always permitted — self-contained, no network access |
+| `file://` | Always blocked — assets and images must be inlined as `data:` URIs before reaching WeasyPrint |
+| `http` / `https` | Disabled by default, but can be blocked - see *Remote URL Fetching* below |
+| Any other scheme | Always blocked |
+
+HTTP redirects are also disabled: a URL that passes validation cannot redirect to an internal address.
+
+### Remote URL Fetching
+
+The **Report URL Fetching** system setting (`REPORT_FETCH_URLS`) controls whether `http://` and `https://` URLs in templates are permitted. It defaults to **disabled**.
+
+When enabled, URLs are still validated against private, loopback, link-local, and reserved IP ranges before the request is made, preventing templates from being used as a vector for [Server-Side Request Forgery (SSRF)](https://owasp.org/www-community/attacks/Server_Side_Request_Forgery) attacks against internal network services.
+
+!!! warning "Enable with care"
+    Enabling remote URL fetching allows report templates to trigger outbound HTTP requests from the InvenTree server. Only enable this if your templates genuinely require it, and ensure that templates are reviewed before deployment.
+
+### Asset Files
+
+Asset files uploaded through the admin interface are embedded directly into the rendered PDF as base64 `data:` URIs — they are read via the Django storage API and never loaded through WeasyPrint's URL fetcher. This means assets work correctly regardless of whether remote URL fetching is enabled, and also work with remote storage backends such as S3.
+
+There are various [helper functions](./helpers.md#report-assets) available to assist with embedding assets into templates.
@@ -1,5 +1,5 @@
 ---
-title: Report and LabelGeneration
+title: Report and Label Generation
 ---

 ## Custom Reports
@@ -144,6 +144,7 @@ Configuration of report generation:
 {{ globalsetting("REPORT_ENABLE") }}
 {{ globalsetting("REPORT_DEFAULT_PAGE_SIZE") }}
 {{ globalsetting("REPORT_DEBUG_MODE") }}
+{{ globalsetting("REPORT_FETCH_URLS") }}
 {{ globalsetting("REPORT_LOG_ERRORS") }}

 ### Label Printing
@@ -677,6 +677,12 @@ SYSTEM_SETTINGS: dict[str, InvenTreeSettingsKeyType] = {
        'default': False,
        'validator': bool,
    },
+    'REPORT_FETCH_URLS': {
+        'name': _('Report URL Fetching'),
+        'description': _('Allow fetching of remote URLs when generating reports'),
+        'default': False,
+        'validator': bool,
+    },
    'REPORT_LOG_ERRORS': {
        'name': _('Log Report Errors'),
        'description': _('Log errors which occur when generating reports'),
@@ -15,6 +15,7 @@ from common.models import DataOutput
 from InvenTree.helpers import str2bool
 from plugin import InvenTreePlugin
 from plugin.mixins import LabelPrintingMixin, SettingsMixin
+from report.fetcher import InvenTreeURLFetcher
 from report.models import LabelTemplate

 logger = structlog.get_logger('inventree')
@@ -168,7 +169,7 @@ class InvenTreeLabelSheetPlugin(LabelPrintingMixin, SettingsMixin, InvenTreePlug
            generated_file = ContentFile(html_data, 'labels.html')
        else:
            # Render HTML to PDF
-            html = weasyprint.HTML(string=html_data)
+            html = weasyprint.HTML(string=html_data, url_fetcher=InvenTreeURLFetcher())
            document = html.render().write_pdf()
            generated_file = ContentFile(document, 'labels.pdf')

@@ -0,0 +1,61 @@
+"""WeasyPrint URL fetcher with security restrictions for report generation."""
+
+from urllib.parse import urlparse
+
+import structlog
+from weasyprint.urls import URLFetcher
+
+logger = structlog.get_logger('inventree')
+
+
+class InvenTreeURLFetcher(URLFetcher):
+    """WeasyPrint URL fetcher restricted to safe origins."""
+
+    def __init__(self, **kwargs):
+        """Disable redirect following so a same-origin URL cannot be used for SSRF."""
+        kwargs.setdefault('allow_redirects', False)
+        super().__init__(**kwargs)
+
+    def fetch(self, url, headers=None):
+        """Validate *url* before delegating to the parent fetcher."""
+        parsed = urlparse(url)
+        scheme = parsed.scheme.lower()
+
+        if scheme in ('data', 'http', 'https'):
+            self._validate_http_url(url, parsed)
+            return super().fetch(url, headers)
+
+        if scheme == 'file':
+            logger.warning("InvenTreeURLFetcher: blocked file:// URL: '%s'", url)
+            raise ValueError(
+                f'file:// URLs are not permitted in report templates: {url}'
+            )
+
+        raise ValueError(f"URL scheme '{scheme}' is not permitted in report templates")
+
+    def _validate_http_url(self, url: str, parsed) -> None:
+        """Raise if HTTP/HTTPS fetching is disabled or the URL is an SSRF risk."""
+        from common.settings import get_global_setting
+        from InvenTree.helpers_model import validate_url_no_ssrf
+
+        if not parsed.netloc:
+            # data: URIs — self-contained, no network access required.
+            return
+
+        if not get_global_setting('REPORT_FETCH_URLS', cache=False):
+            logger.warning(
+                "InvenTreeURLFetcher: blocked URL '%s': remote fetching is disabled (REPORT_FETCH_URLS=False)",
+                url,
+            )
+            raise ValueError(
+                f'Remote URL fetching is disabled in report templates: {url}'
+            )
+
+        try:
+            validate_url_no_ssrf(url)
+        except ValueError:
+            logger.warning(
+                "InvenTreeURLFetcher: blocked URL '%s': resolves to a private or reserved address",
+                url,
+            )
+            raise
@@ -36,6 +36,8 @@ from plugin.registry import registry

 try:
    from weasyprint import HTML
+
+    from report.fetcher import InvenTreeURLFetcher
 except OSError as err:  # pragma: no cover
    print(f'OSError: {err}')
    print("Unable to import 'weasyprint' module.")
@@ -277,7 +279,9 @@ class ReportTemplateBase(
            bytes: PDF data
        """
        html = self.render_as_string(instance, context=context, **kwargs)
-        pdf = HTML(string=html).write_pdf(pdf_forms=True)
+        pdf = HTML(string=html, url_fetcher=InvenTreeURLFetcher()).write_pdf(
+            pdf_forms=True
+        )

        return pdf

@@ -2,6 +2,7 @@

 import base64
 import logging
+import mimetypes
 from datetime import date, datetime
 from decimal import Decimal, InvalidOperation
 from io import BytesIO
@@ -288,13 +289,21 @@ def asset(filename: str, raise_error: bool = False) -> str | None:
        else:
            return None

-    # In debug mode, return a web URL to the asset file (rather than a local file path)
+    # In debug mode, return a web URL to the asset file (rather than encoded data)
    if get_global_setting('REPORT_DEBUG_MODE', cache=False):
        return default_storage.url(str(full_path))

-    storage_path = default_storage.path(str(full_path))
+    file_data = get_media_file_contents(full_path, raise_error=raise_error)

-    return f'file://{storage_path}'
+    if not file_data:
+        return None
+
+    mime_type, _encoding = mimetypes.guess_type(str(filename))
+    if not mime_type:
+        mime_type = 'application/octet-stream'
+
+    encoded = base64.b64encode(file_data).decode('ascii')
+    return f'data:{mime_type};base64,{encoded}'


@register.simple_tag()
@@ -88,7 +88,9 @@ class ReportTagTest(PartImageTestMixin, InvenTreeTestCase):

        self.debug_mode(False)
        asset = report_tags.asset('test.txt')
-        self.assertEqual(asset, f'file://{settings.MEDIA_ROOT}/report/assets/test.txt')
+        # Non-debug mode returns a base64 data: URI (no file:// URLs)
+        self.assertIsNotNone(asset)
+        self.assertTrue(asset.startswith('data:text/plain;base64,'))

        # Test for attempted path traversal
        with self.assertRaises(ValidationError):
@@ -2,12 +2,14 @@

 import os
 from io import StringIO
+from unittest.mock import patch

 from django.apps import apps
 from django.conf import settings
 from django.core.cache import cache
 from django.core.files.base import ContentFile
 from django.core.files.storage import default_storage
+from django.test import TestCase
 from django.urls import reverse

 from pypdf import PdfReader
@@ -871,3 +873,203 @@ class AdminTest(AdminTestCase):
    def test_admin(self):
        """Test the admin URL."""
        self.helper(model=ReportTemplate)
+
+
+class URLFetcherE2ETest(ReportTest):
+    """End-to-end test: the URL fetcher blocks malicious URLs during a real PDF render.
+
+    Extends ReportTest so that fixtures, auth, and default report templates are all
+    available.  The print task runs synchronously in test mode (no workers), so the
+    render completes inline and log output is captured within the same request.
+    """
+
+    def test_file_url_blocked_in_render(self):
+        """A template embedding a file:// URL must still produce a PDF, but the URL must be blocked and logged."""
+        from io import StringIO
+
+        # Upload a minimal report template that embeds a malicious file:// reference.
+        html = (
+            '<html><body>'
+            '<img src="file:///etc/passwd">'
+            '<p>Security test content</p>'
+            '</body></html>'
+        )
+        template_io = StringIO(html)
+        template_io.name = 'security_test_template.html'
+
+        response = self.post(
+            reverse('api-report-template-list'),
+            data={
+                'name': 'Security Test',
+                'description': 'Tests that file:// URLs are blocked during rendering',
+                'template': template_io,
+                'model_type': 'stockitem',
+            },
+            format=None,
+            expected_code=201,
+        )
+        template_pk = response.data['pk']
+
+        item = StockItem.objects.first()
+        self.assertIsNotNone(item)
+
+        # Render the template.  WeasyPrint catches the ValueError from our fetcher and
+        # continues, so the PDF is still generated — the blocked resource is just skipped.
+        with self.assertLogs('inventree', level='WARNING') as captured:
+            response = self.post(
+                reverse('api-report-print'),
+                {'template': template_pk, 'items': [item.pk]},
+                expected_code=201,
+            )
+
+        # A PDF output should have been produced despite the blocked resource.
+        self.assertTrue(response.data['output'].endswith('.pdf'))
+
+        # The fetcher must have logged a warning identifying the blocked URL.
+        blocked_warnings = [
+            msg
+            for msg in captured.output
+            if 'blocked file://' in msg and '/etc/passwd' in msg
+        ]
+        self.assertTrue(
+            blocked_warnings, 'Expected a blocked file:// warning in the log output'
+        )
+
+    def test_ssrf_url_blocked_in_render(self):
+        """A template embedding an HTTP URL to a private/reserved address must be blocked and logged."""
+        from io import StringIO
+
+        # 127.0.0.1 is loopback — validate_url_no_ssrf rejects it regardless of port.
+        html = (
+            '<html><body>'
+            '<img src="http://127.0.0.1/ssrf-probe">'
+            '<p>Security test content</p>'
+            '</body></html>'
+        )
+        template_io = StringIO(html)
+        template_io.name = 'ssrf_test_template.html'
+
+        response = self.post(
+            reverse('api-report-template-list'),
+            data={
+                'name': 'SSRF Test',
+                'description': 'Tests that SSRF URLs are blocked during rendering',
+                'template': template_io,
+                'model_type': 'stockitem',
+            },
+            format=None,
+            expected_code=201,
+        )
+        template_pk = response.data['pk']
+
+        item = StockItem.objects.first()
+        self.assertIsNotNone(item)
+
+        # Render the template.  WeasyPrint catches the ValueError from our fetcher and
+        # continues, so the PDF is still generated — the blocked resource is just skipped.
+        with self.assertLogs('inventree', level='WARNING') as captured:
+            response = self.post(
+                reverse('api-report-print'),
+                {'template': template_pk, 'items': [item.pk]},
+                expected_code=201,
+            )
+
+        # A PDF output should have been produced despite the blocked resource.
+        self.assertTrue(response.data['output'].endswith('.pdf'))
+
+        # The fetcher must have logged a warning for the blocked SSRF attempt.
+        blocked_warnings = [
+            msg
+            for msg in captured.output
+            if 'blocked URL' in msg and '127.0.0.1' in msg
+        ]
+        self.assertTrue(
+            blocked_warnings, 'Expected a blocked SSRF URL warning in the log output'
+        )
+
+    def test_fetch_urls_disabled_blocks_http(self):
+        """When REPORT_FETCH_URLS is False, any http/https URL in a template must be blocked."""
+        from io import StringIO
+
+        from common.settings import set_global_setting
+
+        # Use a publicly routable address so the test would reach the network if
+        # our guard were absent — we want to confirm it is stopped by the setting,
+        # not by a secondary SSRF IP check.
+        html = (
+            '<html><body>'
+            '<img src="https://example.com/image.png">'
+            '<p>Security test content</p>'
+            '</body></html>'
+        )
+        template_io = StringIO(html)
+        template_io.name = 'fetch_disabled_test_template.html'
+
+        response = self.post(
+            reverse('api-report-template-list'),
+            data={
+                'name': 'Fetch Disabled Test',
+                'description': 'Tests that HTTP fetching is blocked when REPORT_FETCH_URLS=False',
+                'template': template_io,
+                'model_type': 'stockitem',
+            },
+            format=None,
+            expected_code=201,
+        )
+        template_pk = response.data['pk']
+
+        item = StockItem.objects.first()
+        self.assertIsNotNone(item)
+
+        set_global_setting('REPORT_FETCH_URLS', False, change_user=None)
+
+        with self.assertLogs('inventree', level='WARNING') as captured:
+            response = self.post(
+                reverse('api-report-print'),
+                {'template': template_pk, 'items': [item.pk]},
+                expected_code=201,
+            )
+
+        self.assertTrue(response.data['output'].endswith('.pdf'))
+
+        blocked_warnings = [
+            msg
+            for msg in captured.output
+            if 'REPORT_FETCH_URLS' in msg and 'example.com' in msg
+        ]
+        self.assertTrue(
+            blocked_warnings, 'Expected a REPORT_FETCH_URLS warning in the log output'
+        )
+
+
+class URLFetcherTest(TestCase):
+    """Tests for InvenTreeURLFetcher security restrictions."""
+
+    def setUp(self):
+        """Import fetcher for each test."""
+        from report.fetcher import InvenTreeURLFetcher
+
+        self.fetcher = InvenTreeURLFetcher()
+
+    def test_file_url_blocked(self):
+        """file:// URLs must always be rejected regardless of path."""
+        for url in [
+            'file:///etc/passwd',
+            'file:///proc/self/environ',
+            f'file://{settings.MEDIA_ROOT}/report/assets/anything.png',
+            f'file://{settings.STATIC_ROOT}/some/font.ttf',
+        ]:
+            with self.assertRaises(ValueError, msg=f'Expected block for {url}'):
+                self.fetcher.fetch(url)
+
+    def test_unknown_scheme_blocked(self):
+        """Non-http/data/file schemes must be rejected."""
+        for url in ['ftp://example.com/file.txt', 'javascript://x']:
+            with self.assertRaises(ValueError, msg=f'Expected block for {url}'):
+                self.fetcher.fetch(url)
+
+    def test_data_uri_allowed(self):
+        """data: URIs must always be permitted."""
+        with patch('weasyprint.urls.URLFetcher.fetch', return_value={}):
+            self.fetcher.fetch('data:image/png;base64,abc123')
+            self.fetcher.fetch('data:text/css;base64,abc123')
@@ -199,6 +199,7 @@ export default function SystemSettings() {
              'REPORT_ENABLE',
              'REPORT_DEFAULT_PAGE_SIZE',
              'REPORT_DEBUG_MODE',
+              'REPORT_FETCH_URLS',
              'REPORT_LOG_ERRORS',
              'LABEL_ENABLE',
              'LABEL_DPI'