-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
WalkthroughThe email attachment extraction now parses URLs from content-disposition safely, returning None instead of raising errors when parsing fails. Attachments are filtered to include only HTTP/HTTPS URLs with non-localhost hostnames. URL parsing and hostname validation are added, and docstrings are updated accordingly. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant E as EmailParser
participant CD as _parse_url_from_content_disposition
participant U as urlparse
participant F as AttachmentFilter
E->>CD: Extract URL from Content-Disposition
CD->>U: Parse URL
alt Valid URL
U-->>CD: url (scheme, host, ...)
CD-->>E: URL
E->>F: Validate scheme/hostname
alt http/https and non-localhost
F-->>E: Keep attachment
else Invalid host or scheme
F-->>E: Drop attachment
end
else Missing/invalid header
CD-->>E: None
E->>F: No URL -> Drop attachment
end
note over E: Returns only validated attachments
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agentuity/io/email.py (1)
90-121: Add None check for _url before use.The
data()method usesself._urlat line 105 without checking if it's None. While theattachmentsproperty filters out None URLs, if anIncomingEmailAttachmentis instantiated directly anddata()is called, it will fail with a non-descriptive error (TypeError: unsupported operand type(s)when httpx tries to use None as a URL).Add a guard at the beginning of the method:
async def data(self): """ Return a Data object that streams the attachment data asynchronously. """ + if self._url is None: + raise ValueError("Attachment URL is not available") tracer = trace.get_tracer("email")
🧹 Nitpick comments (2)
agentuity/io/email.py (2)
298-299: Consider more specific exception handling.Catching all exceptions with a bare
except Exceptionmay hide legitimate parsing or validation errors that should be logged or handled differently. Consider being more specific about which exceptions to catch.- except Exception: - continue + except (ValueError, TypeError, AttributeError) as e: + # Log or handle specific parsing errors if needed + continue
387-387: Optional: Remove explicit return None.Python functions implicitly return
Nonewhen execution completes without a return statement. The explicitreturn Noneis unnecessary but harmless.response = await client.post(url, content=email_body, headers=headers) response.raise_for_status() - return None
| hostname_lower = hostname.lower() | ||
| if hostname_lower in ('localhost', '127.0.0.1', '::1'): | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete localhost/loopback address filtering.
The hostname filtering only checks for exact matches of 'localhost', '127.0.0.1', and '::1'. This misses:
- Other 127.x.x.x loopback addresses (e.g., 127.0.0.2, 127.1.1.1)
- Alternative IPv6 loopback representations (e.g.,
0:0:0:0:0:0:0:1) - Private network ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) that could enable SSRF
Consider using a more comprehensive check:
- hostname_lower = hostname.lower()
- if hostname_lower in ('localhost', '127.0.0.1', '::1'):
- continue
+ import ipaddress
+
+ hostname_lower = hostname.lower()
+ if hostname_lower == 'localhost':
+ continue
+
+ # Check if hostname is an IP address
+ try:
+ ip = ipaddress.ip_address(hostname)
+ if ip.is_loopback or ip.is_private or ip.is_link_local:
+ continue
+ except ValueError:
+ # Not an IP address, proceed with domain name
+ pass📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| hostname_lower = hostname.lower() | |
| if hostname_lower in ('localhost', '127.0.0.1', '::1'): | |
| continue | |
| import ipaddress | |
| hostname_lower = hostname.lower() | |
| if hostname_lower == 'localhost': | |
| continue | |
| # Check if hostname is an IP address | |
| try: | |
| ip = ipaddress.ip_address(hostname) | |
| if ip.is_loopback or ip.is_private or ip.is_link_local: | |
| continue | |
| except ValueError: | |
| # Not an IP address, proceed with domain name | |
| pass |
Summary by CodeRabbit