An intended feature with security implications
Last year security researchers from Bishop Fox found and reported five vulnerabilities in the Ray framework. Anyscale, the company that maintains the software, decided to patch four of them (CVE-2023-6019, CVE-2023-6020, CVE-2023-6021 and CVE-2023-48023) in version 2.8.1, but claimed that the fifth one, assigned CVE-2023-48022, was not really a vulnerability so it was left unfixed.
That’s because CVE-2023-48022 is actually directly caused by the fact that the Ray dashboard and client API do not implement authentication controls. So, any attacker who can reach the API endpoints can submit new jobs, delete existing jobs, retrieve sensitive information, and essentially achieve remote command execution.
The problem is, as a framework whose main goal is to facilitate the execution of workloads across compute clusters, “remote command execution” is essentially a feature and the lack of authentication is also by design. “Due to Ray’s nature as a distributed execution framework, Ray’s security boundary is outside of the Ray cluster,” Anyscale said in its advisory. “That is why we emphasize that you must prevent access to your Ray cluster from untrusted machines (e.g., the public internet). This is why the fifth CVE (the lack of authentication built into Ray) has not been addressed, and why it is not in our opinion a vulnerability, or even a bug.”
The Ray documentation clearly states that “Ray expects to run in a safe network environment and to act upon trusted code” and that it’s the responsibility of developers and platform providers to ensure those conditions for safe operation. However, as we’ve seen with other technologies in the past that lacked authentication by default, users don’t always follow best practices and insecure deployments will make their way on the internet sooner or later. While Anyscale doesn’t want users to put all their trust in an isolation control like authentication inside Ray instead of isolating the entire framework and clusters with external controls, it has decided to work on adding an authentication mechanism in future versions.
Insecure-by-default configurations
Until then, however, many organizations are likely to continue to unwillingly expose such servers to the internet because, according to Oligo, many deployment guides and repositories for Ray, including some of the official ones, come with insecure deployment configurations. Misconfigurations are also made easier by the fact that by default the Ray dashboard and the Jobs API binds to 0.0.0.0, which basically means all available network interfaces on a system and opens port forwarding in the firewall to all of them.
“AI experts are NOT security experts—leaving them potentially dangerously unaware of the very real risks posed by AI frameworks,” the researchers said. “Without authorization for Ray’s Jobs API, the API can be exposed to remote code execution attacks when not following best practices.”