Java: improve qhelp for java/unsafe-deserialization#21807
Java: improve qhelp for java/unsafe-deserialization#21807owen-mc wants to merge 8 commits intogithub:mainfrom
java/unsafe-deserialization#21807Conversation
|
QHelp previews: csharp/ql/src/Security Features/CWE-502/UnsafeDeserialization.qhelpUnsafe deserializerDeserializing an object from untrusted input may result in security problems, such as denial of service or remote code execution. Note that a deserialization method is only dangerous if it can instantiate arbitrary classes. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. Such frameworks are generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks. RecommendationAvoid using an unsafe deserialization framework. ExampleIn this example, a string is deserialized using a using System.Web.Script.Serialization;
class Bad
{
public static object Deserialize(string s)
{
JavaScriptSerializer sr = new JavaScriptSerializer(new SimpleTypeResolver());
// BAD
return sr.DeserializeObject(s);
}
}To fix this specific vulnerability, we avoid using a type resolver. In other cases, it may be necessary to use a different deserialization framework. using System.Web.Script.Serialization;
class Good
{
public static object Deserialize(string s)
{
// GOOD
JavaScriptSerializer sr = new JavaScriptSerializer();
return sr.DeserializeObject(s);
}
}References
csharp/ql/src/Security Features/CWE-502/UnsafeDeserializationUntrustedInput.qhelpDeserialization of untrusted dataDeserializing an object from untrusted input may result in security problems, such as denial of service or remote code execution. Note that a deserialization method is only dangerous if it can instantiate arbitrary classes. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. Such frameworks are generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks. RecommendationAvoid deserializing objects from an untrusted source, and if not possible, make sure to use a safe deserialization framework. ExampleIn this example, text from an HTML text box is deserialized using a using System.Web.UI.WebControls;
using System.Web.Script.Serialization;
class Bad
{
public static object Deserialize(TextBox textBox)
{
JavaScriptSerializer sr = new JavaScriptSerializer(new SimpleTypeResolver());
// BAD
return sr.DeserializeObject(textBox.Text);
}
}To fix this specific vulnerability, we avoid using a type resolver. In other cases, it may be necessary to use a different deserialization framework. using System.Web.UI.WebControls;
using System.Web.Script.Serialization;
class Good
{
public static object Deserialize(TextBox textBox)
{
JavaScriptSerializer sr = new JavaScriptSerializer();
// GOOD: no unsafe type resolver
return sr.DeserializeObject(textBox.Text);
}
}In the following example potentially untrusted stream and type is deserialized using a using System.Runtime.Serialization.Json;
using System.IO;
using System;
class BadDataContractJsonSerializer
{
public static object Deserialize(string type, Stream s)
{
// BAD: stream and type are potentially untrusted
var ds = new DataContractJsonSerializer(Type.GetType(type));
return ds.ReadObject(s);
}
}To fix this specific vulnerability, we are using hardcoded Plain Old CLR Object (POCO) type. In other cases, it may be necessary to use a different deserialization framework. using System.Runtime.Serialization.Json;
using System.IO;
using System;
class Poco
{
public int Count;
public string Comment;
}
class GoodDataContractJsonSerializer
{
public static Poco Deserialize(Stream s)
{
// GOOD: while stream is potentially untrusted, the instantiated type is hardcoded
var ds = new DataContractJsonSerializer(typeof(Poco));
return (Poco)ds.ReadObject(s);
}
}References
java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelpDeserialization of user-controlled dataDeserializing untrusted data using any deserialization framework that allows the construction of arbitrary serializable objects is easily exploitable and in many cases allows an attacker to execute arbitrary code. Even before a deserialized object is returned to the caller of a deserialization method a lot of code may have been executed, including static initializers, constructors, and finalizers. Automatic deserialization of fields means that an attacker may craft a nested combination of objects on which the executed initialization code may have unforeseen effects, such as the execution of arbitrary code. There are many different serialization frameworks. This query currently supports Kryo, XmlDecoder, XStream, SnakeYaml, JYaml, JsonIO, YAMLBeans, HessianBurlap, Castor, Burlap, Jackson, Jabsorb, Jodd JSON, Flexjson, Gson, JMS, and Java IO serialization through Note that a deserialization method is only dangerous if it can instantiate arbitrary classes. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. For example, Apache Avro's deserialization methods follow a schema and are therefore generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks. RecommendationAvoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder. Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures. Recommendations specific to particular frameworks supported by this query: FastJson -
FasterXML -
Kryo -
ObjectInputStream -
SnakeYAML -
XML Decoder -
ObjectMesssage -
ExampleThe following example calls public MyObject {
public int field;
MyObject(int field) {
this.field = field;
}
}
public MyObject deserialize(Socket sock) {
try(ObjectInputStream in = new ObjectInputStream(sock.getInputStream())) {
return (MyObject)in.readObject(); // BAD: in is from untrusted source
}
}Rewriting the communication protocol to only rely on reading primitive types from the input stream removes the vulnerability. public MyObject deserialize(Socket sock) {
try(DataInputStream in = new DataInputStream(sock.getInputStream())) {
return new MyObject(in.readInt()); // GOOD: read only an int
}
}References
python/ql/src/Security/CWE-502/UnsafeDeserialization.qhelpDeserialization of user-controlled dataDeserializing untrusted data using any deserialization framework that allows the construction of arbitrary serializable objects is easily exploitable and in many cases allows an attacker to execute arbitrary code. Even before a deserialized object is returned to the caller of a deserialization method a lot of code may have been executed, including static initializers, constructors, and finalizers. Automatic deserialization of fields means that an attacker may craft a nested combination of objects on which the executed initialization code may have unforeseen effects, such as the execution of arbitrary code. There are many different serialization frameworks. This query currently supports Pickle, Marshal and Yaml. Note that a deserialization method is only dangerous if it can instantiate arbitrary classes. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. Such frameworks are generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks. RecommendationAvoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON. If you need to use YAML, use the ExampleThe following example calls from django.conf.urls import url
import pickle
def unsafe(pickled):
return pickle.loads(pickled)
urlpatterns = [
url(r'^(?P<object>.*)$', unsafe)
]Changing the code to use from django.conf.urls import url
import json
def safe(pickled):
return json.loads(pickled)
urlpatterns = [
url(r'^(?P<object>.*)$', safe)
]References
ruby/ql/src/queries/security/cwe-502/UnsafeDeserialization.qhelpDeserialization of user-controlled dataDeserializing untrusted data using any method that allows the construction of arbitrary objects is easily exploitable and, in many cases, allows an attacker to execute arbitrary code. Note that a deserialization method is only dangerous if it can instantiate arbitrary classes or objects. Serialization frameworks that use a schema to instantiate only expected, predefined types are generally not tracked by this query. Such frameworks are generally safe with respect to arbitrary-class-instantiation and gadget-chain attacks when the schema is trusted and does not permit user-controlled type resolution. However, care must be taken to ensure the schema strictly limits the allowed types. Permitting common standard library classes can still leave the application vulnerable to gadget-chain attacks. RecommendationAvoid deserialization of untrusted data if possible. If the architecture permits it, use serialization formats that cannot represent arbitrary objects. For libraries that support it, such as the Ruby standard library's If deserializing an untrusted YAML document using the If deserializing an untrusted XML document using the To safely deserialize Property List files using the ExampleThe following example calls the require 'json'
require 'yaml'
require 'oj'
class UserController < ActionController::Base
def marshal_example
data = Base64.decode64 params[:data]
object = Marshal.load data
# ...
end
def json_example
object = JSON.load params[:json]
# ...
end
def yaml_example
object = YAML.load params[:yaml]
# ...
end
def oj_example
object = Oj.load params[:json]
# ...
end
def ox_example
object = Ox.parse_obj params[:xml]
# ...
end
endUsing require 'json'
class UserController < ActionController::Base
def safe_json_example
object = JSON.parse params[:json]
# ...
end
def safe_yaml_example
object = YAML.safe_load params[:yaml]
# ...
end
def safe_oj_example
object = Oj.load params[:yaml], { mode: :strict }
# or
object = Oj.safe_load params[:yaml]
# ...
end
endReferences
|
There was a problem hiding this comment.
Pull request overview
Updates the qhelp documentation for the java/unsafe-deserialization query to better explain when deserialization is dangerous, aiming to clarify that schema-driven deserialization (that doesn’t allow arbitrary class instantiation) is generally not in scope.
Changes:
- Adds explanatory text distinguishing “arbitrary class instantiation” deserialization from schema-based deserialization.
- Uses Apache Avro as an example of schema-following deserialization.
- Minor whitespace/formatting cleanups in the qhelp text.
Show a summary per file
| File | Description |
|---|---|
| java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp | Adds guidance about schema-based deserialization being generally safer / out of scope, plus minor formatting tweaks. |
Copilot's findings
- Files reviewed: 1/1 changed files
- Comments generated: 1
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
Does it also make sense to add the same note to the qhelp for the same CWE in other languages too? |
|
Good idea @BazookaMusic . I've done that for the queries where it makes sense. |
Clarify that deserialization that follows a schema is safe.