Yaml python libraries is also capable to serialize python objects and not just raw data:
print(yaml.dump(str("lol")))
lol
...
print(yaml.dump(tuple("lol")))
!!python/tuple
- l
- o
- l
print(yaml.dump(range(1,10)))
!!python/object/apply:builtins.range
- 1
- 10
- 1
Check how the tuple isn’t a raw type of data and therefore it was serialized. And the same happened with the range (taken from the builtins).
safe_load() or safe_load_all() uses SafeLoader and don’t support class object deserialization. Class object deserialization example:
import yamlfrom yaml import UnsafeLoader, FullLoader, Loaderdata =b'!!python/object/apply:builtins.range [1, 10, 1]'print(yaml.load(data, Loader=UnsafeLoader))#range(1, 10)print(yaml.load(data, Loader=Loader))#range(1, 10)print(yaml.load_all(data))#<generator object load_all at 0x7fc4c6d8f040>print(yaml.load_all(data, Loader=Loader))#<generator object load_all at 0x7fc4c6d8f040>print(yaml.load_all(data, Loader=UnsafeLoader))#<generator object load_all at 0x7fc4c6d8f040>print(yaml.load_all(data, Loader=FullLoader))#<generator object load_all at 0x7fc4c6d8f040>print(yaml.unsafe_load(data))#range(1, 10)print(yaml.full_load_all(data))#<generator object load_all at 0x7fc4c6d8f040>print(yaml.unsafe_load_all(data))#<generator object load_all at 0x7fc4c6d8f040>#The other ways to load data will through an error as they won't even attempt to#deserialize the python object
The previous code used unsafe_load to load the serialized python class. This is because in version >= 5.1, it doesn’t allow to deserialize any serialized python class or class attribute, with Loader not specified in load() or Loader=SafeLoader.
Note that in recent versions you cannot no longer call .load()without a Loader and the FullLoader is no longer vulnerable to this attack.
RCE
Custom payloads can be created using Python YAML modules such as PyYAML or ruamel.yaml. These payloads can exploit vulnerabilities in systems that deserialize untrusted input without proper sanitization.