本文用以梳理 Rex 复现、分析 Crash 时的原理及工程实现。相关代码,位于 Crash 类中。复现原理,论文中描述如下:


Vulnerable States. Unlike AEG/Mayhem, but similar to AXGEN, we generate exploits by performing concolic execution on crashing program inputs using angr. We drive concolic execution forward, forcing it to follow the same path as a dynamic trace gathered by concretely executing the crashing input applied to the program. Concolic execution is stopped at the point where the program crashed, and we inspect the symbolic state to determine the cause of the crash and measure exploitability. By counting the number of symbolic bits in certain registers, we can triage a crash into a number of categories such as frame pointer overwrite, instruction pointer overwrite, or arbitrary write, among others.

不难看出关键词为 “concolic execution”,即混合符号执行。通过 concolic-execution 将程序从初始状态引导至 Crash 状态,以 EIP 中符号变量的数量为依据,实现对漏洞类型的判定。

0x00 Concrete Execution

Concolic Execution 原理本文不做详细介绍。Rex 在实现过程中,首先通过 QEMU,加载二进制程序及 PoC,以 Concrete Execution 的方式得到 crash_state。Rex 调用 Tracer 模块实现相关功能。

如同 Tracer 文档中所说一样,Tracer 原本用以辅助 Angr 实现 Concolic tracing,但随着 Angr 的迭代升级,已将相关功能集成在自身之中。Rex 在调用 Tracer 时,也仅使用其中的 QEMURunner() 方法,完成 Concrete Execution。crash.py 中相关代码如下:

1
2
3
4
tracer_args={
'ld_linux': os.path.join(bin_location, 'tests/i386/ld-linux.so.2'),
'library_path': os.path.join(bin_location, 'tests/i386')}
r = tracer.QEMURunner(binary=binary, input=input_data, argv=argv, trace_timeout=trace_timeout, **tracer_args)

0x01 Concolic Execution

在获取到 Crash 状态之后,以 Concrete-Execution 结果为引导,将程序执行路径限定在存在 Crash 的路径上,并在该路径上进行符号执行。由于 Angr 基于 UC-KLEE 所提出的 under-constrained symbolic execution(UCSE) 所实现,因此约束条件及初始状态的设定、路径探索的策略、分析技术的选取对最终结果具有重要影响。

1、simulation_manager()

simulation_manager 是 Angr 的核心概念之一。 Rex 中的设置如下,其中 r.crash_mode 为布尔值 “True/False”:

通过 full_init_state()方法,配置程序的初始状态:

full_init_state()中所设置参数的含义如下,定义在./angr/sim_options.py中:

  • mode = ‘tracing’
  • add_options:
Option name Description
so.MEMORY_SYMBOLIC_BYTES_MAP Maintain a mapping of symbolic variable to which memory address it “really” corresponds to, at the paged memory level?
so.TRACK_ACTION_HISTORY track the history of actions through a path (multiple states). This action affects things on the angr level
so.CONCRETIZE_SYMBOLIC_WRITE_SIZES Concretize the sizes of symbolic writes to memory
so.CONCRETIZE_SYMBOLIC_FILE_READ_SIZES Concreteize the sizes of file reads
so.TRACK_MEMORY_ACTIONS Keep a SimAction for each memory read and write
  • remove_options:
    由于 ‘tracing’ 模式下预制了一些选项,因此在优化策略时,不仅需要add_options,而且需要 remove_options。
Option name Description
so.TRACK_REGISTER_ACTIONS Keep a SimAction for each register read and write
so.TRACK_TMP_ACTIONS Keep a SimAction for each temporary variable read and write
so.TRACK_JMP_ACTIONS Keep a SimAction for each jump or branch
so.ACTION_DEPS Track dependencies in SimActions
so.TRACK_CONSTRAINT_ACTIONS Keep a SimAction for each constraint added
so.LAZY_SOLVES Don’t check satisfiability until absolutely necessary
so.SIMPLIFY_MEMORY_WRITES Run values stored to memory through z3’s simplification
so.ALL_FILES_EXIST Attempting to open an unkown file will result in creating it with a symbolic length

通过full_init_state(),设置约束条件。

2、 state_register_plugin()

Program State 是 Angr 中的另一个核心概念。state 在工程实现时,采用插件式的架构,可以根据分析任务的不同,针对性的选用最适合的插件。Rex 默认选用了’posix’、’preconstrainer’ 插件,代码如下图所示。插件源码位于./angr/state_plugins/目录下。

  • SimSystemPosix()
    Data storage and interaction mechanisms for states with an environment conforming to posix.
    Available as state.posix.

  • SimStatePreconstrainer()
    This state plugin manages the concept of preconstraining - adding constraints which you would like to remove later.
    :param constrained_addrs : SimActions for memory operations whose addresses should be constrained during crash analysis

3、 exploration_techniques()

路径搜索策略,在符号执行中具有至关重要的作用。由于 Rex 在复现 Crash 的过程中采用 concolic-execution 方式,因此使用了 ‘Tracer’、’Oppologist’ 两种路径搜索策略。

源码在./angr/exploration_techniques/目录下,调用代码如下:

  • Tracer

  • Oppologist
    The Oppologist is an exploration technique that forces uncooperative code through QEMU.

0x03 Triage

_triage_crash() 中根据 eip、ebp 中符号变量的个数,及发生崩溃时的操作,定义了七种漏洞类型。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
def _triage_crash(self):
ip = self.state.regs.ip
bp = self.state.regs.bp

# any arbitrary receives or transmits
# TODO: receives
zp = self.state.get_plugin('zen_plugin') if self.os == 'cgc' else None
if zp is not None and len(zp.controlled_transmits):
l.debug("detected arbitrary transmit vulnerability")
self.crash_types.append(Vulnerability.ARBITRARY_TRANSMIT)

# we assume a symbolic eip is always exploitable
if self.state.solver.symbolic(ip):
# how much control of ip do we have?
if self._symbolic_control(ip) >= self.state.arch.bits:
l.info("detected ip overwrite vulnerability")
self.crash_types.append(Vulnerability.IP_OVERWRITE)
else:
l.info("detected partial ip overwrite vulnerability")
self.crash_types.append(Vulnerability.PARTIAL_IP_OVERWRITE)

return

if self.state.solver.symbolic(bp):
# how much control of bp do we have
if self._symbolic_control(bp) >= self.state.arch.bits:
l.info("detected bp overwrite vulnerability")
self.crash_types.append(Vulnerability.BP_OVERWRITE)
else:
l.info("detected partial bp overwrite vulnerability")
self.crash_types.append(Vulnerability.PARTIAL_BP_OVERWRITE)

return

# if nothing obvious is symbolic let's look at actions
# grab the all actions in the last basic block
symbolic_actions = [ ]
if self._t is not None and self._t.last_state is not None:
recent_actions = reversed(self._t.last_state.history.recent_actions)
state = self._t.last_state
# TODO: this is a dead assignment! what was this supposed to be?
else:
recent_actions = reversed(self.state.history.actions)
state = self.state
for a in recent_actions:
if a.type == 'mem':
if self.state.solver.symbolic(a.addr):
symbolic_actions.append(a)

# TODO: pick the crashing action based off the crashing instruction address,
# crash fixup attempts will break on this
#import ipdb; ipdb.set_trace()
for sym_action in symbolic_actions:
if sym_action.action == "write":
if self.state.solver.symbolic(sym_action.data):
l.info("detected write-what-where vulnerability")
self.crash_types.append(Vulnerability.WRITE_WHAT_WHERE)
else:
l.info("detected write-x-where vulnerability")
self.crash_types.append(Vulnerability.WRITE_X_WHERE)
self.violating_action = sym_action
break

if sym_action.action == "read":
# special vulnerability type, if this is detected we can explore the crash further
l.info("detected arbitrary-read vulnerability")
self.crash_types.append(Vulnerability.ARBITRARY_READ)

self.violating_action = sym_action
break

return

Rex 中对漏洞类型的定义如下:

0x04 小结

以上是对 Rex 复现 Crash 部分的简要分析,原理上是依靠 Concolic Execution,工程实现上主要依靠 QEMU 与 Angr。粗浅的表面分析,后续逐步深入。