Rust下的二进制漏洞 CVE-2024-27284 分析-先知社区

Rust被誉为【能够规避内存漏洞】的语言，在这几年收到很多开发者喜爱。所以在这个语言下能出现的UAF漏洞，自然也有价值研究一下。本文就一个常见开源库中发现的UAF漏洞进行分析。

漏洞背景

漏洞本身来自一个叫做Casandra-rs的开源库。

Cassandra 是一个开源的分布式数据库管理系统，由 Apache 软件基金会开发和维护。它被设计为具有高度可扩展性和容错性的分布式存储系统，用于处理大规模数据集的高吞吐量和低延迟的应用程序。Cassandra 使用一种称为 CQL（Cassandra Query Language）的查询语言，它类似于 SQL，但具有一些特定于 Cassandra 的扩展和功能。CQL 提供了灵活的数据模型和查询选项，可以满足各种应用程序的需求。 —— 来自Apache

当前库是一个Rust写的库，理论上Rust是很少能出问题的，但是在现实场景中，由于对底层逻辑的操作需求，Rust也不得不引入unsafe关键字对一些底层的内容进行操作。然而一旦引入了unsafe，Rust在编译期间进行的检查就会失效，在这个过程中就会导致漏洞的出现。

Patch分析

根据漏洞公告，可以看到漏洞描述如下

Code that attempts to use an item (e.g., a row) returned by an iterator after the iterator has advanced to the next item will be accessing freed memory and experience undefined behaviour. Code that uses the item and then advances the iterator is unaffected. This problem has always existed.

This is a use-after-free bug, so it's rated high severity. If your code uses a pre-3.0.0 version of cassandra-rs, and uses an item returned by a cassandra-rs iterator after calling next() on that iterator, then it is vulnerable. However, such code will almost always fail immediately - so we believe it is unlikely that any code using this pattern would have reached production. For peace of mind, we recommend you upgrade anyway.

根据描述，我们可以直到这个漏洞的几个特征：

漏洞类型为UAF
漏洞和迭代器iter有关
漏洞的触发和next()有关系

同时可以找到程序的patch在这个位置。其中有一段内容比较关键:

## Lending iterator API (version 3.0)

Version 3.0 fixes a soundness issue with the previous API. The iterators in the
underlying Cassandra driver invalidate the current item when `next()` is called,
and this was not reflected in the Rust binding prior to version 3.

To deal with this, the various iterators (`ResultIterator`, `RowIterator`,
`MapIterator`, `SetIterator`, `FieldIterator`, `UserTypeIterator`,
`KeyspaceIterator`, `FunctionIterator`, `AggregateIterator`, `TableIterator`,
`ColumnIterator`) no longer implement `std::iter::Iterator`. Instead, since this
is a [lending
iterator,](https://blog.rust-lang.org/2022/11/03/Rust-1.65.0.html#generic-associated-types-gats)
these types all implement a new `LendingIterator` trait. We define this
ourselves because there is currently no widely-used crate that implements it.

观察修复的内容，可以找到大致有两类修复代码：

一类则是增加了生命周期的声明：

/// A field's metadata
-   pub struct Field {
+   //
+   // Borrowed from wherever the value is borrowed from.
+   pub struct Field<'a> {
        /// The field's name
        pub name: String,
        /// The field's value
-       pub value: Value,
+       pub value: Value<'a>,
}

另一类则是增加了一些关于生命周期和幽灵数据的声明

#[derive(Debug)]
-   pub struct RowIterator(pub *mut _CassIterator);
+   pub struct RowIterator<'a>(*mut _CassIterator, PhantomData<&'a _Row>);

/// skip code

-   impl<'a> Iterator for &'a RowIterator {
-       type Item = Value;
+   impl LendingIterator for RowIterator<'_> {
+       type Item<'a> = Value<'a> where Self: 'a;

-       fn next(&mut self) -> Option<<Self as Iterator>::Item> {
+       fn next(&mut self) -> Option<<Self as LendingIterator>::Item<'_>> {
            unsafe {
                match cass_iterator_next(self.0) {
                    cass_false => None,
                    cass_true => Some(Value::build(cass_iterator_get_column(self.0))),
                }
            }
        }
}

可以看到，这里对类型RowIterator新增了生命周期的定义，并且这个LendingIterator似乎是一个新增的描述概念，作者同样添加到了README中:

## Lending iterator API (version 3.0)

Version 3.0 fixes a soundness issue with the previous API. The iterators in the
underlying Cassandra driver invalidate the current item when `next()` is called,
and this was not reflected in the Rust binding prior to version 3.

To deal with this, the various iterators (`ResultIterator`, `RowIterator`,
`MapIterator`, `SetIterator`, `FieldIterator`, `UserTypeIterator`,
`KeyspaceIterator`, `FunctionIterator`, `AggregateIterator`, `TableIterator`,
`ColumnIterator`) no longer implement `std::iter::Iterator`. Instead, since this
is a [lending
iterator,](https://blog.rust-lang.org/2022/11/03/Rust-1.65.0.html#generic-associated-types-gats)
these types all implement a new `LendingIterator` trait. We define this
ourselves because there is currently no widely-used crate that implements it.

并且修复commit中，作者提到

Make ResultIterator a LendingIterator

换句话说，将这些迭代器修改为LendingIterator，尤其是这个ResultIterator，就能解决问题。那么总结以下，漏洞修复方案大概是：

将迭代器由Iterator修改为LendingIterator
将数据对象增加生命周期，并且对某些结构体增加幽灵成员以增加生命周期

整体修复全是基于Rust特性进行的操作。为了能够更好的了解这个修复过程发生了什么，我们需要了解rust中关于生命周期的一些概念。熟悉的同学可以直接跳到漏洞分析。

Rust基本特性补充

虚幻数据`PhantomData`

实际上，结构体本身也是可以有生命周期的，例如:

struct Tmp<'a>{
    index: &'a u32
}

上述声明中，虽然index为一个引用，但是这样声明后，相当于告诉编译器，Tmp对象的生命周期会和index保持一致。当然这并不会刻意的错误延长某些场景的生命周期，例如:

let test1 = 2;
    {
        let tmp = Tmp::new(&test1);
    }

    println!("{:?}", test1);

虽然我们生命周期中提到了tmp和test1的长度一致，但是这也不代表在上面的情况下，作为结构体的tmp被销毁的时test1也将无法使用。这是比较常见的场景，然而在某些特定的场合想，可能并非结构体中的某个成员变量，而是结构体本身会和某个对象关联，这种情况比较少，但是也不是完全不存在。例如在这种代码模型下:

#[derive(Debug)]
pub struct Test1 {
    n1: u32
}
impl Test1 {

    pub fn new() -> Test1 {
        Test1 {n1:1}
    }

    pub fn set_n(&mut self, n:u32) {
        self.n1 = n;
    }
    pub fn get_test2(&self) -> Test2{
        Test2 {n1:2}
    }

}

此时Test2对象由Test1对象生成，这种模型常见于某些操作不安全数据的对象中，例如在会话对象中获取连接，抑或是从迭代器对象中获取数据，均可能出现这种写法。然而一般情况下，Rust是不允许直接声明一个结构体具有生命周期的，因为结构体的声明周期肯定需要关联到某个成员变量上，然而在上述模型中，显然是结构体生命周期与一些逻辑关联了。为了解决这种问题，Rust提出了一种叫做PhatomData(幽灵数据)的数据结构，该结构不占据结构体中的任意一个空间，但是却可以充当生命周期使用。例如:

pub struct Test2<'a> {
    n1: u32,
    _marker: PhantomData<&'a Test1>,
}

此时可以理解成，Test2将会和Test1上进行协变(covariant)。协变这个概念比较复杂，但是在这个例子中有一个更通俗的理解：无论Test2结构体的生命周期有多长，它都将会收缩至和Test1结构生命周期对齐。此时Test1中的声明需要改成

pub fn get_test2<'a>(&'a self) -> Test2<'a>{
        Test2 {n1:2, _marker:PhantomData}
    }

表明当前生命周期范围。如下的代码就是一个很好的例子

fn main()
{
    let test3;
    println!("start test3");
    {
        let test1 = Test1::new();
        let test2 = test1.get_test2();
        test3 = test2;
    }
    println!("test3 is {:?}", test3);
}

可以看到，test指向的是Test2对象，并且生命周期比test1要长。在未声明虚幻数据前，两个结构体之间没有关系，因此这段代码没有任何问题，然而在声明虚幻数据后，由于发生了协变，Test2对象（也就是test3）生命周期缩短至与test1一致，此时就会抛出错误:

error[E0597]: `test1` does not live long enough
   --> src/main.rs:213:21
    |
212 |         let test1 = Test1::new();
    |             ----- binding `test1` declared here
213 |         let test2 = test1.get_test2();
    |                     ^^^^^^^^^^^^^^^^^ borrowed value does not live long enough
214 |         test3 = test2;
215 |     }
    |     - `test1` dropped here while still borrowed
...
219 | }
    | - borrow might be used here, when `test3` is dropped and runs the `Drop` code for type `Test2`
    |
    = note: values in a scope are dropped in the opposite order they are defined

迭代器 iter

不同的语言中都有迭代器这个概念，Rust也不例外，例如常见的数组:

let mut test_vec = vec![1,2,3,4,5,6];
    let it_vec = test_vec.iter();

    for val in it_vec {
        println!("Got: {}", val);
    }

可以看到，这里拿到的test_vec本质上只是一个迭代器，迭代器的接口通常如下:

pub trait Iterator {
    type Item;

    fn next(&mut self) -> Option<Self::Item>;

    // 此处省略了方法的默认实现
}

这里的type是Rust中的一种叫做关联类型（associated type）的特性，一般出现在trait中，表示当前的trait在使用的时候，需要对类型进行指定。其本质类似于泛型，例如我们也可以以如下的方式实现这个接口

impl Iterator for Counter {
    type Item = u32;

    fn next(&mut self) -> Option<Self::Item> {
        // --snip--

这里我们给Counter对象实现了一个Iterator接口，并且指明了在这里的Item表示u32，则在之后对Counter的迭代对象进行迭代的时候，其一定会返回Option<u32>。在迭代器中，有几种不同的迭代器：

iter:正如声明，这种返回的是一个不可变的迭代器，不能修改迭代器中的元素，但是也因此不会发生迭代器中对象的所有权转移，也就不会发生对象的销毁，被迭代对象就依然可被使用
iter_mut:与前者的区别在于，返回的是可变的迭代器对象
into_iter:这种迭代器进行迭代的时候，迭代器的对象会被消费，也就是发生了所有权的转移，此时被迭代器对象不可在被使用

特征	iter	iter_mut	into_iter
迭代元素对象是否可变	不可变	可变	不可变
所有权是否变化	未变	未变	变化为迭代对象

以下代码就能说明这三者的区别:

let mut test = vec![1,2,3,4];
    // let mut iter = test.iter();
    println!("iter mutable");
    for it in test.iter_mut() {
        println!("target is {}", it);
        *it = 1;
    }
    println!("iter");
    for it in test.iter() {
        println!("target is {}", it);
    }
    println!("iter into");
    for it in test.into_iter() {
        println!("target is {}", it);
    }
    // 在这之后test对象就被销毁了
    // println!("{:?}",test); 这里将会报错

漏洞分析

Patch分析

公告中强调的ResultIterator是漏洞分析的切入点，首先回顾这个迭代器的相关逻辑:

#[derive(Debug)]
-   pub struct ResultIterator<'a>(pub *mut _CassIterator, usize, PhantomData<&'a CassResult>);
+   pub struct ResultIterator<'a>(*mut _CassIterator, usize, PhantomData<&'a _CassResult>);

-   // The underlying C type has no thread-local state, but does not support access
-   // from multiple threads: https://datastax.github.io/cpp-driver/topics/#thread-safety
-   unsafe impl<'a> Send for ResultIterator<'a> {}
+   // The underlying C type has no thread-local state, and forbids only concurrent
+   // mutation/free: https://datastax.github.io/cpp-driver/topics/#thread-safety
+   unsafe impl Send for ResultIterator<'_> {}
+   unsafe impl Sync for ResultIterator<'_> {}

    impl<'a> Drop for ResultIterator<'a> {
        fn drop(&mut self) {
            unsafe { cass_iterator_free(self.0) }
        }
    }

-   impl<'a> Iterator for ResultIterator<'a> {
-       type Item = Row<'a>;
-       fn next(&mut self) -> Option<<Self as Iterator>::Item> {
+   impl LendingIterator for ResultIterator<'_> {
+       type Item<'a> = Row<'a> where Self: 'a;
+
+       fn next(&mut self) -> Option<<Self as LendingIterator>::Item<'_>> {
            unsafe {
                match cass_iterator_next(self.0) {
                    cass_false => None,
                    cass_true => Some(self.get_row()),
                }
            }
        }
        fn size_hint(&self) -> (usize, Option<usize>) {
            (0, Some(self.1))
        }
    }
-   impl<'a> ResultIterator<'a> {
-       /// Gets the next row in the result set
-       pub fn get_row(&mut self) -> Row<'a> {
+   impl ResultIterator<'_> {
+       /// Gets the current row in the result set
+       pub fn get_row(&self) -> Row {
            unsafe { Row::build(cass_iterator_get_row(self.0)) }
    }
}

重点关注其中的next函数，我们会发现，代码修改前后都声明了Row对象和这个ResultIterator的生命周期，同时next函数功能为调用ResultIterator迭代器中实现的get_row函数。

这边的LendingIterator为库自身实现的一个接口，本质上和原先Iterator写法类似，所以这里只是省略了没写，但是也是一样声明了生命周期，后面会提及

这个get_row函数调用的函数cass_iterator_get_row为一个CPP实现的函数，其细节如下

const CassRow* cass_iterator_get_row(const CassIterator* iterator) {
  if (iterator->type() != CASS_ITERATOR_TYPE_RESULT) {
    return NULL;
  }
  return CassRow::to(static_cast<const ResultIterator*>(iterator->from())->row());
}

这里的ResultIterator是一个表示迭代器的类，其实现如下

class ResultIterator : public Iterator {
public:
  ResultIterator(const ResultResponse* result)
      : Iterator(CASS_ITERATOR_TYPE_RESULT)
      , result_(result)
      , index_(-1)
      , row_(result) {
    decoder_ = (const_cast<ResultResponse*>(result))->row_decoder();
    row_.values.reserve(result->column_count());
  }

  virtual bool next() {
    // skip code
  }

  const Row* row() const {
    assert(index_ >= 0 && index_ < result_->row_count());
    if (index_ > 0) {
      return &row_;
    } else {
      return &result_->first_row();
    }
  }
  private:
  const ResultResponse* result_;
  Decoder decoder_;
  int32_t index_;
  Row row_;
};

这里可以看到ResultIterator对象中，存放了一个叫做Row的对象，这个对象被创建的时候，对应的row_对象也会被初始化，并且在名为row的函数中，会根据当前的row_count返回不同的指针。那么在这里我们可以得出第一条结论

ResultIterator 和 Row 处在同一片内存空间中，当 ResultIterator 被销毁的时候，Row也将被销毁

接下来，确认这个ResultIterator在程序中是如何创建和使用的:

impl CassResult {
    /// Gets the number of rows for the specified result.
    // ...

    /// Creates a new iterator for the specified result. This can be
    /// used to iterate over rows in the result.
    pub fn iter(&self) -> ResultIterator {
        unsafe {
            ResultIterator(
                cass_iterator_from_result(self.0),
                cass_result_row_count(self.0),
                PhantomData,
            )
        }
    }
}

可以看到，迭代器对象由CassResult对象创建，这里的CaseResult对象指针正是前面ResultIterator对象创建时使用的指针:

ResultIterator(const ResultResponse* result)
      : Iterator(CASS_ITERATOR_TYPE_RESULT)
      , result_(result)         // CaseResult pointer
      , index_(-1)
      , row_(result)            // CaseResult pointer

于是，这里能得到第二个结论

CaseResult 的裸指针传递给了 ResultIterator，并且ResultIterator中会使用 result_ 来操作对象

那么这里就能看到第一个问题：当 CaseResult 在 ResultIterator 销毁前被销毁，ResultIterator使用next的时候就将访问一个未初始化的内存。。。吗？尝试编写一个这样的poc

let tmp_iter;
{
    let result = get_result();
    tmp_iter = result.iter();
}
println!("Using tmp iter here {:?}", tmp_iter);

很容易就会发现编译器报错，说明rust编译器会检查这种问题。这要归功于 ResultIterator 声明的 PhantomData字段：

#[derive(Debug)]
-   pub struct ResultIterator<'a>(pub *mut _CassIterator, usize, PhantomData<&'a CassResult>);
+   pub struct ResultIterator<'a>(*mut _CassIterator, usize, PhantomData<&'a _CassResult>);

可以看到，无论修改前还是修改后，PhantomData逻辑都是保留的，所以ResultIterator的生命周期始终和CassREsult保持同步，保护始终生效。换句话说，这个想法并非为当前报告中提及的漏洞点。

核心漏洞点

那漏洞到底出现在哪儿呢？回到我们分析的第一个点以及维护者提到的next，这个漏洞应该是由于迭代器引发的，那么本质上应该是一个迭代器相关的点触发的问题。重新检查patch，会发现一个很容易忽略的点，在许多的example文件中，都出现了类似的修改

-   for row in result.iter() {
+   let mut iter = result.iter();
+   while let Some(row) = iter.next() {

最初我以为这个修改无关痛痒，毕竟这个看起来只是用法不同。然而当我强行将其改成修改前的调用模式时，会提示如下的问题：

error[E0277]: `cassandra_cpp::cassandra::result::ResultIterator<'_>` is not an iterator
  --> examples/simple2.rs:19:16
   |
19 |     for row in result.iter() {
   |                ^^^^^^^^^^^^^ `cassandra_cpp::cassandra::result::ResultIterator<'_>` is not an iterator
   |
   = help: the trait `Iterator` is not implemented for `cassandra_cpp::cassandra::result::ResultIterator<'_>`
   = note: required for `cassandra_cpp::cassandra::result::ResultIterator<'_>` to implement `IntoIterator`

换句话说，这个写法会直接导致错误，因为修正后的ResultIterator并没有去实现Iterator的特征。实际上作者也进行了相关提醒:

/// An iterator over the results of a query. The result holds the data, so
/// the result must last for at least the lifetime of the iterator.
///
/// This is a lending iterator (you must stop using each item before you move to
/// the next), and so it does not implement `std::iter::Iterator`. The best way
/// to use it is as follows:

结合报错以及生命周期声明，这里会注意到几个特点

修复后的漏洞并没有继承Iterator，而是使用了自行定义的迭代器特征，所以才没办法使用for-in-loop
ResultIterator是一个C++中的对象，其中包含了一个Row对象，而非指针
ResultIterator的生命周期和Row的生命周期在Rust中并非强绑定关系

修复公告中强调ResultIterator不在支持Iterator而是LendingIterator，观察其代码如下

-   impl<'a> Iterator for ResultIterator<'a> {
-       type Item = Row<'a>;
-       fn next(&mut self) -> Option<<Self as Iterator>::Item> {
+   impl LendingIterator for ResultIterator<'_> {
+       type Item<'a> = Row<'a> where Self: 'a;
+
+       fn next(&mut self) -> Option<<Self as LendingIterator>::Item<'_>> {

这个修改前的代码具有一定的迷惑性，乍一看它和修改后一样，都保持了ResultIterator和Item指代的Row类型生命周期长度一致，只不过一个直接显示的指定生命周期，一个使用了Self；一个使用Item指定了带有生命周期的Row<'a>，另一个声明了有生命周期的Item<'a>。然而实际上，Row<'a>的生命周期并非就是真的是Row对象。这里可以检查定义

/// A collection of column values. Read-only, so thread-safe.
-   pub struct Row<'a>(*const _Row, PhantomData<&'a CassResult>);
+   //
+   // Borrowed immutably.
+   pub struct Row<'a>(*const _Row, PhantomData<&'a _Row>);

如果结合这段代码看，我们就能发现，修改前的ResultIterator的生命周期，实际上和Row中指定的CassResult生命周期保持一致。CassResult这个对象提供了接口获取ResultIterator对象，他们之间的关系类似于

CassResult --- Create --> ResultIterator 
                            |
                            +-- Create from self --> Row

从设计角度上看，也没太多问题，毕竟查询结果的每一行的生命周期与查询结果一致是理所当然的。然而在实现过程中，Row自于ResultIterator，而这没有显示的指明Row与ResultIterator的关系，这就导致在修改前ResultIterator和Row在Rust中允许生命周期长度不同，而在C中这两个对象却来自于同一块内存。这种场景中，一旦声明变量为Row类型，并且生命周期长度超过了ResultIterator，就会导致Row对象在ResultIterator被销毁后依然被使用。同时，由于生命周期声明错误，Rust编译器也会无法察觉当前问题，就会产生前文提到的UAF问题。

举个例子（这个代码只用于示范，无法运行）

let mut tmp_row = None;
let result = function.get_result();
{
    for row in result.iter() {
        if condition.satisfied():
            tmp_row = Some(row)
            break;
    }
}

println!("here will cause problem {:?}", tmp_row);

实际上，这种代码在实际中很可能存在

修复策略

作者首先提供了LendingIterator，这个接口如下:

pub trait LendingIterator {
    /// The type of each item.
    type Item<'a>
    where
        Self: 'a;

    /// skip some code
}

可以看到，这边声明关联类型 Item 的时候，强制指定其要与Trait对象一致。换句话说，这里描述的trait要求实现当前接口的对象要和Item对象包含的结构成员生命周期保持一致。这其实是一个Rust提供的新特性（作者在README提到）连接在这

概括来说，这个特性能够实现以下的效果：

定义一个特征，并且在接口中声明一种关联类型的时候，声明生命周期，并且指定其和Self一致
当某个特定的结构体实现特征的时候，这个结构体使用关联类型参与的特征函数时，结构体与特征生命周期保持一致

最典型的就是我们上述提到的这个场景：我们需要迭代器与迭代器其中的类型生命周期保持一致。修复主要是通过这个特性实现的

其次，这里的Row也进行了一定的修改

/// A collection of column values. Read-only, so thread-safe.
-   pub struct Row<'a>(*const _Row, PhantomData<&'a CassResult>);
+   //
+   // Borrowed immutably.
+   pub struct Row<'a>(*const _Row, PhantomData<&'a _Row>);

这里的幽灵数据指向了Row自己（这个_Row就是来自C++的Row的指针）。

结合上述修改，此时Row指针的生命周期就和ResultIterator绑定了。如果此时我们尝试在ResultIterator生命周期使用取出来的Row，此时则会提示其中一方生命周期超出另一方，最终造成问题：

error[E0597]: `iter` does not live long enough
  --> example.rs
   |
21 |     let mut iter = result.iter();
   |         -------- binding `iter` declared here
22 |     while let Some(row) = iter.next() {
   |                           ^^^^ borrowed value does not live long enough
...
28 |     }
   |     - `iter` dropped here while still borrowed
29 |
30 |     println!("here will cause problem {:?}", tmp_row);
   |

其他点分析

除去刚刚的漏洞点外，代码还给很多对象增加了幽灵数据，例如：

#[derive(Debug)]
-   pub struct RowIterator(pub *mut _CassIterator);
+   pub struct RowIterator<'a>(*mut _CassIterator, PhantomData<&'a _Row>);

并且也增加了对应的一些接口函数等等

-   impl Drop for RowIterator {
+   impl Drop for RowIterator<'_> {
        fn drop(&mut self) {
            unsafe { cass_iterator_free(self.0) }
        }
    }
-   impl<'a> Iterator for &'a RowIterator {
-       type Item = Value;
+   impl LendingIterator for RowIterator<'_> {
+       type Item<'a> = Value<'a> where Self: 'a;

-       fn next(&mut self) -> Option<<Self as Iterator>::Item> {
+       fn next(&mut self) -> Option<<Self as LendingIterator>::Item<'_>> {
            unsafe {
            match cass_iterator_next(self.0) {
                cass_false => None,
                cass_true => Some(Value::build(cass_iterator_get_column(self.0))),
            }
        }
    }

在原先的实现中，RowIterator并没有生命周期，而从名字上我们也可得知，其最终可以获取_Row对象，其完美符合我们先前提及的模型，由Test1获取Test2对象的模型，所以对于这些类型，修复前很可能确实存在类似的问题。不过仔细研究后，大部分的Iterator对象以及其提供的接口之间，获取的数据并没有Row与ResultIterator这样的，来自同一段内存的关系，故这些修复猜测应该是针对同类型的漏洞进行提前的修补。

参考资料

https://kaisery.github.io/trpl-zh-cn/title-page.html

漏洞背景

Patch分析

Rust基本特性补充

虚幻数据PhantomData

迭代器 iter

漏洞分析

Patch分析

核心漏洞点

修复策略

其他点分析

参考资料

虚幻数据`PhantomData`