Publications

PUMA: Secure Inference of LLaMA-7B in Five Minutes

Published in arXiv preprint arXiv:2307.12533, 2023

Authors: Ye Dong, Wen-jie Lu, Yancheng Zheng, Haoqi Wu, Derun Zhao, Jin Tan, Zhicong Huang, Cheng Hong, Tao Wei, Wenguang Cheng

Abstract: With ChatGPT as a representative, tons of companies have began to provide services based on large Transformers models. However, using such a service inevitably leak users’ prompts to the model provider. Previous studies have studied secure inference for Transformer models using secure multiparty computation (MPC), where model parameters and clients’ prompts are kept secret. Despite this, these frameworks are still limited in terms of model performance, efficiency, and deployment. To address these limitations, we propose framework PUMA to enable fast and secure Transformer model inference. Our framework designs high quality approximations for expensive functions, such as GeLU and Softmax, which significantly reduce the cost of secure inference while preserving the model performance. Additionally, we design secure Embedding and LayerNorm procedures that faithfully implement the desired functionality without undermining the Transformer architecture. PUMA is about 2x faster than the state-of-the-art MPC framework MPCFORMER(ICLR 2023) and has similar accuracy as plaintext models without fine-tuning (which the previous works failed to achieve). One more thing, PUMA can evaluate LLaMA-7B in around 5 minutes to generate 1 token. To our best knowledge, this is the first time that a model with such a parameter size is able to be evaluated under MPC. PUMA has been open-sourced in the Github repository of SecretFlow-SPU.

Download here

Practical and General Backdoor Attacks against Vertical Federated Learning

Published in ECML-PKDD 2023, 2023

Authors: Yuexin Xuan, Xiaojun Chen, Zhendong Zhao, Bisheng Tang, Ye Dong

Abstract: Federated learning (FL), which aims to facilitate data collaboration across multiple organizations without exposing data privacy, encounters potential security risks. One serious threat is backdoor attacks, where an attacker injects a specific trigger into the training dataset to manipulate the model’s prediction. Most existing FL backdoor attacks are based on horizontal federated learning (HFL), where the data owned by different parties have the same features. However, compared to HFL, backdoor attacks on vertical federated learning (VFL), where each party only holds a disjoint subset of features and the labels are only owned by one party, are rarely studied. The main challenge of this attack is to allow an attacker without access to the data labels, to perform an effective attack. To this end, we propose BadVFL, a novel and practical approach to inject backdoor triggers into victim models without label information. BadVFL mainly consists of two key steps. First, to address the challenge of attackers having no knowledge of labels, we introduce a SDD module that can trace data categories based on gradients. Second, we propose a SDP module that can improve the attack’s effectiveness by enhancing the decision dependency between the trigger and attack target. Extensive experiments show that BadVFL supports diverse datasets and models, and achieves over 93% attack success rate with only 1% poisoning rate.

Download here

GTree: GPU-Friendly Privacy-preserving Decision Tree Training and Inference

Published in arXiv preprint arXiv:2305.00645, 2023

Authors: Qifan Wang, Shujie Cui, Lei Zhou, Ye Dong, Jianli Bai, Yun Sing Koh, Giovanni Russello

Abstract: Decision tree (DT) is a widely used machine learning model due to its versatility, speed, and interpretability. However, for privacy-sensitive applications, outsourcing DT training and inference to cloud platforms raise concerns about data privacy. Researchers have developed privacy-preserving approaches for DT training and inference using cryptographic primitives, such as Secure Multi-Party Computation (MPC). While these approaches have shown progress, they still suffer from heavy computation and communication overheads. Few recent works employ Graphical Processing Units (GPU) to improve the performance of MPC-protected deep learning. This raises a natural question: \textit{can MPC-protected DT training and inference be accelerated by GPU?} We present GTree, the first scheme that uses GPU to accelerate MPC-protected secure DT training and inference. GTree is built across 3 parties who securely and jointly perform each step of DT training and inference with GPU. Each MPC protocol in GTree is designed in a GPU-friendly version. The performance evaluation shows that GTree achieves ∼11× and ∼21× improvements in training SPECT and Adult datasets, compared to the prior most efficient CPU-based work. For inference, GTree shows its superior efficiency when the DT has less than 10 levels, which is 126× faster than the prior most efficient work when inferring 104 instances with a tree of 7 levels. GTree also achieves a stronger security guarantee than prior solutions, which only leaks the tree depth and size of data samples while prior solutions also leak the tree structure. With \textit{oblivious array access}, the access pattern on GPU is also protected.

Download here

FLEXBNN: Fast Private Binary Neural Network Inference with Flexible Bit-Width

Published in IEEE Transactions on Information Forensics and Security, 2023

Authors: Ye Dong, Xiaojun Chen, Xiangfu Song, Kaiyun Li

Abstract: Advancements in deep learning enable neural network (NN) inference to be a service, but service providers and clients want to keep their inputs secret for privacy protection. Private Inference is the task of evaluating NN without leaking private inputs. Existing secure multiparty computation (MPC)-based solutions mainly focus on fixed bit-width methodology, such as 32 and 64 bits. Binary Neural Network (BNN) is efficient when evaluated in MPC and has achieved reasonable accuracy for commonly used datasets, but prior private BNN inference solutions, which focus on Boolean Circuits , are still costly in communication and run-time. In this paper, we introduce FLEXBNN, a fast private BNN inference framework using three-party computation (3PC) in Arithmetic Circuits against semi-honest adversaries with honest-majority. In FLEXBNN, we propose to employ flexible and small bit-width equipped with a seamless bit-width conversion method and design several specific optimizations towards the basic operations: i) We propose bit-width determination methods for Matrix Multiplication and Sign-based Activation function. ii) We integrate Batch Normalization and Max-Pooling into the Sign-based Activation function for better efficiency. iii) More importantly, we achieve seamless bit-width conversion within the Sign-based Activation function with no additional cost. Extensive experiments illustrate that FLEXBNN outperforms state-of-the-art solutions in communication, run-time, and scalability. On average, FLEXBNN is 11× faster than XONN (USENIX Security’ 19) in LAN, 46× (resp. 9.3×) faster than QUOTIENT (ACM CCS’19) in LAN (resp. WAN), 10× faster than BANNERS (ACM IH&MMSec’21) in LAN, and 1.1-2.9× (resp. 1.5-2.7×) faster than FALCON (semi-honest, PoPETs’21) in LAN (resp. WAN), and improves the respective communication by 500×, 127×, and 1.3-1.5× compared to XONN, BANNERS, and FALCON.

Download here

Meteor: Improved Secure 3-Party Neural Network Inference with Reducing Online Communication Costs

Published in The 2023 ACM Web Conference, 2023

Authors: Ye Dong, Xiaojun Chen, Weizhan Jing, Kaiyun Li, Weiping Wang

Abstract: Secure neural network inference has been a promising solution to private Deep-Learning-as-a-Service, which enables the service provider and user to execute neural network inference without revealing their private inputs. However, the expensive overhead of current schemes is still an obstacle when applied in real applications. In this work, we present \textsc{Meteor}, an online communication-efficient and fast secure 3-party computation neural network inference system aginst semi-honest adversary in honest-majority. The main contributions of \textsc{Meteor} are two-fold: \romannumeral1) We propose a new and improved 3-party secret sharing scheme stemming from the \textit{linearity} of replicated secret sharing, and design efficient protocols for the basic cryptographic primitives, including linear operations, multiplication, most significant bit extraction, and multiplexer. \romannumeral2) Furthermore, we build efficient and secure blocks for the widely used neural network operators such as Matrix Multiplication, ReLU, and Maxpool, along with exploiting several specific optimizations for better efficiency. Our total communication with the setup phase is a little larger than SecureNN (PoPETs’19) and \textsc{Falcon} (PoPETs’21), two state-of-the-art solutions, but the gap is not significant when the online phase must be optimized as a priority. Using \textsc{Meteor}, we perform extensive evaluations on various neural networks. Compared to SecureNN and \textsc{Falcon}, we reduce the online communication costs by up to $25.6\times$ and $1.5\times$, and improve the running-time by at most $9.8\times$ (resp. $8.1\times$) and $1.5\times$ (resp. $2.1\times$) in LAN (resp. WAN) for the online inference.

Download here

ABNN2 secure two-party arbitrary-bitwidth quantized neural network predictions

Published in 59th ACM/IEEE Design Automation Conference, 2022

Authors: Liyan Shen, Ye Dong, Binxing Fang, Jinqiao Shi, Xuebin Wang, Shengli Pan, Ruisheng Shi

Abstract: Data privacy and security issues are preventing a lot of potential on-cloud machine learning as services from happening. In the recent past, secure multi-party computation (MPC) has been used to achieve the secure neural network predictions, guaranteeing the privacy of data. However, the cost of the existing two-party solutions is expensive and they are impractical in real-world setting.

In this work, we utilize the advantages of quantized neural network (QNN) and MPC to present ABNN2, a practical secure two-party framework that can realize arbitrary-bitwidth quantized neural network predictions. Concretely, we propose an efficient and novel matrix multiplication protocol based on 1-out-of-N OT extension and optimize the the protocol through a parallel scheme. In addition, we design optimized protocol for the ReLU function. The experiments demonstrate that our protocols are about 2X-36X and 1.4X–7X faster than SecureML (S&P’17) and MiniONN (CCS’17) respectively. And ABNN2 obtain comparable efficiency as state of the art QNN prediction protocol QUOTIENT (CCS’19), but the later only supports ternary neural network.

Download here

Distributed Fog Computing and Federated Learning enabled Secure Aggregation for IoT Devices

Published in IEEE Internet of Things Journal, 2022

Authors: Yiran Liu, Ye Dong, Hao Wang, Han Jiang, Qiuliang Xu

Abstract: Federated learning (FL), as a prospective way to process and analyze the massive data from the Internet of Things (IoT) devices, has attracted increasing attention from academia and industry. However, considering the unreliable nature of IoT devices, ensuring the efficiency of FL while protecting the privacy of devices’ input data is a challenging task. To address these issues, we propose a secure aggregation protocol based on efficient additive secret sharing in the fog-computing (FC) setting. As the secure aggregation is performed frequently in the training process of FL, the protocol should have low communication and computation overhead. First, we use a fog node (FN) as an intermediate processing unit to provide local services which can assist the cloud server aggregated the sum during the training process. Second, we design a light Request-then-Broadcast method to ensure our protocol has the robustness to dropped-out clients. Our protocol also provides two simple new client selection methods. The security and performance of our protocol are analyzed and compared with existed schemes. We conduct experiments on high-dimensional inputs, and our experimental results demonstrate about 24– 168× improvement in computation overhead and 87– 287× improvement in communication overhead compared to Google’s secure aggregation protocol (Bonwatiwz et al. CCS’17).

Download here

DEFEAT: Deep Hidden Feature Backdoor Attacks by Imperceptible Perturbation and Latent Representation Constraints

Published in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Authors: Zhendong Zhao, Xiaojun Chen, Yuexin Xuan, Ye Dong, Dakui Wang, Kaitai Liang

Abstract: Backdoor attack is a type of serious security threat to deep learning models. An adversary can provide users with a model trained on poisoned data to manipulate prediction behavior in test stage using a backdoor. The backdoored models behave normally on clean images, yet can be activated and output incorrect prediction if the input is stamped with a specific trigger pattern. Most existing backdoor attacks focus on manually defining imperceptible triggers in input space without considering the abnormality of triggers’ latent representations in the poisoned model. These attacks are susceptible to backdoor detection algorithms and even visual inspection. In this paper, We propose a novel and stealthy backdoor attack-DEFEAT. It poisons the clean data using adaptive imperceptible perturbation and restricts latent representation during training process to strengthen our attack’s stealthiness and resistance to defense algorithms. We conduct extensive experiments on multiple image classifiers using real-world datasets to demonstrate that our attack can 1) hold against the state-of-the-art defenses, 2) deceive the victim model with high attack success without jeopardizing model utility, and 3) provide practical stealthiness on image data.

Download here

Efficient Byzantine-Resilient Stochastic Gradient Descent

Published in International Workshop on Federated and Transfer Learning for Data Sparsity and Confidentiality in Conjunction with IJCAI, 2021

Authors: Kaiyun Li, Xiaojun Chen, Ye Dong, Peng Zhang, Dakui Wang, and Shuai Zeng

Abstract: Distributed Learning often suffers from Byzantine failures, and there have been a number of works studying the problem of distributed stochastic optimization under Byzantine failures, where only a portion of workers, instead of all the workers in a distributed learning system, compute stochastic gradients at each iteration. These methods, albeit workable under Byzantine failures, have the shortcomings of either a sub-optimal convergence rate or high computation cost. To this end, we propose a new Byzantine-resilient stochastic gradient descent algorithm (BrSGD for short) which is provably robust against Byzantine failures. BrSGD obtains the optimal statistical performance and efficient computation simultaneously. In particular, BrSGD can achieve an order-optimal statistical error rate for strongly convex loss functions. The computation complexity of BrSGD is O(md), where d is the model dimension and m is the number of machines. Experimental results show that BrSGD can obtain competitive results compared with non-Byzantine machines in terms of effectiveness and convergence.

Download here

FLOD: Oblivious Defender for Private Byzantine-Robust Federated Learning with Dishonest-Majority

Published in European Symposium on Research in Computer Security, 2021

Authors: Ye Dong, Xiaojun Chen, Kaiyun Li, Dakui Wang, and Shuai Zeng

Abstract：Privacy and Byzantine-robustness are two major concerns of federated learning (FL), but mitigating both threats simultaneously is highly challenging: privacy-preserving strategies prohibit access to individual model updates to avoid leakage, while Byzantine-robust methods require access for comprehensive mathematical analysis. Besides, most Byzantine-robust methods only work in the honest-majority setting.

We present FLOD, a novel oblivious defender for private Byzantine-robust FL in dishonest-majority setting. Basically, we propose a novel Hamming distance-based aggregation method to resist $>1/2$ Byzantine attacks using a small root-dataset and server-model for bootstrapping trust. Furthermore, we employ two non-colluding servers and use additive homomorphic encryption (AHE) and secure two-party computation (2PC) primitives to construct efficient privacy-preserving building blocks for secure aggregation, in which we propose two novel in-depth variants of Beaver Multiplication triples (MT) to reduce the overhead of Bit to Arithmetic (Bit2A) conversion and vector weighted sum aggregation (VSWA) significantly. Experiments on real-world and synthetic datasets demonstrate our effectiveness and efficiency: (i) FLOD defeats known Byzantine attacks with a negligible effect on accuracy and convergence, (ii) achieves a reduction of \approx 2\times for offline (resp. online) overhead of Bit2A and VSWA compared to \mathsf {ABY}-AHE (resp. ABY-MT) based methods (NDSS’15), (iii) and reduces total online communication and run-time by 167–1416\times and 3.1–7.4\times compared to FLGUARD (Crypto Eprint 2021/025).

Download here

Efficient and Secure Federated Learning Based on Secret Sharing and Gradients Selection (in Chinese)

Published in Journal of Computer Research and Development, 2020

Authors: Dong Ye, Hou Wei, Chen Xiaojun, and Zeng Shuai

摘要：近年来,联邦学习已经成为一种新兴的协作式机器学习方法.在联邦学习中,分布式用户可以仅通过共享梯度来训练各种模型.但是一些研究表明梯度也会泄露用户的隐私信息,而安全多方计算被认为是一种保护隐私安全的有效工具.另一方面,一些研究人员提出了Top-K梯度选择算法,以减少用户之间同步梯度的通信开销.但是,目前很少有工作可以平衡这2个领域的优势.将秘密共享与Top-K梯度选择相结合,设计了高效且安全的联邦学习协议,以便在保证用户隐私和数据安全的同时,减少通信开销,并提高模型训练效率.此外,提出了一种高效的方法来构造消息验证码,以验证服务器返回的聚合结果的有效性,其中,验证码引入的通信开销与梯度的数量无关.实验结果表明:相比于同样条件下的明文训练,该文的安全技术在通信和计算方面都会引入少量额外的开销,但该方案取得了和明文训练同一水平的模型准确率.

Download here

An Efficient 3-Party Framework for Privacy-Preserving Neural Network Inference

Published in European Symposium on Research in Computer Security, 2020

Authors: Liyan Shen, Xiaojun Chen, Jinqiao Shi, Ye Dong, and Binxing Fang

Abstract: In the era of big data, users pay more attention to data privacy issues in many application fields, such as healthcare, finance, and so on. However, in the current application scenarios of machine learning as a service, service providers require users’ private inputs to complete neural network inference tasks. Previous works have shown that some cryptographic tools can be used to achieve the secure neural network inference, but the performance gap is still existed to make those techniques practical.

In this paper, we focus on the efficiency problem of privacy-preserving neural network inference and propose novel 3-party secure protocols to implement amounts of nonlinear activation functions such as ReLU and Sigmod, etc. Experiments on five popular neural network models demonstrate that our protocols achieve about 1.2\times –11.8\times and 1.08\times –4.8\times performance improvement than the state-of-the-art 3-party protocols (SecureNN) in terms of computation and communication overhead. Furthermore, we are the first to implement the privacy-preserving inference of graph convolutional networks.

Download here

EaSTFLy: Efficient and secure ternary federated learning

Published in Computers & Security, 2020

Authors: Ye Dong, Xiaojun Chen, Liyan Shen, and Dakui Wang

Abstract: Privacy-preserving machine learning allows multiple parties to perform distributed data analytics while guaranteeing individual privacy. In this area, researchers have proposed many schemes that combine machine learning with privacy-preserving technologies. But these works have shortcomings in terms of efficiency. Meanwhile, federated learning has received widespread attention due to its ability to update parameters without collecting users’ raw data, but this method is short in communications and privacy. Recently, ternary gradients federated learning(TernGrad) has been proposed to reduce the communications, but it is still to various security and privacy threats.

In this paper, firstly, we analyze the privacy leakages of TernGrad. Then, we present our solution-EaSTFLy to solve the privacy issue. More concretely, in EaSTFLy, we combine TernGrad with secret sharing and homomorphic encryption to design our privacy-preserving protocols against semi-honest adversary. In addition, we optimize our protocols via SIMD. Compared to prior works on floating-point gradients, our protocols are more efficient in communication and computation overheads, and the accuracy is as high as the plaintext ternary federated learning. To our best knowledge, this is the first research combining ternary federated learning with privacy-preserving technologies. Finally, we evaluate our experiments to show improvements.

Download here

Privacy-Preserving Distributed Machine Learning Based on Secret Sharing

Published in International Conference on Information and Communications Security, 2019

Authors: Ye Dong, Xiaojun Chen, Liyan Shen, and Dakui Wang

Abstract: Machine Learning has been widely applied in practice, such as disease diagnosis, target detection. Commonly, a good model relies on massive training data collected from different sources. However, the collected data might expose sensitive information. To solve the problem, researchers have proposed many excellent methods that combine machine learning with privacy protection technologies, such as secure multiparty computation (MPC), homomorphic encryption (HE), and differential privacy. In the meanwhile, some other researchers proposed distributed machine learning which allows the clients to store their data locally but train a model collaboratively. The first kind of methods focuses on security, but the performance and accuracy remain to be improved, while the second provides higher accuracy and better performance but weaker security, for instance, the adversary can launch membership attacks from the gradients’ updates in plaintext.

In this paper, we join secret sharing to distributed machine learning to achieve reliable performance, accuracy, and high-level security. Next, we design, implement, and evaluate a practical system to jointly learn an accurate model under semi-honest and servers-only malicious adversary security, respectively. And the experiments show our protocols achieve the best overall performance as well.

Download here

Efficient and private set intersection of human genomes

Published in IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2018

Authors: Liyan Shen, Xiaojun Chen, Dakui Wang, Binxing Fang, and Ye Dong

Abstract: With the development of human genomes sequencing technology, the biological and medical research has been greatly accelerated and a wide range of health-related applications and services become more and more ubiquitous and affordable. However, the digitized genomes sequence raises serious privacy issues since a genome contains individual’s extremely sensitive information. In this paper, we mainly focus on efficient and privacy-preserving set intersection protocol of human genomes. It makes the paternity and ancestry testing perform safely, without disclosing any additional individual’s genomic information. Experimental results demonstrate that proposed techniques have better performance.

Download here

Ye Dong (董业)

Publications