- Think Outside the Dataset: Finding Fraudulent Reviews using Cross-Dataset Analysis (WWW 2019)
- Nilizadeh et al.
While online review services provide a two-way conversation between brands and consumers, malicious actors, including misbehaving businesses, have an equal opportunity to distort the reviews for their own gains. We propose OneReview, a method for locating fraudulent reviews, correlating data from multiple crowd-sourced review sites. Our approach utilizes Change Point Analysis to locate points at which a business' reputation shifts. Inconsistent trends in reviews of the same businesses across multiple websites are used to identify suspicious reviews. We then extract an extensive set of textual and contextual features from these suspicious reviews and employ supervised machine learning to detect fraudulent reviews. We evaluated OneReview on about 805K and 462K reviews from Yelp and TripAdvisor, respectively to identify fraud on Yelp. Supervised machine learning yields excellent results, with 97% accuracy. We applied the created model on suspicious reviews and detected about 62K fraudulent reviews (about 8% of all the Yelp reviews). We further analyzed the detected fraudulent reviews and their authors, and located several spam campaigns in the wild, including campaigns against specific businesses, as well as campaigns consisting of several hundreds of socially-networked untrustworthy accounts.
Fraud transactions are one of the major threats faced by online e-commerce platforms. Recently, deep learning based classifiers have been deployed to detect fraud transactions. Inspired by findings on adversarial examples, this paper is the first to analyze the vulnerability of deep fraud detector to slight perturbations on input transactions, which is very challenging since the sparsity and discretization of transaction data result in a non-convex discrete optimization. Inspired by the iterative Fast Gradient Sign Method(FGSM) for the L∞ attack, we first propose the Iterative Fast Coordinate Method (IFCM) for discrete L1 and L2 attacks which is efficient to generate large amounts of instances with satisfactory effectiveness. We then provide two novel attack algorithms to solve the discrete optimization. The first one is the Augmented Iterative Search (AIS) algorithm, which repeatedly searches for effective “simple" perturbation. The second one is called the Rounded Relaxation with Reparameterization (R3), which rounds the solution obtained by solving a relaxed and unconstrained optimization problem with reparameterization tricks. Finally, we conduct extensive experimental evaluation on the deployed fraud detector in TaoBao, one of the largest e-commerce platforms in the world, with millions of real-world transactions. Results show that (i) The deployed detector is highly vulnerable to attacks as the average precision is decreased from nearly 90% to as low as 20% with little perturbations; (ii) Our proposed attacks significantly outperform the adaptions of the state-of-the-art attacks. (iii) The model trained with an adversarial training process is significantly robust against attacks and performs well on the unperturbed data.
Many approaches focus on detecting dense blocks in the tensor of multimodal data to prevent fraudulent entities (e.g., accounts, links) from retweet boosting, hashtag hijacking, link advertising, etc. However, no existing method is effective to find the dense block if it only possesses high density on a subset of all dimensions in tensors. In this paper, we novelly identify dense-block detection with dense-subgraph mining, by modeling a tensor into a weighted graph without any density information lost. Based on the weighted graph, which we call information sharing graph (ISG), we propose an algorithm for finding multiple densest subgraphs, D-Spot, that is faster (up to 11x faster than the state-of-the-art algorithm) and can be computed in parallel. In an N-dimensional tensor, the entity group found by the ISG+D-Spot is at least 1/2 of the optimum with respect to density, compared with the 1/N guarantee ensured by competing methods. We use nine datasets to demonstrate that ISG+D-Spot becomes new state-of-the-art dense-block detection method in terms of accuracy specifically for fraud detection.
In large e-commerce websites, sellers have been observed to engage in fraudulent behaviour, faking historical transactions in order to receive favourable treatment from the platforms, specifically through the allocation of additional buyer impressions which results in higher revenue for them, but not for the system as a whole. This emergent phenomenon has attracted considerable attention, with previous approaches focusing on trying to detect illicit practices and to punish the miscreants. In this paper, we employ the principles of reinforcement mechanism design, a framework that combines the fundamental goals of classical mechanism design, i.e. the consideration of agents' incentives and their alignment with the objectives of the designer, with deep reinforcement learning for optimizing the performance based on these incentives. In particular, first we set up a deep-learning framework for predicting the sellers' rationality, based on real data from any allocation algorithm. We use data from one of largest e-commerce platforms worldwide and train a neural network model to predict the extent to which the sellers will engage in fraudulent behaviour. Using this rationality model, we employ an algorithm based on deep reinforcement learning to optimize the objectives and compare its performance against several natural heuristics, including the platform's implementation and incentive-based mechanisms from the related literature.
- Adapting to Concept Drift in Credit Card Transaction Data Streams Using Contextual Bandits and Decision Trees (AAAI 2018)
- Soemers et al.
Credit card transactions predicted to be fraudulent by automated detection systems are typically handed over to human experts for verification. To limit costs, it is standard practice to select only the most suspicious transactions for investigation. We claim that a trade-off between exploration and exploitation is imperative to enable adaptation to changes in behavior (concept drift). Exploration consists of the selection and investigation of transactions with the purpose of improving predictive models, and exploitation consists of investigating transactions detected to be suspicious. Modeling the detection of fraudulent transactions as rewarding, we use an incremental Regression Tree learner to create clusters of transactions with similar expected rewards. This enables the use of a Contextual Multi-Armed Bandit (CMAB) algorithm to provide the exploration/exploitation trade-off. We introduce a novel variant of a CMAB algorithm that makes use of the structure of this tree, and use Semi-Supervised Learning to grow the tree using unlabeled data. The approach is evaluated on a real dataset and data generated by a simulator that adds concept drift by adapting the behavior of fraudsters to avoid detection. It outperforms frequently used offline models interms of cumulative rewards, in particular in the presence of concept drift.
Most of the current anti money laundering (AML) systems, using handcrafted rules, are heavily reliant on existing structured databases, which are not capable of effectively and efficiently identifying hidden and complex ML activities, especially those with dynamic and timevarying characteristics, resulting in a high percentage of false positives. Therefore, analysts are engaged for further investigation which significantly increases human capital cost and processing time. To alleviate these issues, this paper presents a novel framework for the next generation AML by applying and visualizing deep learning-driven natural language processing (NLP) technologies in a distributed and scalable manner to augment AML monitoring and investigation. The proposed distributed framework performs news and tweet sentiment analysis, entity recognition, relation extraction, entity linking and link analysis on different data sources (e.g. news articles and tweets) to provide additional evidence to human investigators for final decisionmaking. Each NLP module is evaluated on a task-specific data set, and the overall experiments are performed on synthetic and real-world datasets. Feedback from AML practitioners suggests that our system can reduce approximately 30% time and cost compared to their previous manual approaches of AML investigation.
The collaboration of financial institutes against fraudsters is a promising path for reducing resource investments and increasing coverage. Yet, such collaboration is held back by two somewhat conflicting challenges: effective knowledge sharing and limiting leakage of private information. While the censorship of private information is likely to reduce knowledge sharing effectiveness, the generalization of private information to a desired degree can potentially allow, on one hand, to limit the leakage, and on the other hand, to reveal some properties of the private information that can be beneficial for sharing. In this demo we present a system that allows knowledge sharing via effective adaptation of fraud detection rules while preserving privacy. The system uses taxonomies to generalize concrete values appearing in fraud detection rules to higher level concepts which conform to some privacy/utility requirements set by the owner. Our demonstration will engage the CIKM'18 audience by showing that private information can be abstracted to enforce privacy while maintaining its usage by (partially) trusted allies.
Fraud detection is of great importance because fraudulent behaviors may mislead consumers or bring huge losses to enterprises. Due to the lockstep feature of fraudulent behaviors, fraud detection problem can be viewed as finding suspicious dense blocks in the attributed bipartite graph. In reality, existing attribute-based methods are not adversarially robust, because fraudsters can take some camouflage actions to cover their behavior attributes as normal. More importantly, existing structural information based methods only consider shallow topology structure, making their effectiveness sensitive to the density of suspicious blocks. In this paper, we propose a novel deep structure learning model named DeepFD to differentiate normal users and suspicious users. DeepFD can preserve the non-linear graph structure and user behavior information simultaneously. Experimental results on different types of datasets demonstrate that DeepFD outperforms the state-of-the-art baselines.
Fraud detection is usually regarded as finding a needle in haystack, which is a challenging task because fraudulences are buried in massive normal behaviors. Indeed, a fraudulent incident usually takes place in consecutive time steps to gain illegal benefits, which provides unique clues to probing frauds by considering a complete behavioral sequence, rather than detecting frauds from a snapshot of behaviors. Also, fraudulent behaviors may entail different parties, such that the interaction pattern between sources and targets can help distinguish frauds from normal behaviors. Therefore, in this paper, we model the attributed behavioral sequences generated from consecutive behaviors, in order to capture the sequential patterns, while those deviate from the pattern can be regarded as fraudulence. Considering the characteristics of behavioral sequence, we propose a novel model, HAInt-LSTM, by augmenting traditional LSTM with a modified forget gate where interval time between consecutive time steps are considered. Meanwhile, we employ a self-historical attention mechanism to allow for long-time dependencies, which can help identify repeated or cyclical appearances. In addition, we encode the source information as an interaction module to enhance the learning of behavioral sequences. To validate the effectiveness of the learned sequential behavior representations, we experiment on real-world telecommunication dataset under both supervised and unsupervised scenarios. Experimental results show that the learned representations can better identify fraudulent behaviors, and also show a clear cut with normal sequences in the lower dimensional embedding space through visualization. Last but not least, we visualize the weights of attention mechanism to provide rational interpretation of human behavioral periodicity
Conducting fraud transactions has become popular among e-commerce sellers to make their products favorable to the platform and buyers, which decreases the utilization efficiency of buyer impressions and jeopardizes the business environment. Fraud detection techniques are necessary but not enough for the platform since it is impossible to recognize all the fraud transactions. In this paper, we focus on improving the platform’s impression allocation mechanism to maximize its profit and reduce the sellers’ fraudulent behaviors simultaneously. First, we learn a seller behavior model to predict the sellers’ fraudulent behaviors from the real-world data provided by one of the largest ecommerce company in the world. Then, we formulate the platform’s impression allocation problem as a continuous Markov Decision Process (MDP) with unbounded action space. In order to make the action executable in practice and facilitate learning, we propose a novel deep reinforcement learning algorithm DDPG-ANP that introduces an action norm penalty to the reward function. Experimental results show that our algorithm significantly outperforms existing baselines in terms of scalability and solution quality.
In e-commerce, different payment transactions have different levels of risk. Risk is generally higher for digital goods, but it also differs based on product and its popularity, the offer type (packaged game, virtual currency to a game or subscription service), storefront and geography. Existing fraud policies and models make decisions independently for each transaction based on transaction attributes, payment velocities, user characteristics, and other relevant information. However, suspicious transactions may still evade detection and hence we propose a novel approach leveraging a graph based perspective to uncover relationships among suspicious transactions, i.e., inter-transaction dependency. Our focus is to detect suspicious transactions by capturing common fraudulent behaviors that would not be considered suspicious when being considered in isolation. In this paper, we present HitFraud that leverages heterogeneous information networks for collective fraud detection by exploring correlated and fast evolving fraudulent behaviors. First, a heterogeneous information network is designed to link entities of interest in the transaction database via different semantics. Then, graph based features are efficiently discovered from the network exploiting the concept of meta-paths, and decisions on frauds are made collectively on test instances. Experiments on real-world payment transaction data from Electronic Arts demonstrate that the prediction performance is efectively boosted by HitFraud where the computation of meta-path based features is largely optimized. Notably, recall can be improved up to 7.93% and F-score 4.62% compared to baselines.
We consider the problem of anomaly detection in finance. An application of interest is the detection of first-time fraud where new classes of fraud need to be detected using unsupervised learning to augment the existing supervised learning techniques that capture known classes of frauds. This domain usually has the following requirements – (i) the ability to handle data containing both numerical and categorical features, (ii) very low latency real-time detection, and (iii) interpretability. We propose the use of a variant of density estimation trees (DETs) (Ram and Gray, 2011) for anomaly detection using distributional properties of the data. We formally present a procedure for handling data sets with both categorical and numerical features while Ram and Gray (2011) focused mainly on data sets with all numerical features. DETs have demonstrably fast prediction times, orders of magnitude faster than other density estimators like kernel density estimators. The estimation of the density and the anomalousness score for any new item can be done very efficiently. Beyond the flexibility and efficiency, DETs are also quite interpretable. For the task of anomaly detection, DETs can generate a set of decision rules that lead to high anomalous-ness scores. We empirically demonstrate these capabilities on a publicly available fraud data set.
- Detection of Money Laundering Groups: Supervised Learning on Small Networks (AAAI 2017)
- Savage et al.
Money laundering is a major global problem, enabling criminal organisations to hide their ill-gotten gains and to finance further operations. Prevention of money laundering is seen as a high priority by many governments, however detection of money laundering without prior knowledge of predicate crimes remains a significant challenge. Previous detection systems have tended to focus on individuals, considering transaction histories and applying anomaly detection to identify suspicious behaviour. However, money laundering involves groups of collaborating individuals, and evidence of money laundering may only be apparent when the collective behaviour of these groups is considered. In this paper we describe a detection system that is capable of analysing group behaviour, using a combination of network analysis and supervised learning. This system is designed for real-world application and operates on networks consisting of millions of interacting parties. Evaluation of the system using real-world data indicates that suspicious activity is successfully detected. Importantly, the system exhibits a low rate of false positives, and is therefore suitable for use in a live intelligence environment.
In this paper, we focus on fraud detection on a signed graph with only a small set of labeled training data. We propose a novel framework that combines deep neural networks and spectral graph analysis. In particular, we use the node projection (called as spectral coordinate) in the low dimensional spectral space of the graph's adjacency matrix as input of deep neural networks. Spectral coordinates in the spectral space capture the most useful topology information of the network. Due to the small dimension of spectral coordinates (compared with the dimension of the adjacency matrix derived from a graph), training deep neural networks becomes feasible. We develop and evaluate two neural networks, deep autoencoder and convolutional neural network, in our fraud detection framework. Experimental results on a real signed graph show that our spectrum based deep neural networks are effective in fraud detection.
- The Many Faces of Link Fraud (ICDM 2017)
- Shah et al.
Most past work on social network link fraud detection tries to separate genuine users from fraudsters, implicitly assuming that there is only one type of fraudulent behavior. But is this assumption true? And, in either case, what are the characteristics of such fraudulent behaviors? In this work, we set up honeypots ("dummy" social network accounts), and buy fake followers (after careful IRB approval). We report the signs of such behaviors including oddities in local network connectivity, account attributes, and similarities and differences across fraud providers. Most valuably, we discover and characterize several types of fraud behaviors. We discuss how to leverage our insights in practice by engineering strongly performing entropy-based features and demonstrating high classification accuracy. Our contributions are (a) instrumentation: we detail our experimental setup and carefully engineered data collection process to scrape Twitter data while respecting API rate-limits, (b) observations on fraud multimodality: we analyze our honeypot fraudster ecosystem and give surprising insights into the multifaceted behaviors of these fraudster types, and (c) features: we propose novel features that give strong (>0.95 precision/recall) discriminative power on ground-truth Twitter data.
- Uncovering Unknown Unknowns in Financial Services Big Data by Unsupervised Methodologies: Present and Future trends (KDD 2017)
- Shabat et al.
Currently, unknown unknowns in high dimensional big data environments can go unnoticed for a long period of time. The failure to detect anomalies in critical infrastructure data can result in extensive financial, operational, reputational and life threatening consequences. In this paper, we describe algorithms for an automatic and unsupervised anomaly detection that do not necessitate domain expertise, signatures, rules, patterns or semantics understanding of the features. We propose several new methodologies for anomaly detection to protect critical infrastructures, with emphasis on finance, spanning from theory to actionable technology. Although anomalies can originate from several sources, we also show that cyber threat,financial and operational malfunction are converging into a single detection paradigm. Performance comparison between different algorithms (ours and others) is presented as well as examples from real use cases.
Credit card fraud detection is an endless war between fraudsters and payment service providers. Indeed, annual global financial loss by credit card frauds has increased. Fraudsters have been organized and systematized, attempting to find weak points of existing fraud detection system (FDS). State-of-the-art FDS approaches utilize already existing fraud cases, which can result in different FDS by payment service providers. Therefore, a new payment service provider may not have room for installing a FDS due to the lack of fraudulent cases. Moreover, credit card transactions contain the legitimate owner’s personal information, which can be exposed to an honest but curious fraud analyst. In this paper, we propose a purchase density based FDS (PD-FDS) that uses three features which are not related to personal information. PD-FDS does not require already existing fraudulent transactions and also shows low false positive rate (<0.01).
In this paper, we present an automated feature engineering based approach to dramatically reduce false positives in fraud prediction. False positives plague the fraud prediction industry. It is estimated that only 1 in 5 declared as fraud are actually fraud and roughly 1 in every 6 customers have had a valid transaction declined in the past year. To address this problem, we use the Deep Feature Synthesis algorithm to automatically derive behavioral features based on the historical data of the card associated with a transaction. We generate 237 features (>100 behavioral patterns) for each transaction, and use a random forest to learn a classifier. We tested our machine learning model on data from a large multinational bank and compared it to their existing solution. On an unseen data of 1.852 million transactions, we were able to reduce the false positives by 54% and provide a savings of 190K euros. We also assess how to deploy this solution, and whether it necessitates streaming computation for real time scoring. We found that our solution can maintain similar benefits even when historical features are computed once every 7 days.
Rapid growth of modern technologies such as internet and mobile computing are bringing dramatically increased e-commerce payments, as well as the explosion in transaction fraud. Meanwhile, fraudsters are continually refining their tricks, making rule-based fraud detection systems difficult to handle the ever-changing fraud patterns. Many data mining and artificial intelligence methods have been proposed for identifying small anomalies in large transaction data sets, increasing detecting efficiency to some extent. Nevertheless, there is always a contradiction that most methods are irrelevant to transaction sequence, yet sequence-related methods usually cannot learn information at single-transaction level well. In this paper, a new "within->between->within" sandwich-structured sequence learning architecture has been proposed by stacking an ensemble method, a deep sequential learning method and another top-layer ensemble classifier in proper order. Moreover, attention mechanism has also been introduced in to further improve performance. Models in this structure have been manifested to be very efficient in scenarios like fraud detection, where the information sequence is made up of vectors with complex interconnected features.
- A Survey of Credit Card Fraud Detection Techniques: Data and Technique Oriented Perspective
- Sorournejad et al.
Credit card plays a very important rule in today's economy. It becomes an unavoidable part of household, business and global activities. Although using credit cards provides enormous benefits when used carefully and responsibly,significant credit and financial damages may be caused by fraudulent activities. Many techniques have been proposed to confront the growth in credit card fraud. However, all of these techniques have the same goal of avoiding the credit card fraud; each one has its own drawbacks, advantages and characteristics. In this paper, after investigating difficulties of credit card fraud detection, we seek to review the state of the art in credit card fraud detection techniques, data sets and evaluation criteria.The advantages and disadvantages of fraud detection methods are enumerated and compared.Furthermore, a classification of mentioned techniques into two main fraud detection approaches, namely, misuses (supervised) and anomaly detection (unsupervised) is presented. Again, a classification of techniques is proposed based on capability to process the numerical and categorical data sets. Different data sets used in literature are then described and grouped into real and synthesized data and the effective and common attributes are extracted for further usage.Moreover, evaluation employed criterions in literature are collected and discussed.Consequently, open issues for credit card fraud detection are explained as guidelines for new researchers.