Skip to main content

Showing 1–17 of 17 results for author: Bontempi, G

  1. arXiv:2401.05386  [pdf, ps, other

    eess.SP cs.HC cs.LG

    EMG subspace alignment and visualization for cross-subject hand gesture classification

    Authors: Martin Colot, Cédric Simar, Mathieu Petieau, Ana Maria Cebolla Alvarez, Guy Cheron, Gianluca Bontempi

    Abstract: Electromyograms (EMG)-based hand gesture recognition systems are a promising technology for human/machine interfaces. However, one of their main limitations is the long calibration time that is typically required to handle new users. The paper discusses and analyses the challenge of cross-subject generalization thanks to an original dataset containing the EMG signals of 14 human subjects during ha… ▽ More

    Submitted 18 December, 2023; originally announced January 2024.

    Comments: 8 pages + 1 appendix page 6 figures (one in appendix) Published in the Adapting to Change: Reliable Learning Across Domains workshop from ECML-PKDD 2023

  2. arXiv:2312.07206  [pdf, other

    cs.LG

    A churn prediction dataset from the telecom sector: a new benchmark for uplift modeling

    Authors: Théo Verhelst, Denis Mercier, Jeevan Shrestha, Gianluca Bontempi

    Abstract: Uplift modeling, also known as individual treatment effect (ITE) estimation, is an important approach for data-driven decision making that aims to identify the causal impact of an intervention on individuals. This paper introduces a new benchmark dataset for uplift modeling focused on churn prediction, coming from a telecom company in Belgium, Orange Belgium. Churn, in this context, refers to cust… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 8 pages, 2 figures, 5 tables, post-proceedings of the ECML PKDD 2023 Workshop on Uplift Modeling and Causal Machine Learning for Operational Decision Making

  3. arXiv:2311.18746  [pdf, other

    cs.LG

    A data-science pipeline to enable the Interpretability of Many-Objective Feature Selection

    Authors: Uchechukwu F. Njoku, Alberto Abelló, Besim Bilalli, Gianluca Bontempi

    Abstract: Many-Objective Feature Selection (MOFS) approaches use four or more objectives to determine the relevance of a subset of features in a supervised learning task. As a consequence, MOFS typically returns a large set of non-dominated solutions, which have to be assessed by the data scientist in order to proceed with the final choice. Given the multi-variate nature of the assessment, which may include… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 8 pages, 5 figures, 6 tables

  4. arXiv:2310.02029  [pdf, other

    cs.LG

    Between accurate prediction and poor decision making: the AI/ML gap

    Authors: Gianluca Bontempi

    Abstract: Intelligent agents rely on AI/ML functionalities to predict the consequence of possible actions and optimise the policy. However, the effort of the research community in addressing prediction accuracy has been so intense (and successful) that it created the illusion that the more accurate the learner prediction (or classification) the better would have been the final decision. Now, such an assumpt… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: Position paper presented in the BENELEARN 2022 conference

  5. arXiv:2309.12036  [pdf, other

    cs.LG

    Uplift vs. predictive modeling: a theoretical analysis

    Authors: Théo Verhelst, Robin Petit, Wouter Verbeke, Gianluca Bontempi

    Abstract: Despite the growing popularity of machine-learning techniques in decision-making, the added value of causal-oriented strategies with respect to pure machine-learning approaches has rarely been quantified in the literature. These strategies are crucial for practitioners in various domains, such as marketing, telecommunications, health care and finance. This paper presents a comprehensive treatment… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: 46 pages, 6 figures

  6. Adversarial Learning in Real-World Fraud Detection: Challenges and Perspectives

    Authors: Danele Lunghi, Alkis Simitsis, Olivier Caelen, Gianluca Bontempi

    Abstract: Data economy relies on data-driven systems and complex machine learning applications are fueled by them. Unfortunately, however, machine learning models are exposed to fraudulent activities and adversarial attacks, which threaten their security and trustworthiness. In the last decade or so, the research interest on adversarial machine learning has grown significantly, revealing how learning applic… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  7. arXiv:2304.05982  [pdf, other

    cs.NI

    Traffic Modeling with SUMO: a Tutorial

    Authors: Davide Andrea Guastella, Gianluca Bontempi

    Abstract: This paper presents a step-by-step guide to generating and simulating a traffic scenario using the open-source simulation tool SUMO. It introduces the common pipeline used to generate a synthetic traffic model for SUMO, how to import existing traffic data into a model to achieve accuracy in traffic simulation (that is, producing a traffic model which dynamics is similar to the real one). It also d… ▽ More

    Submitted 1 March, 2023; originally announced April 2023.

  8. arXiv:2211.07264  [pdf, other

    cs.LG

    Partial counterfactual identification and uplift modeling: theoretical results and real-world assessment

    Authors: Théo Verhelst, Denis Mercier, Jeevan Shrestha, Gianluca Bontempi

    Abstract: Counterfactuals are central in causal human reasoning and the scientific discovery process. The uplift, also called conditional average treatment effect, measures the causal effect of some action, or treatment, on the outcome of an individual. This paper discusses how it is possible to derive bounds on the probability of counterfactual statements based on uplift terms. First, we derive some origin… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  9. arXiv:2107.09323  [pdf, other

    cs.LG cs.CR

    Transfer Learning for Credit Card Fraud Detection: A Journey from Research to Production

    Authors: Wissam Siblini, Guillaume Coter, Rémy Fabry, Liyun He-Guelton, Frédéric Oblé, Bertrand Lebichot, Yann-Aël Le Borgne, Gianluca Bontempi

    Abstract: The dark face of digital commerce generalization is the increase of fraud attempts. To prevent any type of attacks, state-of-the-art fraud detection systems are now embedding Machine Learning (ML) modules. The conception of such modules is only communicated at the level of research and papers mostly focus on results for isolated benchmark datasets and metrics. But research is only a part of the jo… ▽ More

    Submitted 4 November, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

  10. Streaming Active Learning Strategies for Real-Life Credit Card Fraud Detection: Assessment and Visualization

    Authors: Fabirzio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Gianluca Bontempi

    Abstract: Credit card fraud detection is a very challenging problem because of the specific nature of transaction data and the labeling process. The transaction data is peculiar because they are obtained in a streaming fashion, they are strongly imbalanced and prone to non-stationarity. The labeling is the outcome of an active learning process, as every day human investigators contact only a small number of… ▽ More

    Submitted 20 April, 2018; originally announced April 2018.

    Journal ref: International Journal of Data Science and Analytics 2018

  11. SCARFF: a Scalable Framework for Streaming Credit Card Fraud Detection with Spark

    Authors: Fabrizio Carcillo, Andrea Dal Pozzolo, Yann-Aël Le Borgne, Olivier Caelen, Yannis Mazzer, Gianluca Bontempi

    Abstract: The expansion of the electronic commerce, together with an increasing confidence of customers in electronic payments, makes of fraud detection a critical factor. Detecting frauds in (nearly) real time setting demands the design and the implementation of scalable learning techniques able to ingest and analyse massive amounts of streaming data. Recent advances in analytics and the availability of op… ▽ More

    Submitted 26 September, 2017; originally announced September 2017.

    Journal ref: Information Fusion 41C (2018) pp. 182-194

  12. arXiv:1709.02327  [pdf, other

    cs.DC cs.LG stat.ML

    Feature selection in high-dimensional dataset using MapReduce

    Authors: Claudio Reggiani, Yann-Aël Le Borgne, Gianluca Bontempi

    Abstract: This paper describes a distributed MapReduce implementation of the minimum Redundancy Maximum Relevance algorithm, a popular feature selection method in bioinformatics and network inference problems. The proposed approach handles both tall/narrow and wide/short datasets. We further provide an open source implementation based on Hadoop/Spark, and illustrate its scalability on datasets involving mil… ▽ More

    Submitted 7 September, 2017; originally announced September 2017.

    MSC Class: 68W15 ACM Class: D.1.3

  13. arXiv:1611.02118  [pdf, other

    cs.CY

    OpenTED Browser: Insights into European Public Spendings

    Authors: Yann-Aël Le Borgne, Adriana Homolova, Gianluca Bontempi

    Abstract: We present the OpenTED browser, a Web application allowing to interactively browse public spending data related to public procurements in the European Union. The application relies on Open Data recently published by the European Commission and the Publications Office of the European Union, from which we imported a curated dataset of 4.2 million contract award notices spanning the period 2006-2015.… ▽ More

    Submitted 16 September, 2016; originally announced November 2016.

    Comments: ECML, PKDD, SoGood workshop 2016

  14. arXiv:1412.6285  [pdf, other

    cs.LG cs.AI stat.ML

    From dependency to causality: a machine learning approach

    Authors: Gianluca Bontempi, Maxime Flauder

    Abstract: The relationship between statistical dependency and causality lies at the heart of all statistical approaches to causal inference. Recent results in the ChaLearn cause-effect pair challenge have shown that causal directionality can be inferred with good accuracy also in Markov indistinguishable configurations thanks to data driven approaches. This paper proposes a supervised machine learning appro… ▽ More

    Submitted 19 December, 2014; originally announced December 2014.

    Comments: submitted to JMLR

  15. arXiv:1408.2430  [pdf, other

    cs.IR cs.CL

    Optimizing Component Combination in a Multi-Indexing Paragraph Retrieval System

    Authors: Boris Iolis, Gianluca Bontempi

    Abstract: We demonstrate a method to optimize the combination of distinct components in a paragraph retrieval system. Our system makes use of several indices, query generators and filters, each of them potentially contributing to the quality of the returned list of results. The components are combined with a weighed sum, and we optimize the weights using a heuristic optimization algorithm. This allows us to… ▽ More

    Submitted 11 August, 2014; originally announced August 2014.

    Comments: 5 pages, 1 figure, unpublished

    MSC Class: 68T50

  16. arXiv:1108.3259  [pdf, other

    stat.ML cs.AI cs.LG stat.AP

    A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition

    Authors: Souhaib Ben Taieb, Gianluca Bontempi, Amir Atiya, Antti Sorjamaa

    Abstract: Multi-step ahead forecasting is still an open challenge in time series forecasting. Several approaches that deal with this complex problem have been proposed in the literature but an extensive comparison on a large number of tasks is still missing. This paper aims to fill this gap by reviewing existing strategies for multi-step ahead forecasting and comparing them in theoretical and practical term… ▽ More

    Submitted 16 August, 2011; originally announced August 2011.

  17. Distributed Principal Component Analysis for Wireless Sensor Networks

    Authors: Yann-Aël Le Borgne, Sylvain Raybaud, Gianluca Bontempi

    Abstract: The Principal Component Analysis (PCA) is a data dimensionality reduction technique well-suited for processing data from sensor networks. It can be applied to tasks like compression, event detection, and event recognition. This technique is based on a linear transform where the sensor measurements are projected on a set of principal components. When sensor measurements are correlated, a small se… ▽ More

    Submitted 9 March, 2010; originally announced March 2010.

    Journal ref: Sensors 2008, 8(8), 4821-4850