Samuel Marchal

Helsinki, Uusimaa, Finland Contact Info
655 followers 500+ connections

Join to view profile

About

I am the Research Team Leader for the Network Security research group at VTT Technical…

Activity

Join now to see all activity

Experience & Education

  • VTT

View Samuel’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

  • Mitigating Mimicry Attacks Against the Session Initiation Protocol

    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT

    The U.S. National Academies of Science’s Board on Science, Technology and Economic Policy estimates that the Internet and voice-over-IP (VoIP) communications infrastructure generates 10% of U.S. economic growth. As market forces move increasingly towards Internet and VoIP communications, there is proportional increase in telephony denial of service (TDoS) attacks. Like denial of service (DoS) attacks, TDoS attacks seek to disrupt business and commerce by directing a flood of anomalous traffic…

    The U.S. National Academies of Science’s Board on Science, Technology and Economic Policy estimates that the Internet and voice-over-IP (VoIP) communications infrastructure generates 10% of U.S. economic growth. As market forces move increasingly towards Internet and VoIP communications, there is proportional increase in telephony denial of service (TDoS) attacks. Like denial of service (DoS) attacks, TDoS attacks seek to disrupt business and commerce by directing a flood of anomalous traffic towards key communication servers. In this work, we focus on a new class of anomalous traffic that exhibits a mimicry TDoS attack. Such an attack can be launched by crafting malformed messages with small changes from normal ones. We show that such malicious messages easily bypass intrusion detection systems (IDS) and degrade the goodput of the server drastically by forcing it to parse the message looking for the needed token. Our approach is not to parse at all; instead, we use multiple classifier systems (MCS) to exploit the strength of multiple learners to predict the true class of a message with high probability (98.50% ≤ p ≤ 99.12%). We proceed systematically by first formulating an optimization problem of picking the minimum number of classifiers such that their combination yields the optimal classification performance. Next, we analytically bound the maximum performance of such a system and empirically demonstrate that it is possible to attain close to the maximum theoretical performance across varied datasets. Finally, guided by our analysis we construct an MCS appliance that demonstrates superior classification accuracy with O(1) runtime complexity across varied datasets.

    Other authors
    See publication
  • PhishScore: Hacking Phishers’ Minds

    Proceedings of the 10th International Conference on Network and Service Management

    Despite the growth of prevention techniques, phishing remains an important threat since the principal countermeasures in use are still based on reactive URL blacklisting. This technique is inefficient due to the short lifetime of phishing Web sites, making recent approaches relying on real-time or proactive phishing URLs detection techniques more appropriate. In this paper we introduce PhishScore, an automated real-time phishing detection system. We observed that phishing URLs usually have few…

    Despite the growth of prevention techniques, phishing remains an important threat since the principal countermeasures in use are still based on reactive URL blacklisting. This technique is inefficient due to the short lifetime of phishing Web sites, making recent approaches relying on real-time or proactive phishing URLs detection techniques more appropriate. In this paper we introduce PhishScore, an automated real-time phishing detection system. We observed that phishing URLs usually have few relationships between the part of the URL that must be registered (upper level domain) and the remaining part of the URL (low level domain, path, query). Hence, we define this concept as intra-URL relatedness and evaluate it using features extracted from words that compose a URL based on query data from Google and Yahoo search engines. These features are then used in machine learning based classification to detect phishing URLs from a real dataset.

    Other authors
  • A Big Data Architecture for Large Scale Security Monitoring

    Proceedings of the 3rd IEEE Congress on Big Data

    Network traffic is a rich source of information for security monitoring. However the increasing volume of data to treat raises issues, rendering holistic analysis of network traffic difficult. In this paper we propose a solution to cope with the tremendous amount of data to analyse for security monitoring perspectives. We introduce an architecture dedicated to security monitoring of local enterprise networks. The application domain of such a system is mainly network intrusion detection and…

    Network traffic is a rich source of information for security monitoring. However the increasing volume of data to treat raises issues, rendering holistic analysis of network traffic difficult. In this paper we propose a solution to cope with the tremendous amount of data to analyse for security monitoring perspectives. We introduce an architecture dedicated to security monitoring of local enterprise networks. The application domain of such a system is mainly network intrusion detection and prevention, but can be used as well for forensic analysis. This architecture integrates two systems, one dedicated to scalable distributed data storage and management and the other dedicated to data exploitation. DNS data, NetFlow records, HTTP traffic and honeypot data are mined and correlated in a distributed system that leverages state of the art big data solution. Data correlation schemes are proposed and their performance are evaluated against several well-known big data framework including Hadoop and Spark.

    Other authors
  • Advanced Detection Tool for PDF Threats

    Proceedings of the 6th International Workshop on Autonomous and Spontaneous Security

    In this paper we introduce an efficient application for malicious PDF detection: ADEPT. With targeted attacks rising over the recent past, exploring a new detection and mitigation paradigm becomes mandatory. The use of malicious PDF files that exploit vulnerabilities in well-known PDF readers has become a popular vector for targeted at- tacks, for which few efficient approaches exist. Although simple in theory, parsing followed by analysis of such files is resource-intensive and may even be…

    In this paper we introduce an efficient application for malicious PDF detection: ADEPT. With targeted attacks rising over the recent past, exploring a new detection and mitigation paradigm becomes mandatory. The use of malicious PDF files that exploit vulnerabilities in well-known PDF readers has become a popular vector for targeted at- tacks, for which few efficient approaches exist. Although simple in theory, parsing followed by analysis of such files is resource-intensive and may even be impossible due to several obfuscation and reader-specific artifacts. Our paper describes a new approach for detecting such malicious payloads that leverages machine learning techniques and an efficient feature selection mechanism for rapidly detecting anomalies. We assess our approach on a large selection of malicious files and report the experimental performance results for the developed prototype.

    Other authors
  • Semantic based DNS Forensics

    Proceedings of the IEEE International Workshop on Information Forensics and Security - WIFS'12

    In network level forensics, Domain Name Service (DNS) is a rich source of information. This paper describes a new approach to mine DNS data for forensic purposes. We propose a new technique that leverages semantic and natural language processing tools in order to analyze large volumes of DNS data. The main research novelty consists in detecting malicious and dangerous domain names by evaluating the semantic similarity with already known names. This process can provide valuable information for…

    In network level forensics, Domain Name Service (DNS) is a rich source of information. This paper describes a new approach to mine DNS data for forensic purposes. We propose a new technique that leverages semantic and natural language processing tools in order to analyze large volumes of DNS data. The main research novelty consists in detecting malicious and dangerous domain names by evaluating the semantic similarity with already known names. This process can provide valuable information for reconstructing network and user activities. We show the efficiency of the method on experimental real datasets gathered from a national passive DNS system.

    Other authors
  • Proactive discovery of phishing related domain names

    Proceedings of the 15th International Symposium in Research in Attacks, Intrusions and Defences - RAID 2012

    Phishing is an important security issue to the Internet, which has a significant economic impact. The main solution to counteract this threat is currently reactive blacklisting; however, as phishing attacks are mainly performed over short periods of time, reactive methods are too slow. As a result, new approaches to early identify malicious websites are needed. In this paper a new proactive discovery of phishing related domain names is introduced. We mainly focus on the automated detec- tion of…

    Phishing is an important security issue to the Internet, which has a significant economic impact. The main solution to counteract this threat is currently reactive blacklisting; however, as phishing attacks are mainly performed over short periods of time, reactive methods are too slow. As a result, new approaches to early identify malicious websites are needed. In this paper a new proactive discovery of phishing related domain names is introduced. We mainly focus on the automated detec- tion of possible domain registrations for malicious activities. We leverage techniques coming from natural language modelling in order to build pro- active blacklists. The entries in this list are built using language models and vocabularies encountered in phishing related activities - “secure”, “banking”, brand names, etc. Once a pro-active blacklist is created, ongoing and daily monitoring of only these domains can lead to the efficient detection of phishing web sites.

    Other authors
  • Large Scale DNS analysis

    Proceedings of the 6th International Conference on Autonomous Infrastructure, Management and Security - AIMS 2012 – Ph.D. Student Workshop

    In this paper we present an architecture for large scale DNS monitoring. The analysis of DNS traffic is becoming of first importance currently, as it allows to monitor the main part of the interactions on the Internet. DNS traffic can reveal anomalies such as worm infected hosts, botnets or spam participating hosts. The efficiency and the speed of detection of such anomalies rely on the capacity of DNS monitoring system to treat quickly huge quantity of data. We propose a system that leverages…

    In this paper we present an architecture for large scale DNS monitoring. The analysis of DNS traffic is becoming of first importance currently, as it allows to monitor the main part of the interactions on the Internet. DNS traffic can reveal anomalies such as worm infected hosts, botnets or spam participating hosts. The efficiency and the speed of detection of such anomalies rely on the capacity of DNS monitoring system to treat quickly huge quantity of data. We propose a system that leverages distributed processing and storage facilities.

    Other authors
  • Semantic exploration of DNS

    Proceedings of the 11th IFIP/TC6 Networking 2012 Conference - Networking 2012

    The DNS structure discloses useful information about the organization and the operation of an enterprise network, which can be used for designing attacks as well as monitoring domains supporting malicious activities. Thus, this paper introduces a new method for exploring the DNS domains. Although our previous work described a tool to generate existing DNS names accurately in order to probe a domain automatically, the approach is extended by leveraging semantic analysis of domain names. In…

    The DNS structure discloses useful information about the organization and the operation of an enterprise network, which can be used for designing attacks as well as monitoring domains supporting malicious activities. Thus, this paper introduces a new method for exploring the DNS domains. Although our previous work described a tool to generate existing DNS names accurately in order to probe a domain automatically, the approach is extended by leveraging semantic analysis of domain names. In particular, the semantic distributional similarity and relatedness of sub-domains are considered as well as sequential patterns. The evaluation shows that the discovery is highly improved while the overhead remains low, comparing with non semantic DNS probing tools including ours and others.

    Other authors
  • DNSSM: A large scale passive DNS security monitoring framework

    Proceedings of the IEEE/IFIP Network Operations and Management Symposium - NOMS 2012

    We present a monitoring approach and the supporting software architecture for passive DNS traffic. Monitoring DNS traffic can reveal essential network and system level activity profiles. Worm infected and botnet participating hosts can be identified and malicious backdoor communications can be detected. Any passive DNS monitoring solution needs to address several challenges that range from architectural approaches for dealing with large volumes of data up to specific Data Mining approaches for…

    We present a monitoring approach and the supporting software architecture for passive DNS traffic. Monitoring DNS traffic can reveal essential network and system level activity profiles. Worm infected and botnet participating hosts can be identified and malicious backdoor communications can be detected. Any passive DNS monitoring solution needs to address several challenges that range from architectural approaches for dealing with large volumes of data up to specific Data Mining approaches for this purpose. We describe a framework that leverages state of the art distributed processing facilities with clustering techniques in order to detect anomalies in both online and offline DNS traffic. This framework entitled DSNSM is implemented and operational on several networks. We validate the framework against two large trace sets.

    Other authors

Languages

  • Français

    Native or bilingual proficiency

  • Anglais

    Full professional proficiency

Recommendations received

More activity by Samuel

View Samuel’s full profile

  • See who you know in common
  • Get introduced
  • Contact Samuel directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Samuel Marchal

Add new skills with these courses