Skip to main content

The R TraMineR library is a toolbox for exploring and rendering categorical sequence data such as sequences describing family life trajectories or professional careers. This "traminer" tag is intended for questions related to the usage including data preparation and output handling of TraMineR and its companion TraMineRextras, WeightedCluster, and PST packages.

TraMineR is an R-package for mining, describing and visualizing sequences of states or events, and more generally discrete sequential data. Its primary aim is the analysis of biographical longitudinal data in the social sciences, such as data describing careers or family trajectories. Most of its features also apply to non-temporal data such as text or DNA sequences. The package includes:

  • Handling of longitudinal data and conversion between various sequence formats
  • Plotting sequences (density plot, frequency plot, index plot etc.)
  • Individual longitudinal characteristics of sequences (length, time in each state, longitudinal entropy, turbulence, complexity etc.)
  • Sequence transversal characteristics by age point (transversal state distribution, transversal entropy, modal state)
  • Other aggregated characteristics (transition rates, average duration in each state, sequence frequency)
  • Dissimilarities between pairs of sequences (optimal matching, longest common subsequence, Hamming, Dynamic Hamming, Multichannel etc.)
  • Centro-type and heterogeneity measure of a set of sequences
  • Discovering and plotting representative sequences
  • ANOVA-like analysis of sequences and tree structured ANOVA from dissimilarities
  • Extracting frequent event subsequences
  • Identifying most discriminating event subsequences
  • Association rules between subsequences

Resources: