Skip to main content

All Questions

1 vote
0 answers
45 views

Comparing log-loss values for a probabilistic suffix tree?

In the PST package one can estimate the prediction quality of individual sequences using the log-loss, e.g: R> ex2 <- c("a-a-b", "a-b-a-a-b", "b-b-b-b-a") R> ex2 <- seqdef(ex2) R> ...
histelheim's user avatar
  • 5,018
1 vote
1 answer
86 views

Meaning of lag parameter in PST?

In the pmine() function in PST you can use lags. What is this lag? Does it mean that it ignores the lag first positions in the sequence? Or does it mean that you allow for lags within the subsequences?...
histelheim's user avatar
  • 5,018
1 vote
1 answer
55 views

What is the meaning of alpha in the context of an information gain pruning function?

In the PST package we use the value C as a cut-off for the information gain function used to prune the tree. The C value, for an alpha of 0.05 is calculated as follows: C95 <- qchisq(0.95, 1) / 2 ...
histelheim's user avatar
  • 5,018
2 votes
1 answer
118 views

Fitting a VLMC to very long sequences

I am trying to fit a VLMC to a dataset where the longest sequence is 296 states. I do it as shown below: # Load libraries library(PST) library(RCurl) library(TraMineR) # Load and transform data x &...
histelheim's user avatar
  • 5,018
2 votes
1 answer
98 views

Predicting conditional probabilities based on contexts with only 1 state

It seems that PST cannot predict the conditional probabilities of the next state after contexts which consist of a single state, e.g. EX-EX Consider this code: # Load libraries library(RCurl) ...
histelheim's user avatar
  • 5,018
2 votes
1 answer
57 views

Calculate lift for context-state relationship in a probabilistic suffix tree?

PST gives me probabilities and conditional probabilities for various contexts and following states. However, it would be very helpful to be able to calculate the lift (and its significance) of the ...
histelheim's user avatar
  • 5,018
2 votes
1 answer
273 views

Where in the sequence of a Probabilistic Suffix Tree does "e" occur?

In my data there are only missing data (*) on the right side of the sequences. That means that no sequence starts with * and no sequence has any other markers after *. Despite this the PST (...
histelheim's user avatar
  • 5,018
2 votes
2 answers
107 views

Getting log-likelihood from probabilistic suffix tree

Here is my code: library(RCurl) library(TraMineR) library(PST) x <- getURL("https://gist.githubusercontent.com/aronlindberg/08228977353bf6dc2edb3ec121f54a29/raw/...
histelheim's user avatar
  • 5,018