Class StringTrigramModel

  • All Implemented Interfaces:
    TextGenerativeModel

    public class StringTrigramModel
    extends MarkovModel<Trip<java.lang.String>>
    implements TextGenerativeModel
    An extension of a MarkovModel to be used specifically for text generation Very similar to StringMarkovModel except works with transitions of sequences of three strings rather than single strings Will create sequences more closely related to original set but at high risk of over-fitting without very large training set of string sequences
    • Method Summary

      Modifier and Type Method Description
      static StringTrigramModel fromStrings​(java.util.Collection<java.lang.String> strings, java.lang.String regex)
      A static factory method to create a model given a collection of string state sequences to be split into sequences with a given regex using a default start and end string
      static StringTrigramModel fromStrings​(java.util.Collection<java.lang.String> strings, java.lang.String regex, java.lang.String startString, java.lang.String endString)
      A static factory method to create a model given a collection of string state sequences to be split into sequences with a given regex
      java.lang.String generateString()
      Generates a random string sequence using the model's probabilities of transitioning from a given string triple to another
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • fromStrings

        public static StringTrigramModel fromStrings​(java.util.Collection<java.lang.String> strings,
                                                     java.lang.String regex,
                                                     java.lang.String startString,
                                                     java.lang.String endString)
        A static factory method to create a model given a collection of string state sequences to be split into sequences with a given regex
        Parameters:
        strings - the collection of string state sequences
        regex - the regex used to split the strings into actual state sequences
        startString - the unique string not seen anywhere else in the sequences used internally indicate the start of a sentence
        endString - the unique string not seen anywhere else in the sequences used to indicate the start of a sentence
        Returns:
        a StringTrigramModel used for generating string sequences according to the transition probabilities of string triples calculated from the given collection of string sequences
      • fromStrings

        public static StringTrigramModel fromStrings​(java.util.Collection<java.lang.String> strings,
                                                     java.lang.String regex)
        A static factory method to create a model given a collection of string state sequences to be split into sequences with a given regex using a default start and end string
        Parameters:
        strings - the collection of string state sequences
        regex - the regex used to split the strings into actual state sequences
        Returns:
        a StringTrigramModel used for generating string sequences according to the transition probabilities of string triples calculated from the given collection of string sequences
      • generateString

        public java.lang.String generateString()
        Generates a random string sequence using the model's probabilities of transitioning from a given string triple to another
        Specified by:
        generateString in interface TextGenerativeModel
        Returns:
        a generated string sequence concatenated into a single string