Efficient Frequent Subtree Mining Beyond Forests
ebook ∣ Dissertations in Artificial Intelligence
By Pascal Welke
Sign up to save your library
With an OverDrive account, you can save your favorite libraries for at-a-glance information about availability. Find out more about OverDrive accounts.
Find this title in Libby, the library reading app by OverDrive.

Search for a digital library with this title
Title found at these libraries:
Library Name | Distance |
---|---|
Loading... |
A common paradigm in distance-based learning is to embed the instance space into a feature space equipped with a metric and define the dissimilarity between instances by the distance of their images in the feature space. Frequent connected subgraphs are sometimes used to define such feature spaces if the instances are graphs, but identifying the set of frequent connected subgraphs and subsequently computing embeddings for graph instances is computationally intractable. As a result, existing frequent subgraph mining algorithms either restrict the structural complexity of the instance graphs or require exponential delay between the output of subsequent patterns, meaning that distance-based learners lack an efficient way to operate on arbitrary graph data.
This book presents a mining system that gives up the demand on the completeness of the pattern set, and instead guarantees a polynomial delay between subsequent patterns. To complement this, efficient methods devised to compute the embedding of arbitrary graphs into the Hamming space spanned by the pattern set are described. As a result, a system is proposed that allows the efficient application of distance-based learning methods to arbitrary graph databases. In addition to an introduction and conclusion, the book is divided into chapters covering: preliminaries; related work; probabilistic frequent subtrees; boosted probabilistic frequent subtrees; and fast computation, with a further two chapters on Hamiltonian path for cactus graphs and Poisson binomial distribution.