1-4hit |
The issues of comparing the similarity or dissimilarity (distance) between structures have been studied. Especially, several distances between trees and their efficient algorithms have been proposed. However, all of the tree distances are defined based on mapping between vertices only, and they are helpless to compare the tree structures whose vertices and edges hold information. In this paper, we will propose a mapping between rooted and unordered trees based on vertex translation and edge translation, and then define a distance based on proposed mapping, and develop an efficient algorithm for computing proposed distance. Proposed distance can be used to compare the similarity or distance between two natural language sentences.
This paper discusses the largest common similar substructure (in short, LCSS) problem for trees. The problem is, for all pairs of "substructure of A and that of B," to find one of them, denoted by A and B', such that A is most similar to B' and the sum of the number of vertices of A and that of B' is largest. An algorithm for the LCSS problem for unrooted and unordered trees (in short, trees) and that for trees embedded in a plane (in short, Co-trees) are proposed. The time complexity of the algorithm for trees is O (max (ma, mb)2 NaNb) and that for CO-trees is O (mambNaNb), where, ma (mb) and Na (Nb) are the largest degree of a vertex of tree Ta (Tb) and the number of vertices of Ta (Tb), respectively. It is easy to modify the algorithms for enumerating all of the LCSSs for trees and CO-trees. The algorithms can be applied to structure-activity studies in chemistry and various structure comparison problems.
Shaoming LIU Eiichi TANAKA Sumio MASUDA
Several distances between trees have been proposed. However, most of the reports on distances have dealt with rooted and ordered trees. This paper proposes two distances between unrooted and cyclically ordered trees (CO-trees) and their computing methods. A CO-tree is a tree embedded in a plane. These distances are defined based on Tai's mapping (TM) and a strongly structure preserving mapping (SSPM) between CO-trees. The time complexities to compute the distances between two CO-trees Ta and Tb are OT (N 2aN 2b) for the distance based on a TM and OT(mambNaNb) for that on an SSPM, respectively, where ma(mb) and Na(Nb) are the largest degree of a vertex and the number of vertices of Ta(Tb), respectively. The space complexities of both methods are Os(NaNb). Those distances can be applied to the clustering of CO-trees.
This paper discusses the problems of largest similar substructures (in short, LSS) in rooted and unordered trees (in short, R-trees) and those in unrooted and unordered trees (in short, trees). For two R-trees (or trees) Ta and Tb, LSS in Tb to Ta is defined, and two algorithms for finding one of the LSSs for R-trees and that for trees are proposed. The time and space complexities of both algorithms are OT (m3NaNb) and OS(mNaNb), respectively, where m is the largest degree of a vertex of Ta and Tb, and Na(Nb)is the number of vertices of Ta(Tb).