We describe efficient algorithms for accurately estimating the number of matches of a small
node-label tree, i.e., a twig, in a large node-label tree, using a summary data structure.
The problem is of interest for queries on XML and other hierarchical data, to provide query
feedback and for cost-based query optimization. Our summary data structure represents approximate
frequency information about twiglets, i.e., small twigs, in the data tree. Given a twig
query, the number of matches is estimated by creating a set of query twiglets, and combining
two complementary approaches; set hashing, used to estimate the number of matches of each
query twiglet, and maximal overlap, used to combine the query twiglet estimates into an
estimate for the twig query. We propose several estimation algorithms that apply these
approaches on query twiglets formed using variations on different twiglet decomposition
techniques. We present an extensive experimental evaluation using several real XML data
sets, with a variety of twig queries. Our results demonstrate that accurate and robust
estimates can be achieved, even with limited space.