论文标题
结合加入基数估计的学位序列
Degree Sequence Bound For Join Cardinality Estimation
论文作者
论文摘要
最近的工作表明,较差的基数估计对查询处理时间的灾难性影响。特别是,低估的查询基数可能会导致过度乐观的查询计划,该计划的数量级比真正的基数生成的订单更长。通过使用有关数据库的统计信息,例如表格和学位,即值频率,基础性边界通过计算查询输出大小的严格上限来避免这种陷阱。在本文中,我们通过证明一个名为“度序列结合”的新颖结合来扩展这一工作,该序列结合了整个度序列和最大元组多重性。这一结合对以前的工作进行了改进,其中包含了侧重于最高程度而不是程度序列的学位约束。此外,我们描述了如何使用真实程度序列的学习近似值实际计算这种绑定。
Recent work has demonstrated the catastrophic effects of poor cardinality estimates on query processing time. In particular, underestimating query cardinality can result in overly optimistic query plans which take orders of magnitude longer to complete than one generated with the true cardinality. Cardinality bounding avoids this pitfall by computing a strict upper bound on the query's output size using statistics about the database such as table sizes and degrees, i.e. value frequencies. In this paper, we extend this line of work by proving a novel bound called the Degree Sequence Bound which takes into account the full degree sequences and the max tuple multiplicity. This bound improves upon previous work incorporating degree constraints which focused on the maximum degree rather than the degree sequence. Further, we describe how to practically compute this bound using a learned approximation of the true degree sequences.