How (and where) should I combine one-to-many relationships?

Asked
Viewd101

0

I have a user table, and then a number of dependent tables with a one to many relationship e.g. an email table, an address table and a groups table. (i.e. one user can have multiple email addresses, physical addresses and can be a member of many groups)

Is it better to:

  1. Join all these tables, and process the heap of data in code,

  2. Use something like GROUP_CONCAT and return one row, and split apart the fields in code,

  3. Or query each table independently?

Thanks.

101
  • I do not think there is any good generic answer to this question.

    Sinan ÜnürSeptember 25, 2009 14:35

3 个答案

3

这实际上取决于相关表中的数据量以及一次查询的用户数。

选项1往往难以处理。

选项2的处理方式也很混乱,尤其是在大型数据集上分组速度很慢。

选项3最易于处理,但总体上会生成更多查询。如果您的数据集很小,并且您不打算将其扩展到超出当前需求的水平,那么它可能是最佳选择。如果您只想显示一条记录,那绝对是最好的选择。

但是,还有第四种选择,这是我在工作中所采用的中间方法,在这种方法中,我们处理非常相似的情况。而不是一次获取每一行1的相关记录,而是使用IN()获取结果集的所有相关记录。然后循环输入代码以使它们与适当的记录匹配以进行显示。如果您缓存搜索查询,则也可以缓存第二个查询。它只有两个查询,并且代码中只有一个循环(不进行解析,使用哈希通过其键关联事物)

  • After partially implementing option 1, I’ve changed my mind and gone for option 3. With option one I was pulling out maybe 20 rows per person, but I could see that as the system expands and requirements change, this could easily spiral up into the hundreds or even thousands, like you pointed out below. It just doesn’t scale. I like your IN idea, thanks.

    aidanSeptember 28, 2009 09:00
0

就个人而言,假设我的表从头开始索引,那么我将使用表联接并一次性获取所有数据,然后对其进行处理以得到嵌套的数据结构。这样,您就可以发挥每种系统的优势。

  • Remember though, that approach exponentially increases the number of records (has the effect of denormalising everything). On a large data set your application ends up using a lot of memory which is usually undesirable. If the dataset is small enough to deal with that then you might as well use option 3 which is faster to code for.

    CfreakSeptember 25, 2009 14:52
0

通常来说,针对您所处的情况进行最高效的查询。因此,请勿创建在所有情况下都使用的大型查询。创建特定于案例的查询,这些查询仅返回您需要的信息。

在处理结果方面,如果使用GROUP_CONCAT,则必须在处理期间拆分所有结果值。如果您的GROUP_CONCAT'd值中有多余的定界符,则可能会出现问题。我的首选方法是在输出循环期间将GROUPED BY字段放入$ holder。每次都将该字段与$ holder比较,并相应地更改输出。