2017-07-07 107 views
0

我有一个由经度,纬度和值组成的MySQL数据集。我试图总结其经度和纬度坐标在其他经纬度坐标的给定半径内的值(我们称它们为“焦点”)。最棘手的是,我试图从重叠区域分离出不同的坐标 - 例如,半径1与半径2重叠的地方。查询与SQL,地理坐标和半径重叠的区域

每个有半径的焦点都有多个半径的“区域”,所以对于任何给定的纬度/经度坐标,可以总结出很多东西。我设法拼凑一个查询,主要是工作,虽然这是一个有点慢:

Select 
      Sum(If(`zone`='z0_0x1_0',`value`,0)) as `z0_0x1_0`, 
      Sum(If(`zone`='z0_0x1_1',`value`,0)) as `z0_0x1_1`, 
      Sum(If(`zone`='z0_0x1_2',`value`,0)) as `z0_0x1_2`, 
      Sum(If(`zone`='z0_0x1_3',`value`,0)) as `z0_0x1_3`, 
      Sum(If(`zone`='z0_1x1_0',`value`,0)) as `z0_1x1_0`, 
      Sum(If(`zone`='z0_1x1_1',`value`,0)) as `z0_1x1_1`, 
      Sum(If(`zone`='z0_1x1_2',`value`,0)) as `z0_1x1_2`, 
      Sum(If(`zone`='z0_2x1_0',`value`,0)) as `z0_2x1_0`, 
      Sum(If(`zone`='z0_2x1_1',`value`,0)) as `z0_2x1_1`, 
      Sum(If(`zone`='z0_3x1_0',`value`,0)) as `z0_3x1_0`, 
      Sum(If(`zone`='z0_3x1_1',`value`,0)) as `z0_3x1_1`, 
      Sum(If(`zone`='z0_0',`value`,0)) as `z0_0`, 
      Sum(If(`zone`='z0_1',`value`,0)) as `z0_1`, 
      Sum(If(`zone`='z0_2',`value`,0)) as `z0_2`, 
      Sum(If(`zone`='z0_3',`value`,0)) as `z0_3`, 
      Sum(If(`zone`='z1_0',`value`,0)) as `z1_0`, 
      Sum(If(`zone`='z1_1',`value`,0)) as `z1_1`, 
      Sum(If(`zone`='z1_2',`value`,0)) as `z1_2`, 
      Sum(If(`zone`='z1_3',`value`,0)) as `z1_3` 
    From 
     (Select `lat`, `lng`, `value`, 
       Case 
          When ((`dist_0` Between 2.8723597844095 And 4.3343662110324) And (`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z0_0x1_0' 
          When ((`dist_0` Between 2.8723597844095 And 4.3343662110324) And (`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z0_0x1_1' 
          When ((`dist_0` Between 2.8723597844095 And 4.3343662110324) And (`dist_1` Between 1.3333495959677 And 2.1278369006061)) Then 'z0_0x1_2' 
          When ((`dist_0` Between 2.8723597844095 And 4.3343662110324) And (`dist_1` Between 0 And 1.3333495959677)) Then 'z0_0x1_3' 
          When ((`dist_0` Between 1.68658498678 And 2.8723597844095) And (`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z0_1x1_0' 
          When ((`dist_0` Between 1.68658498678 And 2.8723597844095) And (`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z0_1x1_1' 
          When ((`dist_0` Between 1.68658498678 And 2.8723597844095) And (`dist_1` Between 1.3333495959677 And 2.1278369006061)) Then 'z0_1x1_2' 
          When ((`dist_0` Between 1.0573158612197 And 1.68658498678) And (`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z0_2x1_0' 
          When ((`dist_0` Between 1.0573158612197 And 1.68658498678) And (`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z0_2x1_1' 
          When ((`dist_0` Between 0 And 1.0573158612197) And (`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z0_3x1_0' 
          When ((`dist_0` Between 0 And 1.0573158612197) And (`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z0_3x1_1' 
          When ((`dist_0` Between 2.8723597844095 And 4.3343662110324)) Then 'z0_0' 
          When ((`dist_0` Between 1.68658498678 And 2.8723597844095)) Then 'z0_1' 
          When ((`dist_0` Between 1.0573158612197 And 1.68658498678)) Then 'z0_2' 
          When ((`dist_0` Between 0 And 1.0573158612197)) Then 'z0_3' 
          When ((`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z1_0' 
          When ((`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z1_1' 
          When ((`dist_1` Between 1.3333495959677 And 2.1278369006061)) Then 'z1_2' 
          When ((`dist_1` Between 0 And 1.3333495959677)) Then 'z1_3' 
       End As `zone` 

       From 
        (Select `lat`, `lng`, `value`, 
         (acos(0.65292272498833*sin(radians(`lat`)) + 0.75742452772129*cos(radians(`lat`))*cos(radians(`lng`)-(-1.2910922519714))) * 6371) as `dist_0`, 
         (acos(0.65251345816785*sin(radians(`lat`)) + 0.75777713538338*cos(radians(`lat`))*cos(radians(`lng`)-(-1.2916315412569))) * 6371) as `dist_1` 
        From `pop` 
        Where 
         ((`lat` Between 40.714353892125 And 40.810300107875) And (`lng` Between -74.037474145971 And -73.910799854029)) Or 
         ((`lat` Between 40.673205922895 And 40.789544077105) And (`lng` Between -74.081798776797 And -73.928273223203)) 
        ) 
       As FirstCut 
     ) 
     As Zonecut 

这里的事物的逻辑:

  1. 首先,它抓住周围的最大半径边界框为每个焦点。 (这是FirstCut查询。)这将我们正在查看的数据点的数量减少了几个数量级。

  2. 然后,它处理所有数据,并从焦点获取每个数据点的距离(在这种情况下,dist_0dist_1,但可以有焦点的任意数量的 - 我已经在这个例子中使用了两个刚显示它是如何工作的)。这是大圈距离的Haversine公式。

  3. 然后,Case语句启动,为每个坐标指定一个“区域”的成员,这些成员从最复杂到最不复杂的处理。区域代码仅意味着“区域X,半径Y” - 因此“z0_1”表示“区域0,半径1”。如果存在“x”,则表示它是多个区域的交集。这个“区码”只是作为一个字符串分配的。

  4. 最后,通过分配区域名称和Sum(If())语句,区域代码用于总结所有内容。 (不管出于什么原因,如果()似乎比案件工作稍快()在这里。)

其中输出到我的脚本(PHP)区和款项的清单。现在显然这整个过程是程序化生成的,因为你必须提前计算所有可能实际上会“命中”的区域,并且这些都是作为预处理完成的,以避免在SQL中执行。

有没有更聪明的方法来做到这一点?我给他们分配一个字符串的位,然后将该字符串过滤到字段中......它看起来很黑,并不很优雅。但我无法找到一种更好的方法,将它们分类为一个大的Case语句中的字段(它似乎比许多Case语句快得多)。

任何和所有这方面的反馈将不胜感激。 MySQL表格非常庞大(数百万行)并且被索引到所有神圣的地狱。运行上面的查询大约需要0.6秒,这并不算太糟糕,但随着更多的焦点被添加,查询开始花费更长的时间,而我只是想通过SQL逻辑思考我在这个阶段的方式。谢谢。

回答

1

我没有彻底检查,但它看起来这可能会缩短那个大CASE一些:

CONCAT(
    (CASE 
     WHEN (dist_0 ...) THEN 'z0_0' 
     WHEN (dist_0 ...) THEN 'z0_1' 
     ... 
     ELSE ''), 
    (CASE 
     WHEN (dist_1 ...) THEN 'z1_0' 
     WHEN (dist_1 ...) THEN 'z1_1' 
     ... 
     ELSE '')) AS zone 
+0

哦 - 这是非常聪明的。我会看看是否可以实施;我认为可以开展工作。 – nucleon