我有一个由经度,纬度和值组成的MySQL数据集。我试图总结其经度和纬度坐标在其他经纬度坐标的给定半径内的值(我们称它们为“焦点”)。最棘手的是,我试图从重叠区域分离出不同的坐标 - 例如,半径1与半径2重叠的地方。查询与SQL,地理坐标和半径重叠的区域
每个有半径的焦点都有多个半径的“区域”,所以对于任何给定的纬度/经度坐标,可以总结出很多东西。我设法拼凑一个查询,主要是工作,虽然这是一个有点慢:
Select
Sum(If(`zone`='z0_0x1_0',`value`,0)) as `z0_0x1_0`,
Sum(If(`zone`='z0_0x1_1',`value`,0)) as `z0_0x1_1`,
Sum(If(`zone`='z0_0x1_2',`value`,0)) as `z0_0x1_2`,
Sum(If(`zone`='z0_0x1_3',`value`,0)) as `z0_0x1_3`,
Sum(If(`zone`='z0_1x1_0',`value`,0)) as `z0_1x1_0`,
Sum(If(`zone`='z0_1x1_1',`value`,0)) as `z0_1x1_1`,
Sum(If(`zone`='z0_1x1_2',`value`,0)) as `z0_1x1_2`,
Sum(If(`zone`='z0_2x1_0',`value`,0)) as `z0_2x1_0`,
Sum(If(`zone`='z0_2x1_1',`value`,0)) as `z0_2x1_1`,
Sum(If(`zone`='z0_3x1_0',`value`,0)) as `z0_3x1_0`,
Sum(If(`zone`='z0_3x1_1',`value`,0)) as `z0_3x1_1`,
Sum(If(`zone`='z0_0',`value`,0)) as `z0_0`,
Sum(If(`zone`='z0_1',`value`,0)) as `z0_1`,
Sum(If(`zone`='z0_2',`value`,0)) as `z0_2`,
Sum(If(`zone`='z0_3',`value`,0)) as `z0_3`,
Sum(If(`zone`='z1_0',`value`,0)) as `z1_0`,
Sum(If(`zone`='z1_1',`value`,0)) as `z1_1`,
Sum(If(`zone`='z1_2',`value`,0)) as `z1_2`,
Sum(If(`zone`='z1_3',`value`,0)) as `z1_3`
From
(Select `lat`, `lng`, `value`,
Case
When ((`dist_0` Between 2.8723597844095 And 4.3343662110324) And (`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z0_0x1_0'
When ((`dist_0` Between 2.8723597844095 And 4.3343662110324) And (`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z0_0x1_1'
When ((`dist_0` Between 2.8723597844095 And 4.3343662110324) And (`dist_1` Between 1.3333495959677 And 2.1278369006061)) Then 'z0_0x1_2'
When ((`dist_0` Between 2.8723597844095 And 4.3343662110324) And (`dist_1` Between 0 And 1.3333495959677)) Then 'z0_0x1_3'
When ((`dist_0` Between 1.68658498678 And 2.8723597844095) And (`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z0_1x1_0'
When ((`dist_0` Between 1.68658498678 And 2.8723597844095) And (`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z0_1x1_1'
When ((`dist_0` Between 1.68658498678 And 2.8723597844095) And (`dist_1` Between 1.3333495959677 And 2.1278369006061)) Then 'z0_1x1_2'
When ((`dist_0` Between 1.0573158612197 And 1.68658498678) And (`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z0_2x1_0'
When ((`dist_0` Between 1.0573158612197 And 1.68658498678) And (`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z0_2x1_1'
When ((`dist_0` Between 0 And 1.0573158612197) And (`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z0_3x1_0'
When ((`dist_0` Between 0 And 1.0573158612197) And (`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z0_3x1_1'
When ((`dist_0` Between 2.8723597844095 And 4.3343662110324)) Then 'z0_0'
When ((`dist_0` Between 1.68658498678 And 2.8723597844095)) Then 'z0_1'
When ((`dist_0` Between 1.0573158612197 And 1.68658498678)) Then 'z0_2'
When ((`dist_0` Between 0 And 1.0573158612197)) Then 'z0_3'
When ((`dist_1` Between 3.6260179152491 And 5.4681062617155)) Then 'z1_0'
When ((`dist_1` Between 2.1278369006061 And 3.6260179152491)) Then 'z1_1'
When ((`dist_1` Between 1.3333495959677 And 2.1278369006061)) Then 'z1_2'
When ((`dist_1` Between 0 And 1.3333495959677)) Then 'z1_3'
End As `zone`
From
(Select `lat`, `lng`, `value`,
(acos(0.65292272498833*sin(radians(`lat`)) + 0.75742452772129*cos(radians(`lat`))*cos(radians(`lng`)-(-1.2910922519714))) * 6371) as `dist_0`,
(acos(0.65251345816785*sin(radians(`lat`)) + 0.75777713538338*cos(radians(`lat`))*cos(radians(`lng`)-(-1.2916315412569))) * 6371) as `dist_1`
From `pop`
Where
((`lat` Between 40.714353892125 And 40.810300107875) And (`lng` Between -74.037474145971 And -73.910799854029)) Or
((`lat` Between 40.673205922895 And 40.789544077105) And (`lng` Between -74.081798776797 And -73.928273223203))
)
As FirstCut
)
As Zonecut
这里的事物的逻辑:
首先,它抓住周围的最大半径边界框为每个焦点。 (这是FirstCut查询。)这将我们正在查看的数据点的数量减少了几个数量级。
然后,它处理所有数据,并从焦点获取每个数据点的距离(在这种情况下,
dist_0
和dist_1
,但可以有焦点的任意数量的 - 我已经在这个例子中使用了两个刚显示它是如何工作的)。这是大圈距离的Haversine公式。然后,Case语句启动,为每个坐标指定一个“区域”的成员,这些成员从最复杂到最不复杂的处理。区域代码仅意味着“区域X,半径Y” - 因此“z0_1”表示“区域0,半径1”。如果存在“x”,则表示它是多个区域的交集。这个“区码”只是作为一个字符串分配的。
最后,通过分配区域名称和Sum(If())语句,区域代码用于总结所有内容。 (不管出于什么原因,如果()似乎比案件工作稍快()在这里。)
其中输出到我的脚本(PHP)区和款项的清单。现在显然这整个过程是程序化生成的,因为你必须提前计算所有可能实际上会“命中”的区域,并且这些都是作为预处理完成的,以避免在SQL中执行。
有没有更聪明的方法来做到这一点?我给他们分配一个字符串的位,然后将该字符串过滤到字段中......它看起来很黑,并不很优雅。但我无法找到一种更好的方法,将它们分类为一个大的Case语句中的字段(它似乎比许多Case语句快得多)。
任何和所有这方面的反馈将不胜感激。 MySQL表格非常庞大(数百万行)并且被索引到所有神圣的地狱。运行上面的查询大约需要0.6秒,这并不算太糟糕,但随着更多的焦点被添加,查询开始花费更长的时间,而我只是想通过SQL逻辑思考我在这个阶段的方式。谢谢。
哦 - 这是非常聪明的。我会看看是否可以实施;我认为可以开展工作。 – nucleon