5.12 查找需要最多次中转的旅程 Looking for the journey requiring the most stops实用的小精灵（Gremlin图查询语言）阿岶奇TinkerPop向导 PRACTICAL GREMLIN An Apache TinkerPop Tutorial

有时思考这样的问题“从一个给定的机场出发最难到达的地方是哪？”是挺有趣的。对于这个例子，让我们一起来定义“最难到达”也就是“需要最多次中转”。通过修改前一节中的查询我们可以找到一种方法来发现最长的路径。在这节的最后，笔者将包含例子呈现一些生成相同结果的不同的查询。

Sometimes it is interesting to ponder questions such as "Starting from a given airport what are the

most difficult places to get to?". For the sake of this example let’s define "difficult to get to" as

meaning "requires the most hops". By modifying the query from the previous section we can come

up with a way to detect some longest routes. At the end of the section I have also included examples

that show alternative ways to generate the same result.

下面的查询找到每个有出边的机场，但是不包含AUS。对于每个机场找到一种方式可以通过10次以内的中转到达AUS。对于每个机场这样的尝试只做一次。笔者决定中转的数字用10，因为笔者相信不会有机场比10次还远。然而不管使用哪个值，只要它足够大。数值100也不会明显地影响查询，只要AUS 是可达的，emit步骤会返回了结果。这个查询将找到每个启始点和AUS 之间的一条路径。查询的最后一行的where步骤只是为了保证路径的长度是8或更长。这事实上说明了只为我找到至少要经过七次中转才能到达AUS的路线。

The query shown below finds every airport that has out going routes and is not Austin (AUS). For

each airport found an attempt is made to get to Austin in ten or fewer hops. For each airport the

attempt is only made once. I decided to use a value of 10 as I was already fairly confident that there

would be no airports further away in terms of stops than that. However, it really does not matter

what value is used so long as it is big enough. A value of 100 does not adversely affect the query

given that the emit step will return a result as soon as AUS is reached. The query will find a single

path between each starting airport and AUS. The where step on the last line of the query only keeps

any paths that have a length of eight or more. So essentially that says find me only routes that take

at least seven hops to get to Austin as AUS.

运行它，查询将要花几秒钟完成，在一台式机或笔机本上，它不会花费比这个更多的时间。这是运行查询的结果。

When run, the query may take a few seconds to complete, but should not take more than that on a

typical laptop or desktop. Here are the results of running the query.

也可以用repeat和until来写这个查询，如下所示。

The query could also be written using repeat and until steps as shown below.

和前边一样，同样的三条路径被找到了。

As before, the same three paths are found.

5.12.1. 快速找到最难去的机场Quickly finding the hardest to get to airports

有其它的方式来写这个查询：找到最难去的机场。下面的这些技术依赖于使用去重策加入来保证你只访问机场一次。注意这个不同于使用sideEffect步骤的方法。那个步骤允许我们查询出来的，并从多个地方访问相同的顶点，是多对一类型的样式。但是不允许一个遍历对它自身进行循环。通过使用去重方法，在遍历过程中，我们只访问顶点一次，去除掉其它的实例。这为查询处理器减少了大量的工作，结果查询就会执行的更高效。下面的查询和上边的相比只是做了一点点过滤，但是它工作的很好，如果你在查找从指定的启始点出发，需要至少某个指定次数的中转才可以到达的目的地。这次笔进决定从AUS开始，查找七次中转才可以到达的机场。事实上我们删除了路上重复的，保证了有一个不短的路径被找到。如果您擅长倒回去读，您就会发出结果和前边找到的相符，但是顺序是相反的。

There are other ways we could write a query that looks for "hard to get to" airports. The technique

shown below essentially relies on using a de-duplication strategy to make sure you only visit an

airport exactly once. Note this is different from the way the sideEffect step works. That step allows a

query to fan out and visit the same vertex from many places, a many to one type of pattern, but will

not allow a traversal to loop back on itself. By using a de-duplication approach we only visit a

vertex exactly once and remove all other instances of it as the query progresses. This gives the

query processor a lot less work to do and as a result the query executes more efficiently. The

queries below do a bit less filtering than the ones above but work well if all you are looking for are

destinations that, from a given starting point, require at least a specified number of hops to reach.

This time I decided to start with Austin (AUS) and look for any airports that are only reachable in

seven hops. The fact that we are removing duplicates along the way guarantees that there is not a

shorter path to the destinations found. If you are good at reading backwards, you will see that the

results match those found above but in the reverse order.

第一个例子合使用一个store步骤来保留访问过的轨迹，只去以前没有去过的地方直到七次中转完成。

The first example keeps track of where it has been using a store step and only continues on to

places it has not seen before until seven hops have completed.

如您所见返回的结果看着熟悉。

As you can see the results look familiar.

这是变更版的查询，使用了dedup步骤来达到同样的目标。在绝大多数小精灵查询的引擎中，这是一种有效的方式在至少给定的中转次数内找到可达的目标。这个查询有点类似在“两个机场之间有哪些路径”这一节中的查询。

This alternate version of the query uses a dedup step to achieve the same goal. On most Gremlin

query engines this is an efficient way to look for targets only reachable in at least the given number

of hops. This query is similar to the one used in the "Does any route exist between two airports?"

section.

再次，相同的结果生成了。

Once again the same results are generated.

当我们从图中找最远的路径，特别是连接紧密的时，您要小心。注意笔者如果为笔者的搜索选择一个指定的目标或者特定次数的中转。

When looking for longest routes in a graph, especially in a highly connected one, you need to be

careful. Notice how I chose a very specific target or a specific number of hops for my searches.

查找最长的路径必须小心，因为它可能导致查询执行很长的时间。

Looking for longest paths must be done carefully as it can result in very long running queries.

如果您尝试写一个查询随意找到任意两个机场之间最长的路径，您可能会有这样的风险：写了一个运行相当长时间的查询。在很多时候最长的路径的概念，可以从相反的样式来看概。特别是在一些紧密联系的图中，举个例子，航线图就是这样的图。

If you were to try and write a query that arbitrarily looked for the longest route between any two

airports you run the risk of writing a query that runs for an extremely long time. In many ways the

concept of "longest path" can be viewed as an anti pattern. This is especially true in highly

connected graphs of which air-routes is an example.

posted on 2022-04-29 21:13 bokeyuannicheng0000 阅读(17) 评论(0) 收藏举报

刷新页面返回顶部

5.12 查找需要最多次中转的旅程 Looking for the journey requiring the most stops实用的小精灵（Gremlin图查询语言） 阿岶奇TinkerPop向导 PRACTICAL GREMLIN An Apache TinkerPop Tutorial

公告

5.12 查找需要最多次中转的旅程 Looking for the journey requiring the most stops实用的小精灵（Gremlin图查询语言）阿岶奇TinkerPop向导 PRACTICAL GREMLIN An Apache TinkerPop Tutorial