Use toposort instead of topological_op_nodes for DAG reconstruction#15987
Use toposort instead of topological_op_nodes for DAG reconstruction#15987mtreinish wants to merge 3 commits intoQiskit:mainfrom
Conversation
This commit switches the topological sort function we use in transpiler passes when reconstructing a dag from scratch. In several passes where we typically replace or remove a large numbers of gates we iterate over the input dag in topological order and construct a copy of it making the alterations as we go. Right now when we do this we rely on `DAGCircuit::topological_op_nodes` which makes sense because it's or built-in method for iterating over a dag's op node in topological ordering. Internally this uses rustworkx's lexicographical topological sort function with our custom sort function that maintains our desired tie breaker using the bits of a node. However, since Qiskit#14762 where we're asserting structural equality in passes we don't need to use that sort anymore for these reconstruction cases. We just need a consistent topological sort. In optimizing Qiskit#13419 one thing that showed in profiles was that for very large circuits the overhead of the lexicographical topological sort for the iteration. The toposort function in petgraph is lower overhead because it doesn't need to work about the lexicographical tie breaker. By switching to use this instead we can reduce the overhead of the final sort in all these passes. In asv benchmarking this commit speeds up transpiler benchmarks are 2-5% faster (although without asv flagging it as significant).
|
One or more of the following people are relevant to this code:
|
jakelishman
left a comment
There was a problem hiding this comment.
Got to say I don't love this in the way it's been done. I don't like all these extra explicit calls to unwrap just to call what should be an infallible function, and I'm not a fan of having a specific DAGCircuit::topological_op_nodes function that you shouldn't use.
Is there any way we can fix topological_op_nodes in place?
This does not replace all the uses of That being said I can refactor this into a rebuilding method instead. Where you give it a callback that is passed every node and it will return a new dag instance based on that. Since we do this enough times having a streamlined api makes sense anyway. |
This commit adds a new method to DAGCircuit that is designed to replace the explicit rebuilding of a dag that is done in several passes. The previous commit updated the common pattern of rebuilding the dag to use toposort instead of topological_sort, but this wasn't the cleanest interface and it was resulting in some ugliness between the API boundaries. To facilitate making the more performant path easier to use correctly this commit opts to just extract the common pattern into a method that enables performant rebuilding of a dag.
Summary
This commit switches the topological sort function we use in transpiler passes when reconstructing a dag from scratch. In several passes where we typically replace or remove a large numbers of gates we iterate over the input dag in topological order and construct a copy of it making the alterations as we go. Right now when we do this we rely on
DAGCircuit::topological_op_nodeswhich makes sense because it's or built-in method for iterating over a dag's op node in topological ordering. Internally this uses rustworkx's lexicographical topological sort function with our custom sort function that maintains our desired tie breaker using the bits of a node. However, since #14762 where we're asserting structural equality in passes we don't need to use that sort anymore for these reconstruction cases. We just need a consistent topological sort. In optimizing #13419 one thing that showed in profiles was that for very large circuits the overhead of the lexicographical topological sort for the iteration. The toposort function in petgraph is lower overhead because it doesn't need to work about the lexicographical tie breaker. By switching to use this instead we can reduce the overhead of the final sort in all these passes. In asv "utility scale" transpile benchmarking this commit speeds up runtime 2-5% (although without asv flagging it as significant).Details and comments