mirror of
				https://github.com/pytorch/pytorch.git
				synced 2025-11-04 16:04:58 +08:00 
			
		
		
		
	[inductor] Expand use of generic benchmark function (#164938)
Use the more generic `Benchmarker.benchmark` function to allow benchmarking other devices that support the required functionality, for example prologue and epilogue fusion can be benchmarked for triton CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164938 Approved by: https://github.com/nmacchioni, https://github.com/eellison
This commit is contained in:
		
				
					committed by
					
						
						PyTorch MergeBot
					
				
			
			
				
	
			
			
			
						parent
						
							0c14f55de6
						
					
				
				
					commit
					5c583e2573
				
			@ -2671,8 +2671,10 @@ class AlgorithmSelectorCache(PersistentCache):
 | 
			
		||||
 | 
			
		||||
        # Templates selected with input_gen_fns require specific input data to avoid IMA
 | 
			
		||||
        # Passing custom input gen fns to benchmark_fusion NYI, so skip deferred template selection
 | 
			
		||||
        # TODO(jgong5): support multi-template on CPU
 | 
			
		||||
        if input_gen_fns is not None or layout.device.type == "cpu":
 | 
			
		||||
        # TODO(jgong5): support multi-template on CPU C++ backend
 | 
			
		||||
        if input_gen_fns is not None or (
 | 
			
		||||
            layout.device.type == "cpu" and config.cpu_backend != "triton"
 | 
			
		||||
        ):
 | 
			
		||||
            return_multi_template = False
 | 
			
		||||
 | 
			
		||||
        # TODO - assert that we have not mutating kernels here
 | 
			
		||||
 | 
			
		||||
		Reference in New Issue
	
	Block a user