Skip to content

Commit 24b454b

Browse files
jbuchocxkuba-moo
authored andcommitted
ice: Fix ice module unload
Clearing the interrupt scheme before PFR reset, during the removal routine, could cause the hardware errors and possibly lead to system reboot, as the PF reset can cause the interrupt to be generated. Place the call for PFR reset inside ice_deinit_dev(), wait until reset and all pending transactions are done, then call ice_clear_interrupt_scheme(). This introduces a PFR reset to multiple error paths. Additionally, remove the call for the reset from ice_load() - it will be a part of ice_unload() now. Error example: [ 75.229328] ice 0000:ca:00.1: Failed to read Tx Scheduler Tree - User Selection data from flash [ 77.571315] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [ 77.571418] {1}[Hardware Error]: event severity: recoverable [ 77.571459] {1}[Hardware Error]: Error 0, type: recoverable [ 77.571500] {1}[Hardware Error]: section_type: PCIe error [ 77.571540] {1}[Hardware Error]: port_type: 4, root port [ 77.571580] {1}[Hardware Error]: version: 3.0 [ 77.571615] {1}[Hardware Error]: command: 0x0547, status: 0x4010 [ 77.571661] {1}[Hardware Error]: device_id: 0000:c9:02.0 [ 77.571703] {1}[Hardware Error]: slot: 25 [ 77.571736] {1}[Hardware Error]: secondary_bus: 0xca [ 77.571773] {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x347a [ 77.571821] {1}[Hardware Error]: class_code: 060400 [ 77.571858] {1}[Hardware Error]: bridge: secondary_status: 0x2800, control: 0x0013 [ 77.572490] pcieport 0000:c9:02.0: AER: aer_status: 0x00200000, aer_mask: 0x00100020 [ 77.572870] pcieport 0000:c9:02.0: [21] ACSViol (First) [ 77.573222] pcieport 0000:c9:02.0: AER: aer_layer=Transaction Layer, aer_agent=Receiver ID [ 77.573554] pcieport 0000:c9:02.0: AER: aer_uncor_severity: 0x00463010 [ 77.691273] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [ 77.691738] {2}[Hardware Error]: event severity: recoverable [ 77.691971] {2}[Hardware Error]: Error 0, type: recoverable [ 77.692192] {2}[Hardware Error]: section_type: PCIe error [ 77.692403] {2}[Hardware Error]: port_type: 4, root port [ 77.692616] {2}[Hardware Error]: version: 3.0 [ 77.692825] {2}[Hardware Error]: command: 0x0547, status: 0x4010 [ 77.693032] {2}[Hardware Error]: device_id: 0000:c9:02.0 [ 77.693238] {2}[Hardware Error]: slot: 25 [ 77.693440] {2}[Hardware Error]: secondary_bus: 0xca [ 77.693641] {2}[Hardware Error]: vendor_id: 0x8086, device_id: 0x347a [ 77.693853] {2}[Hardware Error]: class_code: 060400 [ 77.694054] {2}[Hardware Error]: bridge: secondary_status: 0x0800, control: 0x0013 [ 77.719115] pci 0000:ca:00.1: AER: can't recover (no error_detected callback) [ 77.719140] pcieport 0000:c9:02.0: AER: device recovery failed [ 77.719216] pcieport 0000:c9:02.0: AER: aer_status: 0x00200000, aer_mask: 0x00100020 [ 77.719390] pcieport 0000:c9:02.0: [21] ACSViol (First) [ 77.719557] pcieport 0000:c9:02.0: AER: aer_layer=Transaction Layer, aer_agent=Receiver ID [ 77.719723] pcieport 0000:c9:02.0: AER: aer_uncor_severity: 0x00463010 Fixes: 5b246e5 ("ice: split probe into smaller functions") Signed-off-by: Jakub Buchocki <jakubx.buchocki@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230612171421.21570-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
1 parent d6858e1 commit 24b454b

1 file changed

Lines changed: 5 additions & 11 deletions

File tree

drivers/net/ethernet/intel/ice/ice_main.c

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -4802,9 +4802,13 @@ static int ice_init_dev(struct ice_pf *pf)
48024802
static void ice_deinit_dev(struct ice_pf *pf)
48034803
{
48044804
ice_free_irq_msix_misc(pf);
4805-
ice_clear_interrupt_scheme(pf);
48064805
ice_deinit_pf(pf);
48074806
ice_deinit_hw(&pf->hw);
4807+
4808+
/* Service task is already stopped, so call reset directly. */
4809+
ice_reset(&pf->hw, ICE_RESET_PFR);
4810+
pci_wait_for_pending_transaction(pf->pdev);
4811+
ice_clear_interrupt_scheme(pf);
48084812
}
48094813

48104814
static void ice_init_features(struct ice_pf *pf)
@@ -5094,10 +5098,6 @@ int ice_load(struct ice_pf *pf)
50945098
struct ice_vsi *vsi;
50955099
int err;
50965100

5097-
err = ice_reset(&pf->hw, ICE_RESET_PFR);
5098-
if (err)
5099-
return err;
5100-
51015101
err = ice_init_dev(pf);
51025102
if (err)
51035103
return err;
@@ -5354,12 +5354,6 @@ static void ice_remove(struct pci_dev *pdev)
53545354
ice_setup_mc_magic_wake(pf);
53555355
ice_set_wake(pf);
53565356

5357-
/* Issue a PFR as part of the prescribed driver unload flow. Do not
5358-
* do it via ice_schedule_reset() since there is no need to rebuild
5359-
* and the service task is already stopped.
5360-
*/
5361-
ice_reset(&pf->hw, ICE_RESET_PFR);
5362-
pci_wait_for_pending_transaction(pdev);
53635357
pci_disable_device(pdev);
53645358
}
53655359

0 commit comments

Comments
 (0)