Skip to content

Commit 4eeffc7

Browse files
author
Myron Stowe
committed
PCI: Extend ACS configurability
JIRA: https://issues.redhat.com/browse/RHEL-48601 Upstream Status: 47c8846 commit 47c8846 Author: Vidya Sagar <[email protected]> Date: Tue Jun 25 21:01:50 2024 +0530 PCI: Extend ACS configurability PCIe ACS settings control the level of isolation and the possible P2P paths between devices. With greater isolation the kernel will create smaller iommu_groups and with less isolation there is more HW that can achieve P2P transfers. From a virtualization perspective all devices in the same iommu_group must be assigned to the same VM as they lack security isolation. There is no way for the kernel to automatically know the correct ACS settings for any given system and workload. Existing command line options (e.g., disable_acs_redir) allow only for large scale change, disabling all isolation, but this is not sufficient for more complex cases. Add a kernel command-line option 'config_acs' to directly control all the ACS bits for specific devices, which allows the operator to setup the right level of isolation to achieve the desired P2P configuration. The definition is future proof; when new ACS bits are added to the spec the open syntax can be extended. ACS needs to be setup early in the kernel boot as the ACS settings affect how iommu_groups are formed. iommu_group formation is a one time event during initial device discovery, so changing ACS bits after kernel boot can result in an inaccurate view of the iommu_groups compared to the current isolation configuration. ACS applies to PCIe Downstream Ports and multi-function devices. The default ACS settings are strict and deny any direct traffic between two functions. This results in the smallest iommu_group the HW can support. Frequently these values result in slow or non-working P2PDMA. ACS offers a range of security choices controlling how traffic is allowed to go directly between two devices. Some popular choices: - Full prevention - Translated requests can be direct, with various options - Asymmetric direct traffic, A can reach B but not the reverse - All traffic can be direct Along with some other less common ones for special topologies. The intention is that this option would be used with expert knowledge of the HW capability and workload to achieve the desired configuration. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vidya Sagar <[email protected]> [bhelgaas: add example, tidy printk formats] Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Myron Stowe <[email protected]>
1 parent 2e069ad commit 4eeffc7

File tree

2 files changed

+122
-58
lines changed

2 files changed

+122
-58
lines changed

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4463,6 +4463,38 @@
44634463
bridges without forcing it upstream. Note:
44644464
this removes isolation between devices and
44654465
may put more devices in an IOMMU group.
4466+
config_acs=
4467+
Format:
4468+
<ACS flags>@<pci_dev>[; ...]
4469+
Specify one or more PCI devices (in the format
4470+
specified above) optionally prepended with flags
4471+
and separated by semicolons. The respective
4472+
capabilities will be enabled, disabled or
4473+
unchanged based on what is specified in
4474+
flags.
4475+
4476+
ACS Flags is defined as follows:
4477+
bit-0 : ACS Source Validation
4478+
bit-1 : ACS Translation Blocking
4479+
bit-2 : ACS P2P Request Redirect
4480+
bit-3 : ACS P2P Completion Redirect
4481+
bit-4 : ACS Upstream Forwarding
4482+
bit-5 : ACS P2P Egress Control
4483+
bit-6 : ACS Direct Translated P2P
4484+
Each bit can be marked as:
4485+
'0' – force disabled
4486+
'1' – force enabled
4487+
'x' – unchanged
4488+
For example,
4489+
pci=config_acs=10x
4490+
would configure all devices that support
4491+
ACS to enable P2P Request Redirect, disable
4492+
Translation Blocking, and leave Source
4493+
Validation unchanged from whatever power-up
4494+
or firmware set it to.
4495+
4496+
Note: this may remove isolation between devices
4497+
and may put more devices in an IOMMU group.
44664498
force_floating [S390] Force usage of floating interrupts.
44674499
nomio [S390] Do not use MIO instructions.
44684500
norid [S390] ignore the RID field and force use of

drivers/pci/pci.c

Lines changed: 90 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -946,30 +946,67 @@ void pci_request_acs(void)
946946
}
947947

948948
static const char *disable_acs_redir_param;
949+
static const char *config_acs_param;
949950

950-
/**
951-
* pci_disable_acs_redir - disable ACS redirect capabilities
952-
* @dev: the PCI device
953-
*
954-
* For only devices specified in the disable_acs_redir parameter.
955-
*/
956-
static void pci_disable_acs_redir(struct pci_dev *dev)
951+
struct pci_acs {
952+
u16 cap;
953+
u16 ctrl;
954+
u16 fw_ctrl;
955+
};
956+
957+
static void __pci_config_acs(struct pci_dev *dev, struct pci_acs *caps,
958+
const char *p, u16 mask, u16 flags)
957959
{
960+
char *delimit;
958961
int ret = 0;
959-
const char *p;
960-
int pos;
961-
u16 ctrl;
962962

963-
if (!disable_acs_redir_param)
963+
if (!p)
964964
return;
965965

966-
p = disable_acs_redir_param;
967966
while (*p) {
967+
if (!mask) {
968+
/* Check for ACS flags */
969+
delimit = strstr(p, "@");
970+
if (delimit) {
971+
int end;
972+
u32 shift = 0;
973+
974+
end = delimit - p - 1;
975+
976+
while (end > -1) {
977+
if (*(p + end) == '0') {
978+
mask |= 1 << shift;
979+
shift++;
980+
end--;
981+
} else if (*(p + end) == '1') {
982+
mask |= 1 << shift;
983+
flags |= 1 << shift;
984+
shift++;
985+
end--;
986+
} else if ((*(p + end) == 'x') || (*(p + end) == 'X')) {
987+
shift++;
988+
end--;
989+
} else {
990+
pci_err(dev, "Invalid ACS flags... Ignoring\n");
991+
return;
992+
}
993+
}
994+
p = delimit + 1;
995+
} else {
996+
pci_err(dev, "ACS Flags missing\n");
997+
return;
998+
}
999+
}
1000+
1001+
if (mask & ~(PCI_ACS_SV | PCI_ACS_TB | PCI_ACS_RR | PCI_ACS_CR |
1002+
PCI_ACS_UF | PCI_ACS_EC | PCI_ACS_DT)) {
1003+
pci_err(dev, "Invalid ACS flags specified\n");
1004+
return;
1005+
}
1006+
9681007
ret = pci_dev_str_match(dev, p, &p);
9691008
if (ret < 0) {
970-
pr_info_once("PCI: Can't parse disable_acs_redir parameter: %s\n",
971-
disable_acs_redir_param);
972-
1009+
pr_info_once("PCI: Can't parse ACS command line parameter\n");
9731010
break;
9741011
} else if (ret == 1) {
9751012
/* Found a match */
@@ -989,56 +1026,38 @@ static void pci_disable_acs_redir(struct pci_dev *dev)
9891026
if (!pci_dev_specific_disable_acs_redir(dev))
9901027
return;
9911028

992-
pos = dev->acs_cap;
993-
if (!pos) {
994-
pci_warn(dev, "cannot disable ACS redirect for this hardware as it does not have ACS capabilities\n");
995-
return;
996-
}
997-
998-
pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
1029+
pci_dbg(dev, "ACS mask = %#06x\n", mask);
1030+
pci_dbg(dev, "ACS flags = %#06x\n", flags);
9991031

1000-
/* P2P Request & Completion Redirect */
1001-
ctrl &= ~(PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC);
1032+
/* If mask is 0 then we copy the bit from the firmware setting. */
1033+
caps->ctrl = (caps->ctrl & ~mask) | (caps->fw_ctrl & mask);
1034+
caps->ctrl |= flags;
10021035

1003-
pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
1004-
1005-
pci_info(dev, "disabled ACS redirect\n");
1036+
pci_info(dev, "Configured ACS to %#06x\n", caps->ctrl);
10061037
}
10071038

10081039
/**
10091040
* pci_std_enable_acs - enable ACS on devices using standard ACS capabilities
10101041
* @dev: the PCI device
1042+
* @caps: default ACS controls
10111043
*/
1012-
static void pci_std_enable_acs(struct pci_dev *dev)
1044+
static void pci_std_enable_acs(struct pci_dev *dev, struct pci_acs *caps)
10131045
{
1014-
int pos;
1015-
u16 cap;
1016-
u16 ctrl;
1017-
1018-
pos = dev->acs_cap;
1019-
if (!pos)
1020-
return;
1021-
1022-
pci_read_config_word(dev, pos + PCI_ACS_CAP, &cap);
1023-
pci_read_config_word(dev, pos + PCI_ACS_CTRL, &ctrl);
1024-
10251046
/* Source Validation */
1026-
ctrl |= (cap & PCI_ACS_SV);
1047+
caps->ctrl |= (caps->cap & PCI_ACS_SV);
10271048

10281049
/* P2P Request Redirect */
1029-
ctrl |= (cap & PCI_ACS_RR);
1050+
caps->ctrl |= (caps->cap & PCI_ACS_RR);
10301051

10311052
/* P2P Completion Redirect */
1032-
ctrl |= (cap & PCI_ACS_CR);
1053+
caps->ctrl |= (caps->cap & PCI_ACS_CR);
10331054

10341055
/* Upstream Forwarding */
1035-
ctrl |= (cap & PCI_ACS_UF);
1056+
caps->ctrl |= (caps->cap & PCI_ACS_UF);
10361057

10371058
/* Enable Translation Blocking for external devices and noats */
10381059
if (pci_ats_disabled() || dev->external_facing || dev->untrusted)
1039-
ctrl |= (cap & PCI_ACS_TB);
1040-
1041-
pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
1060+
caps->ctrl |= (caps->cap & PCI_ACS_TB);
10421061
}
10431062

10441063
/**
@@ -1047,23 +1066,33 @@ static void pci_std_enable_acs(struct pci_dev *dev)
10471066
*/
10481067
static void pci_enable_acs(struct pci_dev *dev)
10491068
{
1050-
if (!pci_acs_enable)
1051-
goto disable_acs_redir;
1069+
struct pci_acs caps;
1070+
int pos;
1071+
1072+
pos = dev->acs_cap;
1073+
if (!pos)
1074+
return;
10521075

1053-
if (!pci_dev_specific_enable_acs(dev))
1054-
goto disable_acs_redir;
1076+
pci_read_config_word(dev, pos + PCI_ACS_CAP, &caps.cap);
1077+
pci_read_config_word(dev, pos + PCI_ACS_CTRL, &caps.ctrl);
1078+
caps.fw_ctrl = caps.ctrl;
10551079

1056-
pci_std_enable_acs(dev);
1080+
/* If an iommu is present we start with kernel default caps */
1081+
if (pci_acs_enable) {
1082+
if (pci_dev_specific_enable_acs(dev))
1083+
pci_std_enable_acs(dev, &caps);
1084+
}
10571085

1058-
disable_acs_redir:
10591086
/*
1060-
* Note: pci_disable_acs_redir() must be called even if ACS was not
1061-
* enabled by the kernel because it may have been enabled by
1062-
* platform firmware. So if we are told to disable it, we should
1063-
* always disable it after setting the kernel's default
1064-
* preferences.
1087+
* Always apply caps from the command line, even if there is no iommu.
1088+
* Trust that the admin has a reason to change the ACS settings.
10651089
*/
1066-
pci_disable_acs_redir(dev);
1090+
__pci_config_acs(dev, &caps, disable_acs_redir_param,
1091+
PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC,
1092+
~(PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_EC));
1093+
__pci_config_acs(dev, &caps, config_acs_param, 0, 0);
1094+
1095+
pci_write_config_word(dev, pos + PCI_ACS_CTRL, caps.ctrl);
10671096
}
10681097

10691098
/**
@@ -6724,6 +6753,8 @@ static int __init pci_setup(char *str)
67246753
pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
67256754
} else if (!strncmp(str, "disable_acs_redir=", 18)) {
67266755
disable_acs_redir_param = str + 18;
6756+
} else if (!strncmp(str, "config_acs=", 11)) {
6757+
config_acs_param = str + 11;
67276758
} else {
67286759
pr_err("PCI: Unknown option `%s'\n", str);
67296760
}
@@ -6748,6 +6779,7 @@ static int __init pci_realloc_setup_params(void)
67486779
resource_alignment_param = kstrdup(resource_alignment_param,
67496780
GFP_KERNEL);
67506781
disable_acs_redir_param = kstrdup(disable_acs_redir_param, GFP_KERNEL);
6782+
config_acs_param = kstrdup(config_acs_param, GFP_KERNEL);
67516783

67526784
return 0;
67536785
}

0 commit comments

Comments
 (0)