Oct 29 07:40:01 mypc-PC anacron[3153]: Job `cron.weekly' started
Oct 29 07:40:01 mypc-PC anacron[2414]: Updated timestamp for job `cron.weekly' to 2019-10-29
Oct 29 07:40:01 mypc-PC kernel: [1207844.240253] AMD-Vi: Event logged [IO_PAGE_FAULT device=1b:00.0 domain=0x000d address=0x0000000000000000 flags=0x0000]
Oct 29 07:40:01 mypc-PC kernel: [1207844.240260] AMD-Vi: Event logged [IO_PAGE_FAULT device=1b:00.0 domain=0x000d address=0x0000000000000400 flags=0x0000]
Oct 29 07:40:01 mypc-PC kernel: [1207844.240347] AMD-Vi: Event logged [IO_PAGE_FAULT device=1b:00.0 domain=0x000d address=0x0000000000000000 flags=0x0000]
Oct 29 07:40:01 mypc-PC kernel: [1207844.240349] AMD-Vi: Event logged [IO_PAGE_FAULT device=1b:00.0 domain=0x000d address=0x0000000000000200 flags=0x0000]
My pc was periodically shut down on 7:40 am JST. I tryied to figure out why.
I checked the log in /var/log/syslog
. There was a lot of the same errors shown as preceding text.
It seems to be caused by fstrim
command triggered by anacron.weekly
job. That’s why it happens in a certain period. When I manually executed fstrim --all
, the errors was raised too.
Then, I found the bug repport in the bugzilla homepage.
202665 – NVMe AMD-Vi IO_PAGE_FAULT only with hardware IOMMU and fstrim/discard
That said it happened because of kernel bugs caused by a machine with Ryzen CPU.
My Environment
Ubuntu 16.04 x64
kernel 4.4.0-166.195
CPU : AMD Ryzen 7 2700X Eight-Core Processor
Motherboard : X470 GAMING PLUS (MS-7B79)
m.2 SSD: Sillicon Powwer P34A80
Easy solution
Easiest solution is put soft
option to iommu
in the kernel setting such as putting iommu=soft
in Grub default. Still I don’t know much about how software iommu works, but it seems to make a delay to map files because it uses software rendering. I rather would like to use hardware rendering.
Fix the Kernel bug and build custom kernel.
That was really painful but I finanlly made it work. All I need is to apply the diff file like below to the kernel. The fix was mentioned in the bug repport page.
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6cbde30..a8bd71c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -470,7 +470,7 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
struct nvme_dsm_range *range;
struct bio *bio;
- range = kmalloc_array(segments, sizeof(*range), GFP_ATOMIC);
+ range = kmalloc_array(256, sizeof(*range), GFP_ATOMIC);
if (!range)
return BLK_STS_RESOURCE;
However, if you are using kernel 4.4, there is no kmalloc_array
in the core.c
file. So you need to upgrade your kernel if you are using Ubuntu 16.04 or find your own way.
sudo apt install linux-generic-hwe-16.04
After executing preceding command, you can use kernel 4.15.0, which is used by Ubuntu 18.04.
I made a docker image to build the kernel 4.15.0 for ubuntu 16.04. Make sure the branch is ubuntu16.04-kernel4.15.0
.
Docker Image
GitHub – fx-kirin/docker-ubuntu-kernel-build at ubuntu16.04-kernel4.15.0
I also tried to build kernel 4.15 for Ubuntu 18.04. The branch name is ubuntu18.04
. The patch file there is not ready to apply the fix, but it can build the kernel. So if you want to use it, please carefully check the way to apply patch in ubuntu16.04-kernel4.15.0
.
Docker command
docker build -t kernel-build-16.04-4.15.0 .
docker run -it --rm -v ~/kbuild/ubuntu16.04/kernel4.15.0:/data -v ~/linux-patches/ubuntu16.04/kernel4.15.0/:/patches -e KERNEL_MAJOR=4.15.0 -e BUILD_CLEAN=Yes kernel-build-16.04-4.15.0
You put the patch file fix-nvme_18_04.patch
to ~/linux-patches/ubuntu16.04/kernel4.15.0/
, then you can build the patched kernel. Then copy all .deb
files to your PC and install them with sudo dpkg -i *.deb
command. If it fails, you should to try execute sudo apt intall -f
command and try it again.
Result
After this patch, fstrim
command works perfectly. Congrats.
Note
The interesting thing is that , if you want to use custom version name, you have to change the version name in the file changelog
normally in debian.master
directory (but this time, it is debian.hwe
because I am using HWE Ubuntu).
And custom version name must not contain hyphen -
. see here
コメント