• 6 Posts
  • 40 Comments
Joined 2 years ago
cake
Cake day: June 14th, 2023

help-circle
rss
  • What troubleshooting steps did you take so far? I would try these:

    • different OS, maybe a live usb running fedora or ubuntu if it is possible to emulate the workload where this appears
    • bios reset to defaults, no OC not even XMP
    • memtest, either the memtest86+ boot iso or the runtime memtester can detect obvious errors
    • long smart self test on OS drive and an fsck or scrub based on FS

    Also the logs show a very old nvidia gpu which is not supported by the new driver. I don’t know if this can cause crashes, haven’t used one in ages, maybe someone else has more insight.













  • The FS feature is great, it’s just cumbersome to use without a tool.

    Snapper works well for a local backup like history both against botched updates and accidental deletion, but eats up the free space with the default settings.

    Timeshift is an easy to use GUI but doesn’t support non-default partitions.

    Also the quota support had a nasty side effect: freezing the whole system on snapshot deletion.


  • I think calling it a “cache” is not precise. The primary function of the DRAM is to hold the dictionary for translating logical addresses (e.g. sectors) from the OS to the physical addresses (which NAND chip, which bank etc.). This indirection is needed for the controller to do wear leveling without corrupting the filesystem.

    On a SATA SSD without DRAM each read IO could mean 2 actual reads: first the dictionary to find the data and than the actual data being read. As you said HBM helps by eliminating this extra read.

    The read and write caching is just a use of the remaining DRAM capacity. Since modern Operating Systems use the general RAM for the same function it is usually just a small increase to the throughput.










  • Thanks for the links! I updated my config from z3fold to zsmalloc and adjusted the vm.page-cluster to test these out.

    Reading a bit more, I think when using large max_pool_percent (>30) with Zswap the two solutions are more similar than not. A crucial difference is what use-case is more acceptable since Zswap can cause unresponsiveness (and potential lockup) under high memory pressure. While Zram could result in an OOM crash in a similar worst-case scenario.