Skip to main content

EC2 swap device management & fixing "swapoff failed: Cannot allocate memory"

 One of the sillier things I've done as an AWS/linux admin is provision an EBS disk as swap to an EC2 instance. I kept getting max allocate errors for a script I needed to run to execute a series of database queries. Reprovisioning to a new EC2 instance class with more RAM wasn't feasible at the time for some long-forgotten reason. 

I would never do this if I owned the disks - provisioning swap to SSD will greatly reduce the lifetime of the disk, among many reasons why this is less than ideal. But Amazon has plenty of money. I figured I could cheaply provision an EBS volume & buy myself enough swap to complete the query. Then, in some point in the future, I could create a more beautimous solution.

Well, if you're a sysadmin you know how this story ends. I moved onto other fires/projects, quickly forgot about the swap situation, and here I am years later, deprovisioning the server, in all its swappy glory.

This wouldn't warrant a blog post, except for the fact that I received an error when trying to disable swap using "swapoff -a":

swapoff failed: Cannot allocate memory

In this case, the swap had about 750MB of swap in use, and this tiny little EC2 Nano instance only had about 5MB of free RAM. In order for me to detach the EBS swap device, I needed a temporary place to store the swap that is currently in use assuming that the server must stay online. Another option would have been to have edited my /etc/fstab file to comment out the line binding the EBS UUID to /dev/swap and rebooting. 

Another method of resolving the issue is to shift the used swap space to a temporary swap file. This does not require a reboot and allows me to reclaim the EBS device immediately. That's what I opted to do, and here is how I did that:

First, you need to find the path to the swap device or file. Because I am a dummy and used an EBS device for this purpose, I could easily find that path using the blkid command (note I am censoring command output, so your output will look different than mine):

$ sudo blkid
/dev/nvme1n1: UUID="****" TYPE="swap"

I also need to determine how much of the swap device is being consumed. The swap file I create must be at least that size.  NOTE: It is not necessary for the new swap file to match the size of the prior swap device, or to use the same size blocks as the swap device. Because this is the case I can easily find the size using the "free" command:

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           464M        202M        5.5M         15M        260M        233M
Swap:          8.0G        816M        7.2G

I then proceed to create the temporary swap file. I went a bit bigger than I needed to, creating a 1GB swap file. Its generally a good idea to give a bit of wiggle room here to be on the safe side. Note how in the dd command below, I am specifying a count of 1,024,000 blocks, each 1024 bytes in size. Multiple those numbers and you get 1048576000 bytes or 1GB:

$ sudo dd if=/dev/zero of=/home/swap bs=1024 count=1024000
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB) copied, 6.8052 s, 154 MB/s

Just to confirm blocksize isn't required to match, my old EBS swap device used blocks half that size (there are other situations where blocksize is very relevant, just not here):

$ sudo fdisk -l /dev/nvme1n1
Disk /dev/nvme1n1: 8589 MB, 8589934592 bytes, 16777216 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Lock down permissions for the new swap file:

$ sudo chmod 0600 /home/swap

Use 'mkswap' to properly configure the file for swapping:

$ sudo mkswap /home/swap

Setting up swapspace version 1, size = 1023996 KiB

no label, UUID=1f3f78ed-8e5d-4672-bbd1-59e6f29c8b08

 And tell the host to use the new file with 'swapon':

$ sudo swapon /home/swap

Unless you use the -v flag, a successful swapon will not return output to the command line. Its still a good idea to check the "free" command to make sure that the new swap has been applied:

$ free -h

              total        used        free      shared  buff/cache   available

Mem:           464M        198M        5.5M         15M        260M        237M

Swap:          9.0G        734M        8.3G

Notice the total swap amount: 9.0G. This number includes the 1GB file we just created as well as the 8GB swap device that already existed. Linux does not show two separate swap devices in this context, the same way it would not show two different sticks of RAM in this context.

Finally, we are ready to disable the old swap device. Here I am using the -v flag to look out for additional information in the event of an error. But the command was successful, so swapoff simply prints back my command when it completes: 

$ sudo swapoff -v /dev/nvme1n1swapoff /dev/nvme1n1

Running the free command again, we can see that the total Swap value has been reduced by about 8GB

$ free -h     total        used        free      shared  buff/cache   available
Mem:           464M        390M        5.9M         27M         67M         34M

Swap:          999M        524M        475M

Looks good. I'm not quite done yet, though. Even though I don't plan to reboot immediately, I still need to remove the /etc/fstab entry for the old swap device. I used a text editor to add a hash symbol (#) to the line relevant to the swap device:

#/dev/nvme1n1                                none                    swap    sw              0 0

Before taking the final step of running "swapoff" and deleting the new swap file, take a moment to double-check the settings for RAM-hungry applications and their memory settings.

Running MySQL? Check out your innodb_buffer_pool_size & innodb_log_file_size settings in my.cnf (or my.cnf.d/server.cnf). 

Running PHP CGIs? Check out your memory_limit settings. 

In my case, removing a few unused databases and resetting innodb_buffer_pool_size  to reflect the current available memory made an enormous difference. Here is the output of "free" immediately after those changes; those changes freed up over 500MB of memory ... not bad for a system with, well, 500MB of RAM.

$ free -h
                        total        used        free      shared  buff/cache   available
Mem:           464M        238M         77M         14M        149M        199M
Swap:          999M         84M        915M

But there are memory-relevant settings in the kernel itself, too. Check out the kernel's swappiness value as shown below:

$ sysctl vm.swappiness
vm.swappiness = 10

The higher the swappiness value, the more inclined the kernel is to swap memory out of RAM. You want a low value here if you plan to disable swap.

You will also want to check out the vm.vfs_cache_pressure kernel setting. This value determines how the kernel reclaims memory from swap. From the kernel documentation:

At the default value of vfs_cache_pressure=100 the kernel will attempt to reclaim dentries and inodes at a "fair" rate with respect to pagecache and swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will never reclaim dentries and inodes due to memory pressure and this can easily lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to reclaim dentries and inodes. Increasing vfs_cache_pressure significantly beyond 100 may have negative performance impact. Reclaim code needs to take various locks to find freeable directory and inode objects. With vfs_cache_pressure=1000, it will look for ten times more freeable objects than there are.

As it turns out, I was unable to "swapoff" the temporary swap file until I modified these values, which had been customized as part of MariaDB performance tuning that long ago stopped being relevant:

# swapoff /home/swap
swapoff: /home/swap: swapoff failed: Cannot allocate memory

Typically, permanent changes to kernel parameters are made in /etc/sysctl.conf. I modified my own values as follows:

vm.swappiness=1  ### changed from a value of 10
vm.vfs_cache_pressure=100 ### changed from a value of 200

So a quick review: at this point, all swaps have been disabled, and thanks to the tweaks above, the host no longer requires swap to meet its memory requirements, as can be seen in free:

  # free -h
              total        used        free      shared  buff/cache   available
Mem:           464M        282M         66M         46M        116M        123M
Swap:            0B          0B          0B

However, I am still paying Amazon for that silly EBS device. If we just absent-mindedly follow the Amazon documentation for detaching EBS devices, we would find the device name for the swap volume and umount it.

We can see in the screenshot that the smaller 8GB volume is /dev/sdf - that is the swap device. But if we try to umount that device, Linux cannot find it:

$ sudo umount -d /dev/sdf
umount: /dev/sdf: mountpoint not found

This is because Amazon never planned for users to do something as silly as mount an EBS as swap. Our earlier commands reference /dev/nvme1n1. If we try to umount that, we can see it is no longer mounted:

# umount -d /dev/nvme1n1
umount: /dev/nvme1n1: not mounted

The point is, we already took care of unmounting the device when we eliminated the swap partition. So we can skip directly to detaching the volume using the AWS console. Be careful to select the correct Volume ID (its a good idea to use a Name tag to avoid mistakes).

And that's it. What could be easier?  :/