Understanding memory mapping
The speed at which application instructions are processed on a system is proportionate to the number of access operations required to obtain data outside of program-addressable memory.
The system provides two methods for reducing the transactional overhead associated with these external read and write operations. You can map file data into the process address space. You can also map processes to anonymous memory regions that may be shared by cooperating processes.
Memory mapped files provide a mechanism for a process to access files by directly incorporating file data into the process address space. The use of mapped files can significantly reduce I/O data movement since the file data does not have to be copied into process data buffers, as is done by the read and write subroutines. When more than one process maps the same file, its contents are shared among them, providing a low-overhead mechanism by which processes can synchronize and communicate.
Mapped memory regions, also called shared memory areas, can serve as a large pool for exchanging data among processes. The available subroutines do not provide locks or access control among the processes. Therefore, processes using shared memory areas must set up a signal or semaphore control method to prevent access conflicts and to keep one process from changing data that another is using. Shared memory areas can be most beneficial when the amount of data to be exchanged between processes is too large to transfer with messages, or when many processes maintain a common large database.
The system provides two methods for mapping files and anonymous memory regions. The following subroutines, known collectively as the shmat services, are typically used to create and use shared memory segments from a program:
Subroutine | Definition |
---|---|
shmctl | Controls shared memory operations |
shmget | Gets or creates a shared memory segment |
shmat | Attaches a shared memory segment from a process. Does not allow you to map block devices. |
shmdt | Detaches a shared memory segment from a process |
mprotect | Modifies the access protections of a specified address range within a shared memory segment. |
disclaim | Removes a mapping from a specified address range within a shared memory segment |
The ftok subroutine provides the key that the shmget subroutine uses to create the shared segment
The second set of services, collectively known as the mmap services, is typically used for mapping files, although it may be used for creating shared memory segments as well.
All operations valid on memory resulting from mmap() of a file are valid on memory resulting from mmap() of a block device. A block device is a special file that provides access to a device driver that presents a block interface. A block interface to a device driver requires data access in blocks of a fixed size. The interface is typically used for data storage devices.
The mmap services include the following subroutines:
Subroutine | Definition |
---|---|
madvise | Advises the system of a process' expected paging behavior |
mincore | Determines residency of memory pages |
mmap | Maps an object file into virtual memory. Allows you to map block devices one process at a time. |
mprotect | Modifies the access protections of memory mapping |
msync | Synchronizes a mapped file with its underlying storage device |
munmap | Unmaps a mapped memory region |
The msem_init, msem_lock, msem_unlock, msem_remove, msleep, and mwakeup subroutines provide access control for the processes mapped using the mmap services.
Refer to the following sections to learn more about memory mapping:
Comparing mmap with shmat
As with the shmat services, the portion of the process address space available for mapping files with the mmap services is dependent on whether a process is a 32-bit process or a 64-bit process. For 32-bit processes, the portion of address space available for mapping consists of addresses in the range of 0x30000000-0xCFFFFFFF, for a total of 2.5G bytes of address space. The portion of address space available for mapping files consists of addresses in the rangesof 0x30000000-0xCFFFFFFF and 0xE0000000-0xEFFFFFFF for a total of 2.75G bytes of address space. In AIX® 5.2 and later, a 32-bit process run with the very large address-space model has the range 0x30000000-0xFFFFFFFF available for mappings, with a total of up to 3.25GB of address space.
All available ranges within the 32-bit process address space are available for both fixed-location and variable-location mappings. Fixed-location mappings occur when applications specify that a mapping be placed at a fixed location within the address space. Variable-location mappings occur when applications specify that the system should decide the location at which a mapping should be placed.
For 64-bit processes, two sets of address ranges with the process address space are available for mmap or shmat mappings. The first, consisting of the single range 0x07000000_00000000-0x07FFFFFF_FFFFFFFF, is available for both fixed-location and variable-location mappings. The second set of address ranges is available for fixed-location mappings only and consists of the ranges 0x30000000-0xCFFFFFFF, 0xE0000000-0xEFFFFFFF, and 0x10_00000000-0x06FFFFFF_FFFFFFFF. The last range of this set, consisting of 0x10_00000000-0x06FFFFFF_FFFFFFFF, is also made available to system loader to hold program text, data and heap, so only unused portions of the range are available for fixed-location mappings.
Both the mmap and shmat services provide the capability for multiple processes to map the same region of an object such that they share addressability to that object. However, the mmap subroutine extends this capability beyond that provided by the shmat subroutine by allowing a relatively unlimited number of such mappings to be established. While this capability increases the number of mappings supported per file object or memory segment, it can prove inefficient for applications in which many processes map the same file data into their address space.
The mmap subroutine provides a unique object address for each process that maps to an object. The software accomplishes this by providing each process with a unique virtual address, known as an alias. The shmat subroutine allows processes to share the addresses of the mapped objects.
Because only one of the existing aliases for a given page in an object has a real address translation at any given time, only one of the mmap mappings can make a reference to that page without incurring a page fault. Any reference to the page by a different mapping (and thus a different alias) results in a page fault that causes the existing real-address translation for the page to be invalidated. As a result, a new translation must be established for it under a different alias. Processes share pages by moving them between these different translations.
For applications in which many processes map the same file data into their address space, this toggling process may have an adverse affect on performance. In these cases, the shmat subroutine may provide more efficient file-mapping capabilities.
Use the shmat services under the following circumstances:
- For 32-bit application, eleven or fewer files are mapped simultaneously, and each is smaller than 256MB.
- When mapping files larger than 256MB.
- When mapping shared memory regions which need to be shared among unrelated processes (no parent-child relationship).
- When mapping entire files.
Use mmap under the following circumstances:
- Portability of the application is a concern.
- Many files are mapped simultaneously.
- Only a portion of a file needs to be mapped.
- Page-level protection needs to be set on the mapping.
- Private mapping is required.
An "extended shmat" capability is available for 32-bit applications with their limited address spaces. If you define the environment variable EXTSHM=ON, then processes executing in that environment can create and attach more than eleven shared memory segments. The process can attach these segments into the address space for the size of the segment. Another segment can be attached at the end of the first one in the same 256M byte region. The address at which a process can attach is at page boundaries, which is a multiple of SHMLBA_EXTSHM bytes.
Some restrictions exist on the use of the extended shmat feature. These shared memory regions cannot be used as I/O buffers where the unpinning of the buffer occurs in an interrupt handler. The restrictions on the use of extended shmat I/O buffers is the same as that of mmap buffers.
The environment variable provides the option of executing an application with either the additional functionality of attaching more than 11 segments when EXTSHM=ON, or the higher-performance access to 11 or fewer segments when the environment variable is not set. Again, the "extended shmat" capability only applies to 32-bit processes.
mmap Compatibility Considerations
The mmap services are specified by various standards and commonly used as the file-mapping interface of choice in other operating system implementations. However, the system's implementation of the mmap subroutine may differ from other implementations. The mmap subroutine incorporates the following modifications:
- Mapping into the process private area is not supported.
- Mappings are not implicitly unmapped. An mmap operation which specifies MAP_FIXED will fail if a mapping already exists within the range specified.
- For private mappings, the copy-on-write semantic makes a copy of a page on the first write reference.
- Mapping of I/O or device memory is not supported.
- Mapping of character devices or use of an mmap region as a buffer for a read-write operation to a character device is not supported.
- The madvise subroutine is provided for compatibility only. The system takes no action on the advice specified.
- The mprotect subroutine allows the specified region to contain unmapped pages. In operation, the unmapped pages are simply skipped over.
- The OSF/AES-specific options for default exact mapping and for the MAP_INHERIT, MAP_HASSEMAPHORE, and MAP_UNALIGNED flags are not supported.
Using the semaphore subroutines
The msem_init, msem_lock, msem_unlock, msem_remove, msleep and mwakeup subroutines conform to the OSF Application Environment specification. They provide an alternative to IPC interfaces such as the semget and semop subroutines. Benefits of using the semaphores include an efficient serialization method and the reduced overhead of not having to make a system call in cases where there is no contention for the semaphore.
Semaphores should be located in a shared memory region. Semaphores are specified by msemaphore structures. All of the values in a msemaphore structure should result from a msem_init subroutine call. This call may or may not be followed by a sequence of calls to the msem_lock subroutine or the msem_unlock subroutine. If a msemaphore structure values originated in another manner, the results of the semaphore subroutines are undefined.
The address of the msemaphore structure is significant. You should be careful not to modify the structure's address. If the structure contains values copied from a msemaphore structure at another address, the results of the semaphore subroutines are undefined.
The semaphore subroutines may prove less efficient when the semaphore structures exist in anonymous memory regions created with the mmap subroutine, particularly in cases where many processes reference the same semaphores. In these instances, the semaphore structures should be allocated out of shared memory regions created with the shmget and shmat subroutines.
Mapping files with the shmat subroutine
Mapping can be used to reduce the overhead involved in writing and reading the contents of files. Once the contents of a file are mapped to an area of user memory, the file may be manipulated as if it were data in memory, using pointers to that data instead of input/output calls. The copy of the file on disk also serves as the paging area for that file, saving paging space.
A program can use any regular file as a mapped data file. You can also extend the features of mapped data files to files containing compiled and executable object code. Because mapped files can be accessed more quickly than regular files, the system can load a program more quickly if its executable object file is mapped to a file.
To create a program as a mapped executable file, compile and link the program using the -K flag with the cc or ld command. The -K flag tells the linker to create an object file with a page-aligned format. That is, each part of the object file starts on a page boundary (an address that can be divided by 2K bytes with no remainder). This option results in some empty space in the object file but allows the executable file to be mapped into memory. When the system maps an object file into memory, the text and data portions are handled differently.
Copy-on-write mapped files
To prevent changes made to mapped files from appearing immediately in the file on disk, map the file as a copy-on-write file. This option creates a mapped file with changes that are saved in the system paging space, instead of to the copy of the file on disk. You must choose to write those changes to the copy on disk to save the changes. Otherwise, you lose the changes when closing the file.
Because the changes are not immediately reflected in the copy of the file that other users may access, use copy-on-write mapped files only among processes that cooperate with each other.
The system does not detect the end of files mapped with the shmat subroutine. Therefore, if a program writes beyond the current end of file in a copy-on-write mapped file by storing into the corresponding memory segment (where the file is mapped), the actual file on disk is extended with blocks of zeros in preparation for the new data. If the program does not use the fsync subroutine before closing the file, the data written beyond the previous end of file is not written to disk. The file appears larger, but contains only the added zeros. Therefore, always use the fsync subroutine before closing a copy-on-write mapped file to preserve any added or changed data.
Mapping shared memory segments with the shmat subroutine
The system uses shared memory segments similarly to the way it creates and uses files. Defining the terms used for shared memory with respect to the more familiar file-system terms is critical to understanding shared memory. A definition list of shared memory terms follows:
Term | Definition |
---|---|
key | The unique identifier of a particular shared segment. It is associated with the shared segment as long as the shared segment exists. In this respect, it is similar to the file name of a file. |
shmid | The identifier assigned to the shared segment for use within a particular process. It is similar in use to a file descriptor for a file. |
attach | Specifies that a process must attach a shared segment in order to use it. Attaching a shared segment is similar to opening a file. |
detach | Specifies that a process must detach a shared segment once it is finished using it. Detaching a shared segment is similar to closing a file. |