Optimization and tuning
You can control the optimization and tuning process, which can improve the performance of your application at run time, using the options in the following table. Remember that not all options benefit all applications. Trade-offs sometimes occur between an increase in compile time, a reduction in debugging capability, and the improvements that optimization can provide. In addition to the option descriptions in this section, consult the XL Fortran Optimization and Programming Guide for details on the optimization and tuning process as well as writing optimization friendly source code.
Some of the options in Floating-point and integer control can also improve performance, but you must use them with care to ensure your application retains the floating-point semantics it requires.
Option name | @PROCESS directive | Description |
---|---|---|
-qalias | ALIAS(argument_list) | Indicates whether a program contains certain categories of aliasing or does not conform to Fortran standard aliasing rules. The compiler limits the scope of some optimizations when there is a possibility that different names are aliases for the same storage location. |
-qarch | None. | Specifies the processor architecture, or family of architectures, where the code may run. This allows the compiler to take maximum advantage of the machine instructions specific to an architecture, or common to a family of architectures. |
-qassert | ASSERT | Provides information about the characteristics of your code that can help the compiler fine-tune optimizations. |
-qcache | None. | Specifies the cache configuration for a specific execution machine. |
-qcompact | COMPACT | Avoids optimizations that increase code size. |
-qdirectstorage | None. | Informs the compiler that a given compilation unit may reference write-through-enabled or cache-inhibited storage. |
-qessl | None. | Allows the compiler to substitute the Engineering and Scientific Subroutine Library (ESSL) routines in place of Fortran 90 intrinsic procedures. |
-qfdpr | None. | Provides object files with information that the IBM® Feedback Directed Program Restructuring (FDPR®) performance-tuning utility needs to optimize the resulting executable file. |
-qhot | HOT(suboptions) | Performs high-order loop analysis and transformations (HOT) during optimization. |
-qinline | None. | Attempts to inline procedures instead of generating calls to those procedures, for improved performance. |
-qipa | None. | Enables or customizes a class of optimizations known as interprocedural analysis (IPA). |
-qlargepage | None. | Takes advantage of large pages provided on POWER4 and higher systems, for applications designed to execute in a large page memory environment. |
-qlibansi | None. | Assumes that all functions with the name of an ANSI C library function are, in fact, the library functions and not a user function with different semantics. |
-qlibessl | None. | Assumes that all functions with the name of an ESSL library function are, in fact, the library functions and not a user function with different semantics. |
-qlibmpi | None. | Asserts that all functions with Message Passing Interface (MPI) names are in fact MPI functions and not a user function with different semantics. |
-qlibposix | None. | Assumes that all functions with the name of a POSIX 1003.1 library function are, in fact, the system functions and not a user function with different semantics. |
-qmaxmem | MAXMEM | Limits the amount of memory that the compiler allocates while performing specific, memory-intensive optimizations to the specified number of kilobytes. |
-qminimaltoc | None. | Minimizes the number of entries in the global entity table of contents (TOC). |
-O | OPTIMIZE | Specifies whether to optimize code during compilation and, if so, at which level. |
-p | None. | Prepares the object files produced by the compiler for profiling. |
-qpdf1, -qpdf2 | None. | Tunes optimizations through profile-directed feedback (PDF), where results from sample program execution are used to improve optimization near conditional branches and in frequently executed code sections. |
-qprefetch | None. | Inserts prefetch instructions automatically where there are opportunities to improve code performance. |
-qshowpdf | None. | When used with -qpdf1 and a minimum optimization level of -O2 at compile and link steps, creates a PDF map file that contains additional profiling information for all procedures in your application. |
-qsimd | None. | Controls whether the compiler can automatically take advantage of vector instructions for processors that support them. |
-qsmallstack | None. | Minimizes stack usage where possible. |
-qsmp | None. | Enables parallelization of program code. |
-qstacktemp | None. | Determines where to allocate certain XL Fortran compiler temporaries at run time. |
-qstrict | STRICT | Ensures that optimizations that are done by default at the -O3 and higher optimization levels, and, optionally at -O2, do not alter certain program semantics mostly related to strict IEEE floating-point conformance. |
-qstrict_induction | None. | Prevents the compiler from performing induction (loop counter) variable optimizations. These optimizations may be unsafe (may alter the semantics of your program) when there are integer overflow operations involving the induction variables. |
-qtune | None. | Tunes instruction selection, scheduling, and other architecture-dependent performance enhancements to run best on a specific hardware architecture. Allows specification of a target SMT mode to direct optimizations for best performance in that mode. |
-qunroll | None. | Specifies whether unrolling DO loops is allowed in a program. Unrolling is allowed on outer and inner DO loops. |
-qunwind | None. | Specifies that the compiler will preserve the default behavior for saves and restores to volatile registers during a procedure call. |
-qvisibility | VISIBILITY(suboption) | Specifies the visibility attribute for external linkage symbols in object files. |