This commit is contained in:
2025-07-12 12:17:44 +03:00
parent c759f60ff7
commit 792e1b937a
3507 changed files with 492613 additions and 0 deletions

View File

@@ -0,0 +1,110 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head><title>
FFTW Frequently Asked Questions with Answers
</title>
<link rev="made" href="mailto:fftw@fftw.org">
<link rel="Contents" href="index.html">
<link rel="Start" href="index.html">
<META name="description"
content="Frequently asked questions and answers (FAQ) for FFTW.">
<link rel="Bookmark" title="FFTW FAQ" href="index.html">
<LINK rel="Bookmark" title="FFTW Home Page"
href="http://www.fftw.org">
<LINK rel="Bookmark" title="FFTW Manual"
href="http://www.fftw.org/doc/">
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
FFTW Frequently Asked Questions with Answers
</h1>
This is the list of Frequently Asked Questions about FFTW, a
collection of fast C routines for computing the Discrete Fourier
Transform in one or more dimensions.
<h1>
Index
</h1>
<ul>
<li><b><font size="+2"><a href="section1.html" rel=subdocument>Section 1. Introduction and General Information</a></font></b>
<li><a href="section1.html#whatisfftw" rel=subdocument>Q1.1. What is FFTW?</a>
<li><a href="section1.html#whereisfftw" rel=subdocument>Q1.2. How do I obtain FFTW?</a>
<li><a href="section1.html#isfftwfree" rel=subdocument>Q1.3. Is FFTW free software?</a>
<li><a href="section1.html#nonfree" rel=subdocument>Q1.4. What is this about non-free licenses?</a>
<li><a href="section1.html#west" rel=subdocument>Q1.5. In the West? I thought MIT was in the East?</a>
<br><br><li><b><font size="+2"><a href="section2.html" rel=subdocument>Section 2. Installing FFTW</a></font></b>
<li><a href="section2.html#systems" rel=subdocument>Q2.1. Which systems does FFTW run on?</a>
<li><a href="section2.html#runOnWindows" rel=subdocument>Q2.2. Does FFTW run on Windows?</a>
<li><a href="section2.html#compilerCrashes" rel=subdocument>Q2.3. My compiler has trouble with FFTW.</a>
<li><a href="section2.html#solarisSucks" rel=subdocument>Q2.4. FFTW does not compile on Solaris, complaining about
<code>const</code>.</a>
<li><a href="section2.html#3dnow" rel=subdocument>Q2.5. What's the difference between <code>--enable-3dnow</code> and <code>--enable-k7</code>?</a>
<li><a href="section2.html#fma" rel=subdocument>Q2.6. What's the difference between the fma and the non-fma
versions?</a>
<li><a href="section2.html#languages" rel=subdocument>Q2.7. Which language is FFTW written in?</a>
<li><a href="section2.html#fortran" rel=subdocument>Q2.8. Can I call FFTW from Fortran?</a>
<li><a href="section2.html#cplusplus" rel=subdocument>Q2.9. Can I call FFTW from C++?</a>
<li><a href="section2.html#whynotfortran" rel=subdocument>Q2.10. Why isn't FFTW written in Fortran/C++?</a>
<li><a href="section2.html#singleprec" rel=subdocument>Q2.11. How do I compile FFTW to run in single precision?</a>
<li><a href="section2.html#64bitk7" rel=subdocument>Q2.12. --enable-k7 does not work on x86-64</a>
<br><br><li><b><font size="+2"><a href="section3.html" rel=subdocument>Section 3. Using FFTW</a></font></b>
<li><a href="section3.html#fftw2to3" rel=subdocument>Q3.1. Why not support the FFTW 2 interface in FFTW
3?</a>
<li><a href="section3.html#planperarray" rel=subdocument>Q3.2. Why do FFTW 3 plans encapsulate the input/output arrays and not just
the algorithm?</a>
<li><a href="section3.html#slow" rel=subdocument>Q3.3. FFTW seems really slow.</a>
<li><a href="section3.html#slows" rel=subdocument>Q3.4. FFTW slows down after repeated calls.</a>
<li><a href="section3.html#segfault" rel=subdocument>Q3.5. An FFTW routine is crashing when I call it.</a>
<li><a href="section3.html#fortran64" rel=subdocument>Q3.6. My Fortran program crashes when calling FFTW.</a>
<li><a href="section3.html#conventions" rel=subdocument>Q3.7. FFTW gives results different from my old
FFT.</a>
<li><a href="section3.html#nondeterministic" rel=subdocument>Q3.8. FFTW gives different results between runs</a>
<li><a href="section3.html#savePlans" rel=subdocument>Q3.9. Can I save FFTW's plans?</a>
<li><a href="section3.html#whyscaled" rel=subdocument>Q3.10. Why does your inverse transform return a scaled
result?</a>
<li><a href="section3.html#centerorigin" rel=subdocument>Q3.11. How can I make FFTW put the origin (zero frequency) at the center of
its output?</a>
<li><a href="section3.html#imageaudio" rel=subdocument>Q3.12. How do I FFT an image/audio file in <i>foobar</i> format?</a>
<li><a href="section3.html#linkfails" rel=subdocument>Q3.13. My program does not link (on Unix).</a>
<li><a href="section3.html#linkheader" rel=subdocument>Q3.14. I included your header, but linking still
fails.</a>
<li><a href="section3.html#nostack" rel=subdocument>Q3.15. My program crashes, complaining about stack
space.</a>
<li><a href="section3.html#leaks" rel=subdocument>Q3.16. FFTW seems to have a memory leak.</a>
<li><a href="section3.html#allzero" rel=subdocument>Q3.17. The output of FFTW's transform is all zeros.</a>
<li><a href="section3.html#vbetalia" rel=subdocument>Q3.18. How do I call FFTW from the Microsoft language du
jour?</a>
<li><a href="section3.html#pruned" rel=subdocument>Q3.19. Can I compute only a subset of the DFT outputs?</a>
<li><a href="section3.html#transpose" rel=subdocument>Q3.20. Can I use FFTW's routines for in-place and out-of-place matrix
transposition?</a>
<br><br><li><b><font size="+2"><a href="section4.html" rel=subdocument>Section 4. Internals of FFTW</a></font></b>
<li><a href="section4.html#howworks" rel=subdocument>Q4.1. How does FFTW work?</a>
<li><a href="section4.html#whyfast" rel=subdocument>Q4.2. Why is FFTW so fast?</a>
<br><br><li><b><font size="+2"><a href="section5.html" rel=subdocument>Section 5. Known bugs</a></font></b>
<li><a href="section5.html#rfftwndbug" rel=subdocument>Q5.1. FFTW 1.1 crashes in rfftwnd on Linux.</a>
<li><a href="section5.html#fftwmpibug" rel=subdocument>Q5.2. The MPI transforms in FFTW 1.2 give incorrect results/leak
memory.</a>
<li><a href="section5.html#testsingbug" rel=subdocument>Q5.3. The test programs in FFTW 1.2.1 fail when I change FFTW to use single
precision.</a>
<li><a href="section5.html#teststoobig" rel=subdocument>Q5.4. The test program in FFTW 1.2.1 fails for n &gt;
46340.</a>
<li><a href="section5.html#linuxthreads" rel=subdocument>Q5.5. The threaded code fails on Linux Redhat 5.0</a>
<li><a href="section5.html#bigrfftwnd" rel=subdocument>Q5.6. FFTW 2.0's rfftwnd fails for rank &gt; 1 transforms with a final
dimension &gt;= 65536.</a>
<li><a href="section5.html#primebug" rel=subdocument>Q5.7. FFTW 2.0's complex transforms give the wrong results with prime
factors 17 to 97.</a>
<li><a href="section5.html#mpichbug" rel=subdocument>Q5.8. FFTW 2.1.1's MPI test programs crash with
MPICH.</a>
<li><a href="section5.html#aixthreadbug" rel=subdocument>Q5.9. FFTW 2.1.2's multi-threaded transforms don't work on
AIX.</a>
<li><a href="section5.html#bigprimebug" rel=subdocument>Q5.10. FFTW 2.1.2's complex transforms give incorrect results for large prime
sizes.</a>
<li><a href="section5.html#solaristhreadbug" rel=subdocument>Q5.11. FFTW 2.1.3's multi-threaded transforms don't give any speedup on
Solaris.</a>
<li><a href="section5.html#aixflags" rel=subdocument>Q5.12. FFTW 2.1.3 crashes on AIX.</a>
</ul><hr>
<address>
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
- 14 September 2021
</address><br>
Extracted from FFTW Frequently Asked Questions with Answers,
Copyright &copy; 2021 Matteo Frigo and Massachusetts Institute of Technology.
</body></html>

View File

@@ -0,0 +1,85 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head><title>
FFTW FAQ - Section 1
</title>
<link rev="made" href="mailto:fftw@fftw.org">
<link rel="Contents" href="index.html">
<link rel="Start" href="index.html">
<link rel="Next" href="section2.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
FFTW FAQ - Section 1 <br>
Introduction and General Information
</h1>
<ul>
<li><a href="#whatisfftw" rel=subdocument>Q1.1. What is FFTW?</a>
<li><a href="#whereisfftw" rel=subdocument>Q1.2. How do I obtain FFTW?</a>
<li><a href="#isfftwfree" rel=subdocument>Q1.3. Is FFTW free software?</a>
<li><a href="#nonfree" rel=subdocument>Q1.4. What is this about non-free licenses?</a>
<li><a href="#west" rel=subdocument>Q1.5. In the West? I thought MIT was in the East?</a>
</ul><hr>
<h2><A name="whatisfftw">
Question 1.1. What is FFTW?
</A></h2>
FFTW is a free collection of fast C routines for computing the
Discrete Fourier Transform in one or more dimensions. It includes
complex, real, symmetric, and parallel transforms, and can handle
arbitrary array sizes efficiently. FFTW is typically faster than
other publically-available FFT implementations, and is even
competitive with vendor-tuned libraries. (See our web page for
extensive benchmarks.) To achieve this performance, FFTW uses novel
code-generation and runtime self-optimization techniques (along with
many other tricks).
<h2><A name="whereisfftw">
Question 1.2. How do I obtain FFTW?
</A></h2>
FFTW can be found at <A href="http://www.fftw.org">the FFTW web page</A>. You can also retrieve it from <code>ftp.fftw.org</code> in <A href="ftp://ftp.fftw.org/pub/fftw"><code>/pub/fftw</code></A>.
<h2><A name="isfftwfree">
Question 1.3. Is FFTW free software?
</A></h2>
Starting with version 1.3, FFTW is Free Software in the technical
sense defined by the Free Software Foundation (see
<A href="http://www.gnu.org/philosophy/categories.html">Categories of Free and Non-Free Software</A>), and is distributed under the terms of the GNU General Public License. Previous versions of FFTW were
distributed without fee for noncommercial use, but were not
technically ``free.''
<p>
Non-free licenses for FFTW are also available that permit different
terms of use than the GPL.
<h2><A name="nonfree">
Question 1.4. What is this about non-free
licenses?
</A></h2>
The non-free licenses are for companies that wish to use FFTW in their
products but are unwilling to release their software under the GPL
(which would require them to release source code and allow free
redistribution). Such users can purchase an unlimited-use license
from MIT. Contact us for more details.
<p>
We could instead have released FFTW under the LGPL, or even disallowed
non-Free usage. Suffice it to say, however, that MIT owns the
copyright to FFTW and they only let us GPL it because we convinced
them that it would neither affect their licensing revenue nor irritate
existing licensees.
<h2><A name="west">
Question 1.5. In the West? I thought MIT was in the
East?
</A></h2>
Not to an Italian. You could say that we're a Spaghetti Western
(with apologies to Sergio Leone). <hr>
Next: <a href="section2.html" rel=precedes>Installing FFTW</a>.<br>
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
<address>
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
- 14 September 2021
</address><br>
Extracted from FFTW Frequently Asked Questions with Answers,
Copyright &copy; 2021 Matteo Frigo and Massachusetts Institute of Technology.
</body></html>

View File

@@ -0,0 +1,285 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head><title>
FFTW FAQ - Section 2
</title>
<link rev="made" href="mailto:fftw@fftw.org">
<link rel="Contents" href="index.html">
<link rel="Start" href="index.html">
<link rel="Next" href="section3.html"><link rel="Previous" href="section1.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
FFTW FAQ - Section 2 <br>
Installing FFTW
</h1>
<ul>
<li><a href="#systems" rel=subdocument>Q2.1. Which systems does FFTW run on?</a>
<li><a href="#runOnWindows" rel=subdocument>Q2.2. Does FFTW run on Windows?</a>
<li><a href="#compilerCrashes" rel=subdocument>Q2.3. My compiler has trouble with FFTW.</a>
<li><a href="#solarisSucks" rel=subdocument>Q2.4. FFTW does not compile on Solaris, complaining about
<code>const</code>.</a>
<li><a href="#3dnow" rel=subdocument>Q2.5. What's the difference between <code>--enable-3dnow</code> and <code>--enable-k7</code>?</a>
<li><a href="#fma" rel=subdocument>Q2.6. What's the difference between the fma and the non-fma
versions?</a>
<li><a href="#languages" rel=subdocument>Q2.7. Which language is FFTW written in?</a>
<li><a href="#fortran" rel=subdocument>Q2.8. Can I call FFTW from Fortran?</a>
<li><a href="#cplusplus" rel=subdocument>Q2.9. Can I call FFTW from C++?</a>
<li><a href="#whynotfortran" rel=subdocument>Q2.10. Why isn't FFTW written in Fortran/C++?</a>
<li><a href="#singleprec" rel=subdocument>Q2.11. How do I compile FFTW to run in single precision?</a>
<li><a href="#64bitk7" rel=subdocument>Q2.12. --enable-k7 does not work on x86-64</a>
</ul><hr>
<h2><A name="systems">
Question 2.1. Which systems does FFTW run
on?
</A></h2>
FFTW is written in ANSI C, and should work on any system with a decent
C compiler. (See also <A href="#runOnWindows">Q2.2 `Does FFTW run on Windows?'</A>, <A href="#compilerCrashes">Q2.3 `My compiler has trouble with FFTW.'</A>.) FFTW can also take advantage of certain hardware-specific features,
such as cycle counters and SIMD instructions, but this is optional.
<h2><A name="runOnWindows">
Question 2.2. Does FFTW run on Windows?
</A></h2>
Yes, many people have reported successfully using FFTW on Windows with
various compilers. FFTW was not developed on Windows, but the source
code is essentially straight ANSI C. See also the
<A href="http://www.fftw.org/install/windows.html">FFTW Windows installation notes</A>, <A href="#compilerCrashes">Q2.3 `My compiler has trouble with FFTW.'</A>, and <A href="section3.html#vbetalia">Q3.18 `How do I call FFTW from the Microsoft language du
jour?'</A>.
<h2><A name="compilerCrashes">
Question 2.3. My compiler has trouble with
FFTW.
</A></h2>
Complain fiercely to the vendor of the compiler.
<p>
We have successfully used <code>gcc</code> 3.2.x on x86 and PPC, a recent Compaq C compiler for Alpha, version 6 of IBM's
<code>xlc</code> compiler for AIX, Intel's <code>icc</code> versions 5-7, and Sun WorkShop <code>cc</code> version 6.
<p>
FFTW is likely to push compilers to their limits, however, and several
compiler bugs have been exposed by FFTW. A partial list follows.
<p>
<code>gcc</code> 2.95.x for Solaris/SPARC produces incorrect code for
the test program (workaround: recompile the
<code>libbench2</code> directory with <code>-O2</code>).
<p>
NetBSD/macppc 1.6 comes with a <code>gcc</code> version that also miscompiles the test program. (Please report a workaround if you know
one.)
<p>
<code>gcc</code> 3.2.3 for ARM reportedly crashes during compilation.
This bug is reportedly fixed in later versions of
<code>gcc</code>.
<p>
Versions 8.0 and 8.1 of Intel's <code>icc</code> falsely claim to be <code>gcc</code>, so you should specify <code>CC=&quot;icc -no-gcc&quot;</code>; this is automatic in FFTW 3.1. <code>icc-8.0.066</code> reportely produces incorrect code for FFTW 2.1.5, but is fixed in version 8.1.
<code>icc-7.1</code> compiler build 20030402Z appears to produce
incorrect dependencies, causing the compilation to fail.
<code>icc-7.1</code> build 20030307Z appears to work fine. (Use
<code>icc -V</code> to check which build you have.) As of 2003/04/18,
build 20030402Z appears not to be available any longer on Intel's
website, whereas the older build 20030307Z is available.
<p>
<code>ranlib</code> of GNU <code>binutils</code> 2.9.1 on Irix has been observed to corrupt the FFTW libraries, causing a link failure when
FFTW is compiled. Since <code>ranlib</code> is completely superfluous on Irix, we suggest deleting it from your system and replacing it with
a symbolic link to <code>/bin/echo</code>.
<p>
If support for SIMD instructions is enabled in FFTW, further compiler
problems may appear:
<p>
<code>gcc</code> 3.4.[0123] for x86 produces incorrect SSE2 code for
FFTW when <code>-O2</code> (the best choice for FFTW) is used, causing
FFTW to crash (<code>make check</code> crashes). This bug is fixed in <code>gcc</code> 3.4.4. On x86_64 (amd64/em64t), <code>gcc</code> 3.4.4 reportedly still has a similar problem, but this is fixed as of
<code>gcc</code> 3.4.6.
<p>
<code>gcc-3.2</code> for x86 produces incorrect SIMD code if
<code>-O3</code> is used. The same compiler produces incorrect SIMD
code if no optimization is used, too. When using
<code>gcc-3.2</code>, it is a good idea not to change the default
<code>CFLAGS</code> selected by the <code>configure</code> script.
<p>
Some 3.0.x and 3.1.x versions of <code>gcc</code> on <code>x86</code> may crash. <code>gcc</code> so-called 2.96 shipping with RedHat 7.3 crashes
when compiling SIMD code. In both cases, please upgrade to
<code>gcc-3.2</code> or later.
<p>
Intel's <code>icc</code> 6.0 misaligns SSE constants, but FFTW has a
workaround. <code>icc</code> 8.x fails to compile FFTW 3.0.x because it
falsely claims to be <code>gcc</code>; we believe this to be a bug in <code>icc</code>, but FFTW 3.1 has a workaround.
<p>
Visual C++ 2003 reportedly produces incorrect code for SSE/SSE2 when
compiling FFTW. This bug was reportedly fixed in VC++ 2005;
alternatively, you could switch to the Intel compiler. VC++ 6.0 also
reportedly produces incorrect code for the file
<code>reodft11e-r2hc-odd.c</code> unless optimizations are disabled for that file.
<p>
<code>gcc</code> 2.95 on MacOS X miscompiles AltiVec code (fixed in
later versions). <code>gcc</code> 3.2.x miscompiles AltiVec permutations, but FFTW has a workaround.
<code>gcc</code> 4.0.1 on MacOS for Intel crashes when compiling FFTW; a workaround is to
compile one file without optimization: <code>cd kernel; make CFLAGS=&quot; &quot; trig.lo</code>.
<p>
<code>gcc</code> 4.1.1 reportedly crashes when compiling FFTW for MIPS;
the workaround is to compile the file it crashes on
(<code>t2_64.c</code>) with a lower optimization level.
<p>
<code>gcc</code> versions 4.1.2 to 4.2.0 for x86 reportedly miscompile
FFTW 3.1's test program, causing <code>make check</code> to crash (<code>gcc</code> bug #26528). The bug was reportedly fixed in
<code>gcc</code> version 4.2.1 and later. A workaround is to compile
<code>libbench2/verify-lib.c</code> without optimization.
<h2><A name="solarisSucks">
Question 2.4. FFTW does not compile on Solaris, complaining about
<code>const</code>.
</A></h2>
We know that at least on Solaris 2.5.x with Sun's compilers 4.2 you
might get error messages from <code>make</code> such as
<p>
<code>&quot;./fftw.h&quot;, line 88: warning: const is a keyword in ANSI
C</code>
<p>
This is the case when the <code>configure</code> script reports that <code>const</code> does not work:
<p>
<code>checking for working const... (cached) no</code>
<p>
You should be aware that Solaris comes with two compilers, namely,
<code>/opt/SUNWspro/SC4.2/bin/cc</code> and <code>/usr/ucb/cc</code>. The latter compiler is non-ANSI. Indeed, it is a perverse shell script
that calls the real compiler in non-ANSI mode. In order
to compile FFTW, change your path so that the right
<code>cc</code> is used.
<p>
To know whether your compiler is the right one, type
<code>cc -V</code>. If the compiler prints ``<code>ucbcc</code>'', as in
<p>
<code>ucbcc: WorkShop Compilers 4.2 30 Oct 1996 C
4.2</code>
<p>
then the compiler is wrong. The right message is something like
<p>
<code>cc: WorkShop Compilers 4.2 30 Oct 1996 C
4.2</code>
<h2><A name="3dnow">
Question 2.5. What's the difference between
<code>--enable-3dnow</code> and <code>--enable-k7</code>?
</A></h2>
<code>--enable-k7</code> enables 3DNow! instructions on K7 processors
(AMD Athlon and its variants). K7 support is provided by assembly
routines generated by a special purpose compiler.
As of fftw-3.2, --enable-k7 is no longer supported.
<p>
<code>--enable-3dnow</code> enables generic 3DNow! support using <code>gcc</code> builtin functions. This works on earlier AMD
processors, but it is not as fast as our special assembly routines.
As of fftw-3.1, --enable-3dnow is no longer supported.
<h2><A name="fma">
Question 2.6. What's the difference between the fma and the non-fma
versions?
</A></h2>
The fma version tries to exploit the fused multiply-add instructions
implemented in many processors such as PowerPC, ia-64, and MIPS. The
two FFTW packages are otherwise identical. In FFTW 3.1, the fma and
non-fma versions were merged together into a single package, and the
<code>configure</code> script attempts to automatically guess which
version to use.
<p>
The FFTW 3.1 <code>configure</code> script enables fma by default on PowerPC, Itanium, and PA-RISC, and disables it otherwise. You can
force one or the other by using the <code>--enable-fma</code> or <code>--disable-fma</code> flag for <code>configure</code>.
<p>
Definitely use fma if you have a PowerPC-based system with
<code>gcc</code> (or IBM <code>xlc</code>). This includes all GNU/Linux systems for PowerPC and the older PowerPC-based MacOS systems. Also
use it on PA-RISC and Itanium with the HP/UX compiler.
<p>
Definitely do not use the fma version if you have an ia-32 processor
(Intel, AMD, MacOS on Intel, etcetera).
<p>
For other architectures/compilers, the situation is not so clear. For
example, ia-64 has the fma instruction, but
<code>gcc-3.2</code> appears not to exploit it correctly. Other compilers may do the right thing,
but we have not tried them. Please send us your feedback so that we
can update this FAQ entry.
<h2><A name="languages">
Question 2.7. Which language is FFTW written
in?
</A></h2>
FFTW is written in ANSI C. Most of the code, however, was
automatically generated by a program called
<code>genfft</code>, written in the Objective Caml dialect of ML. You do not need to know ML or to
have an Objective Caml compiler in order to use FFTW.
<p>
<code>genfft</code> is provided with the FFTW sources, which means that
you can play with the code generator if you want. In this case, you
need a working Objective Caml system. Objective Caml is available
from <A href="http://caml.inria.fr">the Caml web page</A>.
<h2><A name="fortran">
Question 2.8. Can I call FFTW from Fortran?
</A></h2>
Yes, FFTW (versions 1.3 and higher) contains a Fortran-callable
interface, documented in the FFTW manual.
<p>
By default, FFTW configures its Fortran interface to work with the
first compiler it finds, e.g. <code>g77</code>. To configure for a different, incompatible Fortran compiler
<code>foobar</code>, use <code>./configure F77=foobar</code> when installing FFTW. (In the case of <code>g77</code>, however, FFTW 3.x also includes an extra set of
Fortran-callable routines with one less underscore at the end of
identifiers, which should cover most other Fortran compilers on Linux
at least.)
<h2><A name="cplusplus">
Question 2.9. Can I call FFTW from C++?
</A></h2>
Most definitely. FFTW should compile and/or link under any C++
compiler. Moreover, it is likely that the C++
<code>&lt;complex&gt;</code> template class is bit-compatible with FFTW's complex-number format
(see the FFTW manual for more details).
<h2><A name="whynotfortran">
Question 2.10. Why isn't FFTW written in
Fortran/C++?
</A></h2>
Because we don't like those languages, and neither approaches the
portability of C.
<h2><A name="singleprec">
Question 2.11. How do I compile FFTW to run in single
precision?
</A></h2>
On a Unix system: <code>configure --enable-float</code>. On a non-Unix system: edit <code>config.h</code> to <code>#define</code> the symbol <code>FFTW_SINGLE</code> (for FFTW 3.x). In both cases, you must then
recompile FFTW. In FFTW 3, all FFTW identifiers will then begin with
<code>fftwf_</code> instead of <code>fftw_</code>.
<h2><A name="64bitk7">
Question 2.12. --enable-k7 does not work on
x86-64
</A></h2>
Support for --enable-k7 was discontinued in fftw-3.2.
<p>
The fftw-3.1 release supports --enable-k7. This option only works on
32-bit x86 machines that implement 3DNow!, including the AMD Athlon
and the AMD Opteron in 32-bit mode. --enable-k7 does not work on AMD
Opteron in 64-bit mode. Use --enable-sse for x86-64 machines.
<p>
FFTW supports 3DNow! by means of assembly code generated by a
special-purpose compiler. It is hard to produce assembly code that
works in both 32-bit and 64-bit mode. <hr>
Next: <a href="section3.html" rel=precedes>Using FFTW</a>.<br>
Back: <a href="section1.html" rev=precedes>Introduction and General Information</a>.<br>
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
<address>
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
- 14 September 2021
</address><br>
Extracted from FFTW Frequently Asked Questions with Answers,
Copyright &copy; 2021 Matteo Frigo and Massachusetts Institute of Technology.
</body></html>

View File

@@ -0,0 +1,334 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head><title>
FFTW FAQ - Section 3
</title>
<link rev="made" href="mailto:fftw@fftw.org">
<link rel="Contents" href="index.html">
<link rel="Start" href="index.html">
<link rel="Next" href="section4.html"><link rel="Previous" href="section2.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
FFTW FAQ - Section 3 <br>
Using FFTW
</h1>
<ul>
<li><a href="#fftw2to3" rel=subdocument>Q3.1. Why not support the FFTW 2 interface in FFTW
3?</a>
<li><a href="#planperarray" rel=subdocument>Q3.2. Why do FFTW 3 plans encapsulate the input/output arrays and not just
the algorithm?</a>
<li><a href="#slow" rel=subdocument>Q3.3. FFTW seems really slow.</a>
<li><a href="#slows" rel=subdocument>Q3.4. FFTW slows down after repeated calls.</a>
<li><a href="#segfault" rel=subdocument>Q3.5. An FFTW routine is crashing when I call it.</a>
<li><a href="#fortran64" rel=subdocument>Q3.6. My Fortran program crashes when calling FFTW.</a>
<li><a href="#conventions" rel=subdocument>Q3.7. FFTW gives results different from my old
FFT.</a>
<li><a href="#nondeterministic" rel=subdocument>Q3.8. FFTW gives different results between runs</a>
<li><a href="#savePlans" rel=subdocument>Q3.9. Can I save FFTW's plans?</a>
<li><a href="#whyscaled" rel=subdocument>Q3.10. Why does your inverse transform return a scaled
result?</a>
<li><a href="#centerorigin" rel=subdocument>Q3.11. How can I make FFTW put the origin (zero frequency) at the center of
its output?</a>
<li><a href="#imageaudio" rel=subdocument>Q3.12. How do I FFT an image/audio file in <i>foobar</i> format?</a>
<li><a href="#linkfails" rel=subdocument>Q3.13. My program does not link (on Unix).</a>
<li><a href="#linkheader" rel=subdocument>Q3.14. I included your header, but linking still
fails.</a>
<li><a href="#nostack" rel=subdocument>Q3.15. My program crashes, complaining about stack
space.</a>
<li><a href="#leaks" rel=subdocument>Q3.16. FFTW seems to have a memory leak.</a>
<li><a href="#allzero" rel=subdocument>Q3.17. The output of FFTW's transform is all zeros.</a>
<li><a href="#vbetalia" rel=subdocument>Q3.18. How do I call FFTW from the Microsoft language du
jour?</a>
<li><a href="#pruned" rel=subdocument>Q3.19. Can I compute only a subset of the DFT outputs?</a>
<li><a href="#transpose" rel=subdocument>Q3.20. Can I use FFTW's routines for in-place and out-of-place matrix
transposition?</a>
</ul><hr>
<h2><A name="fftw2to3">
Question 3.1. Why not support the FFTW 2 interface in FFTW
3?
</A></h2>
FFTW 3 has semantics incompatible with earlier versions: its plans can
only be used for a given stride, multiplicity, and other
characteristics of the input and output arrays; these stronger
semantics are necessary for performance reasons. Thus, it is
impossible to efficiently emulate the older interface (whose plans can
be used for any transform of the same size). We believe that it
should be possible to upgrade most programs without any difficulty,
however.
<h2><A name="planperarray">
Question 3.2. Why do FFTW 3 plans encapsulate the input/output arrays
and not just the algorithm?
</A></h2>
There are several reasons:
<ul>
<li>It was important for performance reasons that the plan be specific to
array characteristics like the stride (and alignment, for SIMD), and
requiring that the user maintain these invariants is error prone.
<li>In most high-performance applications, as far as we can tell, you are
usually transforming the same array over and over, so FFTW's semantics
should not be a burden.
<li>If you need to transform another array of the same size, creating a
new plan once the first exists is a cheap operation.
<li>If you need to transform many arrays of the same size at once, you
should really use the <code>plan_many</code> routines in FFTW's &quot;advanced&quot;
interface.
<li>If the abovementioned array characteristics are the same, you are
willing to pay close attention to the documentation, and you really
need to, we provide a &quot;new-array execution&quot; interface to
apply a plan to a new array.
</ul>
<h2><A name="slow">
Question 3.3. FFTW seems really slow.
</A></h2>
You are probably recreating the plan before every transform, rather
than creating it once and reusing it for all transforms of the same
size. FFTW is designed to be used in the following way:
<ul>
<li>First, you create a plan. This will take several seconds.
<li>Then, you reuse the plan many times to perform FFTs. These are fast.
</ul>
If you don't need to compute many transforms and the time for the
planner is significant, you have two options. First, you can use the
<code>FFTW_ESTIMATE</code> option in the planner, which uses heuristics
instead of runtime measurements and produces a good plan in a short
time. Second, you can use the wisdom feature to precompute the plan;
see <A href="#savePlans">Q3.9 `Can I save FFTW's plans?'</A>
<h2><A name="slows">
Question 3.4. FFTW slows down after repeated
calls.
</A></h2>
Probably, NaNs or similar are creeping into your data, and the
slowdown is due to the resulting floating-point exceptions. For
example, be aware that repeatedly FFTing the same array is a diverging
process (because FFTW computes the unnormalized transform).
<h2><A name="segfault">
Question 3.5. An FFTW routine is crashing when I call
it.
</A></h2>
Did the FFTW test programs pass (<code>make check</code>, or <code>cd tests; make bigcheck</code> if you want to be paranoid)? If so, you almost
certainly have a bug in your own code. For example, you could be
passing invalid arguments (such as wrongly-sized arrays) to FFTW, or
you could simply have memory corruption elsewhere in your program that
causes random crashes later on. Please don't complain to us unless
you can come up with a minimal self-contained program (preferably
under 30 lines) that illustrates the problem.
<h2><A name="fortran64">
Question 3.6. My Fortran program crashes when calling
FFTW.
</A></h2>
As described in the manual, on 64-bit machines you must store the
plans in variables large enough to hold a pointer, for example
<code>integer*8</code>. We recommend using <code>integer*8</code> on 32-bit machines as well, to simplify porting.
<h2><A name="conventions">
Question 3.7. FFTW gives results different from my old
FFT.
</A></h2>
People follow many different conventions for the DFT, and you should
be sure to know the ones that we use (described in the FFTW manual).
In particular, you should be aware that the
<code>FFTW_FORWARD</code>/<code>FFTW_BACKWARD</code> directions correspond to signs of -1/+1 in the exponent of the DFT definition.
(<i>Numerical Recipes</i> uses the opposite convention.)
<p>
You should also know that we compute an unnormalized transform. In
contrast, Matlab is an example of program that computes a normalized
transform. See <A href="#whyscaled">Q3.10 `Why does your inverse transform return a scaled
result?'</A>.
<p>
Finally, note that floating-point arithmetic is not exact, so
different FFT algorithms will give slightly different results (on the
order of the numerical accuracy; typically a fractional difference of
1e-15 or so in double precision).
<h2><A name="nondeterministic">
Question 3.8. FFTW gives different results between
runs
</A></h2>
If you use <code>FFTW_MEASURE</code> or <code>FFTW_PATIENT</code> mode, then the algorithm FFTW employs is not deterministic: it depends on
runtime performance measurements. This will cause the results to vary
slightly from run to run. However, the differences should be slight,
on the order of the floating-point precision, and therefore should
have no practical impact on most applications.
<p>
If you use saved plans (wisdom) or <code>FFTW_ESTIMATE</code> mode, however, then the algorithm is deterministic and the results should be
identical between runs.
<h2><A name="savePlans">
Question 3.9. Can I save FFTW's plans?
</A></h2>
Yes. Starting with version 1.2, FFTW provides the
<code>wisdom</code> mechanism for saving plans; see the FFTW manual.
<h2><A name="whyscaled">
Question 3.10. Why does your inverse transform return a scaled
result?
</A></h2>
Computing the forward transform followed by the backward transform (or
vice versa) yields the original array scaled by the size of the array.
(For multi-dimensional transforms, the size of the array is the
product of the dimensions.) We could, instead, have chosen a
normalization that would have returned the unscaled array. Or, to
accomodate the many conventions in this matter, the transform routines
could have accepted a &quot;scale factor&quot; parameter. We did not
do this, however, for two reasons. First, we didn't want to sacrifice
performance in the common case where the scale factor is 1. Second, in
real applications the FFT is followed or preceded by some computation
on the data, into which the scale factor can typically be absorbed at
little or no cost.
<h2><A name="centerorigin">
Question 3.11. How can I make FFTW put the origin (zero frequency) at
the center of its output?
</A></h2>
For human viewing of a spectrum, it is often convenient to put the
origin in frequency space at the center of the output array, rather
than in the zero-th element (the default in FFTW). If all of the
dimensions of your array are even, you can accomplish this by simply
multiplying each element of the input array by (-1)^(i + j + ...),
where i, j, etcetera are the indices of the element. (This trick is a
general property of the DFT, and is not specific to FFTW.)
<h2><A name="imageaudio">
Question 3.12. How do I FFT an image/audio file in
<i>foobar</i> format?
</A></h2>
FFTW performs an FFT on an array of floating-point values. You can
certainly use it to compute the transform of an image or audio stream,
but you are responsible for figuring out your data format and
converting it to the form FFTW requires.
<h2><A name="linkfails">
Question 3.13. My program does not link (on
Unix).
</A></h2>
The libraries must be listed in the correct order
(<code>-lfftw3 -lm</code> for FFTW 3.x) and <i>after</i> your program sources/objects. (The general rule is that if <i>A</i> uses <i>B</i>, then <i>A</i> must be listed before <i>B</i> in the link command.).
<h2><A name="linkheader">
Question 3.14. I included your header, but linking still
fails.
</A></h2>
You're a C++ programmer, aren't you? You have to compile the FFTW
library and link it into your program, not just
<code>#include &lt;fftw3.h&gt;</code>. (Yes, this is really a FAQ.)
<h2><A name="nostack">
Question 3.15. My program crashes, complaining about stack
space.
</A></h2>
You cannot declare large arrays with automatic storage (e.g. via
<code>fftw_complex array[N]</code>); you should use <code>fftw_malloc</code> (or equivalent) to allocate the arrays you want
to transform if they are larger than a few hundred elements.
<h2><A name="leaks">
Question 3.16. FFTW seems to have a memory
leak.
</A></h2>
After you create a plan, FFTW caches the information required to
quickly recreate the plan. (See <A href="#savePlans">Q3.9 `Can I save FFTW's plans?'</A>) It also maintains a small amount of other persistent memory. You can deallocate all of
FFTW's internally allocated memory, if you wish, by calling
<code>fftw_cleanup()</code>, as documented in the manual.
<h2><A name="allzero">
Question 3.17. The output of FFTW's transform is all
zeros.
</A></h2>
You should initialize your input array <i>after</i> creating the plan, unless you use <code>FFTW_ESTIMATE</code>: planning with <code>FFTW_MEASURE</code> or <code>FFTW_PATIENT</code> overwrites the input/output arrays, as described in the manual.
<h2><A name="vbetalia">
Question 3.18. How do I call FFTW from the Microsoft language du
jour?
</A></h2>
Please <i>do not</i> ask us Windows-specific questions. We do not
use Windows. We know nothing about Visual Basic, Visual C++, or .NET.
Please find the appropriate Usenet discussion group and ask your
question there. See also <A href="section2.html#runOnWindows">Q2.2 `Does FFTW run on Windows?'</A>.
<h2><A name="pruned">
Question 3.19. Can I compute only a subset of the DFT
outputs?
</A></h2>
In general, no, an FFT intrinsically computes all outputs from all
inputs. In principle, there is something called a
<i>pruned FFT</i> that can do what you want, but to compute K outputs out of N the
complexity is in general O(N log K) instead of O(N log N), thus saving
only a small additive factor in the log. (The same argument holds if
you instead have only K nonzero inputs.)
<p>
There are some specific cases in which you can get the O(N log K)
performance benefits easily, however, by combining a few ordinary
FFTs. In particular, the case where you want the first K outputs,
where K divides N, can be handled by performing N/K transforms of size
K and then summing the outputs multiplied by appropriate phase
factors. For more details, see <A href="http://www.fftw.org/pruned.html">pruned FFTs with FFTW</A>.
<p>
There are also some algorithms that compute pruned transforms
<i>approximately</i>, but they are beyond the scope of this FAQ.
<h2><A name="transpose">
Question 3.20. Can I use FFTW's routines for in-place and
out-of-place matrix transposition?
</A></h2>
You can use the FFTW guru interface to create a rank-0 transform of
vector rank 2 where the vector strides are transposed. (A rank-0
transform is equivalent to a 1D transform of size 1, which. just
copies the input into the output.) Specifying the same location for
the input and output makes the transpose in-place.
<p>
For double-valued data stored in row-major format, plan creation looks
like this: <pre>
fftw_plan plan_transpose(int rows, int cols, double *in, double *out)
{
const unsigned flags = FFTW_ESTIMATE; /* other flags are possible */
fftw_iodim howmany_dims[2];
howmany_dims[0].n = rows;
howmany_dims[0].is = cols;
howmany_dims[0].os = 1;
howmany_dims[1].n = cols;
howmany_dims[1].is = 1;
howmany_dims[1].os = rows;
return fftw_plan_guru_r2r(/*rank=*/ 0, /*dims=*/ NULL,
/*howmany_rank=*/ 2, howmany_dims,
in, out, /*kind=*/ NULL, flags);
}
</pre>
(This entry was written by Rhys Ulerich.)
<hr>
Next: <a href="section4.html" rel=precedes>Internals of FFTW</a>.<br>
Back: <a href="section2.html" rev=precedes>Installing FFTW</a>.<br>
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
<address>
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
- 14 September 2021
</address><br>
Extracted from FFTW Frequently Asked Questions with Answers,
Copyright &copy; 2021 Matteo Frigo and Massachusetts Institute of Technology.
</body></html>

View File

@@ -0,0 +1,64 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head><title>
FFTW FAQ - Section 4
</title>
<link rev="made" href="mailto:fftw@fftw.org">
<link rel="Contents" href="index.html">
<link rel="Start" href="index.html">
<link rel="Next" href="section5.html"><link rel="Previous" href="section3.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
FFTW FAQ - Section 4 <br>
Internals of FFTW
</h1>
<ul>
<li><a href="#howworks" rel=subdocument>Q4.1. How does FFTW work?</a>
<li><a href="#whyfast" rel=subdocument>Q4.2. Why is FFTW so fast?</a>
</ul><hr>
<h2><A name="howworks">
Question 4.1. How does FFTW work?
</A></h2>
The innovation (if it can be so called) in FFTW consists in having a
variety of composable <i>solvers</i>, representing different FFT algorithms and implementation strategies, whose combination into a
particular <i>plan</i> for a given size can be determined at runtime according to the characteristics of your machine/compiler.
This peculiar software architecture allows FFTW to adapt itself to
almost any machine.
<p>
For more details (albeit somewhat outdated), see the paper &quot;FFTW:
An Adaptive Software Architecture for the FFT&quot;, by M. Frigo and
S. G. Johnson, <i>Proc. ICASSP</i> 3, 1381 (1998), also available at <A href="http://www.fftw.org">the FFTW web page</A>.
<h2><A name="whyfast">
Question 4.2. Why is FFTW so fast?
</A></h2>
This is a complex question, and there is no simple answer. In fact,
the authors do not fully know the answer, either. In addition to many
small performance hacks throughout FFTW, there are three general
reasons for FFTW's speed.
<ul>
<li> FFTW uses a variety of FFT algorithms and implementation styles
that can be arbitrarily composed to adapt itself to
a machine. See <A href="#howworks">Q4.1 `How does FFTW work?'</A>.
<li> FFTW uses a code generator to produce highly-optimized
routines for computing small transforms.
<li> FFTW uses explicit divide-and-conquer to take advantage
of the memory hierarchy.
</ul>
For more details (albeit somewhat outdated), see the paper &quot;FFTW:
An Adaptive Software Architecture for the FFT&quot;, by M. Frigo and
S. G. Johnson, <i>Proc. ICASSP</i> 3, 1381 (1998), available along with other references at
<A href="http://www.fftw.org">the FFTW web page</A>. <hr>
Next: <a href="section5.html" rel=precedes>Known bugs</a>.<br>
Back: <a href="section3.html" rev=precedes>Using FFTW</a>.<br>
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
<address>
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
- 14 September 2021
</address><br>
Extracted from FFTW Frequently Asked Questions with Answers,
Copyright &copy; 2021 Matteo Frigo and Massachusetts Institute of Technology.
</body></html>

View File

@@ -0,0 +1,148 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head><title>
FFTW FAQ - Section 5
</title>
<link rev="made" href="mailto:fftw@fftw.org">
<link rel="Contents" href="index.html">
<link rel="Start" href="index.html">
<link rel="Previous" href="section4.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
FFTW FAQ - Section 5 <br>
Known bugs
</h1>
<ul>
<li><a href="#rfftwndbug" rel=subdocument>Q5.1. FFTW 1.1 crashes in rfftwnd on Linux.</a>
<li><a href="#fftwmpibug" rel=subdocument>Q5.2. The MPI transforms in FFTW 1.2 give incorrect results/leak
memory.</a>
<li><a href="#testsingbug" rel=subdocument>Q5.3. The test programs in FFTW 1.2.1 fail when I change FFTW to use single
precision.</a>
<li><a href="#teststoobig" rel=subdocument>Q5.4. The test program in FFTW 1.2.1 fails for n &gt;
46340.</a>
<li><a href="#linuxthreads" rel=subdocument>Q5.5. The threaded code fails on Linux Redhat 5.0</a>
<li><a href="#bigrfftwnd" rel=subdocument>Q5.6. FFTW 2.0's rfftwnd fails for rank &gt; 1 transforms with a final
dimension &gt;= 65536.</a>
<li><a href="#primebug" rel=subdocument>Q5.7. FFTW 2.0's complex transforms give the wrong results with prime
factors 17 to 97.</a>
<li><a href="#mpichbug" rel=subdocument>Q5.8. FFTW 2.1.1's MPI test programs crash with
MPICH.</a>
<li><a href="#aixthreadbug" rel=subdocument>Q5.9. FFTW 2.1.2's multi-threaded transforms don't work on
AIX.</a>
<li><a href="#bigprimebug" rel=subdocument>Q5.10. FFTW 2.1.2's complex transforms give incorrect results for large prime
sizes.</a>
<li><a href="#solaristhreadbug" rel=subdocument>Q5.11. FFTW 2.1.3's multi-threaded transforms don't give any speedup on
Solaris.</a>
<li><a href="#aixflags" rel=subdocument>Q5.12. FFTW 2.1.3 crashes on AIX.</a>
</ul><hr>
<h2><A name="rfftwndbug">
Question 5.1. FFTW 1.1 crashes in rfftwnd on
Linux.
</A></h2>
This bug was fixed in FFTW 1.2. There was a bug in
<code>rfftwnd</code> causing an incorrect amount of memory to be allocated. The bug showed
up in Linux with libc-5.3.12 (and nowhere else that we know of).
<h2><A name="fftwmpibug">
Question 5.2. The MPI transforms in FFTW 1.2 give incorrect
results/leak memory.
</A></h2>
These bugs were corrected in FFTW 1.2.1. The MPI transforms (really,
just the transpose routines) in FFTW 1.2 had bugs that could cause
errors in some situations.
<h2><A name="testsingbug">
Question 5.3. The test programs in FFTW 1.2.1 fail when I change FFTW
to use single precision.
</A></h2>
This bug was fixed in FFTW 1.3. (Older versions of FFTW did
work in single precision, but the test programs didn't--the error
tolerances in the tests were set for double precision.)
<h2><A name="teststoobig">
Question 5.4. The test program in FFTW 1.2.1 fails for n &gt;
46340.
</A></h2>
This bug was fixed in FFTW 1.3. FFTW 1.2.1 produced the right answer,
but the test program was wrong. For large n, n*n in the naive
transform that we used for comparison overflows 32 bit integer
precision, breaking the test.
<h2><A name="linuxthreads">
Question 5.5. The threaded code fails on Linux Redhat
5.0
</A></h2>
We had problems with glibc-2.0.5. The code should work with
glibc-2.0.7.
<h2><A name="bigrfftwnd">
Question 5.6. FFTW 2.0's rfftwnd fails for rank &gt; 1 transforms
with a final dimension &gt;= 65536.
</A></h2>
This bug was fixed in FFTW 2.0.1. (There was a 32-bit integer
overflow due to a poorly-parenthesized expression.)
<h2><A name="primebug">
Question 5.7. FFTW 2.0's complex transforms give the wrong results
with prime factors 17 to 97.
</A></h2>
There was a bug in the complex transforms that could cause incorrect
results under (hopefully rare) circumstances for lengths with
intermediate-size prime factors (17-97). This bug was fixed in FFTW
2.1.1.
<h2><A name="mpichbug">
Question 5.8. FFTW 2.1.1's MPI test programs crash with
MPICH.
</A></h2>
This bug was fixed in FFTW 2.1.2. The 2.1/2.1.1 MPI test programs
crashed when using the MPICH implementation of MPI with the
<code>ch_p4</code> device (TCP/IP); the transforms themselves worked fine.
<h2><A name="aixthreadbug">
Question 5.9. FFTW 2.1.2's multi-threaded transforms don't work on
AIX.
</A></h2>
This bug was fixed in FFTW 2.1.3. The multi-threaded transforms in
previous versions didn't work with AIX's
<code>pthreads</code> implementation, which idiosyncratically creates threads in detached
(non-joinable) mode by default.
<h2><A name="bigprimebug">
Question 5.10. FFTW 2.1.2's complex transforms give incorrect results
for large prime sizes.
</A></h2>
This bug was fixed in FFTW 2.1.3. FFTW's complex-transform algorithm
for prime sizes (in versions 2.0 to 2.1.2) had an integer overflow
problem that caused incorrect results for many primes greater than
32768 (on 32-bit machines). (Sizes without large prime factors are
not affected.)
<h2><A name="solaristhreadbug">
Question 5.11. FFTW 2.1.3's multi-threaded transforms don't give any
speedup on Solaris.
</A></h2>
This bug was fixed in FFTW 2.1.4. (By default, Solaris creates
threads that do not parallelize over multiple processors, so one has
to request the proper behavior specifically.)
<h2><A name="aixflags">
Question 5.12. FFTW 2.1.3 crashes on AIX.
</A></h2>
The FFTW 2.1.3 <code>configure</code> script picked incorrect compiler flags for the <code>xlc</code> compiler on newer IBM processors. This
is fixed in FFTW 2.1.4. <hr>
Back: <a href="section4.html" rev=precedes>Internals of FFTW</a>.<br>
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
<address>
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
- 14 September 2021
</address><br>
Extracted from FFTW Frequently Asked Questions with Answers,
Copyright &copy; 2021 Matteo Frigo and Massachusetts Institute of Technology.
</body></html>