Updates
This commit is contained in:
110
fftw-3.3.10/doc/FAQ/fftw-faq.html/index.html
Normal file
110
fftw-3.3.10/doc/FAQ/fftw-faq.html/index.html
Normal file
@@ -0,0 +1,110 @@
|
||||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
||||
<html>
|
||||
<head><title>
|
||||
FFTW Frequently Asked Questions with Answers
|
||||
</title>
|
||||
<link rev="made" href="mailto:fftw@fftw.org">
|
||||
<link rel="Contents" href="index.html">
|
||||
<link rel="Start" href="index.html">
|
||||
<META name="description"
|
||||
content="Frequently asked questions and answers (FAQ) for FFTW.">
|
||||
<link rel="Bookmark" title="FFTW FAQ" href="index.html">
|
||||
<LINK rel="Bookmark" title="FFTW Home Page"
|
||||
href="http://www.fftw.org">
|
||||
<LINK rel="Bookmark" title="FFTW Manual"
|
||||
href="http://www.fftw.org/doc/">
|
||||
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
|
||||
FFTW Frequently Asked Questions with Answers
|
||||
</h1>
|
||||
This is the list of Frequently Asked Questions about FFTW, a
|
||||
collection of fast C routines for computing the Discrete Fourier
|
||||
Transform in one or more dimensions.
|
||||
<h1>
|
||||
Index
|
||||
</h1>
|
||||
|
||||
<ul>
|
||||
<li><b><font size="+2"><a href="section1.html" rel=subdocument>Section 1. Introduction and General Information</a></font></b>
|
||||
<li><a href="section1.html#whatisfftw" rel=subdocument>Q1.1. What is FFTW?</a>
|
||||
<li><a href="section1.html#whereisfftw" rel=subdocument>Q1.2. How do I obtain FFTW?</a>
|
||||
<li><a href="section1.html#isfftwfree" rel=subdocument>Q1.3. Is FFTW free software?</a>
|
||||
<li><a href="section1.html#nonfree" rel=subdocument>Q1.4. What is this about non-free licenses?</a>
|
||||
<li><a href="section1.html#west" rel=subdocument>Q1.5. In the West? I thought MIT was in the East?</a>
|
||||
<br><br><li><b><font size="+2"><a href="section2.html" rel=subdocument>Section 2. Installing FFTW</a></font></b>
|
||||
<li><a href="section2.html#systems" rel=subdocument>Q2.1. Which systems does FFTW run on?</a>
|
||||
<li><a href="section2.html#runOnWindows" rel=subdocument>Q2.2. Does FFTW run on Windows?</a>
|
||||
<li><a href="section2.html#compilerCrashes" rel=subdocument>Q2.3. My compiler has trouble with FFTW.</a>
|
||||
<li><a href="section2.html#solarisSucks" rel=subdocument>Q2.4. FFTW does not compile on Solaris, complaining about
|
||||
<code>const</code>.</a>
|
||||
<li><a href="section2.html#3dnow" rel=subdocument>Q2.5. What's the difference between <code>--enable-3dnow</code> and <code>--enable-k7</code>?</a>
|
||||
<li><a href="section2.html#fma" rel=subdocument>Q2.6. What's the difference between the fma and the non-fma
|
||||
versions?</a>
|
||||
<li><a href="section2.html#languages" rel=subdocument>Q2.7. Which language is FFTW written in?</a>
|
||||
<li><a href="section2.html#fortran" rel=subdocument>Q2.8. Can I call FFTW from Fortran?</a>
|
||||
<li><a href="section2.html#cplusplus" rel=subdocument>Q2.9. Can I call FFTW from C++?</a>
|
||||
<li><a href="section2.html#whynotfortran" rel=subdocument>Q2.10. Why isn't FFTW written in Fortran/C++?</a>
|
||||
<li><a href="section2.html#singleprec" rel=subdocument>Q2.11. How do I compile FFTW to run in single precision?</a>
|
||||
<li><a href="section2.html#64bitk7" rel=subdocument>Q2.12. --enable-k7 does not work on x86-64</a>
|
||||
<br><br><li><b><font size="+2"><a href="section3.html" rel=subdocument>Section 3. Using FFTW</a></font></b>
|
||||
<li><a href="section3.html#fftw2to3" rel=subdocument>Q3.1. Why not support the FFTW 2 interface in FFTW
|
||||
3?</a>
|
||||
<li><a href="section3.html#planperarray" rel=subdocument>Q3.2. Why do FFTW 3 plans encapsulate the input/output arrays and not just
|
||||
the algorithm?</a>
|
||||
<li><a href="section3.html#slow" rel=subdocument>Q3.3. FFTW seems really slow.</a>
|
||||
<li><a href="section3.html#slows" rel=subdocument>Q3.4. FFTW slows down after repeated calls.</a>
|
||||
<li><a href="section3.html#segfault" rel=subdocument>Q3.5. An FFTW routine is crashing when I call it.</a>
|
||||
<li><a href="section3.html#fortran64" rel=subdocument>Q3.6. My Fortran program crashes when calling FFTW.</a>
|
||||
<li><a href="section3.html#conventions" rel=subdocument>Q3.7. FFTW gives results different from my old
|
||||
FFT.</a>
|
||||
<li><a href="section3.html#nondeterministic" rel=subdocument>Q3.8. FFTW gives different results between runs</a>
|
||||
<li><a href="section3.html#savePlans" rel=subdocument>Q3.9. Can I save FFTW's plans?</a>
|
||||
<li><a href="section3.html#whyscaled" rel=subdocument>Q3.10. Why does your inverse transform return a scaled
|
||||
result?</a>
|
||||
<li><a href="section3.html#centerorigin" rel=subdocument>Q3.11. How can I make FFTW put the origin (zero frequency) at the center of
|
||||
its output?</a>
|
||||
<li><a href="section3.html#imageaudio" rel=subdocument>Q3.12. How do I FFT an image/audio file in <i>foobar</i> format?</a>
|
||||
<li><a href="section3.html#linkfails" rel=subdocument>Q3.13. My program does not link (on Unix).</a>
|
||||
<li><a href="section3.html#linkheader" rel=subdocument>Q3.14. I included your header, but linking still
|
||||
fails.</a>
|
||||
<li><a href="section3.html#nostack" rel=subdocument>Q3.15. My program crashes, complaining about stack
|
||||
space.</a>
|
||||
<li><a href="section3.html#leaks" rel=subdocument>Q3.16. FFTW seems to have a memory leak.</a>
|
||||
<li><a href="section3.html#allzero" rel=subdocument>Q3.17. The output of FFTW's transform is all zeros.</a>
|
||||
<li><a href="section3.html#vbetalia" rel=subdocument>Q3.18. How do I call FFTW from the Microsoft language du
|
||||
jour?</a>
|
||||
<li><a href="section3.html#pruned" rel=subdocument>Q3.19. Can I compute only a subset of the DFT outputs?</a>
|
||||
<li><a href="section3.html#transpose" rel=subdocument>Q3.20. Can I use FFTW's routines for in-place and out-of-place matrix
|
||||
transposition?</a>
|
||||
<br><br><li><b><font size="+2"><a href="section4.html" rel=subdocument>Section 4. Internals of FFTW</a></font></b>
|
||||
<li><a href="section4.html#howworks" rel=subdocument>Q4.1. How does FFTW work?</a>
|
||||
<li><a href="section4.html#whyfast" rel=subdocument>Q4.2. Why is FFTW so fast?</a>
|
||||
<br><br><li><b><font size="+2"><a href="section5.html" rel=subdocument>Section 5. Known bugs</a></font></b>
|
||||
<li><a href="section5.html#rfftwndbug" rel=subdocument>Q5.1. FFTW 1.1 crashes in rfftwnd on Linux.</a>
|
||||
<li><a href="section5.html#fftwmpibug" rel=subdocument>Q5.2. The MPI transforms in FFTW 1.2 give incorrect results/leak
|
||||
memory.</a>
|
||||
<li><a href="section5.html#testsingbug" rel=subdocument>Q5.3. The test programs in FFTW 1.2.1 fail when I change FFTW to use single
|
||||
precision.</a>
|
||||
<li><a href="section5.html#teststoobig" rel=subdocument>Q5.4. The test program in FFTW 1.2.1 fails for n >
|
||||
46340.</a>
|
||||
<li><a href="section5.html#linuxthreads" rel=subdocument>Q5.5. The threaded code fails on Linux Redhat 5.0</a>
|
||||
<li><a href="section5.html#bigrfftwnd" rel=subdocument>Q5.6. FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final
|
||||
dimension >= 65536.</a>
|
||||
<li><a href="section5.html#primebug" rel=subdocument>Q5.7. FFTW 2.0's complex transforms give the wrong results with prime
|
||||
factors 17 to 97.</a>
|
||||
<li><a href="section5.html#mpichbug" rel=subdocument>Q5.8. FFTW 2.1.1's MPI test programs crash with
|
||||
MPICH.</a>
|
||||
<li><a href="section5.html#aixthreadbug" rel=subdocument>Q5.9. FFTW 2.1.2's multi-threaded transforms don't work on
|
||||
AIX.</a>
|
||||
<li><a href="section5.html#bigprimebug" rel=subdocument>Q5.10. FFTW 2.1.2's complex transforms give incorrect results for large prime
|
||||
sizes.</a>
|
||||
<li><a href="section5.html#solaristhreadbug" rel=subdocument>Q5.11. FFTW 2.1.3's multi-threaded transforms don't give any speedup on
|
||||
Solaris.</a>
|
||||
<li><a href="section5.html#aixflags" rel=subdocument>Q5.12. FFTW 2.1.3 crashes on AIX.</a>
|
||||
</ul><hr>
|
||||
<address>
|
||||
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
|
||||
- 14 September 2021
|
||||
</address><br>
|
||||
Extracted from FFTW Frequently Asked Questions with Answers,
|
||||
Copyright © 2021 Matteo Frigo and Massachusetts Institute of Technology.
|
||||
</body></html>
|
||||
85
fftw-3.3.10/doc/FAQ/fftw-faq.html/section1.html
Normal file
85
fftw-3.3.10/doc/FAQ/fftw-faq.html/section1.html
Normal file
@@ -0,0 +1,85 @@
|
||||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
||||
<html>
|
||||
<head><title>
|
||||
FFTW FAQ - Section 1
|
||||
</title>
|
||||
<link rev="made" href="mailto:fftw@fftw.org">
|
||||
<link rel="Contents" href="index.html">
|
||||
<link rel="Start" href="index.html">
|
||||
<link rel="Next" href="section2.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
|
||||
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
|
||||
FFTW FAQ - Section 1 <br>
|
||||
Introduction and General Information
|
||||
</h1>
|
||||
|
||||
<ul>
|
||||
<li><a href="#whatisfftw" rel=subdocument>Q1.1. What is FFTW?</a>
|
||||
<li><a href="#whereisfftw" rel=subdocument>Q1.2. How do I obtain FFTW?</a>
|
||||
<li><a href="#isfftwfree" rel=subdocument>Q1.3. Is FFTW free software?</a>
|
||||
<li><a href="#nonfree" rel=subdocument>Q1.4. What is this about non-free licenses?</a>
|
||||
<li><a href="#west" rel=subdocument>Q1.5. In the West? I thought MIT was in the East?</a>
|
||||
</ul><hr>
|
||||
|
||||
<h2><A name="whatisfftw">
|
||||
Question 1.1. What is FFTW?
|
||||
</A></h2>
|
||||
|
||||
FFTW is a free collection of fast C routines for computing the
|
||||
Discrete Fourier Transform in one or more dimensions. It includes
|
||||
complex, real, symmetric, and parallel transforms, and can handle
|
||||
arbitrary array sizes efficiently. FFTW is typically faster than
|
||||
other publically-available FFT implementations, and is even
|
||||
competitive with vendor-tuned libraries. (See our web page for
|
||||
extensive benchmarks.) To achieve this performance, FFTW uses novel
|
||||
code-generation and runtime self-optimization techniques (along with
|
||||
many other tricks).
|
||||
<h2><A name="whereisfftw">
|
||||
Question 1.2. How do I obtain FFTW?
|
||||
</A></h2>
|
||||
|
||||
FFTW can be found at <A href="http://www.fftw.org">the FFTW web page</A>. You can also retrieve it from <code>ftp.fftw.org</code> in <A href="ftp://ftp.fftw.org/pub/fftw"><code>/pub/fftw</code></A>.
|
||||
<h2><A name="isfftwfree">
|
||||
Question 1.3. Is FFTW free software?
|
||||
</A></h2>
|
||||
|
||||
Starting with version 1.3, FFTW is Free Software in the technical
|
||||
sense defined by the Free Software Foundation (see
|
||||
<A href="http://www.gnu.org/philosophy/categories.html">Categories of Free and Non-Free Software</A>), and is distributed under the terms of the GNU General Public License. Previous versions of FFTW were
|
||||
distributed without fee for noncommercial use, but were not
|
||||
technically ``free.''
|
||||
<p>
|
||||
Non-free licenses for FFTW are also available that permit different
|
||||
terms of use than the GPL.
|
||||
<h2><A name="nonfree">
|
||||
Question 1.4. What is this about non-free
|
||||
licenses?
|
||||
</A></h2>
|
||||
|
||||
The non-free licenses are for companies that wish to use FFTW in their
|
||||
products but are unwilling to release their software under the GPL
|
||||
(which would require them to release source code and allow free
|
||||
redistribution). Such users can purchase an unlimited-use license
|
||||
from MIT. Contact us for more details.
|
||||
|
||||
<p>
|
||||
We could instead have released FFTW under the LGPL, or even disallowed
|
||||
non-Free usage. Suffice it to say, however, that MIT owns the
|
||||
copyright to FFTW and they only let us GPL it because we convinced
|
||||
them that it would neither affect their licensing revenue nor irritate
|
||||
existing licensees.
|
||||
<h2><A name="west">
|
||||
Question 1.5. In the West? I thought MIT was in the
|
||||
East?
|
||||
</A></h2>
|
||||
|
||||
Not to an Italian. You could say that we're a Spaghetti Western
|
||||
(with apologies to Sergio Leone). <hr>
|
||||
Next: <a href="section2.html" rel=precedes>Installing FFTW</a>.<br>
|
||||
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
|
||||
<address>
|
||||
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
|
||||
- 14 September 2021
|
||||
</address><br>
|
||||
Extracted from FFTW Frequently Asked Questions with Answers,
|
||||
Copyright © 2021 Matteo Frigo and Massachusetts Institute of Technology.
|
||||
</body></html>
|
||||
285
fftw-3.3.10/doc/FAQ/fftw-faq.html/section2.html
Normal file
285
fftw-3.3.10/doc/FAQ/fftw-faq.html/section2.html
Normal file
@@ -0,0 +1,285 @@
|
||||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
||||
<html>
|
||||
<head><title>
|
||||
FFTW FAQ - Section 2
|
||||
</title>
|
||||
<link rev="made" href="mailto:fftw@fftw.org">
|
||||
<link rel="Contents" href="index.html">
|
||||
<link rel="Start" href="index.html">
|
||||
<link rel="Next" href="section3.html"><link rel="Previous" href="section1.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
|
||||
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
|
||||
FFTW FAQ - Section 2 <br>
|
||||
Installing FFTW
|
||||
</h1>
|
||||
|
||||
<ul>
|
||||
<li><a href="#systems" rel=subdocument>Q2.1. Which systems does FFTW run on?</a>
|
||||
<li><a href="#runOnWindows" rel=subdocument>Q2.2. Does FFTW run on Windows?</a>
|
||||
<li><a href="#compilerCrashes" rel=subdocument>Q2.3. My compiler has trouble with FFTW.</a>
|
||||
<li><a href="#solarisSucks" rel=subdocument>Q2.4. FFTW does not compile on Solaris, complaining about
|
||||
<code>const</code>.</a>
|
||||
<li><a href="#3dnow" rel=subdocument>Q2.5. What's the difference between <code>--enable-3dnow</code> and <code>--enable-k7</code>?</a>
|
||||
<li><a href="#fma" rel=subdocument>Q2.6. What's the difference between the fma and the non-fma
|
||||
versions?</a>
|
||||
<li><a href="#languages" rel=subdocument>Q2.7. Which language is FFTW written in?</a>
|
||||
<li><a href="#fortran" rel=subdocument>Q2.8. Can I call FFTW from Fortran?</a>
|
||||
<li><a href="#cplusplus" rel=subdocument>Q2.9. Can I call FFTW from C++?</a>
|
||||
<li><a href="#whynotfortran" rel=subdocument>Q2.10. Why isn't FFTW written in Fortran/C++?</a>
|
||||
<li><a href="#singleprec" rel=subdocument>Q2.11. How do I compile FFTW to run in single precision?</a>
|
||||
<li><a href="#64bitk7" rel=subdocument>Q2.12. --enable-k7 does not work on x86-64</a>
|
||||
</ul><hr>
|
||||
|
||||
<h2><A name="systems">
|
||||
Question 2.1. Which systems does FFTW run
|
||||
on?
|
||||
</A></h2>
|
||||
|
||||
FFTW is written in ANSI C, and should work on any system with a decent
|
||||
C compiler. (See also <A href="#runOnWindows">Q2.2 `Does FFTW run on Windows?'</A>, <A href="#compilerCrashes">Q2.3 `My compiler has trouble with FFTW.'</A>.) FFTW can also take advantage of certain hardware-specific features,
|
||||
such as cycle counters and SIMD instructions, but this is optional.
|
||||
|
||||
<h2><A name="runOnWindows">
|
||||
Question 2.2. Does FFTW run on Windows?
|
||||
</A></h2>
|
||||
|
||||
Yes, many people have reported successfully using FFTW on Windows with
|
||||
various compilers. FFTW was not developed on Windows, but the source
|
||||
code is essentially straight ANSI C. See also the
|
||||
<A href="http://www.fftw.org/install/windows.html">FFTW Windows installation notes</A>, <A href="#compilerCrashes">Q2.3 `My compiler has trouble with FFTW.'</A>, and <A href="section3.html#vbetalia">Q3.18 `How do I call FFTW from the Microsoft language du
|
||||
jour?'</A>.
|
||||
<h2><A name="compilerCrashes">
|
||||
Question 2.3. My compiler has trouble with
|
||||
FFTW.
|
||||
</A></h2>
|
||||
|
||||
Complain fiercely to the vendor of the compiler.
|
||||
|
||||
<p>
|
||||
We have successfully used <code>gcc</code> 3.2.x on x86 and PPC, a recent Compaq C compiler for Alpha, version 6 of IBM's
|
||||
<code>xlc</code> compiler for AIX, Intel's <code>icc</code> versions 5-7, and Sun WorkShop <code>cc</code> version 6.
|
||||
<p>
|
||||
FFTW is likely to push compilers to their limits, however, and several
|
||||
compiler bugs have been exposed by FFTW. A partial list follows.
|
||||
|
||||
<p>
|
||||
<code>gcc</code> 2.95.x for Solaris/SPARC produces incorrect code for
|
||||
the test program (workaround: recompile the
|
||||
<code>libbench2</code> directory with <code>-O2</code>).
|
||||
<p>
|
||||
NetBSD/macppc 1.6 comes with a <code>gcc</code> version that also miscompiles the test program. (Please report a workaround if you know
|
||||
one.)
|
||||
<p>
|
||||
<code>gcc</code> 3.2.3 for ARM reportedly crashes during compilation.
|
||||
This bug is reportedly fixed in later versions of
|
||||
<code>gcc</code>.
|
||||
<p>
|
||||
Versions 8.0 and 8.1 of Intel's <code>icc</code> falsely claim to be <code>gcc</code>, so you should specify <code>CC="icc -no-gcc"</code>; this is automatic in FFTW 3.1. <code>icc-8.0.066</code> reportely produces incorrect code for FFTW 2.1.5, but is fixed in version 8.1.
|
||||
<code>icc-7.1</code> compiler build 20030402Z appears to produce
|
||||
incorrect dependencies, causing the compilation to fail.
|
||||
<code>icc-7.1</code> build 20030307Z appears to work fine. (Use
|
||||
<code>icc -V</code> to check which build you have.) As of 2003/04/18,
|
||||
build 20030402Z appears not to be available any longer on Intel's
|
||||
website, whereas the older build 20030307Z is available.
|
||||
|
||||
<p>
|
||||
<code>ranlib</code> of GNU <code>binutils</code> 2.9.1 on Irix has been observed to corrupt the FFTW libraries, causing a link failure when
|
||||
FFTW is compiled. Since <code>ranlib</code> is completely superfluous on Irix, we suggest deleting it from your system and replacing it with
|
||||
a symbolic link to <code>/bin/echo</code>.
|
||||
<p>
|
||||
If support for SIMD instructions is enabled in FFTW, further compiler
|
||||
problems may appear:
|
||||
<p>
|
||||
<code>gcc</code> 3.4.[0123] for x86 produces incorrect SSE2 code for
|
||||
FFTW when <code>-O2</code> (the best choice for FFTW) is used, causing
|
||||
FFTW to crash (<code>make check</code> crashes). This bug is fixed in <code>gcc</code> 3.4.4. On x86_64 (amd64/em64t), <code>gcc</code> 3.4.4 reportedly still has a similar problem, but this is fixed as of
|
||||
<code>gcc</code> 3.4.6.
|
||||
<p>
|
||||
<code>gcc-3.2</code> for x86 produces incorrect SIMD code if
|
||||
<code>-O3</code> is used. The same compiler produces incorrect SIMD
|
||||
code if no optimization is used, too. When using
|
||||
<code>gcc-3.2</code>, it is a good idea not to change the default
|
||||
<code>CFLAGS</code> selected by the <code>configure</code> script.
|
||||
<p>
|
||||
Some 3.0.x and 3.1.x versions of <code>gcc</code> on <code>x86</code> may crash. <code>gcc</code> so-called 2.96 shipping with RedHat 7.3 crashes
|
||||
when compiling SIMD code. In both cases, please upgrade to
|
||||
<code>gcc-3.2</code> or later.
|
||||
<p>
|
||||
Intel's <code>icc</code> 6.0 misaligns SSE constants, but FFTW has a
|
||||
workaround. <code>icc</code> 8.x fails to compile FFTW 3.0.x because it
|
||||
falsely claims to be <code>gcc</code>; we believe this to be a bug in <code>icc</code>, but FFTW 3.1 has a workaround.
|
||||
<p>
|
||||
Visual C++ 2003 reportedly produces incorrect code for SSE/SSE2 when
|
||||
compiling FFTW. This bug was reportedly fixed in VC++ 2005;
|
||||
alternatively, you could switch to the Intel compiler. VC++ 6.0 also
|
||||
reportedly produces incorrect code for the file
|
||||
<code>reodft11e-r2hc-odd.c</code> unless optimizations are disabled for that file.
|
||||
<p>
|
||||
<code>gcc</code> 2.95 on MacOS X miscompiles AltiVec code (fixed in
|
||||
later versions). <code>gcc</code> 3.2.x miscompiles AltiVec permutations, but FFTW has a workaround.
|
||||
<code>gcc</code> 4.0.1 on MacOS for Intel crashes when compiling FFTW; a workaround is to
|
||||
compile one file without optimization: <code>cd kernel; make CFLAGS=" " trig.lo</code>.
|
||||
<p>
|
||||
<code>gcc</code> 4.1.1 reportedly crashes when compiling FFTW for MIPS;
|
||||
the workaround is to compile the file it crashes on
|
||||
(<code>t2_64.c</code>) with a lower optimization level.
|
||||
<p>
|
||||
<code>gcc</code> versions 4.1.2 to 4.2.0 for x86 reportedly miscompile
|
||||
FFTW 3.1's test program, causing <code>make check</code> to crash (<code>gcc</code> bug #26528). The bug was reportedly fixed in
|
||||
<code>gcc</code> version 4.2.1 and later. A workaround is to compile
|
||||
<code>libbench2/verify-lib.c</code> without optimization.
|
||||
<h2><A name="solarisSucks">
|
||||
Question 2.4. FFTW does not compile on Solaris, complaining about
|
||||
<code>const</code>.
|
||||
</A></h2>
|
||||
|
||||
We know that at least on Solaris 2.5.x with Sun's compilers 4.2 you
|
||||
might get error messages from <code>make</code> such as
|
||||
<p>
|
||||
<code>"./fftw.h", line 88: warning: const is a keyword in ANSI
|
||||
C</code>
|
||||
<p>
|
||||
This is the case when the <code>configure</code> script reports that <code>const</code> does not work:
|
||||
<p>
|
||||
<code>checking for working const... (cached) no</code>
|
||||
<p>
|
||||
You should be aware that Solaris comes with two compilers, namely,
|
||||
<code>/opt/SUNWspro/SC4.2/bin/cc</code> and <code>/usr/ucb/cc</code>. The latter compiler is non-ANSI. Indeed, it is a perverse shell script
|
||||
that calls the real compiler in non-ANSI mode. In order
|
||||
to compile FFTW, change your path so that the right
|
||||
<code>cc</code> is used.
|
||||
<p>
|
||||
To know whether your compiler is the right one, type
|
||||
<code>cc -V</code>. If the compiler prints ``<code>ucbcc</code>'', as in
|
||||
<p>
|
||||
<code>ucbcc: WorkShop Compilers 4.2 30 Oct 1996 C
|
||||
4.2</code>
|
||||
<p>
|
||||
then the compiler is wrong. The right message is something like
|
||||
|
||||
<p>
|
||||
<code>cc: WorkShop Compilers 4.2 30 Oct 1996 C
|
||||
4.2</code>
|
||||
<h2><A name="3dnow">
|
||||
Question 2.5. What's the difference between
|
||||
<code>--enable-3dnow</code> and <code>--enable-k7</code>?
|
||||
</A></h2>
|
||||
|
||||
<code>--enable-k7</code> enables 3DNow! instructions on K7 processors
|
||||
(AMD Athlon and its variants). K7 support is provided by assembly
|
||||
routines generated by a special purpose compiler.
|
||||
As of fftw-3.2, --enable-k7 is no longer supported.
|
||||
|
||||
<p>
|
||||
<code>--enable-3dnow</code> enables generic 3DNow! support using <code>gcc</code> builtin functions. This works on earlier AMD
|
||||
processors, but it is not as fast as our special assembly routines.
|
||||
As of fftw-3.1, --enable-3dnow is no longer supported.
|
||||
|
||||
<h2><A name="fma">
|
||||
Question 2.6. What's the difference between the fma and the non-fma
|
||||
versions?
|
||||
</A></h2>
|
||||
|
||||
The fma version tries to exploit the fused multiply-add instructions
|
||||
implemented in many processors such as PowerPC, ia-64, and MIPS. The
|
||||
two FFTW packages are otherwise identical. In FFTW 3.1, the fma and
|
||||
non-fma versions were merged together into a single package, and the
|
||||
<code>configure</code> script attempts to automatically guess which
|
||||
version to use.
|
||||
<p>
|
||||
The FFTW 3.1 <code>configure</code> script enables fma by default on PowerPC, Itanium, and PA-RISC, and disables it otherwise. You can
|
||||
force one or the other by using the <code>--enable-fma</code> or <code>--disable-fma</code> flag for <code>configure</code>.
|
||||
<p>
|
||||
Definitely use fma if you have a PowerPC-based system with
|
||||
<code>gcc</code> (or IBM <code>xlc</code>). This includes all GNU/Linux systems for PowerPC and the older PowerPC-based MacOS systems. Also
|
||||
use it on PA-RISC and Itanium with the HP/UX compiler.
|
||||
|
||||
<p>
|
||||
Definitely do not use the fma version if you have an ia-32 processor
|
||||
(Intel, AMD, MacOS on Intel, etcetera).
|
||||
|
||||
<p>
|
||||
For other architectures/compilers, the situation is not so clear. For
|
||||
example, ia-64 has the fma instruction, but
|
||||
<code>gcc-3.2</code> appears not to exploit it correctly. Other compilers may do the right thing,
|
||||
but we have not tried them. Please send us your feedback so that we
|
||||
can update this FAQ entry.
|
||||
<h2><A name="languages">
|
||||
Question 2.7. Which language is FFTW written
|
||||
in?
|
||||
</A></h2>
|
||||
|
||||
FFTW is written in ANSI C. Most of the code, however, was
|
||||
automatically generated by a program called
|
||||
<code>genfft</code>, written in the Objective Caml dialect of ML. You do not need to know ML or to
|
||||
have an Objective Caml compiler in order to use FFTW.
|
||||
|
||||
<p>
|
||||
<code>genfft</code> is provided with the FFTW sources, which means that
|
||||
you can play with the code generator if you want. In this case, you
|
||||
need a working Objective Caml system. Objective Caml is available
|
||||
from <A href="http://caml.inria.fr">the Caml web page</A>.
|
||||
<h2><A name="fortran">
|
||||
Question 2.8. Can I call FFTW from Fortran?
|
||||
</A></h2>
|
||||
|
||||
Yes, FFTW (versions 1.3 and higher) contains a Fortran-callable
|
||||
interface, documented in the FFTW manual.
|
||||
|
||||
<p>
|
||||
By default, FFTW configures its Fortran interface to work with the
|
||||
first compiler it finds, e.g. <code>g77</code>. To configure for a different, incompatible Fortran compiler
|
||||
<code>foobar</code>, use <code>./configure F77=foobar</code> when installing FFTW. (In the case of <code>g77</code>, however, FFTW 3.x also includes an extra set of
|
||||
Fortran-callable routines with one less underscore at the end of
|
||||
identifiers, which should cover most other Fortran compilers on Linux
|
||||
at least.)
|
||||
<h2><A name="cplusplus">
|
||||
Question 2.9. Can I call FFTW from C++?
|
||||
</A></h2>
|
||||
|
||||
Most definitely. FFTW should compile and/or link under any C++
|
||||
compiler. Moreover, it is likely that the C++
|
||||
<code><complex></code> template class is bit-compatible with FFTW's complex-number format
|
||||
(see the FFTW manual for more details).
|
||||
|
||||
<h2><A name="whynotfortran">
|
||||
Question 2.10. Why isn't FFTW written in
|
||||
Fortran/C++?
|
||||
</A></h2>
|
||||
|
||||
Because we don't like those languages, and neither approaches the
|
||||
portability of C.
|
||||
<h2><A name="singleprec">
|
||||
Question 2.11. How do I compile FFTW to run in single
|
||||
precision?
|
||||
</A></h2>
|
||||
|
||||
On a Unix system: <code>configure --enable-float</code>. On a non-Unix system: edit <code>config.h</code> to <code>#define</code> the symbol <code>FFTW_SINGLE</code> (for FFTW 3.x). In both cases, you must then
|
||||
recompile FFTW. In FFTW 3, all FFTW identifiers will then begin with
|
||||
<code>fftwf_</code> instead of <code>fftw_</code>.
|
||||
<h2><A name="64bitk7">
|
||||
Question 2.12. --enable-k7 does not work on
|
||||
x86-64
|
||||
</A></h2>
|
||||
|
||||
Support for --enable-k7 was discontinued in fftw-3.2.
|
||||
|
||||
<p>
|
||||
The fftw-3.1 release supports --enable-k7. This option only works on
|
||||
32-bit x86 machines that implement 3DNow!, including the AMD Athlon
|
||||
and the AMD Opteron in 32-bit mode. --enable-k7 does not work on AMD
|
||||
Opteron in 64-bit mode. Use --enable-sse for x86-64 machines.
|
||||
|
||||
<p>
|
||||
FFTW supports 3DNow! by means of assembly code generated by a
|
||||
special-purpose compiler. It is hard to produce assembly code that
|
||||
works in both 32-bit and 64-bit mode. <hr>
|
||||
Next: <a href="section3.html" rel=precedes>Using FFTW</a>.<br>
|
||||
Back: <a href="section1.html" rev=precedes>Introduction and General Information</a>.<br>
|
||||
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
|
||||
<address>
|
||||
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
|
||||
- 14 September 2021
|
||||
</address><br>
|
||||
Extracted from FFTW Frequently Asked Questions with Answers,
|
||||
Copyright © 2021 Matteo Frigo and Massachusetts Institute of Technology.
|
||||
</body></html>
|
||||
334
fftw-3.3.10/doc/FAQ/fftw-faq.html/section3.html
Normal file
334
fftw-3.3.10/doc/FAQ/fftw-faq.html/section3.html
Normal file
@@ -0,0 +1,334 @@
|
||||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
||||
<html>
|
||||
<head><title>
|
||||
FFTW FAQ - Section 3
|
||||
</title>
|
||||
<link rev="made" href="mailto:fftw@fftw.org">
|
||||
<link rel="Contents" href="index.html">
|
||||
<link rel="Start" href="index.html">
|
||||
<link rel="Next" href="section4.html"><link rel="Previous" href="section2.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
|
||||
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
|
||||
FFTW FAQ - Section 3 <br>
|
||||
Using FFTW
|
||||
</h1>
|
||||
|
||||
<ul>
|
||||
<li><a href="#fftw2to3" rel=subdocument>Q3.1. Why not support the FFTW 2 interface in FFTW
|
||||
3?</a>
|
||||
<li><a href="#planperarray" rel=subdocument>Q3.2. Why do FFTW 3 plans encapsulate the input/output arrays and not just
|
||||
the algorithm?</a>
|
||||
<li><a href="#slow" rel=subdocument>Q3.3. FFTW seems really slow.</a>
|
||||
<li><a href="#slows" rel=subdocument>Q3.4. FFTW slows down after repeated calls.</a>
|
||||
<li><a href="#segfault" rel=subdocument>Q3.5. An FFTW routine is crashing when I call it.</a>
|
||||
<li><a href="#fortran64" rel=subdocument>Q3.6. My Fortran program crashes when calling FFTW.</a>
|
||||
<li><a href="#conventions" rel=subdocument>Q3.7. FFTW gives results different from my old
|
||||
FFT.</a>
|
||||
<li><a href="#nondeterministic" rel=subdocument>Q3.8. FFTW gives different results between runs</a>
|
||||
<li><a href="#savePlans" rel=subdocument>Q3.9. Can I save FFTW's plans?</a>
|
||||
<li><a href="#whyscaled" rel=subdocument>Q3.10. Why does your inverse transform return a scaled
|
||||
result?</a>
|
||||
<li><a href="#centerorigin" rel=subdocument>Q3.11. How can I make FFTW put the origin (zero frequency) at the center of
|
||||
its output?</a>
|
||||
<li><a href="#imageaudio" rel=subdocument>Q3.12. How do I FFT an image/audio file in <i>foobar</i> format?</a>
|
||||
<li><a href="#linkfails" rel=subdocument>Q3.13. My program does not link (on Unix).</a>
|
||||
<li><a href="#linkheader" rel=subdocument>Q3.14. I included your header, but linking still
|
||||
fails.</a>
|
||||
<li><a href="#nostack" rel=subdocument>Q3.15. My program crashes, complaining about stack
|
||||
space.</a>
|
||||
<li><a href="#leaks" rel=subdocument>Q3.16. FFTW seems to have a memory leak.</a>
|
||||
<li><a href="#allzero" rel=subdocument>Q3.17. The output of FFTW's transform is all zeros.</a>
|
||||
<li><a href="#vbetalia" rel=subdocument>Q3.18. How do I call FFTW from the Microsoft language du
|
||||
jour?</a>
|
||||
<li><a href="#pruned" rel=subdocument>Q3.19. Can I compute only a subset of the DFT outputs?</a>
|
||||
<li><a href="#transpose" rel=subdocument>Q3.20. Can I use FFTW's routines for in-place and out-of-place matrix
|
||||
transposition?</a>
|
||||
</ul><hr>
|
||||
|
||||
<h2><A name="fftw2to3">
|
||||
Question 3.1. Why not support the FFTW 2 interface in FFTW
|
||||
3?
|
||||
</A></h2>
|
||||
|
||||
FFTW 3 has semantics incompatible with earlier versions: its plans can
|
||||
only be used for a given stride, multiplicity, and other
|
||||
characteristics of the input and output arrays; these stronger
|
||||
semantics are necessary for performance reasons. Thus, it is
|
||||
impossible to efficiently emulate the older interface (whose plans can
|
||||
be used for any transform of the same size). We believe that it
|
||||
should be possible to upgrade most programs without any difficulty,
|
||||
however.
|
||||
<h2><A name="planperarray">
|
||||
Question 3.2. Why do FFTW 3 plans encapsulate the input/output arrays
|
||||
and not just the algorithm?
|
||||
</A></h2>
|
||||
|
||||
There are several reasons:
|
||||
<ul>
|
||||
<li>It was important for performance reasons that the plan be specific to
|
||||
array characteristics like the stride (and alignment, for SIMD), and
|
||||
requiring that the user maintain these invariants is error prone.
|
||||
|
||||
<li>In most high-performance applications, as far as we can tell, you are
|
||||
usually transforming the same array over and over, so FFTW's semantics
|
||||
should not be a burden.
|
||||
<li>If you need to transform another array of the same size, creating a
|
||||
new plan once the first exists is a cheap operation.
|
||||
|
||||
<li>If you need to transform many arrays of the same size at once, you
|
||||
should really use the <code>plan_many</code> routines in FFTW's "advanced"
|
||||
interface.
|
||||
<li>If the abovementioned array characteristics are the same, you are
|
||||
willing to pay close attention to the documentation, and you really
|
||||
need to, we provide a "new-array execution" interface to
|
||||
apply a plan to a new array.
|
||||
</ul>
|
||||
|
||||
<h2><A name="slow">
|
||||
Question 3.3. FFTW seems really slow.
|
||||
</A></h2>
|
||||
|
||||
You are probably recreating the plan before every transform, rather
|
||||
than creating it once and reusing it for all transforms of the same
|
||||
size. FFTW is designed to be used in the following way:
|
||||
|
||||
<ul>
|
||||
<li>First, you create a plan. This will take several seconds.
|
||||
|
||||
<li>Then, you reuse the plan many times to perform FFTs. These are fast.
|
||||
|
||||
</ul>
|
||||
If you don't need to compute many transforms and the time for the
|
||||
planner is significant, you have two options. First, you can use the
|
||||
<code>FFTW_ESTIMATE</code> option in the planner, which uses heuristics
|
||||
instead of runtime measurements and produces a good plan in a short
|
||||
time. Second, you can use the wisdom feature to precompute the plan;
|
||||
see <A href="#savePlans">Q3.9 `Can I save FFTW's plans?'</A>
|
||||
<h2><A name="slows">
|
||||
Question 3.4. FFTW slows down after repeated
|
||||
calls.
|
||||
</A></h2>
|
||||
|
||||
Probably, NaNs or similar are creeping into your data, and the
|
||||
slowdown is due to the resulting floating-point exceptions. For
|
||||
example, be aware that repeatedly FFTing the same array is a diverging
|
||||
process (because FFTW computes the unnormalized transform).
|
||||
|
||||
<h2><A name="segfault">
|
||||
Question 3.5. An FFTW routine is crashing when I call
|
||||
it.
|
||||
</A></h2>
|
||||
|
||||
Did the FFTW test programs pass (<code>make check</code>, or <code>cd tests; make bigcheck</code> if you want to be paranoid)? If so, you almost
|
||||
certainly have a bug in your own code. For example, you could be
|
||||
passing invalid arguments (such as wrongly-sized arrays) to FFTW, or
|
||||
you could simply have memory corruption elsewhere in your program that
|
||||
causes random crashes later on. Please don't complain to us unless
|
||||
you can come up with a minimal self-contained program (preferably
|
||||
under 30 lines) that illustrates the problem.
|
||||
|
||||
<h2><A name="fortran64">
|
||||
Question 3.6. My Fortran program crashes when calling
|
||||
FFTW.
|
||||
</A></h2>
|
||||
|
||||
As described in the manual, on 64-bit machines you must store the
|
||||
plans in variables large enough to hold a pointer, for example
|
||||
<code>integer*8</code>. We recommend using <code>integer*8</code> on 32-bit machines as well, to simplify porting.
|
||||
|
||||
<h2><A name="conventions">
|
||||
Question 3.7. FFTW gives results different from my old
|
||||
FFT.
|
||||
</A></h2>
|
||||
|
||||
People follow many different conventions for the DFT, and you should
|
||||
be sure to know the ones that we use (described in the FFTW manual).
|
||||
In particular, you should be aware that the
|
||||
<code>FFTW_FORWARD</code>/<code>FFTW_BACKWARD</code> directions correspond to signs of -1/+1 in the exponent of the DFT definition.
|
||||
(<i>Numerical Recipes</i> uses the opposite convention.)
|
||||
<p>
|
||||
You should also know that we compute an unnormalized transform. In
|
||||
contrast, Matlab is an example of program that computes a normalized
|
||||
transform. See <A href="#whyscaled">Q3.10 `Why does your inverse transform return a scaled
|
||||
result?'</A>.
|
||||
<p>
|
||||
Finally, note that floating-point arithmetic is not exact, so
|
||||
different FFT algorithms will give slightly different results (on the
|
||||
order of the numerical accuracy; typically a fractional difference of
|
||||
1e-15 or so in double precision).
|
||||
<h2><A name="nondeterministic">
|
||||
Question 3.8. FFTW gives different results between
|
||||
runs
|
||||
</A></h2>
|
||||
|
||||
If you use <code>FFTW_MEASURE</code> or <code>FFTW_PATIENT</code> mode, then the algorithm FFTW employs is not deterministic: it depends on
|
||||
runtime performance measurements. This will cause the results to vary
|
||||
slightly from run to run. However, the differences should be slight,
|
||||
on the order of the floating-point precision, and therefore should
|
||||
have no practical impact on most applications.
|
||||
|
||||
<p>
|
||||
If you use saved plans (wisdom) or <code>FFTW_ESTIMATE</code> mode, however, then the algorithm is deterministic and the results should be
|
||||
identical between runs.
|
||||
<h2><A name="savePlans">
|
||||
Question 3.9. Can I save FFTW's plans?
|
||||
</A></h2>
|
||||
|
||||
Yes. Starting with version 1.2, FFTW provides the
|
||||
<code>wisdom</code> mechanism for saving plans; see the FFTW manual.
|
||||
|
||||
<h2><A name="whyscaled">
|
||||
Question 3.10. Why does your inverse transform return a scaled
|
||||
result?
|
||||
</A></h2>
|
||||
|
||||
Computing the forward transform followed by the backward transform (or
|
||||
vice versa) yields the original array scaled by the size of the array.
|
||||
(For multi-dimensional transforms, the size of the array is the
|
||||
product of the dimensions.) We could, instead, have chosen a
|
||||
normalization that would have returned the unscaled array. Or, to
|
||||
accomodate the many conventions in this matter, the transform routines
|
||||
could have accepted a "scale factor" parameter. We did not
|
||||
do this, however, for two reasons. First, we didn't want to sacrifice
|
||||
performance in the common case where the scale factor is 1. Second, in
|
||||
real applications the FFT is followed or preceded by some computation
|
||||
on the data, into which the scale factor can typically be absorbed at
|
||||
little or no cost.
|
||||
<h2><A name="centerorigin">
|
||||
Question 3.11. How can I make FFTW put the origin (zero frequency) at
|
||||
the center of its output?
|
||||
</A></h2>
|
||||
|
||||
For human viewing of a spectrum, it is often convenient to put the
|
||||
origin in frequency space at the center of the output array, rather
|
||||
than in the zero-th element (the default in FFTW). If all of the
|
||||
dimensions of your array are even, you can accomplish this by simply
|
||||
multiplying each element of the input array by (-1)^(i + j + ...),
|
||||
where i, j, etcetera are the indices of the element. (This trick is a
|
||||
general property of the DFT, and is not specific to FFTW.)
|
||||
|
||||
<h2><A name="imageaudio">
|
||||
Question 3.12. How do I FFT an image/audio file in
|
||||
<i>foobar</i> format?
|
||||
</A></h2>
|
||||
|
||||
FFTW performs an FFT on an array of floating-point values. You can
|
||||
certainly use it to compute the transform of an image or audio stream,
|
||||
but you are responsible for figuring out your data format and
|
||||
converting it to the form FFTW requires.
|
||||
|
||||
<h2><A name="linkfails">
|
||||
Question 3.13. My program does not link (on
|
||||
Unix).
|
||||
</A></h2>
|
||||
|
||||
The libraries must be listed in the correct order
|
||||
(<code>-lfftw3 -lm</code> for FFTW 3.x) and <i>after</i> your program sources/objects. (The general rule is that if <i>A</i> uses <i>B</i>, then <i>A</i> must be listed before <i>B</i> in the link command.).
|
||||
<h2><A name="linkheader">
|
||||
Question 3.14. I included your header, but linking still
|
||||
fails.
|
||||
</A></h2>
|
||||
|
||||
You're a C++ programmer, aren't you? You have to compile the FFTW
|
||||
library and link it into your program, not just
|
||||
<code>#include <fftw3.h></code>. (Yes, this is really a FAQ.)
|
||||
<h2><A name="nostack">
|
||||
Question 3.15. My program crashes, complaining about stack
|
||||
space.
|
||||
</A></h2>
|
||||
|
||||
You cannot declare large arrays with automatic storage (e.g. via
|
||||
<code>fftw_complex array[N]</code>); you should use <code>fftw_malloc</code> (or equivalent) to allocate the arrays you want
|
||||
to transform if they are larger than a few hundred elements.
|
||||
|
||||
<h2><A name="leaks">
|
||||
Question 3.16. FFTW seems to have a memory
|
||||
leak.
|
||||
</A></h2>
|
||||
|
||||
After you create a plan, FFTW caches the information required to
|
||||
quickly recreate the plan. (See <A href="#savePlans">Q3.9 `Can I save FFTW's plans?'</A>) It also maintains a small amount of other persistent memory. You can deallocate all of
|
||||
FFTW's internally allocated memory, if you wish, by calling
|
||||
<code>fftw_cleanup()</code>, as documented in the manual.
|
||||
<h2><A name="allzero">
|
||||
Question 3.17. The output of FFTW's transform is all
|
||||
zeros.
|
||||
</A></h2>
|
||||
|
||||
You should initialize your input array <i>after</i> creating the plan, unless you use <code>FFTW_ESTIMATE</code>: planning with <code>FFTW_MEASURE</code> or <code>FFTW_PATIENT</code> overwrites the input/output arrays, as described in the manual.
|
||||
|
||||
<h2><A name="vbetalia">
|
||||
Question 3.18. How do I call FFTW from the Microsoft language du
|
||||
jour?
|
||||
</A></h2>
|
||||
|
||||
Please <i>do not</i> ask us Windows-specific questions. We do not
|
||||
use Windows. We know nothing about Visual Basic, Visual C++, or .NET.
|
||||
Please find the appropriate Usenet discussion group and ask your
|
||||
question there. See also <A href="section2.html#runOnWindows">Q2.2 `Does FFTW run on Windows?'</A>.
|
||||
<h2><A name="pruned">
|
||||
Question 3.19. Can I compute only a subset of the DFT
|
||||
outputs?
|
||||
</A></h2>
|
||||
|
||||
In general, no, an FFT intrinsically computes all outputs from all
|
||||
inputs. In principle, there is something called a
|
||||
<i>pruned FFT</i> that can do what you want, but to compute K outputs out of N the
|
||||
complexity is in general O(N log K) instead of O(N log N), thus saving
|
||||
only a small additive factor in the log. (The same argument holds if
|
||||
you instead have only K nonzero inputs.)
|
||||
|
||||
<p>
|
||||
There are some specific cases in which you can get the O(N log K)
|
||||
performance benefits easily, however, by combining a few ordinary
|
||||
FFTs. In particular, the case where you want the first K outputs,
|
||||
where K divides N, can be handled by performing N/K transforms of size
|
||||
K and then summing the outputs multiplied by appropriate phase
|
||||
factors. For more details, see <A href="http://www.fftw.org/pruned.html">pruned FFTs with FFTW</A>.
|
||||
<p>
|
||||
There are also some algorithms that compute pruned transforms
|
||||
<i>approximately</i>, but they are beyond the scope of this FAQ.
|
||||
|
||||
<h2><A name="transpose">
|
||||
Question 3.20. Can I use FFTW's routines for in-place and
|
||||
out-of-place matrix transposition?
|
||||
</A></h2>
|
||||
|
||||
You can use the FFTW guru interface to create a rank-0 transform of
|
||||
vector rank 2 where the vector strides are transposed. (A rank-0
|
||||
transform is equivalent to a 1D transform of size 1, which. just
|
||||
copies the input into the output.) Specifying the same location for
|
||||
the input and output makes the transpose in-place.
|
||||
|
||||
<p>
|
||||
For double-valued data stored in row-major format, plan creation looks
|
||||
like this: <pre>
|
||||
fftw_plan plan_transpose(int rows, int cols, double *in, double *out)
|
||||
{
|
||||
const unsigned flags = FFTW_ESTIMATE; /* other flags are possible */
|
||||
fftw_iodim howmany_dims[2];
|
||||
|
||||
howmany_dims[0].n = rows;
|
||||
howmany_dims[0].is = cols;
|
||||
howmany_dims[0].os = 1;
|
||||
|
||||
howmany_dims[1].n = cols;
|
||||
howmany_dims[1].is = 1;
|
||||
howmany_dims[1].os = rows;
|
||||
|
||||
return fftw_plan_guru_r2r(/*rank=*/ 0, /*dims=*/ NULL,
|
||||
/*howmany_rank=*/ 2, howmany_dims,
|
||||
in, out, /*kind=*/ NULL, flags);
|
||||
}
|
||||
</pre>
|
||||
(This entry was written by Rhys Ulerich.)
|
||||
<hr>
|
||||
Next: <a href="section4.html" rel=precedes>Internals of FFTW</a>.<br>
|
||||
Back: <a href="section2.html" rev=precedes>Installing FFTW</a>.<br>
|
||||
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
|
||||
<address>
|
||||
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
|
||||
- 14 September 2021
|
||||
</address><br>
|
||||
Extracted from FFTW Frequently Asked Questions with Answers,
|
||||
Copyright © 2021 Matteo Frigo and Massachusetts Institute of Technology.
|
||||
</body></html>
|
||||
64
fftw-3.3.10/doc/FAQ/fftw-faq.html/section4.html
Normal file
64
fftw-3.3.10/doc/FAQ/fftw-faq.html/section4.html
Normal file
@@ -0,0 +1,64 @@
|
||||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
||||
<html>
|
||||
<head><title>
|
||||
FFTW FAQ - Section 4
|
||||
</title>
|
||||
<link rev="made" href="mailto:fftw@fftw.org">
|
||||
<link rel="Contents" href="index.html">
|
||||
<link rel="Start" href="index.html">
|
||||
<link rel="Next" href="section5.html"><link rel="Previous" href="section3.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
|
||||
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
|
||||
FFTW FAQ - Section 4 <br>
|
||||
Internals of FFTW
|
||||
</h1>
|
||||
|
||||
<ul>
|
||||
<li><a href="#howworks" rel=subdocument>Q4.1. How does FFTW work?</a>
|
||||
<li><a href="#whyfast" rel=subdocument>Q4.2. Why is FFTW so fast?</a>
|
||||
</ul><hr>
|
||||
|
||||
<h2><A name="howworks">
|
||||
Question 4.1. How does FFTW work?
|
||||
</A></h2>
|
||||
|
||||
The innovation (if it can be so called) in FFTW consists in having a
|
||||
variety of composable <i>solvers</i>, representing different FFT algorithms and implementation strategies, whose combination into a
|
||||
particular <i>plan</i> for a given size can be determined at runtime according to the characteristics of your machine/compiler.
|
||||
This peculiar software architecture allows FFTW to adapt itself to
|
||||
almost any machine.
|
||||
<p>
|
||||
For more details (albeit somewhat outdated), see the paper "FFTW:
|
||||
An Adaptive Software Architecture for the FFT", by M. Frigo and
|
||||
S. G. Johnson, <i>Proc. ICASSP</i> 3, 1381 (1998), also available at <A href="http://www.fftw.org">the FFTW web page</A>.
|
||||
<h2><A name="whyfast">
|
||||
Question 4.2. Why is FFTW so fast?
|
||||
</A></h2>
|
||||
|
||||
This is a complex question, and there is no simple answer. In fact,
|
||||
the authors do not fully know the answer, either. In addition to many
|
||||
small performance hacks throughout FFTW, there are three general
|
||||
reasons for FFTW's speed.
|
||||
<ul>
|
||||
<li> FFTW uses a variety of FFT algorithms and implementation styles
|
||||
that can be arbitrarily composed to adapt itself to
|
||||
a machine. See <A href="#howworks">Q4.1 `How does FFTW work?'</A>.
|
||||
<li> FFTW uses a code generator to produce highly-optimized
|
||||
routines for computing small transforms.
|
||||
|
||||
<li> FFTW uses explicit divide-and-conquer to take advantage
|
||||
of the memory hierarchy.
|
||||
</ul>
|
||||
For more details (albeit somewhat outdated), see the paper "FFTW:
|
||||
An Adaptive Software Architecture for the FFT", by M. Frigo and
|
||||
S. G. Johnson, <i>Proc. ICASSP</i> 3, 1381 (1998), available along with other references at
|
||||
<A href="http://www.fftw.org">the FFTW web page</A>. <hr>
|
||||
Next: <a href="section5.html" rel=precedes>Known bugs</a>.<br>
|
||||
Back: <a href="section3.html" rev=precedes>Using FFTW</a>.<br>
|
||||
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
|
||||
<address>
|
||||
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
|
||||
- 14 September 2021
|
||||
</address><br>
|
||||
Extracted from FFTW Frequently Asked Questions with Answers,
|
||||
Copyright © 2021 Matteo Frigo and Massachusetts Institute of Technology.
|
||||
</body></html>
|
||||
148
fftw-3.3.10/doc/FAQ/fftw-faq.html/section5.html
Normal file
148
fftw-3.3.10/doc/FAQ/fftw-faq.html/section5.html
Normal file
@@ -0,0 +1,148 @@
|
||||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
||||
<html>
|
||||
<head><title>
|
||||
FFTW FAQ - Section 5
|
||||
</title>
|
||||
<link rev="made" href="mailto:fftw@fftw.org">
|
||||
<link rel="Contents" href="index.html">
|
||||
<link rel="Start" href="index.html">
|
||||
<link rel="Previous" href="section4.html"><link rel="Bookmark" title="FFTW FAQ" href="index.html">
|
||||
</head><body text="#000000" bgcolor="#FFFFFF"><h1>
|
||||
FFTW FAQ - Section 5 <br>
|
||||
Known bugs
|
||||
</h1>
|
||||
|
||||
<ul>
|
||||
<li><a href="#rfftwndbug" rel=subdocument>Q5.1. FFTW 1.1 crashes in rfftwnd on Linux.</a>
|
||||
<li><a href="#fftwmpibug" rel=subdocument>Q5.2. The MPI transforms in FFTW 1.2 give incorrect results/leak
|
||||
memory.</a>
|
||||
<li><a href="#testsingbug" rel=subdocument>Q5.3. The test programs in FFTW 1.2.1 fail when I change FFTW to use single
|
||||
precision.</a>
|
||||
<li><a href="#teststoobig" rel=subdocument>Q5.4. The test program in FFTW 1.2.1 fails for n >
|
||||
46340.</a>
|
||||
<li><a href="#linuxthreads" rel=subdocument>Q5.5. The threaded code fails on Linux Redhat 5.0</a>
|
||||
<li><a href="#bigrfftwnd" rel=subdocument>Q5.6. FFTW 2.0's rfftwnd fails for rank > 1 transforms with a final
|
||||
dimension >= 65536.</a>
|
||||
<li><a href="#primebug" rel=subdocument>Q5.7. FFTW 2.0's complex transforms give the wrong results with prime
|
||||
factors 17 to 97.</a>
|
||||
<li><a href="#mpichbug" rel=subdocument>Q5.8. FFTW 2.1.1's MPI test programs crash with
|
||||
MPICH.</a>
|
||||
<li><a href="#aixthreadbug" rel=subdocument>Q5.9. FFTW 2.1.2's multi-threaded transforms don't work on
|
||||
AIX.</a>
|
||||
<li><a href="#bigprimebug" rel=subdocument>Q5.10. FFTW 2.1.2's complex transforms give incorrect results for large prime
|
||||
sizes.</a>
|
||||
<li><a href="#solaristhreadbug" rel=subdocument>Q5.11. FFTW 2.1.3's multi-threaded transforms don't give any speedup on
|
||||
Solaris.</a>
|
||||
<li><a href="#aixflags" rel=subdocument>Q5.12. FFTW 2.1.3 crashes on AIX.</a>
|
||||
</ul><hr>
|
||||
|
||||
<h2><A name="rfftwndbug">
|
||||
Question 5.1. FFTW 1.1 crashes in rfftwnd on
|
||||
Linux.
|
||||
</A></h2>
|
||||
|
||||
This bug was fixed in FFTW 1.2. There was a bug in
|
||||
<code>rfftwnd</code> causing an incorrect amount of memory to be allocated. The bug showed
|
||||
up in Linux with libc-5.3.12 (and nowhere else that we know of).
|
||||
|
||||
<h2><A name="fftwmpibug">
|
||||
Question 5.2. The MPI transforms in FFTW 1.2 give incorrect
|
||||
results/leak memory.
|
||||
</A></h2>
|
||||
|
||||
These bugs were corrected in FFTW 1.2.1. The MPI transforms (really,
|
||||
just the transpose routines) in FFTW 1.2 had bugs that could cause
|
||||
errors in some situations.
|
||||
<h2><A name="testsingbug">
|
||||
Question 5.3. The test programs in FFTW 1.2.1 fail when I change FFTW
|
||||
to use single precision.
|
||||
</A></h2>
|
||||
|
||||
This bug was fixed in FFTW 1.3. (Older versions of FFTW did
|
||||
work in single precision, but the test programs didn't--the error
|
||||
tolerances in the tests were set for double precision.)
|
||||
|
||||
<h2><A name="teststoobig">
|
||||
Question 5.4. The test program in FFTW 1.2.1 fails for n >
|
||||
46340.
|
||||
</A></h2>
|
||||
|
||||
This bug was fixed in FFTW 1.3. FFTW 1.2.1 produced the right answer,
|
||||
but the test program was wrong. For large n, n*n in the naive
|
||||
transform that we used for comparison overflows 32 bit integer
|
||||
precision, breaking the test.
|
||||
<h2><A name="linuxthreads">
|
||||
Question 5.5. The threaded code fails on Linux Redhat
|
||||
5.0
|
||||
</A></h2>
|
||||
|
||||
We had problems with glibc-2.0.5. The code should work with
|
||||
glibc-2.0.7.
|
||||
<h2><A name="bigrfftwnd">
|
||||
Question 5.6. FFTW 2.0's rfftwnd fails for rank > 1 transforms
|
||||
with a final dimension >= 65536.
|
||||
</A></h2>
|
||||
|
||||
This bug was fixed in FFTW 2.0.1. (There was a 32-bit integer
|
||||
overflow due to a poorly-parenthesized expression.)
|
||||
<h2><A name="primebug">
|
||||
Question 5.7. FFTW 2.0's complex transforms give the wrong results
|
||||
with prime factors 17 to 97.
|
||||
</A></h2>
|
||||
|
||||
There was a bug in the complex transforms that could cause incorrect
|
||||
results under (hopefully rare) circumstances for lengths with
|
||||
intermediate-size prime factors (17-97). This bug was fixed in FFTW
|
||||
2.1.1.
|
||||
<h2><A name="mpichbug">
|
||||
Question 5.8. FFTW 2.1.1's MPI test programs crash with
|
||||
MPICH.
|
||||
</A></h2>
|
||||
|
||||
This bug was fixed in FFTW 2.1.2. The 2.1/2.1.1 MPI test programs
|
||||
crashed when using the MPICH implementation of MPI with the
|
||||
<code>ch_p4</code> device (TCP/IP); the transforms themselves worked fine.
|
||||
|
||||
<h2><A name="aixthreadbug">
|
||||
Question 5.9. FFTW 2.1.2's multi-threaded transforms don't work on
|
||||
AIX.
|
||||
</A></h2>
|
||||
|
||||
This bug was fixed in FFTW 2.1.3. The multi-threaded transforms in
|
||||
previous versions didn't work with AIX's
|
||||
<code>pthreads</code> implementation, which idiosyncratically creates threads in detached
|
||||
(non-joinable) mode by default.
|
||||
<h2><A name="bigprimebug">
|
||||
Question 5.10. FFTW 2.1.2's complex transforms give incorrect results
|
||||
for large prime sizes.
|
||||
</A></h2>
|
||||
|
||||
This bug was fixed in FFTW 2.1.3. FFTW's complex-transform algorithm
|
||||
for prime sizes (in versions 2.0 to 2.1.2) had an integer overflow
|
||||
problem that caused incorrect results for many primes greater than
|
||||
32768 (on 32-bit machines). (Sizes without large prime factors are
|
||||
not affected.)
|
||||
<h2><A name="solaristhreadbug">
|
||||
Question 5.11. FFTW 2.1.3's multi-threaded transforms don't give any
|
||||
speedup on Solaris.
|
||||
</A></h2>
|
||||
|
||||
This bug was fixed in FFTW 2.1.4. (By default, Solaris creates
|
||||
threads that do not parallelize over multiple processors, so one has
|
||||
to request the proper behavior specifically.)
|
||||
|
||||
<h2><A name="aixflags">
|
||||
Question 5.12. FFTW 2.1.3 crashes on AIX.
|
||||
</A></h2>
|
||||
|
||||
The FFTW 2.1.3 <code>configure</code> script picked incorrect compiler flags for the <code>xlc</code> compiler on newer IBM processors. This
|
||||
is fixed in FFTW 2.1.4. <hr>
|
||||
Back: <a href="section4.html" rev=precedes>Internals of FFTW</a>.<br>
|
||||
<a href="index.html" rev=subdocument>Return to contents</a>.<p>
|
||||
<address>
|
||||
<A href="http://www.fftw.org">Matteo Frigo and Steven G. Johnson</A> / <A href="mailto:fftw@fftw.org">fftw@fftw.org</A>
|
||||
- 14 September 2021
|
||||
</address><br>
|
||||
Extracted from FFTW Frequently Asked Questions with Answers,
|
||||
Copyright © 2021 Matteo Frigo and Massachusetts Institute of Technology.
|
||||
</body></html>
|
||||
Reference in New Issue
Block a user