pyvkfft-test

Run pyvkfft unit tests, regular or systematic

usage: pyvkfft-test [-h] [--colour] [--html [HTML ...]] [--gpu GPU]
                    [--opencl_platform OPENCL_PLATFORM] [--mailto MAILTO]
                    [--mailto_fail MAILTO_FAIL] [--mailto_smtp MAILTO_SMTP]
                    [--nproc NPROC] [--silent] [--c2c] [--systematic]
                    [--axes [AXES ...]]
                    [--backend {pycuda,cupy,pyopencl} [{pycuda,cupy,pyopencl} ...]]
                    [--bluestein] [--db [DB ...]] [--dct [{1,2,3,4}]]
                    [--dst [{1,2,3,4}]] [--double] [--dry-run]
                    [--fast-random FAST_RANDOM] [--inplace] [--graph [GRAPH]]
                    [--lut] [--max-nb-tests MAX_NB_TESTS]
                    [--ndim {1,2,3,12,123}] [--norm {0,1}] [--ref-long-double]
                    [--r2c] [--fstride] [--radix [{2,3,5,7,11,13} ...]]
                    [--radix-max-pow RADIX_MAX_POW] [--range RANGE RANGE]
                    [--range-mb RANGE_MB RANGE_MB]
                    [--range-nd-narrow RANGE_ND_NARROW RANGE_ND_NARROW]
                    [--serial] [--timeout TIMEOUT]

Named Arguments

--colour

Use colour depending on how good the measured accuracy is

Default: False

--html

Summarises the results in html row(s). This is saved to 'pyvkfft-test%04d.html', starting at i=1001 and incrementing. Files with i=1000 and i=1999 are the beginning and the end of thehtml file, which can be concatenated to form a valid html page.If --graph is also used, this includes a graph of the accuracy which can be displayed by clicking on the type of transform.

--gpu

Name (or sub-string) of the GPU to use

--opencl_platform

Name (or sub-string) of the opencl platform to use (case-insensitive). Note that by default the PoCL platform is skipped, unless it is specifically requested or it is the only one available (PoCL has some issues with VkFFT for some transforms)

--mailto

Email address the results will be sent to

--mailto_fail

Email address the results will be sent to, only if the test fails

--mailto_smtp

SMTP server address to mail the results

Default: "localhost"

--nproc

Number of parallel process to use to speed up tests. Make sure the sum of parallel process will not use too much GPU memory

Default: [1]

--silent

Use this to minimise the written output (note that tests can take a long time be patient

Default: False

--c2c

When used without --systematic, perform only c2c quick tests and skip the long r2c/dct/dst unless they were also requested.

Default: False

--systematic

Perform a systematic accuracy test over a range of array sizes. Without this argument a faster test (a few minutes) will be performed with selected array sizes for all possible transforms.

Default: False

--fast-random

Use this option to run a random percentage of the full test suite, for faster results. A number between 5 and 100 is required.

systematic

Options for --systematic:

--axes

transform axes: x (fastest) is 1,y is 2, z is 3, e.g. '--axes 1', '--axes 2 3'.The default is to perform the transform along the ndim fastest axes. Using this overrides --ndim

--backend

Possible choices: pycuda, cupy, pyopencl

Choose single or multiple GPU backends,by default all available backends are selected.

--bluestein, --nonradix

Only perform transform with non-radix dimensions, i.e. the largest number in the prime decomposition of each array dimension must be larger than 13

Default: False

--db

Save the results to an sql database. If no filename isgiven, pyvkfft-test.sql will be used. If the file alreadyexists, the results are added to the file. Fields storedinclude HOSTNAME, EPOCH, BACKEND, LANGUAGE, TRANSFORM (c2c, r2c or dct/dst1/2/3/4, AXES, ARRAY_SHAPE, NDIMS, NDIM, PRECISION, INPLACE,NORM, LUT, N, N2_FFT, N2_IFFT, NI_FFT, NI_IFFT, TOLERANCE,DT_APP, DT_FFT, DT_IFFT, SRC_UNCHANGED_FFT, SRC_UNCHANGED_IFFT, GPU_NAME, SUCCESS, ERROR, VKFFT_ERROR_CODE

--dct

Possible choices: 1, 2, 3, 4

Test direct cosine transforms (default is c2c): '--dct' (defaults to dct 2), '--dct 1'

Default: False

--dst

Possible choices: 1, 2, 3, 4

Test direct sine transforms (default is c2c): '--dst' (defaults to dst 2), '--dst 1'

Default: False

--double

Use double precision (float64/complex128) instead of single

Default: False

--dry-run

Perform a dry-run, printing the number of array shapes to test

Default: False

--inplace

Use inplace transforms

Default: False

--graph

Save the graph of the accuracy as a function of the sizeto the given filename (if no name is given, it will be automatically generated).Requires matplotlib, and scipy for linear regression.

--lut

Force the use of a LUT for the transform, to improve accuracy. By default VkFFT will activate the LUT on some GPU with less accurate accelerated trigonometric functions. This is automatically true for double precision

Default: False

--max-nb-tests

Maximum number of tests. If the number of generated test cases is larger, the program will abort.

Default: [1000]

--ndim

Possible choices: 1, 2, 3, 12, 123

Number of dimensions for the transform. Using 12 or 123 will result in testing bother 1 and 2 or 1,2 and 3. It isrecommended to use --range_mb and

Default: [1]

--norm

Possible choices: 0, 1

Normalisation to test (must be 1 for dct or dst)

Default: [1]

--ref-long-double

Use long double precision for the reference calculation, (requires scipy). This gives more objective accuracy plots but can be slower (or much slower on some architectures).

Default: False

--r2c

Test real-to-complex transform (default is c2c)

Default: False

--fstride

Test F-ordered arrays (default is C-ordered). Not supported for DCT/DST

Default: False

--radix

Possible choices: 2, 3, 5, 7, 11, 13

Perform only radix transforms. If no value is given, all available radix transforms are allowed. Alternatively a list can be given: '--radix 2' (only 2**n array sizes), '--radix 2 3 5' (only 2**N1 * 3**N2 * 5**N3)

--radix-max-pow

For radix runs, specify the maximum exponent of each base integer, i.e. for '--radix 2 3 --radix-max-pow 2' will limit lengths to 2**N1 * 3**N2 with N1,N2<=2

--range

Range of array lengths [min, max] along each transform dimension, '--range 2 128'

Default: [2, 128]

--range-mb

Allowed range of array sizes [min, max] in Mbytes, e.g. '--range-mb 2 128'. This can be used to limit the arrays size while allowing large lengths along individual dimensions. It can also be used to separate runs with a given size range and different nproc values. This takes into account the type (single or double), and also whether the transform is made inplace, so this represents the total GPU memoryused.

Default: [0, 128]

--range-nd-narrow

Two values (drel dabs), e.g. '--range_nd_narrow 0.10 11' with 0<=drel<=1 and dabs (integer>=0) must be given to allow 2D and 3D tests to be done on arrays with different lengths along every dimension, but while limiting the difference between lengths. For example in 2D for an (N1,N2) array shape, generated lengths will verify abs(n2-n1)<max(dabs+drel*N1). The default value of (0,0) only allows the same lengths. This allows to test more diverse configurations while limiting the number of tests.

Default: ['0', '0']

--serial

Serialise the tests instead of spawning them in separate process, allowing to diagnose more errors. Incompatible with nproc>1.

Default: False

--timeout

Change the timeout (in seconds) to raise a TimeOut error for individual tests. After 4 have failed, give up.

Default: [120]

Examples:
pyvkfft-test

the regular test which tries the fft interface, using parallel streams (for pycuda), and C2C/R2C/DCT/DST transforms for sizes N=15,17,30,34 with 1D to 4 or 5D transforms, also N=808,2988,4200,13000,13001, 13002,130172 for 1D and 2D transforms. All tests are done with single and double precision, in and out-of-place, norm=0 and 1, and all available backends (pyopencl, pycuda and cupy). For C2C arrays up to dimension 5 are tested, with all possible combination of transform axes. That's for a total of a tens of thousands transforms, which are tested against the result of numpy, scipy or pyfftw (when available) for accuracy. The text output gives the N2 and Ninf (aka max) relative norm of the transform, with the ratio in () to the expected tolerance for both direct and inverse transforms.

pyvkfft-test --nproc 8 --gpu v100 --mailto_fail toto@pyvkfft.org

same test, but using 8 parallel process to speed up, and use a GPU with 'v100' in its name. Also, send the results in case of a failure to the given email address

pyvkfft-test --systematic --backend pycuda --nproc 8 --radix --range 2 10000

Perform a systematic test of C2C transforms in (by default) 1D and single precision, for N=2 to 10000, only for radix transforms

pyvkfft-test --systematic --backend pycuda --nproc 8 --radix 2 7 11 --range 2 10000 --double

Same test, but only for radix sizes with factors 2, 7 and 11, and double accuracy

pyvkfft-test --systematic --backend cupy --nproc 8 --bluestein --range 2 10000 --ndim 2 --lut --inplace

test with cupy backend, only non-radix 2D inplace R2C transforms

, using a lookup table( lut) for higher single precision accuracy.

Columns in the text output:
  • backend

  • type of transform

  • array shape

  • axes for the transform. If None, axes are set by the number of transform dimensions

  • number of dimensions for the transform. Can be None if axes are given.

  • type of algorithm for each axis: r=radix, R=Rader, B=Bluestein, -=skipped axis

  • number of uploads for each axis: 0 if not transformed, 1 if the axis length fits in the cache and the transform can be done in 1 read+write, 2 or 3 if multi-upload is used

  • data type and precision

  • use of a Look-Up-Table (LUT) -for single precision only.

  • inplace or out-of-place transform

  • normalisation for the transform: 0 or 1

  • order of the array: C (fast axis is last) or F (fast axis is first)

  • N2 and N_inf error norm for the forward transform, with the comparison to the maximum allowed error (and in parenthesis the ratio to this maximum), and finally 0 or 1 depending on whether the source array was modified (0) or not (1)

  • Same values for the inverse transform

  • temporary buffer size allocated by VkFFT if necessary, for large transforms

  • status: OK, FAIL (if accuracy is above limit or source array unexpectedly changed) or ERROR (an error was raised during execution, e.g. compilation, memory,...)