VOLK News

Release v2.2.1

24 Feb 2020 by Johannes Demel

Hi everyone,

with VOLK 2.2.0, we introduced another AVX rotator bug which is fixed with this release. In the process 2 more bugs were identified and fixed. Further, we saw some documentation improvements.

Contributors

Changes

  • Fix loop bound in AVX rotator
  • Fix out-of-bounds read in AVX2 square dist kernel
  • Fix length checks in AVX2 index max kernels
  • includes: rearrange attributes to simplify macros Whitespace
  • kernels: fix usage in header comments

Release v2.2.0

16 Feb 2020 by Johannes Demel

Hi everyone,

we have a new VOLK release v2.2.0!

We want to thank all contributors. This release wouldn't have been possible without them.

We're curious about VOLK users. Especially we'd like to learn about VOLK users who use VOLK outside GNU Radio.

If you have ideas for VOLK enhancements, let us know. Start with an issue to discuss your idea. We'll be happy to see new features get merged into VOLK.

The v2.1.0 release was rather large because we had a lot of backlog. We aim for more incremental releases in order to get new features out there.

Highlights

VOLK v2.2.0 updates our build tools and adds support functionality to make it easier to use VOLK in your projects.

  • Dropped Python 2 build support
    • Removed Python six module dependency
  • Use C11 aligned_alloc whenever possible
    • MacOS posix_memalign fall-back
    • MSVC _aligned_malloc fall-back
  • Add VOLK version in volk_version.h (included in volk.h)
  • Improved CMake code
  • Improved code with lots of refactoring and performance tweaks

Contributors

Changes

  • CMake
    • Fix detection of AVX and NEON
    • Fix for macOS
    • lib/CMakeLists: use asm instead of asm for ARM tests
    • lib/CMakeLists: fix detection when compiler support NEON but nor neonv7 nor neonv8
    • lib/CMakeLists.txt: use VOLK_ASM instead of __asm
    • lib/CMakeLists.txt: let VOLK choose preferred neon version when both are supported
    • lib/CMakeLists.txt: simplify neon test support. Unset neon version if not supported
    • For attribute, change from clang to "clang but not MSC"
  • Readme
    • logo: Add logo at top of README.md
  • Build dependencies
    • python: Drop Python2 support
    • python: Reduce six usage
    • python: Move to Python3 syntax and modules
    • six: Remove build dependency on python six
  • Allocation
    • alloc: Use C11 aligned_alloc
    • alloc: Implement fall backs for C11 aligned_alloc
    • alloc: Fix for incomplete MSVC standard compliance
    • alloc: update to reflect alloc changes
  • Library usage
    • Fixup VolkConfigVersion
    • add volk_version.h
  • Refactoring
    • qa_utils.cc: fix always false expression
    • volk_prefs.c: check null realloc and use temporary pointer
    • volk_profile.cc: double assignment and return 0
    • volk_32f_x2_pow_32f.h: do not need to _mm256_setzero_ps()
    • volk_8u_conv_k7_r2puppet_8u.h: d_polys[i] is positive
    • kernels: change one iteration for's to if's
    • kernels: get rid of some assignments
    • qa_utils.cc: actually throw something
    • qa_utils.cc: fix always true code
    • rotator: Refactor AVX kernels
    • rotator: Remove unnecessary variable
    • kernel: Refactor square_dist_scalar_mult
    • square_dist_scalar_mult: Speed-Up AVX, Add unaligned
    • square_dist_scalar_mult: refactor AVX2 kernel
    • kernel: create AVX2 meta intrinsics
  • CI
    • appveyor: Test with python 3.4 and 3.8
    • appveyor: Add job names
    • appveyor: Make ctest more verbose
  • Performance
    • Improve performance of generic kernels with complex multiply
    • square_dist_scalar_mult: Add SSE version
    • Adds NEON versions of cos, sin and tan

Website Updates

02 Jan 2020 by Andrej Rode

After a long time libvolk.org is brought to you from the GNU Radio infrastructure at OSUOSL with love! The canonical URL now is: https://www.libvolk.org. A big thank to the OSU Open Source Lab for providing the GNU Radio project with server infrastructure to run websites, Continuous Integration and other web services.

Website repository

You can find the source code of libvolk.org now at https://github.com/gnuradio/libvolk.org and submit pull-requests with updates and corrections. Once your pull-request is accepted it will be automatically deployed here and it's easy to get content online on libvolk.org now!

Future updates

Stay tuned for future additions to this website and infrastructure!

Release v2.1.0

22 Dec 2019 by Johannes Demel

Hi everyone,

we would like to announce that Michael Dickens and Johannes Demel are the new VOLK maintainers. We want to review and merge PRs in a timely manner as well as commenting on issues in order to resolve them.

We want to thank all contributors. This release wouldn't have been possible without them.

We're curious about VOLK users. Especially we'd like to learn about VOLK users who use VOLK outside GNU Radio.

If you have ideas for VOLK enhancements, let us know. Start with an issue to discuss your idea. We'll be happy to see new features get merged into VOLK.

Highlights

VOLK v2.1.0 is a collection of really cool changes. We'd like to highlight some of them.

  • The AVX FMA rotator bug is fixed
  • VOLK offers volk::vector<> for C++ to follow RAII
  • Move towards modern dependencies
    • CMake 3.8
    • Prefer Python3
      • We will drop Python2 support in a future release!
    • Use C++17 std::filesystem
      • This enables VOLK to be built without Boost if available!
  • more stable CI
  • lots of bugfixes
  • more optimized kernels, especially more NEON versions

Contributors

Changes

  • Usage

    • Update README to reflect how to build on Raspberry Pi and the importance of running volk_profile
  • Toolchain

    • Add toolchain file for Raspberry Pi 3
    • Update Raspberry 4 toolchain file
  • Kernels

    • Add neonv7 to volk_16ic_magnitude_16i
    • Add neonv7 to volk_32fc_index_max_32u
    • Add neonv7 to volk_32fc_s32f_power_spectrum_32f
    • Add NEONv8 to volk_32f_64f_add_64f
    • Add Neonv8 to volk_32fc_deinterleave_64f_x2
    • Add volk_32fc_x2_s32fc_multiply_conjugate_add_32fc
    • Add NEONv8 to volk_32fc_convert_16ic
  • CI

    • Fix AVX FMA rotator
    • appveyor: Enable testing on windows
    • Fixes for flaky kernels for more reliable CI
      • volk_32f_log2_32f
      • volk_32f_x3_sum_of_poly_32f
      • volk_32f_index_max_{16,32}u
      • volk_32f_8u_polarbutterflypuppet_32f
      • volk_8u_conv_k7_r2puppet_8u
      • volk_32fc_convert_16ic
      • volk_32fc_s32f_magnitude_16i
      • volk_32f_s32f_convert_{8,16,32}i
      • volk_16ic_magnitude_16i
      • volk_32f_64f_add_64f
    • Use Intel SDE to test all kernels
    • TravisCI
      • Add native tests on arm64
      • Add native tests on s390x and ppc64le (allow failure)
  • Build

    • Build Volk without Boost if C++17 std::filesystem or std::experimental::filesystem is available
    • Update to more modern CMake
    • Prevent CMake to choose previously installed VOLK headers
    • CMake
      • bump minimum version to 3.8
      • Use sha256 instead of md5 for unique target name hash
    • Python: Prefer Python3 over Python2 if available
  • C++

    • VOLK C++ allocator and C++11 std::vector type alias added

Release v1.4

26 Mar 2018 by Nathan West

A lot of really good changes came to VOLK with v1.4. It wouldn't have been possible without the following contributors:

Contributors

Changes

Generally, there are a lot of kernel changes and some minor dependency changes. I'm trying to remove boost as a dependency and we've introduced mako templates rather than the old Cheetah-templates to keep in line with GNU Radio. There are also several new CI files that support appveyor, travis-ci, and gitlab. Right now all pull requests must pass travis-ci.

Kernels

The easiest way to show these changes is simply with two lists:

New kernels

  • 32 bit reversal
  • 32f_s32f_s32f_mod_range_32f
  • double precision (64f_XXX...)
    • multiply
    • add
  • 32f_64f_multiply_64f
  • add 32f_64f_add_64f
  • 32fc_x2_add_32fc

New proto-kernels by architecture

AVX(2):

Note that in some cases an unaligned version was added where an aligned version already existed

  • volk_64f_convert_32f
  • volk_64f_x2_max_64f
  • volk_64f_x2_min_64f
  • volk_32f_x2_add_32f
  • 32i_x2_and_32i
  • 32i_x2_or_32i
  • conjugate dot products
  • 32f_accumulator_32f
  • stddev_and_mean
  • volk_32f_* kernels
  • 32f_x2_divide_32f
  • 32f_x2_dot_prod_16i
  • volk_32f_s32f_normalize
  • volk_32f_s32f_stddev_32f
  • volk_32f_sqrt_32f
  • volk_32f_x2_max_32f
  • volk_32f_x2_min_32f
  • 32f_x2_s32f_interleave_16ic
  • 32f_x2_subtract_32f
  • 32f_x2_s32f_interleave_16ic
  • 32f_x2_subtract_32f
  • 32f_x2_subtract_32f
  • 32f_x2_s32f_interleave_16ic
  • volk_8ic_s32f_deinterleave_*
  • 32f_log2_32f
  • volk_32f_s32f_convert_8i and 16i

NEON:

  • move all neonasm to aligned protokernels
  • added ARM version of volk_32u_reverse_32u (RBIT)
  • volk_32fc_x2_divide_32fc
  • volk_32fc_32f_add_32fc
  • volk_32f_x2_divide_32f
  • volk_8i_s32f_convert_32f

Additionally, there are new protokernel intrinsics available for use in writing new kernels.

Then, we also had some general kernel and protokernel bug fixes and using proper type-named C functions which happened to increase performance:

The polarbutterfly went through some heavy refactoring and bug fixes as well as adding an AVX version. Fix GH issue #139 for 32fc_index_max_* kernels resulting in a slightly wrong index being returned. Fix bug 106 (volk_64u_popcnt bug in generic implementation)

CI and Builds

As previously mentioned there are appveyor, travis-ci, and gitlab CI files available. There is a travis-ci instance checking all pull requests at https://travis-ci.org/gnuradio/volk/ and a gitlab mirror running CI checks at https://gitlab.com/n-west/volk.

While working on these CI files the kernel tests were split in to individual ctest targets so that each kernel is its own test rather than running them as a monolithic binary. This allows parallel testing, but mostly enables easier diagnostics when a test fails. The readme is now a markdown file that renders well on GitHub and Gitlab along with the travis-ci status as a badge.

Within this release two tools were run that reorganized includes and fixed a bunch of typos within code.

As part of the attempt to build VOLK without boost a bunch of app and build utilities were written to replace boost-code. This shouldnt be visible to the user, but will hopefully make future builds easier and smaller with fewer build and run-time dependencies. Builds with python 2.7 and 3 should work-- although six is required for python2.7 support.

Some build changes make it easier to do a relocatable build and order all files before building so that building from a particular revision (from now on) should be reproducible across machines building the same architectures. To use a relocatable install use the VOLK_PREFIX environment variable. This should support snaps (Canonical packaging environment).

Modtool

modtool: update the cmake find module for volk mods
modtool: deconflict module include guards from main volk

Release v1.3.1

25 Mar 2018 by Nathan West

Contributors

The following people had commits in this release. Thanks to them for making VOLK possible!

Changes

This is an API-compatible support release that only includes bug fixes.

Kernels

Fix GH issue #139 for 32fc_index_max_* kernels. Note that this is a minor API change that modern compilers should be OK with if they can handle the implicit type conversion.

Use 'powf' to match variables and avoid implicit type converstion. Makes some older compilers happy, allowing 'make test' to pass. kernels: Add AVX support to 32f_x2_divide_32f,32f_x2_dot_prod_16i.

Fix bug 106 (volk_64u_popcnt bug in generic implementation)

Adds protokernels for AVX support. Modest speed improvements in some of the kernels, however, it seems to be related to the host architecture being used

Adds AVX support to volk_32f_s32f_normalize,volk_32f_s32f_stddev_32f, volk_32f_sqrt_32f, volk_32f_x2_max_32f and volk_32f_x2_min_32f. Some speed improvements can be seen with the new protokernel addition.

Adds unaligned protokernels to 32f_x2_s32f_interleave_16ic and 32f_x2_subtract_32f.

Adds unaligned versions to the afore mentioned kernels, relative speeds improvements shown in both cases.

Add NEON, AVX and unaligned versions of SSE4.1 and SSE.

Added __VOLK_PREFETCH() compatibility macro

__VOLK_PREFETCH() performs __builtin_prefetch() on GCC compilers and is otherwise a NOP for other systems. The use of __builtin_prefetch was replaced with __VOLK_PREFETCH() to make the kernels portable.

Documentation

Fixing a minimal bug in the log2 docstring

Build Support

Support relocated install with VOLK_PREFIX env var.

Some packaging systems such as snaps will install the volk library to a dynamically chosen location. The install script can set an evironment variable so that the library reports the correct prefix.

cmake: support empty CMAKE_INSTALL_PREFIX

QA and CI

qa: lower tolerance for 32fc_mag to fix issue #96 apps: fix profile update reading end of lines Add a AppVeyor compatible YAML file for building on the AppVeyor CI

Modtool

Update the cmake find module for volk mods and deconflict module include guards from main volk.

Release v1.2.3

02 Jul 2016 by Nathan West

Release v1.2.3

Contributors

Changes

The index_max kernels were named with the wrong output datatype. To fix this there are new kernels that return a 32u (int32_t) and the existing kernels had their signatures changed to return 16u (int16_t).

The output to stdout and stderr has been shuffled around. There is no longer a message that prints what VOLK machine is being used and the warning messages go to stderr rather than stdout.

MSVC builds without explicitly set flags.

VolkConfig.cmake includes a hardcoded install path so that VOLK is easier to find in non-standard prefixes. Similarly the BOOST_ROOT environment variable is no longer overridden so that it is easier to find BOOST in non-standard prefixes.

The 32fc_index_max kernels previously were only accurate to the SSE register width (4 points). This was a pretty serious and long-lived bug that's been fixed and the QA updated appropriately.

Release v1.3

02 Jul 2016 by Nathan West

Release v1.3

Contributors

Changes

Several new kernels are available. These include several type conversions, some fixed point complex operations, and a complex float divide.

Volk_config functions are now overloaded to be able to read and write to a custom volk_config path which can be controlled with a new --path option in volk_profile.

Volk-config-info can now tell you the alignment VOLK is using for your CPU and malloc implementation used in volk_profile.

Builds define the Dual ABI macro _GLIBCXX_USE_CXX11_ABI to 1, which should allow builds with GCC 4 when linking against C++ libraries that are built with GCC 5.

Release v1.2.2

07 Apr 2016 by Nathan West

Release v1.2.2

Contributors

Changes

This is a maintenance release the primarily addresses build issues, primarily for MSVC. Additionally, this fixes an issue when building as a sub-project of GNU Radio with cmake versions >= 3.5

Release v1.2.1

07 Feb 2016 by Nathan West

Release v1.2.1

Contributors

Changes

Profiler

  • Fixed a segfault in the polar butterfly puppet
  • Reverted back to input values in range [-1, 1] rather than [-pi, pi].

Builds (all windows related)

  • Add MSVC 14 to processor detection
  • Minor tweaks to fix builds on MSVC 2015

Kernels

Small performance improvement in log2 by switching to log2f, which is explicitly for floats.

Add reusable intrinsics for NEON. This continues the push for creating an internal library of intrinsics that will make composing more complex kernels from simpler reusable building blocks without going back to memory.

Release v1.2

23 Dec 2015 by Nathan West

Release v1.2

Contributors

Changes

Kernels

New kernels for doing polar codes are available with generic, SSE3, and AVX implementations. This is the result of ESA SoC by Jannes Demel and used in GNU Radio.

The rotator protokernels now normalize phase after every time finished through the main for loop to guarantee normalization happens for a series of calls with smaller vector lengths.

Some kernels now use inacc tolerances for QA. The kernels themselves are exactly the same funcationality as before, but with the existing QA tolerance would fail on NEON due to sqrt and inverse approximations in NEON.

Release v1.1.1

31 Oct 2015 by Nathan West

Release v1.1.1

This is the first maintenance release with only bug fixes since v1.1

Contributors

The following authors have contributed code to this release:

Changes

Coverity

Ben spent the post-GRcon hackfest fixing a few errors that the GNU Radio coverity scan reported. The critical fix was a potential buffer overflow in the profiler while reading existing results.

Builds

Based on feedback from packagers VOLK has made strides towards reproducible builds. The latest effort is removing the builddate.

Several header includes and ifdef guards have been shuffled around to make VOLK out of tree modules easier to work with and ARM builds more robust to compiler whims.

Release v1.1

24 Aug 2015 by Nathan West

Release v1.1

Contributors

The following authors have contributed code to this release:

Changes

This release contains all of the bug fixes from v1.0.1 and v1.0.2 as well as new features and other changes that didn't belong on maint. The following is a summary of non-maint changes.

Architectures

New architectures exist for the AVX2 and FMA ISAs. Along with the build-system support the following kernels have no proto-kernels taking advantage of these architectures:

  • 32f_x2_dot_prod_32f
  • 32fc_x2_multiply_32fc
  • 64_byteswap
  • 32f_binary_slicer_8i
  • 16u_byteswap
  • 32u_byteswap

QA/profiler

The profiler now generates buffers that are vlen + a tiny amount and generates random data to fill buffers. This is intended to catch bugs in protokernels that write beyond num_points.

Miscellaneous

  • All builds now use '-Wall'
  • Removed stray references to PCC and Altivec

Maintenance Release v1.0.2

24 Jul 2015 by Nathan West

Release v1.0.2

This is a relatively minor maintenance release with bug fixes since v1.0.1.

Contributors

The following have contributed code to this release:

Changes

The major change is the CMake logic to add ASM protokernels. Rather than depending on CFLAGS and ASMFLAGS we use the results of VOLK's built in has_ARCH tests. All configurations should work the same as before, but manually specifying CFLAGS and ASMFLAGS on the cmake call for ARM native builds should no longer be necessary.

The 32fc_s32fc_x2_rotator_32fc generic protokernel now includes a previously implied header.

Finally, there is a fix to return the "best" protokernel to the dispatcher when no volk_config exists. Thanks to Alexandre Raymond for pointing this out.

Maintenance Release v1.0.1

08 Jul 2015 by Nathan West

This is a maintenance release with bug fixes since the initial release of v1.0 in April.

Contributors

The following authors have contributed code to this release:

Changes

Kernels

Several bug fixes in different kernels. The NEON implementations of the following kernels have been fixed:

  • 32f_x2_add_32f
  • 32f_x2_dot_prod_32f
  • 32fc_s32fc_multiply_32fc
  • 32fc_x2_multiply_32fc

Additionally the NEON asm based 32f_x2_add_32f protokernels were not being used and are now included and available for use via the dispatcher.

The 32f_s32f_x2_fm_detect_32f kernel now has a puppet. This solves QA seg faults on 32-bit machines and provide a better test for this kernel.

The 32fc_s32fc_x2_rotator_32fc generic protokernel replaced cabsf with hypotf for better Android support.

Building

Static builds now trigger the applications (volk_profile and volk-config-info) to be statically linked.

The file gcc_x86_cpuid.h has been removed since it was no longer being used. Previously it provided cpuid functionality for ancient compilers that we do not support.

All build types now use -Wall.

QA and Testing

The documentation around the --update option to volk_profile now makes it clear that the option will only profile kernels without entries in volk_profile. The signature of run_volk_tests with expanded args changed signed types to unsigned types to reflect the actual input.

The remaining changes are all non-functional changes to address issues from Coverity.

Initial Release

11 Apr 2015 by Nathan West

VOLK 1.0 is available. This is the first release of VOLK as an independently tracked sub-project of GNU Radio.

Contributors

VOLK has been tracked separately from GNU Radio since 2014 Dec 23. Contributors between the split and the initial release are

Changes

QA

The test and profiler have significantly changed. The profiler supports run-time changes to vlen and iters to help kernel development and provide more flexibility on embedded systems. Additionally there is a new option to update an existing volk_profile results file with only new kernels which will save time when updating to newer versions of VOLK

The QA system creates a static list of kernels and test cases. The QA testing and profiler iterate over this static list rather than each source file keeping its own list. The QA also emits XML results to lib/.unittest/kernels.xml which is formatted similarly to JUnit results.

Modtool

Modtool was updated to support the QA and profiler changes.

Kernels

New proto-kernels:

  • 16ic_deinterleave_real_8i_neon
  • 16ic_s32f_deinterleave_32f_neon
  • fix preprocessor errors for some compilers on byteswap and popcount puppets

ORC was moved to the asm kernels directory.

volk_malloc

The posix_memalign implementation of Volk_malloc now falls back to a standard malloc if alignment is 1.

Miscellaneous

Several build system and cmake changes have made it possible to build VOLK both independently with proper soname versions and in-tree for projects such as GNU Radio.

The static builds take advantage of cmake object libraries to speed up builds.

Finally, there are a number of changes to satisfy compiler warnings and make QA work on multiple machines.