Ignore:
Timestamp:
08/26/16 19:35:26 (8 years ago)
Author:
Hal Finkel <hfinkel@…>
Branches:
master, pympi
Children:
8ebc79b
Parents:
cda87e9
git-author:
Hal Finkel <hfinkel@…> (08/26/16 19:35:26)
git-committer:
Hal Finkel <hfinkel@…> (08/26/16 19:35:26)
Message:

Upgrade to latest blosc library

blosc git: e394f327ccc78319d90a06af0b88bce07034b8dd

File:
1 edited

Legend:

Unmodified
Added
Removed
  • thirdparty/blosc/README.rst

    r00587dc r981e22c  
    44 
    55:Author: Francesc Alted 
    6 :Contact: f[email protected] 
     6:Contact: f[email protected] 
    77:URL: http://www.blosc.org 
     8:Gitter: |gitter| 
     9:Travis CI: |travis| 
     10:Appveyor: |appveyor| 
     11 
     12.. |gitter| image:: https://badges.gitter.im/Blosc/c-blosc.svg 
     13        :alt: Join the chat at https://gitter.im/Blosc/c-blosc 
     14        :target: https://gitter.im/Blosc/c-blosc?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge 
     15 
     16.. |travis| image:: https://travis-ci.org/Blosc/c-blosc.svg?branch=master 
     17        :target: https://travis-ci.org/Blosc/c-blosc 
     18 
     19.. |appveyor| image:: https://ci.appveyor.com/api/projects/status/3mlyjc1ak0lbkmte?svg=true 
     20        :target: https://ci.appveyor.com/project/FrancescAlted/c-blosc/branch/master 
     21 
    822 
    923What is it? 
     
    1832 
    1933It uses the blocking technique (as described in [2]_) to reduce 
    20 activity on the memory bus as much as possible.  In short, this 
     34activity on the memory bus as much as possible. In short, this 
    2135technique works by dividing datasets in blocks that are small enough 
    2236to fit in caches of modern processors and perform compression / 
    2337decompression there.  It also leverages, if available, SIMD 
    24 instructions (SSE2) and multi-threading capabilities of CPUs, in order 
    25 to accelerate the compression / decompression process to a maximum. 
    26  
    27 You can see some recent benchmarks about Blosc performance in [3]_ 
     38instructions (SSE2, AVX2) and multi-threading capabilities of CPUs, in 
     39order to accelerate the compression / decompression process to a 
     40maximum. 
     41 
     42Blosc is actually a metacompressor, that meaning that it can use a range 
     43of compression libraries for performing the actual 
     44compression/decompression. Right now, it comes with integrated support 
     45for BloscLZ (the original one), LZ4, LZ4HC, Snappy, Zlib and Zstd. Blosc 
     46comes with full sources for all compressors, so in case it does not find 
     47the libraries installed in your system, it will compile from the 
     48included sources and they will be integrated into the Blosc library 
     49anyway. That means that you can trust in having all supported 
     50compressors integrated in Blosc in all supported platforms. 
     51 
     52You can see some benchmarks about Blosc performance in [3]_ 
    2853 
    2954Blosc is distributed using the MIT license, see LICENSES/BLOSC.txt for 
     
    3257.. [1] http://www.blosc.org 
    3358.. [2] http://blosc.org/docs/StarvingCPUs-CISE-2010.pdf 
    34 .. [3] http://blosc.org/trac/wiki/SyntheticBenchmarks 
     59.. [3] http://blosc.org/synthetic-benchmarks.html 
    3560 
    3661Meta-compression and other advantages over existing compressors 
    3762=============================================================== 
    3863 
    39 Blosc is not like other compressors: it should rather be called a 
     64C-Blosc is not like other compressors: it should rather be called a 
    4065meta-compressor.  This is so because it can use different compressors 
    41 and pre-conditioners (programs that generally improve compression 
    42 ratio).  At any rate, it can also be called a compressor because it 
    43 happens that it already integrates one compressor and one 
    44 pre-conditioner, so it can actually work like so. 
    45  
    46 Currently it uses BloscLZ, a compressor heavily based on FastLZ 
    47 (http://fastlz.org/), and a highly optimized (it can use SSE2 
    48 instructions, if available) Shuffle pre-conditioner. However, 
    49 different compressors or pre-conditioners may be added in the future. 
    50  
    51 Blosc is in charge of coordinating the compressor and pre-conditioners 
    52 so that they can leverage the blocking technique (described above) as 
    53 well as multi-threaded execution (if several cores are available) 
    54 automatically. That makes that every compressor and pre-conditioner 
     66and filters (programs that generally improve compression ratio).  At 
     67any rate, it can also be called a compressor because it happens that 
     68it already comes with several compressor and filters, so it can 
     69actually work like so. 
     70 
     71Currently C-Blosc comes with support of BloscLZ, a compressor heavily 
     72based on FastLZ (http://fastlz.org/), LZ4 and LZ4HC 
     73(https://github.com/Cyan4973/lz4), Snappy 
     74(https://github.com/google/snappy) and Zlib (http://www.zlib.net/), as 
     75well as a highly optimized (it can use SSE2 or AVX2 instructions, if 
     76available) shuffle and bitshuffle filters (for info on how and why 
     77shuffling works, see slide 17 of 
     78http://www.slideshare.net/PyData/blosc-py-data-2014).  However, 
     79different compressors or filters may be added in the future. 
     80 
     81C-Blosc is in charge of coordinating the different compressor and 
     82filters so that they can leverage the blocking technique (described 
     83above) as well as multi-threaded execution (if several cores are 
     84available) automatically. That makes that every compressor and filter 
    5585will work at very high speeds, even if it was not initially designed 
    5686for doing blocking or multi-threading. 
     
    6090* Meant for binary data: can take advantage of the type size 
    6191  meta-information for improved compression ratio (using the 
    62   integrated shuffle pre-conditioner). 
    63  
    64 * Small overhead on non-compressible data: only a maximum of 16 
    65   additional bytes over the source buffer length are needed to 
    66   compress *every* input. 
    67  
    68 * Maximum destination length: contrarily to many other 
    69   compressors, both compression and decompression routines have 
    70   support for maximum size lengths for the destination buffer. 
    71  
    72 * Replacement for memcpy(): it supports a 0 compression level that 
    73   does not compress at all and only adds 16 bytes of overhead. In 
    74   this mode Blosc can copy memory usually faster than a plain 
    75   memcpy(). 
     92  integrated shuffle and bitshuffle filters). 
     93 
     94* Small overhead on non-compressible data: only a maximum of (16 + 4 * 
     95  nthreads) additional bytes over the source buffer length are needed 
     96  to compress *any kind of input*. 
     97 
     98* Maximum destination length: contrarily to many other compressors, 
     99  both compression and decompression routines have support for maximum 
     100  size lengths for the destination buffer. 
    76101 
    77102When taken together, all these features set Blosc apart from other 
    78103similar solutions. 
    79104 
    80 Compiling your application with Blosc 
    81 ===================================== 
    82  
    83 Blosc consists of the next files (in blosc/ directory):: 
    84  
    85     blosc.h and blosc.c      -- the main routines 
    86     blosclz.h and blosclz.c  -- the actual compressor 
    87     shuffle.h and shuffle.c  -- the shuffle code 
     105Compiling your application with a minimalistic Blosc 
     106==================================================== 
     107 
     108The minimal Blosc consists of the next files (in `blosc/ directory 
     109<https://github.com/Blosc/c-blosc/tree/master/blosc>`_):: 
     110 
     111    blosc.h and blosc.c        -- the main routines 
     112    shuffle*.h and shuffle*.c  -- the shuffle code 
     113    blosclz.h and blosclz.c    -- the blosclz compressor 
    88114 
    89115Just add these files to your project in order to use Blosc.  For 
    90 information on compression and decompression routines, see blosc.h. 
    91  
    92 To compile using GCC (4.4 or higher recommended) on Unix: 
    93  
    94 .. code-block:: console 
    95  
    96    $ gcc -O3 -msse2 -o myprog myprog.c blosc/*.c -lpthread 
     116information on compression and decompression routines, see `blosc.h 
     117<https://github.com/Blosc/c-blosc/blob/master/blosc/blosc.h>`_. 
     118 
     119To compile using GCC (4.9 or higher recommended) on Unix: 
     120 
     121.. code-block:: console 
     122 
     123   $ gcc -O3 -mavx2 -o myprog myprog.c blosc/*.c -Iblosc -lpthread 
    97124 
    98125Using Windows and MINGW: 
     
    100127.. code-block:: console 
    101128 
    102    $ gcc -O3 -msse2 -o myprog myprog.c blosc\*.c 
    103  
    104 Using Windows and MSVC (2008 or higher recommended): 
    105  
    106 .. code-block:: console 
    107  
    108   $ cl /Ox /Femyprog.exe myprog.c blosc\*.c 
    109  
    110 A simple usage example is the benchmark in the bench/bench.c file. 
    111 Also, another example for using Blosc as a generic HDF5 filter is in 
    112 the hdf5/ directory. 
    113  
    114 I have not tried to compile this with compilers other than GCC, MINGW, 
    115 Intel ICC or MSVC yet. Please report your experiences with your own 
    116 platforms. 
    117  
    118 Testing Blosc 
    119 ============= 
    120  
    121 Go to the test/ directory and issue: 
    122  
    123 .. code-block:: console 
    124  
    125   $ make test 
    126  
    127 These tests are very basic, and only valid for platforms where GNU 
    128 make/gcc tools are available.  If you really want to test Blosc the 
    129 hard way, look at: 
    130  
    131 http://blosc.org/trac/wiki/SyntheticBenchmarks 
    132  
    133 where instructions on how to intensively test (and benchmark) Blosc 
    134 are given.  If while running these tests you get some error, please 
    135 report it back! 
     129   $ gcc -O3 -mavx2 -o myprog myprog.c -Iblosc blosc\*.c 
     130 
     131Using Windows and MSVC (2013 or higher recommended): 
     132 
     133.. code-block:: console 
     134 
     135  $ cl /Ox /Femyprog.exe /Iblosc myprog.c blosc\*.c 
     136 
     137In the `examples/ directory 
     138<https://github.com/Blosc/c-blosc/tree/master/examples>`_ you can find 
     139more hints on how to link your app with Blosc. 
     140 
     141I have not tried to compile this with compilers other than GCC, clang, 
     142MINGW, Intel ICC or MSVC yet. Please report your experiences with your 
     143own platforms. 
     144 
     145Adding support for other compressors with a minimalistic Blosc 
     146~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
     147 
     148The official cmake files (see below) for Blosc try hard to include 
     149support for LZ4, LZ4HC, Snappy, Zlib inside the Blosc library, so 
     150using them is just a matter of calling the appropriate 
     151`blosc_set_compressor() API call 
     152<https://github.com/Blosc/c-blosc/blob/master/blosc/blosc.h>`_.  See 
     153an `example here 
     154<https://github.com/Blosc/c-blosc/blob/master/examples/many_compressors.c>`_. 
     155 
     156Having said this, it is also easy to use a minimalistic Blosc and just 
     157add the symbols HAVE_LZ4 (will include both LZ4 and LZ4HC), 
     158HAVE_SNAPPY and HAVE_ZLIB during compilation as well as the 
     159appropriate libraries. For example, for compiling with minimalistic 
     160Blosc but with added Zlib support do: 
     161 
     162.. code-block:: console 
     163 
     164   $ gcc -O3 -msse2 -o myprog myprog.c blosc/*.c -Iblosc -lpthread -DHAVE_ZLIB -lz 
     165 
     166In the `bench/ directory 
     167<https://github.com/Blosc/c-blosc/tree/master/bench>`_ there a couple 
     168of Makefile files (one for UNIX and the other for MinGW) with more 
     169complete building examples, like switching between libraries or 
     170internal sources for the compressors. 
     171 
     172Supported platforms 
     173~~~~~~~~~~~~~~~~~~~ 
     174 
     175Blosc is meant to support all platforms where a C89 compliant C 
     176compiler can be found.  The ones that are mostly tested are Intel 
     177(Linux, Mac OSX and Windows) and ARM (Linux), but exotic ones as IBM 
     178Blue Gene Q embedded "A2" processor are reported to work too. 
    136179 
    137180Compiling the Blosc library with CMake 
    138181====================================== 
    139182 
    140 Blosc can also be built, tested and installed using CMake_. 
     183Blosc can also be built, tested and installed using CMake_. Although 
     184this procedure might seem a bit more involved than the one described 
     185above, it is the most general because it allows to integrate other 
     186compressors than BloscLZ either from libraries or from internal 
     187sources. Hence, serious library developers are encouraged to use this 
     188way. 
     189 
    141190The following procedure describes the "out of source" build. 
    142191 
     
    148197  $ cd build 
    149198 
    150 Configure Blosc in release mode (enable optimizations) specifying the 
    151 installation directory: 
    152  
    153 .. code-block:: console 
    154  
    155   $ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=INSTALL_DIR \ 
    156       PATH_TO_BLOSC_SOURCE_DIR 
    157  
    158 Please note that configuration can also be performed using UI tools 
    159 provided by CMake_ (ccmake or cmake-gui): 
    160  
    161 .. code-block:: console 
    162  
    163   $ cmake-gui PATH_TO_BLOSC_SOURCE_DIR 
     199Now run CMake configuration and optionally specify the installation 
     200directory (e.g. '/usr' or '/usr/local'): 
     201 
     202.. code-block:: console 
     203 
     204  $ cmake -DCMAKE_INSTALL_PREFIX=your_install_prefix_directory .. 
     205 
     206CMake allows to configure Blosc in many different ways, like prefering 
     207internal or external sources for compressors or enabling/disabling 
     208them.  Please note that configuration can also be performed using UI 
     209tools provided by CMake_ (ccmake or cmake-gui): 
     210 
     211.. code-block:: console 
     212 
     213  $ ccmake ..      # run a curses-based interface 
     214  $ cmake-gui ..   # run a graphical interface 
    164215 
    165216Build, test and install Blosc: 
     
    167218.. code-block:: console 
    168219 
    169   $ make 
    170   $ make test 
    171   $ make install  
     220  $ cmake --build . 
     221  $ ctest 
     222  $ cmake --build . --target install 
    172223 
    173224The static and dynamic version of the Blosc library, together with 
    174 header files, will be installed into the specified INSTALL_DIR. 
     225header files, will be installed into the specified 
     226CMAKE_INSTALL_PREFIX. 
    175227 
    176228.. _CMake: http://www.cmake.org 
     229 
     230Once you have compiled your Blosc library, you can easily link your 
     231apps with it as shown in the `example/ directory 
     232<https://github.com/Blosc/c-blosc/blob/master/examples>`_. 
     233 
     234Adding support for other compressors (LZ4, LZ4HC, Snappy, Zlib) with CMake 
     235~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
     236 
     237The CMake files in Blosc are configured to automatically detect other 
     238compressors like LZ4, LZ4HC, Snappy or Zlib by default.  So as long as 
     239the libraries and the header files for these libraries are accessible, 
     240these will be used by default.  See an `example here 
     241<https://github.com/Blosc/c-blosc/blob/master/examples/many_compressors.c>`_. 
     242 
     243*Note on Zlib*: the library should be easily found on UNIX systems, 
     244although on Windows, you can help CMake to find it by setting the 
     245environment variable 'ZLIB_ROOT' to where zlib 'include' and 'lib' 
     246directories are. Also, make sure that Zlib DDL library is in your 
     247'\Windows' directory. 
     248 
     249However, the full sources for LZ4, LZ4HC, Snappy and Zlib have been 
     250included in Blosc too. So, in general, you should not worry about not 
     251having (or CMake not finding) the libraries in your system because in 
     252this case, their sources will be automatically compiled for you. That 
     253effectively means that you can be confident in having a complete 
     254support for all the supported compression libraries in all supported 
     255platforms. 
     256 
     257If you want to force Blosc to use external libraries instead of 
     258the included compression sources: 
     259 
     260.. code-block:: console 
     261 
     262  $ cmake -DPREFER_EXTERNAL_LZ4=ON .. 
     263 
     264You can also disable support for some compression libraries: 
     265 
     266.. code-block:: console 
     267 
     268  $ cmake -DDEACTIVATE_SNAPPY=ON .. 
     269 
     270Mac OSX troubleshooting 
     271~~~~~~~~~~~~~~~~~~~~~~~ 
     272 
     273If you run into compilation troubles when using Mac OSX, please make 
     274sure that you have installed the command line developer tools.  You 
     275can always install them with: 
     276 
     277.. code-block:: console 
     278 
     279  $ xcode-select --install 
    177280 
    178281Wrapper for Python 
     
    181284Blosc has an official wrapper for Python.  See: 
    182285 
    183 https://github.com/FrancescAlted/python-blosc 
     286https://github.com/Blosc/python-blosc 
     287 
     288Command line interface and serialization format for Blosc 
     289========================================================= 
     290 
     291Blosc can be used from command line by using Bloscpack.  See: 
     292 
     293https://github.com/Blosc/bloscpack 
    184294 
    185295Filter for HDF5 
    186296=============== 
    187297 
    188 For those that want to use Blosc as a filter in the HDF5 library, 
    189 there is a sample implementation in the hdf5/ directory. 
     298For those who want to use Blosc as a filter in the HDF5 library, 
     299there is a sample implementation in the blosc/hdf5 project in: 
     300 
     301https://github.com/Blosc/hdf5 
    190302 
    191303Mailing list 
     
    200312=============== 
    201313 
    202 I'd like to thank the PyTables community that have collaborated in the 
    203 exhaustive testing of Blosc.  With an aggregate amount of more than 300 TB of 
    204 different datasets compressed *and* decompressed successfully, I can say that 
    205 Blosc is pretty safe now and ready for production purposes. 
    206  
    207 Other important contributions: 
    208  
    209 * Thibault North contributed a way to call Blosc from different threads in a 
    210   safe way. 
    211  
    212 * The cmake support was a contribution of Thibault North, Antonio Valentino 
    213   and Mark Wiebe. 
    214  
    215 * Valentin Haenel did a terrific work fixing typos and improving docs and the 
    216   plotting script. 
     314See THANKS.rst. 
    217315 
    218316 
Note: See TracChangeset for help on using the changeset viewer.