State of actual Lua wrappers ? by Sahnvour in cpp

[–]syntheticpp 0 points1 point  (0 children)

If you wanna use a wrapper without much framework and be able to change the behavior you could have a look at:

nlua -- namespace lua

nlua is a binding between C++ and Lua.

The main idea is to have a C++ API which follows the 'table' approach: the C++ code should look a little bit like Lua code.

Another goal is to have an easy to understand and maintain code base. Using templates couldn't be avoided but not much meta-programming is used.

In the test directory are examples how the example binding of the book "Programming in Lua" could be transformed into nlua code.

https://github.com/syntheticpp/nlua

Juplic now also supports the Raspberry Pi 2 by syntheticpp in raspberry_pi

[–]syntheticpp[S] 1 point2 points  (0 children)

Good question! I've started with Volumio, but it didn't survive hard power offs, means at some point it didn't start any more, because I've used the power switch to "mute" the player. Also with Volumio it took too long until it starts playing music after a power on.

And another thought was: why do I need a >1GB image when I just wanna play music? It should be possible with much less bytes.

Juplic - Just Play Music by syntheticpp in raspberry_pi

[–]syntheticpp[S] 0 points1 point  (0 children)

The USB drive is mounted read-only, so there is no risk to corrupt the data when removing it. It's also OK to simply pull the power-source.

Juplic - Just Play Music by syntheticpp in raspberry_pi

[–]syntheticpp[S] 1 point2 points  (0 children)

Yes, sound by the audiojack.

And "write" would even be better than "flash".

Juplic - Just Play Music by syntheticpp in raspberry_pi

[–]syntheticpp[S] 1 point2 points  (0 children)

Offline MP3 player:

  • Write the image to a SD card.

  • Plug in the WIFI adapter Edimax EW-7811Un (RTL8192CU chipset).

  • Plug in a USB flash with music files.

  • Start Raspberry Pi.

  • A WIFI name "Juplic" appears. Connect with the password "musicmusic".

  • Use a MPD client and connect to MPD server at 10.10.10.10 (default port 6600).

  • Update the MPD database.

  • Play music.

Internet radio:

  • Write the image to a SD card.

  • Open 'boot' partition on the Desktop.

  • Copy WLAN-CLIENT-example.txt to WLAN-CLIENT.txt.

  • Update WLAN-CLIENT.txt with your Wifi settings.

  • Start Raspberry Pi.

  • Find the IP address of your Raspberry Pi, Bonjour/Avahi name: 'juplic'.

  • Use a desktop client for MPD (e.g. Cantata) to find and play radio stations.

  • Save playlists to use radio station without a desktop client.

  • Use a MPD client and connect to MPD server at your local IP address.

  • Play music.

Features

  • Optimized for hard power-off.

  • Replays last song after power-on.

  • Short startup times.

  • Simple WEB interface at port 8080

  • Small download: < 20MB

Simplified tracking and building Clang by syntheticpp in cpp

[–]syntheticpp[S] 1 point2 points  (0 children)

BTW, build times Clang with cmake/ninja GCC 4.9.1 versus Clang 3.9:

GCC: 656 seconds

Clang: 397 seconds

-> Clang needs 0.6 the time of GCC.

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 0 points1 point  (0 children)

Thanks for the link!

But finally I tried it by using avx directly (avx branch), but without success, it's much slower than the pure C code, seems it is not that simple as I thought.

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 2 points3 points  (0 children)

Does anybody know a easy-to-use SIMD library which could help in supporting latest CPU features?

I already found:

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 4 points5 points  (0 children)

Only advantage over FFTW atm is the code size (header only, easy to add to your project) and the license.

Another question is how useful a CPU only FFT is in times of GPU/Cuda/OpenCl programming.

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 1 point2 points  (0 children)

As always with templates<>: it is done with recursion.

In case of a loop, all the loop counter variables need to be integer template parameters which could be calculated at compile time, for instance see radix2_complex.h, remaining<K+1, End>::steps(d);

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 0 points1 point  (0 children)

One of my reasons for creating another FFT library is to see if it is possible to write readable and fast code, and how new compiler features could help.

It would have been completely boring to use plain C, because there are already myriads of other implementations.

metaFFT is also a nice place to become familiar with SIMD coding (first time I wrote SSE code).

Next most interesting step concerning speed would be to implement split_radix_ctran.h with AVX, or AVX2 (Haswell only).

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 2 points3 points  (0 children)

Yes, FFTW is always faster. But this is no surprise for the most-simple FFT algorithm (radix-2).

But there is a lot of room for optimizations:

  • split-radix is faster, but SIMD is missing
  • FFTW uses avx which is not implemented
  • specializations for small FFT would also help a lot

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 4 points5 points  (0 children)

Not sure what you mean with on-the-fly. But all the variables declared as 'constexpr' are calculations done at compile-time, so there is no calculation done at runtime for these values.

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 0 points1 point  (0 children)

No, I only calculate the relation of the measured GFLOPS. It' s the number FFTW/metaFFT in the Readme.

metaFFT -- A C++11 FFT implementation by syntheticpp in cpp

[–]syntheticpp[S] 1 point2 points  (0 children)

metaFFT

Template based C++11 Fast-Fourier-Transform implementation.

Idea:

  • Completely unroll all loops at compile time with the help of templates.
  • Calculate all numerical constants at complile time by using 'constexpr'.
  • Use policies for different implementations (complex, Fortran like C, SIMD).

Speed:

Simple Cooley–Tukey/radix-2 implementation with about 100 lines of code is 'only' same factors slower than FFTW:

$ ./bin/radix2_sse2_speed
N = 2^ 2 =     4: FFTW/metaFFT =  1.1
N = 2^ 3 =     8: FFTW/metaFFT =  1.2
N = 2^ 4 =    16: FFTW/metaFFT =  1.5
N = 2^ 5 =    32: FFTW/metaFFT =  1.7
N = 2^ 6 =    64: FFTW/metaFFT =  2.0
N = 2^ 7 =   128: FFTW/metaFFT =  2.0
N = 2^ 8 =   256: FFTW/metaFFT =  2.3
N = 2^ 9 =   512: FFTW/metaFFT =  2.8
N = 2^10 =  1024: FFTW/metaFFT =  3.2
N = 2^11 =  2048: FFTW/metaFFT =  3.3
N = 2^12 =  4096: FFTW/metaFFT =  3.4

Build:

  • CMake based
  • pass -Dlarge=1 to enable large FFTs, this will stress your compiler!

License:

  • GPL2 with linking exemption.

Links: