Hi everyone. A number of you have asked me to keep you informed of
any major updates on the OpenCL gr-clenabled project and the past
couple of weeks have been pretty active. There's now a version up in
the repo with a significant number of updates and all blocks have been
validated (at least in their basic modes).
So here's the major updates:
Validation flowgraphs - Almost all test flowgraphs have been posted in
the examples directory. You can run the comparisons on your own
hardware for comparison. This is important on older cards that don't
support double precision (you can check with the included clview
command-line tool).
Signal Source Block - A discrepancy in the output was due to an
OpenCL issue. Turns out single/float precision wasn't producing
accurate enough numbers. This block now uses double precision if the
hardware supports it (most new hardware will) for an even cleaner
signal than the native block (no secondary nodes).
Quad Demod - Same single/double trig discrepancy due to precision
which was corrected.
Filters - A lot of work this week has been spent on filter validation
(hence the few emails about TD vs. FD from yesterday)
- Both FIR and FFT implementations are now implemented and
producing correct output
- A generic tap-based block was added for more flexibility
- A test-clfilter command-line tool was added to test performance
given a number of taps across OpenCL FIR, GNURadio FIR, OpenCL FFT,
and GNURadio FFT so you can pick the best performing filter given your
implementation.
Costas Loop - A Costas Loop was added, however the performance on a
GPU kernel is horrible. Because of the sequential calculations, it
couldn't be SIMD parallel processed so it was written as an OpenCL
task-based kernel. This means it just runs single-threaded on a
single core, which is why the performance is so bad. However if
anyone has an OpenCL-capable FPGA card like an Altera I'd love to see
the result of running the included test-clenabled timing tool and see
how the Costas Loop performs. I just don't have access to one.
Performance - Code was added to detect if the hardware supports Fused
Multiply/Add functionality for added kernel performance. If it's
available it's used.
OpenCL Setup Instructions - For those that may not have OpenCL set up,
I added some installation guides in the setup_help directory for
Ubuntu and Debian with step-by-steps on getting it up and running.
I've taken both of those processes on several systems and been up and
running pretty quickly. I also pulled some of the important points
into the main page's README, since in my experience that's generally
all I look through too.
Study - Based on the filter updates, the filter section in the study
in the docs directory was completely rewritten. The report was noted
as updated.
I think that's the biggest updates for now. As always let me know if
anyone runs into any issues.
_______________________________________________
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
No comments:
Post a Comment