Saturday, July 26, 2014

Re: [Discuss-gnuradio] comments on stream tags and metadata storage

I'm confused: are we to discuss these issues on the mailing list, or as
comments on the wiki issues I created? I thought the latter was the
right location. Putting detailed discussion in two places that do not
link to each other is not a good approach.

My comments below address interpretation and philosophy, not technical
issues.

On 07/26/2014 05:36 AM, Marcus Müller wrote:
> Hi Peter,
> I'm on very limited time and internet, so let me just answer things
> in-text, shortening where contributing to readability.
> On 25.07.2014 23:37, Peter A. Bigot wrote:
>> I've added five issues to cover the topics from my original email and
>> followups.
>>
>> http://gnuradio.org/redmine/issues/698 proposes that the key of a
>> stream tag be a namespaced identifier to avoid conflicts between
>> individually-developed components.
> That's a good idea, and actually, standarizing the way stream tags and
> messages should be named and structured was one of the points of the
> last dev call, and is thus work to be done.

I've updated the issue with my proposed approach. If this is already
being worked and people have an alternative architecture in mind, then
please comment on that issue so I stop thinking about it.

>
>> The other four are listed inline below with additional responses to
>> Marcus:
>>
>> On 07/25/2014 08:46 AM, Marcus Müller wrote:
>>> < issue at hand: inserting tags only seems save from within work, but
>>> this is documented nowhere>
>>>
>>> sooo let me just whip up a few comments:
>>> (1) That's a documentation issue, isn't it?
>> I don't believe so.
>>
>> http://gnuradio.org/redmine/issues/699 expands on point (1), that
>> add_item_tag() can only be safely called from within gr::block::work().
> And I have to contradict Sean and you :)
> Inserting tags *is* safe at any time. You get logical errors when you
> don't take care that you don't attach tags to your stream that belong in
> ranges that might already have been processed. This is logical; I do
> think it would be a good idea to explicitely state it in the documentation.
> However, there's no reason not to insert tags anywhere else, just make
> sure that nitems_written() is bigger than your inserted tags' offsets.

I do not see how this allows me to guarantee association of a tag with a
specific absolute sample. I can guess why you want to insert from
outside work(), and could propose compromises, but I'd prefer to do that
at http://gnuradio.org/redmine/issues/699

>
>>
>>> Anyway, I'm not quite sure
>>> you're right; the insertion of tags is mutexed IIRC, and the
>>> get_tags_in_range() functionality, too, so once the user got his vector
>>> of tags, that won't change anymore. There's the possibility that he
>>> *misses* some of the tags for the range that get inserted after he
>>> got_tags_in_range(), but that's only fair -- it's quite intuitive not to
>>> insert tags after you've handed off the samples to which they would
>>> belong to downstream tags.
>> I wasn't concerned about tags across multiple calls to
>> get_tags_in_range(), but for tags that are added to a stream after one
>> call to work() and before the end of the next call to work().
> I'm not sure I understand you correctly.
> If (general_)work gets called, the samples that can be processed within
> that call to work have left the upstream block's work function already.
> If that upstream block decides that he wants to insert tags to samples
> he produced in a prior call already, my comment to (1) and basic "how
> the hell is this going to look causal to the block downstream" applies!

I don't think we disagree on this behavior, only on how to ensure that
the original insertion doesn't get lost or misordered (and whether
that's a problem for which the infrastructure should take some
responsibility).

>>
>>> (3) I don't see the relation to the discussion, as you said ;) but that
>>> sounds like a bug, so if you opened up a new thread or filed a bug at
>>> gnuradio.org, that would be awesome :)
>> http://gnuradio.org/redmine/issues/700 records point (3), that GRC
>> parameter callbacks can be invoked multiple times as a result of a
>> single user action due to the architecture of GUI and other components.
> Thanks! Sounds hard to fix, though, if you say it's due to architecture.
>>> (4) I'm fairly certain the buffers use a deque to store tags, not a map
>>> of any kind. So maybe I'm misunderstanding you, or you misread code
>>> somewhere?
>>> I think what you're describing might be a bug in metadata_filesink, so
>>> that might need some attention! see (3).
>> Yes, the lead-in example was specific to the behavior of
>> file_metadata_sink, but the fundamental requirement goes across the
>> entire infrastructure. I'm hoping to see support for making that
>> requirement stick.
> It is *NOT* a requirement (so far). So far it is the rule that everyone
> that expects multiple tags will do the sorting. And this makes sense --
> computationally. I don't think we should "break" correctness of existing
> blocks by taking away that requirement from the tag readers just to
> shift it to the tag producers. So I strongly disagree with you here, sorry.

If there is information that is conveyed by receiving tags in a
non-monotonic offset order, I'd like to have more details.

If the concern is that having the infrastructure maintain the tag order
I propose at http://gnuradio.org/redmine/issues/701 will introduce a
performance bottleneck, I'd like to see an argument. Preferably as a
comment on that issue.

Otherwise, since I feel that a deterministic non-decreasing order of
arrival of tags simplifies implementing blocks that process those tags,
if you feel that every block that might take advantage of ordered
arrival should instead enforce it in their work() functions then we do
indeed disagree.

>
>> http://gnuradio.org/redmine/issues/701 expands on points (2) and (4),
>> that the infrastructure should promise to maintain all tags inserted
>> by blocks in their original order (for any sample offset) and should
>> document the situations where tags may be discarded by the
>> infrastructure.
>>
>>
> Also, I don't understand the bullet points at the end of your ticket #701.
> * preserve order of insertion: GR does this, so how is this an issue?
You missed the "within the same offset". Yes, I really do want
downstream blocks to be guaranteed that the tags they receive are sorted
by non-decreasing offset. No, I don't want to make every block that
inserts or receives tags have to do the sorting.

> * preserve incoming tag order: I'm not quite sure I get your point, but
> if I do: GR does this, so how is this an issue?
> * never discard a tag at the block that inserted it: Why would someone
> discard a tag he inserted it? GR doesn't discard tags, so how is this an
> issue?

The implementation of file_metadata_sink discards tags, as described.
This suggests that GNU Radio as a holistic system does not have an
architectural requirement that metadata be preserved. I'd like to see
that resolution adopted.

From the rest of your mail I get the impression that our expectations
of what the framework should provide versus what individual component
developers are responsible for handling are at odds. If the consensus
of the GNU Radio development team is that the solution to all this is to
clarify and document the existing behavior with no framework changes at
all, then most of my issues can be closed once that's done. Ten years
experience with systems named X-Midas and Midas 2k, which do much of
what GNU Radio does, leads me to believe that would have significant
long term costs in development effort and system reliability.

Peter

> * clearly document duplicate elimination process: duplicates are *not*
> detected, so how is this an issue?
>
> Seriously, tag insertion is *this* 2 lines of code
> (gnuradio-runtime/lib/buffer_detail.cc):
> void
> buffer::add_item_tag(const tag_t &tag)
> {
> gr::thread::scoped_lock guard(*mutex());
> d_item_tags.push_back(tag);
> }
>
> So how did you come to the conclusion that there might be race
> conditions, necessity to call it only from work, elimination and
> reordering problems?
> All of the problems you raise are logical problems that can -- in my
> private opinion -- best be solved by what should be called a contract,
> ie. "do not shoot yourself in the knee by inserting tags to item numbers
> that you already processed, and do not read tags from samples you don't
> have yet. If you need your tags sorted, sort 'em", which basically boils
> down to "um, you now, don't try to circumvent causality by using tags,
> and by the way, don't assume they're sorted", which, for the signal
> processing guys can further be reduced to "tags aren't per se sorted.
> Usual considerations apply."
>
> As a serious attempt to get something like a private conclusion to this
> discussion, I'd like to point out that so far tag propagation works ok;
> the performance of get_tags_in_range leaves a lot of room for concepts
> that perform better, but that should be applied under the hood; if there
> are two tags at the same item: someone put them there, let's keep it
> like that. If tags are in a certain order: we should keep it like that
> (which metadata filesink seems to break, so thanks for pointing that
> out!). If you insert tags to items that have been processed in the past:
> don't assume someone else will ever look at them. If you're expecting
> tags to be in a specific order and don't know who inserted them: sort
> em. In most cases, this is a non-issue because tags will usually be
> inserted to mark something in a stream, and therefore get the
> item-chronological order automatically.
>
>>> (5)>(5) All stream tags are placed in the extras block,
>>> sorry, can't follow you there. Extras block?
>> The extras block is described in the section "Extras Information" at:
>> http://gnuradio.org/doc/doxygen/page_metadata.html
>>
> This is all about the on-disk storage format of the metadata file sink,
> so it's not really related to the way tags are handled. Maybe we're
> misunderstanding each other, but there is a section in the file format
> for extras.
>
>> http://gnuradio.org/redmine/issues/702 records points (2) and (5),
>> that the existing file_meta_source/sink corrupt (IMO) the metadata.
>>
> Thanks for that bug report!
>
>> Again I may not have made it clear that I'm not intending to raise
>> these simply as complaints. I have the necessary software skills to
>> fix them, and they generally include an outline of the approach I'd
>> propose. At this time I won't make any promises, but I'd consider
>> contributing the necessary effort if I can get enough
>> feedback/sanity-checks/interface validation/similar assistance from
>> you folks to be confident that it'd be seen as a worthwhile enhancement.
> :)
> Greetings,
> Marcus


_______________________________________________
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

No comments:

Post a Comment