Friday, July 25, 2014

Re: [Discuss-gnuradio] comments on stream tags and metadata storage

I'd hoped my comments below would start a more extensive dialog on GNU
Radio's metadata infrastructure. Several years experience that I have
with this capability in a non-commercial C++ DSP framework suggests many
enhancements in flow, representation, and utilities.

I have a slight itch to contribute to a solution, but without community
involvement can't hope to provide anything mergable. Is this simply not
something anybody feels needs to be addressed, or did I ask in the wrong
forum?

Peter

On 07/17/2014 05:11 PM, Peter A. Bigot wrote:
> Some comments after playing with stream tags and metadata this
> afternoon.
>
> (1) Although the discussion of stream tag insertion hints that this
> should be done within the scheduler's call to work() it could be more
> clear that doing it in any other context can result in race conditions.
> (I did think I saw it stated more clearly somewhere, but can't find
> that now, so maybe this point has been addressed.)
>
> (2) In the current implementation it's further necessary that tags be
> added to an output in monotonic non-decreasing offset order.
> file_meta_sink does not sort the return value from get_tags_in_range(),
> and emits all data up to the timestamp of the next tag, so a subsequent
> tag with an earlier offset is dropped from the archive.
>
> (I note that tagged_file_sink() does sort the tags it receives in one
> case, but not in others.)
>
> I don't see this requirement on ordered generation documented. In some
> cases, it may be inconvenient to do this, e.g. when a block's analysis
> discovers after-the-fact that something interesting can be associated
> with a past sample. Similarly, a user might want a block to associate
> a tag with sample that not yet arrived, to notify a downstream block
> that will need to process the event.
>
> A simple solution for the infrastructure is to require that tags only be
> generated from within work(), with offsets corresponding to samples
> generated in that call to work(), and in non-decreasing offset order
> (though this last requirement could be handled by add_item_tag()). The
> developer must then handle the too-late/too-early tag associations
> through some other mechanism, such as carrying the effective offset as
> part of the tag value.
>
> (3) Qt GUI Range with widget Counter + Slider invokes callbacks twice,
> even if the value itself was set exactly once through the counter text
> entry. If the callback records the change by queuing a stream tag for
> addition to the output, multiple tags with the same offset/key/value
> will be generated.
>
> There are ugly solutions to this but it's probably sufficient to note
> somewhere that it can happen. It's really not specific to tags, but is
> clearly visible in that case.
>
> (4) The in-memory stream of tags can produce multiple settings of the
> same key at the same offset. However, when stored to a file only the
> last setting of the key is recorded.
>
> I believe this last behavior is incorrect and that it's a mistake to use
> a map instead of a multimap or simple list for the metadata record of
> stream tags associated with a sample.
>
> One argument is that it's critical that a stream archive of a processing
> session faithfully record the contents of the stream so that re-running
> the application using playback reproduces that stream and thus the
> original behavior (absent non-determinism due to asynchrony). This
> faithful reproduction is what would allow a maintainer to diagnose an
> operational failure caused by a block with a runtime failure when the
> same tag is processed twice at the same offset. This is true even if
> the same key is set to the same value at the same sample offset multiple
> times, which some might otherwise want to argue is redundant.
>
> A corollary argument is that the sample number at which an event like a
> tuner configuration change occurs usually can't be exactly associated
> with a sample; the best estimate is likely to be the index of the first
> sample generated by the next call to work. But depending on processing
> speed an application might change an attribute of a data source multiple
> times before work was invoked. The effect of those intermediate changes
> may be visible in the signal, and to lose the fact they occurred by
> discarding all but the last change affects both reproducibility and
> interpretation of the signal itself.
>
> (5) All stream tags are placed in the extras block, and when a segment
> is completed file_meta_sink will generate a new header. The new header
> contains copies of the unique tags, but updates their offsets to be the
> start of the new segment.
>
> This is incorrect as the original stream did not have those tags
> associated with those samples, so re-playing will introduce a behavioral
> difference. For example, a tag that is meant to be associated with the
> start of a packet will be duplicated at an offset that is probably not
> the start of a packet.
>
> Solutions include (a) leave the original offset setting for tags in the
> extras section when they're reproduced in a new segment, even though
> that offset is not present in the segment; (b) treat stream tags as
> ephemeral and do not persist them in the extras section when generating
> a new segment; (c) extend the add_item_tag API to record whether the
> tag is ephemeral or persistent. Offhand I can see no argument
> supporting persisting a tag and updating its offset, and only rare cases
> where it's appropriate to replicate outdated information in a new
> segment, so (b) seems to be the right move.
>
> All the above is based on my understanding and expectations of how
> stream tags are/should be used. If my understanding is mistaken,
> please let me know.
>
> Peter
>
>
> _______________________________________________
> Discuss-gnuradio mailing list
> Discuss-gnuradio@gnu.org
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio


_______________________________________________
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

No comments:

Post a Comment