thanks for your answer!
You mean neighter adding the metadata nor the operators? I will get exceptions.- What happens, when you do NOT add Latency, Datarate etc. to the query text?
Code: Select all
de.uniol.inf.is.odysseus.core.server.planmanagement.QueryParseException: EvaluationPreTransformation throwed an exception. Message: Missing attribute 'no such attribute: latency' ... Caused by: de.uniol.inf.is.odysseus.core.sdf.schema.NoSuchAttributeException: Missing attribute 'no such attribute: latency' ... 289202 ERROR StandardExecutor - Could not add query '#PRETRANSFORM EvaluationPreTransformation
Hmm, ok, I understand what you mean. But lets think about a stream which data rate is measured at the beginning, "in the middle" and at the end. The tuples are not filtered etc., they are only transformed. If the datarate "in the middle" is much slower than at the beginning, that would point to a bottleneck/some heavy processing.- Datarate measurements are only useful at the inputs else you would measure e.g. the selectivity of the query. Together with latency this should be all you need, I guess. Like e.g. precision and recall it is imporant to use both values, as the throughput can be raised by introducing buffer, but this will of course lead to higher latency, too.
Furthermore the datarate at the end could indicate how much user requests Odysseus can process per second.
Or is this not really practicable?
To the buffers: I tried to fill my RabbitMQ message queue with a different number of routing results of user requests (100, 1000, 2500,...) and then starting processing (1 user request = approx 36 tuples to process). The idea was to see how much requests/tuples can be processed by Odysseus per second. The reading rate is very different - I guess depending on the buffers. The final result output rate (data rate operator before the final sender) slows down as more tuples are in the queue. But if I calculate the real throughput number of tuples/(lend of last tuple - maxlstart first tuple) by hand I can see that the throughput raises and reaches its peak by 5000 user requests (about 180.000 tuples in the queue). I guess the reason for that are the buffers. I guess the dropping datarate at the final sender is due to the fact that for less tuples the buffers are not fully filled and the overall processing can be faster.
Additionally I tried to write a specific number of tuples per second into the message queue to see how the latency behaves for a different number of requests per second. Based on my further results I assumed that Odysseus could process about 600 tuples (which sounds less but if you think on the payload and the time consuming processing, its not really bad). But if I fill the queue with 1000 tuples per second they are read also in that speed. It seems that they are buffered and processed later (the latency will be higher of course). Of course the buffers will eventually be full but its really hard to find a good way to get reliable values for that (at least for me ). The tuples in the queue cannot be easily imported, I have to generate them which is a time consuming process. Of course this is automated but the processing time is high.
What I am looking for is actually just a way to measure the max requests per second and how the latency changes between 1 - max tuples/second. Or at least some reliable values that can show how good/fast the processing is. Do you have an idea?