RSDK-11991: Historical module data #524

lia-viam · 2025-12-19T21:01:42Z

Implements historical module data for C++ modules, using the DataConsumer class

Most of this PR consists in implementing rudimentary BSON serialization, which is tested in the unit tests

…-direct-v2

…o dial-direct-v2

…orical-data

stuqdog

Hmm... maybe it's just that I don't have any real BSON experience but on reviewing this, it's not really clear to me how the semantics actually work. Do we expect that a typical user would know how to construct the appropriate BSONBytes for a query? If so, disregard! If not, we should maybe spend some time thinking about how to make this a bit more intuitive.

src/viam/sdk/module/data_consumer.cpp

stuqdog · 2025-12-22T19:27:49Z

src/viam/sdk/module/data_consumer.hpp

+    static DataClient::BSONBytes default_query(
+        const std::string& part_id,
+        const std::string& resource,
+        std::chrono::time_point<std::chrono::system_clock, std::chrono::milliseconds> time_point);


Milliseconds seems maybe overly precise to me. It's a bit unwieldy to work with and I imagine the cases where a user will care about sub-second precision in a query are probably not too high?

Also: it's a little confusing that the time_point argument for the default query sets the time since, and is distinct from the time_back field. I could easily imagine someone seeing that time_back is the name of the field and so, e.g., create a time point with an arg of one hour.

Perhaps also it would be nice to create an override of default_query that doesn't ask for a time_point but sets a default of 24 hours ago, akin to what we do in the go SDK.

Good points! So default_query is kind of an implementation detail, maybe we could move it to a private header entirely. Users shouldn't actually call default_query, because the other method will do it for them, I just wanted to be able to unit test it.

I chose milliseconds just on the basis of the BSON spec for timestamp, which is in milliseconds, but you can do

using namespace std::chrono; func_taking_ms(milliseconds{hours{5}});

but maybe this could be a remark in the header.

The use of time point vs duration also comes down to testing, because I can't reliably do unit tests on something that calls now() - duration.

I could easily imagine someone seeing that time_back is the name of the field and so, e.g., create a time point with an arg of one hour.

This is a situation where the C++ type system would prevent someone from doing that, but also based on this discussion I think I should just hide this helper function away somewhere

This is a situation where the C++ type system would prevent someone from doing that

So I agree that the type system won't let someone pass milliseconds as opposed to a time_point but I'm worried about the inversion logic of a time_point being "time since epoch", but the argument time_back being "time before now". So yeah, someone couldn't accidentally do

using namespace std:chrono; auto hour = milliseconds(hours{1}); auto query = default_query("id", "resource", hour);

but I think they could accidentally do

auto query = default_query("id", "resource", {hour});

thinking that this will give them everything in the past hour but actually it's giving them everything since an hour past the epoch (if I'm mistaken and you can't curly brace construct a time_point that way, let me know!).

I'm certainly willing to be convinced that this is overly defensive thinking on my part, but also I think just hiding this helper away from users is probably the best solution.

stuqdog · 2025-12-22T20:03:58Z

src/viam/sdk/module/data_consumer.hpp

+        const std::string& resource,
+        std::chrono::time_point<std::chrono::system_clock, std::chrono::milliseconds> time_point);
+
+    DataConsumer(DataClient& dc);


Hmm... can we not use ViamClient's from_env method to get the DataClient within the constructor (similar to how we do in python) to avoid asking users to pass around the DataClient?

I think we could also provide DataConsumer::from_env, the difference would just be that this implies DataConsumer owns its DataClient which is probably a moot point in a GC language like Python

I think that's probably fine if the DataConsumer owns/is responsible for the DataClient? Maybe there's a C++ memory reason to not like that but from a code design perspective I don't have a problem with it, I'd hope that in general users kind of just ignore the internals of the DataConsumer which becomes easier if there's a constructor that uses env vars to avoid asking for anything.

You could give DataConsumer both an optional DataClient and a DataClient&. Then if you have DataConsumer::from_env the optional is engaged and the reference points into the internal field. If you call DataConsumer::DataConsumer(DataClient&) then the optional is disengaged and the caller is responsible for ensuring that the lifetime of the DataConsumer is a subset of the lifetime of the DataClient.

I was trying to implement the hybrid-ownership optional but I think we run into the same problem with DataClient, which has a relationship of strictly observing (but not owning) a ViamClient, see
https://github.com/viamrobotics/viam-cpp-sdk/blob/main/src/viam/sdk/app/viam_client.hpp
https://github.com/viamrobotics/viam-cpp-sdk/blob/main/src/viam/sdk/app/data_client.cpp

ViamClient has a ViamChannel

DataClient observes a ViamChannel obtained from a ViamClient

So for DataConsumer to potentially own its DataClient, it would need space for optionally storing an owned DataClient and an owned ViamClient.

I'm inclined to leave it as is, unless we want to go whole hog the other way and mandate that DataConsumer always owns its client(s)

OK. I'd vote to leave it as-is. Can you add though a doxygen comment explaining that the DataConsumer must not outlive the DataClient?

src/viam/sdk/tests/test_data_consumer.cpp

stuqdog

generally this looks good to me, though I do think the default_query method should probably be made private.

acmorrow · 2026-01-06T13:50:34Z

.clang-tidy

 # readability-implicit-bool-conversion: We have decided that !ptr-type is cleaner than ptr-type==nullptr
 # readability-magic-numbers: This encourages useless variables and extra lint lines
 # misc-include-cleaner: TODO(RSDK-5479) this is overly finnicky, add IWYU support and fix.
+# misc-no-recursion: global ban on recursion seems absurd


There is a reason for this. Stack is a limited resource, stack overflows are often silent, and stack corruption is often exploitable. So, unbounded recursion on user controlled input is a potent vector for security issues. It is made worse by the fact that there isn't a portable way to ask "how much stack space do I have left", or "how much stack will I need to invoke this function", so there isn't a natural way to decide what the bounds should be. So there are often good arguments for abandoning recursion in favor of either 1) iteration, or 2) moving the state to an explicit stack held in the heap.

If there is just one place this is happening, maybe a NOLINT is better.

this is good to know! i'll just nolint where needed

acmorrow · 2026-01-06T13:56:50Z

src/viam/sdk/module/data_consumer.hpp

+        const std::string& resource,
+        std::chrono::time_point<std::chrono::system_clock, std::chrono::milliseconds> time_point);
+
+    DataConsumer(DataClient& dc);


You could give DataConsumer both an optional DataClient and a DataClient&. Then if you have DataConsumer::from_env the optional is engaged and the reference points into the internal field. If you call DataConsumer::DataConsumer(DataClient&) then the optional is disengaged and the caller is responsible for ensuring that the lifetime of the DataConsumer is a subset of the lifetime of the DataClient.

acmorrow · 2026-01-06T13:58:34Z

src/viam/sdk/module/data_consumer.cpp

+    const std::string org_id = get_env("VIAM_PRIMARY_ORG_ID").value_or("");
+    const std::string part_id = get_env("VIAM_MACHINE_PART_ID").value_or("");


I have a feeling this has already been debated, but this makes me kind of sad. I really dislike using environment variables to communicate configuration between processes like this.

agreed, I feel there surely must be a better way but this is what we're doing in the other SDKs

acmorrow · 2026-01-06T14:01:59Z

src/viam/sdk/module/private/data_consumer_query.cpp

+namespace sdk {
+namespace impl {
+
+struct writer {


I know it is a hassle, but, I'd appreciate it if we explicitly ensured we were writing as little endian, per the bson spec, from the start. I know you think: hey, we will never ever care about that. We thought the same once, in another codebase, only to need to spend more than a year fixing the mistake. I believe boost has an endian library you can use to make this portable.

acmorrow · 2026-01-06T14:03:25Z

src/viam/sdk/module/private/data_consumer_query.cpp

+    }
+
+    void write_entry(const std::string& key, const std::string& val) {
+        write_header(int8_t{2}, key);


Consider creating an int8_t enum rather than using numeric constants. You can include only the types you care about.

And if you wanted to get fancy, you could have a trait that mapped T to enum value so you could do write_header(type_to_typecode_v<decltype(val)>) or whatever.

acmorrow · 2026-01-06T14:07:39Z

src/viam/sdk/module/private/data_consumer_query.cpp

+
+        write_bytes(uint8_t{0});  // end of object
+
+        *reinterpret_cast<int32_t*>(&buf.front()) = static_cast<int32_t>(buf.size());


No guarantee of alignment here, you should use copy.

acmorrow · 2026-01-06T14:10:33Z

src/viam/sdk/module/private/data_consumer_query.cpp

+namespace sdk {
+namespace impl {
+
+struct writer {


The writer can be in the unnamed namespace.

acmorrow

LGTM mod what looks like an oversight.

acmorrow · 2026-01-06T18:23:09Z

src/viam/sdk/module/data_consumer.hpp

+        const std::string& resource,
+        std::chrono::time_point<std::chrono::system_clock, std::chrono::milliseconds> time_point);
+
+    DataConsumer(DataClient& dc);


OK. I'd vote to leave it as-is. Can you add though a doxygen comment explaining that the DataConsumer must not outlive the DataClient?

acmorrow · 2026-01-06T18:23:26Z

.clang-tidy

  -readability-implicit-bool-conversion,
  -readability-magic-numbers,
  -misc-include-cleaner,
+  -misc-no-recursion,


Still needed?

lia-viam added 30 commits November 18, 2025 14:52

introduce grpc version check macro

87ef4ad

initial port of old dial direct pr

78d6f87

try changing one

36bdd4f

get viamsdk compiling

1b71ab0

update pipeline helper

ec1cf54

auth token and channel go in impl

c7a9497

dial options to viamchannel::options

081b4ed

Merge branch 'main' of github.com:viamrobotics/viam-cpp-sdk into dial…

3f6cb9d

…-direct-v2

add deprecation comment

066c267

registry takes viam channel by ref to const

386e5e7

remove accidentally reintroduced ctor

59ab5e3

revert todo comment

df5b393

re-set viam channel member var

9508b8d

create new files

657a1e1

add lifetime comment

63a5685

rename variables to channel_options

9863d3e

convenience alias

ed64d67

rearrange doc comments

ae9508b

set and document value returned by get_channel_addr

6429f20

use viam prefixed version

6d627f3

document close

b62e1a4

Merge branch 'dial-direct-v2' into app-client

5bd1382

constructor

0c1607b

concise deprecation message

e356ab5

add data protos and build data

21234fd

Merge branch 'dial-direct-v2' into app-client

615d04d

Merge branch 'main' into dial-direct-v2

0ba163d

move comments to setters

6ff2ce5

Merge branch 'dial-direct-v2' of github.com:lia-viam/viam-cpp-sdk int…

a0aa1c6

…o dial-direct-v2

remove outdated comment

00d06e1

lia-viam added 5 commits December 18, 2025 18:09

finish writer implementation

1600aea

data consumer tests

0cae0fb

add test case

a787dd4

rearrange default ctors and use default param

ee735f3

Merge branch 'main' of github.com:viamrobotics/viam-cpp-sdk into hist…

9dbe237

…orical-data

lia-viam requested a review from acmorrow December 19, 2025 21:01

lia-viam requested a review from a team as a code owner December 19, 2025 21:01

lia-viam requested review from njooma and stuqdog and removed request for a team December 19, 2025 21:01

lia-viam added 2 commits December 22, 2025 13:12

noexcept

b228a36

unban recursion

a0118cc

stuqdog reviewed Dec 22, 2025

View reviewed changes

lia-viam added 3 commits December 23, 2025 12:14

silence linter

62fc9b2

doc comments

80c406f

extra explanatory comment

57159df

lia-viam requested a review from stuqdog December 23, 2025 18:00

static cast

ceb0a3e

stuqdog reviewed Dec 26, 2025

View reviewed changes

lia-viam added 3 commits January 5, 2026 11:48

Merge branch 'main' into historical-data

f2eaec9

factor out query helper into private source and add test case

6ee6e4b

missing include

a01be58

lia-viam requested a review from stuqdog January 5, 2026 17:30

acmorrow requested changes Jan 6, 2026

View reviewed changes

lia-viam added 4 commits January 6, 2026 10:49

use anon ns and add object_id enum

61d1238

revert global nolint

2ded47d

endian awareness

1b8d431

nolint recursion

2845b22

lia-viam requested a review from acmorrow January 6, 2026 16:35

acmorrow approved these changes Jan 6, 2026

View reviewed changes

		const std::string org_id = get_env("VIAM_PRIMARY_ORG_ID").value_or("");
		const std::string part_id = get_env("VIAM_MACHINE_PART_ID").value_or("");


		write_bytes(uint8_t{0}); // end of object

		reinterpret_cast<int32_t>(&buf.front()) = static_cast<int32_t>(buf.size());

RSDK-11991: Historical module data #524

Are you sure you want to change the base?

RSDK-11991: Historical module data #524

Conversation

lia-viam commented Dec 19, 2025

Uh oh!

stuqdog left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

stuqdog left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

acmorrow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants