Learnitweb

Modeling More Complex Data Structures in Protobuf

When you start working with Protocol Buffers, you quickly notice that simple key–value mappings are easy to represent. Protobuf provides a map<K, V> type that works very well when each key corresponds to a single value. However, real-world data is often more complex, and sometimes a single key must be associated with multiple values rather than just one.

This raises a natural question: how do we represent a structure similar to a Map<Key, List<Value>> from Java inside Protobuf?

The Limitation of map in Protobuf

A Protobuf map is designed for one key to map to exactly one value, and that value cannot be declared as repeated inside the map definition. This means you cannot directly write something like:

map<int32, repeated Car> cars = 1; // ❌ Not allowed

This restriction exists because Protobuf maps are internally implemented as repeated key–value entry messages, and allowing repeated fields as values would complicate encoding, decoding, and compatibility guarantees. Therefore, the idea of “a list per key” cannot be expressed directly using a map.

Rethinking the Data Model

Instead of trying to force Protobuf to exactly match a Java data structure, the better approach is to slightly remodel your schema in a Protobuf-friendly way. Protobuf is not meant to be a one-to-one mirror of your in-memory Java collections; it is a language-neutral, serialization-focused format that encourages stable and simple schemas.

A common and effective solution is to introduce an intermediate message.

Using a Wrapper Message for Lists

Suppose you want a structure like:

Map<Integer, List<Car>>

You can model this by creating a message that contains a repeated list of Car, and then use that message as the map value.

Example:

message Car {
  string brand = 1;
  string model = 2;
}

message CarList {
  repeated Car cars = 1;
}

message Garage {
  map<int32, CarList> cars_by_owner = 1;
}

Here, each key maps to a CarList message, and that message contains the repeated list of cars. This design achieves the same logical structure as a map of lists while staying within Protobuf’s rules.

Alternative: Skip the Map Entirely

In some cases, you may not need a map at all and can model your data as a repeated message that contains both the key and the list. This can be useful if ordering matters or if your access patterns do not require constant-time key lookup.

Example:

message OwnerCars {
  int32 owner_id = 1;
  repeated Car cars = 2;
}

message Garage {
  repeated OwnerCars entries = 1;
}

This approach can be simpler and sometimes more flexible, especially when evolving the schema later.

Key Design Insight

Protobuf schema design is about modeling data for serialization and compatibility, not about perfectly copying your Java collection types. If you try to replicate every Java structure literally, you will often run into limitations or produce rigid schemas.

Instead:

  • Think in terms of messages and relationships rather than language-specific collections.
  • Use additional messages as building blocks to represent nested or grouped data.
  • Choose clarity and forward compatibility over exact structural mirroring.