Protobuf Schema Evolution and Backward Compatibility

In real-world systems, APIs do not remain static. Requirements change, fields are added, names are improved, and sometimes structures evolve. When using Protocol Buffers, these changes must be handled carefully so that different versions of services can still communicate safely.

This tutorial explains how Protobuf handles API changes, what happens when clients and servers use different schema versions, and what rules you must follow to maintain compatibility.

1. Real-World Scenario: Multiple Teams, One API

Imagine an organization with multiple teams:

Team A owns a service
Team B and Team C are clients of that service
The service communicates using Protobuf messages

Team A publishes a V1 schema. Everyone generates code and starts using it.

Later:

Team A upgrades to V2
Some clients upgrade, others do not

Later again:

Team A upgrades to V3
Clients are on mixed versions

The big question is:

Will communication break when versions differ?

To understand this, we simulate version evolution.

2. Version 1 (V1) Schema

television.proto (V1)

syntax = "proto3";

package section05.v1;

option java_package = "com.example.section05.v1";
option java_multiple_files = true;

message Television {
  string brand = 1;
  int32 year = 2;
}

2.1 V1 Producer Example (Java)

Television tv = Television.newBuilder()
        .setBrand("Samsung")
        .setYear(2019)
        .build();

byte[] bytes = tv.toByteArray();

2.2 V1 Parser (Client)

Television tv = Television.parseFrom(bytes);

System.out.println(tv.getBrand());
System.out.println(tv.getYear());

Everything works as expected.

3. New Requirements → V2

New needs arise:

Add a new field: TV type (HD, UHD, OLED)
Rename year to model for clarity

We create V2.

4. Version 2 (V2) Schema

syntax = "proto3";

package section05.v2;

option java_package = "com.example.section05.v2";
option java_multiple_files = true;

message Television {

  enum TvType {
    HD = 0;
    UHD = 1;
    OLED = 2;
  }

  string brand = 1;

  // Renamed from year → model
  int32 model = 2;

  TvType type = 3;
}

Changes:

Field 2 renamed
Field 3 added

5. Scenario: V2 Server → V1 Client

Server sends V2 message:

Television tv = Television.newBuilder()
        .setBrand("Samsung")
        .setModel(2019)
        .setType(TvType.UHD)
        .build();

Serialized and sent to a V1 client.

What Happens?

V1 client sees:

Tag 1 → brand
Tag 2 → year

It does NOT know about tag 3.

Result:

brand = Samsung
year = 2019
type ignored

No failure occurs.

Why?

Because Protobuf decodes by field number, not field name.

6. Key Rule: Renaming Fields Is Safe

Changing:

int32 year = 2;

to:

int32 model = 2;

is safe because:

Tag number unchanged
Type unchanged

Protobuf still maps tag 2 correctly.

Field names are only for developers.

7. Key Rule: Changing Types Is Dangerous

Changing:

int32 year = 2;

to:

string model = 2;

is unsafe. Because encoding depends on type.

This can cause:

Parsing errors
Corrupted data
Runtime failures

Never change a field’s type once released.

8. Unknown Fields Behavior

When a client receives fields it does not know:

They are stored as unknown fields
They are ignored by getters
They do not break parsing

V1 client receiving tag 3 (type):

Cannot access it
But it exists internally as unknown

This allows forward compatibility.

9. Scenario: V1 Server → V2 Client

Server sends V1 message:

brand
year

V2 client parses:

brand → OK
model (tag 2) → OK
type (tag 3) → missing

Since type is missing:

Default enum value used
Default = 0 → HD

Again, no failure occurs.

10. Why This Works

Protobuf design principles:

Tag-based decoding
Default values for missing fields
Ignoring unknown fields

This enables:

Backward compatibility
Forward compatibility

11. Safe vs Unsafe Changes

Safe Changes

Adding new fields
Renaming fields
Adding enum values
Reordering fields in file

Unsafe Changes

Changing field types
Reusing tag numbers
Changing tag numbers
Removing fields without reserving tags

12. Practical Guidelines

Always follow these rules in production:

Never change field numbers
Never change field types
Prefer adding new fields instead of modifying old ones
Use new tags for new fields
Keep old fields for compatibility
Reserve removed field numbers

13. Version 3 (V3) — Removing a Field

Suppose the requirement says:

We no longer receive model/year from a third-party service, so we must remove it.

So V3 becomes:

syntax = "proto3";

package section05.v3;

option java_package = "com.example.section05.v3";
option java_multiple_files = true;

message Television {

  enum TvType {
    HD = 0;
    UHD = 1;
    OLED = 2;
  }

  string brand = 1;
  TvType type = 3;
}

Field 2 (model/year) is removed.

14. What Happens to Older Clients?

V1 Client Receiving V3 Data

V1 expects:

string brand = 1;
int32 year = 2;

But V3 does not send tag 2.

So V1 sees:

brand → correct
year → default value 0

Nothing breaks. Missing fields simply use defaults.

15. Default Values in Proto3

Proto3 automatically assigns defaults:

int32 → 0
bool → false
string → empty
enum → first value (usually 0)

So when a field is absent:

It does NOT throw an error
It silently uses default

This is a major reason Protobuf supports compatibility.

16. Version 4 (V4) — A Common Mistake

A new developer joins and adds price:

int32 price = 2;  // ❌ BAD

Why this is dangerous:

Tag 2 used to mean year/model
Old clients will decode price as year

So:

price = 50000
V1 sees year = 50000

This creates semantic corruption. No crash happens, but the data becomes wrong. This is worse than a failure.

17. Correct Approach — Reserving Removed Fields

Whenever a field is removed, reserve its tag. Now tag 2 is permanently blocked. Future developers cannot reuse it accidentally.

message Television {

  reserved 2;

  enum TvType {
    HD = 0;
    UHD = 1;
    OLED = 2;
  }

  string brand = 1;
  TvType type = 3;
}

18. Reserving Field Names Too

You can also reserve names:

reserved "year", "model";

This prevents reuse of old names. Helpful for large teams and long-lived APIs.

19. Adding Price Correctly

Instead of reusing 2:

int32 price = 4;  // ✅ Correct

Now:

Old clients ignore it
No confusion
No data corruption

20. Handling Default Value Ambiguity

Sometimes 0 is not acceptable as a default.

Example:

price = 0 → Is it free or unset?

Two solutions:

Option 1 — Wrapper Types

google.protobuf.Int32Value price = 4;

Allows:

hasPrice()

Option 2 — `optional`

optional int32 price = 4;