[TOC]

文章参考:https://juejin.cn/post/7155788141075365919#heading-2

概述

通常我们使用Java的序列化与反序列化时,只需要将类实现Serializable接口即可,剩下的事情就交给了jdk。今天我们就来探究一下,Java序列化是怎么实现的,然后探讨一下几个常见的集合类,他们是如何处理序列化带来的问题的。

下面我们来思考几个问题:

  1. 为什么序列化一个对象时,仅需要实现Serializable接口就可以了。

  2. 通常我们序列化一个类时,为什么推荐的做法是要实现一个静态final成员变量serialVersionUID

  3. 序列化机制是怎么忽略transient关键字的, static变量也不会被序列化。

下面我们来依次解答这些问题。

Serializable接口

先看Serializable接口,源码很简单,一个空的接口,没有方法也没有成员变量。但是注释非常详细,很清楚的描述了Serializable怎么用、能做什么,很值得一看,我们捡几个重点的翻译一下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
/**
* Serializability of a class is enabled by the class implementing the
* java.io.Serializable interface. Classes that do not implement this
* interface will not have any of their state serialized or
* deserialized. All subtypes of a serializable class are themselves
* serializable. The serialization interface has no methods or fields
* and serves only to identify the semantics of being serializable. <p>
*
* To allow subtypes of non-serializable classes to be serialized, the
* subtype may assume responsibility for saving and restoring the
* state of the supertype's public, protected, and (if accessible)
* package fields. The subtype may assume this responsibility only if
* the class it extends has an accessible no-arg constructor to
* initialize the class's state. It is an error to declare a class
* Serializable if this is not the case. The error will be detected at
* runtime. <p>
*
* During deserialization, the fields of non-serializable classes will
* be initialized using the public or protected no-arg constructor of
* the class. A no-arg constructor must be accessible to the subclass
* that is serializable. The fields of serializable subclasses will
* be restored from the stream. <p>
*
* When traversing a graph, an object may be encountered that does not
* support the Serializable interface. In this case the
* NotSerializableException will be thrown and will identify the class
* of the non-serializable object. <p>
*
* Classes that require special handling during the serialization and
* deserialization process must implement special methods with these exact
* signatures:
*
* <PRE>
* private void writeObject(java.io.ObjectOutputStream out)
* throws IOException
* private void readObject(java.io.ObjectInputStream in)
* throws IOException, ClassNotFoundException;
* private void readObjectNoData()
* throws ObjectStreamException;
* </PRE>
*
* <p>The writeObject method is responsible for writing the state of the
* object for its particular class so that the corresponding
* readObject method can restore it. The default mechanism for saving
* the Object's fields can be invoked by calling
* out.defaultWriteObject. The method does not need to concern
* itself with the state belonging to its superclasses or subclasses.
* State is saved by writing the individual fields to the
* ObjectOutputStream using the writeObject method or by using the
* methods for primitive data types supported by DataOutput.
*
* <p>The readObject method is responsible for reading from the stream and
* restoring the classes fields. It may call in.defaultReadObject to invoke
* the default mechanism for restoring the object's non-static and
* non-transient fields. The defaultReadObject method uses information in
* the stream to assign the fields of the object saved in the stream with the
* correspondingly named fields in the current object. This handles the case
* when the class has evolved to add new fields. The method does not need to
* concern itself with the state belonging to its superclasses or subclasses.
* State is saved by writing the individual fields to the
* ObjectOutputStream using the writeObject method or by using the
* methods for primitive data types supported by DataOutput.
*
* <p>The readObjectNoData method is responsible for initializing the state of
* the object for its particular class in the event that the serialization
* stream does not list the given class as a superclass of the object being
* deserialized. This may occur in cases where the receiving party uses a
* different version of the deserialized instance's class than the sending
* party, and the receiver's version extends classes that are not extended by
* the sender's version. This may also occur if the serialization stream has
* been tampered; hence, readObjectNoData is useful for initializing
* deserialized objects properly despite a "hostile" or incomplete source
* stream.
*
* <p>Serializable classes that need to designate an alternative object to be
* used when writing an object to the stream should implement this
* special method with the exact signature:
*
* <PRE>
* ANY-ACCESS-MODIFIER Object writeReplace() throws ObjectStreamException;
* </PRE><p>
*
* This writeReplace method is invoked by serialization if the method
* exists and it would be accessible from a method defined within the
* class of the object being serialized. Thus, the method can have private,
* protected and package-private access. Subclass access to this method
* follows java accessibility rules. <p>
*
* Classes that need to designate a replacement when an instance of it
* is read from the stream should implement this special method with the
* exact signature.
*
* <PRE>
* ANY-ACCESS-MODIFIER Object readResolve() throws ObjectStreamException;
* </PRE><p>
*
* This readResolve method follows the same invocation rules and
* accessibility rules as writeReplace.<p>
*
* The serialization runtime associates with each serializable class a version
* number, called a serialVersionUID, which is used during deserialization to
* verify that the sender and receiver of a serialized object have loaded
* classes for that object that are compatible with respect to serialization.
* If the receiver has loaded a class for the object that has a different
* serialVersionUID than that of the corresponding sender's class, then
* deserialization will result in an {@link InvalidClassException}. A
* serializable class can declare its own serialVersionUID explicitly by
* declaring a field named <code>"serialVersionUID"</code> that must be static,
* final, and of type <code>long</code>:
*
* <PRE>
* ANY-ACCESS-MODIFIER static final long serialVersionUID = 42L;
* </PRE>
*
* If a serializable class does not explicitly declare a serialVersionUID, then
* the serialization runtime will calculate a default serialVersionUID value
* for that class based on various aspects of the class, as described in the
* Java(TM) Object Serialization Specification. However, it is <em>strongly
* recommended</em> that all serializable classes explicitly declare
* serialVersionUID values, since the default serialVersionUID computation is
* highly sensitive to class details that may vary depending on compiler
* implementations, and can thus result in unexpected
* <code>InvalidClassException</code>s during deserialization. Therefore, to
* guarantee a consistent serialVersionUID value across different java compiler
* implementations, a serializable class must declare an explicit
* serialVersionUID value. It is also strongly advised that explicit
* serialVersionUID declarations use the <code>private</code> modifier where
* possible, since such declarations apply only to the immediately declaring
* class--serialVersionUID fields are not useful as inherited members. Array
* classes cannot declare an explicit serialVersionUID, so they always have
* the default computed value, but the requirement for matching
* serialVersionUID values is waived for array classes.
*
* Android implementation of serialVersionUID computation will change slightly
* for some classes if you're targeting android N. In order to preserve compatibility,
* this change is only enabled is the application target SDK version is set to
* 24 or higher. It is highly recommended to use an explicit serialVersionUID
* field to avoid compatibility issues.
*
* <h3>Implement Serializable Judiciously</h3>
* Refer to <i>Effective Java</i>'s chapter on serialization for thorough
* coverage of the serialization API. The book explains how to use this
* interface without harming your application's maintainability.
*
* <h3>Recommended Alternatives</h3>
* <strong>JSON</strong> is concise, human-readable and efficient. Android
* includes both a {@link android.util.JsonReader streaming API} and a {@link
* org.json.JSONObject tree API} to read and write JSON. Use a binding library
* like <a href="http://code.google.com/p/google-gson/">GSON</a> to read and
* write Java objects directly.
*
* @author unascribed
* @see java.io.ObjectOutputStream
* @see java.io.ObjectInputStream
* @see java.io.ObjectOutput
* @see java.io.ObjectInput
* @see java.io.Externalizable
* @since JDK1.1
*/
public interface Serializable {
}

类的可序列化性通过实现java.io.Serializable接口开启。未实现序列化接口的类不能序列化,所有实现了序列化的子类都可以被序列化。Serializable接口没有方法和属性,只是一个识别类可被序列化的标志。

在序列化过程中,如果类想要做一些特殊处理,可以通过实现以下方法writeObject(), readObject(), readObjectNoData(),其中,

  • writeObject方法负责为其特定类写入对象的状态,以便相应的readObject()方法可以还原它。

  • readObject()方法负责从流中读取并恢复类字段。

  • 如果某个超类不支持序列化,但又不希望使用默认值怎么办?writeReplace() 方法可以使对象被写入流之前,用一个对象来替换自己。

  • readResolve()通常在单例模式中使用,对象从流中被读出时,可以用一个对象替换另一个对象。

serialVersionUID

当一个对象实现 Serializable 接口时,多数 ide 会提示声明一个静态常量 serialVersionUID(版本标识),那 serialVersionUID 到底有什么作用呢?应该如何使用 serialVersionUID ?

serialVersionUID 是实现 Serializable 接口而来的,而 Serializable 则是应用于Java 对象序列化/反序列化。对象的序列化主要有两种用途:

  • 把对象序列化成字节码,保存到指定介质上(如磁盘等)
  • 用于网络传输

现在反过来说就是,serialVersionUID 会影响到上述所提到的两种行为。那到底会造成什么影响呢?

serialVersionUID 是 Java 为每个序列化类产生的版本标识,可用来保证在反序列时,发送方发送的和接受方接收的是可兼容的对象。如果接收方接收的类的 serialVersionUID 与发送方发送的 serialVersionUID 不一致,进行反序列时会抛出 InvalidClassException。序列化的类可显式声明 serialVersionUID 的值,如下:

1
static final long serialVersionUID = 1L;

当显式定义 serialVersionUID 的值时,Java 根据类的多个方面(具体可参考 Java 序列化规范)动态生成一个默认的 serialVersionUID 。尽管这样,还是建议你在每一个序列化的类中显式指定 serialVersionUID 的值,因为不同的 jdk 编译很可能会生成不同的 serialVersionUID 默认值,进而导致在反序列化时抛出 InvalidClassExceptions 异常。所以,为了保证在不同的 jdk 编译实现中,其 serialVersionUID 的值也一致,可序列化的类必须显式指定 serialVersionUID 的值。另外,serialVersionUID 的修饰符最好是 private,因为 serialVersionUID 不能被继承,所以建议使用 private 修饰 serialVersionUID。

举例说明如下: 现在尝试通过将一个类 Person 序列化到磁盘和反序列化来说明 serialVersionUID 的作用: Person 类如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public class Person implements Serializable {

private static final long serialVersionUID = 1L;

private String name;
private Integer age;
private String address;

public Person() {
}

public Person(String name, Integer age, String address) {
this.name = name;
this.age = age;
this.address = address;
}


@Override
public String toString() {
return "Person{" +
"name='" + name + '\'' +
", age=" + age +
", address='" + address + '\'' +
'}';
}
}

简单的测试一下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
@Test
public void testversion1L() throws Exception {
File file = new File("person.out");
// 序列化
ObjectOutputStream oout = new ObjectOutputStream(new FileOutputStream(file));
Person person = new Person("Haozi", 22, "上海");
oout.writeObject(person);
oout.close();
// 反序列化
ObjectInputStream oin = new ObjectInputStream(new FileInputStream(file));
Object newPerson = oin.readObject();
oin.close();
System.out.println(newPerson);
}

测试发现没有什么问题。有一天,因发展需要, 需要在 Person 中增加了一个字段 email,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
public class Person implements Serializable {

private static final long serialVersionUID = 1L;

private String name;
private Integer age;
private String address;
private String email;

public Person() {
}

public Person(String name, Integer age, String address) {
this.name = name;
this.age = age;
this.address = address;
}

public Person(String name, Integer age, String address,String email) {
this.name = name;
this.age = age;
this.address = address;
this.email = email;
}

@Override
public String toString() {
return "Person{" +
"name='" + name + '\'' +
", age=" + age +
", address='" + address + '\'' +
", email='" + email + '\'' +
'}';
}
}

这时我们假设和之前序列化到磁盘的 Person 类是兼容的,便不修改版本标识 serialVersionUID。再次测试如下:

1
2
3
4
5
6
7
8
@Test
public void testversion1LWithExtraEmail() throws Exception {
File file = new File("person.out");
ObjectInputStream oin = new ObjectInputStream(new FileInputStream(file));
Object newPerson = oin.readObject();
oin.close();
System.out.println(newPerson);
}

将以前序列化到磁盘的旧 Person 反序列化到新 Person 类时,没有任何问题。

可当我们增加 email 字段后,不作向后兼容。即放弃原来序列化到磁盘的 Person 类,这时我们可以将版本标识提高,如下:

1
private static final long serialVersionUID = 2L;

再次进行反序列化,则会报错,如下:

1
java.io.InvalidClassException:Person local class incompatible: stream classdesc serialVersionUID = 1, local class serialVersionUID = 2

谈到这里,我们大概可以清楚,serialVersionUID 就是控制版本是否兼容的,若我们认为修改的 Person 是向后兼容的,则不修改 serialVersionUID;反之,则提高 serialVersionUID的值。再回到一开始的问题,为什么 ide 会提示声明 serialVersionUID 的值呢?

因为若不显式定义 serialVersionUID 的值,Java 会根据类细节自动生成 serialVersionUID 的值,如果对类的源代码作了修改,再重新编译,新生成的类文件的serialVersionUID的取值有可能也会发生变化。类的serialVersionUID的默认值完全依赖于Java编译器的实现,对于同一个类,用不同的Java编译器编译,也有可能会导致不同的serialVersionUID。所以 ide 才会提示声明 serialVersionUID 的值。