Common Pitfalls in JPA(Hibernate)

Nowadays, ORM technique has been playing an important role in object-oriented programming, and JPA is now considered the standard industry approach for ORM in the Java industry. In this post, I summarized several phenomena which violate my intuition and prone to error.

As JPA itself is just a specification, there are various underlying implementation. In this post, we are only focusing on Hibernate implementation. In fact, I've never used or tested any other implementation so far, which means there's a chance that a problem cannot be reproduced in other JPA implementation.

Prerequisites

As in post Common Pitfalls of Declarative Transaction Management in Spring, all the samples are written in Kotlin language. And Spring Data JPA framework is used for the sake of convenience. Full source code can be found at common-pitfalls-in-jpa-hibernate.

Pitfall 1: Don't be fooled by equals and hashcode methods

You may already know that there are several contracts we have to obey when implementing equals and hashcode method. Namely Reflexivity, Symmetry, Transitivity, Consistency and "Non-nullity". When it comes to a JPA entity, things become even more difficult since entity state transitions must be taken into account. In other words, equals and hashcode methods must behave consistently across all entity state transitions. Thus, we can immediately conclude that logical key(usually auto generate after the first time being persisted) should not be taken into consideration. AbstractPersistable from spring data JPA library is a perfect counterexample which implements equals and hashcode based on auto-generation id. The following code demonstrates its flaw:

1
2
@Entity
class Demo1 : AbstractPersistable<Int>()

1
2
3
4
5
6
7
8
9
10
11
12
@Test
fun test() {
val demo = Demo1()
val set = hashSetOf(demo)
set.contains(demo) // true
entityManager.persist(demo)
entityManager.flush()
set.contains(demo) // false
}

The HashSet failed to recognize the same entity since its hashcode changed after being persisted. Certainly, this is error-prone. For similar reason, default equals and hashcode inherited from java.lang.Object is not suitable for JPA entity either. Code below shows that a merged entity isn't equal to itself because entityManager.merge may return a different object reference.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
@Entity
class Demo2 : Persistable<Int> {
@Id
@GeneratedValue
private var id: Int? = null
override fun getId(): Int? {
return id
}
override fun isNew(): Boolean {
return id == null
}
// inherit equals and hashcode from Object
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
@Test
fun test() {
val demo = Demo2()
val set = hashSetOf(demo)
set.contains(demo) // true
entityManager.persist(demo)
entityManager.flush()
entityManager.detach(demo)
val managed = entityManager.merge(demo)
set.contains(managed) // false
}

Now, the only option left to us is implementing equals and hashcode methods based on some business key, and never change the key after the entity is created. However, you can not always find such keys in practical. In such cases, the best we can do is no matter which way we choose to implement the methods, be aware of its shortcomings and document them clearly.

Reference:

阅读全文

Common Pitfalls of Declarative Transaction Management in Spring

Spring supports two types of transaction management, namely, programmatic and declarative transaction management. Despite the fact that programmatic management is more flexible, declarative management is still preferred since it is less invasive to application code. In this post, I'm going to summarize several pitfalls you may encounter while using declarative transaction management. Certainly, if you read the official document thoroughly, you should know how to avoid them on your own, but if you think of it is all about annotating your method with the @Transactional annotation as I did, you may never figure them out until the day your customer reports his balance is incorrect.

Prerequisites

Our samples are written in Kotlin language. In addition, I assume that you are already familiar with the following frameworks.

  • Spring
  • JPA(Hibernate implementation)
  • Spring Data JPA

Examples in this post are based on the following class:

1
2
3
4
5
6
7
8
// entity
@Entity
class DemoEntity(
val name: String
) : AbstractPersistable<Int>()
// repository
interface DemoRepo : CrudRepository<DemoEntity, Int>

Read the manual of Spring Data JPA If you are not able to understand the above code. Finally, The related test code will be available at common-pitfalls-of-declarative-transaction-management-in-spring.

Pitfall 1: @Transactional annotation may have no effect at all

It's a common circumstance that we put some code in a private method so it can be reused. If the code involves a transaction, we may end up with writing code like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
@Service
class DemoService(
private val demoRepo: DemoRepo
) {
fun persistAndDoSomething(demo: DemoEntity) {
persist(demo)
// do something
}
fun persistAndDoOtherthings(demo: DemoEntity) {
persist(demo)
// do other things
}
@Transactional
private fun persist(demo: DemoEntity) {
// you may think this action will be rolled back if exception occurs
demoRepo.save(demo)
unpredictableMethod()
}
// simulate a method which may or may not throw an exception
fun unpredictableMethod() {
if (ThreadLocalRandom.current().nextBoolean())
throw Exception("Oops!")
}
}

In this case, the persist method will be invoked as if no @Transactional annotation is present. To understand why, we need to know that the declarative transaction is implemented on top of AOP(Aspect Oriented Programming) proxies. A proxied method invocation procedure looks like this:
transactional-proxy
From the picture, it is not hard to imagine that the beginTransaction, commit and rollback logic is implemented in a so called "advice" component. "Advice" here refers to a core concept of AOP(read the documentation of Spring AOP to get more information), and there is two different advice mode supported by Spring transaction management, which called "PROXY" and "ASPECTJ". As the document says:

When using proxies, you should apply the @Transactional annotation only to methods with public visibility. If you do annotate protected, private or package-visible methods with the @Transactional annotation, no error is raised, but the annotated method does not exhibit the configured transactional settings. Consider the use of AspectJ if you need to annotate non-public methods.

Now the reason is pretty clear, to fix the problem, we can either switch the advice mode from "PROXY"(default option) to "ASPECTJ", or remove the private modifier from the persist method. Let's say we choose to remove the modifier, you can find that the persist is still invoked without any transaction, because we just fall into the next pitfall.

阅读全文

Understanding Zombie Process

As a programmer, I usually feel uncomfortable when top command reports there're zombie processes running on my computer. After some study, I found that the zombie process is not as scary as my thought. This article briefly introduces the zombie process in UNIX-like systems.

What is a "zombie process"?

"In UNIX System terminology, a process that has terminated, but whose parent has not yet waited for it, is called a zombie."

After we create a process via fork function, we get a parent process and a child process. The parent process sometimes needs to know how the child is terminated. In normal cases, we call wait or waitpid to fetch the termination status. However, a child process could terminate before its parent waits for it. In such a case, If the system cleared the child's information completely, its parent wouldn't be able to know its status. As a result, the kernel has to keep a small amount of information after a process terminates. A process like this that has been terminated, but not completely disappear, is called a zombie process.

Note that zombie processes should not be confused with orphan processes: an orphan process is a process that is still executing, but whose parent has died. These do not remain as zombie processes; instead, (like all orphaned processes) they are adopted by init, which waits on its children. The result is that a process that is both a zombie and an orphan will be reaped automatically.

阅读全文

再谈js闭包

网上关于 js 闭包的文章多如牛毛,这里之所以再写一篇,主要是因为网上的那些文章要么对初学者不够友好,要么根本就没有谈到重点。在读过那些文章后的很长一段时间里,我对闭包都是似懂非懂。直到在学 react 的过程中逐渐接触函数式编程,才开始真正理解闭包。

Trick

请先思考一下下面两段代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function createFunction() {
let arr = [];
for (var i = 0; i < 5; i++) {
arr[i] = function () {
return i;
};
}
return arr;
}
const result = createFunction().map(e => e());
console.log(result); // [5, 5, 5, 5, 5]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function counter() {
let n = 0;
return {
increase() { ++n; },
get() { return n; },
};
}
const cnt1 = counter();
const cnt2 = counter();
cnt1.increase();
console.log(cnt1.get()); // 1
console.log(cnt2.get()); // 0

在往下阅读之前,请再次确保你有花时间理解上面两段代码。虽然它们跟下文内容并没有一毛钱关系,不过反正没事烧烧脑也没什么坏处。。。
这里我想表达的只是,网上大量关于闭包的文章大抵都遵循这个模式,先制造一堆跟上面例子类似的函数,之后让读者尝试给出运行结果,最后在配合上自己的一顿讲解,仿佛能理解这些代码就是懂了闭包。而事实却是,能看懂这些代码并不代表你就理解了闭包,理解闭包之后再看这些代码也不一定就都能立刻指出运行结果。

阅读全文

B-tree数据结构

几乎所有文件系统索引,数据库索引,都需要用到B-tree或其变种。因此对于码农而言,理解它如何工作还是相当有必要的。然而我最近阅读了很多相关的文章,它们往往一上来就试图跟读者解释branchingFactor/degree, B-1<=keys<2B-1,B<=children<2B blablabla...,在读者还不懂原理时就开始描述实现细节。导致像我一样没学过正统cs课程的野生码农严重怀疑自己的智商。这篇文章试图向跟我一样的小白来描述B-tree(前提是你能理解二分查找)。

一.B tree,B-tree,B+tree

如果你在google搜索B-tree数据结构,你可能会找到以上三个关键字相关的文章,这一度让我觉得它们是三种不同的数据结构(既然有BPlusTree,很容易让人想到还有BMinusTree,再加上一些不负责任的翻译版算法书竟然丧心病狂的使用 B- tree (注意空格)来标识B-tree)。事实上这里的 - 只是连字符,B-tree与B tree是同一回事。而B+tree则是它们的一个演化版。当然你也许还会发现一些文章中的B-tree和另一些文章中的B tree有着不同的实现。请依然不要被误导,我读了大概20来篇B-tree相关的文章,至今没有发现两个完全相同的实现。。。

阅读全文