Gluck 发布的文章

Sometimes when we use a virtual machine on some cloud platforms such as Azure, we might face the following scenario. The hard disk is running out of space. Thus we expand the size of that hard disk from the cloud platform's dashboard. After logging in to the system again, we find that the size of the disk is successfully expanded while the partition remains the same size.

- 阅读剩余部分 -

An example of a vulnerability in the early JWT token node.js library:

Basic Introduction to JWT Token

According to standard RFC 7519, JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or integrity protected with a Message Authentication Code (MAC) and/or encrypted.

- 阅读剩余部分 -


If you are into Docker and Kubernetes, and have some IaaS resource at hand. Probably you could try to build a nice Kubernetes cluster by yourself instead of deploying the work to some cloud platform. In this way you learn more about the design of Kubernetes and also have the total control of your cluster on host machine level.

Some Good Reading Materials

- 阅读剩余部分 -

When we build applications, one of our aims should be making them resilient. A good application can sustain its operations in the face of different kinds of failure. The final tests for this don't begin until the application is deployed into a production environment, after which we cannot predict its trials or their results. A new approach is to change our perspective on errors in software systems by not preventing them all the time, but triggering the faults in some controlled situation, learning from the behavior of the application, and finally improving its resilience. To this end, we will design this chaos agent, and the first version will be focused on verification and analysis of error-handling in the JVM.

About Chaos Engineering and Antifragile Software

If you are not familiar with chaos engineering, we provide introductory materials about this technique at the end of this article. Chaos engineering is the practice of experimenting on a distributed system in order to build confidence in the system’s capability to withstand unexpected conditions in production. As for antifragility, it's the antonym of "fragility". Traditional means to combat fragility include: fault prevention, fault tolerance, fault removal, and fault forecasting. However, the contributions of those techniques are insufficient; we propose another perspective on system errors. If we can build mechanisms to let the system experience errors and use those to learn from the failures in a controlled environment, we can build confidence in our system's resilience. The goal of chaos engineering and antifragile design is to perform these perturbations and learn from the experience

A Chaos Engineering System for Live Analysis and Falsification of Exception-handling in the JVM

- 阅读剩余部分 -

Short Introduction to This Paper

The paper is written in good texts and it introduces several interesting self-healing strategies for a operating system (not for a specific application). However there is a plethora of related work in hardware and software fault-tolerance.

The contribution of this paper is mainly a survey of techniques that can be applied to provide self-healing functionality to an OS. It discussed the concepts, implementation and evaluation on exception handling, code reloading, operating system component isolation, micro-rebooting, automatic system service restarts, watchdog timer based recovery and transactional components.

- 阅读剩余部分 -