Making a mistake is one of the worst you can do at work, especially whilst working in a country like Singapore. If you want to be a professional, don’t make mistake. And yesterday, I made one. But allow me to tell a bit about my working environment 🙂
We are a small, young, creative and innovating company which is growing and moving fast. We want to take the first position in mind of Singapore’s people and around SEA when they really want to buy Insurance to cover their beloved belonging like expensive smartphones, transit, and travel. We have a small, but strong, team, in term of technical skill and relationship. They all holding the highest degree and experiences for their major. We have Bachelor, Master, PhD, Doctoral in Computer Sciences, Machine Learning & AI, and other domains. They all are experts, especially, we have a very strong mentor to support us, our CEO, also is another tech expert.
We move fast, make breaking thought, build partnerships with the giants like FWD, SOMPO, Bukalapak, and more to come. We know, we need to build the highest quality products: simple but solid, convenience but easy to use, deeply integrated with the cutting edge technology but still beautiful, and, above of all, it must be functioning properly. But today, it didn’t.
That critical issue appeared as a very small alert message which showed in the app of our colleague: “Something went wrong, please contact our customer support…”. It could come from anywhere under the eyes of an Engineer, network got connection issue, internal server error, etc, all of them, is the very common cases could happen to apps nowadays and could be solved easily with an instruction like check you internet connection, or just fix if there is an error related to memory caching on server. But, this time, behind the baby-face was a demon, and I created that demon.
I jumped into investigating with those ideas above on my mind. I open the app, see that message, try to close it, turn on/off internet connection, open app again, it’s still there. I asked my colleague if they got any error coming on server log, everything is fine, quiet, peaceful and clean. My experience told me that, this is not a baby-bug, I felt like there was someone just gave me a slap, stay still for seconds, my brain turns itself to a boosted mode immediately. And I get serious into finding this demon.
I deleted the app, installed again, it’s still there.
I checked out our released branch, repeated the setting up environments step and built the production app again, It run just perfectly, no alert, no error, clean and beautiful. But as beautiful it was, as scared I felt. I was catching a ghost bug on production, which means cannot be reproduced in developer’s environment. All other departments in Singapore office were looking into it and wait, it was 2:30 PM Friday. Taiwan team also asked for elaborating, Chengdu team was thinking if they made any wrong implementation which caused that issue. I hadn’t known that was a demon I created, yet.
I quickly reset all potential idea above, that is a tester/end-user though, and turned my mind to an engineer. My feeling told me I need to catch all the request coming in and out my app, the production app running. Quickly turned on my Proxy Debugger, re-direct my phone to connect to my laptop via the proxy I set. Basically, now, every request comes out and into my app, needed to go through my Laptop, that is an immigration gate, which is I can see, scan, filter and check all requests’ address. I expected it was not what I was thinking about that moment, is, I compiled the app with a wrong server’s endpoint.
It turned out that was exactly what I was thinking, the worst case could happen, It was.
Don’t panic is the most important now. If you were driving a car but accidentally hit a bus. Would you jump into investigating why you hit it? Of course, if there are other people in your car, they can ask, why didn’t you push the brake, why you didn’t see the bus earlier? All the question is above why it could happen.
I had the same on my mind, too. But I calm enough to understand the situation. I need to check if is there anybody gets hurt first,
- Tell my colleagues I found the issue to avoid further experiments. But I haven’t told them what is it, even though they have the right to know, but not before I calculate the damage.
- I quickly check how many people were using that version, the result showed 11, it makes me calm down more. 11 user saw that message, and some of them couldn’t register a new account (if they meant to do).
- Is there anyhow that those people will lose their purchased product? No.
- 1 hour after got started investigating, I announced the issue: What happened? Which country and which version of the app would be affected? How many user affected? Why did it happen? and What I am gonna do to solve.
- 2 Hours after getting the report, I submitted a new version to patch it, of course that I tested it to make sure It is working with the right endpoint.
It’s very uncomfortable and embarrassing to tell everyone that It was my mistake and say an excuse. But I know, I need to be courage, take the responsibility for what I have done and move on, instead of standstill and hide, waiting for someone would fix it, or even, no one could have found it out.
I never made that kind or mistake during my 9 years-career, I am not saying any excuse but solutions to avoid that kind of mistake could happen again. With help from my colleagues, I will improve the release scripts for the app, reduce the steps that needed to be run. The simplest procedure is the best, and to get there, automation will be the answer for us, actually, we were doing automation releasing but seems like it was not good enough. I hope we can become stronger after solve this issue properly. Development happens to solve problems, problems appear then we have development.
This time is my fault, I take it, thanks for all my coleage for have trusted me and supported me.
Nice weekend guys.