2020.12.25 Technology

Looking Back DeNA Engineers' Blog

by mazgi

#blog #aws #circleci #docker #github #hugo #terraform

This is an article that was published on the “ DeNA Advent Calendar 2020 ” for December 25th.
To those of you who only read English, I’m sorry the links are mostly Japanese.

Hi, I’m @mazgi . In the last article , I wrote that we are renewing this blog.
This blog has been renewed on Apr. 2020 and thanks to you, the readership is going up and everything is going well.

The internet is flooded with announcements of new releases. However, there are few articles following up on what’s happened to those new releases.
Therefore, as the individual who was allotted the last day of the advent calendar, I’ll follow up on the results of the blog, whether it could be considered as a success or a failure.

Renewal: What Changed?

Our blog was built ten years ago and powered by Movable Type (MT) which was the standard back them, and it has been hosted on a Linux virtual machine on our on-prem infrastructure.

However, technologies have improved in these ten years, so nowadays hosting static websites such as blogs don’t require servers.
Therefore, I decided, “I will not use servers anymore!” on this renewal.

As a result, we designed a new blog that is mainly powered by Hugo , GitHub Enterprise , CircleCI Enterprise , and Amazon S3 which requires no servers or complex operations.

In terms of architectural detail, please refer to the following article.
https://engineer.dena.com/posts/2018.12/we-just-building-new-dena-engineers-blog/

Or, please refer to my personal article which describes this in more detail.
https://mazgi.github.io/posts/2019.04/built-noops-blog-with-github-circleci-s3/

This blog is likely to keep growing, and a few years later someone else will say “This structure is a relic so let’s renew it,” and redesign it once again.
I feel this is how it should be.

The Result of Renewal

Account Perspective: Did it Become Easier to Write this Blog?

In the old blog, our writing methods had been lost.
In other words, the MT accounts were isolated from other internal accounts, so only those who excel in searching were able to find the application methods needed for the account from deep down our internal wiki.

Obviously, these methods and applications become apparent once you experience them, but it’s not easy for newcomers.
Most likely our MT was left behind from other internal systems and accounts that were renewed and integrated in the last ten years.
It made me feel the weight of time.

In this new blog, if you have a GitHub Enterprise account, you can write articles, and any employee can review it.
Our engineers and employees have a GitHub Enterprise account, so there is no need to apply for an account for writing blog articles to begin with.

Moreover, the Pull Request of each article that flow on Slack, so many employees are able to review, and it creates an attractive and active atmosphere for writing articles.

However, just having a system is not enough to create an interactive blogosphere.
Our blogs liven up thanks to our Technical PR’s efforts in improving and spreading these articles.
https://engineer.dena.com/posts/2020.12/add_another_twist/

Writing and Reviewing Perspective: Did it Become Easier to Write this Blog?

With our old blog, the articles were submitted on web UI on MT, with each author writing a draft, and requesting a review via email and Google Docs.
In other words, only the author and reviewers would know the article has been written.

Of course, if the author is writing the draft using Google Docs, he/she can share the draft and request a review from any employee.
Some authors had been sharing the Google Docs URL and requesting a review from many employees via Slack and IRC(which no longer exists).

However, after our renewal, the process to publish articles changed from being submitted to our MT to Pull Requests on our GitHub Enterprise, and our article format was replaced from MT data on the server to Markdown texts on the Git repository.
At DeNA, our CTO and tech managers are using GitHub Enterprise as well, so anyone can request a review from them via mentions and assignments.

Moreover, we designed it so that a URL is generated for each review request, which added to the traditional review request method mentioned above.
All employees can access this URL, so the author can ask for a review just by sharing this URL to people in legal, business, and other departments who don’t necessarily have a GitHub Enterprise account.

Additionally, the review URLs as Pull Request comments are posted on Slack, so naturally many employees will have the opportunity to see this article.

However, we haven’t been providing any editor applications now, so people who are not used to the markdown format might find it a little difficult.
If we could provide editor applications and web UIs for writing articles with markdown formats, I feel that would be useful.

Migrating the Old Blog to the New One: How Did It Go?

As those of you who have published websites before probably already know, it’s not that difficult to create a new static website and switch it from the old one.
The process is simple. All you have to do is make a new hosting location and update the DNS records.

On the other hand, the content migration is not simple.
While our articles, which many people have been contributing for the last ten years in several MT formats, is an invaluable technological history for our company, they require a lot of work.

It’s not that difficult to store away these articles. Still, we wanted to migrate these so that you could continue to read them, even if our technology stacks spread from Perl to many languages and layers, or our phones change from feature phones to smartphones.

However, our blog doesn’t have anything to do with our sales, so we are able to invest only a limited amount of working hours for this migration.
Therefore, we planned machine conversions for the formats and we made sure of the conversion results manually.

First, we couldn’t find any tools to convert the formats from MT to Hugo, so I developed a simple CLI tool.

The source code of the tool that converts articles from MT to Hugo is published in the following URL.
https://github.com/mazgi/mt-to-hugo-article-converter

The package is also published on PyPi, and you could use it by running pip install.
https://pypi.org/project/mt-to-hugo-article-converter/

Second, I combined convertible assets such as blog articles, images and this tool into one GitHub Enterprise repository.

These assets in the repository are updatable.
The articles were exported in MT format from our old blog web UI, and we committed these to the repository as a file with a predetermined name.
The images were downloaded according to each article, which is the tool function.

And, I’ve included a function that allows the previewing of these articles that were machine converted via Hugo.
I’ve simplified the environment setup to convert articles by combining and dockerizing these features into one repository.

Finally, in this case, we were able to do this only by using Macs, so I wrote the procedure for macOS in the README.

README

In other words, I’m to hand it over to our teammates.

Fortunately, we’ve only had a few new articles, so we’ve completed the conversion without needing to revise.
While I’ve only set up the tools and environments, I really want to thank my co-workers who ensured the conversion results, and our information system department which delivered the Macs required for conversion right away.

Also, thanks to the original designers and operators for our blog system.
We succeeded in the migration because our original blog system was built on MT and maintained properly and carefully.

Migrating Blog URLs: How Did It Go?

The changes in content URLs are a difficult point when migrating blogs and CMSs.
FQDN sometimes changes for various reasons, and paths and query parameters may change depending on whether you change systems such as CMS.

Thank you to those of you who bookmarked to our old blog articles.
We had hoped to maintain our blogs so that you can continue to read them even when you are accessing it through SNS posts and bookmarks.

Therefore, I developed a small library to map and redirect our old URLs and new ones.
https://github.com/mazgi/express-middleware-redirector

However, whenever you publish a library, even if it’s a small web application, you have to perform system operations such as maintaining its security, updating dependencies, and more.
Therefore, I decided to develop this small application as an Express Middleware and ship this on AWS Lambda with Serverless Framework .
This application to redirect URLs will be running continuously. Running this on AWS Lambda will reduce our operation cost and labor.

The example code for running applications on Lambda with Serverless Framework is sls-aws/src/index.ts in the repository I mentioned above.
This example is almost the same as our production code.

import { createRedirector } from "express-middleware-redirector";
import express from "express";
import serverless from "serverless-http";

const app = express();

// eslint-disable-next-line @typescript-eslint/no-var-requires
const config = require("/data/config/app/config.json");
const redirector = createRedirector(config);
app.use(redirector);

export const handler = serverless(app);

We benefited from this small application. You could access articles via the URLs originally bookmarked, keep the blog without unnecessary operations, and hand down redirection settings to the next generation through configuration files committed on our git repository.

This application was dockerized as well, so we can operate via docker-compose up and docker-compose down.
Dockerizing makes an application environment into a solvable black-box so that you can work with a few commands such as git clone && docker-compose up && docker-compose down. And after reading and understanding them in more detail, you can develop and update it.
As a matter of fact, now that I’m no longer in charge of this application, our Technical PR people are maintaining it.

Additionally, Hugo has the alias feature.
If you are using Hugo and are worried about the same as we are, I recommend using this alias feature.

In our case, we hope to simplify the articles in our repository structure, so we resolved this problem without using this Hugo feature.

Other Issues I See When I Look Back

I have a few issues that I regret.

One of them is the directory structure of the articles.
Normally, Hugo reflects the article path to the URL directly.

content/posts/2006.01/an-article.md
▶️ /posts/2006.01/an-article/

It’s a matter of personal taste as well, I think the following points are important.

  • The expiration date of IT articles comes fast, so I want to clearly see the publishing date.
    • For example, I want the URLs or titles to include at least the year and month.
  • I want a delimiter to exist between each meaning.
    • For example, I prefer 2006-01/my-first-article instead of 2006-01-my-first-article.

As a result of this the blog URLs and article’s directory structure are posts/Y.M/title.md.

This directory structure is correct, but when our ten years' worth of articles are placed, it causes 10 years * 12 months = 120 directories to be created under posts/.
I regret this. I should have made it posts/Y/M/title.md instead.

If we write blog articles for the next 300 years and maxed out 3600 directories that are so heavy that as a result we got a ls command, I’m the perpetrator – so sorry.
Before that happens, please fix the structure.

We’ve already started fixing other problems, such as CI/CD that have a naive design.
I hope our co-workers contribute to improving our blog by sending Pull Requests when they write articles, and they start planning to redesign our blog a few years later.

In Closing

As I mentioned in the beginning, operating existing systems and cleaning up old services are important missions, but that might not be as glorious as new releases.

In this article I described the problems and solutions that are inconspicuous but important such as converting articles, redirecting URLs, and others when we migrate our blog. I’m glad if it helps those of you who are struggling with the same issues as us.

With systems such as blogs that have a long lifespan and where one person cannot be in charge full-time, the background and knowledge gets lost very easily.
In this article, I wrote down the background and restrictions, concepts and reasons of why we have chosen what we did.
It would make me happy if this could be used as a reference when the next opportunity to renew this blog comes.

I think the lifespan of the 2020s we systems is about 5 years.
This blog is renewed using 2020s modern technology, but by around 2025, more excellent technologies will come, and AWS specifications will change, so you who are our co-workers in the future will say, “Let’s renew our blog!”.
I believe in you.

Thanks to our co-workers, we now acquired a new blog with articles managed with git and written in text format.
Next time, please renew our blog with the best structure at the time!

Here are some Japanese links that are related to our company:

最後まで読んでいただき、ありがとうございます!
この記事をシェアしていただける方はこちらからお願いします。

recruit

DeNAでは、失敗を恐れず常に挑戦し続けるエンジニアを募集しています。