Distributed systems are like the difference between "copy" and "email."
DataStax is a database that was created based on the open-source Apache Cassandra, and it also provides ancillary services. Mr. Kimoto explained its major features to me in simple terms.
“It's like the difference between copying and e-mailing. When you want to provide people with information, if you use paper, you have to make copies to distribute. But with email, all you have to do is put in multiple email addresses and send one out. I want people to think of it as a database version of this."
Once the data are in, they can be sent to any data center in the world.
“The Internet is connected to the entire world, so apps can go to a specific data center to view data, but if there is a physical distance between them, it takes longer and slows things down. So, we connect to a nearby data center. This is because, if you replicate the data to a nearby data center, the app can respond faster."
The system was originally created when a member of Facebook created and published a system that was a "good mix" of a paper written by a member of Google and a paper written by a member of Amazon.com on the topic of what a distributed system should be.
“But Facebook isn't a database company, so it is not going to maintain it. Therefore, we decided to make it open source. Therefore, we said, ‘Let us together make a better distributed system.’”
The strength of the "duplicate data" option is that it can continue to operate 365 days a year, even if the machine goes down.
This is how the open-source Apache Cassandra project began, and this is where the founding members of DataStax came into play.
“The age of the Cloud is coming. We thought that not only Internet companies but also general companies would need distributed systems. Even though it was open source, we thought it would be better to have some people who could provide support and consult. Therefore, I created a company to support people who wanted to use Apache Cassandra.”
With distributed systems, data can be retrieved quickly from any location. Global Internet companies were the first to realize this appeal — Netflix, Spotify, Instagram, Apple, and others.
“Netflix, for example, started in the U.S. and expanded rapidly to Europe and Asia in a short period of time. This is because it could replicate databases that were originally on the west and east coasts of the U.S. So, we were able to set up bases all over the place in a cloud environment, deploy them there, and then expand from there.”
Netflix was using Amazon's AWS cloud service, but, as it added more data centers, the databases were replicated one after another.
“The beauty of Apache Cassandra is that you can choose which data to replicate and where to replicate them. And how many copies to have in each data center. If one of the machines goes down, the database is still up and running 24 hours a day, 365 days a year.”
Full-fledged launch of a Japanese subsidiary in response to rapidly growing needs
Database technology is still based on the theory of relational databases, which was born in the 1970s. However, there are concerns that the technology can no longer cope with the ever-increasing volume of data due to the spread of the Internet.
“The more data we had, the bigger we scaled up our machines. But this would get slower and slower. That is why we came up with the idea of distributing the load."
But then, as the data grew, it would constantly be necessary to specify where and how to distribute them. This is where Apache Cassandra showed its strength.
“Apache Cassandra has a mechanism where you can hand off the functionality to the database, then just add machines to the side, and it expands on its own."
In Japan, the need was steadily growing. Starting with major IT companies, major telecommunications companies, financial institutions, and manufacturing industries began to enter the market. In 2017, the company established a Japanese subsidiary. Currently, it is staffed by a presales engineer, a postsales engineer, and a sales representative.
“We're sometimes asked by people to explain Apache Cassandra, and if they’re already using Apache Cassandra, we ask if they need support. Sometimes we reach out from the engineering side, and sometimes we organize meetups because we want people to use Apache Cassandra."
There are three main products and services:
Providing technical services and consulting services to customers using open source;
Providing the open-source version as a packaged product with additional features for corporations, such as security, analysis, full-text search, geospatial search, and a database engine based on graph theory, as well as providing support and consulting services;
A cloud service, fully launched in 2020, where DataStax manages the database on the Cloud.
Aiming to reduce the burden of operation and management for customers through cloud services in Japan
The cloud service entrusted with operational management is on a par with Amazon's AWS and Google's cloud services.
“The world is clearly moving toward the Cloud. However, if you want to get technically deep services, you need to have a certain level of knowledge. And it is not easy to get people with the knowledge. This is especially true in Japan."
Kimoto says that customers should not have to bear the burden of such operational management.
“In fact, we want our customers to use our services with greater peace of mind. Because it is a cloud service, you can use it whenever you want and stop when you do not need it. We also want it used in a way that does not become a recurring expense. We would like to respond more and more to such customer needs.”
There are many variations in the form of contracts. DataStax has also started working with partners.
“We don't want to be dependent on any one cloud vendor. We also think that one of the features of the service is that it can handle cases in which a part of the system is used on-premises and a part is used on the cloud, or in which both are temporarily used. We'd love for customers to experience a system that automates operations management.”
Location in front of Tokyo Station is a business advantage
Since the establishment of DataStax’s Japanese subsidiary in 2017, EGG JAPAN has been the office of choice.
“The manager of a company that used to be here was a former colleague of mine, and he recommended it to me as a great place to work. In fact, in front of Tokyo Station is a great location, and there are many offices of large companies that are our clients. It's convenient for getting anywhere, and it's also convenient for people to visit us.”
Mr. Kimoto also manages a base in Southeast Asia, and he says that it is convenient for him to travel by plane. Likewise, it is easy for head office colleagues to come from the U.S. This is an important point for business, he says.
“Prior to COVID, we had a party with our partners in the collaboration space. We can use it for that kind of thing, too."
Another attractive feature of the office that has been talked about is the ability to connect with other tenants, but this has not been fully exploited.
”We have not been able to participate in any events or social gatherings. I am not always in the office. I often say this half-jokingly, but we're a distributed database company, so it's always been the norm for us to work in different locations (laughs).”
Because the product was based on open-source technology, development was usually done by approving code sent in from engineers around the world.
“It would have been difficult to hire otherwise. If you want to attract talented people from worldwide, you cannot set regional limits. Because the U.S. is such a large country, our sales staff are dispersed and assigned to different regions, but many of them work from home. That is why our work has not changed at all under COVID."
The theme for the future is to let people know more about the company. DataStax believes that there is a great need for its services in Japan. It also has ideas to meet those needs.
“We want to meet the needs of people who use a lot of data but want a fast response time."
Interview and text by Toru Uesaka
Editing: Kanae Maruyama
Photo: Tomoyasu Osakabe