Cloud - Freedom Penguin

Get to the Cloud!!

April 27, 2024 by Steven Van Setten

When you sit down with consultants, or if you listen at any AWS conference, they’ll state that there is a right way and a wrong way to migrate. They’ll explain that it is vital to refactor your applications before moving them up to the cloud. They’ll tell you that trying to move your applications without changing them is futile, and that you’ll be setting yourself up to fail. They’ll tell you that it’s messy and drastically more difficult, nigh impossible, to migrate and then refactor. They’ll exclaim that the costs incurred by running your current platform as it stands is prohibitive. They will explain all this, and they mean it, and they are right… kind of.

The Forklift and the Pallets

“Lift and shift” or “shift and lift,” are the two most common ways to describe public cloud migration practices. By the sound of them, it seems they describe the same basic thing, just in two different orders. However, this is not entirely the case. At any AWS conference, they will explain that it is necessary to shift (refactor) before you lift (migrate). While it is ideal to proceed in this manner, it isn’t always an option. Fortunately, most of the public cloud offerings are expanded from the concept of hypervisors running virtual machines. This means that they all offer basic virtual servers and networking that can be created and used in a similar way as existing private infrastructure.

The reason refactoring is recommended before migrating is primarily because it costs more. It’s hard to sell the concept of moving to the cloud when you have to tell clients it will cost them significantly more than they’re currently paying for infrastructure. By changing applications to use the vendor offerings such as function as a service, database as a service, object storage, etc., the costs incurred are dramatically reduced. This level of refactoring requires a company-wide buy in, as in many cases it requires completely rewriting applications if they weren’t already designed to leverage such services. Depending on where the drive to the cloud is coming from, it may not be an option to have that level of developer involvement. In this case, the Infrastructure as a Service option is always there, but make sure your story weighs the benefits over the costs.

There are Always Trade-offs

The public cloud offerings are a great way for startups and small companies to have access to infrastructure without having to raise the capital required to purchase or lease the servers and networking components necessary to build it themselves. Moreover, most of the vendors offer the flexibility of paying on a metered or annual basis. This can allow for adaptation if a product suddenly or only occasionally requires a large amount of resources, because costs can be incurred for only the time necessary until resource usage is reduced.

While the technological benefits being offered are substantial, they do come at a price. As engineers, most of us expect a certain amount of authority over our environment, and by moving to someone else’s computers we give that up. Managers may think it’s a control issue, but it really comes down to accountability and predictability. We create environments with a level of redundancy and reliability known to us, and then set and manage expectations based on this known quality. When this is handled by a third party, we lose our insights into the functionality of our environment and with it our ability to anticipate failures and performance. We do eventually develop a level of understanding, but never the same level we have with a fully controlled environment. Additionally, our responsibility is not reduced, as we are still the owners of the environment, but can no longer resolve issues on the low level architecture. This leaves us at the mercy of the vendor when fixing or investigating outages.

What we get for the price of losing low-level access is the ability, for all intents and purposes, of infinite expandability. That is to say, lack of computing resources can no longer be considered an issue since you are no longer bound by the physical servers, storage, and CPUon hand, but have the resources of a company the size of Amazon, Google, Microsoft, et al at your disposal. Should a situation requiring a large amount of scaling arise, it can be done at a moment’s notice, and with hourly pricing the costs are kept at what is necessary to support the demand. With internal infrastructure, your options are limited: the application can fail (or run unacceptably slowly); more equipment can be purchased to support the high load periods, but would be wasted otherwise, and might not be purchased in time; or you can use a hybrid approach or service (which puts you in public cloud infrastructure anyway).

Additional benefits to leveraging public cloud infrastructure are difficult to truly qualify, because one hopes to never find out how beneficial it really is. Case in point: at my current place of employment, my initiative from hire has been to move our operations to “the cloud.” The real benefit was explained within the concept of disaster recovery. While our backup system permitted the recreation of our environment fairly readily and quickly, the requirement of purchasing, installing and configuring the new equipment necessary to recreate our production environment would have been prohibitive. By having access to near instantaneous resources at our fingertips, a full recovery of a catastrophic loss has gone from weeks to hours.

Workarounds

One of the most valuable lessons learned with our migration was the realization that even when you think you have all of the requirements, there may be some that don’t even register. This was the case with our database migration. We have two mid-sized databases (in the 2-4TB range), that we needed to get into the cloud. Because of licensing, we were unable to utilize database as a service offerings, and had to create virtual machines that we manually configured for the task. What we learned was that despite the block storage being solid state, the IOPs available to the servers at the computing and memory level we required was not even remotely high enough. Our finding showed that even though the storage was made up of SSDs with provisioned IOPs, the limitation set on the networking of smaller VMs kept write speeds around 60MB/s with bursts up to 120MB/s. While this is often unnoticeable on a majority of tasks, it didn’t even come close to the needed 300MB/s for our database to keep up with our applications. The resolution we discovered was to increase the size of our VM until it was able to receive 10Gbps networking which relieved the bottleneck, but presented its own problem. The minimum size VM required to be able to get 10Gbps networking put us over our CPU core limit of licensing, incurring new costs, because our vendor had no way of offering a compromise.

File shares are a vital part of our environment. We utilize an enterprise storage appliance to make management of the mixed NFS and CIFS environment easier and to utilize the active/active failover that it offers. These devices are fairly common in datacenters and there are a myriad of vendors that provide them. Most of these vendors also offer virtual appliances available in your major public cloud of choice. At the time of our migration, the appliance we preferred, or any of the others we investigated and trusted did not offer automatic failover between availability zones within AWS. As such, part of our migration required us moving over to a manual failover process. What this means: if the primary appliance fails we will have to manually change a CNAME in our DNS configuration and break a mirroring protocol between devices. This change, though inconvenient and not ideal, is not difficult, quickly propagates and was considered an acceptable loss in accordance to gains by being able to expand it as necessary, and having it span multiple datacenters.

Unavoidable Caveats

Most of the drawbacks in migrating are tied into the abstraction from the hardware and the reliance upon a vendor. While this seems simple enough, because obviously the resources available to these companies are vastly greater than most. What comes from this sprawling infrastructure is scale and the thing we often forget to consider when scaling is that doing so doesn’t just increase available resources, it increases risk. The likelihood of failure increases exponentially every time your add more devices to an environment. It’s an unavoidable fact. The more disks you have, the more likely it is that one will fail and while these companies work to build redundancy into all of their systems, it ultimately comes down to probability. Moving into the infrastructure of a company that has to worry about bit-flipping from cosmic rays in servers that utilize ECC memory should speak volumes about the rates of probability. At this level, it’s time to question what all those 9’s really mean.

Again, this is something to consider within the constructs of a migration. Despite claim of 99.999% availability on our block storage volumes, we have had several fail on us. This occurred on our most critical of systems, the database, on multiple occasions, and at one point a couple of weeks apart. In a private environment, this would be unheard of, and if it did happen would be indicative of a malfunctioning piece of equipment that would be replaced by the manufacturer. But in the cloud, failures aren’t actually unusual, but we were told it is unusual to have happen multiple times to a single company. Again, this comes down to probability. A company with that sheer size of infrastructure will have failures on a much more regular basis than a small private cloud. It’s simply the nature of reality.

Once again, assessment and mitigation is key. In our particular case, we run redundant failover databases, which meant we have had to flip our primary and rebuild a secondary respectively. The interference this causes in our environment is minimal and as such is acceptable in comparison to the cost increase required to configure redundancy within the database or operating system. This would not be the case in all implementations, and could easily be mitigated using operating system tools such as logical volume management within Linux.

Is it Worth Moving?

Shifting prior to lifting, though preferred, is not always necessary, because of the nature of the cloud. The benefits of doing so are substantial, as they result in dramatic infrastructure cost savings, increased reliability, and minimal administration but require a full buy in from all sides and incur the additional costs of completely rewriting most if not all applications. Simply migrating to the cloud to leverage infrastructure as a service allows instant and near infinite scaling, low cost entry into multi-datacenter redundancy and the ability to utilize periodic scaling during high-traffic or resource intensive times. It’s important to weigh the cost to benefit ratio across the board and be ready to solve unexpected problems.

At the end of the day, infrastructure teams need to be able to work with whatever is given to them and make sure they have the ability to configure their environment to fit the needs of the teams they work for. Whether in the public cloud or on private servers, the priority should be making sure the business requirements are met. Public cloud infrastructure offers resources and an agility difficult to attain on private infrastructure, but the driving force should always be the direction of the business.

Linux Cloud Servers Explained

April 12, 2024 by Matt Hartley

Over the years, there has been a lot of mixed information as to what Linux cloud servers actually mean. This article aims to clear the air once and for all with a concise explanation while providing you with a list of Linux cloud server resources from which you can investigate for yourselves.

Yes, the term “Linux cloud server” may seem silly, as it should be defined as a group of servers working together for a common purpose. But the fact is that most people are not aware of this. Therefore this article was written to provide a more accurate outline of what “the cloud” actually is.

What is Linux Cloud Storage?

The term “putting something into the cloud” often confuses people. I’ve seen the definition of cloud hosting evolve into whatever one’s marketing department happens to want it to mean. But at its simplest, storing something in the cloud means you’re storing data on a computer that is not locally available. Okay, cloud computing is actually a bit more complex than that.

In truth, storing data on the cloud means at least one or two of the following are part of the server solution:

Cloud Storage is best suited towards anyone needing off-site storage for their data.
Your stored data is stored at a location not local to you. This could be across the street or on the other side of the globe.
The server instance you’re using to store your data is flexible and scalable. This means that with very little effort, an administrator can remotely resize the storage capacity of a cloud hosting server. Additionally, increasing the cloud host’s processing power is also an option made readily available despite being in a remote location.
You only pay for what you use. This might mean your cloud hosting service is billed to you only for the time it was used. This service period usually ranges from hourly to monthly pricing.
Usually, cloud hosting providers offer you far greater reliability and physical data security. This translates to your data being spread out across multiple servers instead of being tied to a single server environment. This storage approach is called logical storage pools and can sometimes span beyond a single physical location.

A common example of Linux-based cloud storage would be to utilize Amazon’s S3 object storage. Accessible from various installable tools and web interfaces, Amazon’s S3 provides its users with the ability to meet the above criteria for Linux-based cloud storage.

Tip: The best way to understand cloud storage is to remember that the data sent to the cloud is usually uploaded and stored in an inactive storage state. Think of this as a remote hard drive where you might back up your data.

What is Linux Cloud Hosting?

Cloud hosting is sometimes referred to as cloud computing. Regardless of the term you prefer, the end result is that this is a cloud server instance used for active data interaction. This means a user can run the Linux cloud host in an active environment instead of a passive one.

When looking for a Linux powered cloud host, you’ll generally find they all have the following things in common:

Cloud hosting is targeted towards developers and businesses. Unlike cloud storage, hosting is primarily sought out by those looking to develop web applications or to host a website.
Running a cloud hosting environment requires greater Linux-based computing power than merely storing data.
Like cloud storage, hosting with a cloud instance allows for a scalable environment that you can grow as needed within the host’s virtual environment. Does a website or project require additional resources? No problem, just scale up the CPU and memory as needed.

A well-known example of cloud hosting is Amazon’s EC2 virtual server. Like Amazon’s storage option, EC2 provides web hosting or web application access from anywhere with an Internet connection.

Tip: Unlike traditional Linux web hosting, a cloud host provides what some refer to as elastic computing. Shared, VPS and dedicated hosting require you to select a set amount of resources ahead of time. Cloud hosting resources can be expanded from a control panel anytime it’s needed.

Cloud Storage Recommendations:

Below are cloud based storage and hosting recommendations. Note, this does not include self-hosted solutions like NextCloud as that would require you to figure out a set remote destination for its installation and files.

Provider Name	Encryption Offered	Capacity	Linux Desktop Compatible	License
Dropbox	Encrypted storage & client-side encryption	Unlimited if Dropbox Business	Linux Client Available	Proprietary
Jungle Disk	Encrypted storage & client-side encryption	Unlimited using Amazon S3 or Rackspace	Linux Client Available	Proprietary
Tarsnap	Encrypted storage & client-side encryption	Unlimited	Bash Terminal Linux Client Available	BSD License
SpiderOak One	Encrypted storage & client-side encryption	Up to 5,000 GB for SpiderOak One or Unlimited for SpiderOak Semaphor	Linux Client Available	Proprietary (GPLv3 for some tools)
Google Drive	Encrypted storage & client-side encryption	Up to 30 TB	Insync / overGrive / Gdrive CLI	Proprietary

Cloud Hosting Recommendations:

Provider Name	Resources Available	Capacity
Digital Ocean	Up to 224 GB of RAM Up to 32 Cores CPU Up to 10 TB Data Transfer	Up to 500 GB SSD (Additional Block Storage up to 1.95 TB)
Linode	Up to 200 GB of RAM Up to 16 Cores CPU Up to 20 TB Data Transfer	Up to 1536 GB SSD (No visible details on additional block storage)
Amazon EC2	See Amazon for details	Unlimited
Rackspace cloud hosting	See Rackspace for details	See Rackspace for details

Cloud Server Security Considerations

No matter what cloud backup or hosting solution is selected, the key is to make sure you understand how off-site security works. In the simplest terms possible, once data has been moved away from the original PC, it’s potentially viewable by anyone.

This is less of an issue in LAN-based environments where data isn’t moving beyond the watchful eye of the LAN’s firewall. However, when moving data to and from a cloud server (or cloud instance), it’s critical to know that the transmission of said data is encrypted.

This encryption of data in transit is done in a few different ways:

SSL - Secure Socket Layer provides a layer of encryption for data communications between a server and a client machine. For websites, it’s a must have as it ensures when you login to a website that your credentials aren’t spilling out to watchful eyes looking to exploit weak sign in accounts. This applies to any site using a login capability, including blogs. You’ll know that SSL is being used effectively when you see the green padlock in your browser followed by https. SSL can be looked at as a secure tunnel between one’s browser and the destination website.
SSH - This is a secure method remote access another computer or server. For Linux cloud servers, this is useful as it allows the administrator to remotely login to an offsite machine to update, upgrade or otherwise maintain it within a secure tunnel from their original workstation. It’s also common practice to use SSH connections to connect to a remote file system over sFTP. Unlike regular FTP, sFTP provides FTP through an encrypted SSH connection. SSH can be looked at as secure tunnel for remote access.

Tip: What about using a VPN? Historically, a VPN shares similar benefits to SSH in terms of encrypting internet traffic. However, it differs from SSH in that its core purpose is to simply connect a workstation to a private network over a public network, in this case the Internet. This is useful for employees who need to connect to their workplace network while away from the office.

To summarize, SSH is best used for local workstation access to a cloud server, whereas a VPN would best used to connect a local workstation to a private remote network.

Securing data already uploaded to a cloud server is down with Destination of data should also be encrypted. It’s important to note that not all cloud servers do this automatically. While it’s common practice for cloud storage solutions to encrypt your stored data, cloud hosting usually leaves this up to you.

The reason why cloud hosts won’t provide encryption by default is because many of the files on the cloud host are to be read by the public. Those files that are not will be protected through directory permissions set to the correct values:

Apache users, you can protect a directory’s contents with a htaccess file.
Nginx users can use the Nginx HttpAuthBasic module.

In both examples, the data isn’t encrypted, rather, only accessible for those with the appropriate authentication.

Cloud servers vs Physical Servers

In the old days, having a server to utilize meant having access to a single physical server. For shared web hosting, this may have meant taking a single machine and sharing its resources among hundreds of people. However, for the lucky few who could afford their own box, the dedicated server meant untapped control and power. If you were hacked or something crashes, odds are it meant the local administrator had a late night ahead of them.

Cloud servers in many ways made things better. Unfortunately, at least in the web development space, they make things a bit more confusing as well. Cloud servers used for storage or backups are almost always cloud servers in the truest sense of the term. The data isn’t centralized and is spread across multiple servers (often across multiple locations).

This isn’t always true with cloud hosting servers. These hosting instances are usually centralized to a single location and sometimes even to a single machine running as a VPS (virtual private server). This means even with backups, if that data center goes up in smoke, your data might as well.

Some cloud hosting providers offset this issue by giving you true root access to the hosting instance. This useful as it allows the administrator of the cloud instance to offload important data to a local destination for additional backup assurance using rsync or a similar option.

So which is better? It doesn’t matter – the keys to remember are as follows:

Web hosting – Cloud hosting is just fine for web developers and most lower traffic websites. Higher traffic websites however may do better with advanced power and functionality found with a dedicated server.
Data backup – Cloud storage is a great option. There are providers listed above with Linux desktop support and the prices for managed services are difficult to beat. Obviously, some content is best left out of the cloud, but most of it is just fine, since managed cloud backup services usually encrypt your data anyway.

Whether Linux enthusiasts of the world decide to use or avoid cloud server solutions, one thing is for sure – they’re here to stay and are a growing part of the Linux landscape.