by Noah Fiedel, Blogger Engineering Tech Lead
Many of you have posted asking why Google doesn't continue supporting FTP publishing indefinitely or even charge money to those who would like this support. I'd like to give you some insight into our decision making and technology.
FTP is one of the earliest protocols on the Internet. It was drafted in 1971 before security was a concern, as the Internet at the time connected universities and research labs. Unlike nearly all other Internet protocols, it uses two insecure and unencrypted ports simultaneously. This makes securing FTP effectively impossible on both the server and network levels. FTP servers at ISPs are therefore vulnerable to attack, and your password can be 'sniffed' by anyone with access to the traffic to or within your ISP. sFTP, while more secure than FTP, still requires us to store your user credentials — which itself is undesirable from a security perspective.
Compare this to the HTTP protocol, drafted 20 years later in 1991: FTP doesn't have a mechanism to discover whether an FTP server is up, down, slow, or temporarily unavailable. HTTP supports all of these and more, and is now the basis for nearly all activity on the Internet.
Due to FTP's weaknesses, many ISPs restrict access to their FTP servers. They do this by limiting your FTP account to a list of approved Internet addresses (via an IP whitelist), which makes your account less likely to be hijacked. In the next section you will see why this also makes it difficult for Blogger to reliably provide FTP publishing.
Google runs many datacenters, and Blogger runs in several of them. Each datacenter has a different Internet address (a.k.a. "IP address") when it connects to your FTP server, so if your hosting service requires an IP whitelist, you would have to list all of the IP addresses associated with each of our active datacenters. We really don't like having a "primary" datacenter for anything, and instead prefer to let our traffic flow to the most efficient and lowest latency datacenter for our users. This makes the IP whitelist problem even worse, as some ISPs only allow a single (or very few) IPs to be whitelisted. Your ISP would need to whitelist all of Blogger's datacenters. Since they change regularly, your ISP's whitelist would have to be updated as well. This leads to a significant amount of user frustration, and regularly results in blogs failing to publish successfully. Diagnosing these issues has taken up a large part of our engineering and support team's time.
FTP Web Hosting Providers
Late last year during scheduled maintenance on a datacenter, we moved FTP publishing to another datacenter and updated our publicly posted IPs for your ISP's whitelists. Even after doing this, there were considerable complaints by users unable to publish via FTP to thousands of ISPs. Many of these ISPs maintain their own IP whitelists, often in an undocumented way. Troubleshooting this is extremely difficult and time consuming for us (and for you), as it's rarely clear where the underlying issue is. Our engineering, product and support teams often ended up directly contacting ISPs, waiting on hold, frequently without resolution. All to support a single user's report of "can't publish via FTP". In many cases the user or ISP simply entered the IP whitelist incorrectly. In other cases the hosting service's FTP server was unreachable. On more than one occasion, the ISP had set up "staging" and "production" environments without telling their users what was happening, so while Blogger was successfully publishing (to the staging server), the posts were not visible on the web and the user had no idea why their posts weren't showing up. In a great deal of cases, FTP publishing works but is extremely slow due to shared hosting plans having slow or limited network or disk per user. We have seen cases of full FTP republishes taking over a month, entirely due to the FTP server being slow.
What about sFTP?
If sFTP addresses some of the concerns with FTP, why are we shutting it down too? Fewer than 15% of users have adopted sFTP as their publishing mechanism, and many of the same challenges apply to both sFTP and FTP. Even with sFTP, a republish of a blog can take longer than a month to complete. Not all ISPs support sFTP and of those that do, many lock down FTP and sFTP with the same IP whitelist.
Blogger's current FTP support was 100% re-written for stability and maintainability in 2008. We added redundant queues on our side to make sure we never missed a file. We spent a significant amount of engineering time improving FTP support in the last two years, not including support and troubleshooting user issues. Even after this effort, approximately 10% of all FTP publishes fail.
As the original blog post mentioned, Google infrastructure is changing and would require us to again rewrite a major portion of our FTP support. Even after that rewrite, things would be no better than they are today in terms of stability, and we would be running just to stay in place.
I hope this post helps you understand the issues we face, and why it is not simply a question of money or a small bit of time. We want to deliver a best-in-class product experience for all users. Supporting a protocol with known security vulnerabilities and dependencies on downstream ISPs was preventing us from delivering the stable, reliable, and functional product we want and our users demand. It was also preventing us from doing more for the 99.5% of users who host their blogs with us, either on their own domain or on blogspot.com. While we deeply regret the impact this has on some of our users, many of whom have relied on Blogger for years, we remain confident that this was the right decision.