Case Study Explained: “How NASA used ANSIBLE to migrate cloud and increase cloud efficiency?”

Ansible, A powerful IT automation tool as well as the revolutionary invention of the Industry. It can solve almost every task and can reduce time complexity by running a single task on thousands of machines simultaneously. According to Wikipedia “Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code. It runs on many Unix-like systems and can configure both Unix-like systems as well as Microsoft Windows.

Another terminology we need to understand is “Ansible Tower”. The Ansible Tower can centralize and control IT infrastructure with a visual dashboard, role-based access control, job scheduling, integrated notifications, and graphical inventory management. It’s a product of Ansible.

Ansible is used by almost every top industries and they have their own success stories on “How Ansible solved the challenges they faced?”. According to their use cases, How ansible makes a big impact in Automation World. In most of the case studies, there is one powerful case study of NASA. How NASA used ANSIBLE to migrate cloud and increase cloud efficiency? Let’s read it

About NASA & WESTPrime

The NASA Web Enterprises Services and Technology contract (WESTPrime) was established to create a standard for public cloud usage within NASA. WESTPrime deals with everything from the very well known www.nasa.gov site to privately accessible web applications used by NASA staff around the world.

The Timeline

|

With media in so many different places, you needed institutional knowledge…

|

Therein was the challenge:

“With media in so many different places, you needed institutional knowledge of NASA to know where to look,” says Rodney Grubbs, Imagery Experts Program Manager at NASA.

“If you wanted a video of the space shuttle launch, you had to go to the Kennedy Space Center website. If you wanted pictures from the Hubble Space Telescope, you went to the Goddard Space Flight Center website. With 10 different centers and dozens of distributed image collections, it took a lot of digging around to find what you wanted.”

|

NASA was trying to get away from buying hardware and building data centers…

|

By 2014, like with many government agencies, NASA was trying to get away from buying hardware and building data centers, which are expensive to build and manage. The cloud also provided the ability to scale with ease, as needed, paying for only the capacity we use instead of having to make a large up-front investment.

“We wanted to build our new solution in the cloud for two reasons,” says Grubbs.

|

WESTPrime provided a delivery vehicle for building and managing the new site…

|

The Web Enterprise Service Technologies (WESTPrime) service contract, one of five agency-wide service contracts under NASA’s Enterprise Services program, provided a delivery vehicle for building and managing the new site.

|

Technology selection, solution design, and implementation were managed by…

|

The development of the new NASA Image and Video Library was handled by the Web Services Office within NASA’s Enterprise Service and Integration Division. Technology selection, solution design, and implementation were managed by InfoZen, the WESTPrime contract service provider. As an Advanced Consulting Partner of the AWS Partner Network (APN), InfoZen chose to build the solution on Amazon Web Services (AWS).

“Amazon was the largest cloud services provider, had a strong government cloud presence, and offered the most suitable cloud in terms of elasticity,” recalls Sandeep Shilawat, Cloud Program Manager at InfoZen.

Requirements

Business Challenge

This created an environment spanning multiple virtual private clouds (VPCs) and AWS accounts that could not be easily managed. Even simple things, like ensuring every system administrator had access to every server or simple patching, were extremely troublesome.

The overall business issues are — Increase cloud efficiency and cloud migration.

The Solution

As a result of Ansible Tower implementation, they achieved the following efficiencies:

  • NASA web app servers are being patched routinely and automatically through Ansible Tower with a very simple 10-line Ansible playbook.
  • Ansible is also being used to remediate security issues and was leveraged to re-mediate both OpenSSL issues earlier this year. This not only saved their time but allowed NASA to quickly re-mediate a very daunting security issue.
  • Every single week both the full and mobile versions of www.nasa.gov are updated via Ansible, generally only taking about 5 minutes to do.
  • OS level user accounts for mission-critical staff are continually checked and created if missing. NASA can now say with absolute certainty that everyone who needs access has access, even if that means adding or removing a user almost instantly from all servers.
  • NASA has also integrated Ansible facts into our CMDB, CloudAware, for better management visibility of our entire AWS inventory. As a result, NASA can organize their inventory of AWS resources in a very granular way that was not possible before.
  • Ansible is also used to ensure our environment is compliant with necessary Federal security standards as outlined by FedRAMP and other regulatory requirements. The Federal Risk and Authorization Management Program (FedRAMP), which provides a standardized approach to security assessment, authorization, and continuous monitoring for cloud products and services.

Technology or Product which NASA used before Ansible to solve this challenge

  • Ansible does not require agents to be installed on hosts; native use of SSH
  • The learning curve is very small and took less than a day to learn
  • Non-technical staff can read an Ansible Playbook and know what’s happening
  • Most active open source community among its competitors

Conclusion

  • Updating nasa.gov went from over 1 hour to under 5 minutes
  • Patching updates went from a multi-day process to 45 minutes
  • Achieving near real-time RAM and disk monitoring (accomplished without agents)
  • Provisioning OS Accounts across the entire environment in under 10 minutes
  • Baselining standard AMIs went from 1 hour of manual configuration to becoming an invisible and seamless background process
  • Application stack set up from 1–2 hours to under 10 minutes per stack

“ As a result of implementing Ansible, we are better equipped to manage our environment. Ansible has allowed us to provide better operations and security to our clients. It has also increased our efficiency as a team.”– Jonathan Davila DevOps Lead, InfoZen

Increasing Usage of Ansible in the Future

In the future, Ansible will be used to manage their stack of Windows servers and perform the same magic we’ve been able to achieve in their Linux environments. The end goal will be for their production environment to be completely automated with system administrators only needing to SSH/WINRM into instances manually for troubleshooting. All other instance changes would happen exclusively through Ansible (and the occasional CloudFormation template).

***

This article is written, edited, and published by Shobhit Sharma

Connect With Me On Twitter | Instagram | LinkedIn | Facebook | WhatsApp

Email Me @ shobhit0812@gmail.com

Shobhit Sharma (born 8 December 2000) is an Indian Technology Journalist, Computer Engineer, EDM Artist, Blogger, and Entrepreneur from Agra, Uttar Pradesh.