NuzuSys Network and Datacenter

NuzuSys Network and Datacenter

  • 8:10 a.m.

NuzuSys specialises in leveraging technology for sustainability and development and offers various services. This post briefly explains how we run our networks and servers.

Project Details

Name of assignment or project: NuzuSys Network and Datacenter
Year: 2013
Location: Philippines, United States, Canada, Australia, Vanuatu
Client: NuzuSys
Main project features: To build from the ground up the NuzuSys technology infrastructure to support its services, to develop standards, policies and procedures associated with the infrastructure
Positions held: Senior Programmer, Data Analyst, MIS Expert, Systems Integrator, Systems Developer, Database Administrator, Network Administrator,
Activities performed: design the computer networks including workstations, servers, firewalls, managed switches and routers, management procurement, deploy a completely virtualised infrastructure making more efficient use of hardware and simplifying network/system administration, develop a platform for configuration management and automation saving on time and more effectively managing state of systems over time, designing platform for deployment of client managed hosting services, deploy enterprise grade systems to monitor the infrastructure maintaining a high service level quality and reporting on resources health metrics, develop a highly secure infrastructure environment, deploy an effective backup platform

Report Restricted Access

Summary of Activities and Deliverables

Design the computer networks including workstations, servers, firewalls, managed switches and routers. A very brief introduction to the NuzuSys' environment would look like this:

  • The workstations form a heterogenous environment with Linux Mint as primary operating system (OS), some MAC OS X for graphics work and Windows OS to manage the Samba4 free implementation of Microsoft's Active Directory;
  • The firewall is powered by pfSense running on NuzuSys' own hardware;
  • Huawei managed switches are used instead of more expensive Cisco equipment to connect all nodes together segregating department networks (e.g. management, development lab, workstations) using Virtual Local Area Network (VLANs);
  • Servers are almost entirely Linux with a mixture of Debian and CentOS
  • NuzuSys' managed hosting services run in a few data centers around the world similar to the one on the left below while less critical services and our development lab currently run on a home office network shown below to the right.

    datacenter
    datacenter

    Deploy a completely virtualised infrastructure making more efficient use of hardware and simplifying network/system administration. While NuzuSys can offer support with a number of virtualisation platforms including VMWare, Microsoft's Hyper-V, Xen, KVM and various container technologies (e.g. Docker, Linux Containers) we ourselves currently use a mixture of the following:

    • Proxmox for the main virtualisation platform;
    • Within proxmox, KVM for virtual machines requiring a high level of operating system isolation (e.g. different OS, firewall)
    • Within proxmox, Linux containers for all the client's Virtual Privater Servers
    • Direct KVM on Linux for various of our hardware products that are also virtualised. We managed those environments using libvirt and Virtual Machine Manager
    • And Docker only when this is necessary and the recommended way to install a particular application we support

    Develop a platform for configuration management and automation saving on time and more effectively managing state of systems over time. While managing one or a handful of servers for a single small business is straight forward, trying to scale this to dozens or hundreds of servers is not an easy feat. The graphical user interface paradigm quickly falls apart when managing such infrastructures. When point and click no longer works, one must return to good old fashion command line and tools designed to scale automation, manage configuration state and complex deployments such as Puppet or Chef. At NuzuSys we opted to go with Ansible since it offered a seemingly easier learning curve at the time of adoption while being a very active and promising open source community with great flexibility and capability. Ansible has since been acquired by RedHat and is a world class product of its kind and we don't regret our choice. For example, below is a text file in YAML format, a human friendly data serialisation standard, describing the deployment of a Django web application on a server running a standardised set of tools. With a quick glance at the file someone familiar with our tools knows exactly how this server is deployed. A the evolution of the configuration is easily managed using a version control system as is done with software therefore you can easily know who did what at what time which is great for auditing! We use a private Git server for this.

    Designing platform for deployment of client managed hosting services We have a database of client running in PostgreSQL which nicely integrates with the above tooling though this is a system still in progress.

    Deploy network latency monitoring tool used to monitor the quality of the Internet connection. The tools is configured to measure a wide range of network latency data to various points on the Internet (e.g. NuzuSys hosting servers, Commonly used public services) and provide alerts on suboptimal conditions for further analysis. Below is a simple screenshot of one of our test deployment showing latency visualisation to Google Free DNS servers.

    Deploy enterprise grade systems to monitor the infrastructure maintaining a high service level quality, reporting on resources health metrics, monitor network traffic. We've adopted a collection of tools that we briefly discuss below:

    • OpenNMS as the primary infrastructure monitoring platform. While there are several open source network monitoring systems to choose from OpenNMS is a proven battle-tested solution that has been around for the longest time in large data centers around the world.
    • NtopNG the defacto open source network traffic monitoring and analysis tool is a no brainer for when you need more then simple web access usage of network users which we do with Squid and squid log analysers such as Lightsquid. NtopNG is not the only option though, comparitech.com offers a nice comparison guide here.
    • The ELK platform for our centralised logging server though this one is experimental and not a robust solution we rely on like OpenNMS.

    Some screenshots from our test deployments are included below starting with OpenNMS' main dashboard providing overview of services and outages, then the page of a particular node with more detailed information and finally some very detailed health metrics pulled from SNMP configured on the servers on which you can set thresholds and trigger alarms. Note that we aim at much better service available then depicted on the illustrations below, those are merely test deployments to show how it works.

    NtopNG screenshots showing dashboard of global traffic followed by a simple list of host which you can drill down further to analyse traffic in detail.

    Kibana screenshot from the ELK platform showing a dashboard constructed by mining the data of the our firewall whose logs are consolidated to our centralised logging server. It clearly shows that at that time

    Develop a highly secure infrastructure environment properly management the state of configuration on servers is an important security component of any network. The servers are always kept updated and protected against latest known security vulnerability. Only encrypted key-based authentication is allowed to servers and only public facing services are opened on the firewalls. An active directory is used to centrally manage users and keys making it easy to enforce regular password change policies and deploy single sign-on (SSO) for authentication to most internal applications. Most services are already enforced through encrypted channels (SSL/TLS)

    Deploy an effective backup platform this arguably the most important component of any network. Our backups strategy is not yet complete but it is already pretty good. All virtual machines are snapshotted (cloned) nightly making quick recovery of any system a breeze. In addition, we also have a rotating data backup platform in place based on the enterprise grade BackupPC. The rotating backup enables us to retrieve individual pieces of a system (e.g. a particular dataset) from multiple point back in time (e.g. last 3 days, last 3 weeks, last 3 months) if required. In combination with the Ansible deployment playbooks (see above) it is also possible to completely recover systems by executing but a few commands which could be useful when re-arranging applications on different servers. Screenshots of our test backuppc deployment are shown below starting with the main dashboard, then an individual server page showing its backups and associated logs and finally one showing the actual files in the backup.